120 78 72MB
English Pages [555] Year 2020
URBAN ECONOMICS AND FISCAL POLICY 111111111111111111111111111111111111111111111111111111111111111
URBAN ECONOMICS AND FISCAL POLICY 111111111111111111111111111111111111111111111111111111111111
HOLGER SIEG
111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
PRINCETON UNIVERSITY PRESS Princeton and Oxford
Copyright © 2020 by Princeton University Press Requests for permission to reproduce material from this work should be sent to [email protected] Published by Princeton University Press 41 William Street, Princeton, New Jersey 08540 6 Oxford Street, Woodstock, Oxfordshire OX20 1TR press.princeton.edu All Rights Reserved Library of Congress Cataloging-in-Publication Data Names: Sieg, Holger, 1966- author. Title: Urban economics and fiscal policy / Holger Sieg. Description: Princeton : Princeton University Press, 2020. I Includes bibliographical references and index. Identifiers: LCCN 2020001418 (print) I LCCN 2020001419 (ebook) I ISBN 9780691190846 (hardback) I ISBN 9780691199979 (ebook) Subjects: LCSH: Urban economics. I Metropolitan government. I Fiscal policy. Classification: LCC HT321.S544 2020 (print) I LCC HT321 (ebook) I DDC 330.9173/2-dc23 LC record available at https:/ /lccn.loc.gov / 2020001418 LC ebook record available at https: / / lccn.loc.gov / 2020001419 British Library Cataloging-in-Publication Data is available Editorial: Joe Jackson and Jacqueline Delaney Production Editorial: Natalie Baan Text and Jacket Design: Lorraine Donekier Production: Erin Suydam Copyeditor: Jennifer McClain Jacket art: Shutterstock This book has been composed in Palatino for text and DinPro for display Printed on acid-free paper. oo Printed in the United States of America 10 9 8
7 6 5 4 3
2
1
CONTENTS 111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
Preface xv Acknowledgments xix
1. Introduction 1 1.1 Why Cities? 1 1.2 New York City versus the United States of America 4 1.3 The City as a Public Sector Corporation 8 1.4 An International Perspective 10 1.5 Moving Forward 12 1.6 Problem Sets 13
I. The Economic Rationale of Cities 15 2. Agglomeration, Productivity, and Trade 17 2.1 Motivation 17 2.2 Economic Rationales for Geographic Concentration 18 2.2.1 Transportation, Commuting, and Communication Costs 18 2.2.2 Economies of Scale and Scope 20 2.2.3 Knowledge Spillovers and Agglomeration Externalities 21 2.3 Modeling Agglomeration Externalities 23 2.4 Transportation Costs and Trade 28 2.5 Measuring the Impact of Agglomeration Externalities on Firm Productivity 29 2.6 The Local Nature of Agglomeration Externalities 33 2.7 The Effects of Government Regulations 34 2.8 A Case Study: The Impact of Brexit on the City of London 35 2.9 Conclusions 36 2.10 Technical Appendix: Cost Functions 36 2.10.1 Cost-Efficient Production and Competition 36 2.10.2 Multiple Input Factors 37 2.11 Debate: Amazon's Second HQ 38 2.12 Problem Sets 39 3. Fiscal Federalism and Decentralization 41 3.1 Motivation 41 3.2 A Brief History of Legal Doctrines of Federalism in the US 42 3.3 Fiscal Federalism 43 3.4 Centralization 44 3.5 The Budget of the Federal Government 46 3.6 A Case Study: Federal Flood Insurance 47 3.7 Decentralization 47
Contents 3.8
3.9 3.10 3.11 3.12 3.13 3.14
Heterogeneity in Preferences over Policies 48 3.8.1 Testing the Central Tenet of Economic Federalism 51 3.8.2 Heterogeneity in Education Policies among US States 52 3.8.3 Heterogeneity in Fiscal Policies among the Largest US Cities 53 Social Learning and Experimentation 53 A Case Study: Special Economic Zones 56 Conclusions 57 Technical Appendix: Deriving Optimal Policies 58 Debate: Flood Insurance 59 Problem Sets 60
II. Efficient and Voluntary Provision of Public Goods in Cities 61 4. Efficient Provision of Local Public Goods and Services 63 4.1 Motivation 63 4.2 Efficient Public Good Provision 64 4.2.1 Defining Local Public Goods 64 4.2.2 Pure Public Goods 65 4.2.3 Congestion 67 4.3 Implementation and Mechanism Design 69 4.3.1 The Lindahl Mechanism 69 4.3.2 The Vickrey-Clarke-Groves Mechanism 72 4.4 Stated Preferences over Public Good Provision 73 4.5 Practical Implementation: Optimal Class Size 74 4.6 A Case Study: Berlin Brandenburg Airport 77 4.7 Conclusions 78 4.8 Technical Appendix A: Deriving the Optimality Conditions for the Baseline Model 79 4.9 Technical Appendix B: Deriving the Optimality Conditions for the Model with Congestion 80 4.10 Technical Appendix C: Applying the VCG Mechanism 82 4.10.1 The Planner's Problem 83 4.10.2 Side Payments 84 4.10.3 Taxes 84 4.10.4 Incentives to Tell the Truth 85 4.11 Debate: Efficient Cities 86 4.12 Problem Sets 86 5. Voluntary Provision of Local Public Goods and Services 90 5.1 Motivation 90 5.2 A Model of Voluntary Provision of Public Goods 91 5.3 Empirical Evidence of Crowd-Out 95 5.4 Warm Glow and Private Benefits 97 5.5 Empirical Evidence: Private Benefits versus Warm Glow 97 5.6 Tax Incentives and Matching 99 5.7 Conclusions 100
vi
Contents 5.8 Technical Appendix: Deriving the Nash Equilibrium 101 5.9 Debate: Tax Deductibility of Charitable Donations 103 5.10 Problem Sets 103
Ill. Political Economy of State and Local Governments 107 6. Local Political Institutions in the US 109 6.1 Motivation 109 6.2 A Brief History of Local Governments in the US 110 6.3 Characterizing Municipal Governments in the US 112 6.4 Legal Foundations of Municipal Governments 113 6.5 Direct Democracy 114 6.5.1 Initiative, Referendum, and Recall 114 6.5.2 A Case Study: Fracking in Athens, Ohio 116 6.6 Representative Democracy 117 6.6.1 Local Forms of Government 117 6.6.2 Partisan versus Nonpartisan Elections 118 6.6.3 A Case Study: Electoral Reform in Asheville, NC 119 6.6.4 Term Limits 120 6.7 Conclusions 121 6.8 Debate: Mayor-Council versus Council-Manager 122 6.9 Problem Sets 122 7. Voting over Local Public Good Provision 124 7.1 Motivation 124 7.2 Majority Rule in a Direct Democracy 125 7.2.1 The Median Voter Theorem 125 7.2.2 Sequential Voting 128 7.2.3 Vote Buying, Vote Trading, and Log Rolling 129 7.3 Representative Democracy 130 7.4 Is the Median-Income Voter Decisive? Empirical Evidence 131 7.5 Discussion 132 7.5.1 Dimensionality of the Policy Space 132 7.5.2 Ideology and Competence 134 7.5.3 Accountability and Competence 134 7.5.4 Voter Turnout 136 7.6 A Case Study: The Role of Money in State and Local Politics 137 7.7 Conclusions 138 7.8 Technical Appendix: The Public Good Provision Problem 139 7.9 Debate: City Politics 141 7.10 Problem Sets 141 8. Household Mobility and Fiscal Competition 144 8.1 Motivation 144 8.2 Sorting and Competition among Municipalities 145 8.3 Capitalization: Empirical Evidence 149 8.4 Competition and Efficiency: Empirical Evidence 151 8.5 A Case Study: The Benefits of Consolidation 152
vii
Contents Income Stratification and Voting 154 Segregation and Sorting by Race 155 Conclusions 158 Technical Appendix: Optimal Locational Choices 159 8.9.1 Modeling Fiscal Competition 159 8.9.2 Imperfect Sorting by Income 160 8.10 Debate: City-County Merger 160 8.11 Problem Sets 161
8.6 8.7 8.8 8.9
9.
Spillovers, Fiscal Inequality, and Intergovernmental Transfers 163 9.1 Motivation 163 9.2 Heterogeneity in Intergovernmental Transfers among the Largest US Cities 164 9.3 Fiscal Spillover Effects 164 9.4 Inequality and Fairness 167 9.5 Different Types of Intergovernmental Grants 169 9.6 Poverty and Intergovernmental Transfers 171 9.7 Conclusions 172 9.8 Technical Appendix: Solving the Model with Spillovers 173 9.9 Debate: School Finance Equalization Laws 176 9.10 Problem Sets 176
10. Rent-Seeking Behavior 177 10.1 Motivation 177 10.2 Modeling Rent-Seeking Behavior 179 10.3 Empirical Evidence 181 10.3.1 State Bond Ratings 181 10.3.2 Corruption and Accountability: Evidence from Brazil 182 10.4 A Case Study: Procurement Auctions in Puerto Rico 184 10.5 Conclusions 185 10.6 Technical Appendix: Computing the Equilibrium of the All-Pay Auction 186 10.7 Debate: Term Limits 189 10.8 Problem Sets 189 11. Labor Relations and Collective Bargaining 192 11.1 Motivation 192 11.2 Employer and Union Rights and Obligations 194 11.3 The Theory of Bargaining and Negotiations 196 11.3.1 A Bargaining Model 196 11.3.2 Employment and Wage Negotiations 200 11.3.3 Discussion 201 11.4 Components of Municipal Labor Policy 202 11.4.1 Wages and Salaries 202 11.4.2 Employment and Work Rules 203 11.4.3 Benefits and Pension Funding 205 11.5 Funding of Pension Plans 206 11.6 A Case Study: Collective Bargaining in Philadelphia 208
viii
Contents 11.7 11.8 11.9 11.10
Conclusions 209 Technical Appendix: A Bargaining Model 209 Debate: Pay-as-You-Go 211 Problem Sets 212
IV. The Determination of City Taxes 215 12. Property Taxation 217 12.1 Motivation 217 12.2 Justifications of the Property Tax 219 12.2.1 The Property Tax as a Benefit Tax 219 12.2.2 The Property Tax as a Tax on Capital 221 12.2.3 The Impact of Property Taxes on Renters 222 12.2.4 Fairness 222 12.2.5 Other Administrative Advantages 223 12.3 Property Tax Compliance 223 12.3.1 Some Evidence 223 12.3.2 Modeling Property Tax Compliance Behavior 224 12.3.3 The Effectiveness of Nudge Strategies 225 12.4 Property Tax Limitations and Commercial Property Tax Exemptions 227 12.5 Alternatives to the Property Tax 228 12.5.1 Is a Land Tax a Better Alternative? 228 12.5.2 A Case Study: The Soda Tax in Philadelphia 228 12.6 Conclusions 230 12.7 Debate: Property versus Income Taxation 230 12.8 Problem Sets 231 13. Business Taxation and Economic Development 232 13.1 Motivation 232 13.2 Empirical Evidence on Firm Sorting 233 13.3 A Model of Firm Location Choices 235 13.4 Tax Policy and Firm Location 237 13.5 Business Taxation in Practice 239 13.6 Tax Increment Financing and Community Development 241 13.7 A Case Study: The Relocation of UBS 241 13.8 Conclusions 242 13.9 Technical Appendix: The Derivation of the Profit Function 243 13.10 Debate: Reforming Business Taxation 244 13.11 Problem Sets 245
V. The Practice of Urban Fiscal Policies 247 14. Municipal Budgeting and Planning 249 14.1 Motivation 249 14.2 Priorities and Mission Statement 250
ix
Contents Operating Budget 251 14.3.1 Revenues 252 14.3.2 Expenses 254 14.3.3 Flexible Budgets 256 14.4 Capital Budget 256 14.5 Forecasting 258 14.5.1 Revenue Forecasting 258 14.5.2 Cost and Expenditure Forecasting 260 14.6 Benefit-Cost Analysis 261 14.7 Conclusions 264 14.8 Debate: Hosting the Super Bowl 265 14.9 Problem Sets 265 14.3
15. Fiscal Policies and Fiscal Crisis 266 15.1 Motivation 266 15.2 Data 267 15.3 Expenditure Policies 268 15.4 Revenue Policies 270 15.5 Common Policy Mistakes 272 15.5.1 Labor Policies 272 15.5.2 Redistribution and Tax Policies 273 15.5.3 Economic Development Policies 273 15.6 Municipal Bankruptcies 274 15.7 A Case Study: The Detroit Bankruptcy 275 15.8 A Case Study: The Fiscal Crisis and Recovery of Pittsburgh 277 15.9 Conclusions 278 15.10 Debate: Fiscal Policies in NYC under Mayor de Blasio 279 15.11 Problem Sets 279 16. Debt 16.1 16.2 16.3 16.4
16.5 16.6 16.7 16.8
Finance and Municipal Bond Markets 280 Motivation 280 Key Players in the Municipal Bond Markets 280 Bond Characteristics 281 New York City's Participation in Municipal Bond Markets 282 16.4.1 Data 283 16.4.2 Volume and Amount Issued over Time 284 16.4.3 Municipal Bond Ratings over Time 286 A Case Study: Build America Bonds 289 Conclusions 289 Debate: Tax Exemption of Municipal Bonds 290 Problem Sets 290
VI. Managing Urban Challenges 291 17. Urban Poverty 293 17.1 Motivation 293 17.2 Defining Poverty 295
X
Contents Operating Budget 251 14.3.1 Revenues 252 14.3.2 Expenses 254 14.3.3 Flexible Budgets 256 14.4 Capital Budget 256 14.5 Forecasting 258 14.5.1 Revenue Forecasting 258 14.5.2 Cost and Expenditure Forecasting 260 14.6 Benefit-Cost Analysis 261 14.7 Conclusions 264 14.8 Debate: Hosting the Super Bowl 265 14.9 Problem Sets 265 14.3
15. Fiscal Policies and Fiscal Crisis 266 15.1 Motivation 266 15.2 Data 267 15.3 Expenditure Policies 268 15.4 Revenue Policies 270 15.5 Common Policy Mistakes 272 15.5.1 Labor Policies 272 15.5.2 Redistribution and Tax Policies 273 15.5.3 Economic Development Policies 273 15.6 Municipal Bankruptcies 274 15.7 A Case Study: The Detroit Bankruptcy 275 15.8 A Case Study: The Fiscal Crisis and Recovery of Pittsburgh 277 15.9 Conclusions 278 15.10 Debate: Fiscal Policies in NYC under Mayor de Blasio 279 15.11 Problem Sets 279 16. Debt 16.1 16.2 16.3 16.4
16.5 16.6 16.7 16.8
Finance and Municipal Bond Markets 280 Motivation 280 Key Players in the Municipal Bond Markets 280 Bond Characteristics 281 New York City's Participation in Municipal Bond Markets 282 16.4.1 Data 283 16.4.2 Volume and Amount Issued over Time 284 16.4.3 Municipal Bond Ratings over Time 286 A Case Study: Build America Bonds 289 Conclusions 289 Debate: Tax Exemption of Municipal Bonds 290 Problem Sets 290
VI. Managing Urban Challenges 291 17. Urban Poverty 293 17.1 Motivation 293 17.2 Defining Poverty 295
X
Contents 17.3 17.4 17.5 17.6 17.7
17.8 17.9 17.10 17.11 17.12 17.13
Differences in Poverty by Race 298 Human Capital and Poverty 299 How Do Welfare Programs Change the Budget Set? 301 The Adverse Incentive Effects of Welfare Programs 303 Reforming Welfare Programs 305 17.7.1 Work Incentives 305 17.7.2 Welfare Limits and TANF 306 17.7.3 Extending the Earned Income Tax Credit 307 17.7.4 Work Requirements for Public Housing 308 Discrimination and Affirmative Action Programs 308 Place-Based Policies 309 Conclusions 311 Technical Appendix: Incentive Effects 313 Debate: Time Limits and Work Requirements 314 Problem Sets 314
18. The Provision of Education in Urban School Districts 317 18.1 18.2 18.3 18.4 18.5 18.6 18.7 18.8
Motivation 317 Some Facts about Urban School Districts 318 Education as Human Capital Investment 321 How Large Are the Returns to Human Capital? 326 Public Provision of Primary and Secondary Education 327 Two Case Studies: Philadelphia and Pittsburgh 328 Early Childhood Education 332 Reforming Urban Primary and Secondary Schools 333 18.8.1 Accountability and No Child Left Behind 333 18.8.2 Classroom Size Reductions 334 18.8.3 Magnet Schools 335 18.8.4 Charter Schools 336 18.8.5 Teacher Quality and Compensation 337 18.8.6 Vouchers and Private Schools 339 18.9 Conclusions 340 18.10 Debate: Pay for Performance for Teachers 342 18.11 Problem Sets 342
19. Crime and Public Safety 344 19.1 Motivation 344 19 .2 An Economic Model of Crime 348 19.3 Policy Implications 350 19.4 The Economics of Organized Crime 351 19.5 Police Effectiveness 353 19.6 The Demand for Addictive Goods 355 19.7 A Case Study: The Prohibition Experience 357 19.8 Decriminalization of "Soft" Drugs 358 19.9 Conclusions 358 19.10 Technical Appendix: Optimal Gang Size 359 19.11 Debate: Legalizing Marijuana 360 19.12 Problem Sets 360
XI
Contents 20. Urban Environmental Challenges 362
20.1 20.2 20.3 20.4 20.5 20.6 20.7 20.8 20.9 20.10 20.11 20.12 20.13 20.14
Motivation 362 Negative Production Externalities 364 Empirical Evidence on Measuring the Negative Effects of Air Pollution 367 Heterogeneity in Abatement Costs 369 A Case Study: The Clean Air Act 371 Regulation under Uncertainty 371 A Case Study: The Flint Water Crisis 372 The Impact of Global Warming on Cities 373 A Case Study: Redesigning Flood Zones in NYC 375 A Case Study: Rebuilding New Orleans after Hurricane Katrina 375 Conclusions 376 Technical Appendix: Solving the Model of Externalities 377 Debate: Lessons from Flint 379 Problem Sets 379
21. Managing Cities in Developing Countries 382
21.1 21.2 21.3 21.4
Motivation 382 The Origins of Power, Prosperity, and Poverty 384 Trust and Making Credible Commitments 386 The Consequences of Weak Institutions 390 21.4.1 Local Corruption 390 21.4.2 Lack of Local Fiscal Capacity 392 21.4.3 Lack of Physical Capital 393 21.4.4 Rural to Urban Migration 394 21.4.5 A Case Study: The Hukou System 395 21.5 Natural Hazards 395 21.6 Conclusions 396 21.7 Technical Appendix: Solving the Optimal Taxation Problem 398 21.8 Debate: Transportation Infrastructure Investments in Jakarta 399 21.9 Problem Sets 399
VII. Urban Land, Housing, and Labor Markets 401 22. The 22.1 22.2 22.3 22.4 22.5 22.6 22.7
Internal Structure of Cities 403
Motivation 403 Traffic Congestion in Large US Cities 405 Modeling Internal City Structure 406 The Price of Land in New York City 411 Modern Models of the Internal Structure of Cities 413 Transportation Networks and City Structure 415 A Case Study: Congestion Pricing in Singapore and New York City 417 22.8 A Case Study: The NYC Subway Crisis 417 22.9 Conclusions 418 22.10 Technical Appendix: Endogenous Land Use 420
XII
Contents 22.11 Debate: Public Transportation Infrastructure 421 22.12 Problem Sets 421 23. Land 23.1 23.2 23.3 23.4 23.5 23.6
and Housing Markets 423 Motivation 423 The Hedonic Model of Housing 424 Using Hedonic Models in Empirical Work 428 Housing Prices in LA 428 Land Prices in NYC 430 Housing Policies and Regulation 430 23.6.1 Housing Supply Regulations 430 23.6.2 Rent Control and Rent Stabilization 433 23.6.3 Public Housing Policies 434 23.7 Conclusions 436 23.8 Technical Appendix: Computing the Equilibrium in the Hedonic Model 437 23.9 Debate: Zoning Policies in NYC 440 23.10 Problem Sets 441
24. Local Labor Markets 442 24.1 Motivation 442 24.2 The Urban Wage Premium 443 24.3 Modeling Differences in Local Labor Markets 446 24.3.1 A Baseline Model 446 24.3.2 Extensions 449 24.4 Theory and Measurement 450 24.5 Location-Based Policies 452 24.6 Conclusions 453 24.7 Debate: Attracting the "Creative Class" 454 24.8 Problem Sets 454 25. Homeownership, Mortgage Markets, and Default 455 25.1 Motivation 455 25.2 Measuring the Evolution of Housing Prices in a Market 456 25.3 The Moral Hazard of Renting and the Cost of Homeownership 458 25.4 A Case Study: Why Are Young Americans Not Buying Houses? 461 25.5 Mortgages and Default 462 25.6 Mortgage Markets 467 25.7 A Case Study: The Bailout of Fannie Mae and Freddie Mac 469 25.8 Mortgage Insurance 469 25.9 Conclusions 471 25.10 Debate: Subsidies for Homeownership 472 25.11 Problem Sets 473 26. Epilogue 474
XIII
Contents Appendix: Some Useful Techniques in Empirical Microeconomics 479
Motivation 479 Correlation versus Causation 479 Probability Theory 481 A.3.1 Random Variables 481 A.3.2 Variance and Standard Deviation 483 A.3.3 Multiple Random Variables 484 A.3.4 Correlation 485 A.3.5 Marginal and Joint Distributions 486 A.3.6 Conditional Distributions 487 A.3.7 Conditional Expectations 487 A.3.8 Independence 488 A.3.9 Some Useful Rules 488 A.4 Statistics 489 A.4.1 Random Sampling, Estimation, and Inference 489 A.4.2 Estimating Conditional Expectations 491 A.4.3 Linear Regressions 491 A.4.4 Instrumental Variables 494 A.4.5 Panel Data and Difference-in-Difference Estimation 495 A.5 Causality and Social Experiments 497 A.5.1 The Potential Outcome Model 497 A.5.2 Average Treatment Effects 497 A.5.3 An Example 498 A.5.4 Selection Bias 499 A.5.5 Randomized Experiments 500 A.6 The Potential Outcome Model and the Regression Model 501 A.7 Regression Discontinuity Design 503 A.8 Discrete Choice Fundamentals 505 A.9 An Application: Locational Choice Models 507 A.10 Problem Sets 510
A.1 A.2 A.3
References 513 Index 527
XIV
PREFACE 111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
With over half of global output produced by approximately four hundred large metropolitan areas, studying the economics of cities is vital to understanding economic prosperity. Almost all successful societies are organized around cities, which are essential in the generation of new ideas and products. Urban economics is exciting because it provides us with the analytical and empirical tools to understand complicated issues that we encounter in our daily lives, including the provision of quality education, the access to affordable housing, and the protection from crime and natural hazards. The objective of this book is to introduce students to the field of urban economics and fiscal policy using a modern approach that provides an integrated treatment of theoretical and empirical analysis. Economic theory requires students to think rigorously and differently about challenging problems. Much of the recent progress in applied economics has been driven by improved data quality and improved methods for causal inference, which allow students to determine whether the theories fit the data well or whether new ideas and theories are required. New undergraduate textbooks are needed that reflect these advances in research. I have developed the material for this book based on my experience teaching Econ 237: Urban Fiscal Policy at the University of Pennsylvania since 2010.
Urban fiscal policies are key to understanding many salient economic issues that arise in cities. When it is necessary or desirable to go beyond the scope of the city, this book ventures into state and federal policies and their impacts on cities. Thus the book also contains some material typically covered in state and local public finance, public economics, and political economy. The content is suitable for a one- or two-semester course in urban economics and fiscal policy. The vast majority of the material covered in this book should be accessible to any student with a high school knowledge of mathematics and statistics, as well as a good working knowledge of the principles of economics. Knowledge of intermediate microeconomic theory and game theory is useful to understand some of the more difficult models that are discussed in more detail in technical appendixes. To understand modern economic analysis, students also need to be familiar with basic concepts in statistics and econometrics, material typically covered in the first two years of most economics undergraduate programs. Discussion of empirical research requires knowledge of concepts such as causality, endogeneity, and instrumental variables. These topics are covered in any basic econometrics textbook. I provide a detailed appendix at the end of the book, which gives an overview of useful tools in microeconometric analysis. Any rigorous quantitative course in urban economics may want to cover this material at the beginning of the course as a review or preview of relevant methods or cover the material over the course of the semester.
Preface Each chapter provides material for at least one or two lectures and follows roughly the same pattern. It begins with the motivation for the topic and an exposition of key questions of interest. It then provides the stylized facts and institutional background that are essential for understanding the topic. The sections that follow formalize the main ideas using simple economic models. Here I typically use examples that rely on convenient functional form assumptions. In my experience, students learn more quickly when they face concrete examples with closed-form solutions. I try to avoid general theories that require burdensome notation and excessive abstraction. PhD-level courses are a more appropriate setting for such abstraction and generalization. I provide a simplified exposition of the theoretical models in most chapters and rely on graphics to illustrate the main concepts. I also rely on a smaller number of key equations. The main sections of the book are accessible to students with a good knowledge of algebra and a basic knowledge of calculus. Most programs, regardless of the university, require one college-level mathematics class that would cover the concepts on which this book relies. I have placed most of the harder mathematical derivations, those that require a working knowledge of calculus or probability theory, in technical appendixes at the end of the chapters or the empirical methods appendix at the end of the book. Most undergraduate students at selective research universities and liberal arts colleges should be able to understand the material covered in these appendixes, with some proper guidance from the instructor. Economic theory is useful for students because it forces them to think rigorously. It also provides a vital consistency check for the ideas. Moreover, it offers new qualitative insights into problems whose answers are not immediately obvious. However, economic theory rarely (if ever) allows us to fully answer the questions of interest. Only empirical work can refute or validate the ideas, concepts, and effects the models predict. The next sections examine data and describe tests of the hypotheses that emerge from the models. They also explore ways to quantify the magnitudes of the main treatment effects. The focus is on either causal or predictive analysis. I typically focus on a small number of significant and influential empirical studies. Broad literature surveys are more suitable for handbook chapters that appeal to graduate students and established researchers. The empirical sections discuss the fundamental methods and challenges researchers encounter in identification and estimation. I then discuss the main findings of some key papers and conclusions that can be drawn from the analysis. The main empirical sections require some practical knowledge of regression analysis and statistics. Extended case studies supplement the empirical analysis. The last sections of each chapter explore the policy implications of the analysis. What can we learn from the empirical studies about economic policy? The text focuses on ways to avoid common policy mistakes and to improve upon existing policies. The key challenge is to integrate the theoretical and empirical analyses and to derive the main policy implications from these insights. This book is different from old-style textbooks in economics, which target an audience w ith minimal quantitative and mathematical skills. The quantitative approach in this book is by design, not by mistake. Economics as a science has evolved into a largely quantitative and empirical science. The demand for economists with quantitative skills, like modeling and empirical analysis, is expanding. XVI
Preface Undergraduate education must meet this challenge lest it become outdated and ultimately irrelevant. Many textbooks currently in use do not sufficiently prepare students for the demands of the twenty-first century, where modeling and analysis skills are required. This book should ease some of the burden; rather than having to develop a course completely from scratch to meet these needs, faculty can rely on this text. Slides are available to instructors upon request. The hope is that this book will enable economics departments (and junior faculty, in particular) to embrace a more quantitative approach to economics education. The book is suitable for courses in urban economics, urban fiscal policy, and state and local public finance. A course in urban economics should cover chapters 1, 2, 4, 5, 8, 13, and 17-25. A course in local public finance should focus on the material in chapters 3-15. A course in urban fiscal policy could be built around chapters 1-4 and 7-18. Finally, the book can be used as a supplement for courses in policy analysis, public economics, or political economy. The book's primary audience is undergraduate and master's students. It is suitable for a third- or fourth-year elective in a typical undergraduate economics curriculum. Alternatively, it can be used in a first- or second-year course in a masters of public administration or masters of economics program. It is my experience that PhD students have fairly limited backgrounds in urban economics when they enter doctoral programs; many majored in mathematics, statistics, or other STEM fields. The book can also be used as supplemental reading for PhD students to broaden their understanding of the applications of economics and introduce them to main research topics in urban economics.
XVII
ACKNOWLEDGMENTS 111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
I could not have done this without the help of family, friends, and colleagues. Special thanks to my colleague Robert Inman, who shared his lecture notes for Urban Fiscal Policy when I started teaching this course at Penn in 2010. This book would not have been possible without his help, support, and deep insights into the material. I would also like to thank Dennis Epple for introducing me to research in local public finance and urban economics; my editor at Princeton University Press, Joe Jackson, for his unwavering help and support throughout the process; Jennifer McClain for excellent copyediting; Natalie Baan for managing the final stages of the production process; Elisabeth Widdicombe and Peter Dougherty for many helpful conversations about the publication process and the market for economics textbooks; Joseph M. Cohen for generously supporting my research and my endowed chair at the University of Pennsylvania; my parents for encouraging me to pursue my dreams; and always my wife, Carolyn, for invaluable help, support, and assistance during all stages of writing this book. I received many useful comments from my colleagues in the profession who read various chapters of the book and discussed numerous topics with me. In particular, I would like to thank Gabriel Ahlfeldt, Spencer Banzhaf, Nate Baum-Snow, Jeff Brinkman, Stephen Coate, Daniele Coen-Pirani, Greg Crawford, Morris Davis, Gilles Duranton, Dennis Epple, Fernando Ferreira, Camilo GarciaJimeno, Judy Geyer, Stephan Heblich, James Heckman, John MacDonald, Enrico Moretti, Andrew Postlewaite, Stephen Redding, Richard Romano, Esteban RossiHansberg, Norman Schurhoff, V. Kerry Smith, Will Strange, Koleman Strumpf, Anita Summers, Matt Turner, Rakesh Vohra, and Chamna Yoon. Special thanks to my undergraduate research assistants Antonio Canales, Leila Feldman, Nathan Jiang, Kristina Kulik, Ammar Plumber, Zach Rovner, and Aimee Stephenson for all the excellent work. Special thanks to all my undergraduate students in Urban Fiscal Policy at the University of Pennsylvania, especially those who took the course in the spring of 2019 and helped me debug the final draft of the textbook. In particular, I would like to mention Matthew Coen, Jesscia Futoran, Justin Mignatti, and Jacob Rieber. Thanks a lot!
URBAN ECONOMICS AND FISCAL POLICY 1111111111 11 111111111111111111 111111111111111111111111111111111
1
Introduction
1.1 Why Cities? Economists typically pay little attention to geography. In most textbooks, it is implicitly understood that the relevant geographic unit of analysis is a country or nation. While countries are undoubtedly important, a strong argument can be made that the most important geographic unit of economic analysis is not the country, but the city or the metropolitan area. This view is motivated by the fact that urbanization and economic development are closely related. Urbanization refers to the population shift from rural to urban areas. As an economy develops, we observe that the proportion of people living in urban areas increases. This trend toward urbanization creates new challenges that societies must meet. In 2011 more than half of the world population lived in cities and urban areas. The United Nations predicts that by 2050 about 64 percent of the developing world and 86 percent of the developed world will be urbanized. Figure 1.1 shows urbanization rates around the world in 2015. In most developed countries, these rates exceed 70 percent, while rates are less than 30 percent in most developing countries. Urbanization and economic development are, therefore, strongly positively correlated. The importance of studying large cities or metropolitan areas becomes even more compelling when we take a look at the location of economic activity in most developed countries. Metropolitan statistical areas, or metro areas, are delineated in the US by the Office of Management and Budget, which estimated that more than 83 percent of all Americans lived in metro areas in 2010. Table 1.1 shows ten major metropolitan areas in the US. It ranks them according to their total gross domestic product (GDP)- a commonly used measure of aggregate output-produced in 2016. There are not any real surprises at the top of the list. New York, Los Angeles, and Chicago are the most populous cities in the country. They also have the largest economic output. Even smaller metropolitan areas in the US are large by international comparison. For example, the greater Atlanta metropolitan area has a GDP of $320 billion, bigger than Denmark's. The ten largest metro areas alone combine for 34 percent of the country's total GDP despite the fact that they account
0 0 0
0%-20% 20%-40% 40%-60% 60%-80%
■ 80% - 100%
FIGURE 1.1. Urbanization rates aronnd the world. (Akantamn/Wikimedia Commons)
,,.
Introduction TABLE 1.1. Top Ten US Metropolitan Areas Ranked According to GDP Rank
Metropolitan Area
GDP
Population
#1 #2 #3 #4 #5 #6 #7 #8
New York-Newark-Jersey City, NY-NJ-PA Los Angeles-Long Beach-Anaheim, CA Chicago-Naperville-Elgin, IL-IN-WI Dallas-Fort Worth-Arlington, TX Washington-Arlington-Alexandria, DC-VA-MD-WV Houston-The Woodlands-Sugar Land, TX San Francisco-Oakland-Hayward, CA Philadelphia-Camden-Wilmington, PA-NJ-DE-MD Boston-Cambridge-Newton, MA-NH Atlanta-Sandy Springs-Roswell, GA
$1,430 $885 $569 $471 $449 $442 $406 $381 $371 $320
20.1 13.3 9.5 7.2 6.1 6.7 4.7 6.1 4.8 5.8
Top 10 Metropolitan Areas
$5,700
84.3
$16,800
323.4
#9
#10
USA
Source: US Department of Commerce. Note: 2016 GDP is measured in billions and 2009 prices. Population is measured in millions.
for only 26 percent of total population. GNP per capita-a commonly used measure of output per capita-in the ten largest metropolitan areas is approximately $68,000. Outside these metropolitan areas it falls to approximately $47,000. The main takeaway from this table is that large cities are the engines of prosperity in the US. That is not to say that rural areas and small towns are unimportant for the economy. Urban areas are, however, the most essential places in terms of total output. Ask yourself, how many of your friends, who will graduate with you from college, will decide to live in one of those ten areas? Where are you likely to work and live for the next five or ten years? Of course, some important economic activities occur outside of cities. Examples are agriculture, mining, forestry, most heavy manufacturing, and recreation. As the economies of most developed and some developing countries have shifted toward technology and services, most of the important higher-value-added industries tend to be located in and around large metropolitan areas. Another way to measure the importance of metropolitan areas is to analyze labor productivity- measured as output per unit of labor. Parilla and Muro (2017) conducted a systematic analysis of the differences in labor productivity among US metropolitan areas. They found that average labor productivity is more than 20 percent larger in large metro areas with employment exceeding 2,000,000 than in small- and medium-sized metropolitan areas with employment less than 1,000,000. Hence larger cities tend to either be more productive or attract more productive individuals than smaller cities. To appreciate how much output can be produced in a very small area, it is useful to consider Midtown Manhattan. According to Glaeser (2011), the five zip codes that occupy a single mile between 41st and 59th Streets in Manhattan employ 600,000 workers, which is more than the number of workers in New Hampshire or Maine. The average earnings were more than $100,000 in 2010, giving that piece of real estate a larger annual payroll than the entire state of Oregon or Nevada.
3
Chapter 1 TABLE 1.2. The Population of NYC and the US
us
NYC Year
Population
Change
Population
Change
1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 2017
5,620,048 6,930,446 7,454,995 7,891,957 7,781,984 7,894,862 7,071,639 7,322,564 8,008,288 8,175,133 8,550,405
17.9% 23.3% 7.6% 5.9% - 1.4% 1.5% - 10.4% 3.5% 9.4% 2.1% 4.6%
106,021,568 123,202,660 132,165,129 151,325,798 179,323,175 203,211,926 226,545,805 248,709,873 281,421,906 308,745,538 325,365,189
15.0% 16.2% 7.3% 14.5% 18.5% 13.3% 11.5% 9.8% 13.2% 9.7% 5.4%
Source: US Census Bureau.
Since economic development is closely related to urbanization, economists should pay close attention to the role that cities play in the economy. From a practical perspective, it seems obvious that we need to make cities work. The objective of this book is to study the role of cities in the economy and the problem of effectively organizing economic activities within cities and metropolitan areas.
1.2 New York City versus the United States of America To illustrate the importance of cities, let's compare the US with its largest city, New York City (NYC). One measure of economic activity is the size of an economy, measured by its population. Table 1.2 reports the population for NYC and the US by decade over the past century. It may surprise you to learn that a hundred years ago, the relative economic importance of NYC was even larger than it is today. Approximately one out of twenty Americans lived in NYC in 1920. The growth of NYC outpaced the growth of the US until the 1930s, as NYC was the hub for European immigration into the United States. After the end of World War II, the economic dominance of NYC started to decline. In the 1950s NYC was still the most important hub of the manufacturing industry in the US. However, since the 1950s most large cities in the US have undergone a transformation from a manufacturing-oriented economy to a service sector and technology-based economy. As discussed in detail in Glaeser (2011), the number of manufacturing jobs in NYC declined from 1,082,188 in 1950 to 146,291 in 2000. Employment in manufacturing accounted for 36 percent of all jobs in 1950, but only 5.3 percent in 2000. The negative impact on the manufacturing industry was a large economic shock for NYC and almost all other large cities in the US. As a consequence of this economic shock, population growth in NYC started to slow down and lagged behind the rest of the country. Moreover, most immigrants to the United States have not come from Europe since the 1960s. Instead, a large number of immigrants to the US nowadays have come from Latin America and Asia. Hence NYC has not been a natural entry point for these new immigrants.
4
Introduction
FIGURE 1.2. The skyline of the financial district in NYC. (pixabay.com/ pexels.com)
The economic stagnation in the United States during the 1970s hit New York City particularly hard. The city was close to bankruptcy in 1975. As a result, the city created the Municipal Assistance Corporation, which attempted to refinance the debts of NYC. However, these attempts ultimately failed. The State of New York then appointed an Emergency Financial Control Board, which took full control of the city's budget and made drastic cuts in municipal services and spending. It also cut city employment and froze salaries. Yet these measures were still not sufficient to resolve the financial crisis. The city then appealed for financial aid from the federal government. Initially, President Gerald R. Ford refused to bail out NYC in October 1975, believing other alternatives existed. As the financial crisis continued, the US Congress reversed course in December 1975 and passed the New York City Seasonal Financing Act. The financial crisis reinforced the beliefs of many citizens that New York City was in serious trouble. From 1950 to the end of the 1970s, nearly a million people left NYC. It is fair to say that NYC in the 1970s seemed as dire as Detroit in the 2010s. Although the fiscal situation improved during the late 1970s, NYC continued to suffer from a drug and crime epidemic, which lasted through much of the 1980s. The growth and rejuvenation of NYC was largely due to the fact that NYC's economy was successfully transformed from manufacturing toward technology and services. According to Glaeser (2011), employment in the service sector in NYC increased from 25.4 percent to 41.1 percent of total employment. Modern cities such as NYC serve as regional, national, or even global hubs of the service sector industry. Manufacturing typically plays a negligible role in most US cities and metropolitan areas. NYC is a leading example.
5
Chapter 1 TABLE
1.3. Income Inequality in NYC and the US New York City
United States Amount of income Under $20,000 $20,000-$30,000 $30,000-$40,000 $40,000-$50,000 $50,000- $75,000 $75,000-$100,000 $100,000-$200,000 $200,000-$500,000 $500,000-$1,000,000 $1,000,000-$5,000,000 $5,000,000 or more
Percentage of Filers
Fraction of total income (in %)
Percentage of Filers
Fraction of total income (in %)
34.9 13.3 10.2 7.7 13.3 8.2 9.6 2.3 0.4 0.2 0.0
3.6 6.1 6.6 6.3 15.1 13.0 23.6 11.9 4.4 5.1 4.4
36.6 13.0 10.2 7.7 12.9 7.1 8.2 2.5 0.6 0.4 0.1
5.0 4.7 5.2 5.5 11.5 8.8 16.1 10.9 5.5 10.5 16.2
Source: Author's calculations based on IRS data.
NYC has experienced an amazing renewal since the mid-1970s. It has attracted and retained a disproportionate fraction of high-skill households. A measure of the number of high-skill households can be obtained by studying the distribution of adjusted gross income provided by the Internal Revenue Service (IRS). That is the measure of income used by the IRS to determine federal income tax obligations. The IRS data reveal, for example, that the fraction of households with incomes exceeding $500,000 is almost twice as large in NYC as in the US as a whole. We can also use the share of total income of different income groups to measure inequality. Consider table 1.3, which reports the income shares of different income brackets. The table shows that the richest households in NYC-those w ith adjusted gross incomes of $500,000 or more-account for more than 32 percent of total adjusted gross income in the NYC metropolitan area. These households account for a much smaller share of total income in the United States overall. While the stereotypical wealthy Upper Eastsider is prevalent in Hollywood movies, NYC is also home to a large number of low-skill households. Table 1.3 shows that almost 50 percent of all households in NYC in 2009 had incomes of less than $30,000. By almost any measure of income equality, NYC tends to be an island of extremes. The large income inequality in cities is not an accident. It is intrinsically linked to how the city economy is organized. Large cities attract a disproportionate fraction of high-skill individuals. High-skill individuals work long hours and often do not engage much in home production. Many high-skill households do not cook, clean, or spend much time on other household chores. Instead, they rely on the service sector to provide these services. Many of these services are provided by low-skill workers. High- and low-skill individuals complement each other in the city economy (Eeckhout, Pinheiro, and Schmidheiny, 2014). An important function of New York City is to match high-skill individuals with high-productivity firms. A large fraction of an individual's wage and salary depends on ability and skills. Becker (1964) referred to these skills as human
6
Introduction
FIGURE 1.3. Broadway at night. (Yuting Gao/pexels.com)
capital. Another important component of income-especially for high-productivity individuals-is determined by the match between the firm and the individual. Suppose you are the best patent lawyer in the world and work in a place in which there is little innovation. That would mean the law firm you work for gets few patent cases. Then the services of your firm are not sought by inventors, and your talents are completely wasted. You need to move! One of the important functions of large cities is to match worker talents to firm needs. Looking at the industry structure of NYC, we find that the local economy is heavily dominated by firms in the financial and insurance sectors, as well as technology and entertainment. In addition, NYC is home to some of the largest law, consulting, and accounting firms. These firms provide important professional services for other large companies. As a consequence, it is not surprising that the New York economy attracts a disproportionate share of high-skill individuals whose skills match well with the needs of these high-productivity firms. The large concentration of firms in key industries such as finance, entertainment, and technology is not an accident. We will learn that cities tend to make firms and individuals more productive. These productivity gains arise because of a variety of externalities, such as knowledge spillovers. Economists typically refer to these externalities as agglomeration externalities since they are confined to a small geographic location. While there are many advantages of city life, there is also a host of problems that arise because of high population densities that are inevitable in cities. One drawback of living in N ew York City is that the cost of living is much higher than in average towns and cities in the US. Wallace (2016) provides some examples to determine the true cost of living in New York City. For example, transportation costs were significantly higher in NYC than in the rest of the country. The average parking rate in downtown Manhattan was $533 per month in 2015. Purchasing a
7
Chapter 1 parking space for your car in Manhattan was roughly as expensive as the median house value in the US, which is $236,600. The average insurance rate for drivers in Manhattan exceeded $4,000 per year. A monthly transit pass in New York cost $116.50 per month, about 75 percent higher than the national average. Groceries in New York cost approximately 30 percent more than the national average. The price for a meal for two at a moderate restaurant was 67 percent higher than the national average. Your best bet may be to eat at one of the city's many trucks, although even those can be expensive. Sneakers, jeans, movies, and bowling were, respectively, 24, 44, 40, and 100 percent more expensive than the national average. The Elliman Report provides a detailed characterization of the cost of owning real estate in Manhattan. 1 The average price per square foot of co-ops and condos in the first quarter of 2015 was $1,263. Luxury apartments cost more than twice that amount, at $2,587 per square foot. The average rent for a two-bedroom apartment in Manhattan was $3,895, which is roughly equal to an entire monthly income of the typical US worker. State and local income taxes are also among the highest in the country and range from 7 percent to 12 percent. Wallace (2016) concludes that, taking all expenses into account, the cost of living in New York City is at least 70 percent higher than the national average. If you live in Manhattan rather than an outer borough, the cost of living can be even higher, approximately double the national average. That means Manhattanites must make twice as much money to consume the same consumption bundle. The price of success is, therefore, substantial. Of course, some of these costs reflect the fact that you get to live in one of the most attractive, thriving, electrifying cities in the world. For individuals who like urban life, replete with cultural attractions, excellent cuisine, and luxury shopping, this may be fine. For those who value such services less, the suburbs may be more desirable, but they come with a significant and lengthy commute. For those employed in NYC, the trade-offs are clear and there is no free lunch.
1.3 The City as a Public Sector Corporation This book focuses on the important role that cities and local governments play in the economy. In order to better understand how cities form their operations, it is useful to compare cities to private corporations. According to Inman (2008), each city has a population and this population can be viewed as the number of "shareholders." These shareholders do not own stocks as they do in private sector corporations. Instead, they are either owners or renters of residential land and property. While the shareholders in a private sector company own voting rights for each share of the company owned, the "shareholders" in a democratic city are each entitled to only one vote, regardless of the amount of property they own. In a private sector company, the board of directors is elected by the shareholders. The city council serves the same function in a city as the board of directors of a private corporation. The board chair is similar to the mayor of a city who is either elected by the city council or directly elected by popular vote in the US. In other countries, the mayor is not elected but appointed. And much like a 1 https: /
8
/ www.elliman.com.
Introduction private business usually structures its organization, the mayor and the city council typically appoint a management team for the city's various "product" activities: street cleaning, street maintenance, public education, recreation, libraries, etc. Private sector companies need to satisfy their consumers by offering quality products at reasonable prices. This may be a shocking concept for some firms, but competition usually drives these firms out of business. Cities need to accomplish a similar goal. For cities to maintain a satisfied population, they must protect and enhance their economic competitive advantage. They need to efficiently provide public goods and services for residents and businesses. Public goods include the provision of infrastructure such as roads, ports, airports, and communication systems. Important services are determined by significant consumer demand, or dependencies, like safety, environmental protection, and public education. Each of these goods and services requires inputs like labor, capital, and materials for adequate production. Cities need to manage these production processes. Another important aspect of managing a private sector company is human resource management, which includes the broader administration of employment. The recruitment and selection of workers, as well as performance management, training and development, occupational health and safety, and resolving employee conflicts, all fall under the responsibility of a human resource department. Comparable issues arise in a city and can play an even larger role. A city must hire and manage workers and employees to produce the goods and services that are essential for its success. Additionally, municipal employees are typically unionized, which creates more complicated obstacles for their management. Municipal employees are also city residents and hence they "own" a share of the city and can try to influence city politics in their favor. To pay for the labor and to acquire the capital needed in the production of public goods and services, the city collects taxes and fees and issues debt in the form of municipal bonds. You can view the exchange of taxes and fees for public goods and services as a contract in which residents are the consumers and the city is the supplier. Citizens, as well as businesses and residents of neighboring suburbs, can all be considered potential customers of the city. It is important to understand that cities do not operate in isolation. Just as firms face competition from other firms that offer similar products, each city is in direct competition with other cities that offer similar public goods, services, and locational amenities. We call this fiscal competition since it largely affects the feasibility and suitability of fiscal policies. If a city does not provide its services efficiently, consumers will leave the city and go elsewhere. A declining population then leads to a fall in land prices. Therefore, we can compare the price of land or housing in a city to the share value of a publicly traded company. A city may go into bankruptcy if there is not a sufficient tax-paying population to supply the revenues to cover costs. This notion of competition can be extended to apply across countries. Efficient cities attract residents and businesses, creating prosperity and growth. For exam ple, New York City competes for firms and residents not only with San Francisco and Chicago but also with foreign cities, such as Toronto, London, Berlin, Tokyo, and Shanghai. Countries need to properly design a system of city and corporate governance that encourages efficient city management. If the overarching federal government does not allow cities to function properly, then cities cannot compete against cities in different countries. An inefficient
9
Chapter 1 system of city governance will lead to a decline in overall prosperity and welfare and can create a systemic poverty problem. Well-run cities require good internal management, favorable rules of city governance, and informed and diligent shareholders (Inman, 2008). Cities should have honest and well-trained leaders, appropriate federal and state policies that support them, and informed voters who participate in local elections. Investors, or shareholders, of the city are owners of real estate. Lenders are banks and municipal bondholders, who do not necessarily have to live or operate in the city. Customers are residents or businesses that operate in the city. All of these groups have a vested interest in ensuring the efficient operations of the city. If the city is not well run, housing values will collapse and residents will face the risk of losing their jobs, losing most of their wealth, and confronting personal bankruptcy. The residents in poorly run cities are also at risk of higher crime rates and face greater safety hazards, including injury or losing their lives. Finally, these residents may sacrifice their children's future prosperity if the city fails to fund local schools. As a consequence, there is much at stake in city management! In this book, we try to understand how to finance and manage a city-corporation to enhance the consumption and investment benefits of its citizens.
1.4 An International Perspective Despite the ups and downs of US cities and metropolitan areas, economic growth and prosperity in the US appear to be largely a function of urbanization. This observation is true not just for the US, but applies to other countries as well. Consider, for example, the recent history of the People's Republic of China. Economic growth started to accelerate in China in the early 1980s after the country fully embraced an economic development concept that was centered around the major coastal cities. This policy was designed and implemented under the leadership of the chairman of the Chinese Communist Party, Deng Xiaoping, in the late 1970s. This policy was in stark contrast to economic policies under chairman Mao Zedong that focused primarily on agriculture and heavy industry. Many key industries were moved inland, away from the coastal areas, primarily for military reasons. In contrast, the new economic policies opened China to the world economy and were initially implemented in coastal cities such as Shanghai, Guangzhou, and Shenzhen. These policies drew heavily on lessons learned from Singapore and Hong Kong-two other coastal Asian cities that had been successful urban centers for many decades. This example also shows that cities can serve as laboratories for experimentation in a decentralized system of government. The Chinese government first studied the viability of the new economic approach in a small number of places before adopting it later nationwide. Cities have also played a prominent role in Europe in reshaping the economic agenda. As in the US, most successful European cities have embraced an economic strategy built around high-end services, technology, and entertainment. Cities such as London, Paris, Berlin, and Madrid are international hubs for many service sector firms. However, it is not all about city size. While size is typically a good indicator of productivity, there are many small- and medium-sized cities in Europe that have thrived for many centuries. For example, Frankfurt, Luxembourg, and Zurich have served important functions in international finance for many decades.
10
Introduction
FIGURE 1.4. Bangkok street life. (Suzukii Xingfu/pexels.com)
Maybe most surprisingly, a small suburb of Gelsenkirchen called Schalke has been the center of gravity for football in Germany since 1904. Cities may play an even more important role in many developing economies. Think about famous cities such as New Delhi, Mexico City, Nairobi, or Rio de Janeiro. How does economic policy in such cities differ, if at all, from policy in cities such as New York, Tokyo, or London? All these cities are centers of economic innovation, production, and growth in their respective countries. What might potentially undermine the advantages that arise from urbanization in developing economies are the same forces that can lead to inefficient diffusion of economic activity in general- misguided government spending, tax inefficiencies, and high labor costs of municipal workers, as well as the adverse effects of urban poverty and crime. These problems undermine urbanization advantages and thus the economic potential of cities. But there are some additional challenges that primarily arise in developing countries where big-city inefficiencies are also the result of corruption. Of course, not all cities in developing countries suffer from corruption, and there can be corruption in developed countries as well. However, we observe stronger legal and law enforcement institutions in countries with higher levels of income. As an economy develops, citizens often need to find ways and means to protect themselves from expropriation of property rights by a corrupt and powerful elite. In addition, urban poverty tends to be a much more severe problem in most developing countries. Cities in developing countries often need to deal with the influx of poor migrant workers. When rural poverty is widespread, the economic incentives are for poor rural families to migrate to the more prosperous cities in search of economic opportunity. Urban poverty is, therefore, pervasive. To deal
11
Chapter 1 with the negative consequences of uncontrolled rural-urban migration, countries often try to restrict migration. China, for example, uses a sophisticated residency system-called hukou-to limit the rights of nonresident citizens. Finally, cities may lack fiscal capacities. In particular, local taxation of income and profits can be challenging. The informal sector, which consists of unregistered enterprises such as street vendors, repair shops, or small farms, is typically much larger in developing countries, and it can be difficult to enforce taxation and regulation in this sector of the economy. Other than licenses and fees-an invitation for local corruption, by the way-significant local taxation is often not feasible. Central government taxation is the primary means for raising revenues for government services in most developing countries. That causes problems if these resources are not properly transferred to state and local governments.
1.5 Moving Forward According to Christaller's (1933) central place theory, the primary purpose of a market town or city is the provision of a variety of goods and services for the surrounding market areas. Cities appear to be the engines of economic growth and prosperity. We will discuss in this book why the geographic concentration of economic activity can be beneficial to firms and workers. Geographic concentration not only lowers transportation and communication costs but also potentially creates positive spillover effects that increase productivity. For a city to reap the benefits of agglomeration, the city must efficiently provide a variety of goods and services: protection from crime and natural hazards, the provision of affordable education, infrastructure, the protection of the environment, and the enforcement of property rights and contracts. Private nonprofit organizations that rely on the generosity of donors and volunteers often supplement the provision of public goods and services by local governments. To provide key public goods and services, cities typically rely on mechanisms in which participation is not voluntary. The local government can levy taxes, impose regulations, and force individuals and firms to comply. In democratic systems, we tend to appoint our leaders via elections. Fair and open elections have the potential to generate reasonable outcomes. However, pork barrel politics, rentseeking behavior, and corruption will lead to policy failures. With cities, there is the potential not only for market failure but also for government failure! How do citizens monitor their elected leaders and the large bureaucracy that oversees and implements government programs? Who holds politicians accountable? We will see that fiscal competition and decentralization of government power provide key economic advantages since they encourage competition between cities, regions, states, and even countries. Fiscal competition provides a potential solution to the problem of monitoring and identifying ineffective or corrupt leaders. Households can also move away from inefficiently operated or corrupt cities and states. Individuals can not only vote at the ballot box but also vote with their feet by moving to a different location. The downside of fiscal competition is that it potentially creates large inequities, when households sort into neighborhoods and communities based on income. Moreover, segregation by race and ethnicity can have additional undesirable effects for a society. If the quality of local public goods and services is based on
12
Introduction
the local tax base, segregation by income will give rise to large differences in public goods provision and thus unequal access to economic opportunity. A cleverly designed system of intergovernmental transfers can alleviate most of the problems created by inequality. Nevertheless, we often observe that cities are not properly managed and face serious problems. High levels of crime, failing urban schools, poor infrastructure, and a lack of quality and affordable housing are just a few problems that have haunted many cities for a number of decades. Effective management of cities is exceedingly difficult. It requires ingenuity, commitment, intelligence, creativity, and grit. Many cities all over the world come up woefully short and do not meet even minimal efficiency standards. There is no invisible hand that makes sure that a city will function and succeed. If a society cannot manage its cities, economic decline is the likely consequence. New York City is a prime example of the rise and fall, decline and renewal of a major city. Some cities may triumph and rise to great prominence, but many will fail along the way. Understanding the role that cities play in the economy and learning how to take full advantage of the opportunities that cities offer will make you a smarter and more successful economist, manager, policymaker, or city resident. This book provides you with the skills, tools, and concepts you will need to achieve this goal.
1.6 Problem Sets 1. Why are urbanization and economic prosperity positively correlated? Dis-
cuss two possible explanations. 2. What are the characteristics of shrinking cities, such as Detroit or Buffalo? What policies do you recommend for these cities? 3. What types of challenges do successful cities currently face? What policy options do you recommend to deal with these challenges? 4. How would you distinguish cities that are likely to succeed in the future from those that will struggle? 5. Explain why cities need both low-skill and high-skill workers to function properly. 6. Who are the main stakeholders in cities and what interests do they have? 7. What are the objectives of the mayor and the city council of a city? 8. Why do you think unionization rates are higher among municipal workers than workers in the private sector? 9. What are some additional challenges that arise in managing cities in developing countries? Discuss two. 10. Cities compete for mobile workers and firms. Discuss some advantages of this type of competition.
13
PART I THE ECONOMIC RATIONALE OF CITIES 111111111111111111111111111111 1111111111111111111111111111111111111111111111111111111111111111111111111
2
Agglomeration, Productivity, and Trade
2.1 Motivation To begin this chapter, ask yourself, What is the economic rationale of cities? Or, take a broader perspective and ask, What is the economic rationale of countries or states? The answers to these questions are not obvious, yet typical economics textbooks do not address them. Urbanization and economic prosperity are closely related to each other. Given that, can we go so far as to say that well-managed cities drive economic growth and success, and poorly managed cities lead to decline and decay? Put differently, does the management and organization of a city play a meaningful role in the economic success of the firms and citizens in its boundaries? We can define a city as a center of population and commerce with a significant size and importance. From an economic perspective, the key concepts that define a city are size and density. Note that the US had a population density of 89.5 individuals per square mile in 2013. New York City, the largest and most densely populated city in the US, had a density of approximately 27,781. Why is it desirable to cram so much economic activity into such a confined space? What are the advantages and disadvantages of spreading out individuals and firms across all available space? Some activities are conducted outside of cities. For example, agriculture, forestry, mining, and a large number of tourism and sports-related activities (golf, hiking, skiing, sailing, etc.) cannot or should not be done w ithin the confines of cities because they demand open spaces. On the other hand, the most successful businesses in finance, insurance, technology, and other sectors of the economy tend to be clustered in large cities. How can we explain the geographic concentration observed in many industries and sectors of the economy? When individuals, households, and firms operate in close proximity, they create cities and metropolitan areas. If we believe that cities are somehow "optimal," the economic rationale must be that cities make individuals, firms, and even governments more productive. If there were no significant efficiency gains that arose from proximity, citizens of the United States could spread out far more than they currently do. As we discuss in detail in this chapter, efficiency gains primarily arise in cities due to sharing, matching, and learning (Duranton and Puga, 2004). Cities facilitate the sharing of common, indivisible resources. They improve matches between workers and firms in labor markets. They help in the
Chapter 2
FIGURE 2.1. London Tower. (Photo by author)
diffusion and accumulation of knowledge. We refer to these efficiency gains as agglomeration externalities, which are external economies of scale in contrast to traditional internal economies of scale. Hence cities exist because of these externalities and despite the fact that there are many dispersion forces working against the formation of cities, i.e., cities exist and thrive despite the high cost of living, traffic congestion, vulnerability during epidemics, and violent crime. In this chapter, we focus on productivity benefits to individuals and firms from geographic concentration. In the next chapter, we begin our exploration of how cities make governments more effective.
2.2 Economic Rationales for Geographic Concentration 2.2.1 Transportation, Commuting, and Communication Costs Early cities arose in places with natural locational advantages. For example, early towns and cities were often located on hills or other locations that provided security against enemies because they were easy to defend or hard to attack. Two important natural determinants for location are bridges over key rivers, and fertile land. Examples abound. London is the location of the first bridge over the Thames. Paris is also the location of the first bridge across the Seine. Moreover, Paris is located in the middle of France's most fertile region. Bleakley and Lin (2012) point out that many cities in North America were founded at obstacles
18
Agglomeration, Productivity, Trade 900,000 800,000 700,000 C
0 ·.;::::;
.!!!
:::l Q.
0
a..
600,000 500,000 400,000 300,000 200,000 100,000 0
FIGURE 2.2. The ten largest cities in the US in 1860.
to water navigation, where continued transport required overland hauling or portage. They argue that portage sites attracted commerce and supporting services. In addition, the nearby waterfalls provided water power, which attracted manufacturing during early industrialization. The main economic advantages of these early cities were the ability to buy and sell goods in a single location and the ability for individuals to specialize their skills-important advantages even today! As trade became more important, cities were built up around transportation hubs, such as ports, rivers, and (later) railroad yards.1 To understand the importance of natural advantages in the development of cities, try to guess the three largest cities in the United States in 1860. Many of you will get two out of three correct: New York and Philadelphia. Unless you are familiar with the history of New York City, it is unlikely that you would have guessed the third one correctly. The third largest city in 1860 was Brooklyn, which used to be an independent city. The ten largest cities are presented in figure 2.2. The population estimates are based on the US Census. The common characteristics of these cities were an important harbor and a location either near the Atlantic Ocean or along a large river. Recall that before railroads were built, the only way to transport heavy goods over long distances was by ship. Marshall (1920) discusses three different types of transportation costs-the costs of moving goods, people, and ideas. Transportation and commuting costs are still important. Firms that ship goods to other locations need to be located near transportation hubs. Workers and employees need to live close to their employers to avoid onerous commutes. Thus firms and households tend to be highly concentrated in cities, reducing transportation and commuting costs. The geographic concentration of economic activity thus arises due to the efficiency gains generated 1
Henderson, Squires, Storeygard, and Weil (2018) provide a detailed analysis of the role of n atural characteristics. They study the worldwide spatial distribution of economic activity using light intensity at night as proxies for output.
19
Chapter 2
FIGURE 2.3. Container cargo ship in port. (Albin Berlin/ pexels.com)
when a firm is located close to its suppliers, workers, and consumers. Transportation and commuting costs have declined rapidly over the past two hundred years (Glaeser and Kohlhase, 2003). Since one of the benefits of cities is relatively low transportation and commuting costs, what are the implications of declining transportation costs? Communication costs are also lower in cities. For example, a firm may have to communicate on a regular basis with suppliers, accountants, lawyers, or consultants. Proximity facilitates face-to-face interactions and thus leads to efficiency gains. As the importance of physical meetings diminishes, and meetings can be conducted virtually, you may wonder whether this benefit of cities will still be important in the future.
2.2.2 Economies of Scale and Scope Another rationale for cities is related to economies of scale and scope. Recall that economies of scale arise when a good or a service can be produced more efficiently on a larger scale. Broadly speaking, economies of scale arise if average costs decrease with output. Economies of scope arise when there are synergies associated with producing a variety of goods. Specifically, economies of scope imply that the average cost of one good produced by a firm decreases when that firm produces an increasing range of products. An example is the automobile industry-the production costs for sports cars may decrease if the firm also produces minivans, sedans, and SUVs, since similar technology is used in all these products. If these effects are large, a diversified company may have significant advantages over a firm that is not diversified.
20
Agglomeration, Productivity, Trade Similar effects may arise in a diversified city economy. Economies of scale and scope in the production of a variety of services and consumption goods may depend on the size of the local market. Activities and interests that appeal to only a fraction of the population need a large potential audience to be economically feasible. For example, baseball teams, opera companies, and avant-garde art museums may be attractive to only a small percentage of consumers; without a large pool to draw from, stadiums, performance halls, and galleries will be empty. Specialized retail or restaurants can only be supported in communities large enough to have a critical mass of customers. Urban density thus facilitates consumption. Modern cities have become centers of a variety of consumption activities (Glaeser, Kolko, and Saiz, 2001). The advantages of scale economies and specialization are also clear in many service industries. For example, large cities can sustain law firms that specialize in mergers and acquisitions or litigation. Costly diagnostic machines can only be purchased by hospitals with a sufficient number of patients to benefit from them. Similarly, input/ output linkages may play an important role in generating agglomeration effects. For example, a parts supplier can sell the same good to multiple automobile manufacturers. Or a corporate law firm can provide legal services for multiple corporate headquarters. To create and operate a successful start-up company, an entrepreneur needs access to venture capital and loans (banking services), a marketing strategy, legal support to protect ideas (patent lawyers), general management and consulting advice, and accounting and auditing services. It is much easier to obtain all these services in a large diversified city. Larger cities can also offer goods and services that cannot be offered in smaller cities where the demand for certain specialized goods and services, as a percentage of the population, is too small. City size matters! Fujita (1988) provides a nice formalization of these ideas in a model in which firms have increasing returns to scale, which gives rise to monopolistic competition. Individuals benefit from product variety and firms benefit from scale economies in specialized production. As a consequence, scale economies at the individual firm level are transformed into increasing returns at the city level. This produces strong incentives for spatial economic concentration of firms and households. Finally, size also m atters when it comes to labor market pooling (Marshall, 1920). Workers and firms derive advantages from sharing a labor market. In a thicker local labor market, workers are matched to firms faster, thus reducing unemployment. In addition, firms and workers are likely to find better matches in term s of skills and experience. (The same is true for dating or marriage markets.)
2.2.3 Knowledge Spillovers and Agglomeration Externalities Another important rationale of cities is related to the existence of knowledge spillovers among individuals. This concept is a bit more tricky, but most urban economists consider knowledge spillovers to be a primary justification for cities. Knowledge spillovers give rise to agglomeration externalities and exist whenever an individual's knowledge rises by virtue of being near other individuals or firms. Knowledge spillovers are a key transmission device for new ideas. A person may learn from a neighbor, a friend, a coworker, or an acquaintance.
21
Chapter 2 Innovation and imitation may also be easier in densely populated cities (Duranton and Puga, 2001). Experimentation with new concepts and ideas may require finding like-minded, creative people who live and work in close proximity. Chatterji, Glaeser, and Kerr (2014) report that 92 percent of all patents in the US were granted to residents of metropolitan areas, and virtually all venture capital investments were made in major cities between 1990 and 2005. Just four metropolitan areas (Boston, New York City, San Francisco, and Los Angeles) accounted for almost half of the new product innovation. Moretti (2011) reports that firms in Santa Clara (San Jose) generate 3,390 (1,906) new patents in a typical year, while the median US city generates not even 1 patent per year. Innovation is not restricted to the high-tech sector. New trends in modern art and pop music are prime examples of creativity-driven processes that tend to originate in cities. Agglomeration economies can thus explain the existence of cities even when natural advantages do not exist. Knowledge spillovers are particularly important in open environments, such as universities, that do not rigidly enforce intellectual property rights. To understand the importance of knowledge spillovers, it is useful to distinguish between tacit and formal knowledge. Formal knowledge is captured by theories and formulas that are available in written form. A textbook is a great source for formal knowledge. It is the type of knowledge that can be taught in school and is easy to codify. Informal knowledge is made up of rules of thumb or tricks of the trade. This knowledge is more frequently transmitted verbally through anecdotes and stories, but it is possible to transmit it in written form, using guidebooks or manuals. In the past five years, YouTube has become an invaluable source of informal knowledge; individuals seeking advice on how to set up their sound system or make frosting flowers can watch a video where individuals demonstrate the process to others. Tacit knowledge is another type. It is the kind of knowledge that is difficult to transfer to another person. Why? It is not easy to write down tacit knowledge. Sometimes it is even difficult to verbalize it since it is primarily rooted in practice and experience. Tacit knowledge may be transmitted by learning on the job, apprenticeship, or other forms of training. Tacit knowledge primarily relates to skills, ideas, and experiences that people have in their minds but cannot easily articulate. Tacit knowledge is difficult to access without direct contact. Knowledge spillovers are most important for tacit knowledge and can arise between firms and organizations by sharing a common labor pool. What are some industries in which tacit knowledge may be important and spillovers large? Sharing knowledge among firms is a double-edged sword. Arrow's (1962) information or disclosure paradox is an early recognition of the problems that companies face when managing intellectual property across their boundaries. Firms often need to seek external technologies for their business or external markets for their own technologies. These types of transactions often require knowledge transfers. Once the customer has acquired the knowledge of the new technology, the seller has in effect transferred the technology to the customer. These spillover effects have important implications for the value of technology and innovations. Patents can be used to protect the innovating firms. But patent enforcement can be difficult. (Think about trade wars and the problem of enforcing property rights in an international context.)
22
Agglomeration, Productivity, Trade In a knowledge-based economy, knowledge provides comparative advantages over the competition. In some cases, firms have strong incentives to prevent
knowledge spillovers. They typically use the full force of the legal system to prevent other companies from copying their success. They obtain patents and copyrights for their ideas and products and strictly enforce their property rights through legal channels. Nevertheless, firms may not completely succeed in preventing knowledge from spilling over to competitors or suppliers. Smaller firms find it especially difficult to enforce patents and other types of intellectual property rights because of the high cost of litigation. Turning to the beneficial aspects of knowledge spillovers for firms, Marshall (1920) argues agglomeration arises because of synergies between firms in the same industry. Silicon Valley is a primary example of these types of externalities. One may think that innovations in transportation and communication technologies tend to weaken the importance of agglomeration effects, especially in the manufacturing sector. It is then a remarkable fact that the most famous modern example of an industrial agglomeration is in an industry that has the best access to new technologies. If there is an industry that should have been able to disperse with access to modern information and communication technologies, it is software development. Yet, it is exactly in this industry that firms are heavily spatially concentrated. Jacobs (1969) argues that agglomeration externalities also arise due to synergies between different industries. We have thus seen a number of compelling economic reasons w hy firms and households prefer to be located in cities. To gain some additional insights, we will formalize some of these ideas. We will use a simple cost-function approach to model the productivity-enhancing benefits associated with locating in a city.
2.3 Modeling Agglomeration Externalities Economic proximity makes for more efficient production and trade via agglomeration externalities. Firms that locate in efficient cities have a competitive advantage over firms that are located in inefficient cities. This comparative advantage translates into higher productivity and greater profit. Since firms and individuals will bid for the right to locate in efficient cities, these cities have higher land values and higher wages than cities that are not economically efficient. Let's try to explore some of these ideas within a more formal framework. In a standard economic model, technology and production functions do not depend on geography or firm location. This simplification overlooks the importance of local agglomeration effects and can lead to inaccurate predictions. To obtain a better model of firm productivity, we need to acknowledge that location matters. Here we focus on two channels. First, a city can make costly investments into infrastructure that provide efficiency gains. Second, the magnitude of knowledge spillovers and other agglomeration externalities also depend on the location. To capture these two additional factors that may have an impact on workers' productivity, we expand on a basic model of a firm's technology or production function. Table 2.1 summarizes the notation for the key variables of the model. Consider a firm that produces a single homogeneous good. Let Q denote the quantity of that output good. Let us, for simplicity, adopt a Marxian perspective
23
Chapter 2 TABLE 2.1.
Chapter Notation
Variable
Definition
Q
Quantity of an output good Amount of labor used in production Measure of labor productivity Capital Measure of capital productivity Inputs: productivity-enhancing city services Measure of productivity of city services Number of firms Measure of endogenous externality Agglomeration externalities Cost function Wage Price of infrastructure (i.e., city services M) Resource disadvantage of a location Average costs Marginal costs Price of output for the good produced and sold in the city Index for cities Per unit transportation costs
L IX
K
'Y M f3 N cp A
C(-) w
m D
AC( -) MC(-) p TC
in which labor is the only scarce production factor or input factor. (In the technical appendix at the end of this chapter, we provide a model with both labor and capital. The only additional complexity that arises is determining the optimal input ratio of capital and labor.) Let L denote the amount of labor used in production. The technology of a firm determines how inputs (L) are transformed into outputs (Q). If geography and firm location do not matter, Q would only be a function of L. For example, a convenient functional form for a production technology is given by (2.1)
where et is a parameter that measures labor productivity. If capital matters, we could expand the model and write output as Q = VXK 1 , where')' measures the productivity of capital, K. We use this function in the empirical analysis in section 2.5 and provide more details in the technical appendix at the end of the chapter. But let's keep it simple for now and ignore capital and other inputs chosen by the firm. This model ignores location. As we discussed earlier, location matters because the city provides important inputs to the firm, such as infrastructure and other productivity-enhancing city services. Let us denote these inputs by M. For example, investments in local transportation networks allow workers to reduce commuting time and stress, and hence increase workers' and firms' productivities. These city services are costly to produce. We assume that the city charges the firm a price equal to m for each unit of infrastructure. Since the city and not the firm determines the level of M , we treat Mas a fixed or predetermined input factor. In addition, we have seen above that knowledge spillovers and other agglomeration externalities make workers and firms more productive. Note that these
24
Agglomeration, Productivity, Trade agglomeration externalities improve worker productivity, yet the firms get them virtually for free, i.e., we assume that firms do not have to pay for the benefits that arise due to knowledge spillovers. We denote the externalities by A. The magnitude of the agglomeration externality is often a function of city size or employment density. For example, we could assume that the externality is given by A(N) = A N l, marginal costs are decreasing in Q.
26
Agglomeration, Productivity, Trade I
I
I
I
i
I
I
I I
I
I
I I
I
I
I
I I I I
I
I
I I ·· ··· ······ ·· ··············· · · · ············ ··········· · ······· · · · · · ···· 1, ·········· ···· · · · · · · · · ···· · · · · ·· ···················· · ······ ···
p
I
I I
I
MC/
,, ,,, ,, , ,, , '
,,,,
I
I
I
I
I
I
I
I
I
p = MC = AC
I
I
I
I
.,,,.,,,"""
Q*
Q
FIGURE 2.5. Competition and efficient production.
Marginal costs are increasing in the disadvantagedness of the city (D) and thus decreasing in agglomeration externalities (A) and the city's natural advantages or endowment (M). Let Q* denote the output level that is cost-efficient and minimizes average costs. We show how to derive this level in the technical appendix at the end of the chapter. Let p denote the price of output for the good produced and sold in the city. Let us assume that firms compete on prices and that transportation costs are negligible. Moreover, firms within the city have access to the same technology and can enter and exit the market at no additional cost. Competition then implies that AC(Q* ) = MC(Q*) = p
(2.9)
Firms that do not operate at the efficient level cannot compete in this equilibrium. In the case of decreasing returns to scale (a< 1), the equilibrium is illustrated in figure 2.5. Workers also benefit from efficient cities and agglomeration externalities. In a competitive labor market, workers' wages will be equal to the marginal product of labor. Using this equilibrium condition, wages must satisfy W = paAMf> L'x- l
(2.10)
27
Chapter 2 Note that the marginal product of labor increases in agglomeration externalities A and local resources M . Workers will thus earn more when they work in efficient cities.
2.4 Transportation Costs and Trade Thus far we have considered a city in isolation and ignored spatial competition. In practice, firms in one city compete against firms in other cities. Goods can be shipped across space at certain costs. We can have trade between cities, similar as we have trade between countries. Let's extend our model and consider an example with two cities, i = l, 2. To simplify the analysis, let us assume that each city has one firm that has approximately constant returns to scale technology, a = l. Hence average costs are equal to (2.11)
Let us simplify the analysis by assuming that the second term in the expression above is negligible. In that case, average costs are approximately equal to (2.12) This approximation is valid for large values of Qi or small values of mi and Mi. In that case, average costs are primarily a function of the local wages (w;) and the magnitude of the externality (Di). Each firm can potentially sell its product in both cities. Per unit transportation costs between cities are symmetric and given by TC. Market prices in city i are given by Pi· Notice that they do not necessarily have to be equal due to the existence of transportation costs and differences in local demand. Let us consider an equilibrium with autarky, i.e., an equilibrium without trade. For such an equilibrium to exist we need that the following condition holds in city 1: (2.13) The first inequality states that firm 2 cannot ship its goods to city 1 and make a profit. The second inequality states that firm 1 can make a profit by selling in its home city. A similar condition holds for city 2: (2.14) Without loss of generality let us assume that firm 2 is more productive than firm 1, i.e., AC1 - AC2 > 0. The key condition that firm 2 will not enter m arket 1 can be rewritten as (2.15)
28
Agglomeration, Productivity, Trade The first inequality states that transportation costs must be high relative to the markup-the difference between price and average costs-for firm 2. The second inequality states that the markup must be larger than the difference in average costs. Heterogeneity in costs must, therefore, be small. Firm 2 cannot be much more efficient than firm 1. In a nutshell, markets need to be geographically far apart so that transportation costs are large, and firms must have similar technologies no matter where they locate. Whenever these two conditions are not met, we would expect that trade occurs and firms would naturally operate in the location that provides the lower cost. In that case, we get geographic concentration of economic production in city 2, which is the more productive city. Only the most productive firms will engage in trade and sell their goods in many markets. The local, less productive firms may not be able to compete with more efficient firms located elsewhere. In that case, competition from efficient cities drives firms in inefficient cities out of business. High transportation costs can serve as a barrier to trade and hence protect inefficient firms in local markets. What happens if firms have increasing returns to scale? In that case, marginal and average costs decrease with firm output. In the absence of transportation costs, the efficient number of firms would be equal to one. In a model with transportation costs, we would expect to observe an equilibrium with a small number of regional monopolists that carve up the world market. Increasing returns to scale can thus explain the geographic concentration of economic activity even if agglomeration externalities are small.
2.5 Measuring the Impact of Agglomeration Externalities on Firm Productivity The theory of agglomeration is compelling. Moreover, there is some anecdotal evidence that agglomeration externalities are important. However, answering the question of how important they are is not easy. Casual empiricism suggests that agglomeration externalities may be large. We observe that the most profitable firms in the financial sector tend to be located in New York City. Similarly, we observe a large concentration of tech firms in Silicon Valley. Wages are also very high in these places, which is strong evidence of high productivity as suggested by our model. We cannot conclude from these observations that agglomeration externalities are large. Correlation does not imply causation. There are clearly many reasons why one would like to live in New York City or San Francisco that have nothing to do with agglomeration externalities. Maybe all the bankers want to live in NYC because they like to indulge themselves in consumption and entertainment opportunities that only NYC can offer. Maybe all the tech geeks love the climate of the Bay Area and like to hang out in Napa and Tahoe or go surfing. We will have to include these types of amenities in our models that characterize how individuals and firms make locational decisions. The problem is clearly that firms and households do not make random locational decisions. As a consequence, there is a lot of sorting or selection of workers and firms among different locations. Some of this is driven by factors that we
29
Chapter 2
FIGURE 2.6. Geographic concentration of economic activity in central business districts. (Burst/ pexels.com)
observe and can easily control for (such as weather). Other factors are much harder to quantify and measure correctly. In addition, it is hard to disentangle exogenous shifts in productivity due to, for example, market access and durable infrastructure from endogenous changes in productivity that arise due to spillovers, input sharing, or learning. Not surprisingly, there is a very large empirical literature in urban economics that applies different approaches to estimate the importance of agglomeration externalities. We will not try to summarize all these papers but instead will focus on a small number of studies to illustrate some of the opportunities and challenges in trying to measure the importance of agglomeration externalities. One of the key problems encountered in the estimation of agglomeration externalities is that it is impossible to conduct a valid field experiment. Most firms do not randomly relocate the companies' headquarters or main production facilities. Just because we observed a lot of sorting of firms by productivity, that does not necessarily imply that this sorting is driven by agglomeration externalities. What would be the perfect social experiment that we would like to conduct? We would need to randomly relocate a large sample of firms. Then it would be possible to measure the importance of agglomeration externalities. (If you don't believe me, please skip ahead to the section on social experiments in the appendix.) First, we would divide the sample of cities into a treatment group-cities that will receive new firms and thus an increase in agglomeration-and a control groupcities that do not receive new firms. By comparing the outcomes, such as firm profits, costs, output, wages, or productivity, between the treatment and the control group, we could estimate the effect of agglomeration. While it is impossible
30
Agglomeration, Productivity, Trade to conduct these types of social experiments, we can look for ways to mimic this research design. A clever quasi-experimental design was proposed and implemented by Greenstone, Hornbeck, and Moretti (2010) (GHM). Here is the idea. In the early 1990s BMW decided to enter the market for sports utility vehicles. Better late than never! Since most of the demand for these types of vehicles was in the US, BMW decided to produce these new cars in the US. Since it did not operate a single production plant in the US at that time, it was fairly unconstrained in its choice. BMW announced in 1992 that it would build a 1,150-acre manufacturing facility in Spartanburg County, South Carolina. The plant opened in 1994. We can observe and measure how the opening of the plant in Spartanburg affected the productivity of existing plants in the county. The main question that arises is, then, what counties are similar to Spartanburg and can act as a control group? A natural approach is to look at other cities and counties that BMW was evaluating but did not choose. These counties were in the race to attract the new plant and must have been considered by BMW to have similar characteristics. Generalizing this example, we can view the race to attract a new plant as a quasi-auction, in which cities bid for the right to host the new plant. We observe the winner of the auction, i.e., the city that attracted the new plant. Suppose we also know the city that came in second. GHM suggested using cities that won the bidding process as the treatment group and cities that came in second as the control group of this quasi experiment. GHM collected a dataset consisting of a sample of forty-seven firms that relocated a large plant or facility to a different location or opened a new facility. They observed the county in which the plant ultimately chose to locate (the winning county), as well as the one or two runner-up counties (the losing counties). They could, therefore analyze how the productivity of the firms changed before and after opening the new plant in both types of counties. More specifically, we can estimate the production functions of firms in the winning and losing counties by using a Cobb-Douglas production function: (2.16) Taking logs, we obtain ln(Q) = lnA+alnL + 1ln(K)
(2.17)
Measuring output is difficult, especially for multiproduct firms. As a consequence, we are often forced to use firm revenues as an output measure in empirical studies. (To account for differences in local prices, you may want to use local price deflators to convert revenues into output.) Given these measurement problems, it makes sense to assume that the log of output is measured with error, denoted by u: ln(Q) = ln(Q) + u
(2.18)
We observe Q and not Q. Hence we can write observed output as ln( Q) = ln A + a 1n L + r' ln(K) + u
(2.19)
31
Chapter 2 We observe output and input of firm i in city j at time t. Adding subscripts we obtain the following regression model: (2.20)
The term Aijt is sometimes called total factor productivity (TFP). It can be measured as the systematic component of the residual of the production function that cannot be explained by labor and capital (and other observed inputs.) It serves as a common measure of productivity of a firm. We can decompose that term into a location-time specific component, denoted by bjt, and a firm-time specific shock, denoted by €it: (2.21)
Changes in agglomeration externalities then primarily affect all firms within a given location. We would, therefore, expect that 6jt increases if agglomeration externalities increase in city j at time t and vice versa. In particular, we would expect that bjt increases in the "winning" county and decreases in the "losing" county. So how can we identify and estimate the common component of TFP denoted by bjt? Suppose we knew IX and "(; we could rearrange terms and obtain (2.22)
By construction, the last two terms on the right-hand side reflect idiosyncratic shocks and have mean zero, i.e, E [Eit + Uijt] = 0. So if we knew IX and 'Y, we could estimate bjt just by averaging over the left-hand-side terms:
(2.23)
where Njt is the number of firms in location j at time t. Of course, we don't know IX and 'Y· Hence we need to estimate them simultaneously with the J's, which adds some complications to the analysis. We do not need to get into these details here. The sample is restricted to the period from seven years before the opening to five years after the opening. The paper then documents two important findings. First, in the years before the new plant opening, TFP trends among incumbent plants were similar in winning and losing counties. Hence there is no evidence that firms in winning counties systematically differ from firms in losing counties. Or, in the language of econometrics, there is no evidence of selection based on firm location prior to the new plant opening. Second, beginning in the year of the plant opening, there is a sharp upward break in the difference in TFP between firms in winning and losing counties. The authors find that this relative improvement is mainly due to the continued TFP decline in losing counties and a flattening of the TFP trend in winning counties. This underscores the importance of the availability of losing counties as a counterfactual. If we only look at the (relatively) constant average productivity in
32
Agglomeration, Productivity, Trade the treatment group, we might conclude that there are no benefits to attracting new firms. However, if the counties would otherwise be declining, attracting a new plant would stem that decline and effectively be beneficial. Five years after the opening, the TFP of incumbent plants in winning counties is approximately 5-7 percent higher than the TFP of incumbent plants in losing counties. Consistent with the theory of agglomeration, this effect is larger for incumbent plants that share similar labor and technology pools with the new plant. GHM also finds evidence of a relative increase in skill-adjusted labor costs in winning counties since the demand for labor will increase as productivity rises. While interesting, these results need to be interpreted with caution. The results are identified based on the relocation of a small number of large manufacturing plants. Patrick (2016) provides a detailed discussion of how sensitive the results are to different robustness checks. Her preferred estimates suggest that the relevant effects on output, employment, and earnings are much smaller, but still significant. She concludes that it would take a $79,500 subsidy per job to attract one of the large plants studied in GHM, which is roughly twice as large as the estimate in GHM. In addition, the results are largely driven by the fact that average productivity of firms significantly declines in losing counties, and not because of productivity increases in the winning counties. That finding is somewhat counterintuitive. It may be due to the fact that firms in both losing and winning counties may have been in decline for reasons completely unrelated to agglomeration externalities. Moretti (2004b) also studies the magnitude of human capital spillovers. He estimates a production function using a firm-worker matched dataset. He finds that the productivity of plants in cities that experience large increases in the share of college graduates rises more than the productivity of similar plants in cities that experience small increases in the share of college graduates. But note that some of these productivity gains are offset by increased labor costs. He also finds that within a city, spillovers between industries that are economically close are larger than spillovers between industries that are economically distant. We study the role of local labor markets and agglomeration externalities in more detail in later chapters of this book. Using French establishment-level data, Combes, Duranton, Gobillon, Puga, and Roux (2012) find that differences in firm productivity across space are primarily explained by agglomeration externalities. Larger cities may also toughen competition, allowing only the most productive to survive. Surprisingly, they find that this selection effect plays almost no role in their data.
2.6 The Local Nature of Agglomeration Externalities One important challenge is to determine the geographic scale on which agglomeration externalities operate. Is it the region, the city, or the local neighborhood? Rosenthal and Strange (2003) were the first to address this question in a systematic analysis. They focus on a variety of high-tech and manufacturing industries and find that the benefits of agglomeration are highly localized. For many industries, agglomeration benefits disappear outside of a radius of 1 mile. Arzaghi and Henderson (2008) consider advertising agencies in Manhattan, which accounts for 24 percent of advertising agency receipts in the US. While
33
Chapter 2 there are some large and well-known agencies on Madison Avenue, there are over a thousand smaller agencies in different clusters spread over the southern half of Manhattan. Advertising is known for the key role that networking plays in the operation of agencies. We can interpret localized networking as a case of information spillovers. Arzaghi and Henderson examine the effect on productivity and profitability of having nearer advertising agency neighbors. The idea is that proximity provides information exchange. They show that there is a rapid spatial decay in the benefits of nearer neighbors even in the close quarters of southern Manhattan. This suggests that having a high density of similar commercial establishments is important in enhancing local productivity for industries where information sharing plays a critical role. Of course, rents are much higher in Manhattan than in other parts of New York City. Hence advertising agencies are paying higher rents for the benefits of higher spillovers or better opportunities to network. More generally, most urban economists think that manufacturing spillovers are thought to operate at a city or even regional scale, while many service and technology sectors, such as advertising, software development, or finance, are much more locally clustered.
2.7 The Effects of Government Regulations Geographic concentration of economic activities is also driven by government regulations. The federal government exercises much power in regulating specific industries. To understand the impact of federal regulatory policies on geographic concentration of economic activity, it is useful to consider some examples. The banking industry is an interesting one. After the Great Depression, the US Congress passed the Banking Act of 1933, commonly referred to as the GlassSteagall Act. This law created a separation of commercial and investment banking. It also prevented securities firms and investment banks from taking deposits. Finally, it prevented commercial banks from dealing in nongovernment securities and investing in noninvestment-grade securities for themselves. The Financial Services Modernization Act of 1999 repealed part of the GlassSteagall Act of 1933. In particular, it removed barriers in the market among banking companies, securities companies, and insurance companies, blurring the distinction between investment banks, commercial banks, and insurance companies. The recent economic recession that started in 2008 was largely driven by the housing crisis and the collapse of major banks. One criticism of the recent deregulation of the banking sector is that federal policies created strong incentives for banks to grow large and become "too big to fail." Before the deregulation of the banking sector, there were many more mid-sized commercial banks, headquartered in cities across the United States. The majority of these medium-sized banks have been taken over by a small number of very large banks. These mergers and acquisitions increased the geographic concentration of the financial service industry in the US and were not necessarily driven by knowledge spillovers or other agglomeration externalities. The deregulation of the airline industry is another helpful example that illustrates how federal policies have helped create the geographic concentration of
34
Agglomeration, Productivity, Trade economic activity. The deregulation of the airline industry led to a consolidation among US carriers and favored a small number of large cities that serve as primary hubs for the modern airline industry. The hub-and-spoke system that large carriers use may or may not be cost-efficient. That's somewhat of an open question in transportation economics. However, this concentration of the airline industry can be attributed to changes in federal regulatory policy, which encouraged the creation of monopolies on many routes. Moreover, the cost of additional security that was required following the 9 / 11 attacks also impacted the viability of many small airports. The entertainment sector is also highly concentrated. For example, the largest company in Philadelphia is Comcast, a regional cable monopolist. Geographic concentration is driven by technology, monopoly power, and lack of federal regulation, and not agglomeration externalities. Similarly, recent changes in patent policies have shifted the power to large tech companies that can afford to engage in expensive litigation and patent enforcement. This may explain the large concentration of quasi-monopolistic technology firms in a small number of places such as Silicon Valley. In a nutshell, agglomeration externalities are important, but there are other important reasons for the economic concentration of firms in cities as well.
2.8 A Case Study: The Impact of Brexit on the City of London Uncertainty about future regulatory policies and market access can also be problematic. This point can be illustrated by the recent vote of the UK electorate to leave the European Union (EU), referred to as Brexit. London's financial center plays a large role in the British economy. As one of the global hubs of the financial services sector, it accounts for about 12 percent of Britain's economic output. In addition, it pays more taxes than any other industry in the UK. In a referendum on June 23, 2016, 51.9 percent of the UK electorate voted to leave the EU. Banks located in London potentially have a lot to lose from the end of easy access to the EU's market of 440 million people. According to Reuters, about a third of the transactions that take place in London involve clients in the EU. Article 50 of the Treaty on the European Union determines the rules for withdrawing from the EU. Once the UK notifies the European Council of its intention to withdraw, the EU is required to negotiate an agreement with the UK, setting out the arrangements for its withdrawal. At the same time, the new agreement will specify the future relationship of the UK with the European Union. Until the transition is over, there remains much policy uncertainty. Firms may suffer as a result since it makes it difficult to form long-term strategies or evaluate the feasibility of long-term projects. Brexit is likely to cause major disruptions for the financial industry, which could lead to significant firm relocations. Given the importance of agglomeration externalities, a large loss of jobs could be devastating and create a vicious cycle of decline. Of course, there are not only risks but also opportunities. For example, Brexit could allow the UK government to create a more favorable regulatory
35
Chapter 2 environment. Moreover, it could force firms in the financial sector to streamline their operations and thus generate some efficiency gains.
2.9 Conclusions Modern economies are organized around cities and metropolitan areas since cities can make firms and individuals more productive. Some of the advantages of cities arise naturally due to proximity to important waterways. Other advantages require heavy investments in infrastructure such as airports, subway systems, convention centers, and industrial zones. These investments often reduce transportation, commuting, and communication costs. Agglomeration arises due to knowledge spillovers. Knowledge spillovers can explain why firms located in certain cities have higher productivity and profits than similar firms in other cities. Knowledge spillovers are more likely to be important when knowledge is tacit. Hence knowledge spillovers are more essential in high-tech and creative industries. Agglomeration externalities can, therefore, explain the existence of cities when there are no natural or man-made advantages to the location. In addition, federal policies often favor the creation of regional or national monopolists or market structures with a small number of large players. These policies then provide strong incentives to create a geographic concentration of economic activities even when agglomeration externalities or natural advantages are not important. Agglomeration economies tend to increase with city size. That fact seems to suggest that cities should be large. As we will see in subsequent chapters, there are also many problems and costs that increase with city size. For example, larger cities become more expensive as the cost of housing and urban transportation rises. The price of other nontradable goods also rises. In addition, amenities (such as parks) and public goods (such as schools) suffer from congestion effects. These forces argue for smaller cities. Henderson (1974) argues that there should be an optimal city size that balances these two forces. Overall, good urban fiscal policy is needed to balance the benefits of size with its costs. Let us end this chapter with a word of caution. The economic advantages of a city should not be viewed as "static" or "deterministic." There is no guarantee that the benefits of agglomeration will be realized in any given city. It would be foolish to think that an individual's productivity automatically increases by 10 percent just by taking the train from Philadelphia to New York! Agglomeration externalities are largely a function of information flows, learning, and sharing and thus require the active participation of individuals in formal and informal information networks.
2.10 Technical Appendix: Cost Functions 2.10.1 Cost-Efficient Production and Competition The cost-efficient output level minimizes the average costs. We can derive the efficient level of production by solving the following problem:
36
Agglomeration, Productivity, Trade (2.24)
Taking the derivative of the function above with respect to Q and setting it equal to zero yields the cost-efficient level of output, denoted by Q*. Convince yourself that the solution to this problem is given by (2.25)
Note that at Q*, we have AC= MC. As M rises and D falls, Q* rises. As agglomeration rises-as A rises-Q* rises. As labor becomes more productive-as a rises-Q* rises and firms become bigger.
2.10.2 Multiple Input Factors Let us now extend the model to allow for multiple input factors . For simplicity we ignore fixed costs, i.e., m M = 0. Let's consider an extended version of the production function with labor and capital, denoted by K. (2.26)
Note that we use A to denote a constant baseline productivity and N
0, m is income, p is the price of the membership, G is the quality of the club, and N is the number of members. The utility of not joining the club is
U(0,0,m,0) = a+m
(4.58)
a) What is the maximum willingness to pay for the membership? How does your willingness to pay change with G and N? b) Assume that the club is operated by a monopolist who chooses membership, N, and quality, G, to maximize profits. If the costs of running the club are G + N, what are the profit-maximizing choices of N and G? c) What choices of N and G maximize the welfare of a typical m ember of the club if the costs of the club are shared equally? d) Compare the monopolistic and welfare-maximizing equilibrium and discuss the differences. 6. Rudi and Heidi are roommates and are thinking of buying a sofa for their apartment. Rudi's utility function is
Ur(s,zr)=(l + s)zr
(4.59)
Heidi's utility function is (4.60)
where s = 0 if they do not buy the sofa and s = 1 if they do. Z r and zh denote private consumption. The price of private consumption is equal to 1. Each of them has disposable income equal to $100. Let Pr denote the amount that Rudi pays and Ph the amount that Heidi pays if the sofa is purchased. a) Derive the budget constraint for both Rudi and Heidi. What are the utilities of Rudi and Heidi if they buy the sofa? (Hint: These should depend on Ph and Pr, respectively.) What are the utilities if they do not buy the sofa? b) What is the maximum amount that Rudi is willing to p ay for the sofa, i.e., what is the price at which he is indifferent between owning and not owning the sofa? What is the maximum that Heidi is willing to pay for the sofa? What is the maximum that they are both willing to pay together? c) Suppose the sofa costs $100. Will they buy the sofa? Suppose the price of the sofa drops to $75. What do you predict now? Suppose the price is $20. What do you predict in that scenario? Explain your answers.
87
Chapter 4 7. Donald and Hillary are roommates and are thinking of buying an old radio for their apartment so that they can follow the general election results. The radio costs $20. Donald's utility function is (4.61)
Hillary's utility function is (4.62)
where r = 0 if they do not buy the radio and r = l if they do. zd and zh denote private consumption. The price of private consumption is equal to 1. Each of them has disposable income equal to $100. Let Pd d enote the amount that Donald pays and Ph the amount that Hillary pays if the radio is purchased. a) What is the maximum willingness to p ay for the radio for each person? (i.e., the amount that makes each of them indifferent between owning and not owning the radio) b) What is the equilibrium if they must reach unanimous agreement? What is the equilibrium if one of them can unilaterally buy the radio? 8. Consider three consumers who care about the consumption of a private good and their consumption of a public good. Their utility functions are given by (4.63)
where z; is consumer i's consumption of the private good and G is the amount of the public good consumed by all. The unit cost of the private good is $1 and the unit cost of the public good is $10. Individual income levels are m1 = 30, m2 = 50, and m3 = 20. a) Compute the marginal rate of substitution between G and Zi for each of the three consumers. b) Derive the Samuelson condition for this model. c) Derive the aggregate resource constraint and compute the optimal level of public good consumption. d) Explain why it is difficult to implement an efficient level of public good provision in practice. 9. Consider a model with two players. The utility function is given by Ul (z·l l G)
= z /.·+aI G
1
- -2 G
2
(4.64)
where a2 > a1. a) Compute the Lindahl equilibrium. b) Show that the equilibrium is Pareto optimal. c) What is the optimal sharing rule? d) Why is it difficult to implement the optimal sharing rule? 10. Suppose you observe that the total amount of taxes raised by the VickreyClark-Groves (VCG) mechanism equals the total amount of subsidies (or
88
Efficient Provision
side payments) that are needed to incentivize the players. Can you then conclude from that observation that the outcome that is implemented by VCG is efficient? Explain. 11. Consider the three-player model discussed in section 4.10 of this chapter. The utility function is given by (4.65)
a) Show that utility monotonically decreases as we move away from the bliss point. (Hint: You can do that analytically or by drawing a graph.) b) Show that the taxes for players 2 and 3 are given by t2 = 1600 and t3 = 900. c) What would be the outcome under majority rule in this example? (Hint: Assume that player 2's preferred policy is implemented under majority rule. We will show this result in chapter 7 of this book when we discuss the median voter theorem.) d) Show that the majority rule outcome is more efficient than VCG in this example. What's the intuition?
89
5
Voluntary Provision of Local Public Goods and Services
5.1 Motivation In the previous chapter, we derived the gold standard for the provision of local public goods and services. We have seen that efficient levels of public good provision n eed to satisfy the Samuelson condition. In a world with congestion, there is another efficiency condition that describes the ideal size of the city. It is difficult to implement these efficiency conditions. At least two challenges arise. First, individuals find it difficult to state their preferences or their willingness to pay for public goods. Second, individuals may have strong incentives to disguise themselves and not truthfully report their own willingness to pay for the good. The incentives to misrepresent arise w henever the tax burden depends on stated preferences. As a consequence, it is rather challenging to implement the efficient level of public goods. How, then, do we implement the provision of public goods in practice? Do we really need to use the force of the government to overcome the free rider problem and provide public goods and services at a reasonable level? In this chapter, we primarily study public good provision by individuals when the government plays either no role or a relatively weak role in the provision of public goods and services. We consider mechanisms in which the provision of the public good is purely voluntary: individuals are not forced to participate in the public good provision process and can opt out. These mechanisms rely on voluntary contributions to the common cause. We would like to know whether and under what conditions these mechanisms can produce reasonable or desirable outcomes. There are many examples of voluntary provision of local public goods and services. Many private and public universities in the US rely on donations from alumni as well as grants from private foundations and the federal government. Social, cultural, and environmental not-for-profit organizations such as the United Way, which collects donations for many social charities, rely heavily on private donations. Volunteer fire departments and neighborhood crime watch programs rely on individuals donating their time. Under what conditions can we expect that voluntary provision of public goods leads to a reasonably efficient outcome?
Voluntary Provision
FIGURE 5.1. Mummers Parade in Philadelphia: Voluntary provision of entertainment. (Photo by author)
5.2 A Model of Voluntary Provision of Public Goods To gain some insights into this problem, we consider a model with two individuals in which a public good is financed through their private contributions (g1 and g2 )This model, which was developed by Bergstrom, Blume, and Varian (1986), determines the level of public goods as follows: (5.1)
Let us assume for simplicity that preferences are Cobb-Douglas: (5.2)
91
Chapter 5
FIGURE 5.2. Best-response function.
Moreover, private consumption for both players is (5.3)
Note that we assume for simplicity that both individuals have the same amount of income and have identical preferences. We discuss the impact of heterogeneity among individuals below. We can substitute the budget constraint and the expression that determines public good provision into the utility function and obtain (5.4)
Determining the individuals' contributions is a hard problem to solve. The private contributions of individual 1 crucially depend on the level of contributions of individual 2. For example, if the first individual anticipates that the second individual will contribute a lot of resources to the common cause, she will likely not contribute much herself and vice versa. To deal with these types of strategic interactions, we can conduct the following thought experiment. We can solve the decision problem of player 1 for any arbitrary level of g2. That way, we obtain a best-response function g1 (g2 ). This function is downward sloping because individual 1 will donate less if she anticipates that individual 2 will donate at a higher level. This function is illustrated in figure 5.2.
92
Voluntary Provision The exact shape of this best-response function-such as the slope and intercept-depends on the preferences of the individual, his or her income, and the cost of providing the public good. For the example considered above, we show in the technical appendix (section 5.8) that the best-response function of individual 1 is given by the following equation: (5.5)
Thus individual l's optimal contribution to the public good is decreasing in individual 2's contribution to the public good, increasing in income, and increasing in preferences for public goods (a). Since both individuals are identical in our example, it is not surprising that individual 2's best-response function is given by (5.6)
John Nash suggested that an equilibrium for these types of games is achieved when individual 1 plays a b est response to individual 2, and individual 2 plays a best response to individual 1. In that case, nobody wants to change their behavior. In game theory, we call this a Nash equilibrium. A geometric representation of the Nash equilibrium of this game is given by the intersection of the two best-response functions. The analytic representation of the equilibrium is, therefore, given by the solution to the system of two equations above. It is not hard to show that the equilibrium levels of contributions are given by (5.7)
Note that this equilibrium is symmetric because the individuals are identical. If individuals differ in their incomes or preferences for the public good, the equilibrium is not necessarily symmetric. Figure 5.3 illustrates an asymmetric equilibrium of this model. The total provision of public goods is given as follows:
en = 2gn = c
2am c(2 - a)
(5.8)
In section 4.8 we showed that the efficient level of public good provision is given by (5.9)
Comparing the provision under private contributions to the Pareto efficient provision levels, we find that en < e e. This follows from the fact that a < l. If a is small, we obtain slightly more than 50 percent of the efficient level in equilibrium. This result shows that private provision of public goods leads to an underprovision. The individuals would both like to increase the public good to the Pareto efficient level, but each prefers that the other pay for this increase. In equilibrium, neither is willing to make an additional contribution.
93
Chapter 5
FIGURE 5.3. Nash equilibrium.
This incentive to let others pay for the public good while still enjoying the benefits is an illustration of the free rider problem. The "private market" will, therefore, fall short of providing the efficient amount of the public good. Voluntary provision of public goods leads to a serious underprovision, i.e., not enough public goods are provided in the economy. A natural question to ask is whether a government intervention can improve the outcome. Let us consider an extended version of our model in which both the government and individuals provide public goods. Suppose the government makes a contribution to the public good equal to gp. In this case, the level of public goods is determined as follows: (5.10)
The government needs to finance its contributions. In this model, this is only possible by taxing the two individuals. We assume for simplicity that government contributions are funded through lump-sum taxes on individuals equal to gp/ 2. Taking individual 2's and the government's contribution as given, individual 1 chooses her contribution (g1) to maximize the following utility function: (5.11)
Note that the government intervention has two effects on the individual. First, it lowers the after-tax income to m - gp / 2 since the individual has to pay taxes
94
Voluntary Provision given by gp/2. There is no free lunch here; somebody has to pay taxes! We would expect that this negative income effect would lower the voluntary contributions of individuals. Second, the individual recognizes that an additional amount of the public good will be provided, no matter what she does. This second effect also lowers the willingness of an individual to donate her own resources since the utility function is concave in G. We can illustrate these two effects by shifting the individual's best-response function down. For the example considered above, we show in the technical appendix that individual l's best-response function is now given by (5.12) Individual 2's reaction function is similar. Private contributions in the symmetric equilibrium are given as follows: (5.13) Adding the contributions of both individuals gives us the total private contributions to the public good: (5.14) In this example, the government contributions to the public good crowd out the private contributions. For every dollar contributed by the government, total individual contributions fall by one dollar. Thus government interventions designed to increase public goods are fully offset by private reductions. The example considered above is somewhat extreme since it implies full or onefor-one crowd-out. In reality, it is unlikely that the crowd-out effect is that large. The relevant policy question is, then, How large is the crowd-out effect in practice? To answer that question, we need to conduct some careful empirical analysis.
5.3 Empirical Evidence of Crowd-Out Faith-based organizations play a large role in the voluntary provision of welfare services in the United States. Half of all charitable giving in the US goes to religious and faith-based organizations that supplied social services to seventy million Americans in 2005 with estimated expenditures equal to $24 billion. Hungerman (2005) studies the crowd-out of church-provided welfare, such as soup kitchens, by government welfare spending. He uses a panel dataset of 11,000 Presbyterian Church congregations with 2.5 million members in his sample. Donations to church i in county k at time t are given by D ikt · Government welfare expenditures in county k at time tare given by G kt · Hungerman then considers the following regression model: (5.15)
95
Chapter 5 where Xikt denotes observable characteristics and uikt denotes an error term. He also estimates a similar model for a church's social spending as the dependent variable. Hungerman is mainly concerned about the possible correlation between Gkt and uikt due to omitted variables. Hence ordinary least squares may not give us a reasonable estimator for a 1 . To construct an instrument for government welfare spending, Hungerman turns to a provision in the Personal Responsibility and Work Opportunity Reconciliation Act, which changed welfare laws in the US in 1996. We study this welfare reform law in much more detail in chapter 17. This law changed the eligibility criteria for welfare services from legal residency to legal citizenship. Hence the bill greatly decreased the availability and use of welfare services by noncitizens. For example, most noncitizens in the United States lost eligibility for food stamps. States were given the option to decide whether or not to extend welfare or Medicaid services to noncitizens. As a consequence of these severe reductions in government welfare spending, we observe an increase in welfare spending of not-for-profit groups after welfare reform. Hungerman then adopts an instrumental variable estimation strategy. Broadly speaking, he exploits the variation in government welfare spending at the county level that is due to the provisions of the welfare reform law that affected noncitizens. In particular, the first stage of Hungerman's two-stage least squares estimator is given by (5.16) where Zkt is the instrument. Hungerman defines a postreform time dummy variable, which is equal to 1 if time is greater than 1996 and O otherwise. He then interacts this postreform dummy with the percentage of noncitizens in the county to obtain his instrument. Note that an instrument is just a "special regressor" that is included in the regression in equation (5.16) but is excluded from the main regression in equation (5.15). The instrument predicts government spending but does not predict donations, once we control for other regressors. Hungerman finds that the estimate of 'Yl is negative and economically large in magnitude. This confirms his conjecture that eligibility restrictions on noncitizens led to relative declines in welfare spending in communities with large noncitizen populations after the 1996 welfare law was passed. Next, he uses the first-stage regression model to predict the value of government spending, denoted by Gkt· Broadly speaking, the second stage of the estimator then replaces Gkt by Gkt in the regression model in equation (5.15) and uses least squares. Hungerman does not find evidence supporting a crowd-out effect of government spending on donations. In contrast, he finds that the estimated crowd-out effect of government spending on church spending lies between 20 and 38 cents on the dollar. He concludes that church activities partially substitute for government activities. However, his findings raise questions about how churches financed increased charitable spending in response to the welfare reform law of 1996. Overall, the empirical findings of this literature suggest that government spending has a modest crowd-out effect on individual donations. As a consequence, the simple Bergstrom-Blume-Varian model considered above seems to overpredict the importance of the crowd-out effect.
96
Voluntary Provision
5.4 Warm Glow and Private Benefits Some private donations are motivated by the desire of individuals to provide local public goods and services. These types of donations can be explained by the model we considered above. Most private donations, however, do not seem to be motivated by the desire of individuals to contribute to a public good. Very few individuals have ever sent a check to the City of Philadelphia asking the city to use the money to hire additional police officers or firefighters. We therefore need to look for alternative explanations of charitable behavior. We often observe that individuals tend to donate to causes that are clearly not linked to their personal welfare. For example, many alumni support their universities without planning to go back. Other individuals support poverty relief efforts in countries or regions that are far away from their home residences. It is possible but unlikely that these types of donations are driven by the desire to provide public goods. These donors gain satisfaction from knowing that they have contributed to a worthy cause. Andreoni (1989, 1990) coined the term "warm glow" for this satisfaction. It is possible to modify our model and incorporate warm glow in the specification of the utility function: U (g, z)
= f3 1n g + (1 - f3) ln z
(5.17)
Note that we are treating the donations to a public cause just like a private good. The magnitude of the warm glow effect is now measured by the coefficient (3. If individuals are driven by warm glow, the primary benefit in return for their donation is the feeling that they are doing the right thing. In many cases, individuals also obtain real private benefits that are associated with donating to a charitable organization. For example, theaters may give donors special tickets on opening night, or operas may give donors access to private performances. Harbaugh (1998) provides a framework in which donors are primarily motivated by tangible or intangible private benefits from their gifts. Let b(g) denote the private benefits associated with donating g to your favorite charity. Utility in this model can then be written as
U ( b(g), z)
= 'Y 1n b(g) + (1 - 'Y) 1n z
(5.18)
This model captures the fact that charitable organizations rely on sophisticated fund-raising strategies to attract more donors. The more generous the donation, the more lavish the private benefit package.
5.5 Empirical Evidence: Private Benefits versus Warm Glow Some individuals may support their favorite charities regardless of the incentive structures used to attract donors. Others may be motivated to give conditional on the benefits the organization offers. How important are these different incentives or motivations? Sieg and Zhang (2012) analyze donor lists to the ten largest
97
Chapter 5
FIGURE 5.4. Free provision of classical music. (Kaique Rocha/ pexels.com)
cultural organizations in Pittsburgh from the 2004- 2005 donation cycle. They estimate a discrete choice model of donor behavior that accounts for warm glow and private benefits. Discrete choice means that you have to choose among a small number of well-defined alternatives. We consider these types of models in more detail when we study neighborhood and housing choices. More specifics about discrete choice estimation are provided in the appendix (section A.8). Most private charities offer distinct levels of giving- a discrete choice- that are associated with som e perks. For example, if you want to be a member of the Founder's Society of the Carnegie Museum in Pittsburgh, you need to donate $25,000 or more (as of 2018). With that donation you obtain free museum admission, free unlimited educational films at a cinema for you and an unlimited number of guests, free parking, a private museum tour with the president for you and four guests, and other benefits, w hich include invitations to private receptions
98
Voluntary Provision TABLE 5.1. The Importance of Private Benefits
Charity Ballet Carnegie Museum Opera Symphony Western Pennsylvania Conservancy Public Theater
Model Predictions Status quo No private benefits Status quo No private benefits Status quo No private benefits Status quo No private benefits Status quo No private benefits Status quo No private benefits
Number of Donors
Median Donations
Average Donations
323 202 804 402 369 192 443 165 832 919 718 793
250 250 1000 500 500 215 1000 1000 100 100 50 95
818.11 629.66 1930.97 1116.73 2029.13 913.12 2161.40 1627.12 343.99 389.58 402.09 404.71
and other closed events. For the President's Society, a donation of $10,000 w ill do. You get the idea. There is a strong quid pro quo. Sieg and Zhang find that private benefits that provide social status are good incentives for donors to give to their favorite charities. The prospect of influence in an organization through board membership is also a good way to secure contributions. After estimating a model of donor behavior, the authors simulate donors' choices without benefits. Table 5.1 is based on Sieg and Zhang (2012) and summarizes some key predictions of that study. "Status quo" refers to the predictions of the model when private benefits are available, while "no private benefits" refers to the predictions of the model without private benefits. The empirical findings show that charities that heavily rely on special events that provide social status to attract wealthy donors would receive much lower donations without these perks. Examples are the opera and the symphony. The model predicts that these organizations will experience large declines in donations if private benefits are not allowed. Altruism or warm glow accounts for approximately 50 percent of the observed donations.
5.6 Tax Incentives and Matching Charitable donations are tax deductible, which provides additional incentives to make contributions since the effective cost of a $1 donation is only 1 minus the marginal tax rate. Auten, Sieg, and Clotfelter (2002) estimate a model of charitable donations using a fifteen-year panel of individual tax returns collected by the Internal Revenue Service. The study implies that taxes affect the level of contributions by way of a price effect and an income effect, each of which has two components-a transitory one and a persistent one. Auten, Sieg, and Clotfelter find that persistent income changes have substantially larger impacts on charitable behavior than transitory changes. Additionally,
99
Chapter 5 there are substantial effects of persistent changes in tax prices, with elasticities ranging from - 0.79 to - 1.26. The effects of transitory tax changes are smaller. The findings of this study, therefore, suggest that persistent shocks in income have a substantially larger impact on charitable donations than do their transitory counterparts. The most important behavioral aspect for considerations of tax policy is the persistent tax effect, since transitory effects are, by their nature, passing. Through this effect, tax reforms can have a long-lasting influence on charitable giving. In addition to tax incentives, charitable organizations often provide additional incentives by matching individual donations based on commitments by large donors. Huck, Rasul, and Shephard (2015) consider the impact of alternative fund-raising schemes. They conducted a field experiment with the Bavarian State Opera. They mailed 25,000 opera attendees a letter describing a charitable fundraising project organized by the opera house. They found that charitable donations are maximized by simply announcing that a lead donor has made a large donation to the new project. This simple announcement is better than using the funds to match the donations of others in some way. Maybe the announcement of the large lead donor serves as a signal that the project is deserving.
5.7 Conclusions In this chapter, we have studied the properties of purely voluntary mechanisms of public good provision. These mechanisms play a large role in the urban economy. While city governments are not funded by voluntary contributions, we rely on voluntary contributions for a large variety of not-for-profits that provide important local public goods and services. Prominent examples are homeless shelters, churches, clubs, theaters, symphonies, and many other nongovernmental organizations. Voluntary mechanisms of public provision can be effective tools to finance these organizations. Economic theory suggests that most of these organizations are underfunded. Noncooperative behavior of individuals generates a free rider problem in which many individuals do not provide any voluntary contributions despite the fact that they benefit from the public good or service. Voluntary mechanisms tend to work better if individuals are altruistic or if donations generate strong warm glow. Voluntary mechanisms are also used to supplement or enhance public good provision by a government. Donations for cancer research or disaster relief are prominent examples. However, the incentives to free-ride are strong, and the crowd-out effect may be substantial. The upshot of this chapter is that we cannot expect to obtain viable and thriving cities by relying solely on voluntary mechanisms for local public good provision. In practice, we typically delegate the decision-making process on taxes and expenditures to an elected city council and mayor. The council and the mayor then have the power to force individuals who live in the city to pay for the costs of providing public goods and services by charging user fees and levying taxes on a variety of activities. This arrangement clearly solves the free rider problem. Does it guarantee that the outcome is reasonable and fair? To answer these types of questions, we need to study the properties of involuntary mechanisms of local public good
100
Voluntary Provision provision. Hence we turn our attention to topics in political economy, which we explore in the third part of this book.
5.8 Technical Appendix: Deriving the Nash Equilibrium Recall that the level of public goods is determined as follows: (5.19) To capture the strategic interactions among players, we use some basic game theory. We consider the noncooperative game that is being played between two individuals. We are looking for a Nash equilibrium in pure strategies. Taking individual 2's contribution as given, individual 1 chooses her contribution g1 to maximize the following utility function: (5.20)
where z1 = m - g1 is private consumption. Next, we compute the best response for individual 1 for any level of donations of individual 2. The first-order condition resulting from this maximization problem is given by (5.21) Solving for g1 gives us the best-response function of player 1: (5.22) Thus individual l's optimal contribution to the public good is declining in individual 2's contribution to the public good. Person 2's reaction function is similar: (5.23) Solving this system of two equations gives us the symmetric equilibrium: n
n
IXm
gl = g2 = - 2-
IX
(5.24)
Note that this provision level is symmetric only because the individuals are identical. If individuals differ in their income or preferences for the public good, the equilibrium is not necessarily symmetric. The total provision of public goods is given as follows:
en = 2gn = C
2/Xm
c(2 - 1X)
(5.25)
101
Chapter 5 Recall that the efficient level of public good provision for this example is (5.26)
Comparing the provision under private contributions to the Pareto efficient provision levels, we find that en< Ge. This shows the private provision of public goods leads to an underprovision of public goods. Next, we consider an extended version of our model in which both the government and individuals provide public goods. Suppose the government makes a public contribution to the public good equal to gp . In this case, the level of public goods is determined as follows: (5.27)
These government contributions are funded through lump-sum taxes on the individuals equal to gp/2. Taking individual 2's and the government's contribution as given, individual 1 chooses her contribution g 1 to maximize the following utility function: (5.28)
Player l 's best-response function is given as follows: (5.29)
Person 2's reaction function is similar. Private contributions in the symmetric equilibrium are given as follows: (5.30)
Multiplying this expression by the number of individuals gives us the total private contributions to the public good: 2g
n
21Xm
= 2- IX
-
gp
(5.31)
For every dollar contributed by the government, total individual contributions fall by one dollar: (5.32)
Thus government interventions designed to increase public goods to the Pareto efficient level may be offset by private responses.
102
Voluntary Provision
5.9 Debate: Tax Deductibility of Charitable Donations Charitable donations are tax deductible. Thus donating to one's favorite cause effectively lowers the total tax burden of an individual. The drawback is that these deductions reduce tax revenues. The pro side should argue for the status quo. The con side should argue that charitable donations should not be tax deductible. Below are some leading questions that may help structure the debate: 1. What are the marginal tax rates for different taxpayers in the US? What does
2. 3. 4. 5. 6. 7.
that imply for the after-tax price of donations? Why are donations only deductible for those who itemize? What fraction of taxpayers itemizes charitable donations? Who are the main charities that benefit from this tax break? Who are the largest beneficiaries of donations? How much tax revenue is lost because donations are tax deductible? What could you do with these lost tax revenues? How would you spend the additional tax revenues that you obtained from changing this policy?
5.10 Problem Sets 1. Suppose you are the president of a local charity called Friends of Schuylkill River Park. You are in the process of organizing a campaign to raise $20,000
2.
3.
4. 5.
6. 7.
from local residents for an improvement of the baseball field in the park. You have a tight budget of $1,000 that you can use to cover fund-raising expenditures. You have a list of names and addresses of previous donors. You want to expand the base of donors as part of the fund-raising campaign. How would you accomplish that while still staying within budget? After the fund-raising is over, you want to test the hypothesis that old donors are more generous than new donors. How would you do that and what data would you need to collect? Consider the game of voluntary public good provision discussed in the text. Explain why it is more likely to obtain a corner solution, in which only one player provides a positive contribution when heterogeneity among the two players is large. Why does voluntary provision rarely lead to an efficient provision of public goods and services? Discuss two fund-raising strategies that are popular among local charities and not-for-profits. How do you think the recent recession affected the effectiveness of these strategies? Suppose you are the CEO of the Philadelphia Orchestra. What measures would you take to ensure the long-term viability of the orchestra? A city funds its public good provision solely from voluntary contributions from its two residents, Phil and Hannah. Each of its two residents has a utility function over private goods z and public goods G given by
+ ln(G) UrJz1i, G) = 2 ln(z1i) + ln( G)
Up(zp,G) = ln(zp)
(5.33) (5.34)
103
Chapter 5 Hannah's income is 50. Phil's income is 100. The marginal cost of producing one unit of the public good is equal to 1 unit of income. a) Compute the reaction functions of Hannah and Phil. b) Suppose Hannah donates nothing; how much does Phil donate? Suppose Phil donates nothing; how much does Hannah donate? c) Assuming that Hannah and Phil must donate either zero or a positive amount of money, compute the equilibrium of the model. d) Is the equilibrium efficient? Discuss. 8. Consider a city with N identical consumers who have the utility function given by (5.35) where Zn is private consumption and G is public good consumption. Each consumer has income equal to 1. The marginal cost of providing the public good is 1. a) Compute the symmetric Nash equilibrium of the private provision game. Denote this solution by GN. b) Assume that the social planner is computing an efficient solution by maximizing the following welfare function: N
W = EU11
(5.36)
n=l
given the constraint that costs must be shared equally among the individuals. Compute the optimal solution to the problem. Denote this solution by GE. c) Compute the ratio GN / GE and analyze what happen s to this ratio as N grows large. d) What conclusions do y ou draw from this analysis with respect to the suitability of voluntary provision mechanisms? 9. Consider two identical consumers denoted by 1 and 2, each with income equal tom. Consumers can purchase two goods, z and g. The price of each good is normalized to be equal to 1. Preferences are given by
= ln(g1 + 1Xg2) + z1 U2 = ln (g2 + 1Xg1) + z2 U1
(5.37) (5.38)
where 0 :S IX :S 1. a) Provide an interpretation of the parameter IX. b) Find the Nash equilibrium for this game, i.e., solve for the equilibrium levels of g1 and g2. c) Compute the symmetric allocation that m aximizes the following utilitarian welfare function: (5.39)
104
Voluntary Provision subject to the aggregate resource constraint (and that treats both players equally). d) Under what conditions is the Nash equilibrium efficient? Explain your answer. 10. The town of Vienna funds its free music series solely from individual contributions from its two residents, Falco and Mozart. Each resident has a utility function over private goods z and concerts G. Both residents have identical utility functions given by U (z,G ) = ln(z)
+ ln(G)
(5.40)
Both residents have an income of 70. The price of both the private good and the concert is equal to 1. a) How many concerts are given if the government does not intervene? b) Suppose the government is not happy with the private equilibrium and decides to provide 10 concerts in addition to what Falco and Mozart may choose to provide on their own. It taxes Falco and Mozart equally to pay for the new concerts. What is the new total number of concerts? How does your answer compare to a)? Have we achieved the social optimum? Why or why not? c) Suppose that, instead, an anonymous benefactor pays for 10 concerts. What is the new total number of concerts? d) Is this the same provision as in a)? Why or why not? 11. Why is it difficult to empirically determine the degree to which government spending crowds out private provision of public goods?
105
PART III POLITICAL ECONOMY OF STATE AND LOCAL GOVERNMENTS 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
6
Local Political Institutions in the US
6.1 Motivation We have seen that the provision of local public goods and services requires a strong and effective local government. In this chapter, we take a closer look at the local political institutions in the United States. Most of you are somewhat familiar with the political institutions that make up the US federal government. A lot of the basic material is covered in high school. Most of you have some knowledge of the political institutions in your state. What's the name of the current governor in your state? Does your state have a term limit for governors? Even less is typically known about political institutions at the local level. We will see that there is much institutional heterogeneity among municipalities, which makes it necessary to classify different forms of local government. Most of the larger cities in the US have been granted a charter, which serves as the "constitution" of the city government. A municipal charter is a legal document that establishes a municipality as a city or town and defines the organization, powers, functions, and essential procedures of the city government. A municipal charter is inferior in rank to state laws but superior to all ordinances enacted by that municipality.1 There are two primary forms of city governments: mayor-council and councilmanager. Many large American cities have a mayor-council government with a strong mayor who oversees the administration and a city council that serves as the legislative branch. In mayor-council governments, the mayor's executive powers typically allow him or her to appoint and dismiss department heads without council approval or public input and to prepare and execute the city budget. The alternative model of local government is known as the council-manager form. In that form of municipal government, the city council is an elected body that hires a professional city manager who advises and oversees the administration of the city and is in charge of daily operations. The mayor in this model is usually weak and largely ceremonial. He or she may be selected by the council from among its members or elected as an at-large council member with no executive functions.
1
North Bay Construction, Inc. v. City of Petaluma, 143 Cal. App. 4th 552 (Cal. App. 1st Dist. 2006).
Chapter 6 Cities also differ along many other institutional dimensions that shape the political competition within them and the recruitment of political talent. Some cities and towns allow for elements of direct democracy, such as initiatives, referenda, and recalls. Other cities have binding term limits for mayors and members of the city council. Finally, many cities have tried to embrace nonpartisan democracy, a system of representative government such that elections take place without reference to political parties. Candidates for mayor or city council compete without party affiliation. This is in sharp contrast to state and federal elections, which are dominated by partisan democracy (i.e., candidates are strongly affiliated with political parties).
6.2 A Brief History of Local Governments in the US Let's start our exploration of local governments with a brief historical overview. 2 Before more structured systems were set in place for municipal governments in the US in the eighteenth century, towns and cities were commonly governed by a city council or board of aldermen. Aldermen were elected by "property owners" or, in this era, white males over the age of twenty-one. Some cities and towns imposed more stringent voting requirements so that, in practice, few persons were able to participate in elections. Women and African Americans were the largest groups that were prevented from voting, despite a few exceptions. During this time, the council functioned similarly to a parliamentary or congressional legislative body. It proposed bills, held votes, and passed laws to help govern the city. The council also selected a mayor of the city to lead, although the mayor was given very few exclusive powers. Change started to occur after the American Revolution . As the US adopted and embraced a democratic form of government, the local electorate began to choose the governing councils in almost every American municipality. While the direct election of aldermen has a long history in the US, direct election of mayors-or executives-did not begin to emerge until the 1820s. Between 1820 and 1840 many cities began adopting direct mayoral elections. These gained in popularity as an outgrowth of the Jacksonian Democracy movement, which advocated for greater democracy for the "common man" and pushed for a stronger executive branch and a weaker legislative branch. During this period, more and more immigrant populations moved from Europe to the United States. They settled in concentrated neighborhoods, creating ethnically and geographically based political powers. Until this time, urban power had been confined to the wealthy Protestant communities. As minority groups grew in number, political leaders could no longer neglect their representation. More professional municipal politicians started to dominate the 1850s and 1860s, which also saw the rise of political machines in larger cities. A political machine is an organization in which one leader guides the allegiance of local voters, obtaining enough support to establish political control in a city or metropolitan
2This
section is largely based on Lloyd , Norris, and Vicino (2007), which should be consulted for m ore details and references to the literature.
110
Local Political Institutions
FIGURE 6.1. Statue of Liberty. (Jamie Mclnall/www.pexels.com)
area. Although political machines were mainly single-partisan groups, their purpose was not only to advance a particular political party but also to sustain a monopoly control on political power. Machine politics share many commonalities. A strict party hierarchy governs the machine. Nominations to public offices are controlled by political parties. The party has a leadership that usually does not hold office. The party is supported by a core group of party workers and voters, whose loyalty is maintained through material rewards (patronage, jobs, contracts, etc.) and nonideological rewards (recognition, camaraderie, etc.). Those residents and businesses that did not associate themselves with a particular machine were at a clear disadvantage. The distribution of resources and rewards was based on loyalty to the machine rather than merit. Political machines controlled virtually every large city in the US between 1870 and 1910. Tammany Hall, the leading Democratic party in New York City, became famous for its political domination. It was led by William Magear Tweed, who established an overwhelming majority voter appeal by offering its followers benefits like jobs and housing. With these exclusive benefits given to loyal party members, it is clear to see that many Americans viewed the political machines as corrupt and undemocratic. Not surprisingly, a reform movement started during the early twentieth century. The calls for change were supported by Progressive reformers and advocated a restructuring of municipal institutions. They attempted to remove the dictatorial machine leaders from the electoral process in order to ensure that the government administration maintained democratic and honest values.
111
Chapter 6 The Progressive movement called for at-large elections requiring city council members to be elected through a citywide vote versus the precinct or ward levels that had been previously used. Precinct elections encouraged council members to tend to only the district's needs and not citywide issues. Reformers advocated that elections were to be nonpartisan, removing any and all party affiliation from local elections. Progressives also supported strong mayors that were given executive power and exclusive responsibilities in personnel and policy decisions. Finally, reformers embraced professional managers at the top of the local administration. Following World War II, local governments grew in size and scope. Voters required municipal governments to provide a variety of public goods and services, including water, roads, parks, welfare, and public schools. Most large cities have now adopted a system in which elected officials craft current public policies, while public managers implement the policies and oversee the daily activities of the local government.
6.3 Characterizing Municipal Governments in the US Next, we turn our attention to characterizing the political institutions that we currently observe in the United States. The US has nearly 20,000 municipalities. Note that two-thirds of American municipalities serve populations of fewer than 2,500 residents. Municipal governments can be classified into cities, towns, townships, villages, boroughs, districts, and plantations. These local governments differ significantly in their political and institutional structure. The International City /County Management Association (ICMA) is an association of local government officials and professionals whose goal is to improve local communities worldwide. One function of the ICMA is to survey local municipalities throughout the United States on a host of different topics. The survey of interest in this section is the Form of Government Survey, which the ICMA has conducted for over thirty years. This survey collects a variety of information pertaining to local government structure, including the form of government, elections, and term limits. Here we focus on the survey that was conducted in 2011. The survey was divided into three portions. The first asked questions about the general form of the local government, the second focused on the powers afforded to the chief elected official of the municipality, and the third addressed the role of the city council. The data were collected on the basis of voluntary responses from the municipalities. The survey was sent to 8,813 municipalities of varying sizes and locations throughout the United States. The ICMA received 3,566 responses for a response rate of 40.5 p ercent. Municipalities in the United States can be classified differently based on population size. Cities are the municipalities with the largest populations. Towns typically have fewer people than cities, while villages are smaller in size. Note that these divisions are only generalities. Different states have different systems
112
Local Political Institutions TABLE 6.1. Frequency of US Municipality Types Form of Municipality City Town Village Township Borough
Frequency
Percentage
2284 739 267 149 126
64.05 20.72 7.49 4.18 3.53
of classification for the various forms of municipalities. For instance, states like Alabama and Louisiana define towns by population size, while Nevada classifies towns as municipalities with no municipal charter. In other states like California and Maryland, the term town is completely synonymous with the term city. The ICMA data in table 6.1 indicate that nearly two-thirds of American municipalities are organized as cities, while an additional one-fifth are classified as towns. The general pattern for the classification of municipalities in the ICMA dataset can be tied to the population sizes of those municipalities. For instance, every municipality with a population above 250,000 and all but three with a population above 100,000 in the ICMA dataset are labeled as cities. As population decreases, the number of municipalities classified as cities decreases. For municipalities with population sizes below 2,500, only half are categorized as cities.
6.4 Legal Foundations of Municipal Governments The Tenth Amendment to the United States Constitution states that "the powers not delegated to the United States by the Constitution, nor prohibited by it to the States, are reserved to the States respectively, or to the people." The purpose of this amendment was to establish a division of power between the federal government and state governments. As a consequence, local government is a matter of state rather than federal law. States have adopted a wide variety of systems of local government. From a legal perspective, municipal governments have no power except what is granted to them by their states. This legal doctrine is called Dillon's Rule. It was established by Judge John Forrest Dillon in 1872 and was upheld by the US. Supreme Court in Hunter v. Pittsburgh, 207 US 161 (1907). The Supreme Court decided that Pennsylvania had the power to consolidate the City of Allegheny into the City of Pittsburgh. This consolidation was against the explicit wishes of the majority of Allegheny residents. Similarly to how the federal government must abide by the rules of the US Constitution, local governments must follow the rules and powers given to them by their municipal charters. Laws passed by local governments are called ordinances. A code of ordinances is the set of all laws passed by the government of a municipality. In many cases, a municipality has both a municipal charter and a
113
Chapter 6 TABLE
6.2. Distribution of Different Legal Foundations by Population Size
Population Over 100,000 10,000-100,000 Under 10,000
Municipal Charter
State Law
Ordinance
Other
80.53% 63.27% 50.19%
14.16% 23.24% 26.19%
3.54% 10.19% 21.15%
1.77% 3.30% 2.47%
code of ordinances that work in tandem to establish the legal framework of the municipality. Since municipalities are a part of larger states, each municipality must abide by all the laws passed by the state government as well. Data drawn from the ICMA dataset indicate that the majority of cities in the United States have municipal charters that define the powers and organization of the cities' governments. Table 6.2 demonstrates the frequency of the types of legal foundations found within the cities of the ICMA dataset. The distribution of legal foundations can be further organized by the population size of cities as shown in table 6.2. The largest cities in the United States are overwhelmingly regulated by municipal charters. As city population size decreases, organization by other mechanisms becomes more frequent. Notably, city ordinances regulate over one-fifth of the smallest cities in the ICMA data, but they are mostly absent from medium-sized and larger cities.
6.5 Direct Democracy 6.5.1 Initiative, Referendum, and Recall While the vast majority of municipalities throughout the United States have some elected body that is tasked with managing the municipality's affairs, many cities also have mechanisms in place for the populace to impact the business of the municipality. This is in contrast to the US Constitution, which, for all practical purposes, does not allow for any elements of direct democracy. Not even the US president is elected by direct popular vote, but by the electoral college. There are three popular elements of direct democracy: initiative, referendum, and recall. An initiative is a mechanism through which a citizen of a municipality can craft his or her own legislation or ordinance and have it placed on a ballot. In general, an initiative requires that the citizen gather a number of signatures of approval prior to the legislation entering the ballot. Once the bill makes it onto the ballot, the entire populace of the city has the opportunity to vote on whether or not to pass it. Within the cities in the ICMA survey, over half of them have some form of initiative power given to their residents. There are two types of initiatives. A direct initiative requires that measures with enough signatures be placed directly onto the ballot for a popular vote. An indirect initiative allows the city council to debate the measure before it gets onto the ballot. If the legislation makes it onto the ballot and is approved by the local popular vote, then the legislative body that governs the municipality is typically bound to create the designated changes. In around 15 percent of cities with initiative power,
114
Local Political Institutions
FIGURE 6.2. Political activism. (Rosemary Ketchum/www.pexels.com)
the strength of the initiative is nonbinding. This means that even if a citizen's bill is approved in the popular election, the legislative body is not required to actually go through with the bill. In other words, a nonbinding vote does not necessarily lead to any action. A legislative referendum is a means through which the legislative body of a municipality puts measures onto the ballot for popular vote. This process may overcome gridlock within the legislature, or it may be out of necessity due to the governing charter. Indeed, some municipalities require that certain measures be passed using a legislative referendum. For instance, almost 60 percent of cities in the IMCA data necessitate that measures changing the status of city bonds be approved through a legislative referendum, and half dictate that amendments to the city charter go through the process. A legislative referendum is present in 70 percent of municipalities in the ICMA sample. Prominent examples are Pittsburgh and New Orleans. A popular referendum, often referred to as the people's veto, is a process through which the citizens of a municipality can repeal legislation passed by the city council. By gathering enough signatures on a petition, the residents of a municipality can force measures onto the ballot for a popular vote. If the populace rejects the measure, the legislature is obliged to repeal the measure. Less than 40 percent of the municipalities in the ICMA survey data designate this power to their residents. Some cities that do grant popular referendum power are Milwaukee and Fort Worth. A recall is a power given to the residents of a municipality to forcibly remove an elected official from office before the end of his or her term. By gathering enough signatures on a petition, citizens can put onto the ballot a measure to remove the
115
Chapter 6 official. If that measure is approved in the popular vote, the elected official is removed, and through a separate process, a new official is elected. Approximately 56 percent of municipalities in the ICMA sample grant their residents recall power. Some examples include Raleigh and Omaha. However, the power is rarely used. Since 2006 less than 8 percent of municipalities with recall have employed it. Furthermore, within that time frame only about one-fourth of filed recalls have been successful.
6.5.2 A Case Study: Fracking in Athens, Ohio To gain some additional insights into how an initiative works, it is useful to study an example. Here we focus on a series of initiatives in Athens, OH, that dealt with £racking. Ever since the Industrial Revolution, cheap energy has played a centerpiece role in economies worldwide. Recently, techniques have been invented to reach and obtain less accessible oil and gas supplies. Hydraulic fracturing, better known as £racking, was developed to recover oil and gas from shale rock. The process involves digging into the ground and injecting a fluid into the rock at extremely high pressure. This creates cracks in the rock formation, allowing the extraction of the oil and natural gas. In Ohio, approximately 75 percent of counties have access to commercial gas resources. Oil drilling in the state began in 1895 and the use of £racking in 1952. However, not all municipalities are content with this oil drilling. While proponents of £racking extol its economic benefits, many have pointed to the potential damage that the chemicals used in £racking can cause to the environment and human health. As such, the Athens Bill of Rights Committee (BORC) created an initiative in 2013 to prevent £racking from taking place in the city of Athens, OH. Note that a local government can regulate oil field operations on the surface, but it has no authority over the operations below the surface due to mineral rights. However, initiatives are regulated by local law. The proposed initiative in Athens was struck down for overreaching: not only did the initiative call for a ban on £racking in Athens, but it also asked for such a ban within 20 miles of the city's boundaries. A revised version was submitted in January of 2014, which required the BORC to collect hundreds of signatures of support from residents. While proponents of the measure hoped it would be part of the May primary election, Athens law stated that initiatives could only be voted on during November general elections. As such, voters had to wait several months before even voting on the initiative. Finally, during the November 2014 election, voters overwhelmingly elected to ban £racking within the city's limits. With 78 percent of the vote in approval of the measure, Athens became the fifth municipality in Ohio to ban £racking. The ban of £racking within the city of Athens shows the limitations of local governments and the need for statewide regulations. Oil drilling is not limited to drilling directly over the oil reserve. Improvements in drilling technology allow for horizontal drilling, which makes it possible for the operator to reach oil reserves from an adjacent site outside of city limits. The 20-mile limit w ould have prevented this, but, of course, the legality of such a ban imposed by a local government is questionable. Further, the issue is not simply £racking. Oil fields are
116
Local Political Institutions susceptible to subsidence and ground movement when oil is extracted. Athens is still vulnerable to these impacts.
6.6 Representative Democracy 6.6.1 Local Forms of Government While many municipalities have a strong tradition of direct democracy, the vast majority of larger cities primarily rely on representative democracy to govern themselves. There are two common types of municipal government. The first type follows the familiar mayor-council model. The legislative body is composed of officials elected by the populace of the municipality. The chief elected official, usually known as the mayor, becomes the head of the government and is given significant authority. Typically, the mayor is given additional powers beyond his or her role within the city council or legislature. The second type of government follows a council-manager model. This model is closely related to a corporate structure in which a firm is run by a chief executive, and a board of directors provides oversight. Applying these ideas to city government, one obtains a structure in which the daily operations of the city are run by a city manager. Oversight is provided by the city council. The mayor often just serves representative functions and is selected from among the members of the council. The legislative body appoints and monitors a professional administrator to run the operations of the government. Notable examples of this system are Dallas, Las Vegas, and Oklahoma City. Some municipalities also use commissions instead. In commission systems, the voters elect commissioners, each responsible for one aspect of city governance, such as police or public works. One of the commissioners is then selected to be the mayor. Finally, some smaller municipalities like Andover employ town meetings as their system of government. Town meetings are a form of direct democracy. In this system of government, voters convene together to vote on policy decisions. The voters also select a board to carry out and enforce the decisions made during the town m eetings. Table 6.3 shows the distribution of the forms of government among the cities in the ICMA. The majority operate under a council-manager form of government, and the extensive majority of cities utilize either a council-manager or mayorcouncil system. In contrast, commissions are very rare and only occur in smaller municipalities.
TABLE 6.3 . Distribution of Forms of Government by Population Size Population of City
> 250,000 25,000-250,000 < 25,000
CouncilManager
MayorCouncil
Town Meeting
Commission
Other
42.86% 71.74% 56.08%
57.14% 25.35% 34.67%
0.61% 7.34%
0.92% 1.49%
1.38% 0.42%
117
Chapter 6 Indeed, the form of government found in municipalities varies as the population of those municipalities changes, as shown in table 6.3. Among the large cities in the ICMA data with populations over 250,000, approximately 57 percent utilize the mayor-council system of government while the other 43 percent employ a council-manager form. In cities of such magnitude, other forms of governing are completely absent. In contrast, over 71 percent of medium-sized cities employ a city manager. As city populations decrease, alternative systems of government, like commissions and town meetings, emerge. Almost all the municipalities that utilize town meetings have population sizes below 25,000, are located in the Northeast, and are classified as "towns" rather than "cities." What are the main differences between a mayor-council and a council-manager form of government? In a mayor-council form of government, the power is shared between the mayor and the council. Sometimes we want to distinguish between cities with strong mayors and cities with weak mayors. Independent of that distinction, it is likely that the power-sharing arrangement implies that both the mayor and the city council have to agree for a large project to be implemented. In contrast, all the political power rests with the city council in a council-manager form of government. This should make it easier to push through expensive new projects. However, there are additional agency problems that arise in the council-manager form of government, especially if the council is not effective in monitoring the manager. Standard models of bureaucracy would also suggest that a city manager would prefer larger over smaller budgets (Niskanen, 1968). Both effects may suggest that spending should be higher in cities that have adopted a council-manager form of government. Coate and Knight (2011) study spending differences among the two types of government using a large sample of cities in the US. They find that mayor-council cities have significantly lower spending than council-manager cities. The overall magnitude of that effect is large-approximately 9 percent of per capita spending. However, there are other important differences. Levin and Tadelis (2010) show that council-manager cities are more likely to privatize services than mayorcouncil cities. Enikolopov (2014) finds that the number of full-time public employees is significantly higher in mayor-council cities. That may be due to the fact that mayor-council cities place a higher value on political patronage.
6.6.2 Partisan versus Nonpartisan Elections We have seen above that the Progressive Era during the early 1900s was a period of sweeping reforms as a reaction to the problems created b y corruption throughout the country. As a countermeasure to political machines, nonpartisan local elections were instituted in municipalities all across the nation. Since political machines relied upon their ties to political parties to sustain their power, they were constrained by the removal of party affiliation in local contests. Of the municipalities in the ICMA dataset, about 80 percent have nonpartisan local elections, and that ratio seems to be little impacted by the municipality's population. Do political parties matter in the twenty-first century in local politics? We know that cities are not as polarized as many states and countries. This observation seems to suggest that ideology may be less important at the local level than the
118
Local Political Institutions state or federal level. It's not that surprising since most of the really divisive topics typically are decided at the federal level. Nevertheless, we need to be careful. Ferreira and Gyourko (2009) explore this question. The empirical research design compares policy outcomes from cities where Democrats barely won an election with cities where Democrats barely lost an election to a Republican. Hence they use a regression discontinuity design to estimate the effect of partisanship on outcomes. If you are not familiar with a regression discontinuity design, you may want to consult appendix A.7 of this book, which provides a brief introduction to that research method. This approach potentially deals with the endogeneity that exists due to unobserved factors- such as the true underlying political leanings of the votersinfluencing electoral or political outcomes. The empirical analysis is based on 4,543 direct mayoral elections between 1950 and 2005 in over four hundred cities with populations of at least 25,000 residents. The paper finds that whether the mayor is a Democrat or a Republican does not affect the size of city government, the allocation of local public spending, or crime rates. Hence the authors conclude that partisan politics are less important on the city level.
6.6.3 A Case Study: Electoral Reform in Asheville, NC To get some additional insights into nonpartisan politics, it is useful to consider the case of Asheville, which is the largest city in western North Carolina with a population of approximately 80,000. Just like most other cities in North Carolina, it once held partisan elections for local offices. However, in 1994 the city council voted unanimously to convert to a nonpartisan model. The movement to hold nonpartisan elections in Asheville was led by then vice mayor Chris Peterson, who believed that political parties had too much control over election results by directing campaign funding one way or another. Then in 2007, in a highly controversial decision, the city council voted to return to partisan local elections. Those in favor of the change argued that a return to party affiliation would make it easier for the populace to obtain information on the candidates and cast a more educated ballot. Furthermore, they claimed that a partisan model would improve the city's low voter turnout rate. In addition, they argued that the nonpartisan election model had been ineffective at delivering on its promise of a more diverse city council. While local elections were officially nonpartisan, that was only nominally true. Political parties remained relevant in the elections, and it was easy for a voter to identify a campaigner's party affiliation. The switch would effectively make the elections more politically transparent. On the other hand, those in opposition to the measure claimed that the proposition was nothing but a power grab by the current council members. Indeed, the proposal would have made it significantly more difficult for independent candidates to run. While Democrat and Republican campaigners would only have to pay a $75 fee to run, third-party and independent candidates would have to collect a sizable number of signatures to even appear on the ballot. Many argued that this imbalance served to advantage the major political parties in the election process. Following the passage of the measure through the city council, a group called Let Asheville Vote was formed in protest. According to local and state law, the
119
Chapter 6 group was given thirty days to collect the number of signatures required to trigger a popular referendum. Although over a thousand signatures were disqualified or invalidated, the group collected enough signatures with twenty-two to spare. This pushed the issue to a referendum, and nonpartisan elections were preserved.
6.6.4 Term Limits Term limits also play a significant role in state and local elections. In the United States, term limits on elected officials were present even in the colonial era. Colonies such as Connecticut imposed restrictions on their governors. The states' delegates to the First and Second Continental Congresses were also often termlimited by Article V of the Articles of Confederation (Rausch, 2009). However, term limits were not incorporated into the original Constitution of the United States for a variety of reasons. Some framers of the document opposed term limits because they felt that representatives needed a certain amount of time to develop know-how. Others simply did not expect that representatives would be reelected over and over again. In fact, at the time there was a tradition of voluntary rotation, and there was an implicit expectation for elected officials to turn over their roles after one or two terms. President Washington stepping down after two terms only served to solidify that basis of voluntary rotation, which lasted until the 1940s. The debate over term limits returned to the American spotlight with the reelection of President Franklin D. Roosevelt for a third and fourth term. The president's second and third reelections shattered the long-standing two-term precedent. Legislators responded by passing legislation imposing a strict two-term limit on the presidency. This change came in the form of the Twenty-Second Amendment, and it was ratified into the Constitution in 1951. A debate regarding term limits for positions other than the president began with the passage of the Twenty-Second Amendment, and a small number of states and cities enacted term limits in the 1950s. Many cities and states adopted term limits during the 1980s and 1990s. The movement for term limits began within the states, as state legislatures across the nation passed resolutions for term limits in response to a growing concern about the rising rate of incumbent reelection. Currently, term limits for governors are enacted in roughly half of all US states and for mayors in more than 50 percent of large US cities. To gain some additional insights into the discussion about term limits, we consider New York City, which has a very peculiar history regarding term limits on its elected officials. The city went through three referenda regarding term limits in less than two decades. In each case, elected officials sought extended term limits while the populace was clearly in favor of more restrictions. 3 Riding the term limits movement of the 1990s, a debate regarding term limits for elected officials in New York City emerged in 1993. At the time the debate began, officials could serve for unlimited amounts of time. Several mayors had served twelve years, and some on the council had been members since the 1970s.
3 This
section is largely based on Mastro (2013), which sh ould be consulted for more details and references to the literature.
120
Local Political Institutions An organization called New Yorkers for Term Limits headed by Ronald Lauder launched a campaign to limit New York City's elected officials to serving two consecutive four-year terms. Proponents of term limits sought to limit the power of incumbency. With a reelection rate of over 90 percent for incumbents, many career politicians were accused of homesteading, securing their job security while accomplishing very little and ignoring their electorates. On the other hand, opponents argued that term limits would only serve to deprive the legislature of experienced members. In an effort to cause a popular referendum, Lauder and his team spent over $1 million and collected over 130,000 signatures in support of the proposed term limit legislation. However, the city council resisted the effort and blocked the referendum. As such, the electorate was forced to go to the courts. After a series of challenges, the New York Court of Appeals, the state's highest court, d ecid ed to put the measure onto the ballot just two weeks prior to the 1993 election. During the popular election, the term limits were soundly enacted, with almost 60 percent of the electorate voting in approval. H ence a limit of two consecutive four-year terms was imposed on elected officials beginning in 1993. In 1996, however, the sitting council members attempted once more to extend their grasp on their seats. They passed a motion exempting themselves from the 1993 measure and granting themselves a third term. The bill was crafted and passed in the council, resulting in a legislative referendum. The measure was rejected, with approximately 54 percent of the electorate voting in negation, and the two-term limit was maintained. Then in 2008, in the midst of an economic recession, then mayor Michael Bloomberg was successfully reelected to a third term, despite the 1993 and 1996 referenda. Elected after the terrorist attacks on September 11, 2001, Mayor Bloomberg guided the city out of the ensuing economic crisis. With a new economic downturn, Bloomberg argued that his business acumen was needed to once again lead New York City. At the mayor's request, the city council passed a bill temporarily extending term limits to a third term. The measure was created and passed in the legislature by a 29-22 vote. The motion n ever went to referendum, despite almost 90 percent of the electorate wanting to vote on the proposition. In 2010 a return to two-term limits was once again on the ballot due to referendum, and voters overwhelmingly voted to go back to a two-term limit.
6.7 Conclusions In this chapter, we studied the political institutions of local governments in the United States. From a legal perspective, all powers that are given to a city or a town have to be granted by the state. As a consequence, it is not surprising that there is much heterogeneity in forms of city governments. The legal basis of most cities in the US is a municipal charter. A state grants a city certain rights in this document, which serves as the municipal constitution. Ordinances are local laws that are passed by a city council to regulate local activities. We have seen that there are important differences between municipal political institutions and those that we observe for the federal government. Municipalities rely much more heavily on elements of direct democracy than does the federal
121
Chapter 6 government. Forms of direct democracy include recalls, referenda, and initiatives. Direct democracy plays no role at the federal level of government, reflecting the widespread skepticism of the founding fathers regarding its use. Most large US cities heavily rely on representative democracy for governance. Although each municipality is unique, there are some common trends throughout regions and demographics. More populous cities are more likely to employ a mayor-council form of government, while less populous areas tend to prefer the council-manager alternative. Mayors in large US cities tend to be directly elected by popular vote. These mayors tend to be "strong" mayors and have a fair amount of discretionary power. Another important difference between municipal and federal elections is that elections tend to be nonpartisan on the local level, except in the Northeast of the United States. Many cities have also adopted term limits to encourage turnover among mayors and council members. Term limits restrict the possible dangers of political entrenchment. In general, we find that political parties tend to play a less important role at the local level than at the state or federal level of government. That was not always the case in the history of the US, which once saw cities dominated by political machines. These machines are no longer as powerful as they used to be. Nevertheless, there are still many large cities in the US that are dominated by a single political party.
6.8 Debate: Mayor-Council versus Council-Manager Discuss which form of local municipal government is better for a large city. The pro side should argue for a mayor-council format while the con side should argue for a council-manager format. The following questions may help structure the debate: 1. Which large cities have adopted a mayor-council format? 2. Which large cities have adopted a council-manager format? 3. Why do we want different institutions at the municipal level than at the state or federal level? 4. How does the institutional design affect policies adopted in the city? 5. What are the incentives of elected officials? 6. What are the incentives of appointed bureaucrats?
6.9 Problem Sets 1. What is the role of a city charter?
2. What are the main types of city government? 3. Discuss two potential advantages of direct democracy in state and local politics. 4. Discuss two potential disadvantages of direct democracy in state and local politics. 5. Explain the difference between a referendum and an initiative.
122
Local Political Institutions 6. Discuss two potential advantages of term limits for mayors (governors). 7. Discuss two potential disadvantages of term limits for mayors (governors). 8. Do large cities need strong mayors? Discuss potential advantages and disadvantages of this institutional design. 9. What's the rationale for having nonpartisan elections in cities? 10. How do political institutions differ among geographic regions in the United States?
123
7
Voting over Local Public Good Provision
7.1 Motivation In the US and most other democratic countries, local residents of a city typically elect a mayor and members of the city council. Together the mayor and the city council then determine expenditure and tax policies. Hence participation in the provision of public goods is no longer voluntary but enforced by the elected government. As a consequence, these types of collective choice mechanisms offer the potential to overcome the free rider problem that has plagued purely voluntary mechanisms that we studied in chapter 5. Given the importance of majority and local democratic rule, we turn to the formal study of local government institutions, which is the domain of political economy. In this chapter, we focus on studying voting models and their applications to the public good provision problem. The most common form of collective choice mechanism is based on some form of majority rule, which means that a majority of votes must favor a measure to gain approval. We can use majority rule in direct democracy. Examples are referenda or initiatives in which all voters directly decide tax and expenditure policies. We also use majority rule in representative or indirect democracy when voters elect a mayor and a city council and delegate the decision-making powers to these political institutions. Although majority rule is a familiar concept, it has a number of potential problems. For example, a feasible policy that beats all other feasible policies in a pairwise vote does not always exist. Hence we need to establish conditions under which majority rule leads to a stable outcome. The main theoretical result in this chapter is the median voter theorem. Once we understand the basic ideas behind the median voter theorem, it is important to understand why real-world tax and expenditure policies may not be in line with its predictions. We spend the rest of this chapter understanding some important modifications and limitations of the median voter theorem. Finally, we review some empirical evidence.
Voting over Public Good Provision
7.2 Majority Rule in a Direct Democracy 7.2.1 The Median Voter Theorem We start our investigation by considering models of direct democracy. As discussed in chapter 6, this is a form of democracy in which citizens decide or vote directly on policy initiatives. In the next section, we consider models of indirect or representative democracy in which voters elect persons who represent them in decision-making institutions. So how does majority rule work in a direct democracy? Well, the answer to this question is fairly complicated. Let's start with a simple example. There are three individuals who have to choose among three levels of provision of public education, denoted by low (L), medium (M), and high (H). Preferences are shown in table 7.1. Andy prefers low expenditure over medium expenditure over high expenditure on education. Maybe he does not have school-aged kids, or he does not like paying the taxes that are necessary to finance the expenditures. George is exactly the opposite of Andy. He prefers high expenditure over medium expenditure over low expenditure. Petra is somewhat in the middle. She prefers medium expenditure and high over low expenditure. Figure 7.1 illustrates the shape of the preferences if we just use linear interpolation between the three feasible policies. Note that all voters have preferences that are well behaved. They are strictly monotonically decreasing (Andy); strictly monotonically increasing (George); or decreasing in both directions from a single preferred policy (Petra). If the preferences satisfy these properties, we say they are single-peaked. Recall that the preferences in our decentralization model were also single-peaked. If preferences are single-peaked, each voter has an ideal choice in the set, which we call the bliss point. In the example, Andy's bliss point is L, George's bliss point is H, and Petra's bliss point is M . For each voter, outcomes that are further from his or her ideal choice are strictly preferred less than the bliss point. Next, we need to determine which policy would be chosen by majority rule. In general, this may depend on the rules of the voting process. Who decides on the alternatives, and in which order do we vote? It would be desirable if these issues did not matter, i.e., if there existed a unique alternative that would always win as long as it was included in the choice set. Let's start with pairwise comparisons of alternatives. It seems to be desirable that the policy that is chosen by majority rule beats any other feasible alternative in a pairwise vote. If not, then we are clearly in trouble, and there exists a feasible policy that the majority would prefer over the alternative that is adopted. Pairwise
TABLE
7 .1. An Example of Single-Peaked Preferences
Choice
Andy
George
Petra
First Second Third
L M
H
M
M
H
L
H L
125
Chapter 7 Utility Andy
George
.,, .,,
.,, .,, .,,
.,, .,,
.,, .,,.,,
.,, .,,
.,, .,, .,,
Petra
.,,
H
M
L
Spending
FIGURE 7.1. Single-peaked preferences.
voting also gets around complications that arise if we vote on more than two alternatives at the same time. Suppose we vote on three alternatives L, M, and Hat the same time; then this vote would result in a tie since each alternative would get one vote. By restricting attention to pairw ise votes, we are effectively ruling out these scenarios. 1 Let us assume that each voter behaves sincerely, i.e., a voter votes for policy Lover policy M if and only if the voter's preferences rank policy L higher than policy M. In this example, M beats L by a vote of 2-1. George and Petra vote for M and only Andy votes for L. Similarly, M also beats H by a vote of 2-1. Petra and Andy vote for Mand George votes for H. Policy M thus beats any other feasible policy in a pairwise vote. We therefore say that policy Mis the majority rule equilibrium outcome. It is also called a Condorcet winner, named after the Marquis de Condorcet, who was an early French political scientist who studied election rules. Unfortunately, majority rule does not always produce a clear winner. To illustrate this result, let us change the preferences in the example above. New preferences are given in table 7.2. TABLE 7.2 . An Example of Double-Peaked Preferences
1 Plurality rules
Choice
Andy
George
Petra
First Second Third
L
H
M H
L
M H
M
L
deal with votin g procedures with more than two alternatives. We consider an example of plurality rule in one of the problem sets below.
126
Voting over Public Good Provision Utility Andy
George
Petra
''
L
''
''
''
' ' ' ,,I M
H
Spending
FIGURE 7.2. Double-peaked preferences.
Note that George's preferences are the only ones that have changed. Figure 7.2 provides graphs of the preferences. Andy and Petra still have single-peaked preferences, but George has preferences that are no longer single-peaked. Lis a local maximum. The utility decreases until we get to M, which is the worst possible policy for George. Utility then increases monotonically until we reach H. Let us verify that a Condorcet winner does not exist if voters behave sincerely and preferences are given by table 7.2. Policy L wins in an election of L versus M by a vote of 2-1. Policy M wins in an election of M versus H by a vote of 2-1. Finally, policy H wins in an election of L versus H by a vote of 2-1. Hence, there is no Condorcet winner. That's bad news since we cannot rule out that voters may behave like George. What differentiates the first example from the second example? In the first example, all voters have single-peaked preferences, while in the second example, only two of three voters have single-peaked preferences. The two examples, therefore, suggest that a majority rule equilibrium will exist if preferences of all voters are single-peaked. We have the following general result that is due to Black (1948). A majority rule equilibrium typically exists if all individuals have single-peaked preferences and the vote is about a one-dimensional policy. Moreover, the median voter- the "middle" voter if we order voters by their bliss points-is the decisive voter who determines the outcome. This result is known as the median voter theorem . When preferences are not single-peaked or if the policy space is multidimensional, there is no guarantee that a majority rule equilibrium exists. The median voter theorem only tells us about the existence of a majority rule equilibrium. It does not tell us anything about potential efficiency properties of the majority rule equilibrium. It should not be surprising that it is difficult to derive general results. In the examples above, we cannot say much about efficiency since we have only specified rankings of preferences and not the intensity of preferences. We need to relax this assumption.
127
Chapter 7 In the technical appendix, we consider a public good provision game in which voters need to determine the level of expenditures that are financed by a lumpsum tax. Recall that this is a tax in which all taxpayers pay the same amount of taxes. We can derive preferences over expenditure policies and provide conditions under which these preferences are single-peaked. As a consequence, the median voter is decisive. We then compare the preferred policy of the median voter with the efficient policy that satisfies the Samuelson condition. (Recall we derived this condition in chapter 4 to characterize the efficient level of public good provision.) In general, there is no guarantee that the median voter 's preferred policy satisfies the Samuelson condition. Depending on the voters' distribution of preferences and income, it is possible to get either over- or underprovision of public goods in equilibrium. As a consequence, there is no guarantee that majority rule will implement the efficient level of public good provision.
7.2.2 Sequential Voting The existence of a majority rule equilibrium is closely related to other assumptions that we make regarding the voting environment. For example, we can interpret the previous exercise as a simultaneous voting game, i.e., all votes over all possible policy pairs are taken at the same time. We look for a policy that does not lose a single pairwise contest. Suppose we now adopt a sequential voting game. In this game, we eliminate an alternative if it is beaten by another alternative in a pairwise vote. The votes continue until all but one alternative has been eliminated. Let us consider the equilibria that arise in the double-peaked preferences voting game. Consider the example in table 7.2. Note that the equilibrium of this game depends crucially on the agenda or the sequence of the votes. Suppose Andy is chosen as the agenda setter. Andy would like that policy Lis chosen by majority rule. Can he devise a sequence of votes that delivers this result? The answer is yes. Suppose Andy suggests that the first vote pitches alternative M against alternative H. Convince yourself that M b eats H under sincere voting. Hence alternative H has been eliminated. As a consequence, the only possible vote in the second round of the game pitches L against M. Note that L defeats M by a 2-1 vote, which leaves Las the equilibrium of this sequential voting game. However, this is not the only equilibrium of this sequential game. Suppose George is chosen as the agenda setter. George would like to have policy H adopted. He can accomplish that by proposing a vote between M and L in the first round and H and L in the second round. Finally, consider the case in which Petra is chosen as the agenda setter. Petra would like to have policy M adopted. She can accomplish that by proposing a vote between H and L in the first round and M and H in the second round. We have thus verified that this game has three equilibria. Moreover, the policy that is not considered in the first round of the voting game will ultimately win in a sequential voting game in this example. The person who controls the agenda or the sequence of the vote then becomes a de facto dictator. Since the rules of the voting process matter, we now understand why the rules in the US Congress regarding bringing bills up to a vote and amending existing bills are so complicated and seemingly archaic! Why? Because all these rules matter in determining the final outcome of the voting process!
128
Voting over Public Good Provision
7.2.3 Vote Buying, Vote Trading, and Log Rolling Another key assumption of the median voter theorem is that each agent votes sincerely. We do not allow for side payments at the voting stage, i.e., one person cannot "buy" the vote of another person by offering him or her a payment or transfer of resources on the side. Let's relax this assumption. To make more progress, it is useful to allow for an intensity measure of preferences over alternatives. Let's consider the following example with three alternatives, illustrated in table 7.3. For the sake of concreteness assume that the voters need to decide whether or not to build a pool, a hospital, a library, or do nothing. Let us assume that the net benefits that accrue to each voter can be measured in dollars. That allows us to measure the intensity of the preferences. We can think about this measure as each voter's willingness to pay for each project. Notice that the net willingness to pay can be negative since voters have to pay taxes to finance the construction of each project. Consider table 7.3. Let us assume that the status quo is to do nothing. It will continue unless one project is adopted. Andy is willing to pay $200 if the hospital is built. If the library is built, you would have to pay him $40 to keep him indifferent. You need to pay him $120 if the pool is built to keep him indifferent. By construction, zero payments are required if none of the three projects are implemented. In a nutshell, Andy cares about health care but does not care much about reading or swimming. In the last column of the table, we just add the three dollar amounts to obtain the total willingness to pay. It is fairly straightforward to show that the only equilibrium in the simultaneous voting game without side payments is that none of the three projects are built. Each project loses against the alternative "do nothing" by a 2 to 1 vote. The willingness to pay is negative for two voters in each case and positive for one. What happens if we allow for side payments? The answer depends on the rules of the voting process. Let us assume the following rules. In the first stage, one individual is randomly chosen to become the "proposer." In the second stage of the game, the proposer makes a take-it-or-leave-it offer selecting a project to be determined by an up-and-down vote. At the same time, the proposer can offer side payments to the other two voters. So the implicit assumption here is that all of the voters have enough money in the bank so that they can finance these side payments. What happens under these voting rules? It is easy to see that the proposer can always get his or her favorite project to be adopted by majority rule. For example, if Andy is chosen as a proposer, he would pick the hospital and offer George a side payment of $50. In that case, the proposal would be accepted by a 2 to 1 vote since both Andy and George would be weakly better off in this case. TABLE 7.3. An Example with Side Payments Choice
Andy
Hospital Library Pool Nothing
200 - 40 - 120 0
George
Petra
Total
-50 150 - 60 0
-55 - 30 400 0
95 80 220 0
129
Chapter 7 This result assumes that Petra cannot make a counterproposal to Andy's offer. If she could, she would offer George up to $55 to vote against the proposal. In that version of the game, Andy would have to offer George at least $55 to get the hospital built. Note that similar issues arise if the proposer is allowed to create a bill that bundles two or more projects together. Suppose we do not allow side payments but we do allow for bundling of projects. Again, the outcome will largely depend on the identity of the proposer. Suppose again Andy is chosen as the proposer; then he would propose to build the hospital and the library. Andy's willingness to pay is equal to 160 = 200 - 40. George's willingness to pay is 100 = 150 - 50 and Petra's willingness to pay is -85 = - 30 - 55. As a consequence, this proposal would pass by a 2 to 1 vote. Also, note that this proposal is the one that maximizes Andy's willingness to pay among all possible proposals that can get approved by majority rule. Closely related are the issues of log rolling and vote trading. Log rolling, especially popular in politics, is the practice of exchanging favors by reciprocally voting for each other's proposed legislation. This is closely related to the concept of vote trading. Effectively, these games allow a majority of voters to form a winning coalition. The coalition can then implement the projects that serve their special interests. Note that the costs are typically borne by the minority voters. In the two examples considered above, Petra would be strictly worse off in both equilibria relative to a majority rule outcome of the simultaneous vote game without side payments. At this stage, you may wonder why voting mechanisms with side payments are not commonly observed in practice. The reason is that these mechanisms are hard to implement as the number of voters increases. In a large enough electorate, it is hard to imagine that voters would know each other's preferences. Direct democracy becomes infeasible when there are too many voters. As a consequence, representative democracy is common in large cities.
7.3 Representative Democracy The median voter model may also apply to representative democracies as shown by Downs (1957). To illustrate this result, let us consider the following simple game. Suppose voters need to choose between two candidates or two parties. Each candidate can sign a binding contract with the voters, i.e., each candidate announces a policy that he or she will implement when in office. The voters observe the two announcements by the two candidates, and each voter chooses the politician that promises the more desirable policy. After the election is over, the winner implements the policy that he or she announced before the election. Note that this is a model in which politicians can commit themselves to implement a policy before the election is held. What will happen in this game? Well, the answer depends on how much each candidate wants to win relative to his or her own policy preferences. Suppose for simplicity that politicians only care about winning and do not care what policies are implemented . We will have to relax that assumption below. In that case, we can think about them as maximizing the number of votes they get. Also, note that the commitment and politicians' objectives are closely related. If politicians only care about winning, there is no trade-off for them and hence a real commitment
130
Voting over Public Good Provision problem. If politicians have their own policy preferences, then the commitment issue becomes important. How do you make sure that the elected politician will actually make good on his promises if that is not in his own policy interest? Downs's analysis abstracts from these problems. Under these simplifying assumptions, Downs shows that both politicians will announce the same policy prior to the election. It will correspond to the preferred policy of the median voter. The argument follows from the same logic as Hotelling's (1929) model of spatial competition. Suppose both candidates do not locate at the position of the median voter. This cannot be an equilibrium because one of the candidates can move her platform toward the center of the ideological spectrum, i.e., toward the position of the median voter and increase her vote share. The only equilibrium of that game is then obtained when both candidates adopt platforms that correspond with the position of the median voter. Any deviation implies that the deviating candidate will lose the election for sure. In equilibrium, both candidates will win with 50 percent probability. To illustrate this result, consider the example in table 7.1. We already know that M is the voting equilibrium under majority rule. Let us consider the Downsian game with two candidates, denoted by D and R. Suppose candidate D proposes to the voters to implement policy M. What should candidate R do? Suppose he promises a policy to the left of M. Then he will lose the election since at least two voters will vote for candidate D. Similarly, if he proposes a policy to the right of M, he will lose the election by a 2-1 or 3-0 margin. As a consequence, the best strategy for candidate R is also to move to the center and propose M. In that case both candidates propose the same policy and the three voters are indifferent between the two candidates. They may as well flip a coin. No matter who wins the coin toss, the elected politician will implement policy M. The extension of the median voter theorem by Downs then implies that a twoparty system tends to be stable in the sense that both parties stake out positions near the center of the ideological spectrum. Replacement of direct democracy by a representative system has no effect on the policy outcomes. Both systems mirror the preferences of the median voter. There are many useful extensions of these results. If you are interested in these topics, I suggest that you take a course in political economy or social choice. Here I just want to mention one important paper by Besley and Coate (1997). They assume that politicians are selected by the people from those citizens who present themselves as candidates for public office. That means that the preferences of politicians cannot be arbitrary. There must be at least one citizen that has these preferences. That makes some basic sense. The main advantage of this model is that it is tractable. Most importantly, it can handle multidimensional issues and policy spaces. This framework, therefore, relaxes a number of the strong assumptions that are necessary to derive the simple median voter theorem.
7.4 Is the Median-Income Voter Decisive? Empirical Evidence How do we test the predictions of the median voter theorem? It turns out that this is not as straightforward as one may think. The main problem is that we do not know the identity of the median voter in the real world. In a world in which
131
Chapter 7 individuals differ by income and preferences over public and private goods, it is not obvious who the "median voter" is since we do not observe preferences. Consider the question of whether the median voter theorem can explain observed tax and expenditure policies in US municipalities. If we ignore heterogeneity in preferences, then it is reasonable to think that the voter with median income in the municipality should be decisive. There is a substantial empirical literature that has tried to test that hypothesis. Bergstrom and Goodman (1973) provide an empirically tractable model of public good provision financed by a proportional income tax. They provide a theorem that provides a closed-form solution for the level of public good provision adopted under majority rule and specify conditions under which the median-income household is decisive. Inman (1978) tests the validity of the Bergstrom-Goodman theorem for a sample of fifty-eight Long Island school districts. He rejects the restriction imposed by the median-income voter hypothesis for one-fourth of the districts. Even in those cases where he rejects the median-income-voter-as-decisive hypothesis, the deviations between observed and predicted expenditures are less than 20 percent. Compelling indirect evidence in favor of the median voter theorem is also provided by Lott and Kenny (1999), who examine the growth of state governments as a result of giving women the right to vote. Using panel data for 1870-1940, they examine state government expenditures and revenue. They also study voting by US House and Senate state delegations and the passage of a wide range of different state laws. They find that suffrage-extending voting rights to women- coincided with immediate increases in state government expenditures and revenue. They also find evidence in favor of more liberal voting patterns for federal representatives. Similarly, Miller (2008) finds that suffrage rights for American women helped children benefit from technological innovations in health care and significantly decreased child mortality in the US. These findings are consistent with the implications of the median voter theorem as long as we assume that (a) women prefer higher state and local expenditures than men; and (b) women care more about the welfare of young children than men. These are not unreasonable assumptions. As women become eligible to vote, the median voter theorem predicts that policy should move in their preferred direction.
7.5 Discussion 7.5.1 Dimensionality of the Policy Space The median voter theorem relies on a variety of assumptions that can be challenged. Representatives of federal, state, and local legislatures are elected to vote on a range of different issues, from education policy to social policy. We may, therefore, conjecture that a high-dimensional voting model would be necessary to explain the observed voting patterns of elected politicians. It turns out that this conjecture is false. We have compelling evidence for members of the US House of Representatives and the US Senate as well as members of most state legislatures that a lowdimensional voting model can explain almost all the observed voting patterns.
132
Voting over Public Good Provision TABLE
7.4. Voting Patterns
Vote
Feingold
Snowe
Coburn
1
Yea Yea Nay Nay Yea Nay
Nay Yea Yea Nay Yea Nay
Nay Nay Yea Yea Yea Nay
2
3 4
5 6
How is that possible? Note that there are excellent data on roll call votes in which all present representatives must answer with "yea" or "nay." The challenge is to infer the preferences of politicians from the observed voting patterns. To illustrate the basic ideas, consider the following example due to McCarty (2010) in which three senators vote on six bills. The voting outcomes are described in table 7.4. Let's consider a one-dimensional spatial voting model and assume that each senator has a bliss point given by xi. Let Yj denote the position of the "yea" outcome of vote j and n1 the position of the "nay" outcome. Suppose senators have quadratic preferences and vote sincerely. Senator i will vote "yea" on roll call j if and only if (7.1) The question is then the following: Can we use a classification algorithm that can determine the unknown three bliss points-x1 , x2, and x3- as well as the unknown twelve positions of each bill-(y1 , n 1 ) .. . (Y6, n 6 )-from the observed voting data in table 7.4? Try to convince yourself that this in fact is possible for the example above. This is a fun way to spend thirty minutes of your life! Pick an ideological scale from [-1, 1]. Then pick the three bliss points of the senators. You probably want Feingold on the left, Snowe in the middle, and Coburn on the right of the scale. Then determine values for (y1, n 1 ) that generate the votes for the first roll call. Continue to the next vote. You should do this exercise in a spreadsheet. In practice we see a lot of votes, and a simple voting model along the lines described in equation (7.1) will not be consistent with all observed voting patterns. The solution to this problem is then to adopt a probabilistic voting model in which we add random utility shocks to each voting outcome. Let us denote these random shocks by and €if and assume for simplicity that they follow a normal distribution. Think about these random shocks as adding noise to the voting process. Senator i chooses yea on vote j if
ct
(7.2)
Since we do not observe the idiosyncratic shocks, voting patterns appear to be probabilistic. Integrating out these shocks, we can derive the probability that senator i will vote yea on vote j . Hence we obtain a well-behaved econometric model that we can use to estimate its underlying parameters. In section A.8 of
133
Chapter 7 the appendix, we provide a detailed discussion of these types of random utility models and how to estimate them. Research by Poole and Rosenthal (1985) finds that a one-dimensional probabilistic voting model explains a surprisingly large share of all roll call votes that deal with economic issues taken in the US Congress. Moreover, that dimension has a nice interpretation: it captures differences in ideologies about redistribution and the optimal size of the federal government. A two-dimensional model can explain almost all observed voting patterns including noneconomic votes. While there are clear differences between Democrats and Republicans along the first dimension, there are almost no differences along the second dimension. You can go to the website www.voteview.com to see the full history of Poole and Rosenthal scores of any representative or senator that has ever served in the US Congress. Pick your favorite member of Congress and find out where he or she is placed on the ideological scale! We conclude that the politicians we elect to represent us in Washington are not that complicated. On most economic issues, they seem to vote along ideological lines most of the time.
7.5.2 Ideology and Competence Nevertheless, ideology is not the only thing that matters, especially in state and local elections. Levitt (1996) analyzes Democratic and Republican senators from the same state. Since they face the same electorate, they should vote the same. In fact, Levitt shows that they vote differently from each other. Why do voters from the same state elect senators with different ideologies? Why do conservative states sometimes elect moderate governors and vice versa? There is strong evidence at the state and local level that voters care about other characteristics besides ideology. Politicians differ in competence or valence. There are good ones and bad ones. Competence is particularly important for governors and mayors, who need to have some executive or managerial skills to succeed. Since competence matters, voters sometimes trade off competence for ideology. A more competent politician can be more extreme in equilibrium and still expect to be reelected. The promise of reelection can serve as a disciplining device and provide incentives for elected officials to moderate their behavior, as long as they sufficiently care about holding office, i.e., as long as the benefits of holding office are sufficiently high. Sieg and Yoon (2017) analyze all gubernatorial elections held in the US between 1950 and 2011. They focus on states with a binding two-term limit. They find that the benefits that elected officials obtain from holding office are significant and large in magnitude. As a consequence, the prospects of reelection provide strong incentives for moderate governors to move toward the center of the ideological spectrum during their first term in office. Voters are w illing to accept significant trade-offs in ideology to obtain a more capable governor. Hence they conclude that ideology and competence are two important characteristics of governors. Recall that Ferreira and Gyourko (2009) also show that ideology is less important than competence in m ayoral elections.
7.5.3 Accountability and Competence In addition to eliminating incompetent and extreme candidates, elections may also mitigate moral hazard by creating accountability . What does that mean?
134
Voting over Public Good Provision
Politicians often are required to take costly actions on behalf of voters. For example, they need to work hard to attract new businesses to their communities. Voters need to reward politicians who exert effort and work hard. Similarly, voters need to penalize politicians who shirk their responsibility. Managers are often incentivized by stock options and bonus payments. Politicians are incentivized by reelection. How important are these reelection incentives for politicians? Alt, Bueno de Mesquita, and Rose (2011) study the impact of US gubernatorial term limits across states. They are interested in two separate effects of elections on government performance. Holding tenure in office constant, differences in performance by reelection-eligible and term-limited incumbents identify an accountability effect. If a governor can run for reelection, he or she has greater incentives to exert costly effort on behalf of voters. Holding term limit status constant, differences in performance by incumbents in different terms identify a competence effect. Consider two governors that both face a binding term limit. Since the incentives to exert effort are the same, differences in outcome must be due to differences in ability or competence. Formalizing these ideas, we obtain the following regression model: performancei1 = ll'.Q + ll'. 1 first-term reelection-eligiblei1
+ ll'.2 second-term lame ducki 1 + l\'.3Xit + ei + 61 + vii
(7.3)
where performance is measured by variables such as economic growth, debt finance costs, taxes per capita, and expenditures per capita. Xii are control variables such as population growth, 0i are state fixed effects, and 61 are time fixed effects. Finally, Vit is the error term of the model. The omitted category is first-term lame ducks. The coefficient on first-term reelection-eligible provides an estimate of the accountability effect since it compares the performance of first-term reelection-eligible governors to that of firstterm lame duck governors. The coefficient on second-term lame ducks provides an estimate of the competence effect since it compares reelected lame duck governors to first-term lame duck governors. Since performance is expected to be worst under first-term lame duck governors, we expect both of these coefficients to be negative for spending, taxes, and borrowing costs and positive for economic growth. Alt, Bueno de Mesquita, and Rose estimate the model using a sample of oneterm- and two-term-limit states. Table 7.5 summarizes some key findings of the study. Robust standard errors are reported in parentheses. The first and second columns show that per capita spending and taxes are 3-5 percent lower under both first-term reelection-eligible governors and secondterm lame ducks than under first-term lame ducks, supporting the accountability and competence effects, respectively. Column 3 shows that borrowing costs are six to seven basis points lower under both first-term reelection-eligible governors and second-term lame duck governors, compared to first-term lame duck governors. Column 4 shows that the economic growth rate is nearly 0.7 percentage point higher under first-term reelection-eligible governors than under first-term lame duck governors. Overall, we conclude that these results support our view
135
Chapter 7 TABLE 7.5 . Accountability and Competence
First-Term Reelection-Eligible (Accountability) Second-Term Lame Duck (Competence) Observations R-Squared
Log of per capita Spending
Log of per capita Taxes
Borrowing Cost
Econ omic Growth
1
2
3
4
- 0.048 (0.012) - 0.041 (0.012)
- 0.039 (0.014) - 0.030 (0.015)
- 5.81 (2.18) - 6.75 (2.47)
0.66 (0.27) 0.45 (0.29)
686 0.98
686 0.98
686 0.72
686 0.69
that governors respond to reelection incentives by working harder. Policy is thus responsive to reelection incentives.
7.5.4 Voter Turnout The median voter theorem assumes that there are no substantial differences in turnout. Understanding voter turnout is a central problem in political economy. In large elections, the probability of being pivotal or deciding the election is, for all practical purposes, zero. Hence voters should not turn out to vote. Nevertheless, we see significant voter turnout in important elections. This is som etimes described as the "paradox of voting." Why do people vote when they know they are not d ecisive? The answer is of course that voters tend to show up at the ballot box because of a sense of duty. Democracy only works if individuals vote! From a purely strategic point of view, if nobody is voting, then each individual voter has a profitable deviation of being the only one turning out so that zero turnout cannot be an equilibrium. Bursztyn, Cantoni, Funk, and Yuchtman (2018) exploit naturally occurring variations in the existence, closeness, and dissemination of preelection polls to identify a causal effect of anticipated election closeness on voter turnout in Swiss referenda. They find evidence that voters do react to the perceived competitiveness of the election. Political parties understand the importance of turnout. They fashion their policy platforms and political campaigns to motivate their own base and to discourage the opposition's base. Accordingly, turnout partially determines which party or candidate wins the election, but it also shapes the policy options from which voters select. Coate and Conlin (2004) develop a model based on a "group rule-utilitarian approach." What does that mean? Elections naturally divide the electorate into distinct groups. Examples are liberals against conservatives or free-traders against protectionists. In a referendum or ballot initiative, these groups are the supporters and opposers of the proposal. On the local level, there are, for example, many referenda that affect school financing. This naturally pits advocates of public education against opponents. In a candidate election, the supporters of the different candidates make up opposing camps. Think about H illary Clinton versus Donald Trump. Elections create contests between these different groups. The winner is the
136
Voting over Public Good Provision
FIGURE 7.3. Local organizers. (Rosemary Ketchum/ www.pexels.com)
group that delivers the most votes on election day. The key assumption is that individual group members want to do their part to help their group win. Coate and Conlin analyze data from local liquor referenda in Texas. They find that this model provides a good explanation for the observed turnout patterns. This result is important since it provides a more compelling explanation as to why voters show up at the ballot box. Yes, they d o their civic duty, but they also like their groups to win!
7.6 A Case Study: The Role of Money in State and Local Politics The median voter theorem also ignores the role of money in politics. Of course, federal elections are more expensive than state and local elections. Nevertheless, money matters in state and local politics as well. Let's consider the 2014 gubernatorial election in Pennsylvania, which accumulated the most campaign financing the state had ever seen-a total of $57.9 million. The two final candidates competed to raise these sizable funds on top of their own contributions to their campaign. Democrat Tom Wolf u sed $10 million of his personal funds to launch his candidacy and further raised $20.9 million more to ensure success at the ballot box. Meanwhile, Republican Tom Corbett fell just short of Wolf both in campaign contributions and votes. He received $27 million from outside organizations but only personally funded $3.5 million. The two opponents rallied donations from wealthy individuals along with labor unions, lobbyists, and political party officials and fund-raisers. Across the US, all but six of the thirty-six 2014 state governor 's
137
Chapter 7 races resulted in a victory for the candidate with the highest funding. Whether or not the large amounts of cash are productive in campaign strategy, the correlation of money and winning cannot be ignored. Running for office is not cheap even in city-level elections. The 2017 New York City mayoral race ended in a landslide win for Bill de Blasio with a $4.8 million backing. He successfully shut out his opponents, Sal Albanese and Robert Gangi, who only collected $2.6 million and $80,000, respectively. Six percent of de Blasio's total financing consisted of over 4,000 small donors giving $175 or less, with the majority of the remainder collected from wealthy donors who were accused of seeking access to government officials to benefit their personal businesses. But compared to the gubernatorial election in Pennsylvania, the local election in NYC looks cheap. Of course, that may be partially due to the fact that the NYC elections are dominated by a single party. Where does this cash go? Are these large amounts of money necessary to win? For local elections, most of the costs are attributable to consulting services and the administration of the campaign itself. De Blasio spent 35 percent of his funds, or $862,000, on campaign consultants and 30 percent on wages for his campaign team. The remaining 35 percent was distributed across fund-raising, polling, rent, and professional services. On the other hand, state election spending is dominated by TV advertising. Over $580 million was spent on TV commercials for governor campaigns in 2014, with many candidates committing over 90 percent of their total funds to this promotional tactic. To better quantify the value of TV advertising spending, we can calculate how much the candidates spent for each vote they received. Across the 2014 governor races, Florida governor Rick Scott spent approximately $21.50 on TV ads for every vote, while his losing opponent, Charlie Crist, spent $12.25. In Illinois, Bruce Rauner spent $21 per vote and outpaced Pat Quinn, who spent $22.75. There is little information to determine whether these candidates are successfully "buying" these votes, but they consistently allocate large amounts of cash to these advertisements. However, the funds going to the advertising agencies are coming from primarily the independent Republican and Democratic party groups and other independent national committees, like the National Rifle Association and the Next Generation Climate Action Committee. Also, these groups spend more on negative advertising on the opposition than on the incumbent's own campaign. Approximately 71 percent of the outside group-funded ads attacked the opposition.
7.7 Conclusions Voluntary mechanisms are likely to yield inefficient levels of public good provisions, i.e., the levels are too low relative to the efficient level. As a consequence, city residents face the difficult task of aggregating the preferences of a large set of diverse citizens and determining city policies. In most US cities, residents delegate the decision-making authority to members of a city council and a mayor who is the chief executive of the city. The elected government can then use coercive forces to levy and collect taxes to try to overcome the free rider problem associated with voluntary mechanisms.
138
Voting over Public Good Provision The core model of democracy suggests that elected governments will pursue policies preferred by the median voter. As a consequence, elections and party competition efficiently solve the delegation problem encountered in representative democracies. Moreover, elections allow voters to remove from power politicians who turn out to be too extreme, incompetent, or corrupt. This removal occurs at the ballot box and does not require the use of excessive force or even a public uprising against the elite, which could be very costly. There are additional complications in a representative democracy that may drive a wedge between the policies preferred by the median voter and those adopted by elected politicians. These wedges can be sustained for a variety of reasons. Low voter turnout due to voter fatigue is the most common problem. If voters fail to show up at the polls, it is possible that candidates get elected that do not represent the preferences of the median voter. Finally, there is the absence of a social norm by citizens punishing politicians for misbehavior. The lack of punishment may explain why democracies in poor countries are often ineffective. Political outcomes and electoral competition are also shaped by political institutions. There are some significant differences in the institutions among city governments. Term limits have the advantage of reducing the dangers of political entrenchment and the electoral advantages of incumbency. However, they may also remove some capable and moderate politicians from office. Similarly, professional city managers may improve the cost efficiency of city administration, but they are also sheltered from electoral competition and hence may not be responsive to changes in voters' needs. A big threat to city government comes from rent-seeking or corrupt officials that abuse their offices for personal enrichment and patronage. We study these issues in more detail later in this book. Next, we turn our focus to the roles that household mobility plays in local politics. Households not only vote at the ballot box but also with their feet. This creates some additional problems and complications.
7.8 Technical Appendix: The Public Good Provision Problem We can illustrate the power of the m edian voter theorem by considering the public good provision problem that we studied in detail in the previous chapters. Let us assume for simplicity that the number of voters N is odd. Hence the median voter is unique as long as the preferences are distinct and the policy is unidimensional. Individuals have quasi-linear preferences over a public good and a private good:
i = 1, .. .,N
(7.4)
Individuals differ in their preferences for public goods. Without loss of gen erality, let us assume that we have ordered individuals such that (7.5)
Each individual has an exogenous income denoted by m i. Th e public good can be produced by converting one unit of income into one unit of the public good,
139
Chapter 7 i.e., c = 1. The costs of public good provision are shared equally among the N individuals, i.e., every individual pays },th of the total level of public good provision. We can, therefore, write each individual's budget constraint as (7.6) Substituting the budget constraint into the utility function yields (7.7) Recall that quasi-linear preferences imply that the demand for public goods does not depend on income. Hence differences in income turn out to be irrelevant in this example. Differences in preferences are driving all the key results. The first-order condition that characterizes the bliss point of each voter can be obtained by taking the derivative of the utility function in equation (7.7) with respect to G: (7.8)
Hence the preferred level of public good provision for each individual is given by (7.9) Moreover, we can show that individuals' preferences are single-peaked. Note that the second-order condition for the optimality condition is given by
1
- /3·/ -G2 Vi. Households prefer lower housing prices and higher levels of public goods. Next, we consider two distinct communities that offer different combinations of public goods and services. We can think about community 1 as a low-frills place that offers few services but housing is reasonably cheap. Community 2 offers much better public goods and services than community 1, i.e., G2 > G1 . Hence it also has higher prices of housing, i.e., p2 > p1. Housing prices need to be higher in community 2 than community 1, otherwise nobody would want to live in community 1 in equilibrium. We say that public goods are capitalized in the housing market: if you want to move to a community with higher levels of public goods, you have to pay higher prices per unit of housing. The key question, then, is the following: Where do different households choose to live? The answer clearly has to depend on the income of the households as well as how strongly households care about local public goods and services. For simplicity, let us first ignore differences in preferences and just focus on differences
146
Mobility and Fiscal Competition p High income
Low income
G FIGURE 8.2. Single-crossing condition.
in income. In that case, we would expect that households with higher levels of income would prefer living in the community with high-quality services, while households with low levels of income would prefer to live in the community with lower services but much cheaper housing prices. Under what conditions can we sustain a sorting equilibrium in which households stratify by income between the two communities? The key assumption we need is called a single-crossing property. This allows us to compare the shape of indifference curves for individuals with different levels of income. As income changes, indifference curves must become steeper. This condition implies that households with higher levels of income have steeper indifference curves than households with lower levels of income. Figure 8.2 illustrates this case. Note that these are indifference curves for two different individuals, one with high and one with low income. Recall that indifference curves for the same individual, illustrated in figure 8.1, do not cross. A steeper indifference curve means that you are willing to accept a larger increase in housing prices for a given change in public goods. Hence you have a higher willingness to pay for public goods. Higher-income households should have steeper indifference curves than lower-income households if the public good is a normal good. Convince yourself that this makes sense. We would like to characterize the sorting of households among communities 1 and 2. Recall that higher-income households have a greater willingness to pay for the public good. Hence there exists a household with income m that is exactly indifferent between living in 1 or 2. This household is characterized by the following indifference condition: (8.5)
Figure 8.3 illustrates that result. The choice set is given by the two points ( G1, Pl) and ( G2, p2). The indifference curve of the individual with income m
147
Chapter 8 p
V(G,p, m)
G FIGURE 8.3. Sorting equilibrium.
goes through both points. Hence the household is indifferent between the two communities. Consider a household that has income m > 111. That household will prefer community 2 over community 1. Recall from figure 8.2 that this household will have steeper indifference curves than the household with income 111. That household, therefore, strictly prefers ( G2, p2) over ( G1, Pl). Similarly, consider a household that has income m < 111. That household will prefer community 1 over community 2. From figure 8.2 we know that this household will have flatter indifference curves than the household with income 111. Hence that household strictly prefers ( G1, Pl) over ( G2, p2). Households thus voluntarily sort themselves into communities by income. We call that voluntary segregation or stratification by income. Note that the income cutoff 111 depends on prices and levels of public good provision in both communities. It is easy to verify the following result. If we make the first community more efficient by either lowering housing prices (holding public goods fixed) or increasing public good provision (holding housing prices fixed), households will move from community 2 to community 1. The opposite result holds if we lower prices or increase public goods in community 2. In that sense, our model captures the intuition that more efficient communities tend to be larger in equilibrium since they attract more households. The model considered above is restrictive since households only differ in income. As discussed in the technical appendix at the end of the chapter, we can extend the model and allow heterogeneity in preferences for public goods. Demand for public goods then depends on income and the strength of preferences for public goods. Nevertheless, the same logic applies to household sorting among communities. Households with similar demand for public goods and services live in the same communities. However, this more general model can explain a higher degree of income mixing, since some moderate-income households choose to live in community 2 as long as they have strong enough preferences for public goods.
148
Mobility and Fiscal Competition
Similarly, some high-income households choose community 1 if they have low enough tastes for public goods. We can also extend the model and allow for the fact that households care not only for public goods but also for other amenities, such as distance to work or proximity to parks and areas of recreation. Households face trade-offs between public goods and local amenities. For example, all communities with high-quality schools may be located in a suburb of the metropolitan area and require significant commuting to the city center. These models then give rise to more complicated sorting patterns. We revisit these issues later in the book.
8.3 Capitalization: Empirical Evidence Let's consider whether public goods and services are capitalized in housing prices. This question was first posed by Oates (1969), who studied a sample of municipalities in New York and New Jersey. His evidence suggests that if a community increases its tax rates and employs the receipts to improve its school system, the increased benefits from the expenditure side of the budget roughly offset the depressive effect of the higher tax rates on local property values. He viewed these results as broadly consistent with the predictions of the types of models we have considered in this chapter. Of course, it is very hard to disentangle the effects of public policies on home values since expenditures, taxes, and home values are all simultaneously determined in equilibrium. In addition, the composition of communities is also endogenous if households are mobile. As a consequence, we need to be careful when w e try to measure the magnitude of the capitalization effect. One way to learn about capitalization effects is to study large policy changes or a "natural experiment." For example, an important change occurred in California in 1978 when voters approved Proposition 13. This proposition limited the maximum amount of any ad valorem tax on real property to 1 percent of the "full cash value" of the property, defined as the value of the house as of 1976, with annual increases of 2 percent at most. Finally, Proposition 13 prohibited both state and local governments from imposing any additional ad valorem taxes on real estate. As a consequence, Proposition 13 immensely reduced property taxes in many municipalities in California. To study the impact of Proposition 13 on capitalization rates, Rosen (1982) regresses the changes in housing values on changes in taxes, controlling for a variety of observables. He finds that each $1 of property tax reduction increases house values by about $7, roughly equal to the present discounted value of a permanent $1 tax cut. Note that the fall in property taxes will undoubtedly result in a future reduction in public goods and services. We are not holding expected future services constant in this exercise. These expected cuts in the provision of local public goods and services should reduce, not increase, the value of a house! Rosen finds that house prices rose by almost the present discounted value of the taxes. This stark finding suggests that Californians did not care much about the reduction in future public good provision. While these findings are consistent with the importance of capitalization effects, you may still be concerned that important unobserved neighborhood characteristics may confound the interpretation of the results. Do households really sort
149
Chapter 8
FIGURE 8.4. Proposition 13 affected property values in San Francisco. (pixabay.com/ www .pexels.com)
based on local public policies or do they care more about neighborhood characteristics, such as distance to work or proximity to parks? How do we disentangle these effects? One influential paper is by Sandra Black (1999). The basic idea behind this paper is to compare the prices of houses located on the boundaries of adjacent school districts. These houses are all in the same neighborhood and hence have the same neighborhood characteristics. Differences in housing prices cannot, therefore, be explained by differences in local amenities. However, these houses are in different school attendance zones or different school districts, which creates a discontinuous change in school quality along the boundary. If we can control for differences in housing size and housing quality, then differences in housing prices should be primarily driven by differences in school quality. So then how do we measure school quality? As we discuss in detail in subsequent chapters, we typically use some measure of student test scores. Ideally, all students participate in the same m andatory statewide test. Using this type of "spatial regression discontinuity design," Black (1999) finds that parents are willing to pay a significant housing premium for an increase in test scores. This approach has been extended by Bayer, Ferreira, and McMillan (2007). That paper embeds a boundary discontinuity design in a heterogeneous residential choice model. Residential choice implies that different types of households live on either side of the boundary. This sorting may cause som e additional problems, especially if households care about the quality of their neighbors or
150
Mobility and Fiscal Competition
FIGURE 8.5. Do better schools matter? (pixabay.com/www.pexels.com)
peers. These effects are also called peer effects or neighborhood effects. They play a large role in explaining sorting within communities as we discuss in detail later. Bayer, Ferreira, and McMillan estimate their model using census data from the San Francisco Bay Area. They find that households are willing to pay 1 percent more in house prices when the average performance of the local school increases by 5 percent. These estimates are smaller in magnitude than the ones reported in Black (1999). They explain this finding by pointing out that much of the apparent willingness to pay for more educated and wealthier neighbors is explained by the correlation of these sociodemographic measures with unobserved neighborhood quality. Overall, we conclude that there is compelling empirical evidence that suggests that public goods and services, as well as local amenities, are capitalized into housing values. Mobile households will bid up the prices for houses in desirable school districts, municipalities, and neighborhoods. If you want to live in a nicer neighborhood, you'll have to pay for it! You don't get anything for free in urban housing markets.
8.4 Competition and Efficiency: Empirical Evidence Testing for capitalization is the easy part, although you may disagree at this point. Determining whether competition among municipalities increases efficiency is much harder. Needless to say, the empirical evidence is much less clear than the evidence on capitalization. There are many reasons why this hypothesis is so hard to test. Not only do you run into all the problems that you encounter when you test the capitalization hypothesis, but, in addition, there are a few other headaches you n eed to deal with.
151
Chapter 8 Most obviously, you need to figure out how to measure efficiency or productivity differences among local municipalities or school districts. Firm productivity can be measured as a residual of a production function that controls for all relevant input factors. We used that approach in chapter 2 to study agglomeration externalities, which are nothing else but location-specific shifts in firm productivity. In principle, one could try to estimate these types of production functions for public goods and services as well. For example, in the context of schools, we have fairly good measures of student achievement. Attention then focuses on estimating education production functions. Suppose we can do that; the question is then whether competition among school districts or public-private school competition increases test scores. How do we measure competition? How do we get some exogenous variation in the amount of competition faced by a municipality or school district? Most of the evidence that supports the efficiency hypothesis comes from the analysis of public school districts competing against private schools. Epple, Romano, and Urquiola (2017) review the literature and conclude that there is compelling evidence that public schools improve when they are threatened by the entry of voucher schools. There is also much research on the effects of competition between charter schools and traditional public schools. We review this evidence in more detail in chapter 18.
8.5 A Case Study: The Benefits of Consolidation Efficiency gains may arise from competition among municipalities. But sometimes they can also arise from consolidation. Sometimes there are too many school districts or municipalities. Less can be more! We can view consolidations just as mergers and acquisitions among firms. Sometimes they are good; sometimes they are bad. We need to carefully look at the details. Consolidation can be desirable if it lowers costs or eliminates the replication of services. In addition, it can help resolve some externality or spillover problems that we discuss in more detail in the next chapter. Here we consider a case study to illustrate some of the key ideas that drive consolidation. In 2003 the city of Louisville, Kentucky, and Jefferson County merged to create the Louisville-Jefferson County Metro Government (Louisville Metro for short). This was the first significant merger between a city and county in a number of years, and it occurred as a result of heated debate and deep divisions between the city of Louisville and its county. Leading up to the merger, Louisville had begun experiencing significant problems. Its population had peaked in the 1960s and had been declining, and people had moved out of the city and into the suburbs. This meant that in the 2000 Census, Lexington had overtaken Louisville as the largest city in the state of Kentucky since Lexington had undergone its own merger w ith Fayette County. In addition, Louisville's tax base had shrunk, which led to troubles in the city budget and difficulties providing services to its residents (Wachter, 2013). Moreover, Louisville clashed with Jefferson County in a multitude of aspects. Politically, the city and the county h ad separate and independent governments that ruled over the same area. Resid ents paid taxes to both entities and received
152
Mobility and Fiscal Competition
different services from each. Needless to say, this led to tensions between the two governments. Socially, the white population in Louisville experienced a significant decrease beginning in the 1960s, revamping the socioeconomic d ynamic between city and county. Finally, economic incentives led to competition between Louisville and Jefferson County when it came to attracting businesses, as each wanted the businesses to base themselves within its realm. Prior to the 2003 merger, Louisville and Jefferson County had already experienced joint ventures. In the 1970s the school districts were merged by a court order. Initially, the merger caused strained racial tensions to flare, as Louisville's school district was primarily black while Jefferson County's was mostly white. Nevertheless, by the 1990s the city and the county had grown accustomed to the merged school district. Additionally, in 1985 the city and the county eased economic competition between the two. Notably, a new tax plan was enacted, allowing both entities to benefit from new businesses and economic development, regardless of where physically that growth had taken place. The initiative to merge the city with the county had bipartisan support. Democrats, like then mayor of Louisville Jerry Abramson, and Republicans, such as then county chief executive Rebecca Jackson and Senator Mitch McConnell, publicly supported the plan to unify. The arguments in favor of unification hinged on potential benefits in terms of economic development and government efficiency. Notably, many argued that a merger would enable the unified city to better attract and retain jobs, improve government operations without raising taxes, and create a better sense of community within the then separated populations. On the other hand, a multitude of groups opposed the merger. The running theme behind the opposition was the uncertainty of what happens after a merger. For instance, the local NAACP chapter feared that African Americans would lose representation in the legislature due to a restructuring of the government. The police and firefighters unions did not want to renegotiate their then favorable contracts. Due to the uncertainty and lack of specificity at the time about the finer points of the merger, opposition groups claimed that taxes would increase and social services would decay. In the end, the proposition to merge the city and the county passed with 54 percent of the voters in favor. A new government was established, made up of a mayor and a council of twenty-six members, one for each district. Well over a decade later the effects of the merger are still unclear, as the benefits and the costs of the merger are still being debated. Nevertheless, there are some conclusions that can be drawn about the outcomes of the merger. Prior to the merger, the population of Louisville had been in decline. In contrast, the population of Louisville Metro has grown at a pace of over 6 percent a year, better than other cities in the United States of comparable size. However, the population within the old city bounds has continued to decrease. The effects of the merger on the economic development of the area are hotly contested. Since the merger, the city seems to have repositioned itself better for the future. Jobs have been lost in the manufacturing sector, while they have been gained in other aspects, like the service sector and business services. The impacts of the merger on government expenditure are just as unclear. There is still no consensus on whether spending increased or decreased as a result of the union.
153
Chapter 8 Currently, the merged Louisville metropolitan area is the eighteenth largest city in the United States, and a government survey in 2011 concluded that 56 percent of the residents feel positive about the outcomes of the merger, compared to the original 54 percent approval of the unification. For the most part, the citizens still have the same concerns and optimism for the city. Calabrese, Cassidy, and Epple (2002) provide a political economy model of consolidations that explains why they are so rare. The key insight is that consolidation creates winners and losers, even if in the aggregate it is desirable to consolidate. If the losers are in the majority in one of the districts or communities, they effectively can veto the consolidation since it is hard to design transfer payments that would make everybody better off.
8.6 Income Stratification and Voting How much stratification or household sorting by income do we actually observe within large metropolitan areas? The basic model that we considered above predicts perfect stratification by income among communities. That cannot be a good description of the real world. While we would expect some stratification by income to occur in the real world, there are clearly other factors that affect residential choices. Hence we would expect to observe some imperfect income stratification in the data. We need a model with additional sources of household heterogeneity to generate more realistic income sorting patterns. We can generate more income mixing within each community by adding heterogeneity in preferences for public goods to the model. The idea is simple: if you care a lot about public goods, then you are willing to pay a higher housing price to move to a community with high levels of public goods. As a consequence, the demand for public goods is driven not only by income but also by the strength of preferences for public goods. Consider a community with low levels of public goods. This community primarily attracts low- and moderateincome households, but it also attracts some higher-income households that do not care for public goods and services and just want to enjoy the low housing prices. Similarly, communities with high levels of public goods also attract some lowand moderate-income households that care a lot about public goods. For example, these households may have school-age children, want to send them to good schools, and are willing to pay the higher housing prices. In the appendix, we study this model in more detail. Epple and Sieg (1999) explore w hether such a model can generate a sufficient amount of income heterogeneity within communities to explain the observed sorting of households across municipalities in the Boston metropolitan area. They u se data from the 1980 Census. This estimation exercise requires some nonlinear estimation techniques that are outside the scope of this book. The basic intuition for the approach is, however, simple. The central idea of the estimation strategy is to match the observed quantiles of the income distributions in ninety-two communities in the Boston metropolitan area with those predicted by the model. Epple and Sieg show that their model fits the observed income distributions in all communities in their sample remarkably well. We conclude that residential choice models can be consistent with the observed sorting by income within a metropolitan area. Does that mean that local tax and
154
Mobility and Fiscal Competition expenditure policies reflect the preferences of the median voter in each community as suggested by Barr and Davis (1966)? Not surprisingly, answering that question is not easy. Thus far we have treated property taxes and expenditures as predetermined. We can endogenize taxes and expenditures via majority rule for each community in equilibrium. That requires some algebra, but the basic insights from the previous chapter apply. In each community, there is a set of decisive voters that determines the property tax rate and the level of expenditures: 50 percent of the households that live in the community want more, 50 percent want less. How do we map this model into the data and test its voting implications? Recall that we also observe education expenditures per capita as well as property tax rates. We can also measure aggregate rental expenditures and the total amount of tax revenues generated by the property tax in each community. The model should explain not only the sorting by income but also the observed tax and expenditure policies given the housing market equilibrium that we observe in each community. Calabrese, Epple, Romer, and Sieg (2006) show that the model's predictions are consistent with the observed data in the Boston metropolitan area. Other researchers have applied similar models to study other US metropolitan areas and have found similar results. We conclude that we can develop equilibrium models that capture household sorting among communities and explain public policies via majority rule. These models not only are consistent with the observed sorting of households by income among communities but also can explain the observed property tax and expenditure policies in each community.
8.7 Segregation and Sorting by Race In the United States, communities and neighborhoods tend to be segregated not only by income but also by race and ethnicity. Neighborhoods are usually segregated by income due to individual voluntary sorting, but the separation can also be explained by zoning restrictions that inhibit poorer households from living in relatively wealthier neighborhoods with high public good provisions. We study the impact of zoning restrictions in greater detail in chapter 12. Segregation by race in the US is more difficult to understand than segregation by income. To get an idea of the magnitude of segregation by race, figure 8.6 considers the city of Milwaukee. Each dot or point in the figure accounts for two hundred people. Black dots represent African Americans. Gray dots indicate white Americans. We find that most neighborhoods in the city and surrounding suburbs are either predominantly white or predominantly black or Hispanic. Surprisingly, there are very few mixed neighborhoods in Milwaukee. How representative is Milwaukee? It is one of the most segregated cities in the US, but similar patterns hold in many other US cities. Racial segregation can be involuntary or voluntary. Before the late 1960s segregation by race was mainly driven by explicit discrimination against African Americans in housing markets. While some racial discrimination still has an underlying effect on US housing markets, this type of open discrimination has been illegal in the US since the late 1960s. Since then the federal government
155
Chapter 8
• •
Census Track County Black or African American White, non-Hispanic
N
Fond Du Lac
Sheboygen
A
1 dot= 200 people 10
Kilometers Miles 10
0
Primary Metropolitan Statistical Area (PMSA) boundaries and names are those defined by the Federal Office of Management and Budget on June 30, 1999. All other boundaries and names are as of January 1. 2000.
Dodge
Jefferson
Racine
FIGURE 8.6. Residential segregation in Milwaukee in 2000. (Racial and Ethnic Residential Segregation in the United States: 1980-2000 by John Iceland, Daniel H. Weinberg, and Erika Steinmetz, August 2002, US Census Bureau)
has actively enforced existing antidiscrimination laws. Cutler, Glaeser, and Vigdor (1999) provide a description of the rise and fall of ghettos in the United States. They discuss how economic trends led to the migration of African American groups to northern cities between 1890 and 1970. They also point out the policies that enforced the involuntary segregation of blacks into a small number of undesirable neighborhoods. They argue that segregation of blacks has declined since the 1970s and that the percentage of black families has increased in communities that were once characterized as all-white suburbs. Nonetheless, they argue that the historical legal barriers that enforced segregation have been replaced by voluntary segregation. The argument describes how certain neighborhoods become predominantly white as white families tend to pay more than blacks to live there. In general, racial minorities often confront a dilemma when choosing between being among p eople like themselves versus choosing a neighborhood with better public goods and services (Sethi and Somanathan, 2004).
156
Mobility and Fiscal Competition
Voluntary segregation is almost as common an occurrence as involuntary segregation. For example, immigrants coming to the US have historically banded together for mutual benefit; it has allowed them to maintain a sense of community in the new country. More generally, voluntary segregation can arise due to homophily, or the tendency of like individuals to associate with similar others. Individuals in homophilic relationships share common characteristics (beliefs, values, education, etc.) that make communication and relationship formation easier. Many of these similar groups can benefit and sometimes experience positive trickle-down effects. US cities that receive more immigrants have often experienced a decrease in crime, and some immigrant groups have shown overachieving scores in public schools. It is not obvious whether there is any need for government intervention in this case. There is substantial evidence that households in the US sort based on preferences over the racial composition of neighborhoods. Sociologists have referred to this phenomenon as the neighborhood-level effect, in which an individual's neighborhood can explain and predict the socioeconomic outcomes of his or her life. Cities with the greatest geographical segregation allow for a concentration of isolation and poverty, which intensifies the community's overall economic hardships. This congealed mix of unattractive neighborhood characteristics leads to increases in crime, gang formation, and violence. This is especially evident in the heavily segregated city of Milwaukee, as over 50 percent of black men between the ages of thirty and forty have been incarcerated. Segregation may also occur partially as a result of "statistical discrimination." Here individuals are using a person's race as a statistical signal of that person's likely behavior. Statistical discrimination is common not only in housing markets but also in labor and financial markets. For example, employers may use race or gender as a signal for the expected performance on the job. Banks may use it to determine the likelihood of repaying a car loan or mortgage. This explains how racial segregation still has an impact in institutional systems today. In housing markets, race may be used as a signal of how well the buyer "fits" into a given community. These assessments are often based on the real estate agent's racial stereotypes. For example, growing up in a neighborhood with prevalent crime puts an individual at a much higher risk of obtaining a mark on their criminal record. This individual will be subject to a future disadvantage when trying to purchase a home, as an assessment of their credit risk is much more likely to deny them a loan with any information of a criminal record. These types of behavior are problematic since they tend to stigmatize individuals of a particular race. There is strong empirical evidence that segregation is not desirable from a purely economic perspective. Research by Cutler and Glaeser (1997) shows that negative labor market effects-lower wages and longer spells of unemploymentpersist for those individuals who live in highly segregated metropolitan areas. Suburbanization tends to foster this segregation within cities. Policies such as the Community Reinvestment Act have tried to force banks to extend loans to households in neighborhoods that used to be "redlined" and still are characterized by heavy segregation. There is some evidence that these policies have increased access of minority businesses to small loans. However, there is still a long way to go before households in poor and segregated neighborhoods have equal access to mortgage markets.
157
Chapter 8
8.8 Conclusions Households vote with their feet by moving to cities and municipalities that offer the mix of housing prices, public goods and services, and other amenities that they find attractive. Cities compete for households against other cities and suburban municipalities. Note that cities thus face a very different financial environment than countries. Even if Greece goes into bankruptcy or significant economic decline, few Greeks will leave the country. Mobility is difficult given significant cultural, legal, and language barriers. Moving within the US, on the other hand, is easier and cheaper relative to international mobility. The mobility of households provides incentives for communities to eliminate inefficiencies. Some suburban municipalities operate fairly efficiently. Larger cities face competition from suburbs, although they offer certain services and goods that are unique to the metro area. The poor level of public education in many urban districts or the relatively high level of crime in some cities has driven many middle-class families from these cities into the suburbs. If a city does not provide an adequate level of public goods and services, households who value these services will prefer to live elsewhere. Therefore, it is essential that a city operates and manages the key concerns of its households. We need to take household mobility into consideration when we study the impact of various policy interventions. For example, if we replace a flat property tax with a more progressive income tax in a community, we should expect to attract lower-income households and lose some of the higher-income households. That may not happen immediately due to mobility costs, but it should happen in the long run. We cannot treat the composition of each city or community as fixed when households are mobile. There also exists competition among large cities, although mobility costs are larger across metropolitan areas than within metro areas. For example, Philadelphia competes with its suburbs but also with other large cities on the East Coast. For some highly mobile and skilled young households, cities may even face competition from outside the US. For example, financial service providers in New York City compete for talented college graduates with similar firms that are located in London, Frankfurt, Tokyo, or Shanghai. As we will see in the next chapters, modern cities face a number of daunting challenges, including providing adequate public education, dealing with poverty, providing affordable housing, fighting crime, and keeping neighborhoods safe and healthy. Mobility and fiscal competition may help exert pressure on city administrators to solve these problems. It would be foolish to expect that competition alone would be sufficient to eliminate all inefficiencies at the city level. One of the biggest problems generated by mobility is that it can create large differences in income. Hence there are large differences in fiscal capacity among cities and municipalities. Fiscal competition and household sorting lead to unequal outcomes. High-income households tend to live together and send their children to higher-quality schools, placing their children at an advantage over low-income households in future financial well-being. Residential segregation b y income, therefore, reinforces income inequality across generations. To guarantee a more equal level of expenditure among communities, many states have adopted equalization policies that impose taxes on high-income communities and subsidize low-income communities. We discuss these issues in d etail in the next chapter.
158
Mobility and Fiscal Competition
8.9 Technical Appendix: Optimal Locational Choices 8.9.1 Modeling Fiscal Competition Let us consider the representation of preferences
(8.6) where p < 0 and 1 > ~ > 0. You should convince yourself that the demand for housing can be expressed as
h(p,m )=~m p
(8.7)
Note that~ is the constant expenditure share of housing. Substituting the housing demand function and the budget constraint into the utility function, we obtain the indirect utility function, which can be written as
(8.8) where t¥ is a function of~ and p. Consider a model with two communities, 1 and 2, with indirect utility given by V ( G1, p1, m) and V ( G2, p2, m). There exists a household with income m that is exactly indifferent between living in 1 or 2: (8.9)
Collect terms containing m on the left-hand side: (8.10)
Solving for
m, we obtain (8.11)
Note that the income cutoff m depends on prices and levels of public good provision in both communities. Convince yourself that the following conditions hold:
om
O
(8.12)
(8.13)
159
Chapter 8
If we make community 1 more efficient b y either lowering housing prices (holding public goods fixed) or increasing public good provision (holding housing prices fixed), more households will move to that community.
8.9.2 Imperfect Sorting by Income Let's consider a more realistic version of the model in which households differ by income m and preferences a. Heterogeneity among households is given by the joint distribution of m and a. Let's write the indirect utility function V(p, G, m, a). The analysis proceeds just as before. Households that are indifferent between both communities are still characterized by the following condition: (8.14)
Given our utility function, we can rewrite this indifference condition as
(8.15)
Rearranging and taking logs yields
ln(a) = pln(m)
-f3p
+ ln ( P2
P-
-f3p )
p~ Gl - G2
(8.16)
The boundary indifference condition is thus a line in the (ln m, 1n a) space. The slope is determined by p. Note that p < 0, so the slope is negative. Plotting this indifference locus, the equilibrium sorting will then look as shown in figure 8.7. C1 denotes the set of households that live in community 1, while C2 denotes the set of households that live in community 2. Holding tastes for public goods fixed at some level a, we have perfect sorting of households by income. Since households differ in tastes, there will not be perfect sorting by income in equilibrium. Higher-income households must have lower preferences for public goods to be indifferent between the two communities. H ence the model can generate a fair amount of income heterogeneity within communities. This model provides more realistic predictions about the distribution of households by income across communities. Epple and Sieg (1999) and Epple, Romer, and Sieg (2001) estimate a similar version of this model using data from the Boston metro area.
8.10 Debate: City-County Merger Debate the benefits of a city-county m erger. Consider a situation such as the city of Pittsburgh and Allegheny County. The pro side should argue in favor of a merger. The con side should argue against a merger. The follow ing questions should help structure the d ebate:
160
Mobility and Fiscal Competition
ln(a)
ln(m) FIGURE 8.7. Imperfect sorting by income.
1. Why did the city of Pittsburgh experience significant financial problems in
2. 3. 4. 5. 6. 7. 8.
the past? What was the solution that was adopted to improve city finances? What are some of the structural problems of the city? How many municipalities and school districts are there in the county? What types of municipalities are unlikely to favor a merger? Why? Are there any examples of successful city-county mergers? What have we learned from these mergers? Why is it that the city of Pittsburgh may benefit a lot from a merger while other cities would probably benefit less?
8.11 Problem Sets 1. Explain why housing prices often include a premium that reflects the value
of local public goods and amenities to the residents. Why is housing more expensive in neighborhoods with more amenities than in neighborhoods with fewer amenities? 2. Consider two households that differ by income. The households have the same indirect utility function defined over a local public good, the price of housing, and income. Provide a graphical analysis of the shape of the indifference curves and explain the single-crossing property. 3. Consider the fiscal competition model with two communities, 1 and 2, such that G1 < G2 and Pl < p2. Consider two individuals, A and B, with incomes mA < mB, Show that it is not possible that individual A chooses
161
Chapter 8
4.
5. 6. 7.
community 2 while individual B chooses community 1, if preferences satisfy the single-crossing property, i.e., higher-income individuals have steeper indifference curves than lower-income individuals. Suppose we have three communities and there is a unique equilibrium in which all three communities have distinctly different income levels. Provide an illustration of the sorting of households by income that arises in equilibrium and characterize the median voter in each community assuming a uniform distribution of income. Explain why household mobility and competition between different jurisdictions may lead to segregation by income in local housing markets. Explain why the median voter is endogenous in this model discussed above. Consider a society with many persons that can choose freely to live in either region 1 or region 2. It is more expensive to live in region 2: it costs c1 to live in region 1 and c2 = c1 + tJ. to live in region 2. Individuals differ in their incomes, denoted by m. Income takes on the values between 0 and 1 and is uniformly distributed. Individuals care about the mean income of those living in their region, denoted by rhj. We can think of rhj as measuring the attractiveness of the peers in the region. The utility of an individual living in region j is given by (8.17)
Individuals make simultaneous residential d ecisions. In equilibrium, nobody wishes to move. a) Suppose all individuals live in region 1. Derive the mean income in region 1. Consider the richest person with m = 1. Show that this person has an incentive to move if tJ. < 1. b) Suppose all individuals live in region 2. Derive the mean income in region 2. Consider the poorest person with m = 0. Show that this person has an incentive to move if tJ. > 0.5. c) Consider the case in which 0.5 < tJ. < 1. Show that in equilibrium where the mean incomes differ across regions, every individual in region 1 must have a lower income than every individual in region 2. (Hint: Consider two individuals with m1 < m11. Assume m11 lives in 1 and YI lives in 2 with m1 < m2 and show that this leads to a contradiction.) d) Let m* denote the income of an individual who is indifferent between living in region 1 and region 2. Show that there exists a critical m* = 2!J. - 1 such that all individuals with m > m* live in region 2 and all individuals with m < m* live in region 1.
162
9
Spillovers, Fiscal Inequality, and Intergovernmental Transfers
9.1 Motivation We learned in the previous chapter that fiscal competition among municipalities and cities provides some advantages. In particular, it gives households the option to leave inefficiently run cities and municipalities. There are, however, some serious problems that are also caused by fiscal competition. We discuss two such problems in this chapter. First, fiscal competition does not necessarily provide reasonable outcomes if there are large spillover effects among regions, cities, or communities. Equilibrium allocations may not be efficient since each local government has a tendency to ignore the spillover effects that it imposes on other communities. Spillovers can be problematic in a variety of policies ranging from crime fighting to providing air quality. An efficient organization of government, then, requires a combination of taxes and transfers among jurisdictions that internalizes the positive or negative spillover effects. Second, fiscal competition and decentralization can lead to large inequalities among state and local governments. Inequalities raise fairness issues that can undermine the legitimacy of a decentralized organization of government. In some cases, these inequalities are so large that a centralized solution may be more desirable. In many cases, this drastic solution is neither necessary nor desirable. We will see in this chapter that we can, in principle, design a transfer system that alleviates inequality among local communities. We show that the federal or state government can use taxes and transfers to reallocate resources to poor local governments. These taxes and transfers tend to decrease spending in high-income communities and increase spending in low- and moderate-income communities and, therefore, reduce fiscal inequalities. In practice, intergovernmental transfers often appear to be insufficient to deal with either spillovers or inequalities. One key problem is that federal and state policies often favor suburban and rural communities and penalize large, lowerincome cities. These existing policies largely reflect the distribution of political power within the country or within a state.
Chapter 9
9.2 Heterogeneity in Intergovernmental Transfers among the Largest US Cities To characterize heterogeneity in intergovernmental transfers among the largest US cities, we turn to the data collected by the Lincoln Institute of Land Policy. Mean annual intergovernmental revenue per capita 2005-2014 TX: Austin KY: Louisville TX: Ft. Worth TX:Dallas TX: Houston FL: Jacksonville TX: San Antonio NC: Charlotte AZ: Phoenix IN: Indianapolis CA: San Diego OH: Columbus CA: San Jose IL: Chicago Ml: Detroit CA: Los Angeles CA: San Francisco PA: Philadelphia NY: New York 0
I
I
I
I
1,000
2,000
3,000
4,000
Dollars per capita FIGURE 9.1. Differences in intergovernmental transfers amon g large US cities. (Fiscally Standardized Cities database/Lincoln Institute of Land Policy)
Figure 9.1 plots ten-year averages for the period between 2005 and 2014. We find that cities obtain transfers that range between $1,000 and $4,000 per capita on average during that period. Recall that the vast majority of transfers come from state governments. Differences in intergovernmental transfers thus primarily reflect differences in state policies. We find that states that are considered to be more liberal, such as New York and California, tend to have larger state governments. Hence they provide larger transfers to cities. Conservative states, such as Texas and Florida, tend to have smaller state governments and less redistribution among municipalities. As a consequence, intergovernmental transfers are small for cities such as Austin, Dallas, Houston, and Jacksonville.
9.3 Fiscal Spillover Effects Fiscal spillover effects arise in the provision of local public goods and services if a community benefits from the provision of public goods in nearby or geographically adjacent communities. A compelling example is air quality. Suppose the
164
Spillovers and Fiscal Inequality prevailing wind direction is toward the west and a community engages in serious air pollution. Then due to the prevailing wind conditions, all other communities that are located to the west of that community will suffer worse air quality. This is an example of a negative spillover effect. Similarly, a positive spillover effect can arise when a community improves its own air quality. Positive or negative spillovers can also arise in law enforcement and crime fighting. Suppose a community hires a number of additional police officers. As a consequence, the crime rate drops in that community. Adjacent communities may also benefit from this policy if the reduction in crime in one community discourages, for example, organized crime or drug dealing in the region. In that case, the crime rate also drops in adjacent communities. If criminals just relocate from one community to nearby communities, then this policy could have a negative externality on nearby communities. To formalize these ideas, let us consider a model with two communities, where each community provides public goods equal to Gi, for i = l , 2.1 Spillover effects exist if the utility of a representative household in community 1, denoted by U1 , depends on the level of G2 . If U1 increases (decreases) in G2, we have a positive (negative) spillover effect. Consider the following specification: (9.1)
The parameter a measures the direct benefits from the provision of public goods. In contrast, b measures the indirect effect that arises due to the spillover effect from G2 . Note that if G2 = 0 the second term is zero, and we have a standard model of public good provision. The parameter c can be interpreted as the marginal cost of providing the good. Also note that G1 and G2 are modeled as complementary goods in the utility function above. An increase in G2 increases the marginal utility of G1 , assuming that b > 0. Similarly, the second community has the following utility function: (9.2)
Note that this model of interjurisdictional spillovers is similar to our model of voluntary provision of public goods. Here we are facing a similar free riding problem, which generates an inefficiency. As in the model of voluntary public good provision, the inefficiency arises because the two players do not internalize the externality that they create for the other player. In the model of voluntary public good provision, the individual contributions are substitutes. Here we consider a more general specification of the model that also allows for potential complementarities. More formally speaking, we can solve the decision problem for community 1 for any arbitrary level of G2 . Doing that gives us the best-response function for community 1, denoted by G1 ( G2), which characterizes the optimal strategy for community 1. Similarly, we can compute the best-response function, denoted by G2 ( G1 ), which characterizes the optimal strategy for community 2. 1
The model that we consider is discussed in detail in Hindriks and Myles (2006).
165
Chapter 9
I
I I I
I I I I I I
I I I I I I
I I I I I I
FIGURE 9. 2. The Nash equilibrium with spillovers.
In the technical appendix at the end of the chapter, we show that the bestresponse functions are given by
G1 (G2)
= (a +bcJG;") 2
(9.3)
G2 (G1)
= (a+ bcJGi) 2
(9.4)
Let us assume that b is positive and that there is a positive spillover effect. In that case, we see that an increase in G2 leads to an increase in G1. This follows from the assumption that we treat G1 and G2 as complementary goods in the utility function. An equilibrium is then given by the intersection of the two best-response functions . In the technical appendix, we show that the Nash equilibrium is given by GN =
(-a )2 c- b
(9.5)
Note that the equilibrium level of public good provision increases in a and band decreases inc as we would expect. Figure 9.2 illustrates the resulting equilibrium. It shows the best-response functions of each community. Note that these curves are upward sloping if b > 0. The Nash equilibrium is given by the intersection of the two curves.
166
Spillovers and Fiscal Inequality If two communities do not coordinate their policies, the outcome will in general not be efficient. Consider the case of a positive spillover. If community 1 ignores the positive spillover that it generates for community 2, then the provision of public goods and services will be too low in community 1. In the case of a negative spillover, the provision will be too large. In a decentralized system of government, we can solve this problem by designing a system of fiscal transfers, consisting of subsidies and taxes. In the case of a positive spillover, we need to subsidize the provision of public goods in both communities. The subsidy lowers the cost of providing public goods and, therefore, encourages each community to spend more resources on public goods. In the technical appendix, we also compute the efficient allocation and the optimal level of the subsidy that decentralizes the efficient allocation. This subsidy is best offered by the state or federal government. Spillovers arise not only in public good provision but also due to locational clustering of firms . As we learned in chapter 2, the most important spillover effect in a metropolitan area arises due to agglomeration externalities. For these externalities to occur, the city needs to provide a variety of local public goods and services that benefit all residents who work or consume in the city, not just those who live in the city. Fiscal transfers can, therefore, improve the efficiency within metropolitan areas. In practice, we need transfers from more affluent suburbs to the central city to reap the full benefits of agglomeration and to compensate the city for a variety of goods and services that the city provides to suburban commuters. Sharing that tax burden improves agglomeration and hence the productivity within the metro area. Improved productivity of the city economy means higher wages for suburban commuters and lower costs for city goods that are "exported" to the suburbs. If fiscal transfers from suburbs to large cities are inadequate, city tax rates are too high. If the city government provides inadequate services or provides its services at too high a tax rate, the city is not competitive in the global economy and the region will suffer. In case of a negative spillover, we need to discourage the provision of those goods and services. We discuss negative spillovers in more detail when we address the environmental challenges faced by cities. We will see that the state and federal government should either use quantity regulations or tax the activity that generates the negative spillover effect.
9.4 Inequality and Fairness We saw in the previous chapter that decentralization leads to sorting by households based on income and or tastes for local public goods and amenities. These sorting patterns can lead to socioeconomic segregation. Consider, for example, two communities in the Boston metropolitan area: Weston and Lakeville. Weston is an affluent suburban community with median household income of more than $200,000. It spent $21,334 per student in 2011 for primary and secondary education. Lakeville, which has a median household income of approximately $75,000, spent $11,800. Note that Lakeville is not a poor community; its median income is well above the US median household income. Nevertheless, we find that there are large differences in spending that are driven by differences in fiscal capacities
167
Chapter 9
Pine-Ric land 5.D.
,-1'
Lr
Deer Lakes S.D.
North Allegheny 5.D.
I
Fort Cherry 5.D. (pt.)
Hampton Township 5.D.
nn-Trafford 5.D.(pt.)
FIGURE 9.3. Fragmentation: School districts in Allegheny County. (US Census/ WikiMedia
Commons)
among communities. While some of these differences may be desirable, we need to make sure that poor and disadvantaged children have access to quality education. Your family income and your place of birth should not be the main factors that govern whether you have access to economic opportunities. Inequality is sometimes a consequence of political fragmentation. Figure 9.3 illustrates the problem of fragmentation, focusing on Allegheny County. The map shows that there are forty-two different school districts in Allegheny County. The population of the county was 1,223,048 in 2017. It is hard to believe that the optimal number of school districts for this county is forty-two. With too many independent school districts, each largely financing local education by local taxation , political fragmentation naturally creates inequality in access to public education. While it is possible to merge districts and create a better design of the system, it is often easier and politically more convenient to create a transfer system among the districts. Transfers can offset the natural tendency to inequality that arises in such a fragmented system.
168
Spillovers and Fiscal Inequality
If you want to design a fairer system of local financing, you need to design a transfer system in which the state or federal government levies taxes on higherincome communities and subsidizes expenditures on local public services in lowand moderate-income communities. To reduce the gap in economic opportunities, grants must be inversely related to income, and taxation must be proportional to income or even progressively increasing with income. School finance equalization laws often mandate redistribution of funds across school districts in a state to ensure more equal financing of schools. Finance equalization schemes differ across states. California effectively redistributes all revenues. New Jersey redistributes most revenue from locations whose revenues are above the eighty-fifth percentile. Approximately half of the budget for public schools in Philadelphia comes from the state government.
9.5 Different Types of Intergovernmental Grants Intergovernmental transfers or grants come in multiple forms with different implications. A block grant is a grant of some amount with no mandate as to how it is to be used. A matching grant is a grant in which the amount of the grant is tied to the amount of spending by the local community. Different grant types affect incentives in different ways. Block grants shift out the entire budget constraint, raising spending on all goods. They are primarily used for redistribution of resources and the municipality decides how to use the additional resources, either raising expenditures or lowering taxes. Matching grants graphically rotate out the budget constraint, acting as a subsidy. These grants help with externalities since they are targeted to increase spending in a specific area. They are often used for infrastructure investments or education. To illustrate the impact of the different types of grants, consider a community that has a budget of $10,000 (per capita). It has to make a decision on how to spend these resources. It can either spend on education or on other governmental programs (welfare, housing, parks, etc.). This decision problem is illustrated in figure 9.4. Given the preferences of the community, it is optimal to spend $5,000 on education and $5,000 on other expenditures. Now let us assume that the community obtains a $2,000 block grant from the state or federal government. The new budget constraint is illustrated in figure 9.4. Note that the block grant shifts the budget line out but does not affect the slope of the budget. In the new equilibrium, education expenditures increase by $1,000, while other expenditures increase by $1,000. Finally, consider a matching grant in which the state or federal government subsidizes education expenditures by, let's say, 25 percent. Note that the matching grant primarily affects the slope of the budget constraint. The effective price of education for the community is now 0.75 dollars. Given the preferences shown in figure 9.5, education expenditures increase from $5,000 to $6,200. Note that local education expenditures are reduced from $5,000 to $4,650 = 0.75 x $6,200. The difference is spent on other expenditures, which have increased from $5,000 to $5,350. In this example, the matching grant increases not only total spending on education but also other spending. In general, the magnitude of the effects depends on the shape of preferences as well as the size of the grants. But note that a block grant only has an income
169
Other spending
6,000 5,000
5,000 6,000
Education spending
FIGURE 9.4. Spending with a block grant.
Other spending
5,350 5,000
5,000 6,200
FIGURE 9.5. Spending with a matching grant.
Education spending
Spillovers and Fiscal Inequality effect since it does not affect the slope of the budget line. A matching grant has an income effect (i.e., it makes the community richer); but it also has a substitution effect since it affects the price of education faced by the community. Block grants may crowd out local spending. How large are these crowd-out effects? Knight (2002) analyzes highway grants from the federal government to states. He argues that these grants are primarily determined by the strength of the state's political representatives. He uses an instrumental variable that is based on changes in a state's congressional delegation and the resulting gains or loses of political power. He finds that each $1 increase in federal grant money due to rising congressional power leads to a $0.90 reduction in the state's own spending. These estimates are rather large. Other studies in the literature find smaller crowd-out effects.
9.6 Poverty and Intergovernmental Transfers Perhaps one of the most difficult issues facing mayors of large cities is urban poverty. Certainly, a mayor wishing to serve all city residents wants to do something about poverty. Fighting poverty is in the interest of not only low-income households but also high-income households, who are negatively affected by the adverse fiscal implications of city poverty. City poverty has three adverse effects on city finances: (1) poor families pay only minimal amounts in city taxes; (2) poor families need city services specifically for lower-income families, such as health care, housing, and child services; and (3) poor families may create negative externalities for middle-class families living in the city: higher crime rates, children not school-ready, and less housing and neighborhood maintenance. While the federal and state governments pay the greatest share of the direct costs of poverty-through welfare and health care grants-city budgets have a significant fiscal responsibility for additional poverty spending because of unfunded state mandates and the city's own decisions to provide poverty services. Summers and Jakubowski (1997) found the city of Philadelphia paid $134 million in 1994 in unreimbursed expenditures for low-income households. That would be approximately $278 million today, or about 7 percent of total city spending. If the city is not sufficiently reimbursed for delivering welfare services, the city needs to raise property tax rates paid by middle-class families and businesses or reduce the level and quality of public services received by all city residents. High rates of poverty make the city a less attractive place to live and to do business. This fact lowers city property values significantly. Home values may fall by as much as 25 percent, amounting to an increase of one standard deviation in the city's rate of poverty. Cities are often the most efficient providers of redistributive services. As cities already provide education services, police services, and other neighborhood services, they are in closer touch with the needs of lower-income families. This fact enhances a city's ability to provide complementary redistributive services, such as family counseling or tutoring. Thus the city government is likely to have a comparative advantage in service provision to lower-income families. To close the circle between efficient financing and efficient provision, we need to use intergovernmental transfers from the state or region to the city for the provision of redistributive services. In practice, these transfers tend to be too low. Cities are
171
Chapter 9 significantly limited in their ability to finance redistributive services to lowerincome families from their own tax bases. Significant city tax paym ents for lowerincome services can lead to the exit of middle-class residents and businesses from the city. If fiscal transfers from suburbs to large cities are inadequate, city tax rates can become burdensomely high. If the city government provides inadequate services or provides its services at "too high" a tax rate, it generates incentives for families and firms to move to cities with lower taxes. To help large cities finance their budgets, we can allow municipal and state governments to facilitate the use of transfers between central cities and suburban communities. Sharing the tax burden brings firms and households back into the city, further improving city agglomeration and productivity. Improved productivity of the city economy leads to higher wages for suburban commuters and lower costs for city goods that are "exported" to the suburbs. Residents in both the suburbs and city centers could benefit from this policy action. Excessive suburbanization leads to large differences in incomes and hence tax bases among cities and municipalities in the US. This income inequality often creates large inequalities in the provision of local public goods and services. In principle, we can reduce the inequality by a carefully designed system of state or regional transfers. However, voters in affluent municipalities are typically unwilling to finance these transfers, especially if they perceive that cities are poorly managed. As a consequence, we often observe an inefficient and unequal provision of local public goods and services. This not only undermines the credibility of fiscal decentralization but also implies that the potential benefits of agglomeration are not realized.
9.7 Conclusions Fiscal transfers play an important role in a decentralized system of government. The largest positive spillover in a m etropolitan area arises due to agglomeration externalities. As a consequence, fiscal transfers from state and federal governments to a city can be justified on efficiency grounds, as a mechanism to encourage agglomeration. Fiscal transfers can also be justified based on fairness and equality since large inner cities often serve a large fraction of low- and moderate-income households. Fiscal transfers redistribute resources from wealthy suburbs to central cities and thus allow for a more equitable sharing of the burden of financing welfare services and educating children that have experienced great hardship. Intergovernmental transfers are, therefore, an important aspect of a well-run decentralized government. We have seen that the federal government is primarily responsible for taxation, defense and homeland security, and redistribution. A majority of important public goods and services are actually provided by cities and local municipalities: education, public safety, infrastructure, utilities, welfare, and public housing. Given the federal government's natural advantage in taxation and redistribution, it plays an integral role in financing many important local goods and services. Some of these transfers should come as block grants that provide unconditional transfer to state and local governments. In many cases, matching grants are more desirable since they make sure that state and local governments have some skin in the game.
172
Spillovers and Fiscal Inequality Finally, some federal grants should be awarded based on competitive bidding or past performance. These grants reward cities and communities that do the right thing and efficiently provide public goods and services. A promising example is the Race to the Top initiative, which provided more than $4 billion in competitive grants. The program was designed to spur and reward innovation and reforms in state and local district K-12 education. It is important that intergovernmental transfers are not primarily used to fill budget holes of inefficient and poorly run cities, but that they reward those cities that work. One glaring weakness of the US political system is that there is no strong political government at the metropolitan level. As a consequence, there is no political mechanism for internalizing the spillovers and externalities that arise in a large metropolitan area. Instead, political power is allocated in the US to the state governments, which are often dominated by suburban and rural voters who tend to be rather selfish and often unwilling to support large cities. It is unlikely that we will see a constitutional reform in our lifetimes. But moving political power from the state capitals to metropolitan area governments would have to be at the top of the political agenda. In the absence of constitutional reform, we need to rely on local, regional, and state initiatives to improve policy coordination. For example, most metropolitan areas in the US have organizations that coordinate regional transportation planning and funding. Some metropolitan areas have gone beyond that. A promising example is the Metropolitan Council of Minneapolis-Saint Paul, w hich is the regional governmental agency and metropolitan planning organization. The Council is granted regional authority powers in state statutes by the Minnesota legislature. These powers are unique in that, unlike regional development commissions, they can supersede decisions and actions of local governments. The Council focuses on providing public transportation, sewage treatment, regional planning, urban planning for municipalities, forecasting population growth, ensuring adequate affordable housing, and maintaining a regional park and trails system. It also provides a framework for regional systems including aviation, transportation, parks and open space, water quality, and water management.
9.8 Technical Appendix: Solving the Model with Spillovers We would like to understand in more detail why spillover effects can cause significant welfare losses in a decentralized economy. Consider the game described in section 9.3. We are looking for a Nash equilibrium in pure strategies. Taking city 2's contribution as given, city 1 chooses its public good provision level G1 to maximize the following welfare function: (9.6)
The first-order condition resulting from this maximization problem is given below: ~ 1 a- - +b - - = c
v1Gi
v1Gi
(9.7)
173
Chapter 9 Solving for G1 gives us the reaction function:
(9.8)
Since the cities are identical, city 2' s reaction function is similar:
(9.9)
From the symmetry of the problem we can calculate the equilibrium provision level cNby solving the following equation:
(9.10)
Hence, the Nash equilibrium is given by
GN =(-a )2 c- b
(9.11)
The utility for each community in equilibrium is the same and is given by
UN
2 + b ( - a )2
-a -c-b
c-b
(9.12)
The efficient allocation can be calculated by maximizing the sum of the utility functions:
The first-order conditions of this problem are
a-
1
- + 2b ~ = c
JGi
JGi 1 JGi a--+2b-- = c ~
~
(9.14) (9.15)
From the symmetry of the problem we can calculate the efficient provision level c Eby solving the following equation:
(9.16)
174
Spillovers and Fiscal Inequality Hence we have GE-
(
_ a_ c- 2b )
2
(9.17)
The utility for each community evaluated at the efficient levels of public good provision is given by
u
E
2a2
a2
a2
c - 2b
c - 2b
c - 2b
= -- - --
(9.18)
Note that the provision of public goods is higher in the efficient allocation: GE -
2> ( _c-b a_ )2 ( _a_ c-2b ) -
GN
(9.19)
The inequality follows from the fact that c - 2b < c - b. Welfare is also higher: E a2 a2 c- b a2 a 2b N U = - - = - - - - > - - + - - -2 =U c-2b c-bc-2b c-b (c - b)
(9.20)
In the decentralized case, the individual city does not value the positive spillover effect it brings to the other city. However, in the centralized provision, the social benefits from spillover effects are internalized and hence more public goods are provided. To implement the efficient allocation, consider the following subsidy rates: (9.21) Taking city 2's contribution as given, city 1 chooses its public good provision level G1 to maximize the following welfare function: (9.22) The first-order condition resulting from this maximization problem is given below: 1
JG2 JG1
a--+ b-----= = c - b
JGi
(9.23)
It is then straightforward to show that the Nash equilibrium with subsidies is given by cN,S =
(-a )2 c - 2b
(9.24)
which is equal to the efficient level of public good provision. The model thus shows that a conditional transfer by the state or federal government to each of the two cities can overcome the underprovision problem and implement the efficient allocation of public goods.
175
Chapter 9
9.9 Debate: School Finance Equalization Laws Pick a state and discuss the school finance equalization laws in that state. The pro side should defend the status quo. The con should argue for a different system. The following questions may help structure the debate: 1. What types of communities are net payers? What types of communities are net receivers? 2. How large are the fiscal transfers for different communities? 3. How large are differences in educational outcomes (measured by test scores)? 4. What do we know about the effectiveness of additional spending to increase test scores? 5. What would you use additional funds for? 6. Is there evidence of waste in the school districts that you have studied? 7. How could you make these districts more efficient without increasing spending?
9.10 Problem Sets 1. Why does the US system of fiscal decentralization have a tendency to create inequality in spending per capita among municipalities? 2. How does California's approach to local taxation differ from those of most other states in the United States? 3. Explain the concept of a fiscal spillover. Provide an example to illustrate the problem associated with local spillovers. 4. Explain the role that federal grants to state and local governments play in a decentralized economy. 5. How would you design a more efficient system of intergovernmental grants? 6. Explain why block grants may crowd out local spending. 7. Explain the empirical approach used by Knight (2002) to estimate the impact of block grants on spending. 8. Why does Knight use an instrumental variable estimator? 9. What are some problems with the simple ordinary least squares estimator in Knight's application? 10. Explain why most cities are not fully reimbursed for the burden of housing and educating a large fraction of low- and moderate-income households.
176
10
Rent-Seeking Behavior
10.1 Motivation Local governments do not always work in the best interest of their citizens. Elected officials, politicians, and bureaucrats are self-interested and sometimes abuse the power of the government to enhance their own interests at the expense of the welfare of the city's residents. Governments impose many restrictions on economic activities. For example, you need to have a liquor license to sell alcoholic beverages in many cities. You can give it away for free without a license but you cannot sell it! These types of restrictions give rise to rents in a variety of forms . In this example, it can be rather lucrative to operate a bar-especially near college campuses. Hence firms often compete for these rents. Sometimes such competition is perfectly legal-anybody can try to get a liquor license. In other instances, however, rent seeking takes other forms, such as bribery and corruption (Krueger, 1974). Why don't you try to get the contract to haul trash in your favorite city? Lobbying politicians to obtain favors for you or your clients is another form of rent-seeking behavior. Rent seeking is an activity in which a party takes a costly action with the purpose of redistributing resources from others to himself. Rent-seeking behavior is an attempt to derive economic rents by manipulating the political environment. Note that this behavior generates no real value. It typically redistributes resources. If anything, it creates an environment that hampers economic development and thus lowers welfare. Another example of rent-seeking behavior is creating and sustaining artificial monopolies in an economy. Taxi medallions used to create local transportation monopolies before ride-sharing services such as Uber or Lyft existed. Government agents may not be passive. Instead, they sometimes solicit bribes from individuals or firms. Firms that seek monopoly power behave somewhat rationally since they gain from having access to special economic privileges. Of course, they use these privileges to exploit the consumer once they have established a protected monopoly. Rent-seeking behav ior often diverts resources away from more productive employment. It is easier to exploit a local resource than to run a successful
90-100 80-89 ■ 70-79 ■ 60-69 ■ 50-59
■ 40-49 ■ 30-39 ■ 20-29
■ 10-19
,,
■ 0-9
Fl GU RE 10.1. World map of Perception of Corruption index. (Transparency International/ WikiMedia Commons)
Rent-Seeking Behavior
company that has to innovate and compete against serious rivals. In that sense, rent seeking is not a zero-sum game. It lowers economic welfare. Figure 10.1 illustrates differences in perceptions of corruption as computed by Transparency International. These perception indexes are based on surveys of managers who conduct business in the relevant countries. In this chapter, we explore the impact of rent-seeking behavior on economic outcomes and welfare. Our main workhorse model, outlined in section 10.2, is an all-pay auction in which firms are bribing a local official to obtain access to a lucrative contract or, more generally speaking, an economic "prize." This model will help us understand how corrupt officials can use the power of the local government to extract resources for themselves. In section 10.3, we review some empirical studies that have attempted to measure the extent of rent-seeking behavior at the local level.
10.2 Modeling Rent-Seeking Behavior To illustrate the inefficiencies generated by rent-seeking or corrupt behavior, let us consider a situation in which the city has access to a valuable resource. This resource could be a liquor license, an attractive commercial property, or a lucrative contract to provide important services. Let us assume for simplicity that the control of this resource generates a profit equal to P. In a competitive or noncorrupt environment, the city could sell the resource and generate revenues. One would expect that the selling price for this resource would be approximately equal to P. Let us now consider what may happen in a situation in which the resource is controlled by a rent-seeking or corrupt politician. Instead of selling the resource in an open market and generating public revenues, the corrupt politician may decide to give the valuable resource away for free in exchange for some bribes. Just to keep things simple, let us assume that there are two potential businesses competing for the resource that are willing and capable of bribing the politician. Bribing takes place behind closed doors and away from public scrutiny. Politicians then have the advantage of collecting bribes from all parties in return for awarding the resource to the one "winning" party. We can, therefore, view the bribing and resource allocation process as an all-pay auction. Every bidder pays a bribe, but only one bidder wins the prize. Table 10.1 summarizes the notation of the key variables of the model. The outcome of the bribing process is uncertain . We can view the allocation process as a lottery. The corrupt official does not have to allocate the resource to the person who pays the highest bribe. There is often uncertainty about the outcome. TABLE
10.1. Chapter Notation
Variable
Definition
p
Value of the prize Bribe Probability of winning the auction Expected payoff
B
Pr{•} E(IT)
179
Chapter 10
FIGURE 10.2. Bribes are payments in all-pay auctions. (pixabay.com/ pexels.com)
It is, however, reasonable to assume that higher bribes will increase the bidder's chances of winning this lottery. To capture this feature of the bribing process, let us assume that the probability of winning for firm 1 is equal to the firm's share of the total bribes:
. l wms . ) = -B1 Pr (f zrm B1 + B2
(10.1)
A similar equation holds for firm 2:
Pr(firm2wins) =
B
2
B1 + B2
(10.2)
The expected profits of both firms are then given by
E(IT1 ) = [Bi ; BJ P - B1
E(IT2) = [Bi ; BJ P - B2
(10.3)
Note that in a regular auction you only pay your bid if you win the auction-that is, if you are the highest bidder. In an all-pay auction, you always pay your bid, no
180
Rent-Seeking Behavior
matter whether you win or lose. In addition, you might not win the auction even if you are the highest bidder. Each firm then determines an optimal bidding strategy by maximizing expected profits, taking the bid of the other firm as given. You may have recognized that the mathematical structure of this game is similar to the voluntary public good provision game that we studied before. It is just a little bit more sinister. Note that each firm's behavior is optimal from a private perspective, despite the fact that the overall outcome of the rent-seeking activity is typically undesirable from the society's perspective. Again, we solve the decision problem of firm 1 for each level of bribe from firm 2 and vice versa. Solving this problem generates the best-response function of each firm. The best-response function of firm 1 is the optimal bid of firm 1 given any bid of firm 2. A Nash equilibrium of this game is given by the intersection of the two bestresponse functions. It has the property that, given the other firm's behavior, each firm behaves optimally. In the technical appendix at the end of the chapter, we show that, in equilibrium, bribes of both firms are equal and given by (10.4)
Thus each firm offers and pays a bribe that is equal to a quarter of the total profits that can be made from the valuable resource. The corrupt official thus obtains total bribes equal to B1 + B2 = P / 2, or half the value of the profits, quite a lucrative business! The incentives of engaging in corruption can thus be large. It is, therefore, not surprising that rent-seeking behavior is quite prevalent, especially in cities and countries with weak political institutions and a lack of accountability. Our model offers two additional insights. First, as we increase the number of firms that participate in the bribing process, each firm will bid less aggressively since the probability of winning goes down as more firms join the bidding. However, the total amount of bribes goes up. It is, therefore, in the interest of the local politician to encourage firms to participate in the bidding process. The more bidders, the better! Second, if the corrupt politician is shrewd, he may actually want to offer the prize to the highest bidder. In that case, our equilibrium concept becomes more complicated as all firms will randomize among possible bribes. This version of the all-pay auction generates even higher revenues for the corrupt official. See the technical appendix for more details.
10.3 Empirical Evidence 10.3.1 State Bond Ratings It is hard to provide direct evidence of the negative consequences of rent-seeking behavior or similar acts of corruption. We encounter a slew of measurement problems when trying to conduct empirical analyses. One interesting approach is that taken by Depken and Lafountain (2006), who use state bond ratings to investigate potential public corruption effects in the US. You may ask yourself why d ebt
181
Chapter 10 would be more costly to obtain in the bond market for corrupt governments than for honest governments. Consider a local government that uses debt to finance a public goods project, such as a new recreation center. Suppose the public official in charge of awarding the project accepts kickbacks from a firm. This bribe effectively is part of the cost of building the recreation center. In addition, the official may award the contract to the firm with the highest bribe, not the firm with the lowest costs. As a consequence, the costs of building the center are higher than in the case without corruption. Finally, a corrupt government may not choose projects to maximize welfare but may be more interested in its own income. As a consequence, it provides lower-quality public services for the same amount of spending. The state or city is in worse financial shape, which increases the chance of default, and hence investors will need to be compensated via a higher interest rate for the additional risk. We study municipal bond markets in more detail in chapter 16. Depken and Lafountain use the number of federal public corruption convictions per 100,000 state residents in a given year to measure public corruption. After controlling for various economic influences on bond ratings, the paper finds that the number of public corruption convictions per 100,000 residents is inversely related to the average state bond rating. They estimate that the overall decline in bond ratings by one standard deviation of corruption translates into an increase in the cost of debt service of $18 per $1 million of debt. While this number is not large, it should be taken into consideration that state bonds have been historically considered very safe investments with an extremely low default risk. Consider a state with a debt load of approximately $11 billion. Thus an increase of one standard deviation in corruption increases annual interest payments by roughly $200,000. Maybe that's not much, but then again that is hardly the total cost of corruption.
10.3.2 Corruption and Accountability: Evidence from Brazil More progress can be made if one has access to direct measures of corruption. Ferraz and Finan (2011) study electoral accountability and its impact on corruption in Brazil, which has a long tradition of decentralization. Local governments receive, on average, $35 billion per year from the federal government to provide a significant share of public services in education, health, transportation, and local infrastructure. Moreover, the mayor, in conjunction with local legislators, typically decides how to spend much of the resources. The overall process in Brazil is not that different from that in the US. The mayor of a city proposes a detailed budget, which itemizes spending on all programs. We discuss the details of the budgeting process in chapter 14. Once the mayor submits the budget, the local legislature analyzes the budget proposal and votes on approving it. Once the budget is approved, the mayor implem ents the fiscal policy. Brazil introduced several institutional changes that facilitate a test of whether electoral accountability impacts political corruption. In particular, Brazil passed a constitutional amendment in 1997, which introduced reelection incentives that enabled mayors to run for a second consecutive term. Mayoral elections tend to be competitive in Brazil. Approximately 73 percent of all mayors run for reelection. Only 40 percent of mayors have been reelected since the passage of legislation permitting a second consecutive term. Since reelection
182
Rent-Seeking Behavior TABLE 10.2. The Effects of Reelection Incentives on Corruption 1
2
3
4
5
6
- 0.019 (0.009)
- 0.020 (0.010)
- 0.020 (0.010)
-0.024 (0.011)
- 0.026 (0.011)
-0.027 (0.011)
R-Squared Number of Observations
0.01 476
0.08 476
0.10 476
0.12 476
0.14 476
0.20 476
Mayor Characteristics Municipal Characteristics Institutions Lottery Intercepts State Intercepts
No No No No No
Yes No No No No
Yes Yes No No No
Yes Yes Yes No No
Yes Yes Yes Yes No
Yes Yes Yes Yes Yes
Mayor in First Term
is not guaranteed, mayors who must seek reelection have a strong incentive to perform well while in office. The federal government also created an ambitious anticorruption program in 2003. The objective of this program was to audit municipalities on their use of federal funds. These audit reports provide objective measures of corruption at the municipal level for the 2001-2004 electoral term. The audits in Brazil were designed to be random so that mayors were effectively facing a lottery in terms of whether any corruption might be found out or not. These data combined with the 1997 constitutional amendment allow Ferraz and Finan to compare the corruption levels between municipalities with first-term mayors and those with second-term mayors. We have seen previously that mayors and governors who can run for reelection tend to work harder and perform better in office. For the same reason, they should also be less corrupt. This suggests that we should run the following regression to test our hypothesis that political accountability decreases corruption: (10.5)
Note that Ci is the level of corruption for municipality i that is determined by the federal audit. The variable Ii indicates whether the mayor is in his first term and hence captures the reelection incentives. Our main hypothesis is that this coefficient is negative, i.e., f3 < 0. To control for additional observed heterogeneity, we include a vector Xi, which captures a set of municipal characteristics, and a vector Zi, which measures mayors' characteristics. The term €i denotes the error term of the regression. Table 10.2 is based on Ferraz and Finan (2011) and summarizes some of the main empirical findings. The dependent variable is the share of audited resources that involve corruption. In particular, the authors find significantly less corruption in municipalities where mayors are up for reelection, confirming our initial hypothesis. First-term mayors are associated with a 1.9 percentage point decrease in corruption. The average corruption level of second-term mayors is 0.074. Mayors with reelection incentives misappropriate 27 percent less resources than mayors without reelection incentives. Summarizing, this study suggests that imposing
183
Chapter 10 electoral rules that enhance political accountability can play a crucial role in constraining a politician's corrupt behavior.
10.4 A Case Study: Procurement Auctions in Puerto Rico In the fall of 2017 Puerto Rico was hit with a powerful hurricane named Maria that devastated the local power grid. After Maria, the island's power authority, known as PREPA, did not follow the path of most other public utilities after a disaster. It did not activate mutual aid agreements to bring in utility workers and equipment from the outside to restore power. In Florida, for example, 18,000 workers were called in from out of state to help restore power after Hurricane Irma. Instead, PREPA gave contracts to rebuild the power grid to a number of private contractors that had been asked to submit bids. One contract, valued at $300 million, was awarded to Whitefish Energy, a small company located in Montana. The New York Times reported on October 24, 2017, that "in an interview on Oct. 10, Mr. Techmanski, the owner of Whitefish Energy, said he got the job because he was the first to show up on the island-on Sept. 26, six days after the storm hit-and because he didn't ask for any payment in advance." The size of the contract and the scale of the job awarded to a company that had just two full-time employees soon raised questions about how Puerto Rico awarded contracts in the aftermath of Hurricane Maria. The contract was also scrutinized because the town of Whitefish, Montana, is also home to the US secretary of the interior, Ryan Zinke. Interior officials, as well as Mr. Techmanski, said
FIGURE 10.3. Victim of a n atural disaster. (Denniz Futalan/ pexels.com)
184
Rent-Seeking Behavior the secretary had not helped Whitefish Energy obtain the contract or taken any action on the company's behalf. The contract was canceled on October 26 after further inspection from a growing number of government committees. Following the cancellation, congressional questioning continued to search for the root of the initial contracting decision. Despite its failure to discover the cause of this suspicious contract, Puerto Rico's federally mandated oversight board appointed an emergency manager for PREPA to prevent any future tainted deals. On October 30, 2017, the Federal Bureau of Investigation announced an investigation into this case. Agents from the FBI's San Juan field office started to look into circumstances surrounding the deal that the public power monopoly signed with Whitefish Energy Holdings LLC. To date, no charges have been filed. In preparation for additional storms, Congress is being asked to supply even more funds to ensure a robust rebuilding of Puerto Rico's infrastructure. However, the attention given to the Whitefish deal brought many concerns to American taxpayers regarding the use of their dollars in Puerto Rico.
10.5 Conclusions Corruption and rent-seeking behavior pose some real challenges for city management. City governments in the US have a long history of patronage and machine politics that tend to provide conditions favoring corruption. As we have seen, elections serve the important function of removing unqualified politicians from office. Strong democratic institutions are undoubtedly the best protection against corruption and rent-seeking behavior. Elections may not serve this function if the democratic process is compromised. Electoral competition fails if the city is run by a small political elite that controls the election process and is thus not accountable to the voters. Rent-seeking activities are not zero-sum games that redistribute resources to corrupt officials. They also distort resource allocations and create real inefficiencies. This point is also emphasized by Bertrand, Djankov, Hanna, and Mullainathan (2007), who study the allocation of driver 's licenses in India by randomly assigning applicants to one of three groups. The first group is offered a bonus for obtaining a license quickly. The second group is offered free driving lessons. The third group is a comparison group. They find that the participants in the bonus and lesson groups are more likely to obtain licenses. However, bonus group members are more likely to make illegal payments and obtain licenses without knowing how to drive. Fiscal decentralization and competition can help reduce the negative impact of corruption. Individuals can vote not only at the ballot box but also with their feet. Voters can benchmark the performance of elected officials against the performance of officials in similar cities and states. By comparing the performance of elected politicians and appointed bureaucrats, we can learn about the qualifications and performance of those in office. Finally, a strong commitment to law enforcement and an independent judiciary are essential to deal with those who think they can get away with breaking the law. It would, however, be naive to think that corruption and rent-seeking behavior are not a problem in most countries. The temptation for politicians and
185
Chapter 10 bureaucrats to enrich themselves is strong and should not be underestimated. Empirical evidence on the magnitude of corruption is harder to come by. One compelling measure of corruption at the country level relies on surveys of business owners and managers who have professional interests in foreign countries. By surveying foreigners conducting business in other countries, we can obtain some insights into the extent of corruption and rent-seeking behavior. Transparency International, a watchdog organization, provides corruption perception indexes for many countries. The evidence suggests that many developing countries are clearly plagued by corrupt, inefficient, and often semidictatorial regimes. But the problems are not confined to these countries. Mancur Olson (1982) traces the historic consequences of rent seeking in The Rise and Decline of Nations. He conjectures that as a country becomes increasingly dominated by organized special interest groups, it loses economic vitality and tends to fall into decline. Countries that have a collapse of the political regime and the interest groups that have coalesced around it can radically improve productivity and increase income as they start with a clean slate in the aftermath of the collapse. Examples include Germany and Japan after World War II, or the People's Republic of China after the death of Mao. New coalitions form over time, however, once again shackling society in order to redistribute wealth and income to special interest groups and ruling elites.
10.6 Technical Appendix: Computing the Equilibrium of the All-Pay Auction H ere we consider two versions of the all-pay auction. Our discussion closely follows the exposition in Hindriks and Myles (2006), which should be consulted for details. First, let us derive the equilibrium for the two-payer game studied above. Taking the derivative of the objective function with respect to B1 yields
p
(10.6)
Now consider a symmetric equilibrium with (10.7) Substituting the symmetry condition into the first-order condition above, we obtain (10.8) Hence, the unique Nash equilibrium satisfies (10.9)
186
Rent-Seeking Behavior
Next, consider the same game with n identical players. Again, we are looking for a symmetric, unique equilibrium in pure strategies. Consider player i and let B_i denote the bribe of one of player i's competitors. Hence (n - l )B_i denotes the total amount of bribes of all competitors.The expected payoffs of player i are (10.10)
Taking the derivative of the objective function with respect to Bi yields (10.11)
Now consider a symmetric equilibrium with (10.12)
Substituting the symmetry condition into the first-order condition, we obtain [nB* ]- 1 P - B* [nB*]- 2P = 1
(10.13)
Hence the optimal strategy is given by (10.14)
and the total bribe is nB*
=n-
n
lp < p
(10.15)
Thus the firms obtain some of the benefits associated with the prize. In a game with a small number of players, the bureaucrat cannot steal the full amount of the prize. However, note that with free entry and n -+ oo, the corrupt bureaucrat can extract the full value of the prize. Suppose we change the economic environment and assume that the prize is awarded to the player with the highest bribe, i.e., to the highest bidder. Let us first consider the two-player game. The expected payoffs for player 1 are then given by
if B1 > B2 if B1= B2 if B1 < B2
(10.16)
The payoffs for player 2 are similar to the ones of player 1. Note that the payoffs are not continuous anymore. They have two discrete jumps! Here the incentives of the payoffs imply that you always want to outbid your opponent by a small amount. As a consequence, we can show that there is no equilibrium of this game in pure strategies. To show this result, suppose the optimal pure strategy for player 1 is BI = B. If player 2 plays B2= B + €, then player 2 will
187
Chapter 10 win the prize for sure, and the payoffs are
E(Il1 ) = - B
(10.17)
E(Il2) = P - B -
(10.18)
€
Hence this cannot be an equilibrium since player 1 is better off if she would bid Bi = B + 2€. Each player always wants to leap-frog its competitor. What about bidding the maximum possible amount: Bi = P = B2? Then the paysoffs are (10.19)
That cannot be an equilibrium since both players would be better off bidding 0. Given that there is no equilibrium in pure strategies, it is reasonable to ask whether there is an equilibrium in mixed strategies. In a mixed-strategy equilibrium, we use a randomization device to select an optimal strategy. Recall that in the simple game of Rock-Paper-Scissors you want to play each action with probability 1/3. In most sports, such as football, baseball, and tennis, we use mixed strategies to keep the opponent guessing. Let' s apply the concept of mixed strategies to this version of the all-pay auction. Note that we cannot bid less than zero and we will never bid more than P. We lose money if we bid more than P. Hence we can restrict attention to those strategies that assign positive probabilities to the range [O, P]. To construct a mixed-strategy equilibrium, let's use a guess-and-verify approach. Suppose both players use a strategy that assigns equal probability to each value in the interval [D, P] :
f(B) =
i
(10.20)
We need to verify that there are no incentives to deviate from this strategy. Consider a pure strategy for player 1, i.e., a bribe equal to B* . She wins the bidding if she outbids player 2: Pr(B2 ~ B*) =
/B*
1
Jo P dx
=
B*
P
= F(B*)
(10.21)
The expected payoff for player 1 is then B*
B*
B*
-p (P - B*) - (1 - -p )B* = -p P - B* = 0
(10.22)
Player 2 thus uses the randomization device to make sure that player 1 always receives a zero payoff in equilibrium, no matter what value of B* is played. Note that the strategy above is no longer optimal when we add a third player. With three players the probability of winning is 1/3 if all three players play the same strategy. Hence the expected payoff is P /3. The cost of playing the mixed strategy is P /2. Consequently, we need to adjust the strategy to give less weight to higher values of bribes.
188
Rent-Seeking Behavior We will construct a symmetric equilibrium for the n-player game. Suppose you bid B*. The probability of beating one other player is F(B*), where F the distribution function that characterizes the optimal mixed strategy. The probability of beating (n -1) players is F(B*t- 1 . The expected payoff must be zero: P F(B*t-l - B*
=0
(10.23)
which implies that the distribution that characterizes the mixed strategy in the n-player game is given by:
F ( B*) =
( ~)
1
n=1
(10.24)
In the two-player game, the expected bribe is P /2 per person. Hence the expected total bribe is P. In then-player game the expected total bribe is P /n, and hence the expected total bribe is also P. As long as the number of players is at least two, we find that the total expected bribes equal the value of the prize that is to be obtained. From a social perspective, there is nothing to be gained from the existence of the prize. The corrupt official is the only one b enefiting from its existence, even if there are only a small number of players who compete for the prize.
10.7 Debate: Term Limits The pro side should argue in favor of a binding two-term limit for mayors and governors. The con side should argue against term limits. The following questions may help structure the debate: 1. What is the impact of term limits on politicians who are corrupt? 2. What is the impact of term limits on politicians who are likely to be rent seeking? 3. How do term limits affect politicians who have a centrist or moderate ideology? 4. How do term limits affect politicians w ho are capable or competent? 5. How do term limits affect incentives of incumbents to moderate while in office? 6. If politicians value being in office, what are the incentive effects of the possibility of seeking reelection? 7. How do term limits affect the incentives to work hard in office?
10.8 Problem Sets 1. Consider the second version of the rent-seeking model discussed in section 10.6 in which the highest bidder wins the government contract but both bidders have to pay the bribes. Explain why there is no equilibrium in pure strategies.
189
Chapter 10 2. Explain the research design used by Ferraz and Finan (2011) in their study of corruption in Brazil. Discuss one potential problem associated with this approach. 3. Three firms have applied for the franchise to operate cable TV in a city. The annual cost of operating the system is 250. The (inverse) demand curve is given by P = 500 - Q, where P is the price and Q is the number of cable subscribers. The firm that wins the right for the franchise can operate as a monopolist for one year. a) Compute the value of the franchise denoted by V, i.e., compute the monopoly profits of operating the franchise. b) The city government awards the license to the applicant that spends the most money lobbying the city government. Show that there is no equilibrium of the bidding game in pure strategies. c) Let B denote the bid of a player. Assume that the bid distribution is symmetric among players, i.e., all players use the same bid distribution in equilibrium. Show that the mixed-strategy equilibrium must satisfy
(Hint: If the equilibrium strategies of players 2 and 3 are given by the equation above, any bid B of player 1 must yield an expected utility of zero, i.e., player 1 must be indifferent among any pure strategies.) d) What is the expected bid of any player in the mixed-strategy equilibrium? What is the expected revenue of the city government? 4. Consider an all-pay auction with a rent given as R. Each of the two players (i = 1, 2) spends resources competing for the rent. Let xi denote the amount of resources spent by player i. Let p1 denote the probability that player 1 will win the rent, where p1 is given by
with cp 2 1. a) Derive the expected payoffs and the best-response function for both players. b) Derive the symmetric equilibrium of the game. c) Which player is more likely to win in equilibrium and w hy? d) Compare the total equilibrium spending with cp > l and cp = l. Should we expect more or less spending when payers are identical? Why or why not? 5. Suppose you are competing for a government contract with one rival in an all-pay auction. Assume for simplicity that the highest bidder will win the auction for sure. Your valuation of the contract is equal to $2,000. You do not know your rival's valuation. It is equally likely to be between $0 and $5,000. Suppose you expect that your rival will submit a bid that is half of the valuation. a) What is the distribution of possible bids that you face? b) Suppose you submit a bid of B. Compute the probability of winning the bidding as a function of B.
190
Rent-Seeking Behavior c) What is your expected profit of bidding B? d) What is the bid that is maximizing your expected profit? Explain your findings. 6. Two firms are bidding for the right to operate a liquor store franchise in a state. The annual total costs of operating the system are 225 for firm 1 and 125 for firm 2. The (inverse) demand curve is given by P = 50 - Q, where P is the price and Q is the amount of liquor sold. The firm that wins the right for the franchise can operate as a monopolist for one year. a) What are the potential profits for firm 1 and firm 2 if they obtain the franchise? b) Both firms are bidding in a regular second-price auction for the right to operate the franchise (i.e., the winner pays the price that is equal to the second-highest bid; the loser pays nothing). Assume that potential profits are common knowledge. What is the highest bid that firm 1 must submit so that it will win the license for sure? c) Suppose there is a third firm that has operating costs equal to 275. Should the city encourage this firm to enter into the bidding process? Why or why not? d) Suppose now that the third firm has operating costs equal to 175. Should the city encourage this firm to enter into the bidding process? Why or why not? 7. Explain why corruption can be considered to be a sign of market failure. What are some government policies that can be implemented to reduce corruption? Why are societies often not successful in combating corruption?
191
11
Labor Relations and Collective Bargaining
11.1 Motivation Labor costs account for a large share of a city's annual expenditures. The key to sound fiscal policy begins with sound labor management policies. Let's look at some empirical evidence. Figure 11.1 shows that expenditures on wages and benefits varied between $2,500 and $5,000 per capita in large US cities, on average, during the period between 2005 and 2014. That accounts for 30-50 percent of all current expenditures. As a consequence, good labor relations and negotiating sustainable collective bargaining agreements are key to good city management. The Bureau of Labor Statistics estimated that private industry employers spent an average of $27.64 per hour worked for employee compensation in June 2010. Wages and salaries averaged $19.53 per hour worked and accounted for 70.6 percent of these costs, while benefits averaged $8.11 and accounted for the remaining 29.4 percent. It is useful to compare the compensation of private workers with the compensation of public workers. Total compensation costs for state and local government workers averaged $39.74 per hour worked in June 2010. Of that total cost, 65.7 percent is due to wages and 34.3 percent is due to benefits. Health benefits in the private (public) sector were 8.3 (11.4) percent of the salary. Retirement and savings benefits were 4.4 (8.0) percent. State and local government workers got a significantly larger fraction of their total compensation in the form of benefits in 2010. We start our analysis of labor relations and collective bargaining by reviewing the federal and state laws that define employer and union rights and obligations. Federal law gives states much latitude to define these rights for state and municipal workers. Not surprisingly, there is much heterogeneity among states. To gain some additional insights into labor disputes, we turn to a bargain model that was developed by Stahl (1972) and Rubinstein (1982) . We then study more practical aspects of collective bargaining and labor relations. Here we focus on wage negotiations, work rules, and pension funding. We review some relevant empirical evidence and discuss some possible reform options.
Mean annual wages and benefits 2005-2014
A KY: Louisville FL: Jacksonville IN: Indianapolis TX: Ft. Worth AZ: Phoenix OH: Columbus CA: San Diego TX: Houston TX: San Antonio TX: Austin TX: Dallas PA: Philadelphia Ml: Detroit CA: San Jose NC: Charlotte CA: Los Angeles IL: Chicago CA: San Francisco NY: New York
0
I
I
I
I
I
1,000
2,000
3,000
4,000
5,000
Dollars per capita
B FL: Jacksonville IN: Indianapolis PA: Philadelphia Ml: Detroit TX: San Antonio AZ: Phoenix CA: San Diego OH: Columbus TX: Austin CA: San Francisco CA: San Jose TX: Ft. Worth NY: New York CA: Los Angeles TX: Dallas NC: Charlotte TX: Houston KY: Louisville IL: Chicago
0.0
I
I
I
I
I
0.1
0.2
0.3
0.4
0.5
Share of total expenditures FIGURE 11 .1. Wages and benefits in large US cities, in dollars p er capita (A) and share of total expenditures (B). (Fiscally Stan dardized Cities d atabase/ Lincoln Institu te of Land Policy)
Chapter 11
11.2 Employer and Union Rights and Obligations Many municipal and state employees are unionized in the United States. There are separate unions for police officers, firefighters, teachers, blue collar workers, and white collar employees. A labor union is an organized association of workers formed to act in their interests and to protect and expand their rights. It accomplishes this through collective bargaining, a process of negotiation between an employer and a labor union to ultimately agree upon a regulation of workers' salaries, working conditions, benefits, and other aspects of compensation. Employer obligations and union rights for private sector workers are primarily determined by the National Labor Relations Act (NLRA). 1 States and some local authorities regulate the collective bargaining for state and local public sector workers. As a consequence, state and local rules significantly differ among regions and cities. For example, some states, such as North Carolina, South Carolina, Tennessee, and Virginia, have made it illegal for police officers and firefighters to engage in collective bargaining. Most states, however, give state and municipal employees the option to unionize. While there are many differences among states, there are also similarities to the rules laid out for private sector workers in the NLRA. Therefore, it is u seful to review the common rules that apply to private sector workers outlined in the NLRA. Other important differences between private and public sector rules are discussed at the end of this section. The NLRA forbids employers from interfering with employees in the exercise of rights relating to organizing. As a consequence, workers can form a labor organization for collective bargaining purposes. Employees must contribute dues to finance the union. These payments are subject to federal and state laws as well as court rulings. The NLRA allows employers and unions to enter into union security agreements, which require all employees in a firm to become union members. These agreements benefit the union since it can collect payments from employees who may not support the union. Employees can object to full union membership. In that case, they pay only the share of dues used directly in collective bargaining and contract administration. These individuals are known as objectors. They do not receive full benefits as other members do, but are still covered by the union contract. Most recently, the US Supreme Court has questioned the legality of these types of union security agreements for public employees. This ruling may affect the ability of some public sector unions to finance themselves. As of 2012, twenty-four states had banned union security agreements and passed socalled right-to-work laws. Each employee can then decide whether or not to join the union and pay the respective dues. The primary purpose of a union is to serve as a bargaining representative of the workers. The law requires that the employer and the union representatives bargain in "good faith" about wages, hours, vacation time, insurance, safety practices, and other mandatory subjects. You may ask yourself, What does "good faith" mean? That is obviously subject to interpretation. There is widespread agreement among legal scholars that it is an unfair labor practice to refuse to bargain. Hence the NLRA imposes a "duty to bargain" on employers. However, parties are not compelled to reach an agreement. Often the parties do not initially have an equal share
1 For
details go to https: / / www.nlrb.gov.
194
Labor Relations and Bargaining
FIGURE 11.2. Municipal union headquarters. (Photo by author)
of economic power, and one side has little incentive to negotiate. If no agreement can be reached, the employer can eventually declare an impasse. This, in turn, gives the employer the right to implement the last offer presented to the union. If an employer declares an impasse, the process does not come to an end. The union will typically disagree that a true impasse has been reached. In that case, the National Labor Relations Board (NLRB) assesses the negotiations and determines whether a true impasse exists. If the agency finds that an impasse was not reached, the employer will be asked to return to the bargaining table. Note that the NLRB can seek a federal court order to force the employer to bargain, although this rarely happens. The primary way of resolving labor disputes is by bilateral negotiations and bargaining between the employer and a union. If bargaining fails, there is often an option for arbitration or mediation. In that case, a third party is asked to help resolve the differences among the two parties. Arbitration can be either binding or nonbinding. Nonbinding arbitration is similar to mediation in that a decision cannot be imposed on the parties. In the event that bargaining fails, labor disputes usually end up in court. A judge needs to determine whether both sides have negotiated sufficiently long
195
Chapter 11 enough in "good faith." Typically, unions argue that more time is necessary to reach a consensus while employers argue that the time spent is sufficient and that they have the right to unilaterally impose a new contract. Of course, the dispute may not necessarily end with a court verdict. Union employees can initiate strikes in which they cease work throughout the labor dispute. Alternatively, employers can initiate lockouts in which they deny employment during the dispute. Most states in the US have made it illegal for police officers, firefighters, and teachers to go on strike. However, strikes of blue collar and even white collar municipal workers are often allowed. Lockouts hardly ever happen in the public sector and are rare in the private sector as well. Another option for the union is to engage in a slowdown, in which workers deliberately work less and reduce their productivity in order to gain concessions from the employer.
11.3 The Theory of Bargaining and Negotiations 11.3.1 A Bargaining Model To gain some additional insights into collective bargaining processes, we consider a game-theoretic model that was developed by Stahl (1972) and Rubinstein (1982). In this bargaining game, two players must agree on how to share a pie by making alternating offers. Think about the pie as the surplus that needs to be shared by the city and its municipal employees. The city may want to hand the surplus back to its residents in the form of lower taxes or better services. The union wants higher wages and better work conditions for its members. Without loss of generality, let us normalize the size of the pie to be equal to 1. The model is not about the size of the surplus; it primarily focuses on how the surplus is shared between the two parties. In odd periods (t = 1, 3, 5, ... ), player 1 (the city) proposes a sharing rule (x, 1 - x) that player 2 (the union) can accept or reject. If player 2 accepts any offer, the game ends. In even periods (t = 2, 4, 6, ... ), player 2 proposes a sharing rule (x, 1 - x) that player 1 can accept or reject. Hence we have a model with alternating rights to make an offer. This setup implies that both players are treated more or less the same, although player 1 has a first-mover advantage in this specification. Let us initially consider the game with a finite number of T periods. Players have discount factors equal to 61 and 62 . The discount factor reflects the degree of impatience of a player. We say player 1 is more patient than player 2 if 61 > Ji_ If 1 the players agree on a sharing rule (x, 1 - x) in period t, the payoffs are x for 1 player 1 and 6t (1 - x) for player 2. Figure 11.3 illustrates the structure of this game. To characterize the equilibrium of this game, let us assume for simplicity that T = 3. In that case, player 1 is the last player to make an offer since we are in an odd period. To determine the optimal strategy, we need to specify the payoffs for both players when no agreement is reached in the final period. It is convenient to normalize these payoffs to be Ofor both players. Rejecting the last offer of player 1 in T = 3 implies that player 2 will get a payoff of 0. Hence this structure of the model implies that player 1 is in a very strong
6i-
196
Period 1 2
accept
reject
2
Period 2
accept
reject
and so on until agreement is reached FIGURE 11.3. Th e Rubinstein-Stahl bargaining game.
Chapter 11 bargaining position if the last period is ever reached. We call this the last-mover advantage. You should convince yourself that player 1 will make a take-it-or-leave-it offer of (1, 0) in the last period and thus obtain the whole pie. We are implicitly assuming that a player will accept an offer if the player is indifferent between accepting and rejecting the offer. If you do not like this assumption, then you would require player 1 to make an offer to player 2 that makes player 2 strictly better off. Hence player 1 would have to offer player 2 a very small positive amount of the pie to break the indifference. Since this amount offered to player 2 in the last period can be arbitrarily small, offering O is a reasonable approximation of the optimal strategy here as well. Next, consider period T - l = 2. It is player 2's turn to make an offer. He knows that if he makes an offer that is rejected, he will get nothing in the last period and player 1 will get everything. Since player 1 is impatient, the whole pie in period 3 is worth .:51 to player 1 in current period terms. Player 2 will, therefore, propose a sharing rule that makes player 1 indifferent between accepting and rejecting the offer: (11.1) Finally, consider the first period. It's player l's turn to make an offer. By the same logic used above, player 1 recognizes that player 2's payoff of (1- .:51) in the previous period (T - 1 = 2) is only worth .:52(1- .:51) in the current period (T - 2 = 1). Player 1 will, therefore, propose the following sharing rule in the first period: (11.2) We have, therefore, characterized an equilibrium for this game. Note that in this equilibrium, the players agree on a sharing rule in the first period. Hence there will be no delay due to bargaining along the equilibrium path of this game. If you are familiar with game theory, you will recognize that this equilibrium is the unique subgame perfect Nash equilibrium for this game. Subgame perfection requires that a Nash equilibrium is played at every stage of the game, which rules out empty threats to reject an offer. So what's a Nash equilibrium for this game that does not satisfy subgame perfection? Suppose player 2 threatens to reject any offer in which he does not obtain at least half of the pie. In that case, it is a Nash equilibrium if player 1 just offers to evenly split the pie in the first period. However, this strategy of player 2 relies on an empty threat in the last stage of the game. Threatening to reject any offer in which player 2 does not get at least half the pie is not a Nash equilibrium for the last stage of the game, as we have seen above. The same logic can be applied for any finite time horizon T. The solution technique that we have used is called backward induction since we solve the game starting with the last period and work backward. The equilibrium payoffs depend on the order in which offers are made. There is a first- and a last-mover advantage in this game. The last mover can make a take-it-or-leave-it offer, which gives the player a lot of power. It is also desirable to be the first mover, especially when the players are impatient. The more impatient a player is, the lower is the equilibrium payoff in this game. In the technical
198
Labor Relations and Bargaining
FIGURE 11.4. Municipal workers. (Photo by author)
appendix at the end of this chapter, we consider an infinite horizon version of this game in which there is no last-mover advantage. The benefits of patience are even more pronounced in the infinite horizon game than in the game we considered above. Summarizing our discussion, we find that bargaining outcomes are determined by the patience or, more generally, the bargaining power of each player. In the context of municipal bargaining, the union is likely to be much more patient than the city administration. Politicians need to be reelected every four years. Unions provide money and workers during political campaigns. This support provides a significant advantage to those candidates who receive a union endorsement. A union may also control a significant fraction of local votes given that voter turnout is low in most municipal elections. Union support may be a decisive factor in determining the outcome of close elections (Sieg and Wang, 2013). Moreover, the city government is likely to face significant pressure from disgruntled residents if basic services such as snow removal or trash collection are suspended due to a strike of the local municipal union. The main bargaining advantage of the city government is the ability to unilaterally impose a new contract on the union if time runs out. In that sense, it can make the last "take-it-or-leave-it" offer to the union. Of course, imposing such a unilateral contract will likely result in a strike, if strikes are allowed. In practice, the decision of whether a city government has sufficiently bargained in good faith is in the hands of the state courts. As we will see in a case study below, cities tend
199
Chapter 11 to be in a weak bargaining position unless they are close to bankruptcy or face a serious financial crisis.
11.3.2 Employment and Wage Negotiations Cities and municipal unions typically negotiate about wages, denoted by w, and employment levels, denoted by L. We can generalize our bargain model to account for these more realistic features. To accomplish this task, we need to specify union and city preferences over outcomes. It is safe to assume that the union prefers to obtain higher compensation for union members. Compensation is broadly defined and includes salaries and other benefits. The union also typically cares about employment levels. That is less obvious. Why should union members care about the employment levels? From the employees' perspective, employment levels are closely related to their job protection. In many low-skill city jobs, employment decisions are based on seniority. The last person to get hired is the first person to get fired if the local economy gets hit by a negative shock. If the city follows that rule, the job security of an employee depends on how many employees are below that person in seniority. Older workers tend to have more job security than younger workers. (Is that fair?) Hence a larger workforce implies more job security for each individual union member. Let us consider the following extension of the bargaining game, in which we represent the preferences of the union by a Leontief utility function:
Vu(w, L) = min{w,l\'.L}
(11.3)
The union, therefore, wants to set w = t\'.L in labor negotiations. Convince yourself that the smaller £X is, the more the union cares about employm ent. This is maybe a little counterintuitive, but that's how this specification works. In contrast, the city has preferences that reflect the preferences of its residents. It is fair to assume that residents prefer higher levels of services (up to a point) and hence higher values of L:
Vc(L) = L
(11.4)
For simplicity let us assume that there is a fixed budget for labor expenses given by B. Hence total labor expenditures, denoted by wL, must satisfy the following budget constraint: B= wL
(11.5)
!.
The budget constraint implies that L = Substituting the budget constraint into the optimality condition for the union gives us the optimality condition for the union wage: B
Wu = t\'.Wu
(11.6)
Wu = v'aB
(11.7)
Solving this equation implies that
200
Labor Relations and Bargaining Note that the optimal union wage increases in the size of the budget and increases in a. Let us also assume that there exists a minimum or reservation wage, denoted by W e - This wage is the lowest possible wage that the city can pay its workers. You can interpret We as the wage that a union member can obtain in the private sector. Hence workers are willing to work for the city as long as the city pays at least this reservation wage. It is reasonable to assume that this reservation wage is well below the optimal union wage We< Wu- Convince yourself that the city maximizes employment, given the budget constraint, if it pays workers the reservation wage. In that case, the optimal employment level, from the perspective of the city, is given by Le= B/we. The objective of the bargaining is then to negotiate the municipal wage. Note that the municipal wage then determines the employment level because of the budget constraint. So let's apply the logic of our bargaining game to this scenario. Consider again the last period of the game T. Suppose it is the city's right to make the final offer. Convince yourself that the city will make a take-it-or-leave-it wage offer equal to wr = We, which implies an employment level of Lr = Le. This is the best possible outcome for the city. Next, consider the second-to-last period, T-1. The union can make a counteroffer. The union knows that the payoffs for the city along the equilibrium path are Ve= Le in period T. As before, note that this payoff is only worth < \ Le in current period utility terms because of discounting and impatience. Hence the union will offer a slightly smaller employment level in period T - l, given by Lr - I = 1, revenues are expected to grow over time, with the future period's revenues exceeding the prior period' s revenues, with a growth rate of (1 - l\'. 1 ) percent. A special case of this model, called a random walk, has l\'.o = 0 and l\'. 1 = 1. Then the best predictor of future revenue is current revenue. If you are interested in learning more about statistical forecasting, you should take a course in time series analysis at your university or college.
14.5.2 Cost and Expenditure Forecasting The basic issues that we encounter in expenditure and cost forecasting are the same as when we forecast revenues. In addition, there are a couple of other issues that deserve to be mentioned. Understanding the cost structure is essential for good expenditure forecasting. Some expenses are fixed-they will be incurred regardless of the volume of goods or services provided. For example, the cost of repairing a fire station depends on its age rather than how many fires are extinguished. Other costs are variable-the more activity, the greater the expenses. Fuel costs for the fire trucks depend on their usage; the more emergencies they serve, the higher the costs. Other costs are step costs- they increase only at certain volume levels. For example, suppose a city-run day care center requires 1 employee for every 20 children served. Therefore, 6 employees are needed whether 101 or 120 children attend the center. The costs jump by the salary of 1 additional employee if the day care center plans to serve 121 children rather than 120. Recall that city administrations are organized into departments. Each department has a budget, and the departments need to account for direct and indirect costs. Cost accounting is the process of valuing the overall costs of producing a good or providing a service. Cities need to figure out the full cost of providing each good or service to make good decisions. There are direct and indirect costs. Direct costs can be traced to a specific cost object. Labor and materials u sed in a sp ecific park are direct costs for parks
260
Municipal Budgeting and Planning
and recreation. Indirect costs are not readily identifiable but are necessary for the operation of the organization. An example is the salary of the parks commissioner (who oversees all parks). Indirect costs must be allocated to cost objects to arrive at the full cost of providing the service. Two main concepts for allocating indirect costs are the pool (the total accumulated indirect costs that need to be distributed) and the base (the criteria upon which to allocate the cost pool). Let's consider occupancy costs as an example. The pool consists of total expenditures for rent, utilities, and maintenance. The base can be the percentage of square footage. Let's consider executive compensation as another example. The pool consists of the total salaries of executives. The base is often the personnel headcount in each department.
14.6 Benefit-Cost Analysis Capital budgeting should be based on a careful analysis of costs and benefits of alternative uses of funds. A city needs to decide if a project is deserving and should be undertaken. More generally, a city needs practical methods that help the city evaluate different projects. Benefit-cost analysis is widely used in practice. Let's illustrate some of the key concepts and problems encountered in determining the benefits and costs of a project using a simple, stylized example of road construction. Suppose the city needs to build and maintain a new large road. There are three stages: planning, construction, and maintenance. The benefits are largely twofold: time saved by travelers and lives saved due to a lower accident rate. Suppose the city's time horizon is ten years. Table 14.5 summarizes the key tasks. First, the city needs to hire a consulting or an engineering firm that can provide a feasibility study. Suppose the company works on the study for a year and charges the city a flat fee of $10 million. The feasibility study may conclude that the road cannot be built due to environmental concerns or large resentment and expected protests of affected residents. In that case, the city has wasted $10 million of taxpayers' money. But hopefully the study concludes that the road can be built, and the city can continue with the process. Next, the city needs to measure the costs of the construction process. That's relatively easy. By state law, the city will be required to allocate the construction contract via a competitive national bidding process. Bids will be submitted by large construction companies. The city typically is required to choose the most competitive bid among the qualified bidders. Suppose that bid is $100 million. It takes a year to build the road. Roads do not last forever. As a consequence, the city will need to budget for repairs. It will also allocate the maintenance contract via a competitive bidding process. Bids will only be submitted by local construction companies. Let us assume that the cost of the maintenance contract is $5 million per year. How can we then measure the total costs associated with the project? To convert all costs into one number, we use a concept called present discounted value (PDV). In particular, we n eed to account for the fact that the costs occur at different points in time and that a dollar today is not the same as a dollar tomorrow. A dollar today is worth 1 + r dollars tomorrow because the dollar could earn interest (r) if invested . By the same argument 1 + r dollars tomorrow are only worth 1 dollar
261
Chapter 14 TABLE 14.5. Road Construction: Tasks and Timing Costs
Planning Construction Maintenance
Benefits
Time saved Lives saved
Consultant Competitive bidding Competitive bidding
One time One time
Year 1 Year 2
Annual
Years 3-10
Traffic simulations Accident statistics
Annual
Years 3-10
Annual
Years 3-10
today if you need to take out a loan today and pay it back tomorrow with interest. Note that we are assuming here that you can borrow and lend at the same interest rate, which is not always the case. The social discount rate is the appropriate value of r to use. In our example, let's set r = 0.1. Note that discounting measures the opportunity costs. These costs are the value of the resource in its next best use. In current period dollars, the present discounted value of the costs (in millions of dollars) is given by the following equation: 100 5 5 + + .. · + 2 1.1 1.1 1.19
PD Ve - 10 + -
-
(14.3)
H ence the PDV of the costs is $125.16 million. Not really cheap! The city should not be surprised if it needs to provide a careful explanation of the benefits associated with this project. What are the benefits of this road? Measuring costs is the easy part. Measuring the benefits is more tricky. Suppose the city can actually come up with some reasonable estimates of time and lives saved by the road. It still needs to answer the following two questions: How do we value time saved? How do we value lives saved? Those are the $125.16 million questions! Ideally, we would like to use market-based measures to value time. Suppose we can show that the time that individuals save from driving faster is spent at work. Then we can value the time saved using the wage rate of a representative individual. An alternative to using wages as measures of the opportunity costs of time is to use a revealed preference method. Deacon and Sonstelie (1985) analyze how much money individuals save by standing in line to buy price-controlled gasoline. They report an estimate of approximately $20 per hour. Suppose the city estimates that the road will reduce annual travel times by 1 million hours. The estimates are typically derived from a traffic simulation model. Suppose the city values each hour at $20; the annual flow value is then $20 million. Over the first ten years of the project, the present discounted value of time saved is given by 0
PDVr
20
20
= 0 + Ll + 1.12 + .. . + 1.19
(14.4)
Hence the PDV of the benefits associated with time saved is equ al to $97 million.
262
Municipal Budgeting and Planning
FIGURE 14.1. Road construction. (Photo by author)
The second benefit of the road is that it reduces the number of accidents. For simplicity let us just consider fatal accidents. (For nonfatal accidents we can estimate the medical costs, the costs of repairing the vehicles, and the working time lost and thus can proceed as discussed above.) Saving or extending lives is a central benefit of many interventions. Valuing human lives is the single most difficult issue in benefit-cost analysis. Many would say that human life is priceless. Nevertheless, we make decisions on a daily basis that involve significant risks. Every possible intervention has a chance of saving or extending lives. The value of a statistical life (VSL) is an economic value used to quantify the benefit of avoiding a fatality. How do we estimate the VSL? Consider the case of dangerous pickup trucks produced by General Motors. Some General Motors pickup trucks produced between 1973 and 1987 had a dangerous, side-mounted gas tank. Consumer groups demanded GM recall these trucks. The recall cost $1 billion and saved approximately 32 lives. Using these estimates, the cost per life saved by the recall is $1 billion/32 = $31.25 million. Alternatively, we can study jobs with higher risks of injury and mortality, such as miners, police officers, or firefighters. Compensating differentials are the additional wage payments to workers to compensate
263
Chapter 14 them for the negative amenities of a job, such as increased risk of mortality. This approach suggests the VSL is approximately $9.3 million. Suppose the new road saves, in expectation, one life per year. Over the first ten years of the project, the present discounted value of the lives saved is given by 0
9.3
PDVL=O + Ll + 1.12
+··· +
9.3
(14.5)
1.19
Hence the PDV of the estimated benefits associated with saving lives is $45.10 million. The PDV of all benefits-reduced travel times plus saved lives-is given by PDVB
= PDVr + PDVL =
142.1
>
125.16
=
PDVc
(14.6)
Based on these calculations, we conclude that it may make economic sense for the city to build the road. However, there could be other projects that have larger net benefits, and funds may be severely limited.
14.7 Conclusions We have seen that each city has an operating budget that delineates the projected revenues and expenditures required to provide the ongoing services and goods for the citizens of a city. The operating budget includes a variety of expenditures, such as education, welfare, housing, parks and recreation, public safety, and administrative services. City expenditures need to be financed by a mix of taxes, charges, fees, and intergovernmental transfers. Most cities use property taxes to generate revenues. In addition, cities use sales taxes, business taxes, and wage taxes. Charges are typically directly linked to specific services provided by a city. Taxes, charges, and fees are also called own revenue. In addition to that, cities receive intergovernmental transfers from state and federal governments. Each city also must have a capital budget that contains projected revenues and expenditures for investments in infrastructure and other long-term projects. Cities often use the bond market to raise the large amounts of revenue that are needed for capital-intensive projects. The efficient operation of state and local governments requires precise forecasting of revenues and expenditures. If historical data exist, we can use a variety of statistical tools to help make accurate predictions about the future. There are basically two types of forecasting models. The first type tries to identify an underlying causal relationship. The second type is purely based on correlations that seem to be stable over time. Planning also requires the forecasting of costs and benefits. If resources are scarce, a city will need to prioritize and cannot implement all projects that appear to be desirable. Benefit-cost analysis provides a useful tool to evaluate projects. For a minimum requirement, the benefits associated with a new project should be higher than the associated costs. Predicting costs can be difficult. Predicting benefits is even harder. We do not know how individuals and households w ill value the benefits associated with different public investments. Nevertheless, there are
264
Municipal Budgeting and Planning
many cases when we can use revealed or stated preference methods to make some progress and come up with some reasonable estimates of potential benefits.
14.8 Debate: Hosting the Super Bowl Super Bowl LII, which was decisively won by the mighty Philadelphia Eagles, was hosted by the city of Minneapolis. The debate should focus on the potential costs and benefits of hosting this event. The pro side should argue that hosting the Super Bowl was beneficial to Minneapolis, while the con side should argue the opposite. The following questions may help structure the debate: 1. How many individuals traveled to the city to watch the event? 2. How much did prices for hotel rooms and other local services increase during the event? 3. How did local residents adjust their behavior during the event? 4. What are some additional costs that the city had to incur because of the event? 5. What benefits and costs are hard to measure? 6. What are likely distractions caused by the event? 7. How would you forecast the additional costs associated with hosting a Super Bowl? 8. How would you forecast the additional city revenues that are likely to be generated by hosting a Super Bowl?
14.9 Problem Sets 1. What are the main sources of revenues and expenditures for a typical town in the US? 2. What is the purpose of an operating budget? 3. What is the purpose of a capital budget? 4. Explain the need for forecasting revenues and expenditures. 5. Find the budget of a city and determine how accurate the recent revenue and expense forecasts were. 6. How would you forecast revenues for a new tax, such as a soda tax? 7. Explain the difference between fixed and variable costs, using a concrete example. 8. How do we estimate the value of time? 9. What is the "value of a statistical life"? When do we need to use that concept?
265
15
Fiscal Policies and Fiscal Crisis
15.1 Motivation The budget provides a useful summary of a city's planned fiscal policies. In this chapter, we take a look at historical data that provide a characterization of fiscal policies that were implemented in the largest US cities between 1980 and 2010. We can view these fiscal policies as realizations of the budget process that we discussed in chapter 14. What did city governments spend money on? How did large cities finance their expenditures? The objective of this chapter is to try to answer both questions using a retrospective analysis of the fiscal policies of large US cities. The first question focuses on expenditure policies. Expenditures per capita are convenient measures of government spending. We will see in this chapter that expenditures per capita differ significantly among cities in the United States. Different cities have different needs. For example, cities with high crime levels need to spend more resources on policing. Cities with a large number of poor households need to spend more on welfare and social services. These insights suggest that we need to take a more disaggregated look at different expenditure categories such as education, social services, public safety, education, or transportation. How much do cities spend on these categories, and what are the relevant expenditure shares? We have seen in the previous chapter that we can classify expenditures as operational expenditures and capital expenditures. Operational expenditures are any expenditures that are encountered with the day-to-day operation of the city. Most of these expenditures come in the form of wages, salaries, and benefits for city employees. In addition to operational expenditures, cities invest in durable assets, such as infrastructure. These expenditures can be classified as capital expenditures, which tell how much cities are investing in the future. We can characterize differences in operational and capital expenditures within large US cities based on historical data. The second question focuses on revenue policies. We have also seen in the previous chapter that cities generate revenues from four primary sources. First, they can generate revenues from taxing individuals and businesses. The main sources of tax revenue in the US are property taxes, sales taxes, personal income or wage taxes, and corporate income taxes. Second, cities rely on charges and fees for a
Fiscal Policies and Fiscal Crisis
variety of the services they provide. Some examples are fees for waste management and trash collection, or parking charges. Third, cities can obtain transfers from both the state and federal governments. Since intergovernmental transfers can be large, expenditures typically exceed own revenues by a large margin. That does not necessarily mean, however, that state and federal governments "subsidize" cities. It may just be a reflection of the fact that cities provide a variety of goods and services for nonresidents and low-income residents. These first two sources of revenue are typically classified as own source revenue since they are directly generated by the city. Finally, some cities operate their own businesses and can generate revenues for business-related activities. One main source of business income comes from public utilities, such as water, electricity, or gas. These utilities are often owned and operated by a city. We will see that there are large differences in revenue policies among large US cities. Poor fiscal policies can lead to fiscal crisis and municipal bankruptcy. Fortunately, fiscal crises of cities happen rarely. But when they do, the consequences can be dramatic. While a fiscal crisis can arise for many different reasons, there are some commonalities. It often takes a combination of mistakes. Most of the mistakes fall into one of the following categories: bad labor policies, bad debt policies, bad tax and redistributive policies, and, finally, bad economic development policies. It is important for city managers and local politicians to stay away from these common mistakes. We discuss them in detail in this chapter.
15.2 Data Fiscal data for large cities used in this chapter come from the Fiscally Standardized Cities dataset, produced by the Lincoln Institute of Land Policy (LILP). (We have used these data in previous chapters. In this section, we provide a detailed description of this data source.) Because city governments differ widely in how they are structured, these data solve a major problem in the study of local governments. The researchers at the LILP have consolidated revenue and expenditure data from the Census of Governments for a selection of 150 large cities for the period 1977 to 2014. The data from individual city governments have been combined with the share from their corresponding counties, school districts, and special districts to obtain measures that are easily comparable among cities. The need for this consolidation is apparent in the examination of Baltimore's and Minneapolis's city spending. If we were to solely examine city spending, we would find that Minneapolis spent $2,334 per capita in 2012 while Baltimore spent $5,782 per capita. However, closer examination reveals relative parity in local spending in both cities. Once we consolidate the city w ith the school district and the county, both cities spent approximately $6,350 per capita in 2012. The difference lies in Minneapolis's allocation of responsibilities among local governance regimes. Both the county and the local school district account for a significant expenditure share in Minnesota but not in Baltimore. We aggregate data by decade and consider ten-year averages for the 1980s, 1990s, and 2000s. In doing so, we eliminate any year-to-year variation stemming from business cycle fluctuations. For organizational purposes in distilling the data provided by the LILP, expenditures are grouped by type, using such categories as
267
Chapter 15 operational expenses or benefits, or by purpose, which concerns social objectives like education or health. The cities in the sample were selected if they satisfied one of a few qualifying criteria. First, cities qualified if they had populations above 200,000 in 2010 with the exception of those that had populations below 100,000 in 1980. Second, those with 1980 populations above 150,000 were selected irrespective of their 2010 populations. There were 112 cities that fulfilled these two criteria. To expand the sample to 146 cities, LILP added cities so that at least two from each state were included. Those chosen were the two most populous cities in each state in 2010. The remaining four were the largest state capitals that were not already included in the sample. No cities from Hawaii or New Jersey, however, were included in the sample because these cities' school districts are state administered, which makes it difficult if not impossible to ascribe spending and revenue at the city level.
15.3 Expenditure Policies Economists typically focus on comprehensive expenditure definitions to m easure the size of the local government. We report two such measures in table 15.1. Total expenditures per capita are defined as all spending on all categories by the local government. General expenditures include all spending categories except intergovernmental expenditures, utility expenditures, liquor store expenditures, and employee retirement trust expenditures. General expenditures, therefore, arguably provide a good measure of total expenditures on core government functions . TABLE 15.1. City Expenditures per Capita Expenditure Categories
1980s
1990s
Mean Std. d ev. 90-10 Mean Std. dev. 90-10 Mean Std. dev. 90-10
Total Expenditures 4294.5 General Expenditures 3541.0
1522.9 1122.0
3362.3 5041.3 2525.4 4281.5
1690.7 1333.5
4061.2 6076.4 3269.7 5144.7
2043.2 1656.1
4732.1 3503.7
575.4 1447.1 812.1 484.7 360.6 555.3 530.5 642.3
354.6 577.0 207.5 257.4
766.6 1779.6 1024.4 555.9 432.7 680.7 668.1 771.6
460.9 739.8 232.1 316.4
1162.7 1250.1 541.1 769.3
237.1 264.0 609.3 273.8
97.9 166.8 647.9 215.8
217.4 355.6 1561.5 417.2
292.1 245.2 716.6 365.7
125.6 146.3 744.5 245.2
278.1 341.7 1767.4 517.1
2872.9 4321.1 2437 3787.4 1064.1 1953.6 190.3 115.0 715.2 720.3
1505.3 1224.3 548.7 170.2 320.5
3497.8 5159.5 2892.3 4619.1 1238.6 2187.2 290.9 171.4 725.1 916.9
1771 .0 1496.3 572.7 222.0 431.2
4186 3514.6 1339.2 459.3 994.3
Education Social Services Public Safety Environment & Housing Administration Interest Utilities Miscellaneous
1158.0 385.0 450.5 557.2
326.4 399.3 170.0 223.2
204.3 215.2 652.7 234.1
83.8 130.6 814.7 212.5
Current Operational Wages & Salaries Ben efits Capital
3612.9 3174.9 1730.7 77.3 681.6
1284.6 1084.1 516.5 119.1 394.7
174.4 308.6 1828.4 332.8
Note: All expenditures are per capita and in 2016 dollars.
268
2000s
Fiscal Policies and Fiscal Crisis Table 15.1 provides an overview of these two expenditure measures for our sample of large and mid-sized cities in the US. We use ten-year averages for each decade shown. We find that total expenditures per capita, as well as general expenditures per capita, have increased significantly at similar rates over the course of three decades. The standard deviation measures heterogeneity in policies. The 90-10 range provides a measure of inequality among cities. Both measures indicate there is much heterogeneity in expenditures in the sample. Moreover, inequality in spending has increased over time. Next, we focus on the different expenditure categories. The main categories are education, social services, public safety, environment and housing, government administration, and interest on debt. Education includes primary and secondary education as well as libraries and higher-education spending. Social services include welfare and health care expenditures. Public safety includes expenditures for police, fire, and correctional facilities. Environment and housing includes public housing, parks and recreation, and waste management. Table 15.1 also reports descriptive statistics for these expenditure categories. We find that cities tend to spend most heavily on education services. The second greatest expenditure per capita is utilities, which has a high degree of variation as indicated by the standard deviation-some cities spend significantly more per capita on utility costs than others. Public safety, environment and housing, and social services represent the next three largest spending categories, and their respective increases in per capita spending since the 1980s is concomitant with the trend of increasing total and general expenditures by cities over this time period. Also notable is the large standard deviation in social service expenditures. Some cities clearly spend much more per capita on social services than do others. We can also observe that administrative expenses have increased over these three decades at a slower rate percentage-wise than most of the other spending categories. Perhaps managerial efficiency and service jobs have been rendered more efficient by advances in computing or the internet. It is also useful to differentiate between current expenditures and capital expenditures. Current expenditures are all expenditures that are not capital outlays. Current operational expenditures are all spending categories, except for capital outlays, interest on debt, insurance benefits and repayments, assistance and subsidies, and intergovernmental expenditures. These expenditures are recurring and tend to be less volatile from year to year than general expenditures, largely because of the exclusion of capital outlays. The main subcomponent of operational expenditures are wages, salaries, and benefits. Capital outlays are direct expenditures for construction of buildings, grounds, and other improvements, and purchases of equipment, land, and existing structures. These include amounts for additions, replacem ents, and major alterations to fixed works and structures. However, expenditures for repairs to such works and structures are classified as current operational expenditures. Table 15.1 also reports descriptive statistics for current, operational, and capital expenditures as well as wages and salaries. We find that operational expenditures constitute the bulk of city spending whereas capital outlays and benefits are comparatively much smaller. Most operational expenditures come in the form of wages and salaries. Running a city is a labor-intensive production process!
269
Chapter 15 TABLE
15.2. City Expenditure Shares
Expenditure Categories Education Social Services Public Safety Environment & Housing Administration Interest
1980s
1990s
2000s
Mean Std. dev.
90-10
Mean Std. dev.
90-10
Mean Std. dev.
90-10
0.3394 0.0978 0.1279 0.1583
0.0732 0.0781 0.0241 0.0448
0.1734 0.1838 0.0607 0.1228
0.3528 0.0988 0.1304 0.1510
0.0787 0.0857 0.0234 0.0441
0.1911 0.2082 0.0526 0.1106
0.3603 0.0915 0.1343 0.1518
0.0841 0.0859 0.0263 0.0457
0.2009 0.1771 0.0667 0.1265
0.0583 0.0598
0.0150 0.0311
0.0377 0.0559 0.0792 0.0612
0.0153 0.0319
0.0393 0.0574 0.0727 0.0472
0.0170 0.0225
0.0454 0.0495
0.0475 0.0599 0.0559 0.0278 0.0475
0.1231 0.1463 0.1439 0.0647 0.1231
Note: Shares by purpose as a percentage of general expenditures. Current Operational Wages & Salaries Benefits Capital
0.8436 0.7475 0.4173 0.0160 0.1564
0.0533 0.0653 0.0608 0.0193 0.0533
0.1409 0.1661 0.1464 0.0387 0.1409
0.8575 0.7584 0.3992 0.0195 0.1425
0.0438 0.0595 0.0551 0.0222 0.0438
0.1069 0.1489 0.1389 0.0526 0.1069
0.8498 0.7670 0.3728 0.0247 0.1502
Note: Shares by type as a percentage of total expenditures.
Table 15.2 repeats the exercise for expenditure shares. Current expenditures account for the bulk (85 percent) of expenditures. The remaining 15 percent are capital expenditures. We find that education accounts on average for 33- 36 percent, public safety for 13-14 percent, and environment and housing for 15-16 percent of general expenditures. Wages and salaries account on average for 40 percent of expenditures.
15.4 Revenue Policies General revenues are all government revenue except liquor store revenue, insurance trust revenue, and utility revenue. The four main types of general revenue are taxes, charges, intergovernmental revenues or transfers, and revenues from operating utilities. Taxes are defined as compulsory contributions exacted by a government for public purposes. Note that local government tax revenues exclude any amounts from shares of state-imposed and collected taxes, which are classified as intergovernmental revenue. Intergovernmental revenues are amounts received from other governments, such as fiscal aid in the form of shared revenues and grants-in-aid, as reimbursement for performance of general government functions and specific services for the paying government, or in lieu of taxes. Charges are general revenue other than taxes and intergovernmental revenue. Utilities are government owned and operated water supply, electric light and power, gas supply, or transit systems. Utility revenue is the fourth source of revenue for a city. These are revenues from the sale of utility commodities and services to the public and to other governments. Table 15.3 provides an overview of these revenues.
270
Fiscal Policies and Fiscal Crisis TABLE 15.3. City Revenues per Capita 1980s Revenue Categories Total Revenue General Revenue Intergovt. Revenue Federal Aid State Aid Own Source Revenue Tax Revenue Property Tax Revenue Sales Tax Revenue Income Tax Revenue Total Charges Sewerage Charges Hospital Charges Solid Waste Mgmt. Charges Utility Revenue Electric Utility Revenue Water Utility Revenue
Mean
Std. dev.
1990s 90-10
Mean
Std. dev.
2000s 90-10
Mean
Std. dev.
90-10
4296.2 1503.5 3521.8 5032.4 1687.0 3938.6 5902.3 1950.3 4475.5 3659.4 1202.5 2525.6 4330.6 1394.9 3073.1 5196.9 1718.1 3595.7 1401.3 650.7 1542.9 1593.9 721.9 1931.9 2028.2 889.0 2310.3 324.3 310.5 352.6 251.0 355.2 344.6 354.2 432.8 453.8 1077.0 579.6 1405.3 1342.9 643.5 1800.3 1674.0 800.9 2168.9 2258.2 722.2 1426.2 2736.7 926.0 1883.6 3168.7 1213.6 2325.7 1335.6 525.5 991.2 1620.1 650.1 1171.6 1899.7 815.7 1354.6 927.6 334.5 846.6 1106.6 416.0 1006.4 1247.2 443.6 1136.3 252.6 235.2 513.0 319.7 283.1 662.8 407.1 341.9 838.7 76.4 217.5 403.4 91.8 259.6 470.7 107.2 307.7 491.8 922.6 439.1 1020.4 1116.6 550.8 1307.9 1269.0 679.5 1455.3 112.2 64.0 150.4 164.2 86.9 201.2 189.3 93.3 219.7 132.6 245.8 331.7 158.7 385.9 391.5 171.8 477.0 355.7 70.9 163.3 32.3 39.5 89.8 64.2 63.1 156.7 71.6 513.8 300.6 145.7
734.9 1632.9 683.8 1437.1 85.4 230.2
502.3 273.5 171.2
640.1 1426.5 602.6 1175.8 102.3 266.9
556.8 286.1 199.6
701.2 1522 638.3 1317.9 124.1 312.7
As we would expect, total revenues and general revenues have increased since the 1980s at roughly the same rate at which total and general expenditures have increased. It is clear that intergovernmental revenue, on average, represents a significant percentage of total revenue, which is more apparent in table 15.4, in which it is given as a share of total revenue. State aid rather than federal aid constitutes the bulk of intergovernmental revenue. Own source revenue is mostly collected through taxes, which generally constitute almost two-thirds of own source revenue. Property taxes are the largest source of tax revenue, dwarfing sales and income tax revenue. The minuscule amount of income tax revenue is likely explained by the fact that most income tax is collected at the state and federal levels. States accordingly redistribute funds to cities via state aid. Charges also make up a large portion of own source revenue. A more disaggregated look at the charges reveals what specific charges are most prolific. Utility revenue is also significant, though not as significant as charges and tax revenue. Interestingly, utility revenue has not increased at the same rate that tax revenue and charges have. Table 15.4 repeats the exercise for revenue shares. Note that state (federal) aid accounts for approximately 30- 35 (5- 8) percent of city revenues. Own source revenues account for approximately 55 percent of revenue. Own revenue can be further divided into taxes, which are 32- 33 percent of all revenue, and charges and fees, which account for another 22 percent of revenue. Among taxes, property taxes are the largest source of income, followed by sales and income taxes. Finally, approximately 9-10 percent of city revenues come from utilities, in particular, electric and water utilities.
271
Chapter 15 TABLE 15.4. City Revenue Shares 1980s Revenue Categories Intergovt. Revenue Federal Aid State Aid Own Source Revenue Tax Revenue Property Tax Revenue Sales Tax Revenue Income Tax Revenue Total Charges Sewerage Charges Hospital Charges Solid Waste Mgmt. Charges Utility Revenue Electric Utility Revenue Water Utility Revenue
1990s
2000s
Mean
Std. dev.
90-10
Mean
Std. dev.
90-10
Mean
Std. dev.
90-10
0.3300 0.0774 0.2527 0.5415 0.3271 0.2340 0.0590 0.0159 0.2144 0.0271 0.0290 0.0080
0.0925 0.0392 0.0923 0.1047 0.0996 0.1049 0.0520 0.0424 0.0702 0.0143 0.0533 0.0098
0.2462 0.0837 0.2444 0.2834 0.2687 0.2518 0.1260 0.0790 0.1689 0.0345 0.0823 0.0213
0.3171 0.0467 0.2704 0.5580 0.3363 0.2357 0.0648 0.0159 0.2217 0.0338 0.0276 0.0138
0.0940 0.0317 0.0957 0.1070 0.0997 0.1047 0.0559 0.0420 0.0739 0.0155 0.0621 0.0127
0.2621 0.0669 0.2549 0.2777 0.2485 0.2363 0.1250 0.0767 0.1713 0.0411 0.0700 0.0330
0.3492 0.0566 0.2926 0.5470 0.3336 0.2253 0.0701 0.0160 0.2135 0.0339 0.0238 0.0132
0.1114 0.0328 0.1128 0.1033 0.0893 0.0871 0.0572 0.0419 0.0709 0.0160 0.0600 0.0123
0.3086 0.0714 0.2901 0.2730 0.2338 0.2034 0.139 0.0765 0.1593 0.0376 0.0507 0.0290
0.1032 0.1231 0.2922 0.0911 0.1015 0.2608 0.0879 0.0973 0.2447 0.0543 0.1185 0.2627 0.0458 0.0983 0.2191 0.0424 0.0918 0.1859 0.0353 0.0204 0.0545 0.0354 0.0208 0.0501 0.0351 0.0206 0.0541
Note: Shares are given as a percentage of total revenue.
15.5 Common Policy Mistakes At this stage of the analysis, it is useful to review some commonly made policy mistakes that can easily lead to fiscal problems and eventually to a fiscal crisis.
15.5.1 Labor Policies There are many examples of flawed urban policies. The most common mistake is to mismanage labor relations. Wages, salaries, and benefits typically account for at least 40 percent of total expenditures. Keeping labor costs under control is, therefore, important if the city wants to provide public goods and services at reasonable costs. If political leadership is weak, unions tend to have too much influence over labor policies, leading to labor contracts that are too generous for municipal employees. These costs have to be borne by city residents and typically lead to an unreasonable tax burden. If taxes are too high relative to the quality of public goods and services that are provided in a city, the city becomes unattractive to medium- and high-income households as well as to firms that typically bear the vast majority of the municipal tax burden. If benefits are too generous, some cities try to hide these costs by underfunding their health and pensions plans. If health and pension plans for retirees are not sufficiently funded, the city effectively takes on debt that may threaten its financial viability in the future. Again, unfunded liabilities increase future tax obligations and thus make the city less attractive to future residents. We discussed these issues in detail in chapter 11.
272
Fiscal Policies and Fiscal Crisis
15.5.2 Redistribution and Tax Policies Another common mistake is to adopt poor tax and redistribution policies. Local redistribution and economic development tend to be at odds with each other. Tax and redistributive policies often undermine the fiscal capacity of the city. These policies will typically take the economic surplus earned by efficient agglomeration and redistribute it to those who do not produce a surplus. That is, bad policies are those that excessively tax productive firms and households but do not provide those firms and households public services of compensating quality. When redistribution gets large enough and imposes a significant tax burden on highproductivity households, then they will leave the city. This causes the tax base to shrink, which amplifies the problem. Poor redistributional and tax policies typically lead to unfavorable demographics. That means that the city has high rates of poverty and high rates of the elderly living in the city. Why should these demographic factors matter? Both groups have a high demand for government services that are often mandated by state or federal law and thus impose unfunded mandates on the city. Both groups typically have relatively low tax bases. Thus both groups will consume more in public goods and services than they contribute in tax revenues. This is typically not a serious problem as long as the share of city population in these two groups is modest. Most cities can afford to finance a limited degree of redistribution to the poor and the elderly. However, if their combined share exceeds approximately 30 percent, then the city has real fiscal problems. The transfers from the taxpayers-firms and middle-class households-to these residents will often be sufficient to undermine the inherent economic advantages of the city's location or agglomeration.
15.5.3 Economic Development Policies There is a strategic skill complementarity in production that arises in cities. Highskill and high-productivity workers have a comparative advantage in working long hours in their occupations. Think about entrepreneurs, R&D engineers, managers, lawyers, consultants, investment bankers, or fund managers. These individuals need to purchase a lot of services that serve as substitutes for home production. If you work 60- 70 hours a week, you do not want to spend much time cooking, cleaning, or doing laundry. Similarly, you may not have a lot of time to watch and entertain your children. These types of home production activities have to be outsourced. You eat in restaurants, you bring your clothes to a cleaner, and you hire a nanny for your children or send them to a day care center. As a consequence, many high-productivity cities also tend to be home to many low-skill service sector firms and establishments. The existence of low-skill jobs in a city is not a problem per se. As a matter of fact, it is a necessary requirement for cities to function well. Problems only arise when the local economy is dominated by these low-skill sectors and there is not a sufficient number of jobs in high-skill sectors to complement the low-skill sector. These types of economic imbalances often arise if industries of the city are cyclically sensitive so that a deep recession has a significant impact on the city's skilled employment base and its tax base. In some cases, city economies are dominated b y a single industry. These cities are more at risk than a diversified city economy. If that industry is hit by a large negative shock, many of the high-skill jobs may
273
Chapter 15 disappear from the city. Struggling cities often have economies that are dominated by low-skill occupations. As a consequence, real wages, home values, and capital intensity are low. Economic development policies, therefore, need to primarily target high-skill workers and high-productivity firms . Low-skill workers will automatically benefit from the presence of high-skill workers. We will see later that high-skill workers provide important agglomeration externalities so that one can make an argument for offering some relocation subsidies. The same is true for some high-productivity firms that engage in research and development or provide high-quality services. Some cities do not seem to understand this basic concept and offer large-scale relocation subsidies to low-skill sectors. That seems to be a poor policy, especially when the city primarily relies on taxing high-skill and high-income households to pay for these subsidies.
15.6 Municipal Bankruptcies In the case of a severe fiscal crisis, federal bankruptcy laws give municipalities the right to file for bankruptcy. The US Congress added this legislation in 1937 in order to allow municipalities access to court protection in a fiscal crisis. The federal government guarantees basic municipal government functions during the period of debt restructuring. This Chapter 9 section of the bankruptcy law focuses on protecting the city government, differing from the protection of individuals and businesses in the remaining sections of the bankruptcy law.1 Municipalities are allowed to file for Chapter 9. Note that states are not allowed to use Chapter 9. Additionally, to gain legal eligibility, municipalities must be insolvent. They must have made a "good-faith" effort to negotiate with their creditors. Typically, these efforts have failed to produce results. Finally, the municipalities must be open to accepting a plan to resolve their debts. Some cities need additional stated permission from their state government to file for bankruptcy under Chapter 9. Only fifteen states have given their municipalities the right to file for Chapter 9 under their own discretion. The remaining states retain the decisive power to file. Georgia is the only state that does not allow its municipalities to file for bankruptcy under any circumstances. Chapter 9 recognizes the issue of state versus federal control under the Tenth Amendment of the US Constitution. It limits the power of the federal bankruptcy court to interfere with municipal operations. For example, creditors cannot foreclose a municipal building to recover their debt. The federal courts do not have the authority to make spending or other policy decisions on behalf of the municipality. That power remains with the locality itself. This places the responsibility to create a specific debt restructuring plan on the municipality. The law gives courts the limited ability to approve or reject a bankruptcy plan and to obtain input from other stakeholders. Filing for bankruptcy gives policymakers some breathing room. It places some of the problems associated with the restructuring process on 1
Under a Chapter 7 bankruptcy filing, the debtor 's assets are sold off to pay the lenders. This is called liquidation. Under Ch apters 11 and 13, the debtor negotiates with creditors to alter the terms of the loan without having to liquidate assets. Reorganization bankruptcy is designed for debtors with regular income who can pay back at least a portion of their debts through a repayment plan.
274
Fiscal Policies and Fiscal Crisis
the judicial side of government. It also can affect the outcome of lawsuits that are likely to ensue if a municipality defaults on its debts. In addition, it gives municipalities the ability to rewrite collective bargaining agreements that can override state labor protections, allowing cities to renegotiate unsustainable pensions or other benefits packages during recessions. Municipal bankruptcies are extremely rare. Since the initiation of Chapter 9 in 1937 about 620 municipalities have filed for bankruptcy. 2 To measure the expansiveness of municipal bankruptcies, we list below the five largest bankruptcies in recent US history, documented by Forbes. 1. Detroit (2013), $18 billion, which we discuss in detail below. 2. Jefferson County, Alabama (2011), $4 billion. The county's debt burden accumulated after a failed investment in a local sewer system, amounting to nearly $7,000 for each of the 658,000 residents. 3. Orange County, CA (1994), $2 billion. A relatively wealthy suburban area south of Los Angeles, Orange County filed for bankruptcy in 1994. After heavy borrowing and risky investments, its investment pool substantially declined in value with rising market interest rates. The investments had been made to generate additional revenues to pay for government services. This strategy failed when the fund, with about $8 billion in assets and with borrowings of $12 billion, faced billions of dollars in losses. 4. Stockton, CA (2012), $1 billion. Stockton experienced extreme fiscal mismanagement over the course of two decades. It went into bankruptcy after a spending spree accumulated significant debt. The city went into an economic trough when its housing market collapsed with the Great Recession. Home prices skyrocketed to an average of $400,000 in 2006 from an average of $110,000 six years earlier. However, median home prices slumped back to $110,000 in 2009, erasing nearly a decade's worth of gains. 5. San Bernardino County, CA (2012), $500 million. This city's main reason for the bankruptcy was attributable to poor local government structure. Between 2004 and 2014, the city had five city managers, five police chiefs, four finance directors, and five public works directors. This is a sign of a highly disorganized city that is clearly mismanaged .
15.7 A Case Study: The Detroit Bankruptcy Following six decades of decline and unable to service or renegotiate its debt, the city of Detroit filed for bankruptcy in July 2013. It was the largest municipal bankruptcy in US history. The city of Detroit was remarkably badly run during the decade that preceded the bankruptcy. It attempted to deliver public goods and services to a large area with a low population density, characterized by vacant lots and abandoned buildings. Detroit had failed to recognize and fund $7.2 billion in retiree health costs. The city's main courthouse had $280 million worth of uncollected fines and fees . Public safety accounted for half of the city's budget, yet no one knew how 2
In comparison, there were nearly 12,000 bankruptcy filings under Ch apter 11 in 2011. In addition, there were 418,000 filings under Chapter 13, which is the most popular form of personal or commercial bankruptcy. Of course, there are a lot more households than municipalities in the US.
275
Chapter 15
FIGURE 15.1. View of downtown Detroit. (Anon/pexels.com)
many police officers were patrolling the streets. The d epartment of transportation provided unreliable bus service. The fire department had staffing and equipment problems that made many stations nonoperational. The electrical grid and the water and sewer services needed to be upgraded, and in the case of the latter environmental standards had been neglected and violated. City workers earned relatively high incomes. These incomes kept pace with increases the United Auto Workers union won for its members. With a large tax base, Detroit could afford these salaries. As auto jobs moved elsewhere, the region aged, and the costs of health care skyrocketed, it could not. In an attempt to reduce current period spending, the city offered more generous pensions. This was an easy way to retain workers as it could not afford to increase current compensation. At the same time the city was not contributing enough to the pension plan. To make up for the shortfalls, Detroit borrowed large amounts of money. Tax collection suffered. The remaining residents of Detroit were often unskilled or aging. Without a source of income, many could not afford to pay their property taxes, and the collection rate dropped to about 50 percent. The high debt service burden and low tax collection put Detroit on a path of default. The city also misestimated revenues. Detroit's budget was developed based on expected revenues of $1.48 billion, yet the city only generated $1.17 billion in revenues in 2010. To make up for the shortfall and reduce the d eficit, the city raised $250 million in bonds and utilized $120 million from other government funds. In April 2012 Detroit mayor Dave Bing, with the support of the city council, signed an agreement with Michigan governor Rick Snyder. This agreement allowed for greater fiscal oversight b y the state government in exchange for the state providing Detroit help with its finances. After months of n egotiations, a
276
Fiscal Policies and Fiscal Crisis financial review team was appointed in December 2012. It conducted a sixty-day review; a report on the financial health of Detroit was released in May 2013. The emergency manager summarized the financial condition of Detroit in that report as follows: Excluding proceeds from debt issuances, the City's expenditures have exceeded revenues from fiscal year 2008 to fiscal year 2012 by an average of $100 million annually. These financial shortfalls have been addressed with long-term debt issuances (e.g., $75 million in fiscal year 2008, $250 million in fiscal year 2010, and $137 million in fiscal year 2013) and by deferring payments of certain City obligations, such as contributions to the City's two pension funds. As of April 26, 2013, the City had actual cash on hand of $64 million but had current obligations of $226 million to other funds and entities in the form of loans, property tax distributions, and deferred pension contributions and other payments. Therefore, the City's net cash position was actually negative $162 million as of April 26, 2013. The City has been deferring, and will need to continue to defer, payments on its current obligations in order to avoid running out of cash.
In June 2013 the city of Detroit stopped making payments on some of its unsecured debts, including pension obligations. It filed for Chapter 9 bankruptcy on July 18, 2013. The US Bankruptcy Court ruled in December of 2013 that Detroit was eligible for Chapter 9. The total outstanding debt at that point was $18.5 billion. The city then started to negotiate with different creditors and developed an adjustment plan. After a two-month trial, Judge Steven W. Rhodes confirmed the city's plan of adjustment on November 7, 2014. This paved the way for Detroit to exit bankruptcy. Creditors and insurers were expected to absorb losses totaling $7 billion. Creditors received between 14 and 75 cents on the dollar.
15.8 A Case Study: The Fiscal Crisis and Recovery of Pittsburgh The city of Pittsburgh has also faced serious problems in implementing sustainable fiscal policies. In November 2003 Pittsburgh mayor Tom Murphy requested designation as a financially distressed municipality. The fiscal problems of the city of Pittsburgh had been building up during the previous decades. By the end of 2003 Pittsburgh had a $398.6 million operating budget with a $42 million deficit. Mayor Murphy laid most of the blame for the city's financial troubles on state laws. The main problem was that the state laws gave the city limited opportunities to tax commuters that worked in the city but lived outside the city limits. The only tax was a $10 occupation tax that had not changed in thirty-eight years. He also put some blame on the firefighters union, which had not agreed to $15 million in concessions that he had requested in earlier negotiations. Fiscally distressed communities in Pennsylvania can apply for help from the state government under the Municipalities Financial Recovery Act of 1987. This alternative is commonly known as Act 47. 3 The Murphy administration was forced to apply for distressed status under Act 47 after the state legislature refused to 3
Municipalities in Pennsylvania cannot unilaterally declare bankruptcy. They n eed approval from the state government. For example, the city of Harrisburg tried to opt for Chapter 9 of the US bankruptcy
277
Chapter 15 approve the city tax reform packages in 2003. These reforms would have allowed the city to raise taxes on nonresident commuters. When Pittsburgh entered into Act 47 in 2003, it had a debt burden of more than 20 percent of its operating budget. Under Act 47, a state coordinator wrote a financial recovery plan that included higher commuter taxes on nonresidents$52 instead of $10. It also called for tighter controls on future union contracts. The existing contracts were not affected. Finally, the plan imposed rigid spending controls. The recovery plan was approved by the mayor and city council. A common pleas judge also approved the new commuter taxes. During the next fourteen years, the city successfully managed to improve its fiscal position by negotiating more favorable union contracts, increasing fees and taxes, reducing costs, and cutting services. In addition, the size of the city's workforce shrunk by 26 percent between 2003 and 2017. Mayor Bill Peduto asked the state to end the supervision in 2018, arguing that the city had sufficiently stabilized its finances and practices.
15.9 Conclusions When there is much heterogeneity in preferences and needs within a country, decentralization is desirable. The large amount of heterogeneity in tax and expenditure policies among large US cities suggests that fiscal federalism and decentralization play a large and important role in the United States. Despite these differences in fiscal policies, some common themes emerge. It is clear that all cities spend a large portion of their annual budgets on education. The second largest expenditure goes toward environment and housing purposes, but there is significant variation in this spending category. Public safety expenditures nearly match environment and housing expenditures. Taxes are the largest source of own revenue, with property taxes dwarfing other m eans of taxation. Charges also represent a significant portion of revenue collection. When examining city revenues, intergovernmental revenues are important. Some states like California and New York are heavy taxers and thus are responsible for apportioning funds to their cities. Intergovernmental revenue can sometimes serve as a buffer when a city's own revenues dip due to economic fluctuations. On the other hand, states like Texas raise less revenue, and city spending in these states is therefore driven primarily by own revenues. Bad fiscal policies and insufficient political leadership often cause fiscal crises. Some fiscal crises are clearly homemade, generated by bad labor policies that create an excessive number of overpaid municipal employees. Political corruption, rent seeking, and political patronage can also cause serious inefficiencies, especially if there is insufficient political competition within a city. When a number of these factors come together, a city will stagnate and be vulnerable in an economic recession. If bad policies go with bad economics and bad demographics, a fiscal crisis is often unavoidable. State bailouts may work with small- or medium-sized cities, but they are difficult and often ineffective code to gain relief from its debts. H owever, the filing was challenged in court by the Commonwealth of Pennsylvania. In November 2011 a federal judge ruled that Harrisburg could n ot file for bankruptcy.
278
Fiscal Policies and Fiscal Crisis with large cities. Federal bailouts are uncommon and difficult to implement. As a consequence, fiscal crises are costly and often take decades to resolve.
15.10 Debate: Fiscal Policies in NYC under Mayor de Blasio Mayor Bill de Blasio's first budget was based on a $75 billion spending plan approved by the city council in June 2014. The budget that was adopted in June 2018 called for $89 billion in spending, approximately 19 percent larger than de Blasio's first budget. This is undoubtedly a fairly large expansion of local fiscal policies during a four-year period. Discuss whether this expansion was necessary and desirable. The pro side should argue in support of Mayor de Blasio's spending increases while the con side should argue against the policies. The following questions may help structure the debate: 1. What are the significant contributors to the increase in the budget during the four-year period? 2. What happened to city employment and payroll during this expansion? 3. How expensive was an additional hire on average? 4. How was this expansion financed? 5. What was the impact of transportation problems? 6. Who benefited the most from the fiscal expansion? 7. Who bore the costs of this expansion? 8. What was the role of local unions in this process?
15.11 Problem Sets Download the official budget document of a city of your choice for the past budget year. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
What are the main sources of revenue for this city? What are the main sources of expenditures for this city? What do we know about labor costs in the city? Has the city sufficiently invested in infrastructure during the past decade? How do the city's finances compare to those of similar cities? What can we say about the city's tax structure? What are the main structural problems that the city faces? What do these problems imply for city finances? How do these problems affect the revenue side? How do these problems affect the expenditure side? What measures has the city taken to improve revenue collection? What measures has the city taken to control costs and expenditures? What are some v iable options to improve the fiscal capacity of the city in the future?
279
16
Debt Finance and Municipal Bond Markets
16.1 Motivation Cities and municipalities need access to credit markets to finance infrastructure and other long-term capital investments. To raise large amounts of funds, they typically need to issue municipal bonds. Two-thirds of infrastructure projects-such as bridges, roads, ports, hospitals, schools, and water treatment facilities-are funded through municipal bonds. Note that cities cannot issue stocks! If they could, would you buy stocks from your hometown? Municipal bonds are fairly standard debt obligations. Over the past decade, an average of approximately $442 billion in new municipal securities was issued each year-culminating in $495 billion in new municipal securities issued in 2017 alone. There are over 80,000 issuers of municipal bonds in the US, accounting for $3.7 trillion outstanding to bondholders as of 2015. This is approximately 20 percent of the total US debt. According to the Federal Reserve, there was a 133.8 percent increase from 2001 to 2011 in outstanding municipal debt in the US alone. Municipal bonds have had a historically low rate of default with less than 1 percent of these bonds going into default over the last fifty years. Municipal bond markets, therefore, play an important role in financing local infrastructure projects. In this chapter, we discuss how municipal bond markets work in the US. We then study New York City to get a better understanding of how a large city relies on access to these financial markets.
16.2 Key Players in the Municipal Bond Markets There are essentially three main groups that participate in municipal bond markets: issuers, underwriters, and purchasers. 1 The first group of market participants are the issuers of bonds. These include cities, school districts, and special authorities. The municipalities, or the issuers of these bonds, use them to fund specific 1
Green (2007) provides a detailed discussion of the main players in the market.
Debt Finance and Municipal Bonds
projects or entire organizations. The issuer promises to pay investors back their initial investment with interest. The second group are the underwriters, who assist municipal issuers in selling the bonds in the market to investors. They are the intermediaries between the issuers and investors. Oftentimes underwriters will transact millions of dollars in one deal, requiring a significant amount of legal documentation and paperwork to be completed. Investment banks typically serve as underwriters. The underwriter buys the bonds from the issuer at a discounted price and then offers the bonds to the public at a stipulated price. Typically, the price paid by the underwriter is substantially less than the price at which bonds are sold to customers, as shown in Green, Hollifield, and Schiirhoff (2007). The difference is also called the underwriter's discount, which determines the profit margin for the underwriter. The third important group of market participants are investors who buy and hold the municipal debt as part of their investment portfolios. Investors are wealthy individuals and institutional bondholders. The interest rate and yield for the investors depend on the characteristics of the bonds, such as maturity and riskiness. Investors are the persons and organizations that effectively loan the municipality (the issuer) the funds it needs to implement the desired capital investments. Two other intermediaries also play important roles in these markets. Stockbrokers help investors manage their portfolios and execute trades. Any stockbroker can purchase municipal bonds; however, some brokers choose to specialize in municipal bond markets. Brokers typically collect a fee between 0.5 percent and 3 percent of the purchasing price in the municipal bond market. This is slightly lower than the typical commission on stocks, where brokers typically receive between 1 and 4 percent commission. Ratings agencies such as Moody's Investors Service provide municipal bond ratings. The goal of the ratings agencies is to provide a credit rating to make it easier for market participants to evaluate the risk associated w ith a given bond. The ratings agencies are supposed to consider all economic factors with regard to the issuer. These ratings are useful since municipal bond markets often lack the transparency and access of other bond markets, as discussed in detail below.
16.3 Bond Characteristics There are two basic types of municipal bonds: general obligation bonds and revenue bonds. Principal and interest payments of general obligation bonds are secured by the full faith and credit of the issuer. These bonds are usually supported by the issuer's general taxing authority. In many cases, general obligation bonds need to be approved by voters. This gives general obligation bond issuers the flexibility to pay back their debt obligations through either taxation or project-related revenues. In contrast, principal and interest of revenue bonds are secured b y revenues d erived from a facility built with the proceeds of the bond issue, such as tolls, charges, or rents. These bonds are typically issued by special authorities, such as school districts or transit authorities. Sometimes these authorities are created for that particular purpose. For example, if revenue bonds were issued for a new bridge, the investors would be p aid back with the tolls generated by the bridge. Interest income from municipal bonds is typically exempt from federal income taxation. As long as the investor lives in the issuing municipality or state, both
281
Chapter 16 general obligation and revenue bonds are also exempt from state and local taxation. However, municipal bonds are taxable if the federal government does not find a clear benefit to the public, like a sports facility. These taxable bonds have higher interest rates. 2 Municipal bonds differ by maturity. Maturities matter because long- and shortterm bonds tend to have different interest rates. These differences are typically illustrated by a yield curve, which is a line that plots the interest rates for differing maturity dates. Most municipal bonds are long-term bonds since they are designed to finance long-term investments. Often these bonds fall in the range of five- to thirty-year maturities. The coupon rate is the interest rate an issuer pays for the life of a bond. The coupon rate is not always a fixed rate. Floating or variable rate bonds possess interest rates that reset daily, weekly, monthly, or yearly. Interest rate fluctuations for these types of bonds depend on prevailing market conditions. On the other hand, zero coupon bonds do not pay any interest until the bond matures. Upon maturity, a zero coupon bond provides a lump sum to the investor consisting of both the principal investment and interest. Historically, municipal bonds have been one of the least transparent assets in the bond market (Schultz, 2012). While US Treasury bonds can be bought and sold within seconds on exchange platforms, municipal bonds are much harder to buy and sell given the current absence of widespread secondary market platforms for their exchange. Municipal bonds are typically considered to be low- or moderate-risk investments. The main risk arises from the likelihood that the municipality defaults on the bond. This may happen in a fiscal crisis or bankruptcy. Long-term bonds are riskier than short-term bonds. Other risks for municipal bonds include the interest rate risk, which is the risk associated with bond price fluctuations as market interest rates change. If a city faces financial difficulties, it typically needs to issue bonds with credit enhancements, which improve the creditworthiness of a bond issue. The credit enhancer honors payment of a debt in the event that the issuer is unable to do so. For insured bonds-a type of credit enhancement-the issuer guarantees that even if the original issuing municipality defaults, the insurance company will continue to pay interest for the life of the bond. Since there is a low risk associated with enhanced bonds, these bonds have a lower interest rate than nonenhanced bonds.
16.4 New York City's Participation in Municipal Bond Markets Large cities rely on access to bond markets to satisfy their financial needs. We analyze data for municipal bonds that were issued by the top one hundred cities in the US during the past four decades. Our main focus is New York City, which is the largest single issuer of municipal bonds. We study the characteristics, the volume, and the amount of municipal debt that has been issued b y NYC since the 2lf you
are interested in underlying theory, I would suggest you start with Green (1993), who provides a simple model of the taxable and tax-exempt yield curves.
282
Debt Finance and Municipal Bonds
FIGURE 16.1. Skyline of New York City. (Reynaldo Brigantty / pexels.com)
early 1980s. Finally, we analyze the evolution of credit ratings, which can be used to measure the financial health of a city.
16.4.1 Data The main dataset that we use is Thomson Reuters SDC Platinum, which provides detailed information about municipal bond issues. Table 16.1 provides some descriptive statistics of the main variables that we observe in the sample for NYC. It summarizes the characteristics of different municipal bonds that were issued b y NYC between 1980 and 2017. Note that the majority of the bonds are general obligation bonds that were tax exempt and did not require credit enhancements. It is useful to compare NYC to the average top one hundred US cities. The majority of bonds in this dataset are not enhanced, reflecting the good credit rating of most cities. New York City has a larger portion of taxable bonds than the average large US city in the sample. As mentioned above, municipal bonds are taxable if the federal government does not see a clear benefit to the public for the project at hand. New York City also issues a large amount of short-term bonds when compared to all top one hundred cities in the dataset. The biggest difference between New York City and the other top one hundred cities is found in the bond purpose distribution. New York City rarely utilizes specialized bonds. Only 2.4 percent of bonds issued by N ew York City are specialized bonds. In contrast, most cities typically use general-purpose bonds and specialized bonds at near equal rates. Specialized bonds typically do not use the taxing power of the issuer to pay back the investor. Most cities issue municipal bonds with fixed
283
Chapter 16 TABLE 16.1. Descriptive Statistics: New York City Bond Details
Proportion of Bonds
Bonds Issued
Enhanced Not Enhanced
0.296 0.704
189 449
Taxable Tax Exempt
0.203 0.797
130 508
Short Maturity (0- 5 years) Intermediate Maturity (5-10 years) Long Maturity (10+ years)
0.318 0.071 0.611
203 45 390
Generalized Bond Specialized Bond
0.976 0.024
623 15
Fixed Rate Variable Rate Other
0.768 0.168 0.064
490 107 41
Average
Average Bond Rating (S&P) Average Bond Rating (Moody's) Average Bond Rating (Fitch)
5.19 5.27 4.59
Average Coupon Rate (percent)
6.45
Average Amount of Issue (millions)
272.643
coupon rates. New York City instead issues more variable coupon rates, where the interest rate changes depending on prevailing economic conditions. The average amount of an issue in New York City is approximately $272 million. This is more than four times the average for the entire dataset of the top one hundred cities. The average amount of issue for the largest twenty-five cities in the US is $102 million while this figure is merely $21 million for the smallest twentyfive cities in the dataset. Of course, New York City is much larger than the average US city. Hence it has much larger projects that need to be financed. Table 16.2 compares the average amount of different bonds issued by New York City with the average of other top one hundred cities. For the full sample of top one hundred cities, enhanced bonds have an average issue amount that is statistically significantly higher than the average issue amount of nonenhanced bonds. For New York City, the average amount issued for enhanced bonds is greater, but the results are not statistically significant. Additionally, for New York City, longterm bonds have a statistically significantly higher average issue amount than both short-term and intermediate-term bonds. For the dataset as a w hole, however, this is not the case. Short-term bonds have a higher average issue amount than both long-term and intermediate bonds.
16.4.2 Volume and Amount Issued over Time Not surprisingly, New York City has issued by far the most bonds of any of the largest one hundred US cities since 1980.
284
Debt Finance and Municipal Bonds TABLE 16.2. Average Bond Amount Issued New York City-Enhanced New York City-Nonenhanced Full Dataset-Enhanced Full Dataset-Nonenhanced
296.01 262.80 71.47 50.95
New York City-Long-Term New York City-Intermediate New York City-Short-Term Full Dataset-Long-Term Full Dataset-Intermediate Full Dataset-Short-Term
342.58 107.31 174.92 58.74 25.49 64.13
Note: Amounts are in millions of dollars.
30
--+- Enhanced --+- Notenhanced
V,
Cl)
::i
-~20 "'Cl C 0 .0
...... 0 ,._
Cl)
.0
E10
::i
z
0
1980
1990
2000
2010
2020
FIGURE 16.2. New York City-Enhanced versus nonenhanced frequencies.
We observe the number of times that a city issued bonds in a given year. Figure 16.2 depicts the number of bond issues by enhancement type. New York City begins and ends the time period issuing a majority of its bonds that are not enhanced. But in between these two clear trends, from around 2000 to 2010, enhanced and nonenhanced bonds issued are essentially equivalent. This trend is similar to a number of other cities with relatively good credit ratings. However, a majority of other cities issued significantly more enhanced bonds. Enhanced bonds are unavoidable if a city faces serious fiscal constraints. Figure 16.3 depicts the number of bond issues in New York City by m aturity length of the bonds. Maturity lengths are categorized as short-term (0-5 years), intermediate (5- 10 years), and long-term (10+ years). The majority of bonds issued
285
Chapter 16 30
.
0
--+- 0-5 years --+- 5-10 years - - o -- 1o+ yea rs
V,
QJ
::::,
.;~po "'O C
,," ,, ,, " ' ' '' ' ' '' ' ' '' ' '
0, ,,•' ,, ,, '' '' ' ' '' '
..
0
.. . ;q .
'..
'" ' '
0
.c '+-
0 .... QJ
.c
Q..0 .. d
E 10 ::::,
Q
'..
'
!Q
00 ''
z
'D
. b
p-q
0
1980
1990
2000
2010
2020
FIGURE 16.3. New York City-bond maturity over time.
by New York City are long-term bonds. New York City issued more long-term bonds after the year 2000, reflecting the improved financial capacity of the city. Figure 16.4 depicts the total par amount of bonds issued by year for New York City. The total amount issued is recorded in millions of dollars. Note that there is a large variation in the amount issued during the sample period.
16.4.3 Municipal Bond Ratings over Time Thomson Reuters also provides bond ratings from three ratings agencies: Fitch, Moody's, and Standard & Poor's. These are the major bond ratings agencies and are considered to provide the most reliable ratings. Each of the agencies attempts to measure the financial strength of a bond issuer and also evaluates the issuer's ability to both make interest payments and pay back the principal of the bond when the date of maturity arrives. Bond ratings agencies use a discrete, alphabetical scale for their ratings. It is common practice to convert the alphabetical scales used b y ratings agencies into numerical scores that range from 1 (very high credit quality) to 22 (in default). This scale is simple, essentially providing a numerical value for each rating in ascending order. Note that high numbers refer to worse ratings and low numbers refer to better ratings. We average the ratings of the three agencies using a common conversion scale. Enhanced bonds do not have significant variability in their bond ratings. More importantly, their ratings do not reflect the fiscal health of the city. We therefore exclude these bonds from our analysis. We also focus on long-term bonds with a maturity of at least ten years. Finally, we remove a small number of d ata points that seem to be subject to measurement error. This leaves us a sample of 198 data points for New York City to study the evolution of bond ratings over time.
286
Debt Finance and Municipal Bonds
8,000
;e
6,000
.Q
= .E iJ
Q.J
::i
4,000
"' -~ ...., C ::i
0
E 2,000
w+ - -. l + i -
l +i
(18.12)
Solving this equation for c, we obtain (18.13)
So the returns adjusted by the interest rate must be larger than the costs. In that case, the human capital investment increases the present discounted value of lifetime earnings. Figure 18.5 illustrates this case. We see that the budget constraint with human capital investment is to the right of the budget constraint without human capital investment. Hence individuals who invest in human capital can consume more in both periods than individuals who do not make the investment. As a consequence, the investment makes sense for all individuals, independently of the shape of the preferences. That result is worth repeating: if credit markets work well, then the decision to invest in human capital does not depend on preferences. It only depends on the relative magnitude of costs and returns as well as the prevailing interest rate.
324
Provision of Education
w+ r •·················
w ----------
w-c
w
FIGURE 18.6. Optimal consumption choices with credit markets.
Finally, it is straightforward to determine the optimal consumption choices in both cases. These choices are illustrated in figure 18.6. In the model with credit markets, we can transform the intertemporal decision problem into a problem that looks like a static decision problem by just reinterpreting the two goods as current and future consumption. Credit markets allow us to move resources between the two periods and achieve our preferred consumption bundles. Assuming educational investment only affects future earnings, efficient educational investment shifts out the lifetime earnings budget line the farthest. This maximizes the lifetime consumption possibilities of the individual. We could generalize the model and allow for continuous investments in human capital with heterogeneous costs and returns among individuals. After all, individuals differ by ability. We would not expect that the returns to education are constant. Similarly, some children love to learn while others hate it. Hence the effort costs are different as well. The model with heterogeneous individuals then predicts that individuals with high returns and low costs will invest the most in human capital. In practice, it is difficult to borrow against future earnings. This is largely due to the uncertainty about future income. Banks must be concerned about the default of student loans. We look at the default problem in more detail in chapter 25 when we study housing investments and mortgage markets. At this point, you only need to know that the federal government provides loans for students who want to attend college but cannot borrow in the private credit market.
325
Chapter 18
FIGURE 18.7. Music education. (Photo by author)
18.4 How Large Are the Returns to Human Capital? We typically observe that individuals with higher levels of education have higher wages and salaries. Does that mean that education increases wages as suggested by the human capital model? Not necessarily! There are other reasons why wages rise with education. Spence (1973) argues that education also serves as a screening device: in a world with incomplete information, education provides a means of separating high- from low-ability individuals. It is less costly for high-ability types to go to college than for low-ability types because high-ability types can succeed academically with less effort. Education not only improves skills but also allows high-ability types to signal their relative productivity to potential employers. In a nutshell, high-ability types are more likely to invest in education. Also, ability may not be measured very accurately in most datasets. When we estimate the returns to education, we need to be careful when analyzing the data and cannot necessarily expect that simple least squares regressions will give us the answers we seek. As we discussed in detail in chapter 17, the wage or the earnings of a person are a function of a number of factors, such as inherent ability at birth, skills and experience acquired after birth in school, investments in human capital by the family, work experience and on-the-job training, health status, and good and bad luck (i.e., chance and other random events). We do not necessarily observe all these factors. Hence empirical work is difficult. Comparing wages of individuals with different levels of education suffers from ability bias if we do not measure ability correctly. Higher levels of education may also reflect intelligence or motivation, biasing comparison. Empirical research has focused on finding variation
326
Provision of Education
in education that may not be correlated with inherent ability. Twin studies are an example of this strategy. Card (1999) provides an extensive review of the literature and concludes that each extra year of schooling increases wages by approximately 10 percent. We thus conclude that the average returns to another year of education are relatively large.
18.5 Public Provision of Primary and Secondary Education The basic human capital model explains why individuals should invest in human capital, but it does not provide a compelling justification for public provision of primary and secondary education. To do that, we need to model the interaction between parents and children, taking into consideration that parents make decisions on their children's behalf. As a consequence, we need to allow for the possibility that parents may not make the decisions that are in the best interest of the children. So how should we model parent behavior? A simple model assumes that parents have preferences over their own consumption C and the "quality" of the children Q. For simplicity, let us assume that parents have Cobb Douglas preferences: U(C, Q) = ixlnQ + (1 - ix) lnC
(18.14)
where ix measures the weight that parents put on the quality of the education obtained b y their children. Broadly speaking, ix measures the degree of altruism parents feel toward their children. The quality of the education obtained by the children depends on the human capital they accumulate. Hence it depends, for example, on the quality of the school that the children attend and other parental inputs. This model captures the fact that parents typically pay for the human capital investment of their children-either in the form of tuition, local taxes, or direct expenditures. Hence parents bear the costs of the human capital investments, while the children are the primary beneficiaries of these investm ents. If parents are sufficiently altruistic, i.e., if they sufficiently care about Q, then this is not a problem. What happens if parents are not that altruistic and do not care much about the educational quality that their children obtain? Some parents may have low preferences for educational quality and may primarily care about themselves. As a consequence, they will act selfishly and pick low levels of investment for their children. Should children be punished for the fact they happen to have selfish parents who do not invest in their future? Some parents may also not fully understand or appreciate the importance of human capital investments. Another consequence of the fact that parents make decisions for children is that high-income parents tend to invest a lot in their children while low-income parents tend to invest less in their children. Child education quality is likely to be a normal good from the perspective of the parents. In the absence of government intervention, inequality will then be passed through generations. Children of high-income households will obtain a good education, and children of poor households w ill obtain a bad education . That is h ardly fair! Educational opportunities should be available for all children.
327
Chapter 18 Both of these arguments suggest that a more efficient and definitely a fairer allocation of resources can be obtained if primary and secondary education is provided by local governments and is financed through local taxes. In that case, all children who live in a community will have access to the same schools, and differences in education quality will largely be a function of other parental investments, holding ability fixed . Despite equalization of expenditures within districts, we still have to deal with differences in school quality across districts. The main source of inequality that arises in the current US system is due to sorting and segregation. As we discussed in previous chapters, we need sufficiently large interjurisdictional transfers from high- to low-income communities within a state or a country. If these transfers are sufficiently generous, then differences in educational spending may not be large within a state. In practice, the main objective of fiscal transfers is not necessarily to fully equalize education spending but to make sure that students in low- and moderate-income school districts obtain an education that is of sufficiently high quality. Finally, it should be pointed out that education not only affects the productivity and earnings of individuals, but there are also some positive externalities or spillovers. For example, more educated individuals tend to be more informed voters. They are more likely to be involved in their communities and, therefore, provide positive neighborhood externalities. They are less likely to become criminals since they have better outside options. It is hard to quantify the magnitude of these positive externalities. However, these externalities provide another justification for public provision of primary and secondary education. Summarizing the discussion above, there are a number of justifications for government intervention in the market for education. First, credit market imperfections make it difficult to borrow against future earnings for many students. Second, there is an agency problem: parents may not choose an appropriate level of education for their children. In that case, the government may want to provide incentives or force parents via mandates to invest more in their children. Third, if child education quality is a normal good, private investments in human capital provide a mechanism to sustain inequality across generations. The society should not penalize children just because they were born into low-income families. There are two types of government interventions for this situation: (1) the public provision of free or highly subsidized education (public schools and universities) and (2) the provision of loans and grants (vouchers) to students. When it comes to primary and secondary education, state and local governments in the US primarily rely on the first strategy. In higher education, the government uses a combination of both strategies.
18.6 Two Case Studies: Philadelphia and Pittsburgh To gain some additional insights into the challenges faced by urban school districts, it is useful to look at a couple of districts in more detail. Here we study the two biggest urban school districts in Pennsylvania: Philadelphia and Pittsburgh. In 2011-2012, there were 149,535 students in 242 public schools as well as 55,625 students in 84 charter schools in Philadelphia, with 54.4 percent of the students classified as African American, 7.8 percent as Asian, 18.5 percent as Hispanic,
328
Provision of Education
Enrollment per year as a percentage of enrollment in 1995
A 105 100 95 90 85 80 75 70
65-+---------------------------~ 96 97 98 99 00 01 02 03 04 05 06 07 08 95
Expenditures per student per year
B 24,000 20,000 16,000 12,000 8,000
95
96
97
98
99
00
01
02
03
04
05
06
07
08
FIGURE 18.8. Pittsburgh and Philadelphia Public Schools: Enrollment (A) and expenditures per student (B). (Author's calculations)
and 14.3 percent as white. Of all students, 82.2 percent were economically disadvantaged. The district had 20,309 employees, among them 9,552 teachers and 471 principals and assistant principals. The total budget was approximately $2.77 billion; 33 percent of that amount came from the city, 48 percent from the state, and 19 percent from the federal government. In contrast, the Pittsburgh Public School District is much smaller than the Philadelphia Public School District. This institution operated 54 schools with 3,900 full-time employees, among them 1,875 teachers. Total enrollment was 26,463 students with an operating budget of $529.8 million. Of all students in the Pittsburgh Public Schools, 71 percent were eligible for free or reduced lunch; 55 percent of the students were African American and 33 percent were white. Figure 18.8 shows enrollment and expenditures per student in the Pittsburgh and Philadelphia Public Schools between 1995 and 2008. Note that both districts experienced a significant decline in enrollment. As enrollments dropped by
329
Chapter 18 25-30 percent, expenditures per student increased significantly. One consequence of the the decline in enrollment is that both districts had to close a significant number of schools. Closing underperforming schools and reallocating students to better schools is also not as easy as it may sound. In 2006 Pittsburgh closed twentytwo elementary and middle schools. Closing schools and reducing excess capacity is difficult for political reasons since parents and students often are attached to local neighborhood schools even if the schools are not producing good outcomes or are expensive to maintain. School performance was a central consideration in closing decisions. This intervention was studied by Engberg, Gill, Zamarro, and Zimmer (2012). They find that the transition to new schools can have an adverse effect on attendance and achievement gains for students from closed schools. Another consequence of the decline in enrollment was that spending per student increased significantly during the time period. Philadelphia (Pittsburgh) was spending on average $17,000 ($23,000) per student by the end of the sample period in 2008. Therefore, lack of financial resources does not appear to have been the biggest problem faced by these districts, especially not Pittsburgh. Despite these relatively high expenditures per student, achievement tended to be low. Figure 18.9 shows the distribution of math and reading scores for eleventh-grade students in the Pittsburgh Public Schools in four academic years. The achievement measures are based on the Pennsylvania System of School Assessment (PSSA), which was a standardized test administered to public high schools in the state of Pennsylvania until 2013. 1 The four categories in the PSSA are advanced, proficient, basic, and below basic. For all practical purposes, we can assume that most high school students who do not score in the proficient or advanced categories have some serious reading, writing, and basic math d eficiencies. PSSA scores were important because districts were supposed to meet "adequate yearly progress standards" in reading and math. In 2010 targets scoring proficient or advanced were 56 percent in math and 63 percent in reading. As we can tell from figure 18.9, Pittsburgh Public Schools were far away from meeting those basic standards. Moreover, it should be pointed out that there is quite a large attrition rate in high school in many urban districts, including Pittsburgh Public Schools. So, if anything, the numbers reported in figure 18.9 provide an upper bound on how well students performed since many of the worst-performing students would either have dropped out prior to reaching the eleventh grade or would be absent from school for extended p eriods, missing the standardized tests. The city of Pittsburgh and the Pittsburgh Public Schools in collaboration w ith the University of Pittsburgh Medical Center created the Pittsburgh Promise in 2006. The objective was to retain higher-income students in the district and to motivate students to graduate from high school b y providing financial incentives. The Promise program was modeled after the Kalamazoo Promise, which was the first of these types of programs adopted in the United States. The Pittsburgh Promise provides graduates of the Pittsburgh Public Schools with up to $40,000 as a scholarship to pursue higher education. To be eligible you must be a student 1 Note
that since 2013 high school students have taken the Keystone Exam in place of the PSSA for their standardized testing.
330
Provision of Education
A
I
■ Below basic
Basic ■ Proficient
Advanced
I
100 20.6%(405)
23.5%(412)
23.3%(439)
26.5%(416)
.., V,
C (lJ
75
"O
:::J ..,
53%
50.6%
50%
52.4%
27.1%(475)
32.5%(640)
V,
'+-
0 ..,
50
C (lJ
~
(lJ
CL
25
0
B
2006-2007 Grade 11 (1970)
I
2007-2008 Grade 11 (1752)
■ Below basic
2008-2009 Grade 11 (1887)
Basic ■ Proficient
2009-2010 Grade 11 (1570)
Advanced
I
100 18.6%(368)
.., V,
C
(lJ
21.3%(374)
75
52.4%
25.4%(502)
"O
16.8%(3 17)
26.3%(496)
31%(544)
:::J ..,
22.9%(349) 42.7%
43%
44%
19.8%(302)
V,
'+-
..,0
50
17.1%(260)
20.6%(408)
C (lJ
~
(lJ
CL
25
0
2006-2007 Grade 11 (1976)
2007-2008 Grade 11 (1753)
2008-2009 Grade 11 (1889)
2009-2010 Grade 11 (1523)
FIGURE 18.9. Pittsburgh Public Schools: Eleventh-grade PSSA scores in reading (A) and math (B). (Pennsylvania Department of Edu cation)
in the district and a resident of Pittsburgh continuously since at least the ninth grade; earn a minimum GPA of 2.5; maintain a minimum attendance record of 90 percent; and earn admission to an y accredited public or private postsecondary school located in Pennsylvania. University of Pittsburgh Medical Center provided up to $100 million in seed money for the campaign, which raised a total of $250 million. In 2008 the Promise started to give out its first scholarships. Philadelphia and Pittsburgh are fairly representative of many other large school districts in the US. Many large urban school districts are also struggling with poor student performance. These districts confront the challenges of increasing student achievement and adapting the teacher workforce to changing needs. The purpose of the remaining sections of this chapter is to understand potential reform options and discuss the empirical evidence regarding the efficacy of these options.
331
Chapter 18
18.7 Early Childhood Education Early childhood education is targeted to young children before they enter kindergarten. The idea is to extend the formal teaching of young children by professionals in settings outside of the home. We observe large differences in school readiness scores for young children. The gap in student achievement then widens substantially during elementary and middle school. Given these stylized facts, many researchers and policymakers are convinced that early interventions are essential. The HighScope Perry Preschool Project provided early childhood education for disadvantaged children starting in 1962. It focused on children from poor neighborhoods who did poorly on district-wide, standardized tests and also received low scores in their IQ assessments. In contrast to existing nursery school programs, the Perry program focused on a child's intellectual maturation rather than social and emotional advances. The program was evaluated in a randomized controlled trial of 123 children in five cohorts. Fifty-eight of the children were randomly assigned to a treatment group. Notice that the sample size is not very large. Moreover, there were some deviations of randomization in the implementation of the experiment. The main advantage of the Perry experiment is that we can observe not only short-run outcomes in school but also long-run outcomes at ages twenty-seven and forty. Heckman, Moon, Pinto, Savelyev, and Yavitz (2010) provide a detailed analysis of the returns to the Perry program. They find that the program promoted educational attainment in several ways. First, it increased total years of education. Second, it increased rates of progression to a given level of education. The effects are larger for females than males. Treated females received less special education. They progressed more quickly through grades. They earned higher GPAs. Additionally, they attained higher levels of education than their controlgroup counterparts. For m ales, however, the impact of the program on schooling attainment is weak at best. Despite the small effect on educational attainment, Heckman and his colleagu es report large effects on lifetime earnings for both females and m ales, with gross earnings before the age of twenty-eight lower in the treatment group and the control group. They are significantly larger later in life. Between the ages of twenty-eight and forty-one (forty-two and sixty-five) gross earnings are $83,000 ($160,000) higher for males and $66,000 ($122,000) higher for females in the treatment group relative to the control group. In addition to the income gains by the participants, higher earnings translate into high er absolute amounts of tax payments that are also beneficial to the general public. They also find that crime reduction is a major benefit of the Perry program. Overall, they estimate that the social rate of return to the Perry program is in the range of 7- 10 percent. A significant portion of these returns is due to lower criminal activities in the treatment sample. Lower crime rates then translate into higher earnings during the more productive years in the labor market. It is also hard to be a good parent when you are in prison. As a consequence, lower crime rates also translate into more stable families and better outcomes for children. Garcia, Heckman, Leaf, and Prados (2016) analyze the costs and benefits from two other early childhood programs evaluated by randomized trials: the Carolina Abecedarian Project (ABC) and the Carolina Approach to Responsive Education (CARE). Both were launched in the 1970s and had long-term follow-ups
332
Provision of Education
through the participants' mid-thirties. The programs started at eight weeks old and engaged participants to age five. The study estimates an overall rate of return of 13.7 percent per annum for these two programs. The Perry program and the ABC-CARE programs provide compelling evidence that early childhood education is desirable. Since these early experiments, the federal government in collaboration with state and local government has significantly increased the availability of early childhood education programs. Most urban school districts offer some preschool and kindergarten programs through the Head Start program of the US Department of Health and Human Services. It is intended to create healthy development for low-income children, ages three to five. A large number of these programs are similar in design to the Perry program. Families must earn less than 130 percent of the federal poverty level to qualify for the program. Head Start, as the program currently operates, costs about $9,000 per participating child per year. These costs are typically covered by a combination of federal, state, and local funds. There is some experimental evidence on Head Start that is based on a recent rand omized experimental evaluation. Starting in 2002 n early 4,700 children between the ages of three and four participated in the evaluation and were assigned to either a treatment or control group. The households in the control group could participate in other local preschool programs if slots were available. Hence they may have also been exposed to some early childhood education. However, it is fair to assume that the majority of children in the control group received much less exposure to formal early childhood education. There were eighty-four Head Start centers that participated in the experiment. They were selected to be representative of all programs in operation across the country. Most of these programs were popular and had waiting lists. Ludwig and Phillips (2008) summarize the findings in a report to the New York Academy of Sciences. They conclude that the results regarding the effectiveness of Head Start are disappointing. They find that the program improved cognitive skills, although many of the point estimates reported in the paper are not statistically significant. In summary, quality early childhood education programs are likely to provide persistent boosts in a variety of noncognitive and emotional skills, even if they may not result in large and significant gains in cognitive skills in the short run. The Perry and the ABC-CARE experiments suggest that these gains translate into a better education, health, and economic achievement as well as lower crime rates. These gains in labor m arket productivity and overall welfare do not necessarily require persistent gains in test scores. To be clear, there are few valid alternatives to high-quality early childhood education. Nevertheless, we need more research to determine what kind of early childhood programs are most promising.
18.8 Reforming Urban Primary and Secondary Schools 18.8.1 Accountability and No Child Left Behind The George W. Bush administration introduced the No Child Left Behind Act in 2001 in order to incentivize public schools to improve educational outcom es through assessment-based federal funding. The act set procedures to measure
333
Chapter 18 the academic progress of students, and rather than establishing a set national standard, it allowed each state to develop its own educational standards. To receive federal school funding, states had to give these assessments to all students at select grade levels. Annual standardized tests were created to assess the effectiveness of public schools and their teachers. This allowed each state to define which teachers and school districts were "highly qualified" or "in need of improvement" and tracked reports over time to determine where restructuring was necessary. Also, if a school had been labeled as less than proficient for consecutive years, the act allowed students the opportunity to transfer to better-performing local schools. As the legislation increased accountability in school districts to perform and document improvements, it was criticized for its narrowed focus on standardized testing. By directly linking test performance to federal grants, teachers may be driven to structure classes only on anticipated test material and neglect many other essential topics and project-based processes. After a continued assessment, there was evidence of improvement in some districts, but many others remained struggling and fewer than expected students used the opportunities to switch schools. NCLB was replaced by the Every Child Succeeds Act in 2015. The objective of the new act was to implement varied evaluations of schools, diversifying the NCLB's focus on reading and math scores. The new act gives even more power to individual states to administer their assessments based on other factors like student engagement, completion of advanced coursework, graduation rates, school safety, and postsecondary readiness. States are able to prioritize what areas need improvement and focus their funding and efforts toward these. The act also gives greater importance to educationally lacking groups, like special education students and English-learners. Schools with underperforming subgroups are labeled as "focus schools," and the bottom 5 percent of schools are labeled as "priority schools." These labels allow states to administer specialized programs to help the specific schools improve.
18.8.2 Classroom Size Reductions Another common policy intervention that is aimed at increasing achievement is based on class size reduction. The basic intuition is simple: students will obtain more attention from teachers and suffer from fewer distractions in smaller classrooms. As we have seen before, there is some evidence that smaller class sizes may be useful. Here we consider the evidence that is based on a famous social experiment conducted in Tennessee. The Tennessee legislature considered a statewide initiative to reduce class sizes in 1983. Policymakers and politicians were concerned whether smaller class sizes would result in consistent improvement in student achievement. To evaluate the effectiveness of class size reductions, the legislature created Project STAR. The idea of this experiment was to test whether class sizes that averaged fifteen students in kindergarten through third grade would result in improved student performance. The outcomes were compared to students who attended classes that averaged twenty-four students and were the historical norm. The experiment randomly assigned children to either a small-sized class of approximately thirteen to seventeen students or to a regular-sized class of
334
Provision of Education
twenty-two to twenty-five students. The third option was a regular-sized class with a full-time aide. Seventy-nine schools participated in this experiment. These schools were divided into four geographic groups: inner city, suburban, urban, and rural. Researchers measured student achievement each year based on two tests: the nationally normed Stanford Achievement Test and the curriculum-based Tennessee Basic Skills First Test. In addition, they administered several surveys and questionnaires to participating teachers and school site administrators. Roughly one-third of the children in the sample continued to participate in the same class for three or four years. The evaluation linked data on children across school years to determine whether students in small classes made greater achievement gains than children in regular-sized classes and whether those gains were cumulative. Follow-up studies determined whether achievement gains experienced at the lower grade levels persisted throughout grades four through eight. What did we learn from this large-scale experiment? Krueger (2003) finds that children in small classes consistently outperformed children in large classes. At the end of third grade, students in small classes in inner-city schools, on average, scored 18 points higher on the SAT reading test than their counterparts in regularsized classes. This compared to differences in suburban schools of 6 points, rural schools of 7 points, and urban schools of 4 points. Comparable differences also existed for the SAT math test. Krueger also finds that students retained their achievement advantage over children in large classes in later years. There were, however, no significant additional gains in subsequent years relative to the children in regular classes. Follow-up studies of students in the demonstration show that students from small classes continued to outperform children from large classes through the eighth grade, although the difference by then was quite small. Critics of this experiment have argued that the evidence applies to a specific set of larger elementary schools in Tennessee. It is not known whether the results can be generalized . The experiment was based on large class size reduction that moves classes down considerably below the levels commonly used in California, for example. These large reductions are undoubtedly very expensive, as we have seen in chapter 4. The positive impacts of class size reduction appear to be limited to kindergarten and possibly first grade in elementary school. Nevertheless, we conclude that smaller class sizes seem to work. It is not surprising that most private schools and high-income public schools base instructions on small classes that allow for more student-teacher interaction.
18.8.3 Magnet Schools Magnet schools are public schools with specialized courses or curricula. "Magnet" refers to how the schools draw students from across the normal school attendance zones that feed into regular public schools. Magnet schools emerged in the United States in the 1960s as one means of remedying racial segregation in public schools. Some magnet schools have a competitive entrance process, requiring an entrance examination, interview, or audition. Other magnet schools select all stud ents who apply or use a lottery system, or a system combining some elements of a competitive entrance exam and a lottery. Most magnet schools concentrate
335
Chapter 18 on a particular discipline or area of study while others, such as International Baccalaureate schools, have a more general focus. Many of the most popular magnet programs are oversubscribed, so school districts run a variety of lotteries to determine access. A lottery can be viewed as a randomized experiment, where randomization occurs at the admission stage and not at the participation stage. The lottery is binding in the sense that students with lower numbers are accepted and higher-numbered students are rejected. If the lotteries are fair and balanced, the randomization should create treatment and control groups that have similar observed and unobserved characteristics. Suppose the individuals comply with the intended treatment: all students w ho win the lottery attend the magnet school and all students who lose the lottery do not attend a magnet school. From the lottery winners, we obtain the mean outcome for the "treatment group." From the lottery losers, we obtain the mean outcome for the "control group." Hence lotteries give rise to a quasi-experimental design with proper randomization. Cullen, Jacob, and Levitt (2006) study oversubscribed magnet schools in Chicago. Engberg, Epple, Imbrogno, Sieg, and Zimmer (2014) focus on Pittsburgh. Both studies find that oversubscribed magnet programs have no significant effect on test scores and student achievement. However, they find that these programs do have significant positive effects on behavioral outcomes, such as attendance and suspensions. Moreover, magnet schools are valued by parents, and hence they help urban school districts attract and retain families with high income levels.
18.8.4 Charter Schools Charter schools have gained in popularity in the US since the first one was implemented in Minnesota in 1991. Now more than 2.8 million students attend charter schools throughout the country. Forty-three states have allowed charter schools to operate. The states with the largest public charter school enrollment as a percentage of all public school students as of fall 2015 are Arizona (16%), Colorado (12%), Utah (10%), Louisiana (10%), and Florida (10%). The District of Columbia is the highest of all regions at 43 percent. Charter schools were set up to offer opportunities to students who would have been underserved in traditional district-run schools. They receive government funding similar to their public school counterparts, but they are given more freedom to structure curriculum and teachers' practices while adhering to less constrictive labor regulations. Charter schools usually receive approximately 80-90 p ercent of the funding received by traditional public schools. They are not subject to as many federal regulations, including one that allows them to rent facility spaces, conserving large start-up costs. Additionally, corporations have stepped in to provide capital at lower rates as charter schools gain successful reputations. Charter sch ools also give families more ability to choose where they want their children to attend school. Oversubscribed schools must use a lottery system to ensure equal opportunity. Do charter schools improve student achievement? Angrist, Cohodes, Dynarski, Fullerton, Kane, Pathak, and Walters (2012) focus on Boston. They exploit random assignment to oversubscribed charter schools and provide evidence that attending oversubscribed charter schools improved education al outcomes.
336
Provision of Education
However, oversubscribed charter schools may not be representative of typical charter schools. By definition, they are the most popular ones. If parents know what they are doing, oversubscribed charter schools are also likely to be better than average charter schools. Another shining example of what is possible is the Harlem Children's Zone Promise Academy. This is a charter school where children from a disadvantaged neighborhood routinely outperformed their peers around New York City and New York State. While Harlem has found ways to hire highly qualified teachers and has trained them "how to think rather than how to teach," other charters lack in implementing beneficial management techniques. At a minimum, effective charter schools provide some parents of low-income and minority students the opportunity to get a good education for their children. Why should only middle- and high-income households have the privilege to choose good schools for their children? An interesting question, then, is the following: Can we use the insights of "successful" charter schools to help improve poorly performing urban schools? Fryer (2014) examines the impact on student achievement of implementing best practices from high-performing charter schools into low -performing, traditional public schools in Houston, Texas. These five practices are increased instructional time, more effective teachers and administrators, high-dosage tutoring, data-driven instruction, and a culture of high expectations. Fryer's findings show that injecting best practices from charter schools into traditional Houston public schools significantly increased student math achievement in elementary and secondary schools. The effects are approximately 0.15- 0.18 of a standard deviation per year. Fryer (2014) also finds that the intervention had only a small effect on reading achievement. Similar practices are found to significantly raise math achievement in analyses for public schools in a field experiment in Denver and a program in Chicago. We thus conclude that charter schools serve as laboratories for experimentation in the classroom. We can learn from both the successes and failures of these schools. However, it is often not easy to transfer the successful strategies of small schools with highly motivated teachers and administrators to a large-scale public school system.
18.8.5 Teacher Quality and Compensation Another option is to improve the quality of the teacher pool. We know from the Teach for America program that young, bright college graduates can make a big impact. Teach for America is a nonprofit organization that enlists graduates from top colleges and universities in the US to teach two years in public or charter schools in low-income communities. Clark and coauthors (2013) find that TFA has been very effective in raising math scores. But again, that program cannot be easily scaled up to solve the problem of creating a sufficient supply of high-quality teachers across the United States. To systematically improve the quality of the teacher pool would require substantially higher teacher salaries, which cities and states have been loath to implement. More political support has recently been given to policies intended to motivate the existing staff to perform better, to reward high-performing teachers,
337
Chapter 18 and to hold low-performing teachers to account. If there are schools that consistently perform poorly, the supervising agent can close these schools and reassign teachers to new schools. The key assumption in this reform strategy is that we can measure teacher performance or teacher quality. It is popular to use "valueadded" measures or teacher fixed effects in student test score equations to measure teacher quality. The question is then whether the current pool of teachers will respond to positive financial incentives. Negative incentives are much harder to implement in public schools since it can be difficult to fire or discipline bad teachers. A teacher's salary typically depends on the attained education level and years of service in the profession. Recently, some districts have started to experiment with performance-based pay. President Obama implemented the Race to the Top program in 2009, a Department of Education federal grant of $4.35 billion to incentivize reform and improvements in public schools. A large portion of the grants was distributed as teacher pay based on individual teacher performance. Performance was evaluated using a complicated formula that mainly focused on student performance growth on tests throughout a school year, while controlling for factors such as a student's family income and race. The program created a measurement to hold teachers accountable for their jobs with the end goal of improving American public education. Districts were now able to base 40 percent of teacher pay on these metrics, and some began to deny tenure and fire unproductive teachers. As with any federal legislation, the evaluation system received criticism regarding the fairness of its measurements and the program's overall effectiveness. RAND Corporation's research found that it did not improve student achievement at any grade level, nor did it change the behavior of teachers. Many other concerns came from whether underlying social issues and other uncontrollable factors were taken into account in these metrics. Some teachers have more students with special needs or at-home disadvantages that disproportionately pull down their scores, despite any extra effort they give to help these students. Other arguments go on to discuss how, eventually, highly qualified teachers w ill move to affluent school districts, where their students' access to advantageous resources positively affects the teachers' annual evaluation scores, and the students in underprivileged areas will be systematically deprived of effective teaching. The question remains as to how to reward teachers for actively improving teaching methods without adding disincentives into the system that discourage people from obtaining a teaching position. Teacher tenure is a key part in this line of work, enticing individuals with the certainty of job security. However, tenure has been argued to neglect true teaching value and the interests of the students. Removing tenure-protected teachers includes an expensive court process, usually resulting in schools keeping poorly performing teachers. Three statesFlorida, North Carolina, and Kansas- have eliminated teacher tenure, and ten other states have banned teacher tenure and seniority as the main reason behind deciding layoffs. Arguments for tenure claim that it allows teachers to develop innovative techniques without facing the threat of losing their job, while others disagree, arguing that it increases complacency and decreases effort. School districts must decide how to recruit qualified teachers while also protecting the promised educational opportunities given to each student.
338
Provision of Education
FIGURE 18.10. After-school programs. (Photo by author)
18.8.6 Vouchers and Private Schools Private schools are a popular alternative to public schools in the United States. As of 2015, 5.8 million students, or 10.2 percent of all students, were enrolled in private elementary or secondary schools. Note that 36 percent of private school students were enrolled in Catholic schools, 39 percent were enrolled in other religiously affiliated schools, and 24 percent were enrolled in nonsectarian schools. In the school year 2011-2012 average tuition in private schools was $10,940: average Catholic school tuition was $7,020; other religiously affiliated schools charged, on average, $8,850; and nonsectarian private schools had average tuition of $21,910. Many conservative policymakers advocate the use of vouchers to increase school choice, allowing students to move from public to private schools. Note that the Pell grant system that we use in higher education is basically a voucher system that is not controversial. When it comes to primary and secondary educations, vouchers remain highly controversial. Most policymakers agree that school choice is facilitated by a p er child voucher that can be redeemed at any school of choice, private or public. In a voucher system, parents can more easily pull their students out of failing public schools and send them to private alternatives. Advocates of school vouchers argue that increased competition between private and public schools leads to higher achievement and a lower-cost provision of quality education (Friedman, 1955).
339
Chapter 18 Rouse (1998) provides an evaluation of the Milwaukee Parental Choice Program, the longest-running of several voucher programs in US cities. Her findings suggest that there are some potential benefits from private school competition, particularly in math scores. A common concern about voucher programs is that the best students will disproportionately exit public schools because private schools will recruit them (Epple and Romano, 1998). It seems odd to d eny low-income households the opportunity to find good schools for their children, a privilege that is taken for granted by most middle- and high-income households. A system without vouchers or charter schools typically forces most low-income households to send their children to poorly performing public neighborhood schools. Why are low-income children the only ones that are stuck waiting for local public schools to improve? What if Godot never shows up?
18.9 Conclusions Investments in human capital involve intertemporal trade-offs and d epend on the costs of education, the returns to human capital, and the imperfections of credit markets. One problem is that it is often difficult to borrow against future labor earnings. Moreover, some parents may not sufficiently invest in the education of their children. Finally, there are important positive externalities in the workplace and society that arise from higher levels of education. As a consequence, there is some scope for public intervention in the provision of education. There are two broad types of interventions: the public provision of free or highly subsidized education (public schools and universities); and the provision of loans, grants, and vouchers for those attending private schools and universities. We have seen that the primary and secondary education system in the US is segregated based on socioeconomic status, race, and religion. This segregation results in unequal access to economic opportunities. Problems are most acute in large urban public school districts that are struggling to educate their students. The lack of success of urban districts is generally perceived as the main impediment to growth and prosperity for many large US cities. If public schools in inner cities are not sufficiently attractive for most middle-class households, these cities have a serious problem. Only high-income households who can afford to send their children to private schools, households without children, and large numbers of lowand moderate-income households who have few decent and affordable housing choices in the suburbs will live in inner cities. High-income households that send their children to private schools have few incentives to invest time or resources in improv ing urban public schools. Most inner-city schools-with the exception of a few magnet and gifted schools-will then primarily serve children from low- and moderate-income families. That creates a difficult educational environment to say the least. There are many troubling questions: How can we fix urban schools if we cannot attract a sufficient number of moderate- and high-income households, who provide some positive p eer externalities, to attend these schools? How can we convince low - and moderate-income households to invest more time and
340
Provision of Education
resources into their children? How can we provide incentives for high-quality teachers to stay in urban districts that have been plagued with behavioral problems? How can we make sure that high-achieving students who attend urban districts have the incentives and the means to finish high school and go to college? There is no magic bullet that will solve the problems of many large urban school districts. Nevertheless, we can be cautiously optimistic that improvements are possible. We have seen in this chapter that there is no substitute for high-quality early childhood education. There is some compelling evidence that carefully designed, high-quality programs have high returns and benefit-to-cost ratios. Moreover, these programs are relatively affordable. The interventions have to start at an early age. The longer we wait to address the existing gaps in skills and achievement, the harder it gets. There is also no substitute for a long-term, process-driven strategy that tries to improve the quality of education within urban schools. At a minimum, we need to use intergovernmental transfers to provide the funds and resources so that students in the urban school districts have a chance to succeed. Without sufficient funds, urban school districts cannot attract and retain high-quality teachers, principals, and administrators. The financial burden of educating the majority of students of low-income families should not fall on a small number of high-income households that prefer to live in central cities and urban school districts. Unfortunately, many states do not provide the resources to urban school districts that they need to turn the schools around. As a consequence, many children that attend these poorly performing urban schools are left behind. Money alone will not solve the problems of most urban school districts. Most districts are in need of some serious reforms and will need to make significant changes in their curricula and teaching practices to help students succeed. Elementary and middle school students are likely to benefit from mentoring programs, which can provide a substitute for parental involvement. Programs that focus on noncognitive or vocational skills are promising for high school students. Moreover, partnerships between public schools, firms, and trade associations can help high school students transition from school to work. More research is needed to determine the efficacy of these types of interventions. Not all public school districts will have the willingness or capacity to change. There are often entrenched interest groups that are primarily interested in preserving the current status quo. Vouchers and charter schools are, therefore, necessary to provide incentives for local politicians, administrators, and teachers who are reluctant to embrace change. If the reform of public schools fails or does not produce significant improvements, parents should have the opportunity to opt out of the public school system and look for alternatives. Increased school choice tends to lead to an exodus of the most talented students from traditional public schools. If peer effects are important, the departure of high-income or high-ability students from urban schools may adversely affect those who remain. This is the classic peer effect dilemma. Nevertheless, there is no substitute for increased school choice and competition if urban school districts are unwilling to undergo or incapable of internal reform. Opponents of school choice need to ask themselves why low-income, inner-city parents should be the only ones who have no choice in determining educational opportunities for their children.
341
Chapter 18
18.10 Debate: Pay for Performance for Teachers The pro side sh ould argue that a significant fraction of teachers' compensation should be based on p erformance measures. The con side should argue in favor of the current system, which pays teachers a fixed salary. The following questions may help structure the debate: 1. 2. 3. 4. 5. 6. 7. 8.
In what other professions do we pay b ased on performance? Are these professions similar to or different from teaching? How do we measure teacher performance? What are some advantages of the commonly used measures? What are some disadvantages of the commonly used measures? What are the consequences for hiring and retaining teachers? What are the consequences for overall compensation? What's the evidence that suggests that pay for performance improves student achievement and attainment?
18.11 Problem Sets 1. Suppose a researcher has provided compelling empirical evidence that students who attend schools with smaller class sizes are less likely to commit crimes. Briefly discuss how you would incorporate this additional information into the benefit-cost analysis of optimal class size discussed in chapter 4. Does the optimal class size increase or decrease as a result of these additional positive effects? Explain. 2. What are magnet schools? How are magnet schools financed? How do they increase school choice? 3. Provide two arguments in favor of adopting a test-score-based, pay-forperformance approach for teacher compensation. What policy would you implement if you were the decision maker? Explain. 4. What was the basic idea behind the Perry preschool experiment? Discuss one of the key problems associated with the design used in the Perry experiment. 5. Mandatory statewide student assessment programs are commonly used to evaluate public schools and to determine whether they are improving. Explain one potential problem associated with these assessment systems. 6. It h as been suggested that we should p ay teachers b ased on the measured achievement gains of their students. Suppose the achievement test that you use will only give you a ranking of students in a given class or cohort but not an absolute (cardinal) measure of the achievement gain that can b e compared across time or cohorts. How does that affect your ability to pay teachers based on measured student performance? 7. Suppose you are evaluating the effectiveness of charter schools by regressing a standardized test score Ti on student characteristics Xi and a dummy variable Di that is equal to 1 if student i attended a charter school and 0 otherwise. (18.15)
where ui is the error term in the regression.
342
Provision of Education
a) Explain how the coefficient a is often used as a measure of the effectiveness of charter schools. b) Rewrite this model using two potential outcomes. (Hint: The potential outcome model is discussed in appendix A.5 at the end of this book.) c) Derive the average treatment effect and the average treatment effect on the treated. d) Provide an example of a variable that is likely to have an impact on test scores but is very difficult to measure. How does the existence of such a variable affect the interpretation of the least squares estimate of a? 8. Consider a sharp regression discontinuity design in which students are admitted to a gifted program if their ability Z is above a threshold given by Z 0 , i.e., Pr{D = llZ} = 0 for all Z < Z 0 and Pr{D = llZ} = 1 for all Z 2: Z 0 , where Z is a discrete random variable. (See appendix A.7 for a detailed discussion of the regression discontinuity design.) a) Suppose you do not know Z 0 ; how would determine its value? b) How can you estimate the treatment effect in this case? c) How would you determine whether students above and below the cutoff point are "similar," i.e., have similar observed characteristics? Why is it important that individuals on either side of the cutoff are not that different? d) Provide another example of an empirical application that fits into the regression discontinuity design framework. Explain.
343
19
Crime and Public Safety
19.1 Motivation Keeping cities safe and protecting residents from crime is one of the most challenging and expensive tasks faced by every city. Most forms of crime are local. Thus it makes sense that public safety is primarily a responsibility of cities and their municipal police forces. Municipal police range from one-officer departments in small towns to the 36,600-person-strong New York City Police Department. Most larger municipalities in the US have their own police departments. In small municipalities and more rural areas, law enforcement is typically provided by the sheriff's department and the county police. Some forms of crime, however, require a more centralized response. As a consequence, the US federal government also maintains a significant presence in law enforcement. The federal police forces are part of the US Department of Justice and the Department of Homeland Security. They include the US Marshals Service, the Federal Bureau of Investigation, the Drug Enforcement Administration, the Bureau of Alcohol, Tobacco, Firearms and Explosives, and the Federal Bureau of Prisons. US states also operate government agencies that provide law enforcement services. State police and highway patrol are part of the State Department of Public Safety. Duties include uniform patrol, crash investigation, criminal investigation, and response to other criminal incidents. As of October 2016 the state police of Pennsylvania, for example, employed 4,233 state troopers. Crime is prevalent in many large US cities. Table 19.1 reports the crime rate in large cities in 2013. The crime rate is typically measured as the number of crimes per 100,000 individuals. The statistics are compiled b y the FBI Uniform Crime Reports. The table reports a variety of violent crimes (murder, rape, robbery, assault) and nonviolent or property crimes (burglary and theft). Chicago has the dubious reputation of being the murder capital of the US although many other large cities also have high murder rates. Property crime is also prevalent in many cities. Surprisingly, San Francisco had the highest property crime rate of the cities in table 19.1. Crime imposes large costs not only on victims but also on taxpayers. Figure 19.1 shows that public safety expenditures ranged from $500 to $1,200 per capita
Table 19.1. Crime Rates in Large Cities State
City
New York California Illinois Texas Pennsylvania Nevada Arizona Texas California Texas California Hawaii Texas North Carolina Florida California Indiana Ohio
New York Los Angeles Chicago Houston Philadelphia Las Vegas Phoenix San Antonio San Diego Dallas San Jose Honolulu Austin Charlotte Jacksonville San Francisco Indianapolis Columbus
Source: FBI Uniform Crime Reports.
Population
Violent
Murder
Rape
Robbery
Assault
Property
Burglary
Theft
8,550,861 3,962,726 2,728,695 2,275,221 1,567,810 1,562,134 1,559,744 1,463,586 1,400,467 1,301,977 1,031,458 999,307 938,728 877,817 867,258 863,782 863,675 847,745
585.8 634.8 903.8 966.7 1029.0 920.7 593.8 587.2 398.6 694.2 329.6 243.9 372.5 677.6 648.3 776.8 1288.0 546.3
3.0 7.1 28.7 13.3 17.9 8.1 7.2 6.4 2.6 10.4 2.9 1.5 2.5 6.9 11.2 6.1 17.1 9.1
14.0 55.7 52.5 43.3 84.3 70.9 65.1 71 .7 40.4 60.1 36.4 31.8 51.9 24.6 54.3 39.8 78.4 95.1
198.2 225.9 353.6 451.7 431.5 320.7 193.6 135.7 98.4 320.8 110.5 89.7 99.0 221.8 161.2 417.9 440.2 264.2
357.2 346.0 480.2 458.3 495.3 521.0 327.8 373.4 257.1 302.8 179.8 120.9 219.2 424.2 421.6 312.9 752.3 177.9
1518.7 2359.6 2946.3 4397.5 3147.4 2995.3 3491.3 5029.5 2082.0 3440.2 2427.1 3110.7 3771.0 3767.9 3673.0 6138.0 4790.8 3934.3
164.9 407.8 482.0 872.8 515.6 952.3 820.5 794.8 366.2 854.2 474.7 428.7 532.6 769.5 701.3 600.4 1283.5 851.6
1267.4 1544.2 2089.7 2928.7 2310.7 1537.0 2198.3 3812.8 1351.9 2002.8 1273.7 2294.6 2990.0 2744.4 2704.6 4737.1 2929.5 2715.6
Mean annual public safety spending 2005-2014
A KY: Louisville TX: San Antonio TX: Ft. Worth TX: Dallas FL: Jacksonville IN: Indianapolis NC: Charlotte CA: San Diego TX: Austin OH: Columbus TX: Houston AZ: Phoenix CA: San Jose IL: Chicago PA: Philadelphia Ml: Detroit NY: New York CA: Los Angeles CA: San Francisco
I
I
0
500 1,000 Dollars per capita
0.00
ODS ~1 0 Share of general expenditures
I
1,500
B NC: Charlotte TX: Dallas IN: Indianapolis TX: San Antonio NY: New York CA: San Jose CA: San Francisco CA: San Diego OH: Columbus IL: Chicago KY: Louisville TX: Ft. Worth PA: Philadelphia TX: Houston FL: Jacksonville CA: Los Angeles Ml: Detroit TX: Austin AZ: Phoenix I
I
I
0.15
FIGURE 19. 1. Publ ic safety expenditures in l ar ge US cities. Spen ding in dollars per capita (A) and as a share of gen er al expen ditures (B). (Fi scally Stan dardized Ci ti es d atabase/Lincoln Institute of Lan d Policy)
Crime and Public Safety
FIGURE 19.2. Police officer. (Photo by author)
in large US cities during the period between 2005 and 2014. Most cities spent 10-15 percent of their total expenditures on public safety. Overall, local governments alone spent $173 billion on police and fire protection compared to $549 billion on education in 2015. Crime also imposes large costs on the criminal justice system. The United States has the highest incarceration rate in the world. As of 2013 a total of 7 million individuals were behind bars, on probation, or on parole. A total of 2.2 million people were incarcerated in federal and state prisons as well as county jails. In recent decades, the US has experienced a surge in its prison population, quadrupling since 1980. This is largely due to mandatory sentencing that came about during the "war on drugs." Of those incarcerated in federal prisons, 57 percent were sentenced for drug-related offenses. Those of you who have watched The Wire or Breaking Bad may have some doubts whether the US will win the war on drugs any time soon. 1 1
The Wire is a crime drama series set and produced in Baltimore, Maryland. It was created and primarily written by former police reporter David Simon. Breaking Bad is another highly acclaimed crime drama, which was created by Vince Gilligan and takes place in Albuquerque, New Mexico.
347
Chapter 19 Prisons are expensive. Mai and Subramanian (2017) report that among the forty-five responding states, the total state expenditure on prisons was just under $43 billion in 2015. The state of New York spends $60,000 per inmate and around $18,000 per student. Compare those costs to the roughly $80,000 it takes to pay for a year in an assisted living facility. What do we conclude from that? Prison guards are expensive-not quite as expensive as nurses, but definitely not cheap. In this chapter, we use the tools of economic analysis to gain some insights into urban crime. Our focus is on economically motivated crime, acknowledging the fact that economists have little to say about many forms of violent crime. We develop a model of criminal behavior that allows us to structure our thoughts with respect to possible policy interventions. We then turn to the empirical literature to assess the effectiveness of some popular policies.
19.2 An Economic Model of Crime To help organize our thinking about how to best manage a city's efforts to deter urban crime, we need an analytic framework. We consider a model that is due to Freeman, Grogger, and Sonstelie (1996) . The model focuses on crime that is primarily motivated by economic interests, such as drug sales, illegal gambling, racketeering, prostitution, extortion, people smuggling, and other forms of organized crime. 2 Table 19.2 summarizes the notation of the main variables of our model. Let n denote the number of criminals committing crimes. Let m denote the number of police officers seeking to prevent crime. Fis the monetary cost that criminals bear when they are caught, so we can think of this term as capturing fines and jail time. Let p( n, m) denote the probability of being caught. Note that it increases in m and decreases inn. Finally, let Z (n) denote the economic return of crime per criminal, which decreases inn. Competition is not good for most criminal businesses as well. Note that these are the returns if the criminal is not caught. The return is zero if the criminal is caught. Summarizing, the returns from crime are uncertain and depend on the number of criminals in business and the number of police officers on the street. The behavioral assumption is that criminal behavior is primarily an economic activity. Now obviously you may disagree, but let's try to pursue this approach and see where it leads us. We focus on individuals in this economy who have the potential to be criminals and need to make a choice between honest work and crime. Let us assume that all potential criminals are risk neutral. If you are really risk averse, you should probably not contemplate a career in crime. The expected return to crime, Re, is given by
Rc(n,m) = (l - p(n,m))Z(n)
+
p(n,m) (0 - F)
(19.1)
2 Alternative models of crime have stressed social norms, peer effects, an d social interactions (Glaeser, Scheinkman, and Sacerdote, 1996). Cook (1986) stresses the fact that criminals tend to be selective in choosing a target and are m ost attracted to targets that appear to offer a high p ayoff with little effort or risk of legal consequences.
348
Crime and Public Safety TABLE 19.2. Chapter Notation
Variable
Definition
n
Number of criminals Number of police officers Probability of arrest Returns from crime Fine Expected return of crime Wage Nonpecuniary benefits of honest work Profit function
m
p(n, m) Z(n) F
Rc(n, m)
w G
IT(n,m)
The economic return to honest work, R11 , is the wage paid, denoted by W, plus the personal gain of not being a crook, denoted by G: (19.2)
The personal gain of not being a crook may be positive if doing the right thing is valued by the society. However, G can be negative due to peer pressure or a culture that glorifies crime and violence. Notice the importance of family, friends, and social environment in determining G. How do liberals and conservatives differ in assessing the importance of these factors? An individual chooses honest work if the returns of honest work are higher than the expected returns of crime:
W + G > Rc(n,m)
(19.3)
An individual chooses a criminal life if the opposite holds: (19.4)
Note that markets typically clear by adjusting prices. For example, one could imagine that wages have to rise if firms have problems attracting enough workers, i.e., if crime is too lucrative. We abstract from this mechanism and consider a different way to clear markets. Let us suppose that wages are rigid or fixed in the short run but there is free entry and exit into the criminal profession. Hence we assume that there are plenty of individuals who could succeed as criminals if it were lucrative enough. This might be a strong assumption, but it helps us derive a few useful stylized properties of the model. If we assume free entry and exit, then n will adjust so that the returns of honest work and criminal activity are more or less equal. Broadly speaking, criminals are indifferent between a career in crime and honest work in equilibrium. Equilibrium thus requires that (19.5)
349
Chapter 19
W+G
FIGURE 19.3. Competitive equilibrium.
Given an exogenous level of m, this equilibrium condition determines the number of criminals n that operate in equilibrium. Let us focus on the case in which the returns from criminal activity are decreasing in n, i.e., more criminals mean lower returns per criminal. In this case, there is a stable, unique equilibrium as illustrated in figure 19.3. For crime levels n < nc the expected returns from crime are larger than the returns from honest work. We would expect entry into the profession and higher levels of crime. If n > nc honest work is more desirable than crime, and we would expect exit to occur. We conclude that nc is the unique, stable equilibrium of this model if the expected returns are strictly monotonically decreasing in the level of crime.
19.3 Policy Implications We can ask ourselves how the equilibrium of the model changes as we increase or decrease the model's key parameters. We call these types of exercises comparative static analysis since we compare the properties of two different equilibria. An increase in the level of local policing reduces the returns from crime. An example would be an increase in federal transfers to local law enforcement communities. For example, the Community Oriented Policing Services program provided grants to states and localities to pay up to 75 percent of the cost for new police hires for three years. Evans and Owens (2007) report that this program awarded almost $5 billion in hiring grants paying for 64,000 new police officers between 1994 and 2006.
350
Crime and Public Safety
Convince yourself that increasing W or G increases the returns from honest work and shifts the line W +G in figure 19.3 upward. As the wage increases, crime goes down. Given that many criminals have low levels of education, increasing the minimum wage is a policy option. Of course, drug dealers make considerably more than minimum wage. Other than the danger of being caught or robbed, the work is easier than standing over a hot range at McDonald's. In addition, there is evidence that suggests that selling drugs tends to complement legal earnings. We could also invest in job training programs that increase the human capital of young adults who live in neighborhoods with high crime. As these individuals acquire more human capital, their wage offers should increase as well. These types of policies tend to be favored by many liberals. Similarly, as G goes up, crime goes down. We are talking about "family values." More broadly speaking, any policies that strengthen the family, churches, and communities or improve peer quality should reduce crime. These policies tend to be favored by many conservatives. One of the nice features of this model is that it also provides a justification for policies that are preferred by "law and order" types. For example, as m goes up, p( n, m) goes up and the returns from crime fall. As a consequence, the Re (n, m) curve shifts downward in figure 19.3. The model, therefore, predicts that hiring more police officers decreases crime levels. You should convince yourself that an increase in fines and penalties also shifts the R c(n, m) curve downward. Hence tougher penalties reduce criminal activity in this model. We call this a deterrence effect. More educated individuals have better outside options than their less educated peers. Hence they are less likely to commit crimes. How big are the effects of education on crime? Lochner and Moretti (2004) study the relationship between high school graduation and crime. They use compulsory schooling laws to instrument for endogenous education choices. They use the NLSY, which is a panel dataset that follows young adults over time. They find that the social returns of high school graduation (reduction of crime and reliance on welfare) are about 14-26 percent of the private returns (increases in wages and salaries). Of course, we want to invest in primary and secondary education for reasons that have nothing do with criminal behavior, but it is also reassuring to know that a good school system helps us reduce crime! Finally, note that victim precautions also reduce the returns from crime. For example, Apple ID makes it less valuable to steal a smartphone. Installing a home security system reduces the returns from property crime. Registering your bike makes it easier to convict a thief who is caught in the act of selling your stolen property.
19.4 The Economics of Organized Crime Another nice feature of this model is that it can be used to explain the existence of organized crime. Note that the competitive equilibrium in figure 19.3 allows for free entry and exit. There is no enforcer here and anybody can be a criminal if they choose. As a consequence, the returns of criminal activity and honest work are equalized in equilibrium. This is clearly not the profit-maximizing equilibrium. If
351
Chapter 19 we could restrict entry or exit into the profession, we could potentially make a lot of money in this market. To see why it is lucrative to restrict entry into criminal life, consider the problem of a gang leader or a boss of an organized crime cartel w ho controls a local market, which can be a certain territory, town, region, or country. This means that nobody can operate as a criminal in this territory without the explicit permission of the crime boss. As a true businessperson, the objective is to maximize profits. To learn how to run the business, the boss may actually need to study some basic economics, just like Stringer Bell in The Wire. (Is the demand price elastic or not? What do you do when you have an inferior product relative to your competition?) Like anyone in business, the boss needs to hire individuals to join the gang and commit crimes-perhaps selling drugs or other illegal substances or services. Let n now denote the number of employees that are on the payroll of the criminal organization. Re( n, m) is the revenve per employee. The gang leader is the residual claimant of all profits, but the leader has to pay the members their reservation wages. In our model, the reservation wage is given by W + G. The profit function of the criminal organization can be specified as
IT(n,m) = nRc(n,m) - n(W+G)
(19.6)
The boss maximizes profits by choosing n, taking W + G and m as given. The optimal gang size requires that marginal revenues equal marginal costs. (See the technical appendix at the end of the chapter for a formal derivation and some additional discussion.) This condition is similar to the profit-maximizing choice of a monopolist in a regular market that faces a downward-sloping demand curve. The main difference is that a standard monopolist chooses the level of output so that the marginal revenues equal marginal costs. We assume that the crime boss picks the number of criminals such that marginal revenues equal marginal costs. The model predicts that the level of crime will be lower if crime is organized and entry is controlled by the crime boss than if there is free entry into the profession . This result follows from the fact that a monopolist typically chooses a lower level of output than the level that results in competitive equilibrium. Adding more criminals to the street reduces the chances of a sale for each worker. This cuts into revenues per worker. One could make the same argument for just about any illegal activity: bookmaking, burglary, prostitution, or liquor sales during Prohibition. Why do gang leaders stress the role of the gang as a substitute for the family? We can think of this as an attempt to lower G. Why are gang leaders not interested in economic d evelopment in their area? Any economic development presumably increases competition for labor and hence increases W. The gang lead er has every incentive to keep W + G low. Are these assumptions reasonable? We do not know. For fairly obvious reasons, there is not much systematic empirical evidence on the economics of criminal organizations. One exception is the work by Levitt and Venkatesh (2000), who analyze a unique dataset that details the financial activities of a drug-selling street gang. They find that the gang leader runs a fairly profitable operation. They also find that wage earnings for runners in the gang are somewh at higher than legal market alternatives, but do not offset the increased risks associated with selling drugs.
352
Crime and Public Safety
Those findings are consistent with our model. The workers get their reservation wages, which are fairly low in most economically depressed neighborhoods. The gang leader captures most of the profits. Levitt and Venkatesh suggest that the prospect of moving up in the hierarchy and becoming a gang leader with high future earnings is the primary economic motivation for joining a gang. So a more sophisticated model may need to incorporate career ladder aspects of criminal activity. Of course, our findings should not be viewed as evidence suggesting that gangs are good. Organized crime clearly has many undesirable effects on the overall welfare of society that we have not captured in our simple model of crime. Gangs tend to increase violent crime. We are not considering this aspect in the simple model above. Overall, there is plenty of evidence that suggests that a world without gangs and organized crime would be a more perfect world. But then again organized crime tends to operate in highly profitable markets. If we succeed in taking away the financial gains and the profits of organized crime, we will probably also succeed in destroying the economic rationale for the existence of organized crime.
19.5 Police Effectiveness One important empirical research question, then, is to measure police effectiveness. Should we invest in law enforcement to reduce crime? Or should we use alternative, softer strategies to reduce crime? To answer these questions, we need to estimate the impact of hiring additional police officers on crime and the effectiveness of alternative crime-fighting strategies.3 What are some alternatives to police enforcement? Many policymakers believe that the crime problem is largely due to a drug problem. For example, the popularity of crack cocaine in many large cities in the US in the 1990s created a large number of addicts who needed to commit crimes to finance their addictions. In response to the crack epidemic and soaring crime rates, there was a large increase in resources devoted to drug control. What's more effective in reducing crime, police enforcement or drug treatment? One key problem encountered in regression analysis is that police hiring is endogenous since it is often triggered by an increase in crime. The same unobserved factors that influence crime also determine police hiring. Regressing crime statistics on the number of police officers typically gives a positive coefficient. But that just restates the obvious: more crime goes along with larger police forces and higher arrest rates. To obtain some estimates of police effectiveness, we need to be smarter and try to exploit some exogenous variation in police hiring and law enforcement. Of course, these are complicated issues that are subject to much ongoing research. To illustrate the key issue, we focus on a study by Corman and Mocan (2000), who study crime, drug use, arrests, and police hiring in NYC. They use a dataset that consists of monthly observations in New York City for nearly thirty 3 Police
effectiveness is only one component of criminal deterrence. A great overview of criminal d eterrence is given by Chalfin and McCrary (2017).
353
Chapter 19
FIGURE 19.4. SWAT vehicle. (Photo by author)
years. They deal with the endogeneity problem by using lagged variables. It takes time to train a new police officer. How long? Well, that depends on the state. First, there is some variation in educational requirements. The typical police department requires a minimum of a high school diploma. In addition, some police departments require recruits to have some college credit before applying. These requirements are often waived for applicants with military service or prior law enforcement experience. Second, applicants need to pass a process that involves an entrance examination as well as physical and background checks. Most police departments use a polygraph, or lie detector test, to assess the honesty of applicants. (Should colleges and universities do the same?) Finally, most police hiring programs require applicants to attend a police academy. Large law enforcement departments have their own academies. Smaller departments outsource this training. Recruits are required to attend training for a period of twelve to twenty-four weeks before going out into the field. The purpose of the police academies is to train recruits in criminal law, community policing, firearms training, and investigation and defensive tactics. So in a nutshell it can easily take six to twelve months for a newly hired police officer to be placed in the field . Most police departments assign a new officer to a more experienced field training officer. This mentor administers the additional field training to the new officer. As a consequence, there is a lag between police hiring and police effectiveness. It takes another six to twelve months before the officer can operate without supervision. Some police hiring can, therefore, be fairly exogenous. It is often more directly related to collective bargaining contracts and
354
Crime and Public Safety cycles in the age of the police force. So even if a city wants to respond to crime and hire police, it often cannot do so for a number of months or even years. The results by Corman and Mocan indicate that murders, robberies, burglaries, and motor vehicle thefts decline in response to increases in arrests. An increase in the size of the police force also generates a decrease in robberies and burglaries. They find no significant relationships between drug use measures and violent crime, such as assault and murder. They also find no causal relationship between drug use and motor vehicle thefts. On the other hand, they find a positive relationship between drug use and robberies and burglaries. Criminal drug users apparently go where the easy money is! In particular, a 10 percent decrease in drug use, proxied by drug deaths, generates a 1.8-2.8 percent decrease in robberies, while a 10 percent increase in robbery arrests brings a 7.1-9.4 percent decrease in robberies. These results suggest that increased law enforcement is a more effective method of crime prevention in comparison to efforts targeted at reducing drug use. There is no evidence that drug-prevention or rehabilitation programs decrease drug use on a scale that can be measured in the aggregate, as is evident from the continued high demand for drugs in the industrialized world. Corman and Mocan argue that no alternative policy could guarantee a reduction in drug use b y a given magnitude, whereas an increase in law enforcement is relatively straightforward to implement. Evans and Owens (2007) study the impact of the Community Oriented Policing Services (COPS) program that we discussed in section 19.2. They use the size of COPS grants as an instrument for the size of the police force in their crime regressions. They find that police added to the force generated statistically significant reductions in auto thefts, burglaries, robberies, and aggravated assaults. The magnitude of the effects is similar to the ones reported in Corman and Mocan (2000). We conclude that effective policing reduces crime. That should not come as a shocking result. However, the overall magnitude of the effects is reassuring.
19.6 The Demand for Addictive Goods The markets for many illicit goods are only profitable because the demand for these goods is high and fairly price inelastic. Three competing theories have been proposed for explaining the consumption of potentially harmful and addictive goods. Early approaches typically attribute consumption of these goods to a lack of self-control or myopic behavior (Thaler and Shefrin, 1981). In contrast, Becker and Murphy (1988) forcefully argue that addiction can be modeled as an outcome of rational behavior of forward-looking individuals with stable preferences. A third alternative assumes that individuals are forward-looking but have a "present bias." These models then give rise to time-inconsistent consumption paths (Gruber and Koszegi, 2001). The three competing theories primarily differ in their assumptions regarding the length of the planning horizon that is attributed to individuals. The myopic model assumes that the planning horizon is short and consists- in the limiting case-of only one time period. Individuals care only about today and do not internalize the negative effects of harmful consumption on health in the future. In contrast, rational addiction theory relies on the notion that individuals are
355
Chapter 19 forward-looking and rational. Thus individuals take into consideration the future risks associated with smoking, heavy drinking, or drug use. In the standard rational, forward-looking model, the individual can simply choose a plan that maximizes lifetime utility at the start, without worrying about later selves disagreeing with or overturning it. For a time-consistent individual, if it is beneficial to do something next week, it is typically also beneficial to do it now. Many people do not act that way. Excessively favoring gratification now at the expense of future gratification can usefully be modeled via "present-biased" preferences. So a present-bias individual will also start working out, save more, eat more healthy food, and be more pleasant next year, just not today. At least, that's what they say. Of course, when next year comes around, the time-inconsistent person will typically continue to delay making the hard choices. We studied a model of procrastination in our discussion of property tax compliance in chapter 12. A similar model explains why drug users have a hard time quitting even if the addictive effects are not that large. Of course, quitting is really hard once you are addicted. It should be pointed out that in all three models some individuals get addicted and will experience regret after addiction or negative health shocks will occur. If we are ex-ante rational, we still may feel ex-post regret! The policy implications of these models are quite different. Theories that emphasize lack of self-control focus on paternalistic policies that prevent initiation or reduce consumption by sometimes drastic means. Rational addiction theory favors tax and price instruments to recover external costs. Models with present bias emphasize the role of commitment devices that help individuals avoid time-inconsistent choices. What's the empirical evidence? First, it should be pointed out that almost all the evidence is based on smoking and drinking behavior since data on illegal drug consumption is a lot more problematic to obtain and less reliable. Second, an important characteristic of tobacco consumption-in particular, consumption of cigarettes-is the long latency period between the time of initiation and the onset of adverse events. 4 Relatively few adverse health events occur in the first half of life. To illustrate, at age thirty-five, the cumulative probability of survival is the same for males who have never smoked and smokers. At age forty-five (sixty-five, eighty-five), the corresponding ratio is 1.02 (1.18, 2.11) as discussed in Arcidiacono, Sieg, and Sloan (2007). It is, therefore, almost impossible to distinguish among these theories using data from younger individuals. Nevertheless, most of the initiation happens at a relatively young age. What do we know about younger individuals? There is some serious doubt that young individuals are sufficiently rational and forward-looking to fully internalize all the risk associated with illicit drug consumption. On the other hand, teenagers and young adults like to take risks. As a consequence, policies need to be implemented to protect children and teenagers. What about older individuals? Arcidiacono, Sieg, and Sloan focus their analysis on a sample of men in late middle age from the Health and Retirement Study. Individuals over the age of fifty start to experience negative health shocks. These are at least partially due to past smoking and heavy drinking habits. Much of the 4
The latency period for alcohol can be substantially less than for smoking, for example, due to accidents while being intoxicated.
356
Crime and Public Safety uncertainty about future health and the link between smoking or drinking and health outcomes is resolved during the later years of life. The question, then, is the following: Are these individuals sufficiently rational to change their behavior as they experience negative health shocks? This provides a compelling test of the rational addiction hypothesis. If heavy smokers or drinkers do not quit when they start to experience bad health outcomes, then maybe we can safely rule out that they are rational about their consumption choices of these addictive goods. The empirical findings of Arcidiacono and colleagues suggest that a forwardlooking model fits the data much better than a myopic model. They find that outof-sample predictions of the forward-looking model clearly dominate those of the myopic model. They thus conclude that forward-looking models provide better within-sample fits and out-of-sample predictions than myopic models. So maybe smoking and heavy drinking by older individuals are not completely irrational. Maybe these people are more rational than many policymakers and some leading health economists think. What about younger individuals, such as teenagers and children? Obviously, we should have some serious doubts that they behave in a fully rational way. As a consequence, some special protection for children and teenagers is desirable. Where do you draw the line? At what age can we safely assume that individuals are sufficiently mature and responsible so that they can act on their own? Different societies come up with different answers. In Germany, the legal drinking age is sixteen while in the US it is twenty-one. In many countries the consumption of "soft" drugs such as marijuana is legal or penalties are not enforced. One key question is whether the use of drugs should be legal or not. We turn to the Prohibition experiment for some guidance.
19.7 A Case Study: The Prohibition Experience Recall from your US history lessons that the Eighteenth Amendment to the US Constitution banned the production, importation, transportation, and sale of alcoholic beverages in 1920. The Volstead Act then set down the rules for enforcing the federal ban. It defined the types of alcoholic beverages that were prohibited. Note that local laws were often much stricter than federal laws. Some states in the US banned possession outright. In the 1920s the laws were widely disregarded. The Eighteenth Amendment neither improved moral standards nor put an end to excessive alcohol consumption. Maybe most troublingly, organized criminal gangs took control of the beer and liquor supply in many cities, unleashing a crime wave that shocked the nation. An entertaining TV show that covers this subject is Boardwalk Empire, which takes place in Atlantic City-one of the "sin cities" of that time period-during the Prohibition era. Garcia-Jimeno (2016) uses Prohibition-era city-level data on police enforcement, crime, and alcohol-related legislation. His results show that a 15 percent increase in the homicide rate can be attributed to Prohibition enforcement. By the late 1920s opposition mobilized nationwide against Prohibition, which ended with the ratification of the Twenty-First Amendment in 1933. The US Prohibition experience, therefore, showed a remarkable policy reversal. In only fourteen years, a drastic shift in public opinion required two constitutional amendments. On a positive
357
Chapter 19 note, the country gave it a shot and tried a somewhat controversial policy. Once it recognized that the policy did not work, it managed to reverse course fairly quickly.
19.8 Decriminalization of "Soft" Drugs Some lessons can be learned from the Netherlands, which has experimented with decriminalizing consumption of marijuana and tolerating the existence of commercial outlets for low-volume retail sales (coffee shops) for a number of decades. According to Hoorens (2017), this policy has been more or less successful in its aim to separate the markets for hard and soft drugs. However, it has also created problems. First, the unregulated coffee shops stock their supply on the illegal market. That trend has indirectly contributed to the Netherlands becoming a major producer of herbal cannabis and a transit hub for cannabis resin. Drug syndicates have been among the main beneficiaries of this model, w hile law enforcement is costing the government millions of euros. In 2015 alone, the Dutch police dismantled 5,856 cannabis plantations, almost sixteen per day. Second, it is not easy to regulate the official price of cannabis. If the price at the counter is too high, the street trade will thrive. If the price in coffee shops drops after regulation, then demand is likely to increase. Finally, drug-related tourism in cities such as Amsterdam tends to be unpopular with many local residents.
19.9 Conclusions Crime is a prevalent problem in many US cities. From an economic perspective, the core problem is the existence of a number of very lucrative but illegal business opportunities, such as drug selling, illegal gambling, racketeering, prostitution, extortion, and people smuggling. These illegal businesses generate sufficiently large profits to attract an organized form of crime. Organized crime then typically tries to enforce a "monopoly" or "cartel" with a small number of players, each controlling a separate market or territory. However, these equilibria may not be stable. They will often be challenged either by new entrants or b y other existing organized crime units that try to expand their market size. Entry and exit in these markets then creates violent crime. Violence tends to be particularly bad if there is a lot at stake. The bad news is that crime is prevalent. The good news is that cities and states are not helpless. There exist a number of proven policy options. First, we can adopt policies that discourage criminals from entering into the profession. As we increase the value of the outside option for potential criminals via schooling and training, we make it not only harder but also much more costly for criminal organizations to operate. In addition, we need to encourage economic development and opportunity in areas that are particularly disadvantaged and often provide breeding grounds for criminal organizations. Second, we n eed to strictly enforce existing laws. The empirical evidence suggests that law enforcement is fairly effective and that high levels of crime are largely a function of lack of enforcement. Hiring more police officers increases arrests and clearance rates and provides a strong deterrence for crime. The main
358
Crime and Public Safety
drawback here is that police officers are highly paid professionals. The current yearly salary for a police officer recruit in the police academy in Philadelphia is $49,477. After graduating from the academy as a police officer, the salary increases to $51,245. 5 Third, we need to structure and potentially reform the criminal justice system to avoid high rates of recidivism. As long as the demand for these illicit goods-in particular, illegal drugsremains high, the only hope is to limit the damage. Legalizing the use of some softer drugs, such as marijuana, may turn out to be a reasonable policy. We clearly need more experience with legalization to pass a judgment. It is hard to argue that legalizing harder, more destructive drugs makes much sense. We need stricter regulations on the production and distribution of drugs for medical purposes.
19.10 Technical Appendix: Optimal Gang Size The optimal gang size is obtained by maximizing profits in equation (19.6) with respect to n. The first-order condition can be written as follows:
MR = Rc(n,m ) + n
cJRc(n , m) cJn = W+G = MC
(19.7)
Marginal costs are just given by W + G, the fixed amount that you have to pay to hire another worker. Marginal revenues are more complicated and have two components. The first term, Rc(n, m), is just the revenue generated by an additional criminal(= worker). The second term, n (cJRc(n,m) / cJn), captures the "externality" of hiring another criminal, since the returns for all other criminals(= workers) go down as we increase the size of the gang. Note that
cJRc(n, m) < 0 cJn
(19.8)
Adding another worker imposes a negative externality on other criminals! Equation (19.7) determines the optimal size, denoted by ng. This condition is illustrated in figure 19.5. Note that if ng is less than nc, the gang will restrict access to the profession. Hence criminal activity is lower than in the competitive equilibrium. In the profit-maximizing equilibrium, each criminal generates positive returns for the boss. Excess returns are given as Rc(ng, m) - W - G > 0. The total profits of the gang are then given by ng (Rc(ng, m) - W - G).
5
There are scheduled increases in p ay to the present maximum of $64,459 a year. Overtime is paid at the level of time and a half. The starting salary for a PA state trooper is $58,962. The police commissioner of Philadelphia drew a salary of $240,000 in 2016. That m ade him the second highest paid employee b y the city. Only the medical examiner h ad a higher salary of $260,730. The mayor of Philadelphia, Jim Kenny, came in fourth with a salary of $217,820. These numbers were reported by the website billypenn.com.
359
Chapter 19
W+G
FIGURE 19.5. Optimal gang size.
19.11 Debate: Legalizing Marijuana Debate the pros and cons of legalizing the use of marijuana. The following questions may help structure the debate: 1. What does the medical literature have to say about the negative health effects
2. 3. 4. 5.
of regular or excessive marijuana consumption? What are the potential spillover or externality effects? What are reasonable regulations that should be put into place if use is legalized? What are some of the potential negative side effects from legalizing marijuana? What have we learned from countries and states that have already decriminalized or legalized its use?
19.12 Problem Sets 1. Suppose you analyze the relationship between crime and wages paid in the
regular economy. You want to test the hypothesis that higher wages discourage criminal activity. Suppose you collect data on crime, local wages, and a number of socioeconomic city characteristics that are related to crime, such as poverty and unemployment for a large, random sample of cities. You estimate a linear regression model using least squares. You find that the coefficient that measures the impact of wages is positive and statistically significant from zero. Provide a plausible explanation for this empirical result.
360
Crime and Public Safety
2. Consider the model of criminal activity discussed in this chapter. Under what conditions do you get a unique equilibrium of the model? 3. Consider the economic model of crime discussed in this chapter in which a gang leader controls a territory. What happens to criminal activity in this version of the model if we increase the wage for honest work? Explain your answer. 4. An empirical study suggests that low-ranking members of gangs who primarily sell drugs on the street may not earn much more than minimum wage. Suppose this result is true; what may account for this finding? 5. Consider our model of crime discussed in this chapter and focus on a unique, stable, competitive equilibrium with free entry and exit. Suppose the potential criminals are risk averse instead of risk neutral. How would that affect the equilibrium? Provide a graphical explanation. 6. Explain why an increase in the minimum wage in a city may lead to a reduction in overall crime. 7. How does Garcia-Jimeno (2016) explain the drastic shift in public policy during the Prohibition era in the US?
361
20
Urban Environmental Challenges
20.1 Motivation Many environmental challenges are local or regional issues. As a consequence, cities are important players in meeting these challenges. Two of the most important local environmental challenges are providing clean air and clean water. Water supply is particularly important in the American West, which is for all practical purposes a semidesert. Even if there is a decent amount of water, it can be difficult to allocate the scarce resource when there are many competing needs and the price of water does not necessarily reflect its scarcity. Providing safe drinking water to residents has also been problematic in many of the older cities in the Northeast and Midwest because the infrastructure is old. A recent example is the Flint water crisis that we discuss in section 20.7, which led to contamination of the water supply for tens of thousands of local citizens. In addition, cities must also offer protection against natural disasters, especially if they are vulnerable to extreme weather conditions, such as hurricanes or flooding, which may become more prevalent due to global warming. As pointed out by Kahn (2010), "green" cities are not necessarily "safe" cities. In addition, there are conservation issues, such as providing open space, parks, and areas of recreation, and protecting rivers, wildlife, lakes, wetlands, beaches, and other natural resources that are inside the city limits. Large US cities spend between $100 and $300 per capita on the environment. These amounts are small relative to the amounts spent on education or public safety. These low expenditures can be misleading for a number of reasons. First, many of the expenditures to provide clean water and treat and remove sewage do not show up as environmental spending but are listed as current or capital expenditures for utilities. We have seen before that expenditures on utilities are much higher. Second, regulations can have a large impact on many environmental activities. Take construction codes as an example. Most cities in Pennsylvania have adopted the Uniform Construction Code passed by the state assembly in 2004. The Pennsylvania building code is often blamed by builders for the high cost of
Urban Environmental Challenges
FIGURE 20 .1. Clean water. (Photo by author)
new construction in Philadelphia. (Strong unions in the construction sector are typically cited as the other culprit.) Alternatively, pollution control requires firms to make large investments in technology to reduce emissions or treat water. These costs are substantial and not reflected in public accounts. The total costs imposed on businesses by these types of regulations can, therefore, be substantial and provide another measure of the stringency of environmental regulations. Zoning laws and restrictions on urban planning are other regulatory policies that matter. Houston is the largest US city without zoning laws. Of course, Houston did not really fare that well during Hurricane Harvey. Some of the catastrophic flooding could have been prevented by better urban planning and sensible regulation. Urban growth, including in flood-prone areas, had diminished the land's natural ability to absorb water. Houston's drainage system consisted of a network of reservoirs, bayous, and, as a last resort, roads that hold and drain water. The drainage system was not coordinated with the restrictions on development, and it was clearly not designed to handle a massive storm such as Harvey. Unfortunately, these types of storms are increasingly more common. Of course, the probability that a hurricane hits any given city is still small.
363
Chapter 20
20.2 Negative Production Externalities Most urban environmental problems arise due to negative production or consumption externalities. Negative production externalities arise when a firm's production reduces the well-being of others who are not compensated by the firm. A firm should be accountable for all the costs it imposes on other agents. Next, we model the behavior of a firm that produces a negative externality. Table 20.1 summarizes the notation of the key variables of the model. TABLE
20 .1. Chapter Notation
Variable
Definition
X
Production output Private costs Marginal private costs External costs Marginal external costs Price of output Tax rate Profits Price per permit
C(x) C' (x) E(x) E'(x) p t
IT r
Negative production externalities drive a wedge between private and social costs. The private costs, denoted by C(x), are the internal costs of producing the good. Social costs are the sum of the private costs to producers plus any external costs, denoted by E(x), associated with the production of the good that are imposed on others. For example, consider the case of coal power plants, which are still popular in many parts of the country despite the fact that they cause significant air pollution. The health costs associated with air pollution are imposed on others by the operators of these power stations. How do externalities affect efficiency? The intuition is quite simple. In competitive markets, firms will set prices equal to marginal costs. From the perspective of a firm, the only costs that are decision relevant are the costs that are directly borne by the firm. If the firm is not held accountable for some of the costs that it imposes on other agents in the economy, then we cannot really expect that markets are efficient. Instead, firms will produce too much output since the private costs faced by the firm do not reflect the full costs associated with the production of the good. Markets are only efficient if there is not a large difference between private costs and social costs. Let us try to formalize this intuition. In a competitive market, efficiency requires that prices equal marginal social costs: (20.1)
Efficiency requires that the price reflects all marginal social costs, which include the marginal private costs that accrue to the firm and marginal externalities it imposes on others. Figure 20.2 illustrates the efficient allocation.
364
Urban Environmental Challenges
p
FIGURE 20 .2. Private costs, external costs, and equilibrium.
In an unregulated market equilibrium, prices will be set equal to marginal private costs:
p=C'(xc)
(20.2)
Production externalities, therefore, lead to inefficiencies if E'(xe) is substantial. In that case, production is too large. In our leading example, we obtain too much pollution, as is the case in many cities that are located all over the world that have struggled with providing clean air. Figure 20.2 also illustrates the competitive allocation. The problem arises because property rights for goods such as air quality or water quality are poorly defined and not well enforced by the federal and state governments. Coase (1960) points out that if property rights were well defined one would expect that negotiations between the party creating the externality and the party affected by the externality should bring about the socially optimal market quantity. Consider the case in which the property is given to the agents that are affected by the externality. If they own the property rights to the resource that is polluted, then they can either require the polluting firm to cease with the activity or ask for compensation . In our model, the compensation would be at least equal to E(x), the magnitude of the damage caused by the firm. It is then easy to see that the
365
Chapter 20 firm would voluntarily choose an efficient outcome if total private costs are given by C(x) + E(x). There are several difficulties with Coasian solutions, making them less likely to arise as more people become involved. Transaction costs and negotiating problems need to be taken into consideration. It is hard to negotiate when there are large numbers of individuals on one or both sides of the negotiation. Moreover, there can be a holdout problem. If the shared ownership of property rights gives each owner power over all the others, each person has veto power and so may demand unreasonable payments. As a consequence, we cannot necessarily rely on private negotiations to solve these externality problems. Instead, public policymakers need to intervene and enforce property rights for air, water, and other natural resources. The idea is that federal, state, or local governments need to act on behalf of a large number of citizens that are affected by pollution and enforce the property rights against the firms causing the negative externality. A government can employ two types of remedies to resolve the problems associated with negative externalities: (1) corrective taxation to discourage use; or (2) quantity restrictions and quotas to directly limit the output level. Taxes that correct externalities are called Pigouvian taxation, after the English economist Arthur Cecil Pigou. It is straightforward to see that any positive per unit tax discourages the production of output. Moreover, the efficient Pigouvian tax satisfies: (20.3) Note that the total costs for a firm with a Pigouvian tax are equal to C(x ) + tx. Hence the firm sets (20.4) Substituting equation (20.3) into equation (20.4), we obtain (20.5) Hence we find that we can implement the efficient outcome if we set the tax rate equal to the correct amount of the marginal d amage. Similarly, a policymaker can directly regulate the quantity that can be produced by a firm in a market. These types of command and control approaches do not use market incentives. Instead, a policymaker typically imposes a quota for each firm, which determines the m aximum amount of pollution that is admissible under the regulatory regime. If the firm exceeds the limit, it will be subject to fines and penalties. In the extreme case, the regulator could shut the operation of the firm down, either temporarily or permanently. Under full information, a regulator could require a firm to cap output at X e. In an ideal world with complete information, Pigouvian taxation and regulation produce identical results. Regulation has been the traditional choice for addressing environmental externalities in the United States and around the world. In reality, a regulator may not know either the cost function of the firm or the magnitude of the
366
Urban Environmental Challenges external costs. Good regulations are, therefore, only possible if we can overcome these informational problems. Efficiency does not imply political feasibility, as we have learned from a variety of protest movements that were triggered against higher taxes that were sold as Pigouvian interventions. The gilets jaunes movement in France is one prominent recent example.
20.3 Empirical Evidence on Measuring the Negative Effects of Air Pollution The EPA has designed and maintained a comprehensive system of air quality measuring stations. The monitors automatically produce high-resolution measurements of hourly pollutant concentrations at a single point. Pollutants analyzed include ozone, nitrogen oxides, sulfur dioxide, carbon monoxide, and particulates. Researchers can filter these measurements through an air diffusion model to obtain realistic estimates of air quality at places and locations that are not directly monitored. As a consequence, the EPA has a fairly good assessment of air quality in most locations in the US. To translate the air quality measures into environmental costs and damage is more difficult. The most direct approach is based on measuring the negative health effects on humans. For example, asthma is an increasingly common respiratory disease that may be triggered by air pollution. The challenge is to link a deterioration of air quality to an increase in the incidence of asthma in the population. Once we have established this causal link, we need to translate the higher incidence of asthma into monetary costs. Bad air quality can also be linked to higher mortality rates. The Great London Smog in 1952 lasted for five days and led to approximately four thousand more deaths than usual. The deaths were attributed to the dramatic increase in air pollution during the period, with extremely high levels of sulfur dioxide and smoke. In response to the Great London Smog, the UK government passed its first Clean Air Act in 1956. The objective of this law was to control domestic sources of smoke pollution. In addition, the law helped introduce a cleaner version of coal that led to a significant reduction in sulfur dioxide pollution. The benefits of this policy were undoubtedly significant. Translating lower mortality rates into monetary values is rather tricky, as we discussed in section 14.6 on benefit-cost analysis. Recall that health economists typically use a concept called "the value of a statistical life" to measure the monetary costs associated w ith higher mortality rates. Let's review the b asic ideas. Economists typically focus on the risk-money trade-off for small risks of death. Suppose a worker has a choice between two jobs. One is perfectly safe. The other involves a very small risk of death. Let us assume this risk is 1/ 10,000. In a population of 10,000 each facing a 1/10,000 risk, we would expect one death on average. We observe that the second job pays an extra $700. The worker is indifferent between the two jobs. If we divide the willingness to accept the risk by the probability of death, we obtain the value of a statistical life. So in the example the willingness to accept the risk is $700. The probability of death is 0.0001. Hence the
367
Chapter 20
FIGURE 20.3. Air pollution. (Photo by author)
value of a statistical life is $7 million (equal to $700 times 10,0000, which is the number of persons exposed that will result in one death on average). The $700 is also referred to as a compensating differential, which is the additional wage payments to the worker to compensate him or her for the higher mortality risk. The tricky part is the following. Just because some individuals are willing to roll the dice in exchange for some cash does not mean they are willing to do it if we increase the risk. Would the same worker be willing to take the job for an additional $70,000 if the risk were 0.01? On the other hand, many individuals are willing to take very risky actions if they believe they are doing the right thing, such as rescuing a child from a burning house. These are difficult ethical issues. How would you make these trade-offs if you were in charge? Direct measures of the benefits of air pollution reduction, then, basically are based on premature mortality estimates. The main culprits for these mortality effects are increased risks of lung cancer and chronic obstructive pulmonary disease. One classic paper on this topic is Dockery, Pope, Xu, Spengler, Ware, Fay, Ferris, and Speizer (1993). (As you can tell from the really long list of authors, this paper must be a medical study.) They study the association between mortality and air pollution, adjusting for smoking and other risk factors. They report statistically significant and robust associations between air pollution and mortality. The adjusted mortality rate ratio for the most polluted city as compared with the least polluted city is 1.26. That is a rather large effect.
368
Urban Environmental Challenges An alternative approach is based on indirect or revealed preference methods. For example, we can measure how much households are willing to pay to move from a neighborhood with bad air quality to a neighborhood with good air quality. Since neighborhood amenities are capitalized into land and housing prices, we can measure this willingness to pay fairly accurately. Sieg, Smith, Banzhaf, and Walsh (2004) provide some compelling evidence that air quality is capitalized into housing values in Los Angeles. We discuss the hedonic approach for measuring the value of environmental amenities in more detail in chapter 23.
20.4 Heterogeneity in Abatement Costs In many cases, we would like to reduce the amount of pollutant emissions, but different firms face different costs associated with abatement or emission reduction. In that case, we need to take this heterogeneity in abatement costs into consideration when designing sensible regulations. To illustrate the key issues, let's consider an example with two firms, A and B, that have different abatement technologies. The cost curves of firms A and B are given by
CA(xA) = x~
(20.6)
CB(XB) = 0.5x1 Note that x no longer denotes the output of a firm but the reduction in pollution that is achieved. The marginal cost curves of firms A and B are, therefore, given by
C~ (xA) = 2XA C~(xB)
(20.7)
= XB
Hence firm B has lower abatement costs and is, therefore, more efficient than firm A. Suppose we would like to reduce total emissions by 300 units. If we mandate an equal reduction of 150, we obtain
CA +CB = 1502 + 0.5 x 1502 = 22,500 + 11,250 = 33,750
(20.8)
Is that efficient? The answer is obviously no since we ignore the differences in technology among firms. The marginal costs of firms A and B are
c~ (150) = 300 c~ (150) = 150
(20.9)
Firm A is reducing pollution by too much and firm Bis doing too little. There are potential gains from trade. It makes sense to shift abatement from firm A to firm B. As long as the marginal costs for firm A are larger than the marginal costs for firm B, these incentives remain. Efficiency, therefore, requires that firms equalize the marginal abatement costs.
369
Chapter 20
X
FIGURE 20.4. Equalizing marginal costs.
The idea of equalizing marginal costs to achieve efficiency is illustrated in figure 20.4. Broadly speaking, we need to move the dotted line up so that the sum of Xa + xb is equal to the aggregate reduction target. Alternatively, the efficient allocation can be computed by solving the following system of equations: (20.10)
C~ = 2XA = XB = C~ XA +xB
= 300
Substituting the second equation into the first equation implies that XB = 200. Total abatement costs at the efficient allocation are CA + CB =
1002 + 0.5 x 2002 = 30,000
XA
= 100 and
(20.11)
How do we implement this allocation? The easiest way is to use a cap and trade system. We initially assign responsibilities of abatement to the two firms and then let the firms trade in the permit market to reap potential gains that arise from differences in costs. So suppose both firms are initially responsible for 150 units of abatement. Then it makes sense for firm A to purchase 50 permits from firm B. As long as markets for permits are competitive, we would expect that this outcome could be implemented at a price of $200 per permit.
370
Urban Environmental Challenges
20.5 A Case Study: The Clean Air Act The US Clean Air Act of 1963 is a federal law that was designed to control air pollution. Amendments to the law were approved by Congress in 1970, which greatly expanded the federal mandate. These amendments required comprehensive federal and state regulation of all pollution sources. They also significantly expanded federal enforcement. The Environmental Protection Agency (EPA) was established on December 2, 1970, for the purpose of consolidating all federal activities related to the environment into one agency. The activities included monitoring air and water pollution, setting environmental standards, and enforcing standards and laws. The early versions of the Clean Air Act reduced S02 emissions but encouraged the use of older plants that were subject to less stringent regulation. This provided some strange incentives for electricity producers to use old power plants. The 1990 amendments encouraged emissions trading. This was an attempt to rectify this problem. The law created an S02 allowance system that granted plants permits to emit S02 in limited quantities. More importantly, it allowed them to trade those permits. As we have seen above, these types of trading systems can be useful when firms face heterogeneity in costs in pollution abatement.
20.6 Regulation under Uncertainty There is significant uncertainty about the potential benefits and costs of abatement. You may ask yourself, How does uncertainty about abatement costs affect the optimal corrective strategies? If costs are high, regulation could be expensive since plants are forced to comply. Using a price mechanism avoids this problem since firms set the marginal cost of adjustment equal to the subsidy rate. If costs are uncertain, so is the amount of pollution reduction that a subsidy achieves. A mandate implements a given reduction, but the costs that it imposes on firms are uncertain. Weitzman (1974) provides a detailed analysis of price and quantity regulations in the presence of asymmetric information. The basic challenge here is that the regulator does not know the "type" of firm she is facing. There are good types of firms that have access to cheap abatement technology. These firms can reduce pollution at a low cost. The bad type of firm only has access to expensive technology. What should the regulator do when she does not know whether she is facing a good or a bad firm? Should she fix the quantity and lose control over the abatement costs? Or should she fix the price and lose control over the emission level? There is no clear answer to that question. It depends on the degree of uncertainty about both costs and benefits. For example, if it is costly to produce too much output, then we probably want to use a quantity control such as a quota. If it is costly to produce too little output, then a price instrument may be preferable. We conclude that price and quantity instruments are not necessarily equivalent and do not generate the same outcome if there is uncertainty about costs and benefits. The advantage of a price instrument is that it allows firms to adjust, which can lead to lower welfare losses than mandates. As firms adjust to price signals, the regulator has less control over the amount of pollution reduction.
371
Chapter 20 In a dynamic environment, additional complications arise due to the possibility that a regulator may change regulation in the future. Consider a simple two-period model in which firms have the opportunity to make a costly investment in the first period that reduces abatement cost in the second period. Let us assume that this investment is optimal as long as the regulator does not impose additional new regulations in the second period. If the regulator can commit to such a policy, firms find it in their interests to invest in the new technology. If the regulator, however, cannot commit herself to a regulation policy in period 2, firms may not want to invest in anticipation of a subsequent tightening of the regulations in period 2. This is sometimes called a "ratchet effect."
20.7 A Case Study: The Flint Water Crisis The city of Flint, Michigan, is located seventy miles from central Detroit and is home to about 99,000 residents. Almost 42 percent of residents in Flint live below the poverty line, and the city has faced extreme economic debilitation, which led to a deficit of around $25 million in 2011. The city of Flint had historically been home to numerous automobile manufacturers, which supported the city by providing jobs and generating economic activity. By the mid-1980s, General Motors and other automobile manufacturers shut down several of their factories in Flint, laying off thousands of the city's employees and drastically shocking the city's economy. Flint had been struggling for years to get back on its feet. State legislatures were reluctant to bail out the city and refused to provide the needed funds that would help Flint with its deficit. In an effort to save the city's economy, Flint's government officials decided to decrease spending, place various regulations on businesses, and raise taxes. Flint's economy continued to decline, and the city quickly became bankrupt. It was placed into a state of financial receivership in December 2011. Under this status, Governor Rick Snyder appointed emergency financial managers who were then granted the power to override local elected officials. Bankrupt and in a state of receivership, Flint was desperately looking for ways to cut costs. By 2011 the Flint water supply fund was facing a $9 million deficit, causing the state of Michigan to reevaluate Flint's finances and water supply costs. Since 1967 the city's primary water supply had come from Lake Huron and was treated and operated by the Detroit Water and Sewage Department (DWSD). In 2011 Flint officials took a vote on whether to renew their thirty-year contract with DWSD and decided to switch to the Karengondi Water Authority (KWA), which would drastically cut costs. Government officials decided to build a new pipeline connecting Flint to Lake Huron, which would be operated by KWA rather than the city of Detroit. The new pipeline was said to take two years to complete and would save the city $4 million per year. In the long run, Flint officials hoped that the switch would save enough money to eventually pull the city out of its deficit. However, since the city had ended its contracts with DWSD, it needed an alternate water source for Flint during the new pipeline construction. So in April 2014 the city decided to switch Flint's primary water supply to the Flint River, which was processed and treated in Flint and had previously been used as the city's emergency backup source.
372
Urban Environmental Challenges Throughout 2014 there were various complaints from residents about the city's tap water. Warnings were sent to residents notifying them of numerous contaminations (such as E.coli and other bacteria) and advising them to boil water before consumption. After treating the water several times for these contaminants, families with children were told to consult with a physician to decide whether their children should drink the tap water. In addition to these contaminants, the city soon discovered that the water from the Flint River was far more corrosive than the water from Detroit's Lake Huron, which caused large amounts of lead from the piping materials to leach into the city's tap water. The EPA ignored citizen complaints for months; then once it did the testing, it tried to bury its own report and fired an employee who was trying to get it to act. Because of the policy failures, water with dangerous levels of lead was supplied to Flint households for over a year. As a result, an estimated 8,000 children in Flint are now at risk of learning and developmental problems caused by lead consumption. The Flint water crisis sparked heated debate over government accountability and serves as a paradigm of government failure on all three levels: federal, state, and local. Numerous hearings brought into question whether state or federal regulators were to blame for leaving the crisis unaddressed long after they learned of the dangerous lead content in Flint's water supply. Lawmakers from both parties pushed for various officials in the EPA and the state of Michigan to resign. More specifically, Republicans blamed EPA administrator Gina McCarthy, who failed to address this issue until January of 2015, months after the EPA cited high levels of lead in Flint's water. Democrats focused on state legislators and blamed Michigan's governor Rick Snyder for knowing about the issue and failing to address it for almost a year. Darnell Earley, the elected emergency manager at the time, denied his responsibility and claimed that the decision to switch water supplies was part of a long-term plan that had been approved by Flint's mayor, Dayne Walling, and by a city council vote months before he began his term. Following a series of hearings and investigations into the Flint water problem, the city switched its water supply back to Lake Huron in October 2015. The Flint case highlights a potential problem of decentralization. Low - and moderate-income households tend to live in places with low-quality environmental amenities, and hence tend to suffer worse health outcomes. The environmental justice movement has focused on bringing these issues to the attention of policymakers and has actively lobbied for changes in policies that address this type of inequality.
20.8 The Impact of Global Warming on Cities The prevailing scientific assessment of climate change is that at least some of the warming observed over the past decades is attributable to population and economic growth. Economic expansion has increased the amounts of carbon dioxide (CO2) and other greenhouse gases (GHGs), which are the primary causes of the human-induced component of warming. GHGs are released by the burning of fossil fuels, land clearing, and agriculture, etc. and lead to an increase in the greenhouse effect. The greenhouse effect is the process in which the absorption of infrared radiation by an atmosphere warms a planet.
373
Chapter 20 In 2005 the world produced 28.1 billion metric tons of CO2. The number is predicted to rise to 42.3 billion tons by 2030. Some of the leading researchers have concluded that we need to stabilize the atmospheric carbon dioxide concentration at 500 parts per million (ppm) or even as low as 350 ppm. That would mean reducing our total CO2 emissions to no more than 19.1 billion tons per year, or approximately 2.5 tons per person. Note that a Toyota Corolla hits that limit after 7,500 miles per year. (The average US driver travels 12,000 miles annually.) Global warming is a function of the total emission of GHGs of all individuals and firms. In principle, we could use a global carbon tax or a cap and trade system to reduce the emission of GHGs. In practice, the problems that have to be overcome to design and implement such a system are rather daunting. Under international law, obligations may be imposed on a sovereign state only with its consent. Treaties thus require for all practical purposes unanimity among all major players. Possible mechanisms are noncooperative solutions in which each country operates its own system. This best describes the current laissez-faire approach. Alternatives are nonbinding voluntary agreements, specific and binding treaties as attempted in the failed Kyoto Protocol or the Paris Treaty, or limited delegation of regulatory authority to a supranational body. It is difficult to reach an agreement on global warming since any policy involves estimating and balancing costs and benefits that are subject to much uncertainty. Kahn (2010) discusses the likely implications of global warming for cities and their urban development. He explains how cities will have to transform as we change our behavior and as our natural surroundings respond to the changing climate. "Safe" or climate-resilient cities are defined as those that are likely to be immune to climate change or may actually benefit if they are in an exceptionally cold climate. Examples are Minneapolis or Milwaukee. "Green" cities will take proactive steps to deal with the negative consequences of climate change and to capitalize on the opportunities that come with these changes. Climate change will trigger migration from cities that are heavily affected to cities that are less affected, thus potentially leading to large changes in respective property values. Cities that take proactive steps to deal with rising energy and transportation costs will be rewarded with the changing climate. On the upside, global warming will trigger innovation in products that will help people cope with some of the negative consequences of climate change. Individuals and firms that design new helpful and applicable products to battle global warming will undoubtedly make a huge profit. Innovation w ill also be triggered by the change in relative prices for natural resources as well as electricity. As greater demand shifts toward products that use renewable energy, those energy sources will also experience an increase in demand. Cities can take proactive steps to speed up the transformation of the local economy. An example is Seoul, South Korea, which has suffered from serious traffic congestion and air pollution problems for some time. In an innovative step to alleviate traffic jams and improve air quality, the metropolitan government made electric cars available for rent throughout Seoul starting in 2013, under its Electric Vehicle Sharing program . Registered drivers can book a Kia Ray EV online for approximately $5 an hour. Charging is free and the rental expenses include insurance costs.
374
Urban Environmental Challenges
20.9 A Case Study: Redesigning Flood Zones in NYC After Hurricanes Sandy, Harvey, Irma, and Maria hit the northeastern states, Texas, Florida, and Puerto Rico between 2012 and 2017, many cities in the US began rethinking their flood plans. The added threat of global warming has also caused the hurricane-prone Atlantic coast to treat the issue with greater urgency as the incoming storms are forecast to be more frequent and more damaging. Specifically, New York City has started to redraw the flood maps surrounding the city in an attempt to begin an almost complete reconstruction of its drainage system. FEMA, the Federal Emergency Management Agency, has taken the lead in New York's flood planning. The agency has given a realistic estimate of which areas need protection. The plan has received pushback from communities that will be affected by the proposed changes. Most of the city was built before flood planning. Under the new plan developed by FEMA, many more homes and establishments will fall under the expanded flood zone and will be required to purchase flood insurance. As some view this as a valuable investment in preventing future damage and costs to the city, many others argue the inclusion of 26,000 additional buildings is excessive and unnecessary. In addition, the city has adopted more stringent building codes for homes in a flood zone. For example, new regulations require that electrical and mechanical systems are moved above a stated height. These renovation costs are not calculated into the city's total estimates. Developers have started to build new homes in the affected neighborhoods that satisfy the new regulations. While more affluent individuals can afford these flood-protected homes and apartments, lower-income residents will ultimately be displaced from these neighborhoods.
20.10 A Case Study: Rebuilding New Orleans after Hurricane Katrina The city of New Orleans experienced one of the most devastating urban natural disasters in modern history when Hurricane Katrina flooded 80 percent of the land in 2005. Since then, the city has been praised for its ability to rebuild businesses and revive the anchoring tourism industry. With the help of federal recovery spending, volunteers, nonprofits, and other government programs, New Orleans was able to regain footing and create jobs in specific industries that outpaced prestorm levels. There is much speculation over whether the decade-long recovery established a base for continuing economic growth. The city catered to the lost tourism in restaurant, hospitality, and service businesses. As a consequence, it disproportionately created m any low-wage jobs in the region. Similarly, the economic progress has been experienced mostly by white residents, while a third of the black demographic group that fled the city following the storm did not return. The rebound has resulted in greater economic inequality for New Orleans, receiving a Bloomberg label as the country's most unequal city. Also, rebuilding the destroyed public infrastructure increased production in construction and contracting, but after over a decade the need for these services declined. The influx of manufacturing and industrial jobs initially stabilized the
375
Chapter 20 city, but in order for New Orleans to prosper, other industries are needed to generate a diversified local economy. The city overcame catastrophic destruction and in ten years almost reached its prestorm population and production levels. Its recovery gave hope to many other declining US cities.
20.11 Conclusions Most urban environmental problems are due to negative production externalities, which arise when a firm's actions negatively affect another party (households or other firms) and the firm does not fully compensate the other parties for the additional costs it imposes. If externalities are present, then the market will not produce an efficient outcome and a government intervention is potentially justified. The standard externality framework is useful for understanding basic urban problems, such as providing clean air and clean water. The solution to these problems typically includes a combination of taxes, subsidies, mandates, and other regulations. Cities need to work together with state and federal agencies to solve these environmental problems. While most cities in the US and other developed countries have made significant progress in accomplishing the dual goals of providing clean air and water, many cities in developing countries are still struggling to achieve these objectives. Global warming poses another set of problems since it affects the varied climates of the world differently. Obviously, cities and states cannot solve this global problem on their own. A basic plan of action is the use of multinational treaties to deal with global warming. Unfortunately, strategies have been unclear thus far in creating an effective deal that would spread the costs of reducing greenhouse gases among countries. Greenhouse gas reduction serves as an example of a global public good and helps illustrate the massive problems associated with providing these types of goods. Without a serious reduction in greenhouse gases in the near future, it is likely that temperatures will continue to rise. In that case, cities need to prepare themselves for the rapidly approaching consequences of global warming. We have seen that there is a difference between a "green" city and a "safe" city. "Safe" or climate-resilient cities are those that are likely to be unaffected by climate change or may even partially benefit because they are in an unattractive colder climate. "Green" cities are those that will take proactive steps to deal with the negative consequences of climate change and to capitalize on the opportunities that are associated with these changes. Americans are mobile and approximately 3 percent of Americans move to different states each year. In 1900 38 percent of the US population lived in the South and the West. By 2000 this percentage had grown to 58 percent. In 1950 Las Vegas had a population of 25,000 people. By 2008 the population was 560,000, with more than 1.8 million in the greater metro area. Phoenix and Dallas show similar growth patterns. Coastal areas located within fifty miles of an ocean or a great lake account for 50 percent of the population in the US and 56 percent of income, but account for only 13 percent of the land. Individuals and firms have migrated from cities that are climate resilient to cities that are at risk (Kahn, 2010). These mobility patterns create a number of public policy challenges. The recent examples of Hurricanes
376
Urban Environmental Challenges Katrina (New Orleans), Sandy (NYC), and Harvey (Houston) illustrate how vulnerable many coastal cities in the US are to flooding. Cities need to take proactive measures to avoid even larger potential disasters. Bad and careless policies can be disastrous.
20.12 Technical Appendix: Solving the Model of Externalities To formalize these ideas, let us consider an economy with two firms. The first firm produces one good x1. The cost function is given by C (x1). The profit-maximizing output for this firm is given by max IT1 XJ
= p X1 - C(x1)
(20.12)
There is a negative production externality, and for every unit of x1 , a second firm incurs external costs given by (20.13)
We are assuming for simplicity that 1 unit of output creates 1 unit of pollution. We also assume for simplicity that that firm 2 is passive and does not engage in production. Convince yourself that this assumption can be easily relaxed. To compute the optimal allocation in this model, let us consider a planner 's problem that maximizes total profits: (20.14)
Optimal allocations need to satisfy (20.15)
The price must equal total marginal costs, which are the sum of marginal private and external costs. One way to implement this allocation is to merge firms 1 and 2. The new merged or integrated firm now faces the same decision problem as the planner and will implement the first best outcome. In a competitive market, firm 1 maximizes profits. Hence firm 1 sets outputs such that (20.16)
The price equals marginal private costs. Note that marginal private costs are smaller than marginal social costs. The market, therefore, produces too much of the private good. Firm 1 does not take the negative externality into consideration when making decisions. One way to reduce the outcome of firm 1 is to introduce a tax that must be paid for each unit of output. Let t denote this per unit tax. Firm 1 now faces the
377
Chapter 20 following decision problem: max I11
= p X1 - t X1 - C(x1)
(20.17)
Xl
The first-order condition is now (20.18) Hence, taxes reduce output. What would be the optimal tax in this model? From the previous analysis, we see that we need to set the tax rate such that it equals the marginal external costs: (20.19) Note that it may be hard to implement the optimal tax in practice, because the external costs of pollution may be hard to measure. Market allocations are inefficient because firm 2 cannot influence the behavior of firm 1. Adding a market for permits provides another mechanism for an efficient allocation. Suppose you need permits to engage in an activity that causes pollution. If the market price for pollution is r, then firm 1 can decide how many permits to buy from firm 2. Firm 2 can decide how many permits it wants to sell to firm 1. The profit maximization problem for firm 1 then becomes max I11
= p X1 -
rx1 - C(x1)
(20.20)
XJ
First-order conditions are then given by (20.21) Similarly, firm 2 maximizes max I12 = r x2 - E(x2) Xz
(20.22)
which implies that (20.23) In equilibrium we need x1 = x2, which yields (20.24) The allocation of resources under trading is thus efficient. Thus creating property rights and a market for pollution rights can solve the externality problem. This result is attributed to Coase (1960), who conjectures that m arket equilibria w ill be efficient if all property rights are defined properly (Coase Theorem).
378
Urban Environmental Challenges Finally, consider the model with heterogeneous technologies. The abatement cost minimization problem can be written as (20.25)
where x is the aggregate reduction target. Taking partial derivatives with respect to XA and xs implies that (20.26)
Hence we want to equalize marginal abatement costs across firms.
20.13 Debate: Lessons from Flint Discuss the key lessons and takeaways from the Flint water crisis. The pro side should argue that we need to centralize, strengthen the EPA, and provide better federal monitoring of local communities. The con side should argue that we need to decentralize, strengthen and empower local communities, and decrease the importance of federal oversight of essentially local activities. The following questions should help structure the debate: 1. What's the role of the state of Michigan in Flint's bankruptcy?
2. 3. 4. 5. 6. 7. 8.
Why was Flint in financial trouble? Who are the leading players in the crisis? Who made the critical decisions that led to the crisis? What were the decisions that led to the mistakes? What's the role of the EPA? What expertise should be provided at the federal level and why? What do we know about the quality of the water supply in other US cities?
20.14 Problem Sets 1. What are two advantages of taxing people based on actual car use compared
to a gasoline or carbon tax? 2. What would be a potentially simpler and less intrusive alternative to a carbon tax that could also reduce congestion in cities? Explain your answers. 3. Explain why the lack of well-defined property rights can be a problem if there are environmental externalities. 4. Provide two real-world examples w here we use taxes to correct for negative externalities. 5. Explain how global warming may affect N ew York City. Discuss two policies that you would advocate that the mayor of NYC should implement. 6. A competitive refinery that operates in a city sets prices equal to marginal costs. The inverse demand function is given by
p = 20 - q
(20.27)
379
Chapter 20 where q is the quantity and p is the price paid by the consumer. Suppose the private costs faced by the refinery are C(q) = 2q
+ 0.5q2
(20.28)
a) Compute the competitive market output q111 and price level p111 • b) Suppose the refinery imposes an externality on the residents of the city, given by E(q)= 0.25q 2
(20.29)
What is the socially optimal output q0 and price level p0 ? (Hint: Set prices equal to marginal social costs.) c) Suppose the government imposes an emissions fee oft per unit of output. How large must this fee be if the market is to produce a socially efficient level of output? d) Somebody has suggested the following welfare measure for the loss associated with the externality: q,n
L=
1 qo
[MSC(q) - p(q)] dq
(20.30)
where the marginal social costs MSC( q) = C' (q) + E' (q). Compute the loss L for the example above and provide a graphical illustration of the magnitude of the loss. 7. The newly elected mayor of a city has pledged to reduce air pollution. The city has no close neighbors. The only source of air pollution are the two domestic plants run by firm A and firm B. Firm A has pollution abatement (= reduction) costs of x 3, where xis a unit of pollution. Firm B has a pollution abatement cost of x 2 . Assume that neither firm is initially engaging in pollution abatement. The per unit benefit to a unit of pollution abatement experienced by the city's citizens is constant at $300. a) What is the socially optimal level of pollution abatement? How is the socially optimal level of abatement split between the two firms? b) The mayor considers engaging in command and control style quantity regulation and declares that each firm must engage in 80 units of pollution abatement. Is this optimal? Why or why not? c) Alternatively, the mayor considers providing a subsidy of $300 per unit of pollution abatement. What is the per firm and total level of pollution abatement? Is this socially optimal? d) The mayor also considers issuing pollution permits and establishing a market for these permits. For reasons associated with the relative generosity of the firms to his recent election campaign, firm A is given permits such that it must engage in 100 units of pollution abatement if it fails to enter the market. Firm B is given permits such that it must engage in 60 units of pollution abatement if it fails to enter the market. Each unit allows the firm holding the unit to produce 1 unit of pollution. Assume that the market for pollution rights is perfectly competitive.
380
Urban Environmental Challenges To derive the demand for pollution rights of firm A, solve the following cost minimization problem: (20.31)
Solve the following problem to derive the demand for permits of firm B: min(60- as)2 + pas as
(20.32)
A market equilibrium is achieved when aA = -as. Compute the market equilibrium. Is the market equilibrium socially optimal? Explain.
381
21
Managing Cities in Developing Countries
21.1 Motivation According to World Urbanization Prospects published by the United Nations in 2012, the world population is expected to increase by 2.3 billion over the next four decades, from 7.0 billion in 2010 to 9.3 billion in 2050. Urban areas of the world are expected to absorb all the population growth while drawing in some of the rural population. Furthermore, most of the population growth expected in urban areas will be concentrated in the cities and towns of less developed regions. Asia, in particular, is projected to see its urban population increase by 1.4 billion, Africa by 0.9 billion, and Latin America and the Caribbean by 0.2 billion. Population growth is, therefore, becoming largely an urban phenomenon concentrated in the developing world. Managing cities in developing countries is one of the biggest economic challenges. Table 21.1 reports the size and per capita GDP of some of the largest metropolitan areas in the world. These statistics were computed based on the New York City Global City database. Population estimates are from 2010. GDP per capita estimates are for 2013. The top panel focuses on high-income metropolitan areas, and the lower panel focuses on low- to moderate-income metropolitan areas. The table shows that there are some significant differences between metropolitan areas in developed and developing economies. The key question that we try to answer in this chapter is the following: How does fiscal policy management in cities located in developing economies differ, if at all, from policies in developed economies? Of course, one of my friends who grew up in Philadelphia argues that life in some parts of Philadelphia resembles life in developing countries: bad schools, bad infrastructure, bad housing, lots of poverty, lots of crime, and not much hope for progress. But at the end of the day, you will agree that a comparison between poor and neglected neighborhoods in the US and those in developing economies will not get us very far. So where shall we begin? Most urban economists agree that cities are also the centers of economic innovation, production, and growth in most developing economies. The main objective of a city is to protect and enhance the economic
Cities in Developing Countries TABLE 21.1. Large International Metropolitan Areas
Rank
City
Country
Population
GDP per Capita
Japan South Korea United States Russia France Japan United States United Kingdom United States China
34.9 23.6 18.1 14.9 11.6 11.3 9.6 9.0 9.0 7.1
41.4 32.2 67.7 44.8 50.0 35.2 55.0 52.0 66.2 48.7
27.9 24.0 19.9 19.8 19.6 19.2 18.0 18.0 13.3 12.5
7.2 9.5 19.9 23.7 20.3 21.4 21.9 5.9 26.1 22.8
High-Income Cities 1 2 3 4 5 6 7 8 9 10
Tokyo Seoul New York Moscow Paris Osaka Chicago London Los Angeles Hong Kong
Low- and Moderate-Income Cities 1 2 3 4 5 6 7 8 9 10
Jakarta Delhi Mexico City Sao Paulo Beijing Shanghai Karachi Mumbai Buenos Aires Istanbul
Indonesia India Mexico Brazil China China Pakistan India Argentina Turkey
Source: Author's calculations based on data from Global City database. Note: Population is in millions; GDP per capita in thousands (USO).
advantages of the local economy by efficiently providing a variety of public goods and services. It is essential that the advantages that arise from proximity and agglomeration are protected, no matter where a city is located. What might potentially undermine these advantages in developing economies are the same forces that can generally lead to inefficient diffusion of economic activity: government corruption; inefficient tax and spending policies; a bloated, incompetent, or overpaid bureaucracy; excessive congestion; air and water pollution; underfunded schools; and the adverse effects of urban poverty and crime. These factors undermine agglomeration advantages and thus the economic potential of cities. We conclude that there are many commonalities between managing cities in developing countries and dealing with those in developed countries. Can we go as far as saying that the basic principles that apply to New York, Berlin, and Tokyo also apply to Sao Paulo, Delhi, and Karachi? The answer to this question has to be "probably not." There are fundamental differences between rich and poor countries. We need to take these differences into consideration when we try to understand urban fiscal policies in developing economies. Otherwise, we are likely to confuse cause and effect and, therefore, make poor policy recommendations. Before we ask ourselves how we should
383
Chapter 21
FIGURE 21 .1. Twin Towers in Kuala Lumpur. (Umar Mukhtar/pexels.com)
manage cities in developing economies, we need to take a step back and ask ourselves a different set of questions. What are the origins of poverty? Why are some countries rich and other countries poor? To answer these questions, we largely draw on recent research in political economy.
21.2 The Origins of Power, Prosperity, and Poverty Why do some countries succeed and other countries fail? What are the origins of economic prosperity? What causes poverty? These questions are at the core of some exciting research in modern political economy. Acemoglu and Robinson (2012) summarize the findings of research initiatives pertinent to this topic conducted over a period of two decades. Let us summarize their basic arguments. To obtain some intuition and to rule out common misconceptions, it is useful to study some quasi experiments that were recently conducted through the course of historical events. We can compare the economic performance of East and West Germany after World War II. We can study North and South Korea after the end of the Korean War. Alternatively, we can try to draw some lessons from China, comparing economic performance under Mao Zedong and under Deng Xiaoping. Finally, some useful examples can be found in recent African history following the decolonization of the continent. For example, one can compare Botswana, often considered to be an economic success story, with many similar countries such as Zimbabwe or the Congo, stuck in a cycle of poverty and violence. What can we learn from these case studies?
384
Cities in Developing Countries
Let's start with Germany, which is not a developing country but its industries and cities were heavily destroyed during World War II. Moreover, the country was partitioned largely along geographic lines into East and West Germany. Note that both countries were very similar in culture, history, the work ethos of their people, and initial knowledge or human capital. Nevertheless, we observe very different paths of economic development after the partitioning of Germany. West Germany thrived, while East Germany had a brief period of success, stagnated in the 1970s, then eventually collapsed at the end of the 1980s. An even more striking pattern is observed in Korea after the partitioning of the country at the end of the Korean War. North Korea has performed very poorly by any standards since the very beginning. South Korea started out in the 1950s as a largely agricultural society. If anything, most of the heavy industry and natural resources were in North Korea. South Korea developed at a very rapid pace and created much prosperity in an astonishingly short period of time. The same is true for China, which was stuck in poverty during the 1950s and 1960s under the leadership of Mao Zedong. China started to thrive following a reversal of economic policies under Deng Xiaoping. The main tenet of research in political economy-summarized in Acemoglu and Robinson-is that differences in political and economic institutions that result from policy choices made by the ruling elites can largely explain the differences in prosperity among countries. Neither guns, germs, or steel nor religion, natural resources, ethnicity, or race appear to be of first-order importance. The claim is that policy choices matter most as they shape the economic, political, and cultural institutions of a country. Broadly speaking, the ruling elites of North Korea, East Germany, and Maoist China faithfully followed the works of Karl Marx, while West Germany, South Korea, and modern China embraced modern economic principles. These examples also suggest that there may be multiple pathways to economic success since successful countries adopt different social, cultural, and political institutions. Now you may say that the West German (and South Korean) political elites were strongly influenced by the Americans when they adopted their postwar policies. How much choice did the Americans really give the West Germans in 1948? Most leading German historians would agree that the policy choices made in the late 1940s and early 1950s in Germany were not inevitable. It is hard to believe, but the fate of West German economic policies was largely decided by a handful of liberal economists under the leadership of Professor Ludwig Erhard, who was the main architect of West German economic policy during that time. The majority of the political elites and the vast majority of the voters were not enamored with market-based capitalism in postwar Germany. Sometimes a well-organized minority can actually change the course of a country. It helps to be right and have better arguments than your opponents. It also helps to have the Americans on your side. It is kind of ironic how some myths become deeply ingrained in the social fabric of a country. Many Germans are still under the illusion that the economic success of their country was primarily a result of hard work and common sacrifices, completely missing the point that hard work and sacrifices have never been sufficient conditions for economic success. The lessons from China are similar. It is important to recognize that it took a fairly controversial policy reversal to conduct China onto a path of growth and prosperity. Recent Chinese history illustrates that a country is not bound to be
385
Chapter 21 stuck in poverty. Instead, the ruling elites can take active and decisive actions to improve policies. The fate of countries is in the hands of the political, economic, and military elites that are in power and control the political agenda. There are no deterministic inevitabilities! Once you adopt this notion as a working hypothesis, then you should ask yourself a few questions. Why do the ruling elites behave as they do? What are the incentives of those in power? What are the existing conflicts among different elites and between the elites and the residents of the country? How are these conflicts resolved? These types of questions then lead us to the study of political competition and institutions that are at the heart of research in modern political economy. Let's go back to the US to illustrate an important insight. Let me highlight three important features of the political and economic institutions in the US. First, one amazing feature of political competition in the US is that-with the exception of the Civil War-there has been a peaceful transfer of power among the different political elites in the country. Just recently, the transfer of political power from President Obama to President Trump was painful for many Democrats, who had placed their hopes in Senator Clinton. But it was peaceful. Nobody went to the barricades. Almost everybody learned to accept the verdict of the voters. In most developing countries, the transfer of power from one ruling elite to another is anything but peaceful. Instead, it is often violent and highly disruptive. Second, the US has d eveloped a legal system that provides strong protection of private property against expropriation and encourages risk taking and innovation. Again, these types of institutions are lacking in many other d eveloping countries around the globe. Third, the US embraced and sustained a tradition of state and local autonomy that is enshrined in the US Constitution. This enabled the US to develop institutions that turned out to be essential for an efficient form of urbanization. According to Henderson (2002), these institutions included mechanisms for the internal governance and financing of cities, intergovernmental arrangements, regulatory and financial instruments for intercity communications and transport networks, a civil service with technical expertise in urban and regional planning and service provision, and institutions for an efficient functioning of national and local land markets. What does that imply for managing cities in d eveloping economies? It is far from easy to develop political and economic institutions necessary to sustain successful cities. We need to acknowledge the fact that cities in developing economies often exist within a structure of weak political and economic institutions. This institutional design imposes severe limitations on feasible policies. We need to take these additional constraints into consideration w hen discussing the viability of different urban policies. Moreover, real reform and progress often require a better institutional environment in which cities operate.
21.3 Trust and Making Credible Commitments To illustrate the importance of political and economic institutions, let us consider an example of dynamic capital taxation. This section of the book requires a more advanced understanding of economic theory and can be skipped for readers w ho are not interested in these more technical issues.
386
Cities in Developing Countries
Households need to make an investment decision to accumulate capital. Once the capital stock is in place, it becomes an asset with a fixed or inelastic supply. As a consequence, even a benevolent government will have strong incentives to tax capital once it is in place. These ex-post incentives to tax capital can pose a problem when the government cannot credibly commit itself to maintaining a reasonably low capital tax rate. In the absence of commitment or trust, households anticipate high future capital taxes. Hence they will underinvest in capital accumulation. To formalize these ideas, let us consider a dynamic model with two periods, based on Persson and Tabellini (2000). Suppose a government needs to finance some exogenous expenditures using a combination of labor and capital taxes. Both taxes are potentially distortionary and thus create welfare losses. The government is benevolent and tries to maximize household welfare. In that sense, this model is an example of an optimal taxation problem. Let us first solve the model under policy commitment. In this case, the timing of decisions is as follows: In the first stage, the government announces a tax policy, which consists of a tax on capital and a tax on labor. After the announcement, households make investment d ecisions in period 1. In the second period, households make labor supply and consumption decisions that are subject to government taxation. We can solve the model using backward induction. Let's start with the household decision problem for a given tax system. A representative household of a city has preferences d efined over consumption in both periods, denoted by c 1 and c2, and leisure in the second period. Let us d enote leisure by 1 - Z, where the time endowment in period 2 is normalized to be equal to 1, and l is the amount of labor supplied in period 2. For simplicity, assume that preferences are given by (21.1)
In the first period, the household can either consume the exogenous incomenormalized to be equal to 1-or invest in a capital good , denoted by k. For simplicity, we treat this capital good as a storage technology that pays no interest. Hence the budget constraints for both periods are given by C1
+ k= 1 C2
(21.2)
= (1 -
Tk)
k + (1 -
T1 )
l
where Tk is the tax the government imposes on capital and T / on labor. For simplicity, we also normalize wages in the second period to be equ al to 1. Substituting the budget constraints into the utility function, we can write the household's maximization problem as
1 max - ln(l - k) U 2
+ (1 -
1 Tk) k + (l - T1) l + - ln(l - 1) 2
(21.3)
Households take the tax rates as given and maximize utility. Suppose we have an interior solution for both k and l; the first-order conditions fork and l are given by
387
Chapter 21 1 1 21 -k
- - =(1 -Tk) 1
(21.4)
1
- - = (l-Tz) 21-l
Solving the equations above fork and l, we obtain the optimal capital accumulation and labor supply choices: k=
0.5-Tk
l l=
-Tk
(21.5)
0.5 -TJ
(l - Tz) Note that tax rates must be less than 0.5 to have an interior solution. We can show that the higher the capital tax (labor tax), the lower the capital investments (labor supply). Thus taxes distort the household's decision problem by lowering savings and labor supply. The higher the taxes, the larger the distortions. Let us consider the government's decision problem at the beginning of the model. Suppose the government needs to finance an exogenous level of expenditures in the second period, given by G. Hence the government budget constraint is given by (21.6) Suppose the government sets taxes to maximize household utility. In that sense, taxes are "optimal." This is the best-case scenario from the perspective of the household. To see how that works, we can substitute the optimal capital accumulation and labor supply functions in equation (21.5) into the household objective function and write household utility as a function of Tk and T1. The government then maximizes the utility function with respect to Tk and T1 subject to the government budget constraint in equation (21.6). In the technical appendix (section 21.7), we show that the government will want to set (21.7) Note that we use superscript c to denote the equilibrium under commitment. This implies that zc = F . Hence we have (21.8) The distortions that the tax policy creates in the labor market and the capital market are equal from the perspective of the household. We can think about an environment in which the government can credibly commit itself to a tax policy as a "good" institutional environment. The government has managed to convince its residents that it can be trusted and will not renege on the promises it made at the beginning of the model. Let us see what happens if commitment is not possible and the government cannot be trusted. In that case, the timing of decisions changes. First, h ouseholds
388
Cities in Developing Countries
decide how much capital to accumulate in period 1 in anticipation of a tax rule that will be implemented in period 2. The government then implements a tax policy for period 2. Finally, households make consumption and leisure decisions in period 2. In equilibrium, the expectations of households regarding the second-period tax policies must be correct and consistent with the choices made by the government. Lack of commitment, therefore, means that the government cannot commit to a tax policy before households make investment decisions. As we will see, this has some strong implications. Again, we need to solve the model backward starting in period 2. The household's maximization problem can be written as (21.9) Notice that k is predetermined from the perspective of the household in period 2 since it was determined in period 1, i.e., k is not a choice variable in period 2. It just provides a fixed, predetermined level of income. 1 Hence the household only n eeds to determine its labor supply in period 2. Again, you should convince yourself that the optimal labor supply in period 2 is given by l = 0.5 - T1
(21.10)
(1-T1)
Next, consider the problem of the benevolent government that picks Tk and T/ to maximize second-period utility. Note that k is predetermined at the beginning of period 2. Hence a tax on capital is equivalent to a tax on a fixed factor and is not distortionary in period 2.2 However, a tax on labor is distortionary. Since k is predetermined at the beginning of period 2, it provides a nonelastic tax base. The optimal choice of the government is to set Tk as high as necessary or possible: Tkn
· { = min
G1} >
kn ,
Tkc
(21.11)
The first case corresponds to the equilibrium where kn is sufficiently high so that a tax on capital can fully pay for all government expenditures G. The second case arises when kn is not large enough and the government also needs to tax labor. Solving the model backward, households anticipate this tax policy in period 1. Hence there are two possible equilibria. Consider the first case, in which Tf: = 1. In that case, the optimal choice for households in period 1 is clearly to set k 11 = 0. Hence households do not save at all in this equilibrium. The second case arises if Tf: = G/kn < 1. Note that this tax rate is higher than the tax rate under commitment. As a consequence, it leads to higher distortions of the household decisions and lower overall welfare. In both equilibria, households underinvest in capital relative to the equilibrium with commitment. Hence, the lack of commitment reduces overall welfare in the 1 Aside:
This is true for the model under commitment as well. But since the government does not make a decision in period 2 in that model, it is easier to directly solve for period 1 and period 2 decisions. But we could have solved the commitment model sequentially and obtained the same results as above. 2 0£ course, the capital tax is still distortionary from the perspective of period 1 as we will see below.
389
Chapter 21 economy. This model is an example of a holdup problem that arises when the government cannot commit itself to the ex-ante optimal tax on capital. The key insight of this analysis is that even a benevolent government that maximizes household welfare will be forced to adopt inefficient policies if it cannot credibly solve the commitment problem that it faces. The lack of trust in the government can lead to large inefficiencies. Of course, trust has to be earned! We would expect that problems are even worse when the government has objectives that differ from the interests of the households, i.e., we do not have a benevolent government that maximizes households' utility. Instead, the government may pursue more limited, selfish objectives. More generally, Acemoglu and Robinson (2012) argue that nondemocratic institutions tend to serve an entrenched elite and in consequence suffer from a similar holdup problem: the government cannot commit to be honest and guarantee the private property of its citizens. Hence households and firms often fail to make productive investments, with lower growth and economic welfare as a consequence. If the government is run by crooks, you need to plan accordingly and protect yourself against expropriation.
21.4 The Consequences of Weak Institutions 21.4.1 Local Corruption In most developing economies, corruption of local politicians and bureaucrats appears to be an important challenge for the efficient provision of public goods and services in cities. Of course, we need to be careful while examining corruption. In a system with overall weak institutions, a certain amount of corruption is unavoidable and may even be beneficial. For example, corruption is often a way to generate income for local public officials who are grossly underpaid. In that sense, income from corruption may just be a convenient way of paying local public officials when the local government lacks the fiscal capacity to raise taxes or other revenues. Corruption is not necessarily the cause of poverty; it may just be a consequence of a poor legal and criminal justice system or the inability to finance the government through taxes. Nevertheless, local corruption can be a serious problem. The typical solution to local corruption is a stronger central government. Federal oversight can work as long as the federal government is less corrupt than the state or local governments. We studied a fairly successful anticorruption program in Brazil in chapter 10. However, note that this program was partially successful because there are competitive elections in Brazil, which allow voters to eliminate corrupt and inefficient local incumbents. In many developing countries, elections are either noncompetitive or rigged by the political elite. Anticorruption campaigns have a better chance of success if politicians face proper reelection or reappointment incentives, which requires fairly advanced political institutions. Providing information can help as well. Reinikka and Svensson (2004) studied school aid in Uganda using an experimental design. In the treatment group, the central government informed villagers and local newspapers of aid sent to local officials for education. In the control group, local residents received no information. Villages with information had eighty-five cents of every dollar of aid spent on school supplies. Villages without information received only twenty cents of each
390
Cities in Developing Countries
d ollar of aid. Middle-tier bureaucrats stole the other eighty cents. We conclude that lack of policy transparency can encourage corruption. Unfortunately, the central government may not have strong incentives to combat local corruption. Rent seeking and urban bias by the ruling elites result in centralization. We observe that capital markets and licensing for exports, imports, plant production, and material allocations are often centralized in the nation's capital. This system forces a centralized location of production. This is convenient for a rent-seeking elite. It is a lot easier to generate rents when economic activity is highly concentrated in one city. The rest of the country may be of little concern to those in power as long as these areas do not threaten the status quo. To illustrate this point, it is useful to ask why there are so many poor countries that are rich in natural resources. To put it bluntly, is there a natural resource curse? The empirical evidence is thought provoking. Lane and Tornell (1996) point out that Saudi Arabia's real per capita GDP actually declined between 1970 and 1999. Gylfason (2001) notes that per capita GDP in OPEC countries fell 1.3 percent per year from 1965 to 1998, while all lower- and middle-income countries were growing at an average rate of 2.2 percent. Van der Ploeg (2011) reports that income became highly concentrated during the oil price run-up in Nigeria. In 1970 the richest 2 percent earned as much as the poorest 17 percent; b y 2000 the share of income controlled by the richest 2 percent of the population equaled that of the poorest 55 percent. The number of Nigerians who subsisted on $1 p er day or less rose from 26 percent to 70 percent over the same period. The problem is that the existence of a natural resource gives rise to rent-seeking behavior, as we discussed in chapter 10. When it is easier for the ruling elite to exploit the natural resources than to invest in economic growth, it is not surprising that we observe economic stagnation. By contrast, institutions that have been relatively effective in discouraging rentseeking activity can explain the more favorable outcomes in resource-rich countries such as Norway, Chile, Malaysia, and Botswana. The experiences of these countries suggest that the resource curse phenomenon is neither universal nor inevitable. Whether resource abundance is a curse or blessing appears to hinge on the host country's institutions and on the particular resource involved. As we have seen in previous chapters, decentralization of political decision making and fiscal competition among state and local governments is one of the most effective tools in combating corruption and rent seeking. If one state or local government is corrupt, there are large benefits of moving on and dealing with less corrupt officials in a different state or city. Of course, if corruption exists at the federal or centralized level of government, this mechanism is less effective and often does not work at all. As long as there is a division of power at the federal government level, there is hope that an independent criminal justice system led b y a strong judiciary m ay help prevent excesses of corruption and rent seeking. However, political institutions are not sufficiently strong in many developing countries to provide an effective check against corruption. In that case, it may still be desirable to have a strong central government. The argument is similar to the one we used to study the impact of gangs on crime (chapter 19). As a "gang leader," a strong centralized government has the incentive to act as a monopolist to keep bribes high and restrict access. As in the case of organized crime, this may in fact lead to less corruption than a weak central government that cannot control its bureaucracy and
391
Chapter 21 local politicians. We are not talking about first- or second-best outcomes here, but again outcomes need to be evaluated within the context of feasible institutional design. It is not surprising that we periodically observe strong and prolonged "anticorruption" campaigns in many developing countries. This is largely due to the fact that (a) corruption is prevalent; (b) corruption is deeply unpopular among the population; and (c) the political elites have strong incentives to portray themselves as being honest, even if they are highly corrupt. Almost all political strongmen across the globe have promised to clean up the "swamp" and eliminate political corruption. Very few have seriously tried to accomplish that promise once they rise to power. Even fewer have ever succeeded. Anticorruption campaigns are often thinly disguised attempts by the elites to consolidate power, to eliminate political opponents, and to gain financial advantages.
21.4.2 Lack of Local Fiscal Capacity Another consequence of poor economic and political institutions is a lack of local fiscal capacity, which can be defined as the ability of a city to extract revenues and provide local public goods and services. Fiscal capacity is, therefore, the capacity or the power to tax, as taxes are the main but not the only source of public revenues. The central difficulty in financing government services is tax administration. Recall from US history that a tax revolt against the Stamp and Tea Acts provided the impetus for the war of independence against Britain. Moreover, the infamous Whiskey Rebellion against the liquor tax in 1794 created the first major internal crisis of the young nation that required the use of military force against the rebels by the federal government. Without formal active markets with accurate record keeping, one cannot administer an income tax, wage tax, profits tax, or property tax. The best one can hope for is to impose taxes where record keeping is most widespread. This is in formal consumption markets. Taxation of land or property should be feasible, although it is even less popular in developing countries. Land and property ownership is often more concentrated in the hands of the political elites, which makes this form of taxation inconvenient for the ruling class. Almost all people prefer to shift the tax burden to others. The most practical tax strategy in most developing economies is to tax the formal market sector. Sales and consumption taxes are, therefore, the norm. The only administratively feasible tax is the sales tax since a valued-added tax is too complicated. The sales tax is administered at the point of sale. It requires that retail outlets keep good records. Unfortunately, having cities administer and collect sales taxes is not only difficult but also very inefficient. Taxed activities exit the city for an alternative location or disappear into the informal sector. Other than licenses and fees, which are an invitation to local corruption, significant local taxation is often not feasible. Central government taxation is, therefore, the primary m eans to raise revenues for state and local governments. Hence a disproportionate share of all taxes and other revenues is collected by the national government in developing countries. Because of local expertise, many services continue to be provided at the city level. This then raises the question of how we can finance city governments under these circumstances.
392
Cities in Developing Countries
FIGURE 21.2. Local markets. (Nicole Law / pexels.com)
In principle, the answer is intergovernmental aid, i.e., payments by the central government to state and local governments for the provision of city services, such as education, health care, police and fire, sanitation, water, and roads. The same theory of intergovernmental transfers applies to developed and developing countries. In practice, we often observe that policies favor megacities that also serve as capitals. A well-functioning decentralized system of government requires significant fiscal discipline by the central or federal government. This discipline is often lacking in many developing countries. As a consequence, we observe inefficient fiscal transfers and selective fiscal bailouts. Smaller cities cannot raise the fiscal resources and provide the services needed to compete with the main cities for firms and households. Thus the allocation of households across space is inefficient. In most developing countries, there are too many households living in a small number of megacities. Emerging democracies-South Africa is one example-will often use block and matching grants. Incentive-based block grants provide transfers for the provision of key city services. Lump-sum block grants are means to encourage institutional (i.e., democratic) development. Much like one might do with young children, a central government may give the city residents in this young democracy an allowance" so that they can begin to learn to budget on their own with democratic oversight. 11
21.4.3 Lack of Physical Capital Many developing countries lack significant amounts of capital and do not have the savings potential to quickly accumulate capital. This scarcity of capital provides
393
Chapter 21 important restrictions to urban development. It is impossible to invest enough in public infrastructure to support widespread urban agglomeration. As a consequence, a developing country cannot develop a system of interconnected cities that are joined by highways or rail transit. It also struggles with building large communication networks. According to Henderson (2002), an efficient allocation of economic infrastructure requires concentration in just one or two large cities. With accompanying economic growth, the country will develop some appropriate institutions and a pool of skilled technocrats. Eventually, the country will be able to invest outside the capital, allowing other major urban centers to develop, as well as smaller and medium-sized cities.
21.4.4 Rural to Urban Migration Another consequence of poor institutions is that developing countries tend to have a large share of their population living in rural areas under subsistence agriculture. This is a self-sufficiency farming system in which the farmers focus on growing enough food to feed themselves and their families. The output is mostly for local requirements with little or no surplus for trade. This type of agriculture is fairly uncommon in developed countries since agriculture tends to be much more industrialized and land ownership highly concentrated. As a consequence, there exists a large and often widening gap between the poor, predominantly agricultural hinterlands and the more prosperous cities in many developing countries. When rural poverty is widespread, the economic incentives encourage at least some members of a poor rural family to migrate to a more affluent city in search of economic opportunity. This type of migration is necessary to efficiently reallocate labor across space and to provide manufacturing firms with a deep labor pool. However, uncontrolled rural to urban migration is often not desirable since there are just too many poor households that live in rural areas. Moreover, cities do not have the infrastructure to deal with a large number of poor rural migrants. In the absence of migration controls, ghettos and shantytowns with poor services, health care, and overall quality of life are unavoidable. Everyone suffers from an excess of rural migration. Congestion costs include traffic accidents, exposure to high levels of air and water pollution, and time lost to long commutes. One possible solution to this problem is to create special economic development zones, where the benefits of agglomeration can be realized but the adverse effects of poverty minimized through limited access. The Philippines, Korea, and China have all used these strategies with success. Public services are often financed by central government taxation on profits or by user fees. Some services can also be provided by private contractors with renewable service contracts.3 Establishing economic development zones also encourages competition between them. Firms want to be located in an efficient zone that offers low-cost, quality public services and efficient agglomeration economies. Thus we approach the virtues of fiscal competition that can lead to efficient cities-but on a manageable scale. 3
"Contracting out" is an effective strategy for the provision of public services if (1) the service quality is easy to monitor (the trash did get picked up; the water is clean and always available; the roads are passable); and (2) the contract is competitively bid as a limited-time (say, 5 years) fee-for-service contract.
394
Cities in Developing Countries
21.4.5 A Case Study: The Hukou System While mobility is consid ered to be d esirable in many developed countries, it can be more problematic when there are large numbers of poor households living in rural areas. Consider, for example, the case of China. Hundreds of millions of poor households have migrated from rural areas to coastal cities in search of economic opportunities. These workers are called migrant workers, and their mobility is restricted because of the hukou. This is a somewhat unique residency system that limits access to a variety of local public services and housing markets based on the birthplace of the person. The hukou was initially established in 1954 to limit the mobility of China's large rural population. Since China introduced its own form of capitalism in the late 1970s, China's economy has grown at an astonishingly high rate. Economic growth has created not just prosperity for many urban residents but also economic opportunities for rural migrants. Hundreds of millions of workers moved during the past several decades to coastal cities to become low-wage laborers in new manufacturing businesses. The new job opportunities promised an escape from poverty in China's countryside. The hukou system implies that rural migrants do not enjoy the same privileges as urban residents. As a result, it helps stem and, maybe more importantly, manage the flow from poor rural communities to more prosperous urban areas. According to the China Labor Bulletin, the migrant worker population is currently approximately 277 million. China is facing the difficult task of providing social services to this demographic group. Moreover, the hukou system is often considered to be unfair and discriminatory. In response to these challenges, the Chinese government announced ambitious goals to remedy this situation. The objective of the new policy was to give urban residency to 100 million migrant workers by 2020. Achieving this goal will not be easy. Local governments in China's largest cities, such as Beijing, Shanghai, Guangzhou, and Shenzhen, still make it rather difficult for migrant workers to obtain equal rights. These cities use an allocation system based on an applicant's education level, tax payments, and work experience. These cities may also not be good targets for new migration flows given their relatively large sizes and high housing prices. Second- and third-tier Chinese cities are typically less developed, have smaller populations, and are more affordable than first-tier cities. These cities introduced comparatively weaker regulations that are in line with the central government's goal of channeling migrants to lower-tier cities.
21.5 Natural Hazards Cities are often located in areas that are subject to natural hazards. Common hazards include flooding, droughts, earthquakes, landslides, volcanoes, and cyclones. According to World Urbanization Prospects (United Nations, 2012), the five most populated cities in 2011 located in areas with exposure to at least one major natural hazard are Tokyo, Delhi, Mexico City, New York, and Shanghai. Tokyo is located in a region with high risk of floods and cyclones; Delhi is potentially affected by high risk of floods and medium risk of droughts; Mexico City has a high risk of floods, medium risk of landslides, and low risk of droughts; New York is at high
395
Chapter 21 risk of floods and medium risk of cyclones; and Shanghai is at high risk of floods. In general, the vast majority of cities are exposed to at least one hazard. We saw in the previous chapter that climate change threatens to increase the likelihood and the severity of some of these natural hazards-especially cyclones, floods, and droughts. Cities in developing countries are very vulnerable to natural disasters for at least four reasons. First, they often lack the resources to prepare for a potential disaster. Second, they do not have the luxury to invest in preventive mechanisms that may limit the damage. Third, they do not have the infrastructure and capacities to deal with a disaster once it occurs. Finally, urban population density tends to be much higher in developing economies than in developed economies. According to Behrman and Kohler (2014), Dhaka (Bangladesh) has a population density of 44,000 people per square kilometer. Hong Kong has a density of 26,000 people per square kilometer. The high population density of megacities in developing countries is likely to cause severe problems in case of a natural disaster.
21.6 Conclusions When we study cities in developing countries, we need to be aware of the fact that these countries often have weak political and economic institutions, which are the result of bad policies pursued by the ruling elites. Of course, this does not mean that the elites do not benefit from these policies. It also does not mean that the ruling elites have strong incentives to change course. These political constraints need to be taken into consideration when trying to manage cities in developing economies. Ignoring these constraints is likely to yield foolish policy recommendations. A key challenge, then, is to transform a country with weak institutions that are largely a function of bad economic and social policies. Despite these problems, we can draw some important insights from traditional urban economics. The key role for cities, as always, is to facilitate efficient production through agglomeration economies and proximity. The evidence is as clear for developing economies as it is for developed economies. The engines of growth and innovation will be in cities. As for developed economies, cities need to provide supporting infrastructure (transit, ports, communication) and public services (health care, water, sanitation, and property protection) for economic production. We have seen that these services need to be primarily financed by local sales taxes, user fees, and transfers from the central government. There is one very important complication to the financing and management of successful cities in developing economies: as the city becomes successful-because of agglomeration economies and accumulation of capital- private sector wages rise, and poor rural populations migrate into the city. Given that there are simply not enough resources to address all the ills of poverty, and the only hope for doing so is economic growth, the question becomes, How we can protect the economic benefits of urban proximity in the midst of widespread poverty? We have seen that it can be essential to impose restrictions on mobility. As the country becomes more affluent, it can afford better welfare services, and it can relax some of the mobility restrictions.
396
Cities in Developing Countries
FIGURE 21.3. Poverty. (fancycrave.com/ pexels.com)
Given the potentially destructive forces of congestion, overpopulation, pollution, and other negative externalities, it is even more important that prices, taxes, and subsidies provide the correct incentives for households and firms. Most cities in developing countries do not rely on proper price incentives to deal with these problems, which often explains the observed policy failures. If the local government does not want to rely on a price mechanism, it needs to impose strict quantity controls and migration restrictions. But again, cities in developing countries also struggle with enforcing quantity controls and quotas. As the country develops, it becomes essential to spread welfare and create a system of cities that are linked by a properly functioning transportation network. These policies then lift the pressure off the main megacities and allow for a more decentralized organization of the economy. In developing economies, poverty is by definition much more pervasive than in developed countries. Transfer programs that would be sufficient to make any difference in the economic lives of poor households would require excessively high tax rates on high-productivity and wealthy households, with adverse effects on economic incentives and economic development. Poverty programs for developing countries are long run-at least a generation- and need to focus on human and physical capital accumulation and investment. But there is hope. The cycle of poverty and violence is not inevitable. Policies can change. China is a great example that a large country can change policies and embrace an economic strategy that builds around coastal cities open to the world economy, while at the same time raising the living standards of many poor rural households.
397
Chapter 21
21.7 Technical Appendix: Solving the Optimal Taxation Problem Consider the case under commitment. Convince yourself that the optimal savings and labor supply functions are given by k=
0.5 - Tk
l
- Tk
l = 0.5 -
TJ
l - T1
Substituting these functions into the household utility yields
We can then write the government's problem under commitment as max
V(T1, Tk)
T[, Tk
S.t.
_ 0.5 - Tk G -Tk - - l - Tk
0.5 - TJ + T1--1 - TJ
Note that we have also substituted the optimal savings and labor supply functions into the government's budget constraint. This looks like a complicated optimization problem, and, to some degree, it is. But notice that the optimization problem is completely symmetric with respect to TJ and Tk · In general, that does not have to be the case, but we have constructed this example so that it is. The symmetry of the problem then implies that
For arbitrary functional form assumptions, the model would not necessarily be symmetric. In that case, you would have to solve the problem above using standard techniques from constrained optimization. Next, consider the case under noncommitment. Substituting the optimal labor supply into the second-period utility function, we obtain
and the government budget constraint is now given by
398
Cities in Developing Countries The timing of decisions implies that the government will treat k as constant when making decisions in the second period. As a consequence, the tax on capital is a lump-sum tax that does not distort labor supply decisions. Note that the nondistortionary nature follows from the assumption that we have quasi-linear preferences and hence there is no income effect. That assumption is important! Therefore, Tf: = 1 if k < G or Tf: = GI k if k > G. In the second case, Tf1 = 0. Again, you can formally derive this result by solving the constrained optimization problem and verifying that the optimal solution involves a corner solution for TkFor a detailed general derivation of the equilibrium under general specifications, see the discussion in Persson and Tabellini (2000), chapter 11.
21.8 Debate: Transportation Infrastructure Investments in Jakarta Consider the city of Jakarta, which is one of the largest megacities in the world. The metropolitan area is home to more than thirty million people. Another seven million are expected to migrate to the city over the next fifteen years. Jakarta has been deemed one of the world's most congested cities for the past decade. Drivers stop on average ninety-one times per day. Recently, the city completed the first phase of the Jakarta Mass Rapid Transit (MRT)-a rail-based transit system that connects communities to the central business district. The pro side should argue that more investments in rail-based public transportation systems are needed. The con side should argue in favor of more investments in roads and highways. To structure the debate, both teams are asked to answer the following questions: 1. How does the public transportation infrastructure of Jakarta compare to
those of similar cities? How has the city invested its funds recently? How well does the TransJakarta Bus Rapid Transit system work? Is there evidence of corruption or mismanagement? What types of households are more likely to rely on public transportation? What types of consumers are more likely to rely on private transportation (cars, motorcycles, bikes etc.)? 7. Are toll roads or higher gasoline taxes feasible alternatives to reduce congestion? 8. What are other policies that could act as complements to a better transit policy?
2. 3. 4. 5. 6.
21.9 Problem Sets 1. What are the main differences between cities in developing and developed
economies? 2. Why are political and economic institutions important factors that determine economic prosperity? Illustrate your arguments using a specific country. 3. Is corruption the cause or the consequence of poverty? Discuss. 4. How do you measure the fiscal capacity of a city?
399
Chapter 21 5. Consider a large city in a developing country and determine what fraction of revenues are own revenues, i.e., under the direct control of the city. 6. Explain why many cities in developing economies appear to be "too large." 7. Explain the problems caused by rural to urban migration in developing economies. Discuss some policy options that have been used to restrict mobility. 8. Explain how development aid can help to improve cities in developing countries. 9. Discuss how the recent creation of a network of high-speed trains in China affected lower-tier cities and economic development.
400
PART VII URBAN LAND, HOUSING, AND LABOR MARKETS 111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
22
The Internal Structure of Cities
22.1 Motivation Land use patterns and the spatial sorting of households and firms in cities play a large role in urban economics. It is fairly obvious that these sorting patterns are closely related to transportation networks and the prices for land and housing in different locations. If transportation and commuting are costly, we would expect to see dense residential and firm locational patterns and fairly high land prices near the employment centers. If it is fairly easy to commute longer distances within a city (for example, because of the existence of a sophisticated transportation network), then we would expect firms and households to spread out in space as long as there is enough land that can be used relatively cheaply. The spatial distribution of firms and households is thus driven by the magnitude of agglomeration externalities, the availability of land, the need for workers to commute to their places of work, and the costs of commuting-in terms of both money and time. Transportation networks are largely determined by investments in highways, roads, rails, bridges, and tunnels. The costs for larger investments in infrastructure are typically shared by municipalities, states, and the federal government. Small and local investments are typically solely financed by cities and counties. Cities spend a large fraction of their operating expenditures on transportation. Figure 22.1 shows that transportation expenditures range between $400 and $1,400 per capita in large US cities. The expenditure share ranges between 5 and 17 percent. Recall that most cities spend around 5-15 percent on public safety. Hence the expenditure share for transportation is of similar magnitude. In addition to these operating expenditures, a large share of capital expenditures is also d evoted to infrastructure investments. This chapter focuses on the internal structure of cities, which broadly looks at the arrangement of land use in urban areas. The objective is to explain where different types of households and businesses tend to locate within metropolitan areas. We have seen in previous chapters that some sorting of households is clearly driven by the availability of public goods and services, such as education and public safety. We abstract from these factors in this chapter and focus instead on the trade-off between commuting costs and land prices. So an implicit assumption of the model considered in this chapter is that the distribution of the quality of public
Mean annual transportation spending 2005-2014
A TX: San Antonio KY: Louisville OH: Columbus TX: Ft. Worth Ml: Detroit CA: San Diego TX: Austin PA: Philadelphia IN: Indianapolis FL: Jacksonville NY: New York NC: Charlotte AZ: Phoenix CA: San Jose TX: Houston CA: Los Angeles IL: Chicago TX:Dallas CA: San Francisco
I
I
I
0
500 1,000 Dollars per capita
1,500
0.00
0.05 0.10 0.15 Share of general expenditures
B NY: New York OH: Columbus TX: San Antonio Ml: Detroit PA: Philadelphia CA: San Diego NC: Charlotte IN: Indianapolis TX: Ft. Worth KY: Louisville CA: San Jose TX: Austin FL: Jacksonville CA: Los Angeles TX: Houston IL: Chicago AZ: Phoenix CA: San Francisco TX: Dallas I
I
I
I
0.20
FIGURE 22.1. Tran sp ortation expenditures in large US cities. Spending in dollars per capita (A) and as a sh are of gen eral expenditures (B). (Fiscally Standardized Cities database/ Lincoln Institute of Land Policy)
Internal Structure of Cities
goods and services is fairly homogeneous within the city. That is undoubtedly a very strong assumption that we need to relax to obtain more realistic models. We consider a monocentric city model since it assumes that all households work in the city center and, therefore, need to commute to the city center from more remote residential areas in the city. This model is one of the most famous models in urban economics. It goes back to research done by William Alonso (1964), Richard Muth (1969), and Edwin Mills (1967). Most forms of neoclassical economics do not account for spatial relationships between individuals and organizations. To understand cities, it is useful to explicitly model space. This also helps us understand topics such as the impact of transportation technologies on land prices and commuting patterns. We then study whether the basic predictions of this model are valid. This leads us to review a fairly large empirical literature that studies the relationship between land prices and distances to employment centers in different metropolitan areas. While the monocentric city model provides some useful insights into the trade-off between land prices and commuting costs, it is too simple to provide a full characterization of city structures. We therefore briefly discuss more recent approaches to modeling the structure of a city. Modern models can account for the existence of multiple employment centers within a city as well as mixed residentialcommercial land use patterns. We conclude our discussion with a detailed case study of the NYC subway crisis, which illustrates that public investments in transportation networks are often inefficient and inadequate. Underinvestment in transportation networks leads to serious distortions in land use patterns within metropolitan areas.
22.2 Traffic Congestion in Large US Cities Most large cities in the world are highly congested. Commuters lose significant amounts of time being stuck in traffic or overcrowded public transportation systems. According to a recent study by INRIX (2019), the average US driver faced a total driving cost of $10,288 in 2017, made up of direct costs (maintenance, fuel, insurance, and parking and toll fees) and indirect costs (wasted time and carbon, parking fines and overpayments). Interestingly, traffic- and parking-related costs made up 45 percent of the total cost of vehicle ownership. In large US cities these costs can be much higher. For example, average driving costs in New York City (Los Angeles) were $18,926 ($14,824) according to this study. Table 22.1 reports some descriptive statistics of traffic patterns in large US cities. Table 22.1 shows that most Americans commute by car to work. New York City is the only large city in the US in which less than 50 percent of all workers commute by car. Washington, DC, is second with 56.9 percent of commuters primarily relying on cars. Not surprisingly, New York and Washington have extensive rail and bus systems, and traffic patterns are highly congested. The average commute time by car in most large US cities ranges between 28 and 35 minutes. These numbers can be rather misleading since they do not reflect the fact that a large number of workers have to commute during rush hour. INRIX (2019) also calculates time lost in congestion by comparing peak, offpeak, and free-flow traffic patterns. Peak traffic corresponds to the worst portion of the morning and afternoon commute. Off-peak traffic is the low point between
405
Chapter 22 TABLE 22.1. Traffic Congestion in Large US Cities
City Boston Washington Chicago New York Los Angeles San Francisco Philadelphia
Hours Stuck in Traffic Congestion (per year)
Average Driving Commute Time (in minutes)
Percentage of Commuters Who Drive
164 155 138 133 128 116 112
29.7 34.3 30.9 34.7 28.7 30.3 28.8
75.9 56.9 79.3 39.7 84.2 70.7 81.2
the peak periods. According to these estimates, Boston is the most congested city in the US. On average, peak traffic in the morning and afternoon adds another 164 hours per year to the commute. Assuming that a worker commutes 240 days a year, peak commutes add approximately 20 minutes to the average commute in Boston. Hence the average peak commuting time is 50 minutes instead of 30 minutes. New York City is the slowest US city, with "last mile" speeds of 9 mph, meaning it is faster to bike than to drive or take the bus. The most congested road in the US is the Cross Bronx Expressway, from the Bruckner Expressway to the TransManhattan Expressway, with an average daily delay of 29 minutes. The term "expressway" is clearly meant as a euphemism. Compared to the Cross Bronx Expressway, the stretch of 1-76 that runs through Philadelphia is a true speedway with an average daily delay of only 13 minutes. Of course, it could be worse. You could live in Bogota (London), where you lose 272 (227) hours each year in peak traffic. Public transportation systems only offer limited relief in most large US cities. These systems often rely on old and outdated technologies that frequently break down. We discuss the poor state of the subway system in NYC in a case study below. Moreover, there is no reliable light rail system in many US metropolitan areas. The US also lacks a high-speed train system that connects major employment centers along the coasts.
22.3 Modeling Internal City Structure To model the spatial distribution of workers and the importance of transportation technologies, we consider a version of the Alonso-Muth-Mills model as discussed in more detail in Glaeser (2008). Consider a city in a featureless plain. The optimal shape of a city is, then, a circle since it minimizes total commuting times as we will see below. The city has a fixed population level, denoted by N . Let us assume for simplicity that all workers must commute to the central business district (CBD) to earn a wage W. The CBD is located at the center of the circle, as illustrated in figure 22.2. The area of the CBD is negligible and equal to zero.
406
Internal Structure of Cities
d ·----------..
...... ---......
CBD
-
.......
··· .•d
FIGURE 22.2. The spatial structure of a city.
Preferences are defined over consumption C and land L and can be represented by a standard utility function given by U(C, L). For simplicity, we assume that each worker consumes a fixed amount of land. As a consequence, the d emand for land is constant and does not depend on the price of land. Table 22.2 summarizes the notation of the key variables of the model.
TABLE 22.2. Chapter Notation Variable
Definition
C L
Consumption Land Population Distance Maximum distance Rent outside the city Wage Travel cost function Rental price function Utility function Fixed costs of commuting Per unit cost of commuting
N d
cl r
w t(d) r(d) U() F
Workers can locate anywhere in the city but must commute to the CBD to earn a wage. A location is, therefore, defined by a point in the plane. To simplify the analysis, we assume that all points that are located the same distance from the city center (d) are identical. This is a reasonable assumption if commuting patterns are radial, i.e., you can basically commute along the ray that connects your location with the city center. You do not have to take an y serious d etours to get to work! Under these assumptions, distance is the only factor that determines the length or the cost of the commute to the city center. As a consequence, households w ill be indifferent between locations that have the same distance to the CBD. We can
407
Chapter 22 then restrict attention to symmetric equilibria of the model in which choices only depend on the distance to the CBD. Total land demand is given by NL. Let d denote the maximum distance from the city center. Land supply is given as nd2 . Equilibrium requires that land supply equal land demand: (22.1) which implies that d = ~-The space required for a city increases with population size and the residential land use per worker. Commuting costs are an increasing function of distance, denoted by t(d), with t' (d) > 0. Land rents are also a function of distance r(d) . The budget constraint is given by
W = C + t (d)+r(d) L
(22.2)
where r(d)L are total expenditures on land. Hence consumption is given by
C=W-t(d)-r(d)L
(22.3)
Substituting equation (22.3) into the utility function yields the utility associated with living at distanced:
U(d) = U(W - t(d ) - r(d)L, L)
(22.4)
Workers are identical and thus must be indifferent in equilibrium between all locations. As a consequence, the utility function in equation (22.4) must be constant. This implies that the derivative with respect to d must be 0. Using the chain rule for derivatives, we have
oU(W - t(!~ - r(d)L, L) =
!~
(- t' (d) _ r' (d) L) = O
(22.5)
Note that the first term is the outer derivative and the second term is the inner derivative. The first term is the marginal utility of consumption, which is always positive, i.e., oU I oC > 0. Hence the second term in the equation above must be equal to 0:
- t'(d) - r'(d) L = 0
(22.6)
Solving this equation for the slope of the rental function, we obtain
r' (d) = - t'(d) L
(22.7)
Note that t'(d) > 0 since commuting costs are increasing in distance. Hence rents are decreasing in distance, which implies that r' (d) < 0. An increase in commuting
408
Internal Structure of Cities
r
r(O)
r(d)
r
-- ----- ------------------- - - - --- ------
-
d
FIGURE 22.3. The equilibrium land rent function.
costs must be offset by a reduction in land rents. Let us assume for simplicity that the transportation technology is linear:
t(d) = td Substituting t' (d)
(22.8)
= t into equation (22.7) implies that r' (d) = _!._ L
(22.9)
Solving this differential equation yields
t r(d) = r(O) - Id
(22.10)
where r(O) is rent in the CBD. Land rents are, therefore, a linear function of distance and decline from the city center. The resulting equilibrium land rent function is illustrated in figure 22.3. To close the model, we need to determine the rent in the city center, denoted by r(O). Let us assume that there is an alternative land use at the edge of the city that produces land rents of r. This alternative land use is determined by activities like agriculture, mining, forestry, or recreation. This rent r then determines the land rent at the boundary d of the city. Evaluating the land rent function (22.10) at the
409
Chapter 22 boundary, we obtain
d
i'=r(0)-t-
(22.11)
L
Using the fact that d =
/T{f, we get r(O) =r
tf!!-L -
+ -L
rr
(22.12)
The rental rate in the city center increases in transportation costs, population, and opportunity costs of land use on the city fringe. The equilibrium rent function is, therefore, given by
(22.13)
Figure 22.3 illustrates the determination of the land rent function in equilibrium. At the edge of the city, the land rent must be equal to i'. Moving from the fringe of the city toward the center, land rents must increase by t' (d) I L to offset the reduction in commuting costs. With a linear commuting technology, we obtain a linear land rent function. Substituting the equilibrium land rent function into the utility function, we obtain the following expression for the utility in equilibrium:
(22.14) You should convince yourself that utility is increasing in W . Utility is decreasing in population size N, commuting costs t, and the alternative land rent r . A useful extension of the model considered by Glaeser (2008) introduces two different transportation technologies. The first technology involves no fixed costs but has a high cost per distance equal to f. The second technology has a fixed cost of K and a low per unit distance cost of t_ ( < f). We can think about the first technology as walking or taking public transportation and the second technology as commuting by car. These technologies are illustrated in figure 22.4. Workers want to minimize travel costs and, therefore, choose the travel mode with lower costs. Figure 22.4 then implies that workers will use the first technology if d < d. Similarly, if d > d workers will use the second technology. At distance d workers are indifferent between the two technologies. Hence we have (22.15)
or F d =- -
t- t
410
(22.16)
Internal Structure of Cities
t(d)
d
d
FIGURE 22.4. Transportation technologies.
Workers who live close to their jobs will walk or take public transportation. Workers who live far away will travel by car. The equilibrium condition that - t' (d) I L = r' (d) still holds. The gradient depends on distance. For close distances households walk and, therefore, we have r' (d) = f. For longer distances households drive and hence we have r' (d) = tThis result is illustrated in figure 22.5. The rent gradient is steeper closer to the city. Individuals shift from the low-fixed-cost, high-variable-cost transportation technologies to high-fixed-cost, low-variable-cost technologies as they live farther away from the city. As a consequence, the land rent function flattens once workers start commuting by car. In the technical appendix (section 22.10), we relax the assumption of fixed land consumption. Households consume less land in locations with high rents and more land in places with low rents. As a consequence, population density is highest near the city center and decreases with distance to the city center. This pattern is commonly observed in many metropolitan areas.
22.4 The Price of Land in New York City Does the monocentric city model accurately explain observed land gradients in the US? There is plenty of evidence showing that the land in the city center of many metropolitan areas is more expensive than land farther from the center. For example, Haughwout, Orr, and Bedoll (2008) report that a 3.4-acre parcel of land in Manhattan sold for $345 million in 2000. That is roughly equal to $2,300 per square foot. The parcel was located 1.3 miles from the Empire State Building at the southwest corner of Central Park. It formerly housed an exposition and
411
Chapter 22 r
r(d)
r
---------- ----- ------------------ -- - - --
-
d
-
d
FIGURE 22.5. Equilibrium with different transportation technologies.
convention center known as the New York Coliseum. The buyers quickly demolished that complex to make way for construction of the Time Warner Center, a 2.8-million-square-foot, large commercial development that includes two office towers, a hotel, retail stores, and a parking garage. Along with its close proximity to the city center, the location gained value in its attractive retail and commercial development. The Time Warner Center is now one of the most valuable properties in New York. One problem associated with measuring the price of land is that there are not abundant data on land sales. Vacant land is rare in many urban areas, making it difficult to conduct empirical analyses. Haughwout, Orr, and Bedoll were able to study the price of land in the NY metropolitan area based on a sample of 6,186 land sales between 1999 and 2006. A total of 623 transactions, or roughly 10 percent of the sample, was in Manhattan. In addition, 1,639 transactions, or about 25 percent, took place in other parts of New York City. The remaining sales took place in northern and central New Jersey. Figure 22.6 plots the logarithm of price per square foot of land against the distance from the Empire State Building. This location was chosen as the base for the study. The price per square foot of land ranges from less than $1 to more than $12,647. Don't get fooled by the scale of figure 22.6. The logarithm of the price of land is on the y-axis. Just to give you an idea, if ln(p) = 9, then p = 8,103. Similarly, if ln(p) = 4, then p = 54.6. So the land-distance gradient does not look steep if we use the logarithm of price. It would look much steeper if w e used price lev els. Plots can sometimes be deceiving if you don't pay attention to the scales!
412
Internal Structure of Cities ....... 0
E
~
ro
:::J O'"
12 10 0
0
OQo
V,
.... QJ
8
C. QJ
u
·;:::
8
6
o
C.
-a C
~
.....0 E
o"'
0
000
4
0
0
I
fl
0
0
o 0
2
.L:
.......
·;:::
ro
0
0
Ol
0
~ :::J .......
ro
z
-2
0
0
-4
0
20
40
60
0
80
100
Distance from Empire State Building (kilometers) FIGURE 22.6 . Land prices in the New York metropolitan area. (Federal Reserve Bank of NY)
We therefore conclude that there is a strong inverse, nonlinear relationship between price and distance from the city center in NYC. The plot also suggests a wide variation in prices at any distance from the city center. While distance is clearly a significant factor affecting land prices, it does not appear to be the sole factor. In the next chapter, we return to this dataset and further study the determination of land prices in NYC. We will see that there are many other land characteristics that determine the price of land, not just the proximity to the city center.
22.5 Modern Models of the Internal Structure of Cities While the monocentric city model can broadly explain the declining rent gradient that we observe in many cities, it is obviously too simplistic to describe the internal structure of modern cities. Most of the employment in a city is not located in the CBD. Many firms have followed households and moved to the edges of the traditional cities (Garreau, 1991). The main problem of the monocentric city model is that it assumes that all firms are located in the CBD for purely exogenous reasons and that all workers need to commute to the CBD. The competition between firms and households for land anywhere else in the city is not explored in that model. Hence a central feature of the internal structure of cities is resolved by assumption rather than deduced from basic economic principles. To put it differently, we should write down a model in which firms and households are free to locate anywhere in the city. The concentration of firms in a given location should arise endogenously in the model and follow from the locational choices that firms and workers make.
413
Chapter 22 How can we fix this problem and have firms and workers make endogenous locational decisions within a city and explain the observed spatial agglomeration of jobs in central business districts? Needless to say, it is difficult to answer this question. Not surprisingly, it took economic theorists a while to come up with a suitable model. One of the most compelling models that addresses these concerns was developed by Lucas and Rossi-Hansberg (2002). To fully understand this model, we need mathematical tools that are outside the scope of this book. Let me briefly sketch the main arguments. The basic idea behind Lucas and Rossi-Hansberg is that firms benefit from agglomeration externalities, as discussed in earlier chapters of the book. Furthermore, it illustrates a key trade-off that firms face: productivity versus land prices. If there are no agglomeration benefits in this model, producers and workers move to locations where land for production and residential use is cheap. In practice, we observe the opposite. The most productive firms are often located where land is very expensive. That's the main reason why Lucas and Rossi-Hansberg incorporate endogenous agglomeration externalities into the model. They are endogenous since they depend on the d ensity of firms in a location. These types of agglomeration externalities then imply that firms prefer to locate in close proximity to one another. As a consequence, land rents tend to be higher in locations that have high firm densities. Note that there is nothing special about any particular part of the city. Agglomeration can occur anywhere in the city. In particular, an equilibrium need not take the form of a central business district surrounded by residential areas. The model can generate equilibria with multiple business districts that spread out in space. Households care about land consumption and distance to commute to work. Hence the model accounts for differences in commuting costs. In their model, local wage rates adjust to make workers indifferent with respect to different commuting patterns. The model thus nicely captures the two key externalities encountered in urban life: agglomeration externalities due to knowledge spillovers and congestion externalities due to commuting. Given the importance of agglomeration externalities, there is no reason to believe that land rents and location-specific wage rates give firms and households the correct incentives for making land u se decisions. As a consequence, the allocation of firms and workers across space may not be efficient. Optimal land use restrictions can, therefore, improve the efficiency of allocations in this model. Another modern attempt to develop a quantitative model of the internal city structure is given by Ahlfeldt, Redding, Sturm, and Wolf (2015). They develop a different model that also captures agglomeration and dispersion forces, allowing for trade within a city. One key advantage of this model is that it can more realistically capture the geography and transportation n etworks that we observe in real cities. They use d ata on thousands of city blocks in Berlin for 1936, 1986, and 2006 and exogenous variation from the city's division after World War II and unification of Germany in 1990. The paper finds substantial and highly localized production and residential externalities. The estimated model can both qualitatively and quantitatively account for the observed changes in city structure that we observe in Berlin during the time period. Finally, it shows how the quantitative framework can be used to undertake counterfactuals for changes in the
414
Internal Structure of Cities organization of economic activity in Berlin in response to changes in the transport network. 1 In summary, modern models of internal city structure capture both agglomeration and dispersion forces that are crucial to understanding city economies. Much progress has been made in capturing these forces in compelling general equilibrium models. These models can actually explain the observed data patterns and can thus be used for counterfactual policy analysis.
22.6 Transportation Networks and City Structure One upshot of this literature is that the structure of transportation networks affects the city structure and the spatial allocation of workers within cities. Between 1950 and 1990, the US made massive investments in infrastructure, especially roads and highways to accommodate the increased used of the automobile. Much of these investments were financed by the federal government. As the US grew in size and prosperity, the population of metropolitan areas increased by 72 percent during that time period. During this boom, the aggregate population of central cities declined by 17 percent. These facts suggest that these massive investments in infrastructure primarily favored suburbs and smaller municipalities at the fringe of metropolitan areas. These transportation policies may have also caused or accelerated the decline of inner-city neighborhoods by facilitating urban flight. Baum-Snow (2007) assesses the extent to which the construction of new highways contributed to the decline of the central city population. Of course, it is hard to determine the impact of highway construction on the spatial allocation of households since highways are not randomly constructed. Instead, growing areas tend to attract a larger amount of new highway construction, which creates an endogeneity problem in estimation. Despite these empirical challenges, we can make some progress. The national road network was not designed to facilitate local commuting. Instead, the objective was to link faraway places to other population centers. Some metropolitan areas received more interstate highways than others simply because they are located nearer to other population centers. In particular, the paper uses the number of proposed highways in a 1947 national interstate highway plan as an instrument for the current highway network. Note that most highway construction in the US did not really take off until the passage of the 1956 Interstate Highway Act. The idea here is that the plan that was drawn up in 1947 could not have been correlated with locational demand and supply shocks that occurred later and that may have driven much of the relocation between 1950 and 1990. Baum-Snow estimates indicate that one new highway passing through a central city reduces central city population by about 18 percent. As a consequence, aggregate central city population would have grown by about 8 percent had the interstate highway system not been built, rather than declining by 17 percent. Of course, the results do not mean that the building of the interstate highway 1
For those of you who are interested in learning how to use quantitative spatial models, I suggest that you start with Redding and Rossi-Hansberg (2017). They also provide some MATLAB cod e for implementing a simple version of these models.
415
Chapter 22
FIGURE 22.7. Public transportation. (Photo by author)
system was a mistake. Overall, these investments made metropolitan areas more attractive. Duranton and Turner (2012) find that a 10 percent increase in a city's initial stock of highways causes about a 1.5 percent increase in its employment over a twenty-year period. We thus conclude that infrastructure investments are essential for metropolitan areas but do not necessarily lead to population growth in central cities. They may in fact cause the opposite effect, creating more urban sprawl. If we want to avoid this type of urban sprawl, we need to encourage higher-density residential and commercial real estate development, which then also makes public transportation systems more viable and effective. It is not surprising that population growth leads to an increase in traffic and more road congestion. Somewhat surprisingly, increases in highway construction may not lead to faster or smoother travel patterns. Better roads attract more traffic, and each extension of highways is typically met with a proportional increase in traffic. As a consequence, road capacity expansions alone may not be appropriate policies with which to combat traffic congestion. Duranton and Turner also provide some evidence that investments in public transit do not necessarily reduce traffic times. Again, the problem is that better public transportation may initially speed up travel times, but driving patterns adjust as using the car becomes more attractive. So in equilibrium we may only see small overall improvements in travel time. Again this finding reinforces the view that higher residential density patterns are essential to keep traffic flows under control, to make public transportation systems effective, to discourage urban sprawl, and to improve the overall energy efficiency of metropolitan areas.
416
Internal Structure of Cities
22.7 A Case Study: Congestion Pricing in Singapore and New York City From a transportation policy perspective, Singapore is one of the most proactive large cities in the world. It was one of the first cities to implement a congestion pricing scheme. The basic idea is that it charges motorists for driving into the central business district during morning rush hour to account for traffic externalities. When individuals make transportation decisions, they typically ignore the congestion that they cause for other commuters. Congestion pricing provides a simple solution for internalizing these externalities. Going forward, Singapore is considering a new regulation that will require all vehicles to have a GPS device that can calculate exact driving distances. Such a system will make it possible for the city to adjust tolls depending on traffic conditions at different times of the day. Other cities are paying attention. New York City's municipal leaders are under constant scrutiny regarding if and how they are tackling the persisting traffic problem facing daily commuters. City residents and employees attempting to get into their downtown offices must decide whether to take the inefficient and faulty public subway system, hire drivers to take them through the congestion, or drive their own cars and face additional parking costs. While the government struggles with finding a cost-effective way to restructure the underground subway and rail system, Governor Andrew Cuomo has presented the alternative of a central business district zone fee. This would impose a charge on all vehicles entering this highly trafficked area of New York, hopefully reducing congestion and increasing revenues that would be directed tow ard supporting subway operations. The proposal has this fee ranging in expense depending on whether the vehicle falls under the classification of a taxi service, trucking service, or for-hire drivers like Uber or Lyft. Cars could be paying above $10 to enter the city, while some trucks could face a fee above $25. The companies offering ride-hailing apps have agreed with the imposed fee but only under the assumption that all vehicles are charged. The transportation tax creates greater costs for those living outside the city compared to Manhattan residents. Other large cities have implemented a form of ride-hailing service taxes in order to raise revenues for investments in public transportation. Congestion pricing is a useful tool for transportation economists and urban planners. However, we need to be careful. Without viable public transit options, congestion pricing will induce relocation of businesses and loss of productivity (Brinkman 2016). Alternatively, it places costs on drivers who have to adjust their schedules. It makes sense that these two policies-congestion pricing and investments in public transit- need to be used together to reap positive effects.
22.8 A Case Study: The NYC Subway Crisis Large negative spillovers in a metropolitan area arise due to traffic congestion. New York City's population has continued to grow over the p ast several decades, but an important source of transportation- the subway system- is still running
417
Chapter 22 on the same technology and equipment it did in the 1930s. This combination of factors has led to the system's usual overcrowding and delays. Commuters have learned to expect umeliability. In comparison to other public transportation systems in cities similar to the size and density of New York, like the Tube in London and the Metro in Paris, the subway seems to be operating with vast inefficiencies. Board members of the Metropolitan Transportation Authority (MTA) have claimed that the issues stem from heavy regulations, deteriorating utilities, rising land prices, and an increasing population. But researchers have found that, despite collective concerns to improve the subway, the inability to collect the necessary funding and poor management have hindered its progress. The proposed costs are far greater than other comparable construction projects due to the high demands by labor unions and inflated prices of private construction companies. There are few contracting companies that specialize in underground railroad construction and its corresponding hazards, allowing them to demand higher prices. Construction workers and subway managers have argued that their generous wages and benefits are justly so, given their daily exposure to health risks and consumer complaints. While subway workers make an average of $100,000 in Boston, Chicago, Los Angeles, and Washington, those in New York make an average of $170,000. Additionally, the number of workers hired per job in New York is much greater than for projects in these other cities. The unions have argued that the increased number of people on the job is intended to reduce malfunctions and assure greater safety, but its construction process has still resulted in the same amount of failures. MTA has also experienced funding cuts from the city and the state. Sixty percent of its revenue is from rider fares and the rest is from taxpayer dollars, but as these sources faltered during the recession in 2008, the city government decreased capital allocated to the system's operations and improvement. MTA collected $1 billion from the city in 1990 but now takes in only $250 million. Similarly, the state has continued to trim subway maintenance funding since the 1990s. The subway's upkeep has been neglected, resulting in more d elays and extending the time on any realization for improvement.
22.9 Conclusions The development of the automobile and a highly efficient n etwork of modern roads radically changed city structures in the twentieth century. These innovations led to a breakdown of the traditional monocentric city structure. Employment spread out to the suburbs and thus followed residential patterns. A decentralization of employment gave rise to cities with multiple business centers or "edge" cities (Garreau, 1991). Nowadays 75 percent of employment is located more than 3 miles away from historical city centers. Modern cities are polycentric, with many areas that have mixed commercial and residential land use patterns. Nevertheless, distance to the historical city center is still an important factor that affects the locational d ecisions of households and firms in many urban areas. Households commute to the historical city center not only for work but also for recreation, entertainment, and a variety of other purposes.
418
Internal Structure of Cities
FIGURE 22.8. Bike-sharing programs are popular in many cities. (Photo by author)
More recently, the trend toward decentralization of employment supported by extensive road and highway networks has been reversed in some metro areas. Revitalized city centers across the nation have started to draw new, younger residents and new businesses. Some cities, such as Rochester, Milwaukee, and Portland, have demolished and replaced some older highways that are expensive to maintain, divide urban neighborhoods, and hinder efforts to create pedestrian-friendly spaces. Broadly speaking, the spatial distribution of firms and households determines the commuting patterns and hence infrastructure needs. The spatial structure of cities also depends on transportation technologies and energy prices. For example, rising energy prices lead to higher commuting and transportation costs. Development of better technologies may partially offset these higher commuting costs. Global warming will have a significant impact on future city structures. There are some plausible scenarios that will require societies to drastically reduce CO2 emissions. Recall that the average automobile commuter emits ten times as much CO2 as the average urban commuter who uses public transportation. Approximately 20 percent of all CO2 emissions is due to household commuting. Another 20 percent is due to residential housing, such as heat and air conditioning. Global warming may, therefore, call for higher-density development, less urban sprawl, and a decreased use of automobiles. Given the importance of cities, one would expect that federal policies would subsidize cities at the expense of their more affluent suburbs. Unfortunately, this
419
Chapter 22 is not the case in the US. Consider, for example, the subsidy for homeownership via mortgage deduction. This deduction is only useful for households that have incomes sufficiently high enough to itemize their tax deductions. We have seen that in most metropolitan areas in the US high-income households tend to live in suburban communities outside central cities. Effectively, the mortgage interest deduction is a tax subsidy for housing consumption of affluent suburban households. This policy, therefore, subsidizes urban sprawl, decreases agglomeration, and hence seems to be counterproductive. Land and house prices depend not only on the distance of the house to important centers of employment but also on a variety of other location-specific and structural characteristics. We need a more flexible model that can explain price differences when houses or land differ by a variety of characteristics. Sherwin Rosen's hedonic model provides one elegant solution to that problem. We will discuss that model in the next chapter.
22.10 Technical Appendix: Endogenous Land Use Glaeser (2008) considers a version of the monocentric city model with endogenous land use. Let us assume that utility is defined over consumption and is land and is given by
U(C,L)=C
+ alnL
(22.17)
Substituting the budget constraint into the utility function implies that
U(d) = W - td- r(d)L + a lnL
(22.18)
Now we treat land use as a choice variable. This makes sense since households substitute consumption for land as land gets more expensive. Think about New York City. In Manhattan, most individuals live in high-rise apartments buildings that use little land. In the suburbs, most households live in detached single-family houses. Why? Land is much more expensive in the center city than in the suburbs. Optimal land use then must satisfy the following first-order condition:
r(d) =
{\'.
I
(22.19)
or L = a/r(d) . Hence we have
U(d) = W - td - a + aln(a) - aln(r (d) )
(22.20)
As before we can take the derivative with respect to d and obtain the spatial optimality condition, which is given by
r'(d ) = - t/ L = - tr(d) {\'.
420
(22.21)
Internal Structure of Cities
Solving this differential equation yields
r(d ) = r(O ) e- (t / a )d
(22.22)
Taking logarithms of both sides of the equation yields ln(r(d))=ln(r(O))- (t/a)d
(22.23)
The logarithm of land use rises linearly with distance to the city since rents decrease at an exponential rate. This specification is often used for empirical work.
22.11 Debate: Public Transportation Infrastructure Consider a city such as London or Berlin that has recently upgraded the public transportation system. Then discuss the pros and cons of these investments. The pro side should argue that more investment in public transportation makes sense. The con side should argue that public transportation is inefficient and infrastructure investment should focus on roads and highways. The following questions may help structure the debate. 1. Who gains and loses from investments in public transportation? 2. What is the impact of investments in public transportation on traffic congestion? 3. What have we learned from cities that have invested heavily in public transportation? 4. What are some alternative policies that could be implemented to reduce traffic congestion in city centers? 5. How many people work in the city center and need to commute by car? 6. What are the alternative modes of transportation that are feasible for suburban commuters?
22.12 Problem Sets 1. What are the key assumptions of the monocentric city model? 2. Which cities in the US are approximately shaped like a circle? Which are not and what explains this discrepancy? 3. Explain how a change in the rental price of land outside the city affects the rental price function in equilibrium. 4. Explain what will happen to the price of land as a function of distance to the city center if the price of gasoline permanently increases due to an increase in carbon taxation. 5. Suppose preferences are given by (22.24)
where C is private good consumption and L is land consumption, which is assumed to be fixed. The transportation technology is given by
421
Chapter 22 (22.25)
where I is a parameter and d is the distance to the city center. Households are identical and earn a wage equal to W by commuting to the city center. Let r(d) denote the rent function. a) Derive the utility of a representative household as a function of d. b) Derive the differential equation that characterizes the slope of the equilibrium rent function. c) Derive the equilibrium rent function, assuming that the rent at the edge of the metro area is exogenously given by r. d) How does the shape of the equilibrium rent function depend on the parameter 1 ? Provide a graphical analysis. 6. There are a large number of commuters who decide to travel either b y car or by train. Commuting by train takes 70 minutes whatever the number of commuters on the train. Commuting by car takes T(x) = 20 + 60x
7. 8. 9. 10.
422
(22.26)
where x is the proportion of commuters taking their car, 0 :::; x :::; 1. a) Plot the curves of the commuting time by car and train as a function of the proportion of car users. b) At what level of x does it take equal amounts of time if you take the train or the car? Why would that be the market equilibrium? c) What is the proportion of car users that minimizes the total travel time? d) Interpret the difference between (b) and (c) and suggest some policies that could achieve an efficient allocation. What are alternatives to congestion pricing? Why are congestion pricing policies unpopular with local voters? Consider the city of Atlanta. How many areas in Atlanta could be considered to be central business districts? Explain why agglomeration externalities may explain why we observe firm concentrations outside the central business district.
23
Land and Housing Markets
23.1 Motivation We have seen that land and housing prices are partially determined by local public goods and services as well as amenities such as the distance to important places of employment or central business districts. However, these are not the only factors that determine the price or the value of a house. In general, houses differ b y many structural dimensions, such as size, location, and quality. To make progress and track the observed heterogeneity in housing, we must develop a new model. The key concept that we use in industrial organization to model demand for consumer goods is called product differentiation. This is the process of distinguishing a product or service from other similar products. Think about cars. There are many mid-sized sedans that are similar but not identical. Product differentiation arises because consumers have heterogeneous tastes and do not necessarily want to purchase a uniform, standardized version of a product. As a consequence, most consumer goods-such as cars, computers, clothes, and food-are differentiated products. The same logic applies to housing. Households have different needs for housing and different income and wealth levels. As a consequence, housing comes in all shapes, forms, and prices. We need a simple, trackable model that can capture the heterogeneity we observe in housing markets. To reduce the complexity or dimensionality of the problem, we u se a characteristics-based approach. This approach fully describes a product by a finite number of product characteristics.1 The most popular model of housing is the hedonic model, which is due to Sherwin Rosen (1974). The basic idea is to treat housing as a differentiated product that is valued for its characteristics. The characteristics include all relevant dimensions of housing quality, such as the size of the house, size of the lot, location, amenities, access to public goods and services, etc. In equilibrium, we need to match households that want to purchase specific houses to the developers that build these houses. The resulting match is achieved via a potentially nonlinear pricing function that maps the characteristics of the housing unit into its price. 1 In
appendix A.9, we discuss h ow to use discrete choice techniques to m odel and estimate the demand for housing.
Chapter 23 In this chapter, we introduce a simple version of the hedonic model in which houses only differ by a one-dimensional quality index. We then discuss how to extend the model to allow for more than one characteristic. While it is not easy to characterize the equilibria of hedonic models, it is fairly straightforward to estimate hedonic price functions in practice. We can use readily available data on housing or land transactions. These models provide the backbone of many empirical studies in housing economics. To illustrate the power of the hedonic approach, we present two applications. First, we consider housing prices in Los Angeles. Second, we continue our analysis of land prices in NYC. We estimate the hedonic pricing functions using data on housing or land transactions. Housing prices and rental rates for housing can be high in many popular metropolitan areas. That makes it difficult for many low- and moderate-income households to live in adequate housing units if they have to pay market prices. As a consequence, there are a variety of government programs that help to provide affordable housing for low- and moderate-income households that live in large metropolitan areas. We first consider public housing communities and housing voucher programs that are administered by the US Department of Housing and Urban Development (HUD). In addition, municipalities often use their own housing market regulations to keep housing affordable. The most notable program in the US is the rent stabilization program in NYC. We present that program as a case study.
23.2 The Bedonie Model of Housing To start out, let us assume that houses differ by a single attribute or characteristic that we call quality. Table 23.1 summarizes the notation for the key variables of the model. Let us denote quality by z. This assumption basically implies that there exists a simple ranking of all housing units that all households easily agree upon. Note that this assumption is probably too restrictive, but it is a good starting point for our analysis. The price of a house of quality z is given by p(z ). Note that prices do not have to be linear in z, i.e., we no longer assume that p(z) = p z, where p is a constant price per quality unit. There are a large number of firms that differ in productivity 0. We need heterogeneity in productivity for two reasons. First, it captures an important aspect
TABLE
Variable
Definition
z
Housing quality Housing price function Profits of housing d eveloper Cost function Heterogeneity in costs Utility function Utility from h ousing Heterogen eity in preferences Household income
p(z) IT
C() 0
U()
h(z) V
m
424
23 .1. Chapter Notation
Land and Housing Markets
p \
\ofits
~creasing
Iso profit
curve
z FIGURE 23.1. The optimization problem of a firm.
of reality. Not all firms that operate in a market are equally productive. Second, it helps us explain why different firms produce different types of goods. In this model, high-productivity firms build high-quality houses while low-productivity firms build low-quality houses in equilibrium, as we will see in detail below. The costs of building a house of quality z are given by C(z;0 ). Each firm produces a single house. Profits are thus given by
IT(z, p(z);0) = p(z) - C(z;0 )
(23.1)
Each firm chooses a level of z to maximize profits. The firm decision problem is illustrated in figure 23.1. It shows the isoprofit lines, which are combinations of price and quality that yield the same level of profits. It also shows the price function, which characterizes all levels of z and p(z) that are feasible for the firm. The figure also shows the optimal choice of quality for one firm with a given level of productivity. At the optimum, the isoprofit curves and the hedonic price function satisfy a standard tangency condition. We show in the technical appendix at the end of this chapter that the slope of the price function, which is given by p'(z), must be equal to the slope of the cost function, given by C'(z;0). Hence profit-maximizing firms choose z such that
p'(z) = C'(z;0)
(23.2)
Firms differ in productivity. Hence different firms will locate at different points of the hedonic price function.
425
Chapter 23
p p(z)
Increasing utility
z FIGURE 23 .2. The optimization problem of a household.
Households differ in preferences for quality. Let us denote the heterogeneity of households by v. Let's assume that preferences are quasi-linear and given by
U(z,c;v) = h(z;v) + c
(23.3)
where c is nonhousing consumption. The budget constraint is given by m = c + p (z). Substituting the budget constraint into the utility function, we obtain
U(z,p( z );v) = h(z;v)+m - p(z)
(23.4)
where m denotes income. Note that income does not affect the demand for housing in this specification because preferences are quasi-linear. Figure 23.2 illustrates the solution to the household's maximization problem. It shows indifference curves that are defined over z and p(z). The optimal consumption bundle must satisfy the condition that the slope of the indifference curve is equal to the slope of the pricing function:
p' (z) = h' (z; v)
(23.5)
The slope of the indifference curve measures the marginal willingness to pay for quality. Since households have different preferences for quality, they line up at different points of the price function. Households with strong preferences for housing-high value of v-purchase high-quality houses. Households with weak preferences for housing- low value of v-purchase low-quality houses.
426
Land and Housing Markets
p Iso profit
curve
z FIGURE 23.3. Matching of a single firm with a household.
Finding an equilibrium for this market is, then, solving a matching problem. We need to find a pricing function so that each firm is matched to one and only one household. The matching of a single firm to a single household is illustrated in figure 23.3. In equilibrium, we need to match every firm to a household such that the matching is stable, i.e., no firm and no household can obtain a better match by deviating unilaterally. 2 Characterizing an equilibrium price function that implements a stable match can be rather complicated. In the technical appendix, we provide an example of a model that has a closed-form solution for the equilibrium. We show that high-demand households are matched to high-productivity firms and vice versa. This type of allocation is called positive assortative matching. The hedonic model can be extended to allow for cases in which houses differ by a large number of different characteristics. Not surprisingly, these versions of the model must be solved on a computer. However, the basic insights of the analysis above extend to a more complicated model. For example, equilibrium requires that the slope of the hedonic price function must be equal to the marginal willingness to pay for a characteristic for each household. Similarly, the slope must be equal to the marginal costs for each firm. We are still looking for a stable matching between producers and consumers that can be decentralized by a nonlinear pricing function. We discuss more details of this model in the technical appendix at the end of the chapter.
2
A similar model can also be used to capture the marriage market, where we need to find stable matches across partners. Stability means there does not exist a pair that has an incentive to deviate.
427
Chapter 23
23.3 Using Bedonie Models in Empirical Work One appealing property of the hedonic model is that we can learn something about unobserved household preferences for housing characteristics from the observed price function. The first step of the empirical analysis is to get a sample of housing transactions and housing characteristics. These types of data are typically available from a variety of different companies that track housing markets. The second step of the empirical analysis is then to estimate the hedonic pricing function for a market. Notice that different local housing markets have different pricing functions. Plotting the data against some important characteristics-such as the size of the house or the size of the lot-can be helpful in determining suitable functional forms. Suppose, for example, we observe that the relationship between prices and the size of the house is convex. In that case, we can estimate a quadratic housing price function, such as (23.6)
where u is the error term of the regression model. If a2 > 0, the pricing function is convex. If a2 = 0, the pricing function is linear. Another popular specification is a log-log regression: ln(p)
= 'Yo + ')'1 ln (z) + u
(23.7)
Alternatively, we can impose convexity and estimate an exponential function, given by p = ef3o+f3i z+u
(23.8)
Taking logs, we obtain ln(p)
= f3o + f31 z + u
(23.9)
The log-linear or log-log specifications are commonly used in many empirical studies since researchers have found that they often fit the data quite nicely. We can also extend the log-linear model to account for more than one characteristic. Suppose we observe J different characteristics, denoted by (z 1 , . .. z1)- The log-linear specification is then given by (23.10)
where (3 1 through /3 1 are the parameters that correspond to the different characteristics in the regression function. We can estimate the parameters of this regression model using ordinary least squares as long as we have access to a random sample of housing or land transactions in a local market.
23.4 Housing Prices in LA To illustrate the usefulness of hedonic housing price regressions, we consider some results from Sieg, Smith, Banzhaf, and Walsh (2002). They analyze housing transactions in ninety-two school districts in the greater Los Angeles metropolitan area.
428
Land and Housing Markets TABLE
23.2. Housing Prices in LA
Variable
Coefficient
Variable
Coefficient
Bathrooms
0.0366 (0.0018)
Bedrooms
-0.0390 (0.0012)
Pool
0.0656 (0.0021)
Fireplace
0.0747 (0.0018) 0.1424 (0.033)
Building (sq. ft)
- 2.0506 (0.068)
Lot (sq. ft)
Age
- 0.3120 (0.017)
Age2
Lot2
- 0.0096 (0.0014)
Building2
0.1749 (0.0050)
- 0.0170 (0.0006)
Age x lot
0.0326 (0.0014)
Age x building
0.0050 (0.0023)
Lot x building
0.0098 (0.0046)
Constant
16.93 (0.295)
N
221,296
R2
0.6295
Note: Estimated standard errors are in parentheses.
The dataset contains housing characteristics and transaction prices for virtually all housing transactions between 1988 and 1992. Table 23.2 is based on their findings and reports the results from a log-log regression that includes fixed effects for each of the school districts in the sample. Most of the estimated coefficients have the expected sign and plausible magnitudes. The price is decreasing in age and increasing in building and lot size (over the relevant range) . You may be surprised to see that the coefficient on building size is negative but that the quadratic term is positive. If you want to compute the marginal effect of increasing building size, you need to take this into consideration as well as all the interaction effects. An additional bathroom increases the housing price, as do fireplaces and pools. The R2 of the regressions is 0.625, indicating that the structural characteristics observed in our sample explain a large fraction of the observed variation in prices. Nevertheless, there is a fair bit of price variation that is not explained by structural characteristics or school district fixed effects. Many researchers include not only structural housing characteristics into the hedonic regression but also a variety of neighborhood characteristics, such as amenities, local public goods, distance to the city center, or measures of the quality of the environment, such as air or water quality. These additional variables often explain much of the remaining variation. This confirms our intuition that housing and neighborhood choices are closely related. Moreover, we can use these types of regressions to get some insights into how much households value different amenities and public goods. That is useful information since we do not observe the direct demand for these types of goods.
429
Chapter 23
23.5 Land Prices in NYC Hedonic price regressions can be estimated for almost any durable consumption good that is a differentiated product. We use these types of regressions to study not only housing markets but also many other markets. To illustrate the usefulness of hedonic methods, let's take another look at the land transactions data in New York that we studied in the previous chapter. We would like to control for heterogeneity in characteristics among land parcels. To investigate these issues in more detail, we can estimate a hedonic land price regression that controls for time trends and observed characteristics of land. These characteristics include the type of property, the condition of the property, the characteristics of the transaction, and location-specific characteristics such as distance to the city center. Table 23.3 is based on Haughwout, Orr, and Bedoll (2008) and summarizes the parameter estimates of a log-linear price regression. As we would expect from the discussion in the previous chapter, the estimate of the log of the distance to the Empire State Building (ESB) is negative and highly significant, but it is not the only significant variable in the regression model. Not surprisingly, structural characteristics, such as the condition of the property, the type of property, and characteristics of the transaction or intended use patterns are also important explanatory variables that determine the price of land in the NYC metro area. As a consequence, we conclude that there are many factors that influence the price of land in NYC besides distance to the city center. This is not that surprising, but it also clearly shows that we need to develop fairly rich models if we want to explain the observed variation in land and housing prices.
23.6 Housing Policies and Regulation To finish our discussion, it is useful to address a number of policies that affect land and housing markets. First, and most importantly, the price of housing is largely affected by housing supply regulations. Second, the price of housing may deviate from the competitive equilibrium price due to rent control or rent stabilization policies. These types of policies are popular in large cities to control the price for rental properties. Third, public housing programs that are largely financed by the Department of Housing and Urban Development provid e quality housing to low- and moderate-income households. We discuss each of these policies in this section.
23.6.1 Housing Supply Regulations The price of housing strongly depends on regulations that affect housing construction and, therefore, the supply of housing. These regulations differ significantly across states and metropolitan areas. In general, coastal cities in the West and Northeast tend to be heavily regulated while the rest of the country is less regulated. Among the least regulated cities is Houston. Let us consider California to illustrate the impact of regulation on housing m arkets. Demand for housing has skyrocketed in many urban areas in California for the past several d ecades. Supply has been lagging behind. As a consequence,
430
Land and Housing Markets TABLE 23.3. Land Prices in NYC Parameter
Estimate
Standard Error
Constant
6.82
(0.19)
Type of Property Residential land Industrial land
0.09 -0.75
(0.25) (0.23)
Condition of Property Lot is graded Lot is paved Lot is "finished" Lot is "fully improved" Lot was previously developed Lot is currently partially developed Lot is platted and engineered Lot has a structure present Structure present times log of distance to ESB Improvements not available
0.45 0.45 0.45 0.38 0.55 0.55 0.23 - 0.11 0.03 0.23
(0.06) (0.09) (0.05) (0.07) (0.06) (0.31) (0.37) (0.19) (0.07) (0.05)
Characteristics of Transaction Lot sold as part of expansion plans by buyer Foreclosure transaction Eminent domain transaction Lot has significant environmental problems Lot was not sold on the open market
0.17 - 0.38 0.38 - 0.81 0.04
(0.07) (0.17) (0.18) (0.14) (0.06)
Intended Use Buyer intends to hold lot for investment Lot is intended for public use Lot will be held as open space Intended use unknown
-0.21 -0.48 - 1.24 -0.19
(0.07) (0.08) (0.08) (0.07)
Location Natural logarithm of distance from ESB Natural logarithm of distance from ESB* residential land
-0.95 - 0.32
(0.05) (0.04)
California is one of the most expensive housing markets in the United States. Limiting the supply of housing tends to drive up home values. On average, houses cost more than 150 percent of the national average. 3 What explains these developments in California housing markets? Here we focus on the question of why the construction of new houses and apartments has been so difficult in some parts of California. California has the most stringent regulations on new housing construction in the country. More than tw o-thirds of cities and counties in coastal metropolitan areas have policies that are explicitly aimed at restricting housing growth and impose limits on density. Local governments impose a multilayered review 3
See the Wall Street Journal article by Allysia Finley (September 29, 2017) for details.
431
Chapter 23
FIGURE 23.4. New housing construction. (fancycrave.com/ pexels.com)
process for each new construction project. Often developers need to obtain approval from multiple departments within the same city, including the municipal building department, the health department, the fire department, and the planning commission. In addition, neighbors can delay or block projects using the state's 1970 Environmental Quality Act. As a consequence, it takes three times as long as to get a housing permit in San Francisco than in the typical American city. Local governments also charge development fees that are three to four times as high in California as in the rest of the country. Labor costs are approximately 20 percent higher in California than in the rest of the country. Communities often require developers to pay workers "prevailing wages" determined by local unions. Stringent building codes and energy efficiency standards increase construction costs. Summing up all these factors, the costs of building a median quality home in California may be $50,000 to $75,000 higher than in other metropolitan areas.
432
Land and Housing Markets
In general, the supply of housing depends largely on zoning laws, building codes, labor rules, and other regulations. The example of California illustrates that these regulations can significantly increase the costs of construction. More importantly, they can severely limit the amount of new construction. Both of these factors tend to drive up the price of new and existing housing. Homeowners are thus forced to pay large premiums for houses in California. Once they have purchased a house they have no incentives to vote for deregulating the housing market. A large increase in housing supply would lower the overall price level and hence create capital losses for all current homeowners. As a consequence, it is not surprising that local politicians are reluctant to deregulate the local housing markets and increase housing supply.
23.6.2 Rent Control and Rent Stabilization Rental markets are also heavily regulated in many cities and countries. Rent control policies are arguably among the most important regulations in rental housing markets. These policies protect tenants in privately owned residential properties from large rent increases. By keeping rental rates low, these policies aim to provide affordable housing to low- and moderate-income households. Hence rent control policies primarily redistribute resources from landlords to renters. To learn about the likely impact of rent control on housing markets, it is useful to study a recent policy change that happened in Cambridge, Massachusetts. All rental units in Cambridge built prior to 1969 were regulated by a rent control ordinance until 1994. These regulations placed caps on rent increases and restricted the removal of units from the rental stock. Rent-controlled units typically rented at 40 percent below the price of nearby noncontrolled properties. In November 1994 the Massachusetts electorate passed a referendum to eliminate rent control in the state. Autor, Palmer, and Pathak (2014) study the impact of abolishing rent control on property values. They use data on residential properties between 1988 and 2005. Not surprisingly, they find that abolishing rent control generated substantial price appreciation of units that used to be rent controlled. We would expect that these properties appreciated in value since landlords could charge much higher rents for the units in the building. They also find significant spillovers on the housing values of nearby units that were never subject to rent control. They estimate the abolishing of rent control accounted for a quarter of the $7.8 billion in Cambridge residential property appreciation during this period. The majority of this contribution stems from an induced appreciation of properties that were not previously rent controlled. This finding illustrates that rent control policies not only significantly lower rents below their equilibrium values but, more surprisingly, also generate negative externalities on property values of nearby, unregulated housing units. Rent stabilization policies are often less draconian than rent control policies but impose similar restrictions on landlords. To illustrate the impact of rent stabilization policies, it is useful to study housing markets in NYC, which have also been heavily regulated since the 1930s. Under New York State's rent stabilization law, any city in the state of New York may declare a housing emergency whenever the city's rental vacancy rate drops below 5 percent. This law was most recently
433
Chapter 23 renewed in June 2015 and affects units with a maximum rent of $2,700. De facto, this is a law that only applies to NYC. This policy determines maximum rates for annual rent increases. It also gives tenants the right to have their leases renewed. The rent guidelines board meets every year to determine how much landlords can increase rents on the leases. As a consequence, rent-stabilized units are approximately 50 percent cheaper than nonstabilized units. Not surprisingly, it is rather difficult to find a rent-stabilized unit in New York. Renters are reluctant to leave the units even if they are too large or too small given their personal needs. This is inefficient since housing units are not properly matched to households. Sieg and Yoon (2020) report that over one million units were rent stabilized in 2011, representing roughly 47 percent of the rental housing stock in NYC. A small fraction of the units are voluntarily subjected to rent stabilization by their owners. Under the 421-a program, developers currently have to set aside 20 percent of new apartments for poor and working-class tenants in order to receive tax abatements lasting thirty-five years. The vast majority of units are involuntarily stabilized since the city declared a housing emergency in 1974. In 2016 NYC mayor Bill de Blasio proposed a ten-year plan to build 200,000 affordable or rent-stabilized housing units in the New York City area through various rezoning laws. According to the plan, two hundred blocks of East New York, Brooklyn, would be rezoned under "mandatory inclusionary zoning" that would allow developers to build much taller buildings on the condition that they also build a number of affordable housing units and apartments for seniors. The zoning laws are designed in a way that no longer favors the construction of parking lots, thus incentivizing developers to build more affordable housing units instead. The plan also includes certain rules and restrictions that would protect the character of the neighborhood and ensure better-quality housing for low-income and senior residents. For example, one subsection of the plan requires that space be set aside for "better ground retail," like day care centers and supermarkets, in order to improve the quality of life in these communities. In March 2016 the city council approved this plan. However, the question remains whether this plan will address the long-term problem that New York City faces. If developers choose not to build or not to honor the agreements to build affordable housing, these new zoning laws will not lead to the creation of new affordable housing units and the city will have to take other measures to solve this ongoing issue.
23.6.3 Public Housing Policies In addition to private housing markets, there is a market for publicly provided housing. As we discussed in chapter 17, the US government currently offers a variety of housing assistance programs to low-income households. The US Public Housing Program was formed in 1937 under the US Housing Act for the purpose of funding local governments to own and manage buildings for low-income residents with subsidized rents (Olsen, 2003). Currently, the US Department of Housing and Urban Development (HUD) supports the efforts of hundreds of local housing authorities in the United States. To give a p erspective on the extent of the coverage that HUD supplies, in the state of Pennsylvania alone there are ninety-two distinct housing authorities. In 2006 the estimated
434
Land and Housing Markets
HUD budget for public housing was $24.6 billion.4 Within the public housing program, these funds finance administration, building maintenance, and even law enforcement. The development of public housing communities started in the 1940s, initially established as part of a slum clearance program. Eventually, the emphasis shifted toward increasing and upgrading the housing stock for low- and moderateincome households. In the 1960s and 1970s public housing communities were considered to be more desirable than other low-cost alternatives. By the 1980s this common perception changed as public housing communities began to develop a reputation for violence, drug use, prostitution, and political patronage. Now some of the worst public housing communities have been demolished. However, there still exists an excess demand for public housing in almost all major cities. In addition to providing and maintaining public housing communities, local housing authorities also have a role in the private rental market. The Housing Choice Voucher Program, formerly known as Section 8, provides rental assistance to low-income families . The program was created by the Housing and Community Development Act of 1974 to increase opportunities for better living conditions for low-income families while maintaining rent payments at an affordable level. While private owners maintain these buildings, the vouchers are intended to promote freedom of housing choice and integrate lower-income and minority families into higher-quality communities. In many large cities, these vouchers have drawn in such a large demand that some waiting lists for Section 8 vouchers have been closed. Let's consider NYC to obtain some more insights into the functioning of these programs, which are administered by the New York City Housing Authority (NYCHA). Households whose incomes do not exceed 80 percent of median income are eligible for the public housing program. The threshold is 50 percent for the Section 8 voucher program. In addition, these income thresholds take family size into account. For example, in 2011 the program's qualifying income limit for a single-person household was $45,850, while it was $65,450 for a family of four living in NYC. The NYCHA currently operates 177,666 public housing apartments throughout New York, providing 403,000 residents with affordable housing. In addition, the NYCHA Section 8 program subsidizes rental assistance in private housing for another 235,000 residents. While public housing often has a poor reputation among middle-class households, these units tend to be popular among low- and moderate-income households. The NYCHA reported that 270,201 families were on the waiting list for conventional public housing and 121,356 families were on the waiting list for Section 8 vouchers. Sieg and Yoon (2020) estimate that it takes seventeen to nineteen years on the waiting list to obtain access to public housing in NYC. They also find that the willingness to pay for a public housing unit by low-income households is approximately equal to $50,000. Now you may argue that NYC is one of the most expensive metropolitan areas in the world. Clearly, the situation must be less severe in other US cities. That conjecture is partially correct. Nevertheless, we observe an excess demand for public housing in almost all metropolitan areas. Geyer and Sieg (2013) offer a study of 4
This number does not include housing voucher programs, low-income community development programs, or other nonstate owned and managed housing programs.
435
Chapter 23 public housing in Pittsburgh, adding an analysis of another large city differentiated from the skyrocketing housing and rental prices in NYC. They find that for each family that leaves public housing there are on average 3.8 families that would like to move into the vacated unit. Demolitions of existing units increase the degree of rationing and potentially cause a welfare loss. Hence significant excess demand for public housing exists even in cities such as Pittsburgh. Kling, Liebman, and Katz (2007) analyze the Moving to Opportunity experiment and find that moving to neighborhoods with lower poverty rates not only improved physical and mental health but also produced mixed outcomes for children's behavior, with minimal impact on their employment outcomes. Geyer (2017) analyzes voucher recipients in Pittsburgh. She finds that individuals using vouchers live in neighborhoods with lower crime rates and better schools than the neighborhoods of public housing residents, suggesting that at least with respect to neighborhood quality, vouchers offer an improvement over public housing. Note that public housing is more prevalent and less controversial in many countries outside the United States. A compelling example is Singapore. The vast majority of the residential housing developments in Singapore are publicly governed and developed. As of March 31, 2015, 82 percent of the resident population lived in long-term lease accommodations that were managed by the Singapore Housing and Development Board. The remainder of the population lived in rental flats, which are reserved for low-income households who are unable to afford the cheapest forms of public housing despite financial support.
23. 7 Concl us ions Bedonie models are used to characterize the demand and supply for heterogeneous houses. In equilibrium, a household is matched with a firm that is willing to build the exact house that the household demands. The matching process can be decentralized via a hedonic pricing function. This function expresses the price of the house as a function of its characteristics. In equilibrium, the slope of the hedonic pricing function must be equal to the household's marginal willingness to pay for that characteristic. Housing characteristics typically include structural characteristics, such as the number of bedrooms and the lot size, as well as neighborhood amenities, local public goods, and services. We can estimate hedonic price functions using data on housing prices and observed characteristics. The estimated slopes of the hedonic price functions are informative about household preferences. We can thus estimate the marginal willingness to pay for improvements in amenities and public goods using hedonic regressions. In addition to private housing, we also consider public housing. Public housing is provided by the Department of Housing and Urban Development in combination with local housing authorities that are m anaged by cities. In addition, HUD offers housing vouchers to low-income households. These vouch ers let households rent in the private market at highly subsidized rates. Eligibility for public housing is primarily determined by income. There is a large shortage of public housing and vouchers in most cities.
436
Land and Housing Markets Finally, housing is subject to many local regulations. Zoning laws impose restrictions on what types of houses can be built in certain neighborhoods. Building codes affect the price and quality of new housing. Sometimes cities directly interfere with the market process by imposing rent control or rent stabilization policies. These types of restrictions affect both the demand and the supply of housing and can cause serious distortions in housing markets. These policies create mismatches in the housing market and reduce incentives for new housing construction.
23.8 Technical Appendix: Computing the Equilibrium in the Hedonic Model Recall that we assume that houses differ by a one-dimensional index of quality, denoted by z. The price of a house of quality z is given by p(z). Note that prices do not have to be linear in z. There are a large number of firms that differ by productivity 0. The costs of building a house of quality z are given by (23.11)
For simplicity, let's assume that 0 ~ U(O, 1). Each firm produces 1 house. Profits are thus given by
IT(z, p(z); 0) = p(z) -
z/3
0
(23.12)
Each firm chooses z to maximize profits. The first-order condition of the profit maximization problem is given by {3-l
p'(z) = Pz= /3z 0
(23.13)
The slope of the pricing function is equal to the marginal cost of quality. Solving for 0 gives us the inverse supply function:
z/3-l
0(z) = / 3 Pz
(23.14)
Households differ in preferences for quality, v ~ U(O, 1). Each household buys 1 house and utility is given by
U(z,p(z);v) = vza + m - p(z)
(23.15)
where m denotes income. Note utility is quasi-linear. Hence income does not affect the demand for housing in this specification. The first-order condition of the utility maximization problem is given by (23.16)
437
Chapter 23 0
V
FIGURE 23.5. Matching of firms and households in equilibrium.
The slope of the pricing function is equal to the marginal willingness to pay for quality. Solving for v gives us the inverse demand function:
v(z) -
Pz
(23.17)
ixzLt-1
Let us conjecture that higher-demand households (v) are matched in equilibrium w ith more efficient firms (0) . Since both of these parameters have the same uniform distribution, it is natural to conjecture that the solution to the optimal assignment or matching problem is given by
p
z/3- 1
V =ixz1t-2- l = /3 -Pz- = 0
(23.18)
Matching is along the 45-degree line, as illustrated in figure 23.5. This type of matching is also called perfect assortative matching in the literature: higherdemand types match with higher-supply typ es. To verify this conjecture, note that the fraction of households that would like to purchase a house with quality less than or equal to z is given by
F ( Pz ) _ V
ixz1t-l
-
Pz
ixz1t-l
(23.19)
The equality in (23.19) follows from the assumption that the distribution is uniform.
438
Land and Housing Markets Similarly, the fraction of firms that builds houses with quality less than or equal to z is given by zf>- 1) zf>- 1 (23.20) Fe ( /3- =/3Pz Pz Equilibrium requires that the demand equal the supply for each value of z: ~
a:za-1 -
zf>- 1
(23.21)
/3 Pz
Solving the equation above implies that the slope of the equilibrium price function is given by (23.22)
Integrating with respect to z gives the equilibrium price function: 2
a+/l
p(z) = (a:(3) 11 2 - - z--r IX+ {3
+C
(23.23)
where the constant of integration is determined by the condition that the lowestproductivity firm makes zero profits. Substituting the slope of the equilibrium price function into the first-order condition of the household problem yields
(23.24)
Similarly, a firm's supply is given by
z
5
=
(~l\'. )~ 0Fi 2
(23.25)
Hence the matching is characterized by v = 0 along the equilibrium path. Note that we need O< IX < f3 < l for the model to be well behaved. One nice feature of the hedonic model is that it can be extended to allow for more than one characteristic of housing. Let us think about z = (z, ... , z J) as a vector of observed characteristics such as size of the lot, size of the house, and quality features of the house. The optimization problem of the household can then be written as (23.26) s.t .
c
+ p(z1, ... , ZJ) = y
Substituting the budget constraint into the utility function, we obtain (23.27)
439
Chapter 23 The first-order conditions are now given by (23.28)
where Pzi = cJp/cJzj and U2 i
= cJU /cJz1. Rewriting the equation above, we obtain
Uz1. Pz·1 = -Uc
= MRSz-c ;· = l, ... ,] 1
(23.29)
'
The slope of the hedonic price function is equal to the marginal willingness to pay for the attribute Zj, measured by the marginal rate of substitution. Again, we can solve this system of equations and obtain a system of inverse demand functions for each characteristic. Similarly, we can extend the decision problem of the firms and obtain the inverse supply functions for each characteristic. Setting demand equal to supply, we obtain a system of partial differential equations that need to be solved for the equilibrium pricing function. In general, these models do not have closed-form solutions but must be solved numerically on the computer. An exception is the model in which preferences and technologies are quadratic functions in z. As a consequence, the first-order conditions are linear in z. If we also assume that heterogeneity in preferences and technology is normally distributed, then we can obtain a closed-form solution of the hedonic price function and the equilibrium matching rule. The derivation of this result is not difficult but involves some knowledge of matrix algebra. The interested reader should consult, for example, Epple (1987). Note that the matching of firms to households can be much more complicated in this model. In particular, we do not necessarily get the simple assortative matching equilibrium of the previous unidimensional model.
23.9 Debate: Zoning Policies in NYC Should New York City actively use rezoning policies to force developers to provide subsidized affordable housing for lower-income households? The following questions may help structure the debate. 1. Is it a reasonable policy to require new developments in New York to also
provide affordable housing? 2. Does the government have the right to impose such constraints on private companies and developers? 3. Should affordable housing be a burden on local taxpayers? 4. What is the benefit of having low- and moderate-income people living in N ew York City that makes it so essential that the local government accommodate them? 5. What would happen if these lower-income people lived somewhere else? 6. With higher wages and longer commutes (if lower-income people lived outside of NYC), how would we incentivize workers to commute to NYC? 7. How does government-subsidized housing affect the quality of life in low incom e neighborhoods?
440
Land and Housing Markets
8. How would the policy of requiring higher minimum wages (instead of subsidized housing) affect the quality of life in low-income neighborhoods?
23.10 Problem Sets 1. Explain the concept of product differentiation. Use an example to illustrate
this concept. 2. Explain why the slope of the hedonic equilibrium function provides you with some information about how households value housing quality on the margin. 3. Suppose you observe a random sample of housing prices and housing quality measures. Explain how you can estimate the unknown hedonic pricing function for this market, accounting for the fact that the underlying relationship between prices and quality may be nonlinear. 4. Explain why we can interpret the hedonic model as a matching model. 5. Suppose a city mayor proposes to pay for the acquisition and construction of public parks by implementing a seven-year local sales tax. Someone argues that the financing mechanism should instead be something that is spread out over more time, noting that a person who moves here eight years from now would get to enjoy the parks but under the current proposal wouldn't pay for them. Similarly, this individual argues that those who live in the community now pay for it, but if they move, they won't benefit at all from it after a few years. The mayor argues that those who move out of this area will in fact benefit from the parks beyond their personal use and that those who move to this area after the seven-year period will implicitly pay for access to the parks. What does the mayor mean? 6. You observe that almost identical houses on different sides of the border between two municipalities cost very different amounts of money. Provide an explanation that may rationalize the price differential. How would you measure the price differential using actual data? 7. Explain how housing market regulations can drive up the cost of homeownership and limit the supply of housing. 8. What are some policy options to discourage homeownership by foreign investors? What are the potential drawbacks of these policies? 9. Explain under what conditions rent stabilization programs may suppress investment in new housing. 10. Explain how rent stabilization programs can create "mismatch," i.e., a situation in which households prefer to live in a housing unit that is either too small or too large for their needs.
441
24
Local Labor Markets
24.1 Motivation Metropolitan areas serve as local labor markets, matching workers to firms. When we analyze mobility and locational choices within a city or within a single local labor market, it makes sense to assume that household income is exogenous. All workers in a city participate in the same local labor market and, therefore, face the same wages. Despite the rise of the internet and telecommuting, most workers still tend to live in the same metropolitan area where their employers are located. We have seen that workers commute to jobs. Locational choices within a city primarily affect the distance and time to commute, but these choices typically do not have a significant impact on earnings. In this chapter, we study mobility among metropolitan areas or local labor markets. We have already shown that there are large differences in firms' productivity and innovation rates across metropolitan areas. These differences in productivity imply that there are large differences in wages and earnings across local labor markets. Moreover, we know that the differences in local wages and earnings increase with the amount of human capital, or skills of workers. As a consequence, not all local labor markets are created equal. There is potentially much to be gained from finding the metropolitan area that offers the best match to a person's skills or the best career opportunities. To illustrate this point, suppose you have been offered a job by Google. The firm gives you a choice: you can work in the headquarters in Mountain View, California, or in an office located in Pittsburgh, Pennsylvania. Which location do you choose? It is likely that the job in Mountain View will pay a higher salary than the job in Pittsburgh. (Why?) However, living in Silicon Valley is also considerably costlier than living in Pittsburgh. Not only is housing more expensive in Silicon Valley, but so are other nontradable goods and services, such as restaurant meals, dry cleaning, memberships in fitness studios, or haircuts. Hence you face a clear trade-off: the higher earnings in Silicon Valley are at least partially offset by the higher cost of living. Which location is better for you in the short run? Which location may be better for you in the long run? Note that the answers to these two questions do not necessarily have to be the same!
Local Labor Markets More generally, we show in this chapter that there exists a large urban wage premium. Firms located in large cities typically pay higher wages than firms located in small cities, holding observed skills and other personal characteristics constant. Moretti (2012) stresses that the size of the local labor market affects wages and earnings in at least three ways. First, "thick" local labor markets provide better matches between employers and employees and therefore tend to be more productive, more creative, and more innovative than "thin" markets. Second, large markets provide better matches between firms and suppliers of intermediate goods and services. Consider, for example, a start-up technology company located in Silicon Valley. It is a lot easier to find specialized lawyers and venture capitalists in the Bay Area than in Cleveland. Specialization also encourages productivity gains, which translates into higher wages and earnings. Finally, there are knowledge spillovers that arise due to random and not-so-random encounters in the city. The magnitude of these spillovers may also depend on city size. Differences in nominal wages will overstate differences in inequality in welfare among cities. The reason for this is simple. Not only are wages higher in larger cities, but so is the cost of living. In particular, housing rents tend to be much higher in large metropolitan areas. As a consequence, you need to account for the fact that your earnings, as well as your cost of living, depend on local labor market conditions. We have studied housing demand and supply, labor demand and supply, and locational choices of firms and households separately. We are now in a position to put the different components of our model together and characterize the equilibrium that arises in a system of cities with mobile firms and workers. The model also helps us disentangle the consumption and production value of a city. Cities differ not only in productive externalities but also in consumption amenities. Wages and rents are endogenous and reflect the attractiveness and productivity of a city in equilibrium. The workhorse in this literature is a model that goes back to Rosen (1979) and Roback (1982), which captures an intuitive notion of equilibrium across local labor markets within a country. We have already discussed Rosen's basic hedonic model in the previous chapter. We build on his insights in this chapter. Not surprisingly, these spatial general equilibrium models can be rather complicated since they have multiple markets and potentially much heterogeneity among firms and households. We develop a simple version of the model, w hich largely abstracts from heterogeneity, and then discuss how to make the model more realistic. After that, we explore the policy implications of these models.
24.2 The Urban Wage Premium We begin our explorations of local labor markets by reviewing some basic empirical evidence. Local labor m arkets in the US exhibit large differences in wages and earnings. Moretti (2011) uses 2000 census d ata to study the difference in local wages for full-time workers in 288 metropolitan areas. He finds that the average high school graduate living in the median metropolitan area earned $14.10 for each hour worked in 2000. The tenth and ninetieth percentiles of the wage distribution across metropolitan areas were $12.50 and $16.50, respectively. For college
443
Chapter 24 graduates, the average wages in the tenth and ninetieth percentiles were $20.50 and $28.50. As a consequence, wages are not equalized among local markets and large differences persist over time. Differences in nominal wages are partially offset by differences in the cost of living. Typically, we can measure these differences in price levels, using a local consumer price index that accounts for differences in housing rents. Real wages are then computed as the ratio of nominal wages to the local price index. Moretti reports that the tenth and ninetieth percentiles of the real wage distribution for high school graduates were $10.00 and $11.70, respectively. For college graduates, he reports estimates of $16.70 and $20.40. We therefore conclude that there are significant differences in wages and earnings among metropolitan areas, even after adjusting for differences in the cost of living. Some of these spatial wage differences may be driven by differences in the distribution of observed characteristics, such as ability, experience, and other forms of human capital that may not be captured by the level of formal education. Moreover, there may be substantial differences b y race, ethnicity, or gender among local labor markets. The empirical challenge is then to determine what fraction of wages or earnings can be explained by observed workers' characteristics and what fraction is due to differences across local labor markets. Suppose we have access to a large sample of workers, i = l, . .., N. Some workers work in small cities, others in large cities. We observe each worker's wage w i and the worker's characteristics xi. Let di denote a dummy that is equal to 1 if worker i is in a large urban labor market and Ootherwise. A simple approach to estimate the urban wage premium is to use the following regression model: (24.1) Note that this regression is another example of the hedonic model we used before to study land and housing markets. The coefficient a then measures differences in wages across the two types of labor markets that are not explained by differences in observed characteristics (such as education, experience, gender, or race). We can estimate similar regressions using total earnings or labor productivity as a dependent variable. Estimates of spatial differences in wages and earnings that are based on these types of hedonic regression models can be problematic for a variety of reasons. There are important factors that affect wages and earnings that are hard to measure. Ability, motivation, grit, and personality traits are examples of variables that are known to predict earnings but are not well measured in most datasets. In addition, workers do not make random locational choices. If the omitted variables are correlated w ith locational choices, then the measurement model above will result in a biased estimator of a. For example, suppose high-ability workers are primarily recruited by high-productivity firms that tend to be located in large cities. That may induce a positive correlation between (unobserved) ability and (observed) locational choice. As a consequence, our estimator will overstate the importance of location. One possible approach to deal with unobserved time-invariant characteristics is to use panel data, as we discuss in detail in the appendix. Using panel data, we can identify the urban wage premium using movers, i.e., individuals who moved
444
Local Labor Markets between local labor markets. Since ability is constant, it cannot explain the systematic and persistent wage gain or wage loss of a mover. But then again, the decision to move is not necessarily random either. Moretti (2012) documents that local labor market conditions have been diverging for the past three decades. At one extreme of the distribution are the cities that he calls "brain hubs." These include New York, San Francisco, Boston, and Raleigh-Durham's Research Triangle. These cities tend to attract and retain workers who are on average more productive and creative and hence higher paid than workers in cities that have struggled. At the other extreme of the distribution are former manufacturing cities, which are rapidly losing jobs and residents. Alternatively, empirical studies have tried to directly capture the magnitude of the externalities provided by a local labor market, using proxies such as firm density, employment density, or the percentage of college-educated workers. Again, all these direct measures of agglomeration externalities are potentially endogenous, requiring the use of an instrumental variable strategy. Let us briefly discuss a few well-known empirical studies that have attempted to characterize spatial differences in productivity, earnings, or wages. An early study is Ciccone and Hall (1996), who show that labor productivity varies dramatically by counties and states. Output per worker in the most productive state was two-thirds larger than in the least productive state in 1988. They investigate whether these differences are due to agglomeration externalities, using employment density as a measure of agglomeration externalities. As instruments they suggest historical variables such as the presence of railroads or the population density in the 1860s. They find that a doubling of the employment density in a county leads to a 6 percent increase in labor productivity. Rosenthal and Strange (2008) focus on the urban wage premium. They partially explain these spatial wage differences by human capital spillovers of collegeeducated workers. They use 2000 census data to estimate the impact of agglomeration and proximity to human capital on labor productivity. Agglomeration is measured in concentric rings from the place of work while productivity is measured by wages. Geologic features such as landslide hazard, seismic hazard, bedrock, and surface water are used as instruments to address measurement error in the agglomeration variables and endogeneity in the wage-agglomeration relationship. The paper finds that the benefits of spatial concentration are driven by proximity to college-educated workers. To give you an idea of the magnitude of this human capital spillover, transforming 50,000 less-than-college-educated workers within a five-mile radius into college-educated workers elevates the average worker's wage by 10 percent. This is roughly equivalent to the private returns from one additional year of schooling. Learning is also an important component of agglomeration economies. Economists tend to explain the returns on experience b y a "learning-by-doing" argument. You can acquire general skills at school or in college. In contrast, jobor occupation-specific skills are typically learned on the job. Since it takes time to acquire these job-specific skills, they tend to be correlated w ith experience. BaumSnow and Pavan (2012) show that there are large differences in the returns on experience between workers in different cities. Moreover, these differences are systematically linked to city size, i.e., the returns on experience are higher in large cities than in small cities. The evidence reported in Baum-Snow and Pavan then suggests that cities may facilitate the learning process. De La Roca and Puga (2017)
445
Chapter 24 provide some additional evidence from Spain that suggests workers obtain more valuable experience in large cities. Moreover, workers keep on benefiting from this experience even if they move back to smaller cities later in life.
24.3 Modeling Differences in Local Labor Markets 24.3.1 A Baseline Model Let us consider a general equilibrium model with two cities. The model is in the spirit of Rosen (1979) and Roback (1982). These were the first papers that systematically analyzed mobility across local labor markets within a well-defined equilibrium model. To simplify the exposition of the main ideas, we abstract from heterogeneity among workers and firms as much as possible. Consider a model with two cities, denoted j = l, 2. There are two types of consumption goods: a composite good that is nationally traded and land that is locally traded. The output of the composite good is produced by firms in both cities. A firm that is located in city j produces output Q using a production function given by QJ· -- Af(L J ]·' H-) J
(24.2)
where Lj is labor input and Hj is the amount of commercial land used in production.1 Aj measures the productivity of firms located in city j . At this stage, we assume that the productivity differences among cities are exogenous. We discuss how to incorporate endogenous agglomeration externalities below. For simplicity let us assume that city 2 is more productive than city 1. Hence we have A2 > A1 . For simplicity let us assume that there are no transportation costs for traded goods. Hence the price of the composite output good is the same in both cities. We can normalize the price of output to be equal to 1 because only relative prices matter. Firms need to pay the local wage Wj to hire labor and the local rent Pj to purchase land. The cost minimization problem of firms in city j can be written as min
s.t.
wj
L
+ pj H
Q = Aj/(L, H)
(24.3) (24.4)
Solving the firm's cost minimization problem, we obtain the cost function. The unit or average cost function for each firm is denoted by (24.5)
This function expresses the average cost of producing 1 unit of output in location j. For simplicity let us assume that the production function has constant returns to scale so that the per unit or average cost does not depend on the level of output Q. We assume that the output market is competitive and there is free entry among firms. Hence profits have to be 0. As a consequence, the average cost of producing 1 For
simplicity w e abstract from capital. But as long as the rental price of capital is the same in both locations, it is fairly easy to introduce capital into the model.
446
Local Labor Markets
FIGURE 24.1. Equilibrium in a system of cities.
the composite output good has to be equal to the price of output in both cities: (24.6) This condition defines downward-sloping indifference curves in the (p, w) space for firms in city 1 and city 2. These indifference curves are downward sloping since a firm needs to face low wages if rents are high. It can afford high wages if rents are low. Figure 24.1 illustrates two downward-sloping indifference curves in the (p, w) space. The downward-sloping lines show combinations of wages and rents that hold firms' unit costs constant within a city. For city 1, this requires that C1 (p, w) = 1. This locus is characterized by the lower line. For city 2, this requires that C2(p, w) = l. That locus is characterized by the upper line. Note that firms in city 2 are more productive than firms in city 1. Hence, they can afford to pay higher wages at the same level of rents to obtain the same unit costs. Let's consider an example that is based on a Leontief production function to illustrate these concepts. The production function is given by (24.7) where 'YL measures labor productivity and 'YH m easures the productivity of land. Optimal input ratios must satisfy (24.8) With a Leontief production function, you need to use inputs in fixed ratios. Substituting the optimality condition into the production function gives us the following
447
Chapter 24 conditional factor demand functions: 1 L-- - Q1 1
(24.9)
- Aj'YL
1 H----Q 1 1 - Aj 'YH
Substituting the conditional factor demand functions into the definition of costs yields the following cost function: (24.10)
The unit or average cost function for each firm is given by Wj Pi C1(w1,P1) = - A + - A
j'YL
(24.11)
j'YH
Equilibrium requires that the price equal average cost: (24.12)
Solving for Wj , we have the following indifference curve: 'YL 'YH J
w = A·ryL - - p J
J
(24.13)
This indifference curve is downward sloping since a firm needs to face low wages if rents are high. Also, note that the intercept of the indifference curve is higher in the more productive city. Each firm can afford to pay higher wages at the same rents if it is located in a more productive city. Next, consider workers. We assume that workers' utility depends on a composite consumption good c and residential land h. (We use lowercase letters for workers and uppercase letters for firms.) Assume a standard Cobb-Douglas utility function. Hence we have
U(c,h) = (l - f:3)1nc In this case, is given by
+ f:3lnh
(24.14)
/3 measures the expenditure share of housing. The budget constraint W = C
+ ph
(24.15)
Variation in the cost of living only depends in this model on variation in the cost of land, p, since we assume that the price of consumption is equal to 1 in both locations. In practice, there are m any other local goods and services that are not traded (haircuts, tennis lesson s, etc.). All these nontraded goods and services have prices
448
Local Labor Markets that depend on the location. In this model, we can just view land as a composite nontraded good. Workers maximize utility subject to a budget constraint by choosing quantities of consumption and residential land. Substituting the demand functions into the utility function, we can represent workers' preferences by an indirect utility, which is a function of the nominal wage and the price of land in a location. For the CobbDouglas case, we obtain
V(p, w)
= ln(w) -
~ ln(p)
+ (1 -
~) ln(l -
M+ ~ ln(~)
(24.16)
If we set the indirect utility function equal to a constant, we can derive the workers' indifference curves in the (p, w) space. For the Cobb-Douglas case, indifference curves are given by
ln(w) = ~ln(p)
+V -
(l - ~) ln(l - ~) - ~ln(~)
(24.17)
Indifference curves are upward sloping since a worker needs a higher wage to pay for higher rents to stay indifferent. With two locations, the indirect utility in city 1 is V(p1, w1), and the indirect utility in city 2 is V(p2, w2). Figure 24.1 also illustrates an upward-sloping indifference curve of a worker. It is useful to highlight the differences between firms and workers. Both firms and workers prefer lower rents. However, workers prefer higher wages while firms prefer lower wages. That explains the difference in the shapes of the indifference curves. The equilibrium that arises in this model is illustrated in figure 24.1. Equilibrium wages and rents are denoted by (p1, w1) and (p2, w2). Note that both rents and wages are higher in the more productive city (city 2). The upward-sloping line, denoted by V(p, w) = V, represents the indifference curve for a worker evaluated at the equilibrium utility level, denoted by V. Since workers are mobile, they must receive the same utility in both cities. Hence equilibrium requires that (24.18) If this condition is violated, free mobility implies that all workers would prefer to live in the city with the higher utility. Another appealing feature of the model is that the high-productivity city is typically larger in size than the low-productivity city. This theoretical result is then consistent with the urban wage premium for large cities that we observe in the data.
24.3.2 Extensions We can add a lot of bells and w histles to the model specification to make it more realistic. Here are some useful extensions that researchers h ave considered. First, we can allow for heterogeneity among workers along the lines discussed, for example, in Moretti (2004b). One promising approach assumes that w orkers differ exogenously by skill level. For example, there m ay high-skill and low -skill workers. The production function uses high- and low-skill workers as input factors, in
449
Chapter 24 addition to land. These production functions are extensions of the examples we considered above. The wages for the different skilled workers are determined in a competitive equilibrium by the marginal product of labor. Second, we would like to add endogenous agglomeration externalities to the model. One way to accomplish that is to assume that the productivity in location j depends on the sorting of firms and workers. For example, we can assume that the externality parameter Aj is a function of the number of high-skill workers in the local labor market. Third, we can add exogenous differences in spatial amenities and public good provision to the model. In these models, cities affect production via agglomeration externalities and consumption via amenities and local public goods. Observed rents and wages can tell us about the relative importance of productivity and the amenities of cities. For example, Detroit has relatively high wages and low rents. This observation suggests high productivity and low amenity values. San Francisco has high rents and high wages, which suggests high productivity and high amenity values. Myrtle Beach and Honolulu have low wages and high rents, which suggests low productivity and high amenities. The basic intuition that we learn from the simple model above carries over to these more realistic models. Attractive cities tend to be larger in size or population than less attractive cities. Since workers and firms often do not internalize the externalities that they create, equilibria are not necessarily efficient. Hence there is some scope for public policies to improve the overall efficiency of the equilibrium allocation. Next we turn our attention to some promising attempts that have tried to quantify the magnitude of the relevant effects.
24.4 Theory and Measurement Diamond (2016) considers a model with two types of skilled workers. She documents a large increase in the difference in wages between high school and college graduates over the past three d ecades--wages of high school graduates have stagnated while wages of college graduates have increased. The difference in wages between low- and high-skill workers is called a wage gap. The increase in the wage gap has been accompanied by a substantial increase in the geographic sorting of workers by skill. Cities that used to have a history of attracting college-educated workers are doing well, while blue collar cities such as Detroit, Cleveland, and Buffalo have struggled to stay competitive. Diamond shows that the increase in demand for high- and low-skill labor across cities is primarily due to local productivity changes. For example, Detroit experienced a negative shock due to the decline of manufacturing industries. At the same time, San Francisco and Boston benefited from positive shocks to the technology and service sectors. Cities that had a history of attracting college graduates became more desirable places to live and more productive for both high- and low -skill labor. The combination of desirable wages and amenities made college-educated workers willing to pay high housing costs to live in these cities. Lower-skill workers also found these cities desirable. However, they were less willing to pay high housing costs. As a consequence, they preferred to live in m ore affordable cities.
450
Local Labor Markets
,·
'
;..
"'
f:.
"
FIGURE 24.2. College-educated workers. (Christina Morillo/pexels.com)
Note that local amenities and prices for nontraded goods introduce a wedge between real and nominal income across cities. As a consequence, the college wage gap does not necessarily reflect a gap in economic well-being. As pointed out by Moretti (2004a, 2012), college graduates increasingly live in cities with high housing costs. He points out that the higher local price levels partially offset some of the benefits of higher wages. In a nutshell, it is true that you have higher earnings in New York, but most of the goods and services that you consume there are also much more expensive than in other cities, as we discussed in the chapter's introduction. These models provide a compelling explanation for these facts. We therefore need to be careful when we make welfare comparisons across cities. Desmet and Rossi-Hansberg (2013) also consider locational choices within a system of cities. There are three determinants in their model that affect the size of a city: productivity, amenities, and friction. Their findings are consistent with the notion that higher production efficiency and better amenities lead to larger cities but also to greater friction through congestion and other negative effects of urbanization. Surprisingly, they find that the potential welfare gains from optimally allocating households across space are small in the US. Differences in earnings are offset to a large degree by differences in local prices and negative friction. In contrast, they find much larger potential welfare gains from reallocating households in China. Albouy (2009) explores another friction that distorts locational decisions. He notes that workers in cities offering above-average wages pay approximately 27 percent more in federal taxes than otherwise identical workers in cities offering
451
Chapter 24 below-average wages. This follows from the fact that income taxes are based on nominal annual income. As a consequence, the federal income tax provides incentives for households to move from high-cost to low-cost cities, holding earnings constant. Simulations of his model suggest that federal taxes lower long-run employment levels in high-wage areas by 13 percent and land and housing prices by 21 and 5 percent, respectively, causing locational inefficiencies costing 0.23 percent of total income, or $28 billion in 2008. Employment is shifted from the North to the South and from high-cost to low-cost cities. He suggests that the federal income tax should allow for deductions that index tax liabilities by the local cost of living. These changes in the tax code would improve locational sorting and overall aggregate efficiency. Another interesting study of regional labor markets is Yoon (2017), who studies the decline of the Rust Belt. He considers a model with multiple sectors that allows him to differentiate between the manufacturing sector and the service sector. In 1960 manufacturing firms in the Rust Belt were approximately 10-15 percent more productive than firms in the rest of the United States. By 2010 this relative advantage had been almost completely eliminated. As a consequence, the decline of the Rust Belt is partially due to the fact that firms in other regions in the US caught up with firms in the Rust Belt. This story is clearly true for the automobile sector. Many foreign car makers invested heavily outside the Rust Belt and built plants that were as productive or even more productive than plants operated by car manufacturers in Michigan and Ohio. At the same time places outside the Rust Belt also had weaker labor unions, which added to the cost advantages of these locations (Holmes, 1998). According to Yoon, the transition of the US economy to a service sector economy is a less significant factor in the decline of the Rust Belt. Overall, the decline primarily affected less educated and less mobile, older households.
24.5 Location-Based Policies Moretti (2011) defines location-based policies as government interventions aimed at reallocating resources from one location to another. These policies are widespread both in the US and in the rest of the world. State and local governments in the US spend tens of billions of dollars each year to entice firms to relocate, to attract new firms, and to retain older firms. The recent bidding war for Amazon's second h eadquarters provides a great example of state and local governments trying to influence the locational choice of a large firm. The example also highlights the potential political problems associated with offering large subsidies to highly profitable firms. As we have discussed in previous chapters, local and state governments u se a variety of tools to attract firms, such as subsidies, tax incentives, and subsidized loans. They can build infrastructure and industrial parks; state and local governments typically play important roles in training the local workforce by financing primary, secondary, and higher education; and, more generally, states and cities compete based on personal and corporate tax policy as well as labor and environmental regulations. In a world with significant agglomeration spillovers, it may be efficient from the perspective of a local government to target the relocation of high-productivity
452
Local Labor Markets companies. The results in Greenstone, Hornbeck, and Moretti (2010) suggest that there can substantial spillover effects on the productivity of existing incumbent firms. Moreover, attracting new firms and workers creates additional demand for local goods and services and increases the fiscal capacity of the city. All that can be desirable from the perspective of the local government. Nevertheless, these types of policies do not necessarily increase aggregate welfare. A relocation of a firm benefits the "winning" city and harms the "losing" city. These types of relocations are not necessarily zero-sum games, but can create aggregate losses if firms move to low-productivity cities. The US federal government also promotes several location-based policies. The Tennessee Valley Authority and the Appalachian Regional Commission are important historical examples of large federal programs that targeted poor rural areas for development aid. As we discussed in detail in chapter 17, it makes more sense to target workers than locations as long as workers are sufficiently mobile. If workers are not mobile, place-based or location-based policies may be necessary to address large inequalities within a country.
24.6 Conclusions In this chapter, we have characterized differences in local labor markets. We have seen that there are large and persistent differences in nominal and real wages among metropolitan areas. We have also studied spatial equilibrium models of local labor markets along the lines pioneered by Rosen (1979) and Roback (1982). These models focus on household and firm choices within a system of cities. In equilibrium, local wages and rents reflect differences in both productivity and agglomeration externalities among cities. More productive cities, then, have higher wages and higher rents in equilibrium. They also tend to be larger in size. In modern versions of these models, households differ exogenously by skill. Cities differ by their production technologies and, therefore, offer different wages for skilled and unskilled workers. These models also account for endogenous spillovers that may arise from the composition of the workforce. More educated and more skilled workers tend to generate positive externalities for other workers, which can act as a multiplier effect. Spatial equilibrium models provide useful insights into a variety of topics, such as the importance of local and regional economic shocks as well as agglomeration externalities and endogenous amenities. They explain the observed recent trends in mobility and sorting of firms and workers in the US and other countries. For example, they help us understand why cities with a high percentage of collegeeducated workers-such as Boston, Washington, DC, San Francisco, and Au stinhave continued to improve over the past several decades, while cities with a low number of college-educated workers-such as Cleveland and Detroit- have continued to d ecline. Local governments have strong financial incentives to attract high-productivity firms and high-skill workers that are likely to provide significant positive externalities for the local economy. As a consequence, we see much competition among states and cities in the United States and other countries. Some of this competition is clearly desirable since it forces municipal and state governments to offer attractive conditions for firms and high-skill workers. This type of fiscal competition
453
Chapter 24 clearly imposes limits on state and local taxation of high-productivity firms and high-skill workers. However, not all competition is good. While some types of subsidies may be individually rational, they do not necessarily increase aggregate welfare. Federal policies also tend to have an impact on the spatial distribution of economic activity. Some federal policies are desirable to address inequalities in income and welfare within the country. However, some policies have unintended negative consequences. Federal income tax policies determine the individual tax burden based on nominal rather than real income. If there are significant price differences across cities, the federal income tax creates undesirable incentives for households to relocate to less productive, lower-cost cities. 11
24.7 Debate: Attracting the Creative Class" Richard Florida (2002) conjectures that cities and metropolitan areas with high concentrations of creative individuals, such as artists, musicians, researchers, scientists, professionals, and entrepreneurs, have a higher level of economic development. Florida argues that the "creative class" fosters an open and dynamic urban environment that stimulates economic growth and innovation. Discuss some of the pros and cons of policies that try to appeal to the creative class. The following questions may help structure the debate. 1. 2. 3. 4.
What types of individuals constitute the creative class, according to Florida? What do these individuals have in common? How do these individuals differ? What is the empirical evidence that these different types of individuals have a disproportionate impact on the local economy? 5. What are the policy implications that follow from his analysis? 6. What are some potential policy mistakes that follow from his analysis?
24.8 Problem Sets 1. Define the concept of a local labor market. 2. Why does it make sense to differentiate between different geographical labor markets in the US? 3. What's the empirical evidence that suggests that differences in local labor markets are important? 4. Derive the unit cost function for a Cobb-Douglas technology. 5. Derive the indifference curves for firms and verify that they are downward sloping. 6. Discuss policy options for cities to attract high-skill young workers. 7. What are location-based policies? Provide two examples and explain the objective of these policies. Under what circumstances do we prefer locationbased policies over people-based policies?
454
25
Homeownership, Mortgage Markets, and Default
25.1 Motivation There are a number of idiosyncratic aspects of housing that arise from the fact that houses not only provide consumption flows but also are durable assets. For the majority of US households, real estate investments are the only significant assets held in their portfolios once we control for mandatory retirement savings. Owneroccupied housing solves the moral hazard problem of renting, which arises from the fact that renters have only limited incentives to maintain properties. Moreover, it can be difficult to evict renters who have failed to make the contractual rental payments. Owner-occupied housing solves these problems but creates a financing problem. Most potential owners do not have a sufficient level of wealth to purchase a house. Many owners, therefore, need loans to purchase houses. Loans that are secured by real estate assets are called mortgages. Loans solve the financing problem since the owner only needs a small fraction of the housing value for a down payment to purchase the house. A bank will typically lend the owner the rest. Hence housing is the only asset that allows moderately wealthy households to speculate in an asset market. Mortgage markets can only function if default rates are sufficiently low. This requires banks to scrutinize mortgage applicants. Banks need to make sure that only those applicants are approved for a mortgage who are likely to pay it back. As a consequence, many potential homeow ners may have problems receiving a mortgage. To increase the liquidity in the mortgage market, the federal government created a number of specialized agencies that purchase mortgages from banks and issue mortgage-backed securities. These agencies also provide subsidized mortgage insurance for moderate-income households. As a consequence, the US government de facto owns or guarantees the value of a large portion of the national housing stock. The market for mortgage-backed securities lacks transparency. Mortgage brokers and local banks may not have strong incentives to screen mortgage applicants since their primary incomes are based on commissions. As long as housing prices
Chapter 25 are stable or increase, default rates will be low. In good times, mortgage-backed securities yield higher returns than comparable investments, such as long-term corporate bonds. Everything changes when housing markets collapse, as happened in the US in 2007. Due to an increase in default rates, banks and government agencies that held large portfolios of mortgage-backed securities experienced significant losses and went into bankruptcy. Some of them had to be bailed out by the federal government to avoid a banking crisis. We discuss these issues in more detail in this chapter.
25.2 Measuring the Evolution of Housing Prices in a Market To understand real estate investments, we need to be able to measure housing prices across time in different local markets; we need to construct a time series of housing prices for each market. It is difficult to compute housing price indexes since houses are heterogeneous and differ by many characteristics, as we saw in chapter 23. Moreover, only a small fraction of all houses is sold each year. While we do not have transaction prices for all houses at each point in time, we have a sufficient number of housing transactions to estimate a housing price index using a modified hedonic regression model. The most famous price index is calculated from data on multiple or repeat sales of single-family homes, an approach developed by economists Karl Case, Robert Shiller, and Allan Weiss. To construct a repeat-sales index, you need to observe at least two transactions for the same housing unit. To illustrate the main issues that arise in the construction of a repeat-sales index, we consider the following example, which is discussed in detail in Silverstein (2014). Table 25.1 shows three different housing transactions. House 1 was sold in periods 0 and 1, house 2 was transacted in periods 1 and 2, and house 3 in periods 0 and 2. Note that houses sell at different points of time, and the time interval between two sales is not necessarily the same. Moreover, the houses clearly differ in quality since they transact at very different price levels. Some of the price changes observed in the data may be due to renovations or depreciation. Let us, therefore, restrict our sample to transactions for which the quality of the houses is approximately constant between the two sales. For these properties, the observed price changes must be due to changes in the relative price of housing over time, not changes in housing quality. Let us denote the level of the price index at time t by ft . We can normalize the price index in the baseline period (t = 0) to be equal to 100 (I0 = 100) since we only care about relative prices. If the prices of all properties followed the sam e law of motion as the index, then the index would satisfy the following system of three equations: 194 200 400 420 130 110
456
I1 Io
h I1
h Io
(25.1)
Homeownership and Mortgage Markets TABLE
25.1. Example of a Repeat-Sales Index
House 1 House 2 House 3
Period 0
Period 1
$200,000
$194,000 $420,000
$110,000
Period 2 $400,000 $130,000
Note that this system of equations does not have a solution for the example above. Hence we need to add some noise to the model to account for differences between the law of motion of the price index and the law of motion of prices of individual houses. Taking the logarithm of both sides of the equations above and adding an error term for each house yields the following regression model: ln ( 194) 200
= ln(I1 ) - ln(I0 ) + u 1
ln ( 400) 420
= ln(h) - ln(li) + u 2
(25.2)
where ui is the error term of house i = l , 2, 3. We can estimate I1 and h using ordinary least squares (OLS). In this example, we obtain Ii = 105.3 and h = 108.9. Note that the index is increasing despite the fact that two of the three houses experienced relatively small price decreases during the time period. More generally, let us denote the transaction price of house i at time t by Vil . We can write the law of motion of two transactions at time t and t + s u sing the following regression model: ln (
Vit+s) = ln(It+s) - ln(I1) + vit
Ui
(25.3)
If we have a large enough sample of repeat-sales transactions that cover the full period from O to T, we can estimate the sequence of price indexes Io, I1 , ... , ly using OLS. One problem associated with this approach is that housin g transactions are not random events. This methodology can be problematic during a housing market boom when professional investors start "flipping" houses for short-term gains. That can create a serious selection problem that m ay bias the estimator for the index. Nevertheless, most economists and professionals in real estate finance routinely use these types of price indexes. Figure 25.1 shows the evolution of housing prices in the US since 1996. These estimates are based on data provided by Zillow Inc. Note that real housing prices fluctuate over time. The rise in housing prices during the "bubble period" (1998-2007) was unprecedentedly large. During that time p eriod, housing prices almost doubled in the US, more than doubling in some selected metropolitan areas. Similarly, the collapse of housin g prices during the Great Recession
457
Chapter 25 250,000
200,000 X
~
150,000
C:
$ ..Q
N
100,000
50,000
FIGURE 25.1. Housing prices in the US since 1996. (Author's calculations based on Zillow data)
(2007-2011) was also unprecedented in its steepness.1 Since 2011 housing prices have recovered. In some places, they have reached new all-time highs. Note that there is a lot of heterogeneity in housing price evolution. Some cities, such as NYC, Washington, DC, Miami, and San Francisco, have experienced large swings in housing prices during the past twenty years, while other cities, such as Dallas, Houston, Cleveland, and Detroit, have had fairly stable price patterns. We have seen in previous chapters that local institutional features, such as zoning laws and building codes, as well as the availability of land for new construction determine the price elasticity of housing supply. Cities that have a fairly price elastic housing supply rarely experience large swings in housing prices.
25.3 The Moral Hazard of Renting and the Cost of Homeownership Given that a majority of households do not have enough resources to purchase a house and housing is a rather risky investment, this begs the following question: Why don't we all rent houses? What drives risk-averse individuals into assuming the risks of ownership? The main answer is that renting a house creates a serious moral hazard problem for the owner since the renter has little incentive to maintain the house. Renting also creates a monitoring and screening problem since it is difficult and costly to evict a renter who has defaulted on rental payments. Owneroccupied housing solves both problems. We consider a model to compute the costs of home ownership. Table 25.2 summarizes the notation of the main variables of the model. 1A
comprehensive survey of the literature that summarizes the impact of housing finance on the macroeconomy is given by Davis and van Nieuwerburgh (2015).
458
Homeownership and Mortgage Markets TABLE
25.2. Chapter Notation
Variable
Definition
V
House value (transaction price) User cost of owning a house Rental rate of housing Interest rate Property tax rate Depreciation rate Down payment Default probability Recovery rate
UC(V) V
r T
!5 d p k
You may ask yourself how expensive it is to own a property with a given value of V. How big are the annual costs in a perfect world without uncertainty? There are at least three factors that you need to consider when purchasing a house. First, most households do not have a sufficient amount of assets to make the purchase. Instead, they have to borrow money from a bank. Even if you use some of your own wealth as part of a down payment, there are opportunity costs. So let us assume that the cost of borrowing is equal to the opportunity cost of investing and equal to the interest rater. So the total interest rate payments per period are given by r V. In addition, you have to purchase homeowners insurance at a given rate, but these payments are relatively low compared to the interest rate payments. So let's ignore insurance for the time being. Second, you have to pay property taxes. Let us assume that the city uses a constant property tax rate that is equal to T. Total property tax payments per period are then given by TV. Finally, you need to pay for repair and maintenance to keep the quality of the house constant. Let us assume that there is a constant depreciation rate equal to 6. Hence total repairs per period are given by 6 V . If we ignore federal taxes and the fact that mortgage payments are tax deductible, we obtain the following equation for the total annual u ser costs of owning a house: UC(V)
= (r+T+b) V
(25.4)
The per dollar user cost of housing is, therefore, given by (r + T + b). Currently, long-term interest rates are approximately 4 percent. Taxes and insurance can be another 1.5 percent. Maintenance can be somewhere between 1 and 2 percent for most houses unless they are really old or have been poorly maintained. In this example, the user cost of housing should be somewhere between 6.5 and 7.5 percent of the house value. Of course, you do not have to purchase a house. You can always rent an equivalent unit. Let v denote the rental price of the same house. A measure of how expensive ownership is relative to renting is given by the ratio of the rent divided by the value of the house, denoted by Using the techniques discussed in Epple, Quintero, and Sieg (2020), we can estimate the rent-to-value function. The results for the city of Miami are shown in figure 25.2. Recall that Miami was one of the cities that experienced a very large run-up in asset prices during the "bubble period."
v·
459
Chapter 25
O.D75
0.070 0.065
- - -- -... ........
k,ggs
....... ....... .......
........
...... ......
0.060
......
........
.......
0.055
--
-------
0.050 0.045
.. -...... - - ..........
.. ........ .. .. .. .. ........
0.040
.. .. .. - - ...... .. -- ..
...... - ........
0.035 0.030 0.025 - + - - - - - r - - - - - r - - - , - - - , - - - , - - - , - - - - , - - - - , - - - - + 50 100 150 200 250 300 350 400 450 500
Value (thousands) FIGURE 25.2. Rent-to-value functions in Miami, 1995- 2007.
Figure 25.2 shows the estimated rent-to-value function for three time periods: 1995, 2002, and 2007. Housing quality is measured by house valu es in 1995 and ranges between $50,000 and $500,000. The rent-to-value function is flat and ranges between 0.06 and 0.07 in 1995. We see quite a drastic change in these rent-to-value functions in a fairly short period of time. By the end of 2007 the ratios range between 0.035 and 0.046. This is consistent with the notion that rents were fairly stable during the bubble period, while housing prices increased by 50-70 percent. How can we explain this drastic change in the rent-to-value functions? The average thirty-year mortgage rate was 7.95 percent in 1995. It declined to 6.34 percent in 2007. Hence credit became cheaper between 1995 and 2007. There is also some compelling evidence that applicants with low credit ratings had a much easier time obtaining mortgages during the bubble period. Figure 25.2 shows that the largest changes in the rent-to-value ratio are for lower-quality houses. This finding is consistent with the notion that demand for these assets may have increased more strongly due to changes in credit markets. The costs of owning a house also depend on federal income tax policies. First, the home mortgage interest deduction allows taxpayers who own their homes to reduce their taxable income by the amount of interest paid on the mortgage, thus effectively lowering the after-tax interest rate payments. A home mortgage interest deduction is, therefore, a subsidy to homeowners that is likely to increase the demand for owner-occupied housing. Who is likely to benefit from this policy? Does this policy favor inner cities or suburban communities? Coen-Pirani and Sieg (2019) study the impact of the Tax Cut and Jobs Act (TCJA) of 2017, which represents the most comprehensive reform of the federal
460
Homeownership and Mortgage Markets
FIGURE 25.3. Should you buy or should you rent? (Photo by author)
tax code since 1986. The primary objective of this tax policy change is to lower corporate and personal income tax rates in the US. To partially finance these reductions, the TCJA caps the deductibility of state and local taxes for the purpose of federal taxation. From a geographic perspective, these deductions are unevenly distributed across states, with two states characterized by relatively high income and high taxes-California and New York-jointly accounting for about one-third of nationwide state and local tax deductions in 2014. By capping state and local tax deductions, the TCJA effectively and significantly raises local taxes and the effective price of housing for high-income households in New York and California.
25.4 A Case Study: Why Are Young Americans Not Buying Houses? Since the bursting of the housing bubble, many households between the ages of twenty-five and thirty-four have not purchased houses or condos. Homeownership rates hit forty-eight-year lows in 2015. According to a Zillow research report
461
Chapter 25 released in August 2015, first-time homebuyers are older and more likely to be single than first-time homebuyers in the 1970s and 1980s. Zillow's study found that younger households in 2015 preferred to rent for approximately 6 years before buying their first homes. In the 1970s they rented for an average of 2.6 years. Real estate has clearly become less affordable. In the 1970s first-time homebuyers bought homes that cost about 1.7 times their annual income. Now they're buying homes that cost 2.6 times their annual income. What explains these recent trends in housing markets? Younger, higher-productivity households prefer to live in coastal cities with growing job markets (Moretti, 2012). Coastal cities tend to be much more expensive than cities that are not near the coast. Think about New York, Boston, Washington, Seattle, San Francisco, and Los Angeles as some of the primary targets of younger and more educated households that are most likely to delay homeownership. Zillow also reports that the average age of first-time homebuyers is about thirtythree. Their median income is $54,340, which is about the same as what first-time homebuyers made in the 1970s when adjusted for inflation. However, real estate prices tend to be much higher in 2015 than in 1979, even after adjusting for inflation. As a consequence, a simple application of the user cost formula that we studied in the previous section suggests that renting is a much more desirable alternative for most young households that live in coastal cities. The relative increase in housing prices also implies that younger households are struggling to save for a down payment. Since 1990 the median home price has gone up roughly $40,000, while median income has only risen $2,000. In addition, many young households have accumulated student debt, which makes the prospect of taking on mortgage debt less appealing. In addition, there are other social changes that are likely to contribute to the low rate of homeownership among young adults. Marriage and cohabitation rates are also at or near an all-time low among twenty-five- to thirty-four-year-old households. Not surprisingly, married or cohabitating young adults rarely live with their parents. Singles are much more likely to live with their parents or with friends and roommates in rental units. The lack of interest in homeownership may, therefore, be the consequence of a larger cultural shift in attitudes toward marriage and children.
25.5 Mortgages and Default Most prospective homeowners cannot afford to buy a house without receiving a loan or some sort of financing from a bank. Loans that are used for the purpose of buying houses are typically called mortgages. Home financing creates some serious complications, the most important of which is the potential default of a homeowner. In a perfect world, default does not occur and home financing would be simple. However, we do not live in a perfect world. Default occurs when a homeowner stops making payments on a loan. When a borrower defaults on a loan, the lender runs the risk of losing money. The bigger the loan, the larger the potential loss. There are two mechanisms that a bank uses to protect itself from potential losses in case of a default. First, the bank typically does not lend the full amount of the
462
Homeownership and Mortgage Markets
purchase price to the owner. Instead, it requires the new homeowners to invest some of their own wealth when they purchase a house. The owner's share of the initial investment is called a down payment. Down payment requirements range from 5 to 15 percent in most cases. Because of the down payment requirements, most young households cannot become owners. They first need to save enough in order to have enough wealth to use as a down payment. Of course, if you have rich parents or grandparents, they may help you out! If you don't, you may be able to get mortgage insurance to help you out or you may qualify for a subsidized first-time homeowner program; we'll get to that later. Second, the bank secures its loan using the house as collateral. What does that mean? A mortgage is a secured loan in which the borrower pledges the house as collateral for the loan. In the event that the borrower defaults, the bank takes possession of the house. The bank may sell it to regain some or all of the amount originally lent to the borrower. This process of selling the house to recover the mortgage in case of default is also called foreclosure. That's the bad news. If you default on your mortgage, the bank will eventually force you out of the house, assume ownership of the house, and sell it to recover some or all of the outstanding debt. The good news is that a mortgage is a non-recourse loan. What does that mean? The house that is used as collateral is the only security or claim the bank has against you. The bank has no further recourse against the borrower for any deficiency remaining after foreclosure against the property. To put this into plain English, if you default on a mortgage you do not have to declare personal bankruptcy. The worst thing that can happen to you is that you lose your house and with it your down payment and any other amount you may have invested in the house. Let's consider an example. Suppose you want to buy a house that costs $500,000, putting 10 percent down. You obtain a thirty-year mortgage of $450,000 with a 5 percent interest rate. Your monthly payment is $2,415.70. Property taxes can add another $500-$1,000 to your monthly payment. Homeowners insurance adds another $100. The good news is that you can deduct your mortgage interest payment from your taxable income when you compute federal income tax payments as long as you do not have to pay the alternative minimum tax. Before the tax reform in 2017, you could also deduct state and local taxes. Now these deductions are capped at $10,000. Over the course of the thirty years, you will pay $419,651 in interest payments and $450,000 in principal payments. As a rule of thumb, the home-valu e-to-income ratio should be approximately 3 to 1. Hence, you need to make $167,000 in annual household income. Your after-tax monthly income is approximately $10,000. Your monthly payments are approximately $3,000, which means you spend 30 percent of your after-tax income on your home. If nothing really bad happens, you will fully own the house after thirty years. Let's suppose the economy tanks right after you purchase the house. The value of your house depreciates to $380,000, you lose your job, and hence you cannot make any additional mortgage payments. That's bad luck. I hope it never happens to you. In that case, you would probably have to default on the mortgage. Eventually, the bank sells the house for $380,000 in the market. You lose $50,000, which equals your down payment. The bank loses another $70,000, w hich is the difference between the outstanding loan amount and the n ew sales price.
463
Chapter 25
Strike price
FIGURE 25.4. Call options.
To gain some additional intuition into the default process, it is useful to compare purchasing a house with purchasing a call option in the stock market. What is a call option? The buyer of the call option has the right but not the obligation to buy an agreed quantity of a particular asset from the seller of the option at a certain time (the expiration date) for a certain price (the strike price). The price of a call option is difficult to compute since it depends on the expectations about the future price of the asset. Even if the call is out of the money, i.e., if the current price of the asset is below the strike price, the call option still has a positive value since it is possible that the price of the asset will rise over the strike price before the expiration date of the option. 2 Figure 25.4 illustrates the payoffs of a call option as a function of the underlying price of the asset. To understand a transaction in which an individual buys a home using a large mortgage, it is useful to think about it in terms of buying a call option. The bank de facto owns the house and rents the house to the owner. The owner can buy the house back from the bank at any time by paying off the mortgage (purchase option). Effectively, the owner does not own the house until the mortgage is paid off, but owns a call option. The initial down payment is the price of the call option. The monthly interest payments are the rental payments. The strike price is the remaining principal. Each monthly payment also reduces the strike price of the call. So when you pay back your thirty-year mortgage you are effectively buying a sequence of up to 360 different call options. 2
1n an advanced finance course, you can learn h ow to use continuous time stoch astic processes to model asset p rices and then use stochastic calculus to price derivatives such as options.
464
Homeownership and Mortgage Markets
To understand the potential default in the mortgage market, let us consider a simple two-period model. Suppose you purchase a house in period 1 at value V1. The down payment constraint is binding. You invest a fraction d of your own wealth and borrow 1 - d from the bank at a mortgage interest rate equal to rm. That means if you want to assume full ownership of the house in the second period, you need to make a payment of (1 + rm)(l -d)V1 . Between the first and second period, the economy is subject to an economic shock. As a consequence, the value of the house in period 2, denoted by Vi, is a random variable. For simplicity let us assume that there are only two possible states in period 2: good or bad. Hence the value of the house is V} in the bad state of the economy or in the good state of the economy. At the beginning of the second period, you then have to make a decision whether to take full ownership of the house and "purchase the house from the bank" or whether to default on the mortgage and walk away. Let us assume for simplicity that there are no long-term costs of default, i.e., you do not take a hit on your credit rating and you do not have to declare personal bankruptcy. In that case, the decision of whether or not to default is fairly easy. You purchase the house from the bank if and only if the value of the house at the beginning of period 2 is larger than the outstanding mortgage p ayment. Suppose the opposite condition is true:
vf
(25.5)
In that case, we say the mortgage is "under water." If you want to sell the home in the market and pay back the loan, you would take an economic loss. To put it differently, you would be better off defaulting on the existing mortgage and buying the house in the market at the lower price with a new mortgage. Note that the smaller the down payment, the higher the mortgage interest rate, and the larger the negative economic shock, the more likely it is that this condition will hold. In the real world, there are more than two periods. A household can make the mortgage payment and stay in the house for another period and hope that the economy and the house value will recover. That may make sense if transaction costs are large. In our simple two-period model that is not an option. To gain some additional insights into the determination of mortgage interest rates, let us assume that this condition holds if the economy is in the bad state, i.e., if (25.6)
but not if the economy is in the good state: (25.7)
Our simple model predicts that the household will default in the bad state of the economy. Let p denote the probability that a household will default, i.e., that we will reach the bad state of the economy. Moreover, let r f denote the risk-free interest rate. A risk-neutral bank n eeds to generate at least as much expected revenue from mortgage lending as from risk-free lending activities. Let's assume that the bank recovers v} in the bad state of the world and (1 + rm)V1 in the
465
Chapter 25 good state of the world. The risk-free investment pays off (1 + r f ) V1 . The bank is indifferent between both investments if and only if (25.8)
Let us define the recovery rate denoted by k as (25.9)
Then we have (25.10)
Solving this equation for rm, we have
Ym
=
rJ+(l - k)p l
-p
(25.11)
Hence the mortgage interest rate is increasing in the default probability and decreasing in the recovery rate. Bankers are not stupid; they understand that some households will default and will try to price the default risk into the mortgage interest rate. Of course, that requires them to have reasonably good and reliable estimates of default rates. If banks have wrong beliefs and underestimate the default risk, they will lose money. In the extreme, the banks themselves will be in serious financial trouble and may have to declare bankruptcy if they have taken on too many bad risks, i.e., if they own too many bad mortgages. When banks start to default, we often have a banking crisis since banks borrow from and lend to each other. More importantly, they use complex financial arrangements to share risk among themselves. That's exactly what happened during the Great Recession. Default occurs w h en households stop making payments on mortgages. When would you default? That's clearly up to you and how ethical you are. But from a purely economic perspective, an owner should consider default when there is no hope that the price of the house you own will ever be equal to the amount of outstanding principal. In that case, the owner should default even if he or she is capable of making the payments. It is in the financial interest of the owner to walk away and leave the house to the bank that holds the mortgage. At that stage, the bank will start a foreclosure process. Figure 25.5 shows foreclosure activity during the Great Recession. As w e can see, a normal level of foreclosures is approximately 250,000 per quarter. During the Great Recession, we had roughly four times that volume during extended periods of time. Clearly, many households decided to walk away from their homes and default on their mortgages because they did not expect to sell the house at a price that would recover at least the outstanding principal. However, a surprisingly large number of owners decided to stay in their houses and continued to make payments despite the fact that it would have made more financial sense to default. The economy experienced fewer defaults than economic theory predicted.
466
Homeownership and Mortgage Markets 1000 900 800 V,
0 0 0
700 600 -
C V,
~ 500 -
::::, V,
_Q
u 400 ~
0
LL
300 200 100 0
FIGURE 25.5. US properties with foreclosure activity during the Great Recession.
25.6 Mortgage Markets How do you get a loan to purchase a house? In a traditional mortgage, a local commercial bank lends money to an individual to buy a house using funds deposited at that bank. The credit or default risk is entirely borne by the local bank. The main advantage of this arrangement is that there is little moral hazard since the local bank has strong incentives not to make bad loans. This market design is also called the primary mortgage market. This market has three potential disadvantages. First, small or growing markets may not have enough liquidity during periods of growth and expansion. Second, local markets can be subject to negative shocks and lending may shut down if local banks are in trouble. Third and most importantly, local banks only invest in the local real estate market. As a consequence, they do not hold a diversified portfolio. The local banks are thus exposed to a risk that could be diversified. The drawbacks of the primary mortgage market became apparent during the savings and loan crisis that hit the US during the 1980s. The secondary mortgage market is a market for the sale of bonds that are collateralized by the value of mortgage loans. Intermediaries buy loans from local banks and repackage the loans for resale via mortgage-backed securities (MBS). The intermediaries typically are the issuers of mortgage-backed securities. Investors can buy and hold a diversified portfolio of mortgage-backed securities. The local banks can pass the mortgage or default risks to the investors in the secondary
467
Chapter 25 Loans Loans payable
V'l
..r::. V'l ro
C
ro
Servicer
u
0
.....J
/ ~o:/4'\~e~'~-ls_s_u_e_r-~
V'l
(1J
..r::. V'l ro
u
·.;::;
·;:: :::; V
(1J V)
◄- --_l:_ -----------j~_R_at_i_n_g_a_g_e_n_c_y~
>- ....
-..r::.
c0
C
(1J
E >~ ~
Investors FIGURE 25.6. How the secondary mortgage market operates.
market. Figure 25.6 illustrates the relationships among the key players in the secondary market. Note that the borrower does not make payments to the original bank or lender but to a service agency that passes the payments of all borrowers to the issuer of the mortgage-backed security. The secondary mortgage market was primarily created by the US federal government to add liquidity to the mortgage market. The Federal National Mortgage Association- commonly known as Fannie Mae- was founded as a government agency in 1938. The success of this agency convinced the federal government to create the Federal Home Loan Mortgage Corporationcommonly known as Freddie Mac- in 1970. The objective was to expand the secondary market for mortgages and create competition to Fannie Mae. These two government-sponsored enterprises (GSEs) were successful during the 1980s and 1990s. As a consequence, both agencies were con verted into publicly traded companies. Broadly speaking, GSEs intervene in the mortgage market by buying mortgages from local banks and other financial service providers in the primary market. They hold some of these mortgages in their own portfolios. Other loans are packaged into mortgage-backed securities that may be sold to outside investors. As a result, they expand the pool of funds available for housing. With Fannie Mae and Freddie Mac buying up mortgages, local banks no longer have to hold onto the mortgages they originated, but can sell them in the secondary market shortly after origination. This allows local banks to make additional loans and mortgages to creditworthy borrowers. As a result, Fannie Mae and Freddie Mac add liquidity to the primary mortgage market and improve the risk allocation of these mortgages. By doing so, they have increased homeownership rates in the US.
468
Homeownership and Mortgage Markets
25.7 A Case Study: The Bailout of Fannie Mae and Freddie Mac US housing and mortgage markets became increasingly stressed during 2007 and 2008 as a result of a weakening economy. This led to the sharp decline in housing prices illustrated in figure 25.1. As the prices of homes began dropping below the balances left on mortgages, a wave of mortgage d efaults and foreclosures ensued. During the same time period, Fannie Mae and Freddie Mac increased their leverage and began investing in subprime securities that credit agencies had misleadingly characterized as low-risk investments. As the US housing crisis persisted, Fannie Mae and Freddie Mac became financially distressed. A nationwide decline in housing prices and a simultaneous increase in mortgage default rates were disastrous for these GSEs, whose concentrated exposure to US residential mortgages and high leverage left them in a vulnerable state. The GSEs lost a combined $47 billion in 2008 in their single-family mortgage businesses. Unlike other investment firms, they were only involved in residential mortgage finance and, therefore, did not have any other sources of income to compensate for their losses. However, the situation got much worse. Between January 2008 and March 2012, the GSEs lost a combined $265 billion. Moreover, more than 60 percent of the loss was attributable to risky products that the GSEs had purchased in 2006 and 2007. As the housing market continued to collapse, the market for nonagency-issued mortgage-backed securities, which are securities that are not explicitly or implicitly guaranteed by the federal government, also collapsed. Figure 25.7 shows that residential mortgage-backed securities (RMBS) that were not issued b y one of the governmental agencies rapidly declined after 2008. The US government recognized that failure for Freddie Mac and Fannie Mae would be detrimental to the entire financial system. As a result, in September 2008 the US Treasury Department decided to put both GSEs into conservatorship under the control of the Federal Housing Finance Agency. The government thus effectively took control of the two firms and absorbed their losses in an effort to stabilize housing and financial markets before more damage was done. Government aid for Fannie Mae and Freddie Mac eventually amounted to $187.5 billion as of March 2016. Under the terms of the bailout, the GSEs issued senior preferred stock to the government. With the rebound of the US economy starting in 2011 and the increase in housing prices shown in figure 25.1, Fannie Mae and Freddie Mac returned to profitability, and the US government managed to recover most of the costs associated with the bailout. Today Fannie Mae and Freddie Mac guarantee almost $6 trillion of securities and enable thirty-year fixed rate loans. The market for nonagency-backed securities is slowly starting to recover, but overall levels are small compared to the pre-bubble era.
25.8 Mortgage Insurance If borrowers cannot make a sufficient down payment, some lenders decide to insure themselves against the default risk with private mortgage insurance (PMI), which borrowers are then required to purchase. Historically, most home
469
A 2,500 ~
1,500
:
1,000
I
I
500
Agency MBS
II I I I I I I I I I I I I I I I I I I I I I I I I
2,000
I __ J
I
r~~ , '
I I '1
...
/\ I
I \
\
, \
I I I
/
/ '
I
I '
\
',
I ', I I
;',
\
;; \ I
\
\1
'
' ·
/
\
I
I I I
I
Nonagency RMBS
B 8,000 I
Agency MBS
■ Nonagency RMBS I
7,000 6,000 5,000 4,000 3,000 2,000
FIGURE 25.7 . Stock and flows of mortgage-backed securities: Issuance (A) and outstanding (B), in billion s (USD ). (Auth or's calculation s)
Homeownership and Mortgage Markets
purchasers who buy a house with a down payment of less than 20 percent are required to buy PMI. PMI is expensive: typical rates are $55 per month per $100,000 financed. In our example, suppose you buy the $500,000 home with no down payment; you will have to pay an additional $275 for your monthly payment. Moreover, there is no guarantee that you will be approved by the insurance company to qualify for PMI. In that case, you will not obtain the mortgage, and you will not be able to buy the house. Since private mortgage insurance is expensive and hard to get for moderateincome households, the federal government also offers public mortgage insurance. A public mortgage insurance premium equals 1 percent of the loan amount at closing. The FHA then essentially makes up any payments to the bondholder that the borrower misses. It also makes up for any loss for the loan not being fully repaid either by the borrower or from the sale of the house. The risk of a downturn in housing markets is shifted to the taxpayers, who guarantee the GSEs. There are many federal, state, and local programs that help and encourage first-time homebuyers or buyers with disadvantaged backgrounds. One federal program is provided by the Federal Housing Administration. In an FHA loan, mortgages and insurance are bundled together. These loans have smaller down payment requirements and lower closing costs than conventional loans. FHA loans are also available to homebuyers with relatively low credit scores who may not be able to obtain mortgages in the private market. Fannie Mae also provides homeow nership education for first-time homebuyers through its HomePath Ready Buyer program.
25.9 Conclusions Encouraging homeownership has been an explicit policy objective of almost all US administrations since the early 1950s. Many households prefer owning a house to renting, even if they can barely afford it. A nice house in a good school district or friendly neighborhood has been part of the American Dream for many generations. There may even be some social benefits of homeownership. There is some evidence that suggests that homeowners are more responsible neighbors than renters. Overall, homeownership is much more affordable in the US than in almost all other countries in the world. Even expensive cities such as NYC or San Francisco are surprisingly "cheap" by international standards. When it comes to homeownership, the US is clearly still the most affordable country in the world. Holding some real estate in a properly diversified portfolio makes a lot of financial sense, especially for wealthy households. Housing is one of the few investments that allows moderate-income households with little or no wealth to speculate in asset markets. If you are a risk lover, that's great news. Try to go to your bank and ask for a loan to play the stock market or purchase some pork belly options. No dice! Despite the affordability of homeownership in many US cities, there are some doubts that the current policy of heavily subsidizing homeownership makes sense. First, real estate developers are likely the biggest beneficiaries from these subsidies, especially in markets with limited supply. Second, the federal government has not really solved the problems that partially caused the last recession. The
471
Chapter 25 GSEs still hold or guarantee a large percentage of all mortgages in the United States. Hence taxpayers are directly exposed to the housing market risk. While local banks have increased their lending standards, they still do not have strong incentives to screen mortgage applicants since they earn commissions from selling mortgages. Many large commercial banks still do not seem to hold enough capital to absorb a large negative shock. However, they are still "too big to fail." A casual look at the economic history of the US suggests that many economic crises were driven by, or at least accompanied by, serious housing or land "bubbles." A housing crisis often leads to a sequence of bank failures. The Great Recession was neither the first nor will it be the last economic crisis at least partially caused by excessive speculation in housing or land markets. If land and housing speculation were just a zero-sum game in which some win and others lose, any society could live with that. Unfortunately, the economic costs associated with these housing and banking crises are real and not trivial. Even if you have faith that the government has learned from its past mistakes and are convinced that housing prices will be stable in the future, you should not necessarily rush into homeownership. First, few of us have the skills and talents to maintain our own houses. When was the last time that you fixed anything around your apartment? Second, housing markets often lack transparency and are subject to serious asymmetric information problems. What do you really know about the quality of the house that you are about to purchase? The seller clearly knows a lot more about the house than the buyer. A home inspection can only alleviate this asymmetric information problem. Third, there are large transaction costs involved in a home purchase. You need to pay realtor fees or commissions, which range between 4 and 6 percent. In most markets, there are also real estate transfer taxes. In Pennsylvania, for example, these taxes are 1 percent of the value of the house. Then there are all kinds of other closing costs that add up. Finally, you need to understand the incentives of the other market participants. Your bank or mortgage broker, your real estate agent, and your government all want you to purchase real estate. If you don't, they are not making any money. Keep all of that in mind when you take the plunge and pursue happiness via homeownership.
25.10 Debate: Subsidies for Homeownership Should the federal government subsidize homeownership? The pro side should argue that there are significant and large externalities that come with homeownership. As a consequence, the subsidies can be justified. The con side should argue that the externalities are small and that the primary beneficiaries of this policy are not households but housing developers. The following questions should help structure the debate. 1. Why does the federal government currently subsidize homeownership? 2. What are some alternatives to promote homeownership besides handing out subsidies? 3. What are the reasons behind the recent shift from homeownership to renting housing? 4. Do housing subsidies distort the local provision of public goods?
472
Homeownership and Mortgage Markets
5. Do housing subsidies have a beneficial effect on local economies? 6. Who is bearing the cost of these subsidies?
25.11 Problem Sets 1. Explain how government housing policies contributed to the recent housing crisis. 2. Suppose you purchase a home with a 20 percent down payment, financing the remaining 80 percent of the purchase price with a thirty-year mortgage provided by your local bank. Explain why this transaction can be viewed as purchasing a call option, i.e., an option that gives you the right to purchase the house from the bank at the price that is given by the remaining principal that you owe the bank. 3. Explain why it is difficult to construct a reliable price index for housing. What are some potential problems with the commonly used repeat-sales index? 4. Suppose the government is debating whether or not to provide full insurance for all mortgages, i.e., if a borrower defaults on a mortgage, the government will make up the payments to the lender. Use what you have learned in this chapter to evaluate the pros and cons associated with such a policy. 5. Explain how secondary mortgage markets can increase the liquidity in local housing markets. Why can it be difficult for outside investors to assess the riskiness of mortgage-backed securities? 6. How much involvement should the federal government have in the housing market? 7. What are the benefits of having federally regulated mortgage enterprises like Fannie Mae and Freddie Mac? 8. What are the implications resulting from the existence of Fannie Mae and Freddie Mac for private firms? 9. How can the government expand access to affordable mortgages without having taxpayers bear the risk? 10. Should the US mortgage market be privatized? What are the potential benefits? What are the disadvantages? 11. Are there other ways that the federal government could stimulate the housing market without Fannie Mae or Freddie Mac?
473
26
Epilogue
We do not have a blueprint for the perfect design of a city. If I have to offer an educated guess, we will probably never have one. All attempts to build new utopian cities from scratch will fail miserably, even if the smartest urban planners and economists take on this task. Cities have evolved over time into important economic, political, and cultural institutions. It is hard to conceive of successful economies without cities. We can use retrospective analysis to partially understand why some cities succeeded and others failed. These insights together with our theoretical knowledge of economics and related sciences allow us to design potential policies that promise to address some of the main shortcomings and deficiencies of urban life. How do we improve urban schools? How do we fix roads and infrastructure? How do we invest in green and sustainable cities? In most developed economies, realistic new policies hardly ever lead to fundamental or revolutionary change in institutional design. Instead, we need to be patient and careful to improve our existing institutions based on trial and error. In developing countries, more radical departures from the status quo may be needed to overcome some of the more fundamental flaws of existing policies. In any case, there is no substitute for a careful empirical evaluation of existing and potential policies. Empirical research helps us to determine what works and what does not. While there is no optimal or efficient design of a city that could be implemented by a social planner, several major themes arise from the study of urban economics that should guide current and future policies. Proximity and geography matter for economic development. Almost all successful societies are organized around cities- prominent centers of production and consumption, which are essential in the generation of new ideas and products. All problems have not been solved, but urban economists have made progress and provide some of the compelling explanations for the observed realities. Urbanization reduces transportation costs for people, goods, and ideas. It creates larger local markets that can sustain a diverse supply of products. Large local labor markets help match workers to firms. Urbanization also creates a number of positive externalities, such as knowledge spillovers that increase the productivity of individuals and firms. Still, there may be other important channels that we do
Epilogue not understand that may also contribute to the explanation of why cities are so crucial for economic development. Cities can only function with cost-effective local governments, which provide a variety of public goods and services that enhance and protect the comparative advantages of the local economy. The list of important goods and services is long: infrastructure, education, public safety, protection of the environment, provision of affordable housing, public health, and welfare. In addition, the local economy cannot function without sensible regulation, such as building codes, safety standards, and environmental rules. To provide these goods and services, a city must generate enough revenues through taxes, fees, and user charges. Finally, cities must rely on intergovernmental transfers to deal with inequality, poverty, redistribution, and a variety of other issues. Cities do not operate in isolation. Instead, cities compete with both nearby and more distant cities for workers and firms. Individuals and firms are mobile and will leave a city if the tax burden is unreasonable relative to the quality of public goods and services. Mobility and fiscal competition can, therefore, serve as useful disciplining devices. Successful cities cannot rely on excessive taxes on mobile factors, such as high-productivity businesses and younger, high-skill households. Fiscal competition can also create problems; tax competition among cities can undermine the fiscal capacity of a city. Mobility can generate additional problems in developing economies if the gap between poor rural areas and relatively affluent cities becomes too large. Even in the best-case scenario, successful cities face some daunting challenges that arise due to high population density, the heterogeneity of tastes among the city population, and the unequal distribution of economic welfare. Traffic, public health, and environmental problems can put stress on the city, crime can spin out of control, and poverty can become overwhelming. Housing and sanitary conditions can be problematic, especially for low- and moderate-income residents. Even competent and well-meaning local politicians and administrators will struggle with these problems on a daily basis. Political competition and fair local elections tend to reward moderate, competent, and hard-working politicians. However, we cannot take for granted that elected or appointed officials will act in the interest of city residents, or that they will be sufficiently capable to run a city. Mismanagement by corrupt or incapable politicians and administrators can destroy a city's potential, and the quality of life can deteriorate quickly. Free elections provide a way to remove ineffective politicians from office. If local politicians are entrenched and all else fails, mobility or "voting with your feet" is often the last way out. There is no invisible hand that guarantees that cities will work or thrive. However, there are reasons to be cautiously optimistic. It is difficult to imagine any viable alternative to city life. While the rich and famous can afford to leave cities behind and enjoy a life of luxury in more pastoral environments, the overwhelming majority of individuals and households do not have this option. They need to live in close proximity to cities, where most of the jobs are. Very few of us could prosper in rural communities; almost all attempts to create rural communities that foster freedom, happiness, and prosperity have failed, most of them miserably. As a consequence, we are bound to make cities work, whether we like it or not.
475
Chapter 26
FIGURE 26.1. City life. (Photo by author)
While many will never fully embrace city life, more and more individuals learn to appreciate the diversity and richness that only cities can provide. Even many small cities offer a variety of experiences, goods, and services that are remarkable. The key challenge is to make sure that economic opportunities are not reserved for the elite or a relatively small number of wealthy and educated individuals, but that they are available to everybody. While we are far away from reaching this lofty goal, much progress has been made, not just in the United States but also in many cities and countries around the globe. Nevertheless, there is still much more to be done. To say that the US federal government has not always played the most supportive or constructive role in cities' development is a gross understatement. In fact, the federal government has pursued policies that subsidize suburban and rural communities at the expense of cities and metropolitan areas. The federal income tax code subsidizes low-productivity places and encourages urban sprawl and excessive land use. Transportation policies have primarily focused on providing access from remote suburbs to employment centers and have neglected urban and regional public transportation systems. Welfare policies place an undue financial burden on inner cities and create ghettos of poverty. Educational policies do not provide enough financial and other support for urban school districts and disadvantaged children who have experienced hardship. Many regulatory policies have weakened cities, with the exception of a small number of large superstar cities. Criminal and drug enforcement policies have been disastrous for man y minority neighborhoods in cities. And the beat goes on. Since neither Democratic nor
476
Epilogue Republican administrations have a better track record supporting cities, it is hard to point to ideology for these policy failures. The US political institutions do not favor cities. The US Constitution was written by landed gentry and wealthy merchants during the preindustrial era. Cities were often unhealthy and undesirable places to live in those days. These ruling elites were more than skeptical of the large number of immigrants that crowded the larger cities. Maybe it is not surprising that the US Constitution shifts power from cities to rural areas. We have argued that the US needs stronger regional governments that facilitate the coordination of policies within a metropolitan area. Where does that leave us? Cities can only thrive if residents are involved in the economic, social, cultural, and political fabric of society. The current younger generations may have learned from the mistakes of their parents, who abandoned inner cities in search of the elusive suburban utopia: a house with a white picket fence, green grass, and a two-car garage. Younger generations are again embracing city life, preferring the buzz of the city over the deceiving tranquility of suburbia. If you have managed to read this book all the way to the bitter end, you are already prepared to make a difference. Do not take the benefits of city life for granted; get involved and make things happen! Many cities around the globe have been blessed with capable and inspiring local politicians, civic leaders, and activists. This book is dedicated to Jim Relken, who served the city of Port Huron as mayor, member of the city council, and member of the school board for many decades, and all of those who have devoted their lives to improving the quality of life in their beloved hometowns and cities. They do so without much outside recognition, medals, or awards, without visits to the White House or invitations from Oprah, and without an entry in Wikipedia. These people are the forgotten heroes of this country.
477
Appendix
Some Useful Techniques in Empirical Microeconomics
A.1 Motivation Economic theory typically provides insufficient guidance on whether a proposed intervention is desirable or not. Even if theory provides a clear, qualitative prediction regarding the effects of an intervention, we need to know whether the effects are small or large. Empirical work is necessary to quantify the magnitude of the impact of an existing intervention or to predict the likely effects of future interventions. Modern economics is, therefore, primarily an empirical science. To obtain a good understanding of the empirical methods that we use and discuss in this book, you have to be familiar with a variety of concepts in probability theory, statistics, and econometrics. Every department of economics, therefore, offers a variety of courses that provide these foundations for empirical work. This appendix presents a review of some of the essential concepts. Note that it cannot replace proper training in probability theory and statistics. Sections A.2-A.4 cover basic material in probability theory and statistics. Sections A.5-A.7 discuss experimental design. Sections A.8-A.9 focus on discrete choice.
A.2 Correlation versus Causation You have heard many times that correlation does not imply causation. Despite the simplicity of this statement, it can be rather difficult to sort out the difference between correlation and causation in any application. We can typically observe or measure correlations in our datasets. To learn something about unobserved causal relationships, we need to proceed carefully and avoid many pitfalls that are associated with empirical analysis. There are many examples where causation and correlation are easily confused. For example, some empirical papers have documented the existence of a "marriage premium." These papers find that married males have approximately
Appendix
10 percent higher earnings than similar males who are not married. Does that mean marriage increases productivity? It's possible but highly unlikely. My employer definitely did not give me a 10 percent raise when I got married. I am confident to predict that unmarried men reading this book will also not get a 10 percent earnings bump if they ever get married. So what's going on? A more plausible explanation of the marriage premium is the conjecture that women prefer to marry males who have higher earnings and salaries. (By the way, it turns out that males also tend to prefer more highly educated women who tend to have higher earnings. So it works both ways.) Hence the marriage premium is an artifact of how marriage markets work. It has little to do with the impact of marriage on productivity. We say that the marriage premium is due to a selection effect. Women "select" men based on productivity or earnings potential, and vice versa. If you read the newspaper or follow the news carefully, you will encounter many instances in which individuals fall into the marriage premium trap and draw wrong inferences based on observed correlations. In marriage market equilibrium, not every woman can be matched with a highproductivity male. There are just not enough high-productivity males available. Becker was the first to conjecture that there is assortative matching in the marriage market. As a consequence, low-productivity males tend to be matched with low-productivity females or remain single struggling to find suitable partners. In a nutshell, the higher your productivity, the better your chances at marriage, holding other things constant. The causality, therefore, runs from productivity to marriage and not the other way around! These concepts are covered in d etail in a course about the economics of family. Two economic variables are correlated if they move together. Two economic variables are causally related if the movement of one variable causes the movement of the other variable. In statistics, this is called the identification problem: Given that two random variables are correlated, how do you determine or "identify" whether one variable is causing another variable to move in a certain direction? Whenever we see a correlation between two variables A and B, there are at least three possible explanations: 1. A is causing B. 2. Bis causing A. 3. Some third factor C is causing both A and B to move together.
The general problem that empirical economists face is trying to distinguish among these three explanations. Only in the first case does the correlation between A and B imply that A causes B. Here is a concrete example from urban economics. Suppose we observe that cities with high crime rates hire large numbers of police officers. 1. Does high crime induce cities to hire more police officers (A causes B)? 2. Does hiring police officers come with higher corruption and, therefore, cause higher crime rates (B causes A)? 3. Does some third factor cause both police hiring and crime to move together, such as drug dealing and the "war on drugs"?
How can we reliably estimate the impact of hiring another police officer on crime within a city? If you just run a cross-sectional regression of crime on the size of the police force, controlling for some observables such as poverty or income, you
480
Appendix
will unlikely obtain a good answer. You need to be smarter than that. Levitt (1997) proposes to use variation in police hiring due to political business cycles. Mayors who run for reelection like to be "tough on crime." This is an example of an instrumental variable approach that we discuss in more detail below. That is a clever idea. Unfortunately, it does not generate sufficiently precise estimates of police effectiveness. Similar questions that we have explored in the book are the following: • • •
What is the effect of early childhood education on labor market outcomes? What is the effect of pay-for-performance on student achievement? What is the impact of agglomeration externalities on firm productivity?
We have seen that we need to conduct clever empirical analysis to answer these types of questions. How do we go about this?
A.3 Probability Theory Before we review some of the most common empirical methods that can be used to learn about causal relationships, it is useful to remind ourselves of some basic concepts in probability theory. Again, I provide a rather informal discussion here. I suggest that you take more rigorous courses in probability theory if you are interested in these topics. Risk and uncertainty are the joys of life.
A.3.1 Random Variables Consider a discrete random variable D that can take on two values O and 1. We call these random variables binary. A simple example of such a random variable is a coin toss that can be either heads (= 0) or tails (= 1) with equal probability. It is obvious that the outcome of a coin toss is completely random. There are many situations where the outcome is not completely random but we still want to think about these events as random. Consider, for example, the decision to go to college. We all know that this decision depends on a number of observable factors, such as grades, parental income, and others. Nevertheless, the observable factors do not perfectly predict college attendance. So there must be some factors that are really hard to measure that also partially determine college attendance. From the perspective of the researcher who studies college attendance decisions, these factors add some randomness to the decision process. As a consequence, we may want to use probability theory to describe these decisions and outcomes as well. Let's continue with the college example. Let D = 1 if the student attends college and D = 0 otherwise. Let Pr{ D = 1} = p denote the probability that outcome 1 occurs, i.e., the student goes to college. Likewise let Pr{ D = O} = 1 - p denote the probability that outcome O occurs. The expected value of D, denoted by E[D], is then given by
E[D] = pl +(l - p)O = p = Pr{ D = l}
(A.1)
So in this case, the expected value of the binary random variable is just the probability that a person attends college.
481
Appendix
We can generalize the concept of a discrete random variable to describe events that have more than two possible outcomes. For example, consider a soccer match: here we have three possible outcomes: a win, a draw, or a loss. Similarly, there are many events that can be described by discrete random variables, denoted by X, that have a finite number of outcomes, denoted by x1, ... , XJ - Each outcome occurs with probability Pj- The mean of the random variable Xis defined as J
E [X]
= L, Pj Xj
(A.2)
j= l
Broadly speaking, the expected value of a discrete random variable is the probability-weighted average of all possible outcome values. There are other random events in which the number of possible outcomes is very large. Think about income, for example. Some individuals have very low incomes and others have very high incomes. We could treat income as a discrete random variable with a large but finite number of possible outcomes. That would be fine but somewhat tedious. There are just too many potential outcomes. As a consequence, it is more convenient to treat income as a "continuous" random variable, i.e., a random variable that can take infinitely many values. The main problem we run into with continuous random variables is the following: How do we extend the notion of probability from a discrete random variable to a continuous random variable? The obvious problem is that every possible event should have probability zero. We need a trick here. The solution is amazingly simple. Instead of assigning probability to single events, we now assign probability to intervals of events. So consider the example of assigning probability to income. We can ask ourselves the following question: What is the probability that an individual has an income that is between $50,000 and $60,000? Clearly, that event should have a strictly positive probability. How do we assign probability to arbitrary intervals in an internally consistent way? Let's take a continuous random variable Y and consider a closed interval [a, b]. Define a density function f(y) such that for any possible interval the following definition of probability holds: 0 :S Pr{a :S Y :S b} =
lb
f(y)dy :S l
(A.3)
Mathematically, we use integration to define the probability associated with an interval. The function f (y) is called the density function of the random variable. It serves the same function as the probability Pj in the discrete case. As long as the density integrates to 1 as we integrate over the largest possible interval that makes sense, we should be in good shape. If 0 is the lowest possible value that Y can take, we can define the distribution function associated with Y as follows:
F(b) = Pr{0 :S Y :S b} = fobf(y)dy
(A.4)
We have extended the concept of probability from a discrete to a continuous random variable. We can also define the expected value of Y, denoted by E[Y] as E[Y]
482
= fooo y f(y) dy
(A.5)
Appendix
Basically, we have just replaced summation by integration. Here is an example. Suppose Y has a uniform distribution on the interval [0, l ]. Hence the density is constant and equal to 1. Moreover, the distribution of a uniform random variable is given by
F(b )=
labldy = b
(A.6)
for any bin [0, l]. Finally, the mean is given by
E[Y] =
l
foo
y l dy
1
y2 1
= -
2 1o
= -
2
(A.7)
To make sure you understand the argument, redo this exercise for a uniform random variable on the interval [a, b]. (We'll need these results below.)
A.3.2 Variance and Standard Deviation The expected value is also called the first moment of the distribution. It roughly tells us what we can expect to obtain "on average" if we generate many draws from the distribution and compute the average outcome. Given that there is some uncertainty about the outcome, the first moment does not provide all the relevant information about the random variable. For example, you may want to know how much possible dispersion there is in outcomes. The expected value does not tell you anything about that. If you want to learn about dispersion, you need to perform a different analysis. Let's consider a discrete random variable that has a bunch of outcomes x 1, .. . , x1. For each outcome consider the difference between the potential outcome and the mean of the random variable: Xj - E(X) . For some outcomes Xj > E(X) and for others Xj < E(X). Recall the coin toss. The two possible outcomes are 0 and 1. If it is a fair coin and both outcomes occur with probability 0.5, then E[X] = 0.5. So in this example half the outcomes are above the mean and the other half are below the mean. Note that we never observe the mean since 0.5 is not a possible outcome of a coin toss. We see either 0 or 1. To measure dispersion, we do not care whether the difference is positive or negative. So let's define the squared difference (xj - E(X) )2. The mean of the squared difference can be used as a reasonable measure of dispersion:
V(X) =
J
L Pi (xj -
E(X)) 2
(A.8)
j= l
This measure is called the variance of the distribution. Let's consider the simple coin toss. Here we obtain
V(X) = (l - p) (O - p)2 + p(l - p) 2 = p(l - p)
(A.9)
483
Appendix
Another commonly used measure of dispersion is the square root of the variance, which is known as the standard deviation, denoted by oy (A.10)
For a continuous random variable, we just use the density instead of the probabilities and replace summation by integration:
V(X) =
fo
00
(x - E(X) )2 f(x) dx
(A.11)
In the case of the uniform random variable on the interval [0, 1 ], we have
fo = fo
00
V(X) =
(x - 0.5)21 dx 00
(A.12)
(x 2 - x + 0.25) dx
x3 x2 = - - -
3
1
2
1
+0.25x
11 o
1
= 3 - 2+4 1 12
We can go on and define third- and higher-order moments. Since we do not need them in this book, let's move on to higher ground.
A.3.3 Multiple Random Variables In general, we are not just interested in the behavior of a single random variable but we care about multiple random variables. Let's consider another example. Some students work hard; some students shirk. Some students graduate from high school; some drop out. Here we have two discrete random variables. What's the probability that a student works hard in high school? What's the probability that a student will graduate from high school? The two events are clearly related, but they are not the same. There are some slackers who will graduate from high school and some hard workers who may drop out. How do we model these two random variables, taking into consideration that the outcomes should be related to each other? Let X be equal to 0 if the person shirks and 1 if the person works hard. Similarly, let Y be equal to 0 if the person drops out and 1 if the person graduates from high school. We can now ask ourselves the question, What is the probability that a person will shirk and drop out? This probability is denoted by
Pr{X = 0,Y = 0} = poo
484
(A.13)
Appendix
Similarly, we can define the three other probabilities:
Pr{X=O, Y=l} =po1 Pr{X =l, Y = 0} =p10
(A.14)
Pr{X = 1, Y = 1} = Pn More generally, we can define for two discrete random variables the joint probability distribution as
Pr{ X =
Xj, Y
= yk} = Pjk
(A.15)
Note that probabilities need to sum to 1, so we have (A.16)
What happens if X and Y are continuous random variables? Here is an example. What is the probability that your starting salary is between $50,000 and $60,000 and that your salary five years later is between $70,000 and $80,000? We can define the joint density f (x, y) and ask ourselves the question, What's the probability that Xis in the interval [a,b] and Y is in the interval [c,d]? Using the joint density function, we obtain (A.17)
That looks difficult, but that's why we have computers that can figure out those integrals fairly easily. You need to take a multivariate calculus course to learn how to evaluate integrals analytically.
A.3.4 Correlation Why did we need to do all this heavy lifting? Well, here is the answer. Now that we understand multiple random variables, we can define the concept of correlation. We want to measure whether two random variables move in the same direction or in opposite directions. In both of our examples above, we would expect that these variables move together. If you work hard, you are more likely to graduate from high school and vice versa. If your initial starting salary is high, you would expect your salary five years after graduation to also be high. To formalize this idea, consider the term (X - E(X)) (Y - E(Y) ). If the two variables move together, we would expect that X - E(X) > 0 implies that Y - E(Y) > 0 and vice versa. Let's average over all possible outcomes. This discussion suggests that we should define the following measure of co-movement:
Cov(X, Y) = [;[>jk (xj- E(X)) (yk- E(Y) ) j
(A.18)
k
= E{(X - E(X)) (Y - E(Y))} 485
Appendix
As many of you remember, this measure is called the covariance between X and Y. A positive covariance means that the two random variables move in the same direction on average, while a negative covariance means that they move in opposite directions on average. It turns out that the magnitude of the covariance depends on the scale of the random variables, i.e., the covariance between starting salary and salary after five years depends on whether we measure income in dollars or thousands of dollars. In many cases, we want a measure of the co-movement that is properly standardized and does not depend on the scaling of the random variables. The obvious way to obtain a standardized measure of co-movement is to divide the covariance by the two standard deviations of the two random variables:
Car(X, Y) = Cav (X, Y) o-x 0-y
(A.19)
where o-x (o-y) denotes the standard deviation of X (Y) . This measure is called the correlation between X and Y . We can show that the correlation is always between - 1 and 1. For example, you should verify that Car( X, X) = 1 and Car( X, - X) = - 1. The correlation between two random variables just measures the co-movement between those two random variables. If Car(X, Y) > 0, they move in the same direction. Going back to one of our initial examples, the size of the police force and the level of crime in a city appear to be positively correlated.
A.3.5 Marginal and Joint Distributions Suppose you know the joint distribution of two random variables. How do you recover the distribution of each individual random variable? Let's consider the simple example of graduating from high school and working hard. Suppose you want to know what the probability is that a person will graduate from high school. Well, there are two cases: either the person works hard or the person does not work hard. Summing up the two cases, we obtain
Pr{X = 1} = Pr{X = l, Y= O} + Pr{X = l, Y = 1}
(A.20)
More generally, using our notation for two arbitrary discrete random variables, we have
Pr{X = j} = LPr{X = j, Y = k} = LPjk k
(A.21)
k
We are just summing over all possible outcomes of the random variable that we are trying to eliminate and we obtain the marginal distribution. We can again extend these ideas to continuous random variables. Suppose we have two random variables X and Y with joint density f (x, y). For example, Y is the starting salary and Xis the GPA that a student obtained at a university. Again, we need to replace summation by integration. The marginal density of Y is then defined as
f(y) =
486
fo
00
J(x, y) dx
(A.22)
Appendix
Note that we integrate over the support of X. In the example above, we obtain the marginal income density by integrating the joint density of income and GPA with respect to the GPA.
A.3.6 Conditional Distributions In many cases, we want to control for certain events in our analysis. For example, we would like to know what the probability of graduating from high school is given that you work hard. Recall that X = 1 if you graduate from high school and X = 0 otherwise. Similarly, let Y = l if you work hard and Y = 0 otherwise. A reasonable way to define this conditional probability is to do the following:
Pr{X = l lY= l} = Pr{X = l, Y= l} Pr{Y = 1}
(A.23)
Pr{X = OIY = 1} = Pr{X = 0, y = 1}
(A.24)
Similarly, define
Pr{Y = l}
Note that the renormalization in the denominator just guarantees that the conditional probabilities sum to 1. More generally, define P {X = 'IY = k} = Pr{X = j, Y = k} r J Pr{Y = k}
(A.25)
Again, we can extend the concept of conditional probabilities to continuous random variables. The conditional density of Y given X is defined as
f(y lx) = f(x,y) f(x)
(A.26)
In the example considered above, this is the density of income, holding GPA fixed at a given level x.
A.3.7 Conditional Expectations Once we have defined conditional distributions, we can then proceed as above and define moments replacing unconditional distributions with conditional distributions. The expected value of Y conditional on X, denoted by E[YIX], is then given by
E[YIX] =
1-:
y f(ylx) dy
(A.27)
Conditional expectations of Y given X can be a complicated function of X. To simplify the analysis, we often approximate a conditional expectation by a linear function, i.e., we assume that E[YIX] is approximately linear in its parameters:
E[YIX] =a + f3X
(A.28)
487
Appendix
where l\'. and f3 are parameters of the model. The main advantage of this simplification is that we can use linear regression techniques to estimate this function, as we discuss in detail below. Suppose we have a discrete random variable D and a continuous random variable Y. Let fi(Y) denote the conditional density of Y given D = i. For example, let fi(y) denote the income density of university graduates and fo(y) the density of individuals that do not attend college. We then define the conditional expectation as (A.29) Hence in our example E [YID = 1] denotes the expected income of a university graduate. If we want to analyze whether college increases earnings, it would make sense to compare the two conditional income distributions. However, we need to take into consideration that individuals do not randomly decide to go to college.
A.3.8 Independence Consider two discrete random variables X and Y. We say that X and Y are independent if (A.30) for all j, k. Independence plays an important role in random sampling. For example, if we toss the same coin twice in a row, then the two outcomes are independent of each other. More generally, if we use a computer to draw a sequence of random numbers from the same distribution, then those events are independent. In statistics, we commonly deal with random samples. A random sample can be thought of as a set of objects that are chosen randomly. More formally, it's a sequence of independent, identically distributed random variables.
A.3.9 Some Useful Rules Before we turn our attention to statistics, let me mention some properties of random variables that come in handy. Convince yourself that the following property holds for two random variables X and Y and constants a and b:
E[aX + bY] = a E[X]
+ b E[Y]
(A.31)
Cov (aX, bY) = ab Cov (X, Y)
(A.32)
Moreover,
The next two results in equations (A.33) and (A.34) require independence, i.e., they only hold if X and Y are independent of each other:
Cov(X, Y)
=0
(A.33)
i.e., independence implies that the covariance is 0. Moreover, w e have
V(aX + bY)
488
= a2 V(X) + b2 V(Y)
(A.34)
Appendix
A.4 Statistics A.4.1 Random Sampling, Estimation, and Inference We can use probability theory to describe events that we observe in the real world. How do we learn about the properties of distributions of random variables? For example, suppose we want to know the distribution of income of college graduates in the United States. We cannot determine the exact shape of this income distribution by introspection. Here is the basic problem. The underlying distribution of this random variable in the overall population of college graduates is unobserved or unknown. How can we make some progress without having to ask the income of every single college graduate in the US? Statistics provides the answer to these types of questions. We don't ask everybody, just a sufficiently large number of college graduates, and we hope that this gives us an approximately correct answer to our broader questions. In statistics, we are typically satisfied with the approximate truth. Let's formalize the basic ideas. Suppose we are interested in learning the value of the mean of a random variable, denoted by E [Y]. We do not know E [Y]. How can we figure out or "estimate" E [Y]? Intuitively, we need to conduct the following experiment. Suppose we generate random draws from the underlying distribution of Y and we observe a bunch of these draws. Let's call this collection of random draws a sample. Then the sample should be informative about the underlying population as long as it is large enough. Consider a random sample of size N. Recall that a random sample consists of N independent draws from the same distribution with density function f(y). Independence in the context of sampling just means that the draws are unrelated. We need to avoid repeatedly drawing from the same part of the distribution. For example, suppose we only sample employees of Goldman Sachs. That's not a random sample of college graduates. These are not independent draws. The sample mean is defined as
1
N
N
i= l
YN = - }:Yi
(A.35)
Note that we use the formula for the mean of discrete random variables, where each observation has an equal weight of The sample mean is a random variable. Each random sample gives us a different realization of a sample m ean. Of course, the key feature is that we can observe realizations of the sample mean. Each observed sample gives us one realization of the sample mean or one estimate of the sample mean. That's a little bit confusing, but then again nobody said statistics is easy. The key question that we need to answer is the following: What is the relationship between the sample mean and the population mean? First, note that the population mean is a number or an unknown parameter of a distribution, while the sample mean is a random variable. I bet that sounds confusing as well, but the distinction is essential! Note that the expectation of the sample m ean is equal to the population mean:
iJ.
(A.36)
489
Appendix
On average, we are right on the money! What about the variance? Using the variance rule in equation (A.34) for independent random variables, convince yourself that the following holds: (A.37) The larger the sample size, the smaller the variance of the sample mean. In the limit, as N goes to infinity, the variance goes to 0. The distribution of the sample mean collapses to a point mass around the population mean. This result is known as the law of large numbers (LLN). This theorem states that the sample mean YN converges in some probabilistic sense to the population mean E[Y] as our sample size gets large (more formally, as N goes to infinity). Broadly speaking, the LLN means that the difference between the sample mean and the population mean is small in large samples, no matter what sample we generate. This is a cool result since it tells us that we will always be close in large samples no matter what sample we use. That's good to know! You can treat this as an identification result. Whenever the LLN applies, we can treat expectations as "known" or "identified," as long as we have access to a sufficiently large random sample. The next step of the analysis then typically tries to approximate the magnitude of the error that we make if we use the sample mean instead of the population mean. We already know from the LLN that the error is relatively small in very large samples. Unfortunately, the sample sizes that we use in the real world are often not that large. How big is the difference between the sample mean and population mean in a reasonable sample? We don't really know how to exactly answer this question. However, we can try to approximate the difference. The central limit theorem (CLT) typically allows us to approximate the distribution of the sample mean around the population mean for large enough samples. Broadly speaking, the CLT provides conditions under which you can approximate the distribution of the sample mean using a normal distribution. Thus the CLT allows you to approximate confidence intervals for the population mean assuming you can also estimate the standard deviation of that distribution. That is essential if you want to do any type of hypothesis testing or inference. How do you construct confidence intervals and do hypothesis testing? Let's consider the problem of estimating an unknown population mean. Suppose we can appeal to a CLT that tells us that following approximation is accurate in large samples: (A.38) where µy = E[Yi] and CT~ = V[Yi] for all i = 1, ... , N. Then we can obtain the following 95 percent confidence interval for the unknown parameter µy: (A.39) That means that if the data were generated from a distribution with mean µy and variance CT~, there is a 95 percent probability that µy will be in the random interval
490
Appendix
above. Now we only need an estimator for CT~ and we are done. Well, the obvious estimator for CT~ is 1/N[,f:1 (Yi - YN)2 . We could keep going, but at this stage a more advanced statistics course would be required. From an applied perspective, most software packages compute standard errors and confidence intervals for you. However, you need to be careful. Just because the computer spits out some numbers that does not always mean that the underlying assumptions that are needed to make these approximations make sense.
A.4.2 Estimating Conditional Expectations Estimating conditional expectations is a little bit harder than estimating unconditional expectations, but the same basic principles apply. We need a random sample to appeal to a law of large numbers and a central limit theorem to make proper inference, i.e., to construct confidence intervals for the parameters of interest. Consider the simple example in which we want to estimate the mean of income Y conditional on having graduated from high school D = 1. Note that the conditioning variable D is discrete. So we can just generate a random sample from the conditional distribution of Y given D = 1. We just need to find a clever way of randomly picking a bunch of high school graduates and asking them about their incomes. We can then proceed as in the case of estimating the mean of the unconditional income distribution. If the conditioning variable is continuous, then we face the additional problem that E[YIX] can be a fairly complicated object. How do we proceed in that case? The answer requires two steps. First, we need to sample from the joint distribution of Y and X or from the conditional distribution of Y given X. Second, we need to come up with a strategy to estimate E[YIX]. If you take more advanced statistics courses, you'll learn that there are more sophisticated methods that allow you to estimate this conditional expectation without imposing any functional form assumptions. For the purposes of this book, you do not need to know how to do that. In the empirical papers we study in this book, researchers feel confident to make simplifying assumptions on the shape of E[YIX] . Typically, they assume that the function is linear in its parameters. We can then use regression analysis techniques to estimate the conditional expectation.
A.4.3 Linear Regressions In most econometrics courses, you learn how to estimate conditional expectations using regression analysis. Let's review the essentials. To understand some of the potential problems, let us consider the simple linear model with one regressor, given by (A.40)
where Un is the error term of the regression model. We assume that the conditional expectation of Y given X is linear: (A.41)
491
Appendix
Another way of saying this is that we assume that (A.42) The unobserved error term Un is uncorrelated with the observed regressor X n. The problem of estimating the conditional expectation then simplifies to the problem of estimating the parameters a and /3. To simplify the math, let us assume that the intercept of the regression model is known to be zero, i.e., a = 0. Note that this makes sense if we have normalized the dependent and independent variables such that they have a mean of zero, E[X] = 0 = E[Y]. Sometimes we can easily impose these normalizations. Of course, all results that we discuss here do not depend on this assumption and can be extended to the general multivariate regression model discussed in any econometrics textbook. We just like to keep things simple to focus on the key intuition behind the main results. Setting the intercept of the regression equal to 0, we can write the "datagenerating process" that creates our random samples as
n = l, ... , N
(A.43)
where f3 is the parameter under which the data or the random samples were generated. We observe one random sample { Xn, Yn}~=l · Recall that the least squares estimator for f3 is given by (A.44) Note that the numerator is for all practical purposes just the sample covariance between X and Y. The denominator just rescales the sample covariance using the sample variance of X. Under what conditions does ~LS provide a good estimator for {3? Hopefully, we can appeal to a law of large numbers so that in large samples we should be fine. Let's try that. We can substitute the linear model in equation (A.43) into the definition of the least squares estimator in (A.44) and obtain (A.45) From the equation above we can conclude that the least squares estimator is a good estimator for f3 if and only if the numerator of the second term vanishes as the sample becomes large. We need that
l N NLXnUn-----t O
(A.46)
n=l
In other w ords, we need to be able to appeal to a law of large numbers so that the sample average in equation (A.46) goes to zero. This will only be the case when
492
Appendix Xn and Un are not systematically related to each other. Independence between Un and Xn is a sufficient condition for that to hold since (A.47)
If that is not the case, for example, if Un and Xn are systematically positively or negatively correlated, we are in trouble and least squares is not a reliable estimator. A simple test is to ask yourself the following question: Can you tell a plausible story why Xn and Un should be correlated? If yes, you may be in trouble. Correlation between Xn and Un will arise if there are important omitted variables that are not included in the regression model but that are correlated with your regressor Xn. For example, consider the problem of regressing wages on education. Education is often measured by years of schooling, which is positively correlated with ability. Note that ability is inherently hard to accurately measure. Since education and ability are positively correlated, the least squares estimator of f3 will pick up not only the returns to education but also the fact that higher-ability students tend to have higher levels of education. The second effect is problematic since the ordinary least squares (OLS) estimator tends to overstate the returns to education in this example. Once we have estimated our regression model, we also need to construct estimated standard errors for the parameters. Recall these standard errors come from the approximation that we can obtain from a CLT. 1 Here is another useful example. Suppose that X n is a categorical variable. Let's assume for simplicity that it can only take on two values: 0 and 1. Consider the original regression model with intercept (A.48)
We can think about Xn = 0 as the comparison group and Xn = l as the treatment group. Let No denote the sample size of the comparison group and N1 denote the sample size of the treatment group. In that case, it is straightforward to show that the OLS estimators are given by (A.49)
(A.50)
where 1{ ·} is an indicator function that is equal to 1 if the event in brackets is true and Ootherwise. The intercept is the mean outcome in the comparison group. The slope parameter is the difference between the mean outcomes in the treatm ent group and the comparison group. Note that individuals are typically not randomly assigned to the treatment and the comparison group. Think about the 1 Computing
these standard errors can be a tricky business. If you get a PhD in statistics or economics, you will spend a fair bit of time trying to learn the underlying theory.
493
Appendix
example of high school graduates who go and do not go to college. That is not a random decision. If the assignment to the treatment and the comparison group is not random, we cannot give the OLS estimates a causal interpretation. One solution to this problem is to run an experiment. Random assignment then guarantees that Xn and Un are independent of each other. In the example with a discrete treatment, the experiment-if done correctly-yields a control group that has the same observed and unobserved characteristics as the treatment group. In that case, OLS can be used to estimate the treatment effect. We discuss experiments in more detail below.
A.4.4 Instrumental Variables Of course, it is a little bit tricky to randomly assign different levels of education to individuals. Maybe we need a different approach if we want to measure the returns to education. Another approach is to use an instrumental variable (IV), denoted by Zn. The IV estimator for the regression model without intercept is defined as (A.51) Take a moment to compare the LS estimator in (A.44) with the IV estimator in (A.51). Convince yourself that the LS estimator is a special case of the IV estimator in which we use Xn as an instrument, that is, we set Zn = Xn. Again, we can substitute the linear model in equation (A.43) into the definition of the IV estimator in (A.51) and obtain (A.52) For the IV to work, we need the last term to vanish as the sample gets large. That will be the case if (1) Zn and Un are not systematically related to each other (the numerator converges to 0), and (2) the denominator of the expression is not O (Zn predicts Xn). Note that the first assumption cannot be tested, while the second can be tested by regressing Xn on Zn . What would be a good instrument for education in this context? Researchers have tried different approaches. For example, some individuals live further away from a college than others. Suppose these distances are due to locational decisions that are not correlated to ability. Then maybe distance to the nearest college can serve as an instrument for having obtained a college education. A simple example of an IV estimator uses a discrete instrument that can only take on two values, denoted by O and 1. In that case, we can show that the IV estimator is given by A
i\ - Yo
f3watd = X 1 - X 0
(A.53)
where Y1 (Yo) is the conditional sample mean of Y if Zn = 1 (Zn = 0). X:1 (Xo) is the conditional sample mean of X if Zn = 1 (Zn = 0). This IV estimator is also called a Wald estimator, named after the person who first suggested it. In the schooling example, suppose we use data from two school districts that have
494
Appendix
similar characteristics. One district has a promise program that incentivizes students to stay in school and go to college; the other district does not. The Wald estimator is given by the differences in mean earnings divided by the differences in mean schooling among the two districts. We have seen many applications of IV estimators in this book. Some may be more compelling than others. After reading this book, you will understand why IV estimation is popular in empirical economics. IV works when OLS fails, but finding good instruments can be tricky.
A.4.5 Panel Data and Difference-in-Difference Estimation Panel data can also be useful to deal with omitted variables, especially when we do not have decent instrumental variables. Consider again the problem of estimating a wage function. Recall we have a really hard time measuring ability. As it turns out, it can be helpful to have access to panel data. Let us assume that we observe the same individual for at least two time periods. Let's add time t as a subscript to each random variable in the sample. We therefore obtain the following datagenerating process: (A.54) Ynt =a+ f3Xnt + ,Wnt + Unt Note that I have added a second regressor Wnt and let us assume that we do not observe Wnt • In general, that creates the well-known omitted variable problem in estimation. It will be hard to get a good estimator for f3 if we do not observe Wnt· However, there is one exception. Suppose that W 111 = Wn for at least two time periods, i.e., Wnt does not vary significantly over time. In that case, we can transform the data and obtain (A.55)
We say we have "differenced" the data. While we cannot estimate a or I this way, we can estimate f3 with OLS using the data in first differences, as long as (X11 1+ 1 - Xnt) is not constant. This is an example of a difference estimator. We use first differences to eliminate time-invariant unobserved regressors. This estimator is closely related to a fixed effects estimator that uses individual-level dummy variables to account for unobserved time-invariant regressors. Sometimes it is helpful to difference the data more than once. Suppose now that X111+ 1 = Dnt+l is a discrete random variable that measures whether an individual obtained a treatment or not. For example, it may indicate whether an individual participated in a job training program. Let us also assume that Dnt = 0 for all n, i.e., nobody was treated in the baseline period. The model above can then be written as (A.56)
In that case, f3 can then be estimated using a simple difference-in-difference estimator. We estimate f3 as the difference between the mean of Ynt+l - Ynt for the subsample of participants (Dnt+l = 1) and the sample mean for the nonparticipants (Dnt+l = 0). Let's consider an example to illustrate how difference and difference-indifference estimators work. (Note that the numbers in this example are made up
495
Appendix
and are not based on any real empirical analysis.) Consider the Tax Cut and Jobs Act that was signed into law in December 2017. It capped the deductions of state and local taxes for married households at $10,000. As a consequence, households that live in cities with high state and local taxes now have to pay higher federal income taxes. We would expect that this tax policy change reduced housing prices in areas with high state and local taxes due to a standard capitalization argument that we discussed in the book. Let Dn,2018 = 1 if a house is located in a high-tax metropolitan area (such as NY or SF) and D 11,201s = 0 if the house is in a low-tax area (such Dallas or Phoenix). Suppose we observe in high-tax cities that mean housing prices are given by E[Yn,2018 IDn,2018 = 1] = 450,000
(A.57)
E [Y11,2017!D11,201s = 1] = 440,000
Hence the average housing price increased by $10,000 between 2017 and 2018. The simple difference estimator would then conclude that the cap on state and local tax deductions increased housing prices by $10,000. That's not a good estimator since the aggregate economy changed as well. We need to account for housing price changes that have nothing to do with the tax policy change. So we need to difference the data one more time to get a better estimate of the impact of the tax reform. Suppose we observe in low-tax areas that mean housing prices are given by E[Yn,2018I Dn,2018 = 0] = 400,000
(A.58)
E[Y11,20d D11,201s = O] = 420,000
In that case, the difference-in-difference estimator is equal to
f3 DiD = (450,000 - 440,000) - (420,000 - 400,000)
(A.59)
= - 10,000
Hence we would conclude that the tax reform lowered average housing prices by $10,000 in high-tax cities. The logic is simple. The comparison group of houses in low-tax cities tells us that housing prices should have gone up by $20,000 in high-tax areas in the absence of the tax change. If housing prices in high-tax cities only went up by $10,000, then the new tax law must have suppressed housing prices by $10,000 in those cities. Hence a key identifying assumption is that the differences in housing prices in low-tax cities can be used to measure the price change that would have happened in high-tax cities in the absence of the tax reform. Again, these types of difference-in-difference estimators can be made more sophisticated by controlling for time fixed effects and changes in other observables. However, the basic logic applies to all panel data estimators discussed in the applications of this book.
496
Appendix
A.5 Causality and Social Experiments One objective of empirical analysis is to learn about causal relationships. We want to make statements such as "Hiring another police officer reduced crime X percent" or "Graduating from a selective college increases salaries by Y percent." How do we go now from correlations to causality? We reexamine these issues in this section and provide a clean statistical definition of causality.
A.5.1 The Potential Outcome Model Following Neyman (1923) and Fisher (1935), we adopt a standard notation in the program evaluation literature and consider a model with two potential outcomes. Let D be a dummy variable that is equal to 1 if a person receives a treatment and 0 otherwise. Y1 denotes the outcome with the treatment. Yo denotes the outcome without the treatment. We treat D, Yo, and Y1 as random variables. The researcher observes (A.60) The individual gain from receiving the treatment or participating in the program is defined as (A.61) Note that for any individual we cannot observe this treatment effect since we never observe both Y1 and Yo at the same time. You will never know what your starting salary would have been if you had not attended college. Hence you will never know whether attending college raised your salary! More generally, we cannot answer the question of whether a specific individual benefits from a program or not.
A.5.2 Average Treatment Effects Attention shifts from individual treatment effects to average treatment effects. We should be able to answer the question of whether college graduates have higher earnings "on average" than high school graduates. The average treatment effect is defined as ATE = E[Ll]
= E[Y1 - Yo]
(A.62)
The average treatment is thus the expected treatment for a person that we have randomly drawn from an underlying population. If you had to offer an educated guess, is the ATE positive or negative for attending college? We typically cannot force an individual to participate in a program. Not everybody wants to go to college! If program participation is voluntary, we primarily care about the effectiveness of the program for those who are willing and capable to participate in the program. The average treatment on the treated can be defined as (A.63)
497
Appendix
Note that D = 1 means that the individual is willing and eligible to participate in the program. Sometimes we would like to know whether the program is more effective for certain groups of the population. For example, do female students gain more from attending college than male students? Let X denote some characteristics observed by the researcher, such as race, gender, intelligence, etc. In that case, we care about the average treatment effect conditional on X,
ATE(X) = E[Y1 - Yo l X]
(A.64)
or the treatment on the treated conditional on X,
TT(X) = E[Y1 - Yol D = 1, X]
(A.65)
A.5.3 An Example The two potential outcomes for an individual are given by
Yo = ao + Uo Y1 = a1 + U1
(A.66)
where U1 and Uo are random variables. In contrast, ao and a 1 are parameters. Let's assume that E(U1) = 0 = E(U2)- Let us assume that the parameters satisfy the following restriction: (A.67)
What does this restriction imply? Let D be an indicator that is equal to 1 if the individual participates in the program and O otherwise. Let's assume that individuals maximize income. Each individual decides to participate in the program (D = 1) if and only if Y1 2 Y0 . Hence we can w rite the decision rule as Y12 Yo
(A.68)
a1 + U1 2 ao + Uo Hence we have: (A.69)
We also say that individuals self-select themselves into the program if program participation is voluntary. From the perspective of the researcher, participation is random since only the individuals, but not the researchers, observe Uo and U1. Let's assume that U1 - Uo is a uniformly distributed random variable with support (- 1, 1). Note that the density of a uniform (- 1, 1) random variable is 1/2. The participation or selection probability is then given by
(A.70)
498
Appendix
The average treatment effect of the program is given by
ATE= E[Y1 - Yo] (A.71)
Note that by assumption ATE < 0. Is that a crazy assumption for attending college? The average treatment effect of the program on the treated is given by
TT= E[Y1 - Y0 1Di = 1]
= a1 - ao + E[U1 - Uo lD = 1]
+
1 + ao - a1
= a1 -
ao
=~_
ao - a1
2
2
2
>0
(A.72)
Note that TT > 0. The average treatment effect is negative while the average treatment on the treated is positive. These differences are due to selection based on U1 - Uo: only individuals with sufficiently large realizations of U1 - Uo will participate in the program. The participants of the program differ from the nonparticipants based on these unobserved factors. The selection is based on variables that are not observed by the econometrician. That makes the analysis somewhat complicated.
A.5.4 Selection Bias Individuals voluntarily choose or self-select themselves into a group that participates in the program or into another group that does not participate in the program. This self-selection process does not imply random assignment of the treatment status. Hence we have to be careful w hen we evaluate the effectiveness of the program. Can we use a naive approach and compare outcomes for participants and nonparticipants? For example, does a simple OLS estimator that uses data on participants and nonparticipants provide a reliable estimator of the average treatment effect? The answer to both questions is likely to be no, as we argued above. Using the language of the potential outcome model, we can gain some additional insights into the problem of characterizing the selection bias. Suppose we ignore the selection problem. For those w ho participate in the program, we observe (A.73)
For those who do not participate in the program, we observe
E[Yol D = O]
(A.74)
We can estimate both of these conditional m eans based on samples of participants and nonparticipants. In our example, we would have to get a random sample of students at selective universities and a random sample of students at nonselective universities in the US.
499
Appendix
Based on the observed outcomes, we can compute the difference in expected outcomes between participants and nonparticipants:
E[Y1 D = 1] - E[Yol D = 0] J
= E[Y1
J
D = 1] - E[Yol D = 1]
+ E[Yol D = 1] - E[Yo l D = 0]
=TT+ E[Yo l D = 1] - E[Yol D =0] (A.75)
= TT + Bias
The second term measures the extent of the evaluation, or selection bias. In our example, this term is given by
Bias = E[Uol D = 1] - E[Uol D = 0] -f. 0
(A.76)
In general, we have no reason to believe that the Bias term above is 0. Individuals who are willing to participate in the program have different outcomes in the baseline case than individuals who are not willing to participate in the program. The comparison between outcomes of participants and nonparticipants does not provide us with a clean estimator of the treatment effect. Is the selection bias term positive or negative? Note that our model implies that Uo is negatively correlated with D. Why?
A.5.5 Randomized Experiments One solution to deal with the evaluation bias is to conduct a randomized social experiment. Let D = 0 denote the nonparticipants, who are also called the "comparison" group. Consider the set of individuals who are eligible and willing to participate in the program, i.e., D = 1. Let us randomize the potential participants into a "treatment" group (R = 1) and a "control" group (R = 0). For the treatment group, we observe
E[Y1I D = l,R = 1] = E[Y1I D = 1]
(A.77)
For the control group, we observe
E[Yol D = 1,R = 0] = E[Yol D = 1]
(A.78)
For the comparison group, we observe
E[Yo l D = 0]
(A.79)
The main difference between an experimental design and a nonexperimental design is that we observe the mean outcome for the control group in the experimental design. Comparing the outcomes between the treatment and the control group, we can estimate the TT: (A.80)
500
Appendix
Comparing the outcomes between the control and the comparison group, we can estimate the evaluation bias: Bias=E[YolD=l,R=O] - E [YolD = O]
(A.81)
Randomization may be violated, due to either poor research design or noncompliance. In contrast to medical experiments, social experiments are not double blind. Individuals know whether they are in the treatment or control group. Contamination bias arises when members of the control group seek alternative forms of treatment. Ethical consideration may lead to opposition to the experiment: some individuals may refuse to participate after they learn whether they are assigned to the treatment or control group. Attrition bias can arise if some members in the treatment or control group drop out before completing the program.
A.6 The Potential Outcome Model and the Regression Model Let's go back to our regression model. What's the relationship between regression analysis and the potential outcome model? Suppose we assume that
Yo =ix + U Y1 =IX+'}'+ U
(A.82)
where ix and 'Y are parameters. U is the error term that captures factors not observed by the econometrician with E[U] = 0. We are assuming that the error term is the same in both outcome equations and that each individual benefits the same from the treatment. We can write the model above as the familiar regression model:
Y = D Y1 + (1 - D) Yo
(A.83)
= D (ix+ 'Y + U) + (1- D) (ix+ U) = ix+,,o+u
(A.84)
Hence we have rewritten the potential outcome model as a simple regression model with outcome Y and regressor D. If Dis uncorrelated with U, which is not necessarily a plausible assumption, we can estimate the parameters of this model using OLS. What does this model imply about the ATE and the TT? Note that the model implies that (A.85)
and hence ATE = '}' = TT
(A.86)
501
Appendix
There is no difference between the ATE and the TT in the simple regression model. The simple regression model does not seem to be a good model to think about the benefits associated with attending college! How can we fix the simple regression model? Suppose we assume that
Y0 = a+ U0 Y1 =a+,+ U1
(A.87)
with E [Ud = 0. The random shocks now differ by treatment. Some students may have very high realizations of U1, which means that they would benefit a lot from attending college. In that case, the treatment for an individual is (A.88)
The individual treatment effect now also depends on U1 - U0 , i.e., there is heterogeneity in the effectiveness of the program. We can still compute the average treatment effect: (A.89)
The treatment on the treated is given by (A.90)
Note that there is a difference between ATE and TT, if (A.91)
How can we control for observed heterogeneity in treatment? Suppose we assume that
Yo = a+ f3X + Uo Y1 = a+ f3X + 1 X + U1
(A.92)
Y = a+ f3X + 1 X D + DU1 + (1- D ) Uo
(A.93)
and hence we obtain
This model is also called a switching regression model because the indicator D switches between regression O and regression 1. Note that we cannot estimate the parameters of the model using OLS since we have two different error terms and D may be correlated with Uo and U1 . The average treatment effect is given by
ATE(X) = ,X
(A.94)
and the treatment on the treated is given by (A.95)
502
Appendix In this model the ATE and the TT depend on X. In addition, the difference between TT and ATE is due to a selection effect.
A.7 Regression Discontinuity Design Another popular quasi-experimental estimator is obtained from a regression discontinuity design. Let Z denote an observed outcome that determines admission to a program. For example, admission to a gifted program is often determined by an IQ score. The probability of participating in the program is d efined as
E[DIZ] = Pr{D = l lZ}
(A.96)
The RD design is then based on the assumption that Pr{D = llZ} is known to be discontinuous at Zo. We can think of Zo as a threshold that determines admission to the program. For example, in gifted education a commonly used threshold level is an IQ score of 130, i.e., students are admitted to the program if and only if they have an IQ of at least 130. The basic idea behind the RD design is then to compare outcomes of individuals who are just below and just above the cutoff value. Let us rewrite the outcome as (A.97)
Hence we obtain
E[Y] = E[Yo] + E[~ D]
(A.98)
For simplicity let us assume that~ is constant among individuals who receive the treatment:
E[Y] = E[Yo] + ~ E[D]
(A.99)
This is not necessarily a great assumption in many applications, but it makes the math easier. Let e > 0 denote a small positive number. In that case, E[YIZ = Zo + e] is the average outcome for individuals who are just above the admission threshold. In contrast, E[YIZ = Zo - e] is the average outcome for individuals that are just below the admission threshold. The difference in these outcomes is largely due to the fact that individuals above the threshold are much more likely to participate in the program than individuals below the threshold. In the constant treatment case, we have E[YI Z = Zo + e] - E[YIZ = Zo - e] = L'> (E[DIZ = Zo + e] - E[DI Z = Zo - el)
+ E[YolZ =
Zo + e] - E[YolZ = Zo - e]
(A.100)
If we assume that E[Yol Z] is continuous at Zo, the second term in the equation above vanishes, as e goes to 0. As a consequence, the treatment effect is identified
503
Appendix
from the following equation: b. = lim _E~ [Y_I_Z_ =_Z_o_+_e~]_- _E~[~Y_ IZ_=_Z_ o -~ e] e➔ o
E[DIZ = Z 0 + e] - E[DIZ = Z0 -e]
(A.101)
Note that the denominator is just the difference in the participation probabilities. Hence we can estimate a local treatment effect by comparing outcomes just above and just below the cutoff while adjusting for the differences in participation probabilities. Note that the numerator does not go to Obecause of the discontinuity of participation at Zo. Let us illustrate this estimator using a simple example. Consider the case of a gifted program with a threshold of 130. In that case, Zo = 130. Let's set e = l. Hence we compare students who attended the same school district and had IQ scores of 131 with students of the same district who had IQ scores of 129. Suppose we observe that E[D IZ = 131] = 0.9
(A.102)
E[D IZ = 129] = 0.4 Of those students with IQ=131, 90 percent were in the gifted program; 40 percent of students with IQ=129 were in the gifted program. The denominator of our estimator is, therefore, given by E[DIZ = 131] - E[DIZ = 129] = 0.5
(A.103)
Suppose we measure Y as attending a highly selective college or university. Suppose we observe E[YIZ = 131] = 0.7
(A.104)
E[YIZ = 129] = 0.6 This tells us that 70 percent of students with IQ=131 attend a selective university. In contrast, 60 percent of students with IQ=129 are enrolled at a selective university. The difference is likely due to differences in participation in the gifted program. The numerator of the estimator is then E[YIZ = 131] - E[YIZ = 129] = 0.1
(A.105)
Hence our estimator of the local treatment effect is given by b. = 0.1 /0.5 = 0.2
(A.106)
We conclude that the gifted program increases attendance at selective universities by 20 percentage points for students who have IQ scores near 130.
504
Appendix
A.8 Discrete Choice Fundamentals Let's start by reviewing some of the fundamental results regarding discrete choice and random utility models. Consider a general specification of a discrete choice problem with two discrete choices. Let x 1, x2 denote the two discrete alternatives between which the consumer is choosing. For example, you need to determine whether to live in Philadelphia or New York. If you live in Philadelphia, the size of your apartment will be x 1 . If you live in New York, the size of your apartment will be x2. The prices per unit of housing in Philadelphia and New York are given by p1 and p2 . The key constraint is that the consumer can only purchase one of the two goods. You cannot live in Philadelphia and New York at the same time. We assume that it does not make any sense to rent apartments in both places at the same time. You are not rich enough. At least, not yet! Hence if x 2 > 0 then x 1 = 0, and if x1 > 0 then x2 = 0. Those constraints can be written as
Either the consumer buys good 1 or she buys good 2 but not both. Let c denote the numeraire good, with its price normalized to 1, and m is income. Hence the budget constraint is given by
The consumer's maximization or choice problem can be written as max U (x1, x2, c)
(A.109)
XJ,X2,C
subject to
p1x1 +p2x2+c=m
(A.110)
X1X2 = 0
(A.111)
That's a complicated choice problem. How do we solve this problem using techniques that we learned in intermediate microeconomics? Suppose we condition on x1 = 0, i.e., we assume that the consumer picks alternative 2. In this case, the choice problem is a standard utility maximization problem: max U(0, x2, c) (A.112) Xz,C
subject to (A.113)
Now you are confronted with a standard optimization problem, which you can handle using tools you know. Let us denote the solution for the d emand functions b y x2 (p2, m) and c(p2, m) .
505
Appendix
Plugging these demand functions back into the utility function U ( ·) yields the indirect utility from facing this choice given price p2 and income m:
V2 (p2, m) = U(O, x2 (p2, m), c(p2, y))
(A.114)
Doing the same conditioning on x2 = 0, we obtain an analogous expression: (A.115)
Vi (p1, m) and Vi (p2, m) are called conditional indirect utility functions. The unconditional optimum of the original problem is then determined by which of the two "solutions" leads to an overall higher utility: max °VJ(Pj, m)
(A.116)
j = {l ,2}
The problem becomes significantly easier if we assume that x1 and x2 can only be O or 1. This assumption really says that you cannot choose the amount of the product, or that all products are the same size and you can only buy one. For example, you can only buy one car per year, live in one house, or go on one vacation. In that case, we have
= U(l,0,m - p1 ) V2(p2, m) = U(O, 1, m - P2)
V1(p1,m)
(A.117)
Consequently, the consumer will choose alternative 1 if and only if (A.118)
In most cases, we treat the conditional indirect utility functions as the primitives in our discrete choice models of demand. Recognize that we can derive them from a standard consumer choice problem with multiple goods. By now you may have recognized that the locational choice model we considered in chapter 8 falls into this framework. In many settings, it is desirable to have a model in which consumers who look identical make different decisions. For example, we often observe two consumers with the same income, yet one lives in New York while the other lives in Philadelphia. The difference in the choices must be due to differences in tastes or preferences, including the extent to which they value amenities like public goods and housing. We can try to model these differences, as we discuss in detail below. Other differences in preferences are purely idiosyncratic. We can capture these idiosyncratic taste differences by adding random shocks to the conditional indirect utility function. Let €j thus denote the random shock associated with the conditional indirect utility of product j. If these shocks are additively separable from the conditional indirect utility function, then we obtain the following specification of the consumer choice problem: (A.119)
506
Appendix
Utility functions that arise when we add random preference shocks are called random utility functions. Let's consider a simple example with two products and assume that the conditional indirect utility functions are constant (i.e., they do not depend on income or prices). Here a consumer chooses alternative 1 over alternative 2 if and only if (A.120)
which we can rewrite as (A.121)
So the difference in the idiosyncratic random preference shocks € 1 - €2 must be larger than the difference in the common components Vi - Vi. Suppose we now want to compute the market share of product 1. Let's assume there are a large number of consumers. The market share of product 1 is given by conditional choice probability: (A.122)
The conditional choice probability is the probability that an individual will choose product 1. It depends on the distribution of the idiosyncratic preference shocks €1 and €2. It also depends on the magnitude of the common components Vi and ViFollowing McFadden (1974), we assume that €j are type I extreme value errors. The distribution function is given by F(€j) = exp [- exp(-Ej)]). This distribution may look strange, but it is not that different from a normal distribution. This distribution has a unique mode at O and a mean of 0.577. In contrast to the normal distribution, it is not perfectly symmetric. Otherwise it is not that different. With this functional form, the market share of product 1 has a closed-form solution and is given by 51
exp(V1)
= -ex_p_(_V1_)_+_e_x_p-(V _2_)
(A.123)
Hence the market share of product 1 increases in V1 and decreases in V2. However, even when the common utility of 1 (V1) is much larger than 2 (V2), some people still choose 2 because of their idiosyncratic preferences. More generally, in a model with K products, the market share of product j is given by (A.124)
This is the familiar logit model.
A.9 An Application: Locational Choice Models Often we would like to jointly study housing and locational choices. Locations can be metropolitan areas, cities, communities, or n eighborhoods. We have con sidered a variety of factors that influence locational choices, including the quality of local
507
Appendix
public goods and services, the quality and price of housing and other nontradable goods, and the amenity values of a location, such as distance to employment centers or areas of entertainment. The model we considered in chapter 8 can be thought of as a locational choice model since it models not only the demand for housing but also the choice of a community. In this section, we discuss how to apply random utility models developed by McFadden (1978) to study neighborhood choice. These models are attractive since they allow for idiosyncratic shocks in utility functions . These shocks capture heterogeneity in preferences for locations that are not easily linked to observed characteristics of the city. For example, you may have grown up in San Francisco and can't imagine living anywhere other than California. We don't observe that, but your childhood memories may affect your preferences for location as an adult. Alternatively, you may be a lifelong supporter of the Philadelphia Eagles and wish to live in Philly so you can get season tickets to the games. All of us have idiosyncratic preferences that affect where we decide to live. Let us apply our discrete choice framework to the problem of choosing a city or a neighborhood within a city. In the model, there are a number of different cities indexed by j = l, ... , J. Within each city or neighborhood, there is a finite number of different housing types indexed by k = l, ... , K. For example, if we have two cities and ten types of houses, then J = 2 and K = 10. The total number of possible locations you can choose from is J K = 20. Let us assume that cities differ by a single variable, denoted by aj- Think about this variable as the amenity value that the city provides. So if you live in San Francisco, you can go surfing, you can go skiing in Tahoe, the weather is pleasant and moderate, and so on. It is not very difficult to allow for more than one amenity, but let's keep things simple for now. In addition, each city has a bunch of different housing types. Some are nice and some are not that great. Let's denote the quality of the house by qk. Again, we simplify and assume that houses can be characterized by a single index; it is not hard to extend the model to include more housing characteristics. As before, households have idiosyncratic preferences for each possible location choice. These preferences capture factors that we as researchers do not observe, but they are observed by the individual. All idiosyncratic preferences are captured by a random utility shock, denoted by €jk· Let us denote the (rental) price of house kin city j with Pjk· For simplicity let us initially assume that income does not depend on the location and is given by m. Then your expenditures on consumption in city j living in house k are given by (A.125)
Let us assume for simplicity that the conditional indirect utility function for community j and house k is linear and given by (A.126)
Each household chooses the neighborhood-housing pair that maximizes utility. Given that there are a finite number of possible choices (JK) for each household, there typically exists a unique city-house combination that maximizes utility.
508
Appendix
The share of households with income m that choose city j and house k is given by the conditional choice probability (A.127) which is a natural generalization of equation (A.124). Note that the market shares depend on the parameter values (a, (3, 'Y) . Different models have different values of (a, (3, 'Y) and will, therefore, give us different market shares. For example, if households are really price sensitive (large value of a), then expensive houses tend to have smaller market shares than if households are price insensitive (low value of a). We have suppressed this dependence to keep the notation simple, but this property of conditional choice probabilities plays a large role in estimation as discussed below in detail. Holding the parameters of the model fixed, the market share of location jk increases in the own amenity aj and quality qk and decreases in Pjk · Similarly, the market share increases if we increase the prices in other locations. It decreases as amenities in other locations increase. How do we estimate the parameters of these models? We need a technique that falls outside the standard regression framework that we have used thus far. The standard approach to estimate the discrete choice model is called maximum likelihood estimation. How does this work? Suppose we observe a sample of households indexed by i = l, ..., N. For each household i we observe locational choices, denoted by { dm, ..., d;JK}, as well as income, denoted by mi . For each house we observe qk. For each city we observe aj. Finally, we observe Pjk for each house-city pair. The logarithm of the likelihood function, denoted by L, is then given by N
L(a,f3,'Y)
J
K
=LL L dijk ln (Pr{dijk = 1 Im;})
(A.128)
i=lj=O k=O
Note that the value of the likelihood function depends on the parameters of the model ( a, (3, 'Y) since the conditional choice probabilities are functions of these parameters as we discussed above. The maximum likelihood estimator is obtained by maximizing the log-likelihood function. Hence we choose the parameters that make the observed outcomes most likely given the model structure. Note that this estimator is just an application of the feasible maximum likelihood estimator for the conditional logit model derived by McFadden (1974). More recently, urban economists have focused on the role of unobserved housing and neighborhood characteristics. For example, we as econometricians probably do not know which neighborhoods are "hip" or popular, while households that rent or purchase houses in a n eighborhood are probably aware of these factors. We can maybe come up with some proxies, but at the end of the day we do not observe all relevant characteristics of a house, a neighborhood, or a city. When our m easures of n eighborhood or housing quality are incomplete or noisy, we need to be careful in estimation. One way to deal with this problem is to introduce a variable, denoted by C:jk, which captures all unobserved characteristics of the neighborhood-housing pair jk. Adding this variable to the utility function
509
Appendix
is fairly straightforward: (A.129) These models are more difficult to estimate because the Sjk must be treated as an omitted variable. Berry (1994) and Berry, Levinsohn, and Pakes (1995) derive a general method for estimating these types of discrete choice models. The key insight of those papers is that the unobserved product characteristic can be recovered from the observed market share of the product. The intuition is simple. If you observe that a product has a large market share but does not have appealing observed characteristics, then it must have appealing unobserved characteristics, and vice versa. The remaining parameters of the model can be estimated by using an instrumental variable estimator that controls for the correlation between price and unobserved product characteristics. In our example, we would be concerned that housing prices are systematically higher in n eighborhoods with attractive unobserved characteristics. If we ignore this correlation in the analysis, we are likely to make some systematic mistakes. Bayer, McMillan, and Rueben (2004) and Bayer, Ferreira, and McMillan (2007) have applied these techniques to estimate locational choice models using data from the Bay Area. These papers then provide estimates that allow us to determine how households trade off different amenities. Moreover, we can u se these models to evaluate potential policy changes. For example, we may want to know how much households are willing to pay for a 10 percent reduction in crime or a 10 percent improvement in public school quality.
A.10 Problem Sets 1. Consider the problem of program evaluation. Explain why we typically can-
not evaluate the efficiency of the program b y comparing outcomes between participants and nonparticipants. 2. A recent labor market study has found that married males have 11 percent higher wages than males who are not married after controlling for observed differences in human capital and experience. The study concludes that marriage increases wages by fostering important skills among males. Use what you have learned in this book to evaluate the validity of this conclusion. 3. Suppose more students wanted to attend a m agnet school than the magnet school could accept. You want to identify the effect of attending a magnet school by comparing the outcomes of students w ho attended the magnet school with those of students who wanted to go but could not. Would a comparison between participants and nonparticipants be acceptable if the magnet school were allowed to design its own admission policy and accepted students on the basis of their characteristics? Would your answer change if students were accepted on the basis of a lottery instead? 4. A recent labor market study found that unionized workers have 11 percent higher wages than workers who are not unionized after controlling for observed differences in human capital and experience. The study concludes
510
Appendix
that unionization increases wages by fostering collective bargaining. Use what you have learned in this book to evaluate the validity of this conclusion. 5. Explain why you would be concerned about differential attrition rates in an experimental study. 6. Suppose you use nonexperimental data to try to assess whether a training program increases workers' productivity. After you collect your data, you observe that-after controlling for observed heterogeneity among workersworkers who participated in the program had significantly lower earnings in the three months before the training program started than similar workers who did not participate in the program. What can you conclude from these findings? What does that imply for the evaluation of the program?
511
References 111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
Aaron, H.J. (1975). Who Pays the Property Tax? Washington, DC: Brookings Institution. Acemoglu, D., and J. Robinson (2012). Why Nations Fail: Origins of Power, Poverty and Prosperity. New York: Crown Publishers (Random House). Admati, A., and M. Perry (1987). "Strategic Delay in Bargaining ." Review of Economic Studies, 54 (3), 345- 364.
Ahlfeldt, G., Redding, S., Sturm, D., and N. Wolf (2015). "The Economics of Density: Evidence from the Berlin Wall." Econometrica, 83 (6), 2127-2189. Albou y, D. (2009). "The Unequal Geographic Burden of Federal Taxation." Journal of Political
Economy, 117 (4), 635- 667. Alonso, W. (1964). Location and Land Use: Towards a General Theory of Land Rent. Cambridge, UK: Cambridge University Press. Alt, J., Bueno de Mesquita, E., and S. Rose (2011). "Disentangling Accountability and Competence in Elections: Evidence from US Term Limits." Journal of Politics, 73 (1), 171-186.
Anas, A., and I. Kim (1996). "General Equilibrium Models of Poly-centric Urban Land Use with Endogenous Congestion and Job Agglomeration." Journal of Urban Economics, 40 (2), 232-256.
Andreoni, J. (1989). "Givin g with Impure Altruism: Applications to Charity and Ricardian Equivalence." Journal of Political Economy, 97, 1447- 1458. Andreoni, J. (1990). "Impure Altruism and Donations to Public Goods: A Theory of Warm-Glow Giving." Economic Journal, 100, 464-477. Angrist, J., Cohodes, S., Dynarski, S., Fullerton, J. B., Kane, T. J., Pathak, P. A., and C. R. Walters (2012). "Student Achievement in Massachusetts Charter Sch ools." Working Paper, Center for Education Policy Research, Harvard University. Arcidiacono, P., Sieg, H., and F. Sloan (2007). "Living Rationally under the Volcano? Heavy Drinking and Smoking among the Elderly." International Economic Review, 48 (1), 37-65. Arrow, K. (1962). "Economic Welfare and the Allocation of Resources for Invention." In: The Rate and Direction of Inventive Activity, 609. Cambridge, MA: National Bureau of Economic Research. Arzaghi, M., and V. Henderson (2008). "Networking Off Madison Avenue." Review of
Economic Studies, 75 (4), 1011-1038. Au, C., and V. Henderson (2006). "Are Chinese Cities Too Small?" Review of Economic
Studies, 73 (3), 549- 576. Auten, J., Sieg, H., and C. Clotfelter (2002). "Ch aritable Giving, Income and Taxes: An Analysis of Panel Data." American Economic Review, 92 (1), 371- 382. Autor, D., Palmer, C., and P. Pathak (2014). "Housing Market Spillovers: Evidence from the End of Rent Control in Cambridge, Massachusetts." Journal of Political Economy, 122 (3), 661- 717.
Barnett, R., and H. Gerken (2010). "Article I, Sec. 8: Federalism and the Overall Scope of Federal Power." National Constitution Center, Philadelphia. Barr, J., and 0. Davis (1966). "An Elem entary Political and Economic Th eory of Expenditures of State and Local Governments." Southern Economic Journal, 33 (2), 149-165. Bartel, A., and D. Lewin (1981). "Wages and Unionism in the Public Sector: The Case of Police." Review of Economics and Statistics, 63, 53- 59.
References
Baum-Snow, N. (2007). "Did Highways Cause Suburbanization?" Quarterly Journal of Economics, 122 (2), 775-805. Baum-Snow, N., and R. Pavan (2012). "Understanding the City Size Wage Gap." Review of Economic Studies, 79 (1), 88-127. Baum-Snow, N., and R. Pavan (2013). "Inequality and City Size." Review of Economics and Statistics, 95 (5), 1536-1548. Bayer, P., Ferreira, F., and R. McMillan (2007). "A United Framework for Measuring Preferences for Schools and Neighborhoods." Journal of Political Economy, 115 (4), 588-638. Bayer, P., McMillan, R., Murphy, A., and C. Timmins (2016). "A Dynamic Model of Demand for Houses and Neighborhoods." Econometrica, 84 (3), 893-942. Bayer, P., McMillan, R., and K. Rueben (2004). "An Equilibrium Model of Sorting in an Urban Housing Market." NBER Working Paper 10865. Bayer, P., and C. Timmins (2005). "On the Equilibrium Properties of Locational Sorting Models." Journal of Urban Economics, 57 (3), 462-477. Becker, G. (1957). The Economics of Discrimination. Chicago: University of Chicago Press. Becker, G. (1964). Human Capital: A Theoretical and Empirical Analysis, with Special Reference to Education. Chicago: University of Chicago Press. Becker, G., and K. Murphy (1988). "A Theory of Rational Addiction." Journal of Political Economy, 96 (4), 675-700. Behrman, J., and H. P. Kohler (2014). "Population Quantity, Quality and Mobility." In:
Towards a Better Global Economy: Policy Implications for Global Citizens in the 21st Century, F. Allen, J. Behrman, N. Birdsall, S. Fardoust, D. Rodrik, A. Steer, and A. Subramanian (eds.), 138-204. Oxford, UK: Oxford University Press. Bergstrom, T., Blume, L., and H. Varian (1986). "On the Private Provision of Public Goods." Journal of Public Economics, 29, 25-49. Bergstrom, T., and R. Goodman (1973). "Private Demand for Public Goods." American Economic Review, 63 (3), 280-296. Bergstrom, T., Rubinfeld, D., and P. Sharpiro (1982). "Micro-based Estimates of Demand Functions for Local School Expenditures." Econometrica, 50 (5), 1183- 1205. Berry, S. (1994). "Estimating Discrete-Choice Models of Product Differentiation." RAND Journal of Economics, 25 (2), 242- 262. Berry, S., Levinsohn, J., and A. Pakes (1995). "Automobile Prices in Market Equilibrium." Econometrica, 63 (4), 841-890. Bertrand, M., Djankov, S., Hanna, R., and S. Mullainathan (2007). "Obtaining a Driving License in India: An Experimental Approach to Studying Corruption." Quarterly Journal of Economics, 122, 1639-1676. Besley, T., and A. Case (1995a). "Incumbent Behavior: Vote-Seeking, Tax Setting, and Yardstick Competition." American Economic Review, 85, 25-45. Besley, T., and A. Case (1995b). "Does Electoral Accountability Affect Policy Choices? Evidence from Gubernatorial Limits." Quarterly Journal of Economics, 110 (3), 769-798. Besley, T., and S. Coate (1992). "Workfare versus Welfare: Incentive Arguments for Work Requirements in Poverty-Alleviation Programs." American Economic Review, 82 (1), 249-261. Besley, T., and S. Coate (1997). "An Economic Model of Representative Democracy." Quarterly Journal of Economics, 112 (1), 85- 114. Besley, T., and S. Coate (2003). "Centralized versus Decentralized Provision of Local Public Goods: A Political Economy Approach." Journal of Public Economics, 87, 2611-2637. Bissinger, B. (1997). A Prayer for the City. New York: Vintage Books. Black, Duncan (1948). "On the Rationale of Group Decision-Making." Journal of Political Economy, 56, 23- 34. Black, S. (1999). "Do Better Schools M atter? Parental Valuation of Elementary Education." Quarterly Journal of Economics, 114 (2), 577- 599.
514
References
Bleakley, H ., and J. Lin (2012). "Portage and Path Dependence." Quarterly Journal of Economics, 127 (2), 587-644. Boozer, M., and C. Rouse (2001). "Intra-school Variation in Class Size: Patterns and Implications." Journal of Urban Economics, 50, 163-189. Breitschneider, S., Gorr, W., Grizzle, G., and E. Klay (1989). "Political and Organizational Influences on the Accuracy of Forecasting State Government Revenues." International Journal of Forecasting, 5, 307-319. Brennan, G., and J. Buchanan (1980). The Power to Tax. New York: Cambridge University Press. Brinkman, J. (2016). "Congestion, Agglomeration and the Structure of Cities." Journal of Urban Economics, 94, 13-31. Brinkman, J., Coen-Pirani, D., and H. Sieg (2015). "Firm Dynamics in an Urban Economy." International Economic Review, 56 (4), 1135-1164. Brinkman, J., Coen-Pirani, D., and H. Sieg (2018). "The Political Economy of Municipal Pension Funding." AEJ-Macroeconomics, 10 (3), 215-246. Browning, M., Chiapori, P., and Y. Weiss (2014). Family Economics. Cambridge, UK: Cambridge University Press. Brueckner, J. (2011). Lectures on Urban Economics. Cambridge, MA: MIT Press. Brueckner, J., and S. Rosenthal (2009). "Gentrification and Neighborhood Housing Cycles: Will America's Future Downtowns Be Rich?" Review of Economics and Statistics, 91 (4), 725-743. Brueckner, J., and L. Saavedra (2001). "Do Local Governments Engage in Strategic PropertyTax Competition?" National Tax Journal, 54 (2), 203-229. Brueckner, J., Thisse, J., and Y. Zenou (1999). "Why Is Central Paris Rich and Downtown Detroit Poor? An Amenity-Based Theory." European Economic Review, 43 (1), 91-107. Brunner, E., and S. Ross (2010). "Is the Median Voter Decisive? Evidence from Referenda Voting Patterns." Journal of Public Economics, 94, 898-910. Bursztyn, L., Cantoni, D., Funk, P., and N . Yuchtman (2018). "Polls, the Press, and Political Participation: The Effects of Anticipated Election Closeness on Voter Turnout." NBER Working Paper 23490. Calabrese, S., Cassidy, G., and D. Epple (2002). "Local Government Fiscal Structure and Metropolitan Consolidation." Brookings-Wharton Papers on Urban Affairs, (1), 1-43. Calabrese, S., Epple, D., and R. Romano (2012). "Inefficiencies from Metropolitan Political and Fiscal Decentralization: Failures of Tiebout Competition." Review of Economics Studies, 79 (3), 1081-1111. Calabrese, S., Epple, D., Romer, T., and H. Sieg (2006). "Local Public Good Provision: Peer Effects, Voting, and Mobility." Journal of Public Economics, 90 (6-7), 959- 981. Card, D. (1999). "The Causal Effect of Education on Wages." In: Handbook of Labor Economics, vol. 3, 0 . Ashenfelter and D. Card (eds.). Amsterdam: Elsevier. Cestau, D., Epple, D., and H. Sieg (2017). "Admitting Students to Selective Education Programs: Merit, Profiling, and Affirmative Action." Journal of Political Economy, 125 (3), 761-797. Cestau, D., Green, R., and N. Schurhoff (2013). "Tax-Subsidized Underpricing: The Market for Build America Bonds." Journal of Monetary Economics, 60, 593- 608. Chalfin, A., and J. McCrary (2017). "Criminal Deterrence: A Review of the Literature." Journal of Economic Literature, 55 (1), 5-48. Chatterji, A., Glaeser, E., and W. Kerr (2014). "Clusters of Entrepreneurship and Innovation." In: Innovation Policy and the Economy, vol. 14, J. Lerner and S. Stern (eds.). Chicago: University of Chicago Press. Chay, D., and M. Greenstone (2005). "Does Air Quality Matter? Evidence from the Housing Market." Journal of Political Economy, 113 (2), 376-424.
515
References
Chetty, R., Hendren, N., and L. Katz (2016). "The Effects of Exposure to Better Neighborhoods on Children: New Evidence from the Moving to Opportunity Project." American Economic Review, 106 (4) 855-902. Chirico, M., Inman, R., Loeffler, C., MacDonald, J., and H. Sieg (2016). "An Experimental Evaluation of Notification Strategies to Increase Property Tax Compliance: Free-Riding in the City of Brotherly Love." Tax Policy and the Economy, 30, 129-161. Chirico, M., Inman, R., Loeffler C., MacDonald, J., and H. Sieg (2019). "Deterring Property Tax Delinquency in Philadelphia." National Tax Journal, 72 (3), 479-506. Christaller, W. (1933). Die zentralen Orte in Suddeutschland. Jena, Germany: Gustav Fischer. Ciccone, A., and R. Hall (1996). "Productivity and the Density of Economic Activity." American Economic Review, 86 (1), 54-70. Clark, M., Chiang, H., Silva, T., McConnell, S., Sonnenfeld, K., Erbe, A., and M. Puma (2013). The Effectiveness of Secondary Math Teachers from Teach For America and the Teaching Fellows Programs (NCEE 2013-4015). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, US Department of Education. Clarke, E. (1971). "Multipart Pricing of Public Goods." Public Choice, 11 (1), 17-33. Coase, R. (1960). "The Problem of Social Cost." Journal of Law and Economics, 3 (1), 1-44. Coate, S., and M. Conlin (2004). "A Group Rule-Utilitarian Approach to Voter Turnout: Theory and Evidence." American Economic Review, 94, 1476-1504. Coate, S., and B. Knight (2011). "Government Form and Public Spending: Theory and Evidence from US Municipalities." American Economic Journal: Economic Policy, 3 (3), 82-112. Coen-Pirani, D., and H. Sieg (2019). "The Impact of the Tax Cut and Jobs Act on the Spatial Distribution of High Productivity Households." Journal of Monetary Economics, 105, 44-71. Combes, P., Duranton, G., Gobillon, L., Puga, D., and S. Roux (2012). "The Productivity Advantages of Large Cities: Distinguishing Agglomeration from Firm Selection." Econometrica, 80 (6), 2543-2594. Cook, P. (1986). "The Demand and Supply of Criminal Opportunities." Crime and Justice, 7, 1-27. Corak, M. (2013). "Income Inequality, Equality of Opportunity, and Intergenerational Mobility." Journal of Economic Perspectives, 27 (3), 79-102. Corman, H., and H. Naci Mocan (2000). "A Time-Series Analysis of Crime, Deterrence, and Drug Abuse in New York City." American Economic Review, 90 (3), 584-604. Crane, C., and M. Manville (2008). "People or Place. Revisiting the Who versus the Where of Urban Development." Lincoln Institute: Land Lines, July, 1-7. Cullen, J., Jacob, B., and S. Levitt (2006). "The Effect of School Choice on Participants: Evidence from Randomized Lotteries." Econometrica, 74 (5), 1191-1230. Cunha, F., and J. Heckman (2007). "The Technology of Skill Formation." American Economic Review, 97 (2), 31-47. Cutler, D., and E. Glaeser (1997). "Are Ghettos Good or Bad?" Quarterly Journal of Economics, 112, 827- 872. Cutler, D., Glaeser, E., and J. Vigdor (1999). "The Rise and Decline of the American Ghetto." Journal of Political Economy, 107 (3), 455-506. Dale, S., and A. Krueger (2011). "Estimating the Returns to College Selectivity over the Career Using Administrative Data." NBER Working Paper 17159. Davis, B., Engberg, J., Epple, D., Sieg, H., and R. Zimmer (2013). "Bounding the Retention Effects of a Gifted Program Using a Modified Regression Discontinuity Design." Annals of Economics and Statistics, 111-112, 10-34. Davis, J. C., and J. V. H enderson (2008). "The Agglomeration of Headquarters." Regional Science and Urban Economics, 38 (5) 445-460.
516
References
Davis, M., and S. van Nieuwerburgh (2015). "Housing, Finance and the Macroeconomy." In: Handbook of Regional and Urban Economics, vol. SA, G. Duranton, V. Henderson, and W. Strange (eds.), 753-811. Amsterdam: Elsevier. De La Roca, J., and D. Puga (2017). "Learning by Working in Big Cities." Review of Economic Studies, 84, 106-114. Deacon, R., and J. Sonstelie (1985). "Rationing by Waiting and the Value of Time: Results from a Natural Experiment." Journal of Political Economy, 93 (4), 627-647. Depken, C., and C. Lafountain (2006). "Fiscal Consequences of Public Corruption: Empirical Evidence from State Bond Ratings." Public Choice, 126 (1-2), 75-85. Desmet, K., and E. Rossi-Hansberg (2013). "Urban Accounting and Welfare." American Economic Review, 103 (6), 2296-2327. Diamond, R. (2016). "The Determinants and Welfare Implications of US Workers' Diverging Location Choices by Skill: 1980-2000." American Economic Review, 106 (3), 479-524. Diermeier, D., Eraslan, H., and A. Merlo (2003). "A Structural Model of Government Formation." Econometrica, 71 (1), 27-70. Dockery, D., Pope, A., Xu, X., Spengler, J., Ware, J., Fay, M., Ferris, B., and F. Speizer (1993). "An Association between Air Pollution and Mortality in Six US Cities." New England Journal of Medicine, 329 (24), 1753-1759. Donohue, J., and S. Levitt (2001). "The Impact of Legalized Abortion on Crime." Quarterly Journal of Economics, 116 (2), 379-420. Downs, A. (1957). "An Economic Theory of Political Action in a Democracy." Journal of Political Economy, 65, 135-150. Duranton, G., and H. Overman (2005). "Testing for Localization Using Micro-geographic Data." Review of Economic Studies, 72 (4), 1077-1106. Duranton, G., and D. Puga (2001). "Nursery Cities: Urban Diversity, Process Innovation, and the Life Cycle of Products." American Economic Review, 91 (5), 1454-1477. Duranton, G., and D. Puga (2004). "Micro-foundations of Urban Agglomeration Economies." In: Handbook of Regional and Urban Economics, vol. 4, J. Henderson and J. Thisse (eds.), 2063-2117. Amsterdam: Elsevier. Duranton, G., and M. Turner (2011). "The Fundamental Law of Road Congestion: Evidence from US Cities." American Economic Review, 101 (6), 2616-2652. Duranton, G., and M. Turner (2012). "Urban Growth and Transportation." Review of Economic Studies, 79 (4), 1407-1440. Eeckhout, J., Pinheiro, R., and K. Schmidheiny (2014). "Spatial Sorting." Journal of Political Economy, 122 (3), 554-620. Ellison, G., and E. Glaeser (1997). "Geographic Concentration in US Manufacturing Industries: A Dartboard Approach." Journal of Political Economy, 105 (5), 889-927. Ellison, G., Glaeser, E., and W. Kerr (2010). "What Causes Industry Agglomeration? Evidence from Coagglomeration Patterns." American Economic Review, 100 (3), 1195-1213. Engberg, J., Epple, D., Imbrogno, J., Sieg, H., and R. Zimmer (2014). "Evaluating Education Programs That Have Lotteried Admission and Selective Attrition." Journal of Labor Economics, 32 (1), 27-63. Engberg, J., Gill, B., Zamarro, G., and R. Zimmer (2012). "Closing Schools in a Shrinking District: Do Student Outcomes Depend on Which Schools Are Closed? Journal of Urban Economics, 71, 189-203. Enikolopov, R. (2014). "Politicians, Bureaucrats and Targeted Redistribution." Journal of Public Economics, 120 (1), 74-83. Epple, D. (1987). "Hedonic Prices and Implicit Markets: Estimating Demand and Supply Functions for Differentiated Products." Journal of Political Economy, 95 (1), 59-80. Epple, D., Filimon, R., and T. Romer (1984). "Equilibrium among Local Jurisdictions: Toward an Integrated Treatment of Voting and Residential Choice." Journal of Public Economics, 24, 281-308.
517
References
Epple, D., Gordon, B., and H. Sieg (2010), "A New Approach to Estimating the Production Function for Housing." American Economic Review, 100 (3), 905-924. Epple, D., Jha, A., and H . Sieg (2018). "The Superintendent's Dilemma: Managing School District Capacity as Parents Vote with Their Feet." Quantitative Economics, 9 (1), 483-520.
Epple, D., and G. Platt (1998). "Equilibrium and Local Redistribution in an Urban Economy When Households Differ by References and Income." Journal of Urban Economics, 43, (1), 23-51.
Epple, D., Quintero, L., and H. Sieg (2020). "A New Approach to Estimating Hedonic Equilibrium Models for Metropolitan Housing Markets." Journal of Political Economy, 128 (3), 948-83.
Epple, D., and R. Romano (1994). "Public Provision of Private Goods." Journal of Political
Economy, 104 (1), 57-84. Epple, D., and R. Romano (1998). "Competition between Private and Public Schools, Vouchers, and Peer-Group Effects." American Economic Review, 88 (1), 33-62. Epple, D., and R. Romano (2004). "Ends against the Middle: Determining Public Service Provision When There Are Private Alternatives." Journal of Public Economics, 62 (3), 297-325.
Epple, D., Romano, R., and H. Sieg (2012). "The Intergenerational Conflict over the Provision of Local Education." Journal of Public Economics, 96 (3-4), 255-268. Epple, D., Romano, R., and M. Urquiola (2017). "School Vouchers: A Survey of the Economics Literature." Journal of Economic Literature, 55 (2), 441-492. Epple, D., and T. Romer (1991). "Mobility and Redistribution." Journal of Political Economy, 99, 828-858.
Epple, D., Romer, T., and H . Sieg (2001). "Inter-jurisdictional Sorting and Majority Rule: An Empirical Analysis." Econometrica, 69, 1437-1465. Epple, D., and H. Sieg (1999). "Estimating Equilibrium Models of Local Jurisdictions."
Journal of Political Economy, 107, 645-681. Evans, W., and E. Owens (2007). "COPS and Crime." Journal of Public Economics, 91 (1-2), 181- 201.
Fan, F., Hines, J., and J. Horowitz (2016). "Are PILOTs Property Taxes for Nonprofits?"
Journal of Urban Economics, 94, 109- 123. Farrell, M., Rich, S., Turner, L., Seith, D., and D. Bloom (2008). "Welfare Time Limits: An Update on State Policies, Implementation, and Effects on Families." MDRC Report. Fernandez, R., and R. Rogerson (1998). "Public Education and Incom e Distribution: A Dynamic Quantitative Evaluation of Education-Finance Reform." American Economic
Review, 88 (4), 813-833. Ferraz, C., and F. Finan (2011). "Electoral Accountability and Corruption: Evidence from the Audits of Local Governments." American Economic Review, 101 (4), 1274-1311. Ferreira, F. (2010). "You Can Take It With You: Proposition 13 Tax Benefits, Residential Mobility, and Willingness to Pay for Housing Amenities." Journal of Public Economics, 94, 661- 673.
Ferreira, F., and J. Gyourko (2009). "Do Political Parties Matter? Evidence from US Cities."
Quarterly Journal of Economics, 124, 399-422. Fisher, R. (1935). Design of Experiments. New York: Hafner. Florida, R. (2002). The Rise of the Creative Class: And How It's Transforming Work, Leisure, Community and Everyday Life. New York: Basic Books. Fox, L., Wimer, C., Garfinkel, I., Kaushal, N., and J. Waldfogel (2015). "Waging War on Poverty: Poverty Trends Using a Historical Supplemental Poverty Measure." Journal of
Policy Analysis and Management, 34 (3), 567-592. Freeman, S., Grogger, J., and J. Sonstelie (1996). "The Spatial Concentration of Crime."
Journal of Urban Economics, 40 (2), 216-231.
518
References
Friedman, M. (1955). "The Role of Government in Education." In: Economics and the Public Interest, R. Solo (ed.), 123-144. New Brunswick, NJ: Rutgers University Press. Fryer, R. (2014). "Injectin g Charter School Best Practices into Traditional Public Schools." Quarterly Journal of Economics, 129 (3), 1355-1407. Fujita, M. (1988). "A Monopolistic Competition Model of Spatial Agglomeration: Differentiated Product Approach." Regional Science and Urban Economics, 18 (1), 87-124. Fujita, M., and P. Krugman (1995). "When Is the Economy Mono-centric? Von Thunen and Chamberlin Unified." Regional Science and Urban Economics, 25 (4), 505-528. Fujita, M., Krugman, P., and A. J. Venables (1999). The Spatial Economy: Cities, Regions and International Trade. Cambridge, MA: MIT Press. Fujita, M., and J. Thisse (2002). Economics of Agglomeration. Cities, Industrial Location, and Regional Growth. Cambridge, UK: Cambridge University Press. Gais, T., and S. Urahn (2011). "States' Revenue Estimating: Cracks in the Crystal Ball." Pew Charitable Trust and Rockefeller Institute. Research Report. Garcia, J., Heckman, J., Leaf, D., and M. Prados (2016). "The Life-Cycle Benefits of an Influential Early Childhood Program." NBER Working Paper 22993. Garcia-Jimeno, C. (2016). "The Political Economy of Moral Conflict: An Empirical Study of Learning and Law Enforcement under Prohibition." Econometrica, 84 (2), 511-570. Garreau, J. (1991). Edge City: Life on the New Frontier. New York: Anchor Books. George, H. (1881). Progress and Poverty (reprint 1932). London: Henry George Foundation . Geyer, J. (2017). "Housing Dem and and Neighborhood Choice with Housing Vouchers." Journal of Urban Economics, 99 (C), 48-61. Geyer, J., and H. Sieg (2013). "Estimating a Model of Excess Demand for Public Housing." Quantitative Economics, 4 (3), 483-513. Glaeser, E. (2008). Cities, Agglomeration and Spatial Equilibrium. Oxford, UK: Oxford University Press. Glaeser, E. (2011). Triumph of the City. New York: Penguin Press. Glaeser, E., and J. Gyourko (2005). "Urban Decline and Durable Housing." Journal of Political Economy, 113 (2), 345-375. Glaeser, E., and M. Kahn (2010). "The Greenness of Cities: Carbon Dioxide Emissions and Urban Development." Journal of Urban Economics, 67 (3), 404-418. Glaeser, E., Kallal, H ., Scheinkman, J., and A. Shleifer (1992). "Growth in Cities." Journal of Political Economy, 100 (6), 1126-1152. Glaeser E., and J. Kohlhase (2004). "Cities, Regions and the Decline of Transport Costs." In: Fifty Years of Regional Science. Advances in Spatial Science, R. Florax and D. Plane (eds.), 197-228. Berlin, Heidelberg: Springer. Glaeser, E., Kolko, J., and A. Saiz (2001) . "Consumer City." Journal of Economic Geography, 1 (1), 27- 50. Glaeser, E., and E. Luttmer {2003). "The Misallocation of Housing under Rent Control." American Economic Review, 93 (4), 1027-1046. Glaeser, E., and D. Mare (2001). "Cities and Skills." Journal of Labor Economics, 19 (2), 316-342. Glaeser, E., Scheinkman, J., and B. Sacerdote (1996). "Crime and Social Interactions." Quarterly Journal of Economics, 111, 507-548. Goldstein, G., and M. Pauly {1981). "Tiebout Bias and the Demand for Local Public Goods." Journal of Public Economics, 38 (3), 319-343. Green, R. (1993). "A Simple Model of the Taxable and Tax-Exempt Yield Curves." Review of Financial Studies, 6, 233-264. Green, R. (2007) . "Issuers, Underwriter Syndicates, and Aftermarket Transparency." Journal of Finance, 62, 1529-1550.
519
References
Green, R., Hollifield, B., and N. Schilrhoff (2007). "Dealer Intermediation and Price Behavior in the Aftermarket for New Bond Issues." Journal of Financial Economics, 86, 643-682.
Greenstone, M., Hornbeck, R., and E. Moretti (2010). "Identifying Agglomeration Spillovers: Evidence from Winners and Losers of Large Plant Openings." Journal of Political Economy, 118 (3), 536-598. Grogger, J. (1996). "Does School Quality Explain the Recent Black/White Wage Trend?" Journal of Labor Economics, 14 (2), 231-253. Grogger, J. (2002). "Behavioral Effect of TANF Time Limits." American Economic Review, 92 (2), 385-389.
Groves, T. (1973). "Incentives in Teams." Econometrica, 41 (4), 617-631. Gruber, J., and B. Koszegi (2001). "Is Addiction Rational: Theory and Evidence." Quarterly Journal of Economics, 116 (4), 1261-1304. Gylfason, T. (2001). "Natural Resources, Education, and Economic Development." European Economic Review, 45 (4-6), 847-859. Hamilton, B. (1975). "Zoning and Property Taxation in a System of Local Governments." Urban Studies, 12 (2), 205-11. Hamilton, B. (1976). "Capitalization of Intrajurisdictional Differences in Local Tax Prices." American Economic Review, 66 (5), 743- 753. Harbaugh, W. (1998). "What Do Donations Buy? A Model of Philanthropy Based on Prestige and Warm Glow." Journal of Public Economics, 67, 269-284. Haughwout, A., Inman, R., Craig, S., and T. Luce (2004). "Local Revenue Hills: Evidence from Four US Cities." Review of Economics and Statistics, 86 (2), 570-585. Haughwout, A., Orr, J., and D. Bedoll (2008). "The Price of Land in the New York Metropolitan Area." Current Issues in Economics and Finance, 14 (3), 1-17. Federal Reserve Bank of New York. Heckman, J., Moon, S., Pinto, R., Savelyev, P., and A. Yavitz (2010). "The Rate of Return to the High-Scope Perry Preschool Program." Journal of Public Economics, 94 (1-2), 114-128.
Heckman, J., Stixrud, J., and S. Urzua (2006). "The Effects of Cognitive and Noncognitive Abilities on Labor Market Outcomes and Social Behavior." Journal of Labor Economics, 24 (3), 411-482.
Henderson, J. (1974). "Optimum City Size: The External Diseconomy Question." Journal of Political Economy, 82 (2), 373-388. Henderson, J. (2002). "Urbanization in Developing Countries." World Bank Research Observer, 17 (1), 89-112. Henderson, J. (2003). "Marshall's Scale Economies." Journal of Urban Economics, 53 (1), 1-28. Henderson, J., and Y. Ioannides (1983). "A Model of Housing Tenure Choice." American Economic Review, 73 (1), 98-113. Henderson, J., Kuncoro, A., and M. Turner (1995). "Industrial Development in Cities." Journal of Political Economy, 103, 1067-1090. Henderson, J., Squires, T., Storeygard, A., and D. Weil (2018). "The Global Distribution of Economic Activity: Nature, History, and the Role of Trade." Quarterly Journal of Economics, 133 (1), 357-406. Herrnstein, R., and C. Murray (1996). The Bell Curve: Intelligence and Class Structure in American Life. New York: Simon and Schuster. Hindriks, J., and G. Myles (2006). Intermediate Public Economics. Cambridge, MA: MIT Press. Holmes, T. (1998). "The Effect of State Policies on the Location of Manufacturing: Evidence from State Borders." Journal of Political Economy, 106 (4), 667- 705. Holmes, T. (1999). "Localization of Industry and Vertical Disintegration." Review of Economics and Statistics, 81 (2), 314-325. Holmes, T. (2011). "Diffusion of Wal-Mart and Economies of Density." Econometrica, 79 (1), 253- 301.
520
References
Holmes, T., and H . Sieg (2015). "Structural Estimation in Urban Economics." In: Handbook of Regional and Urban Economics, vol. SA, G. Duranton, V. Henderson, and W. Strange (eds.), 69-114. Amsterdam: Elsevier. Holmes, T., and J. Stevens (2002). "Geographic Concentration and Establishment Scale."
Review of Economics and Statistics, 84 (4), 682- 690. Hoorens, S. (2017). "The Future of Cannabis in the Netherlands." RAND Research Report. Hotelling, H. (1929). "Stability in Competition." Economic Journal, 39, 41-57. Huck, S., Rasul, I., and A. Shephard (2015). "Comparing Charitable Fundraising Schemes: Evidence from a Natural Field Experiment and a Structural Model." American Economic
Journal: Economic Policy, 7 (2), 326-369. Hungerman, D. (2005). "Are Church and State Substitutes: Evidence from the 1996 Welfare Reform." Journal of Public Economics, 89, 2245-2267. Inman, R. (1978). "Testing Political Economy's As If Proposition: Is The Median Income Voter Really Decisive?" Public Choice, 33 (4), 45-65. Inman, R. (1995). "How to Have a Fiscal Crisis: Lessons from Philadelphia." American
Economic Review, 85 (2), 378-383. Inman, R. (2008). Lecture notes in Urban Fiscal Policy. University of Pennsylvania. Inman, R., and D. Rubinfeld (1997). "Rethinking Federalism." Journal of Economic Perspec-
tives, 11 (4), 43-64. INRIX (2019). "Congestion Costs Each American 97 Hours, $1,348 a Year." Press Release. Ioannides, Y., and L. Loury (2004). "Job Information Networks, Neighborhood Effects, and Inequality." Journal of Economic Literature, 42 (4), 1056-1093. Irving, S., and T. Loveless (2015). "The Dynamics of Economic Well-Being: Participation in Government Programs, 2009-2012: Who Gets Assistance?" Report b y US Census Bureau, Washington, DC. Jacobs, J. (1969). The Economy of Cities. New York: Vintage Books. Kahn, M. (2007). Green Cities: Urban Growth and the Environment. Washington, DC: Brookings Institution Press. Kahn, M. (2010). Climatopolis. New York: Basic Books. Kennan, J., and J. Walker (2011). "The Effect of Expected Income on Individual Migration Decisions." Econometrica, 79 (1), 211-251. Kingma, B. (1989). "An Accurate Measurement of the Crowd-Out Effect, Income Effects, and the Price Effect to Charitable Contributions." Journal of Political Economy, 97, 1197-1207.
Kline, P., and E. Moretti (2014). "Local Economic Development, Agglomeration Economies and the Big Push: 100 Years of Evidence from the Tennessee Valley Authority."
Quarterly Journal of Economics, 129 (1), 275-331. Kling, J., Liebman, J., and L. Katz (2007). "Experimental Analysis of Neighborhood Effects."
Econometrica, 75 (1), 83-119. Knight, B. (2002). "Endogenous Federal Grants and Crowd-Out of State Government Spending: Theory and Evidence from the Federal Highway Aid Program." American
Economic Review, 92, 71-92. Krueger, A. (1974). "The Political Economy of the Rent Seeking Society." American Economic
Review, 64, 291-303. Krueger, A. (2003). "Economic Considerations and Class Size." Economic Journal, 113, 34-63. Krugman, P. (1991). "Increasing Returns and Economic Geography." Journal of Political
Economy, 99 (3), 483-499. Laibson, D. (1997). "Golden Eggs and Hyperbolic Discounting." Quarterly Journal of
Economics, 112 (2), 443-477. Lane, P., and A. Tornell (1996). "Power, Growth and the Voracity Effect." Journal of Economic
Growth, l (2), 213- 241. Levin, J., and S. Tadelis (2010). "Contracting for Government Services: Theory and Evidence from US Cities." Journal of Industrial Economics, 3, 507- 542.
521
References
Levitt, S. (1996). "Disentangling the Role of Voter Preferences, Party Affiliation, and Senator Ideology." American Economic Review, 86 (3) 425-441. Levitt, S. (1997). "Using Electoral Cycles in Police Hiring to Estimate the Effects of Police on Crime." American Economic Review, 87 (3), 279-290. Levitt, S., and S. Venkatesh (2000). "An Economic Analysis of a Drug-Selling Gang's Finances." Quarterly Journal of Economics, 115 (3), 755-789. Lindahl, E. (1919). Die Gerechtigkeit der Besteuerung: Eine Analyse der Steuerprinzipien auf Grenznutzentheorie. Lund, Sweden: Gleerup. Lloyd, B., Norris, D., and T. Vicino (2007). "The Mayor in American Local Government." In: Heads of the Local State: Mayors, Provosts and Burgomasters since 1800, John Garrard (ed.), 191-206. Burlington, VT: Ashgate Publishing. Lochner, L., and E. Moretti (2004). "The Effect of Education on Crime: Evidence from Prison Inmates, Arrests, and Self-Reports." American Economic Review, 94 (1), 155-189. Lott, J., and L. Kenny (1999). "Did Women's Suffrage Change the Size and Scope of Government?" Journal of Political Economy, 107 (6), 1163-1198. Lucas, R. (1976). "Econometric Policy Evaluation: A Critique." In: The Phillips Curve and Labor Markets, K. Brunner and A. Meltzer (eds.), Carnegie-Rochester Conference Series on Public Policy, vol. 1, 19-46. New York: American Elsevier. Lucas, R., and E. Rossi-Hansberg (2002). "On the Internal Structure of Cities." Econometrica, 70 (4), 1445-1476. Ludwig, J., and D. Phillips (2008). "Th e Long-Term Effects of Head Start on Low-Income Children." Annals of the New York Academy of Sciences, 40, 1-12. Mai, C., and R. Subramanian (2017). "The Price of Prisons: Examining State Spending Trends, 2010-2015." Vera Institute of Justice. Marshall, A. (1920). Principle of Economics, 8th ed. London: McMillan. Mastro, R. (2013). "On the Voters' Terms: Amending New York City's Charter to Protect Voter-Imposed Term Limits." New York Law School Law Review 139 (2013-2014), 58, 139-162. McCarty, N . (2010). "Measuring Legislative Preferences." Working Paper, Princeton University. McCrary, J. (2002). "Using Electoral Cycles in Police Hiring to Estimate the Effects of Police on Crime: Reply." American Economic Review, 92, 1236- 1244. McDonald, B. (2015). "A Dirty Approach to Efficient Revenue Forecasting." Journal of Public and Nonprofit Affairs, l (1), 3-17. McFadden, D. (1974). "Conditional Logit Analysis of Qualitative Choice Behavior." In: Frontiers in Econometrics, P. Zarembka (ed.), 105-142. New York: Academic Press. McFadden, D. (1978). "Modeling the Choice of Residential Location." In: Spatial Interaction Theory and Planning Models, A. Karlqvist, F. Snickars, and J. Weibull (eds.). Amsterdam: Elsevier North-Holland. Merlo, A., and C. Wilson (1995). "A Stochastic Model of Sequential Bargaining with Complete Information." Econometrica, 63 (2), 371-399. Michalopoulos, C., Robins, P., and D. Card (2005). "When Financial Work Incentives Pay for Themselves: Evidence from a Randomized Social Experiment for Welfare Recipients." Journal of Public Economics, 125 (1-2), 113-139. Mieszkowski, P. (1972). "The Property Tax: An Excise Tax or a Profit Tax?" Journal of Public Economics, l (1), 73- 96. Mikesell, J., and J. Ross (2014). "State Revenue Forecasts and Political Acceptance: The Value of Consensus Forecasting in the Budget Process." Public Administration Review, 74, 188- 203. Miller, C., Katz, L., Az urdia, G., Isen, A., and C. Schultz (2017). "Expanding the Earned Income Tax Credit for Workers without Dependent Children: Interim Findings from the Paycheck Plus Demonstration in N ew York City." MDRC Research Report.
522
References
Miller, G. (2008). "Women's Suffrage, Political Responsiveness, and Child Survival in American History." Quarterly Journal of Economics, 123 (3), 1287-1327. Mills, E. (1967). "An Aggregative Model of Resource Allocation in a Metropolitan Area." American Economic Review, 57, 197-210. Moretti, E. (2004a). "Human Capital Externalities in Cities." In: Handbook of Regional and Urban Economics, vol. 4, J. Henderson and J. Thisse (eds.), 2243-2291. Amsterdam: Elsevier. Moretti, E. (2004b). "Workers' Education, Spillovers, and Productivity: Evidence from Plant-Level Production Functions." American Economic Review, 94 (3), 656-690. Moretti, E. (2011). "Local Labor Markets." In: Handbook of Labor Economics, vol. 4, 0. Ashenfelter and D. Card (eds.), 1237-1313. Amsterdam: Elsevier. Moretti, E. (2012). The New Geography of Jobs . Boston: Houghton Mifflin Harcourt. Murray, S., Evans, W., and R. Schwab (1998). "Education-Finance Reform and the Distribution of Educational Resources." American Economic Review, 88 (4), 789-812. Muth, R. (1969). Cities and Housing. Chicago: University of Chicago Press. Neal, D., and W. Johnson (1996). "The Role of Premarket Factors in Black-White Wage Differences." Journal of Political Economy, 104 (5), 869-895. Nechyba, T. (1997). "Local Property and State Income Taxes: The Role of Intetjurisdictional Competition and Collusion." Journal of Political Economy, 105 (2), 351-384. Nechyba, T. (2000). "Mobility, Targeting and Private School Vouchers." American Economic Review, 90, 130-146. Needham, B. (2000). "Land Taxation, Development Charges and the Effects on Land Use." Journal of Property Research, 17 (3), 241-257. Neyman, J. (1923). "On the Application of Probability Theory to Agricultural Experiments: Essay on Principles." Translated in Statistical Science, 5, 465-480. Niskanen, W. (1968). "Nonmarket Decision Making: The Peculiar Economics of Bureaucracy." American Economic Review, 58 (2), 293-305. Novy-Marx, R., and J. Rauh (2011). "Public Pension Promises: How Big Are They and What Are They Worth?" Journal of Finance, 66 (4), 1211-1249. Oates, W. (1969). "The Effects of Property Taxation and Local Spending on Property Values: An Empirical Study of Tax Capitalization and the Tiebout Hypothesis." Journal of Political Economy, 77, 957- 971. Oates, W. (1972). Fiscal Federalism. New York: Harcourt, Brace and Jovanovich. Oates, W. (1999). "An Essay on Fiscal Federalism." Journal of Economic Literature, 37 (3), 1120-1149. O'Donoghue, T., and M. Rabin (1999). "Doing IT Now or Later." American Economic Review, 89 (1), 103-124. Olsen, E. (2003). "Housing Programs for Low-Income Households." In: Means-Tested Transfer Programs in the United States, R. Moffitt (ed.). Chicago: University of Chicago Press. Olson, M. (1982). The Rise and Decline of Nations: Economic Growth, Stagflation, and Social Rigidities. New Haven, CT: Yale University Press. Parilla, J., and M. Muro (2017). "Understanding US Productivity Trends from the Bottom Up." Brookings Institution Research Report. Patrick, C. (2016). "Identifying the Economic Development Effects of Million Dollar Facilities." Economic Inquiry, 54 (4), 1737-1762. Persson, T., and G. Tabellini (2000). Political Economy: Explaining Economic Policy. Cambridge, MA: MIT Press. Poole, K., and H. Rosenthal (1985). "A Spatial Model for Legislative Roll Call Analysis." American Journal of Political Science, 29 (2), 357-384. Poole, K., and H. Rosenthal (1991). "Patterns of Congressional Voting." American Journal of Political Science, 35, 228-278.
523
References
Poterba, J. (1984). "Tax Subsidies to Owner-Occupied Housing: An Asset-Market Approach." Quarterly Journal of Economics, 99, 729-752. Rausch, D. (2009). "When a Popular Idea Meets Congress: The History of the Term Limit Debate in Congress." Politics, Bureaucracy & Justice, l (1), 34-43. Redding, S., and E. Rossi-Hansberg (2017). "Quantitative Spatial Economics." Annual Review of Economics, 34, 1-25. Redding, S., and D. Sturm (2008). "The Costs of Remoteness: Evidence from German Division and Reunification. American Economic Review, 98 (5), 1766-1797. Reinikka, R., and J. Svensson (2004). "Local Capture: Evidence from a Central Government Transfer Program in Uganda." Quarterly Journal of Economics, 119 (2), 679-705. Roback, J. (1982). "Wages, Rents, and the Quality of Life." Journal of Political Economy, 90 (6), 1257-1278. Romer, T., and H. Rosenthal (1982). "Median Voters or Budget Maximizers: Evidence from School Expenditure Referenda." Economic Inquiry, 20 (4), 556-578. Rosen, K. (1982). "The Impact of Proposition 13 on House Prices in Northern California: A Test of the Interjurisdictional Capitalization Hypothesis." Journal of Political Economy, 90, (1), 191-200. Rosen, S. (1974). "Hedonic Prices and Implicit Markets: Product Differentiation in Pure Competition." Journal of Political Economy, 82, 34- 55. Rosen, S. (1979). "Wages-Based Indexes of Urban Quality of Life." In: Current Issues in Urban Economics, P. Mieszkowski and M. Straszheim. (eds.). Baltimore, MD: Johns Hopkins University Press. Rosenthal, S., and W. Strange (2001). "The Determinants of Agglomeration." Journal of Urban Economics, 50 (2), 191-229. Rosenthal, S., and W. Strange (2003). "Geography, Industrial Organization, and Agglomeration." Review of Economics and Statistics, 85 (2), 377-393. Rosenthal, S., and W. Strange (2008). "The Attenuation of Human Capital Spillovers." Journal of Urban Economics, 64, 373-389. Rossi-Hansberg, E. (2004). "Optimal Urban Land Use and Zoning." Review of Economic Dynamics, 7 (1), 69- 106. Rossi-Hansberg, E., Sarte, P., and R. Owens (2010). "Housing Externalities." Journal of Political Economy, 118 (3), 485- 535. Rossi-Hansberg, E., and M. Wright (2007). "Establishment Size Dynamics in the Aggregate Economy." American Economic Review, 97 (5), 1639-1666. Rouse, C. (1998). "Private School Vouchers and Student Achievement: An Evaluation of the Milwaukee Parental Choice Program." Quarterly Journal of Economics, 113 (2), 553-602. Rubin, D. (1974). "Estimating Causal Effects of Treatment in Randomized and Nonrandomized Studies." Journal of Educational Psychology, 66, 688-701. Rubinstein, A. (1982). "Perfect Equilibrium in a Bargaining Model. " Econometrica, 50 (1), 97- 109. Samuelson, P. (1954). "The Pure Theory of Public Expenditure." Review of Economics and Statistics, 36 (4), 387- 389. Schochet, P., Burghardt, J. and S. McConnell (2008). "Does Job Corps Work? Impact Findings from the National Job Corps Study." American Economic Review, 98 (5), 1864-1886. Schultz, P. (2012). "The Market for New Issues of Municipal Bonds: The Roles of Transparency and Limited Access to Retail Investors." Journal of Financial Economics, 106, 492-512. Sethi, R., and R. Somanathan (2004). "Inequality and Segregation." Journal of Political Economy, 112 (6), 1296-1321. Sieg, H. (2000). "Estimating a Bargaining Model with Asymmetric Information: Evidence from Medical Malpractice Disputes." Journal of Political Economy, 108 (5), 1006-1021.
524
References
Sieg, H ., Smith, V., Banzhaf, S., and R. Walsh (2002). "Interjurisdictional Housing Prices in Locational Equilibrium." Journal of Urban Economics, 52, 131-153. Sieg, H ., Smith, V., Banzhaf, S., and R. Walsh (2004). "Estimating the General Equilibrium Benefits of Large Changes in Spatially Delineated Public Goods." International Economic Review, 45 (4), 1047-1077. Sieg, H., and Y. Wang (2013). "The Impact of Unions on Municipal Elections and Urban Fiscal Policies." Journal of Monetary Economics, 60 (5), 554-567. Sieg, H., and C. Yoon (2017). "Estimating Dynamic Games of Electoral Competition to Evaluate Term Limits in US Gubernatorial Elections." American Economic Review, 107 (7), 1824-1857. Sieg, H., and C. Yoon (2020). "Waiting for Affordable Housing in N ew York City." Quantitative Economics, 11 (1), 277-313. Sieg, H., and J. Zhang (2012). "The Effectiveness of Private Benefits in Fundraising." International Economic Review, 53 (2), 349-374. Silverstein, J. (2014). "House Price Indexes: Methodology and Revisions." Special Report, Research Department, Federal Reserve Bank of Philadelphia. Smith, V., Sieg, H ., Banzhaf, S., and R. Walsh (2004). "General Equilibrium Benefits for Environmental Improvements: Projected Ozone Reductions for the Los Angeles Air Basin." Journal of Environmental Economics and Management, 47, 559- 584. Spence, M. (1973). "Job Market Signaling." Quarterly Journal of Economics, 87 (3), 355-374. Stahl, I. (1972). Bargaining Theory. Stockholm: Economics Research Institute, Stockholm School of Economics. Strumpf, K. (2002). "Does Government Decentralization Increase Policy Innovation?" Journal of Public Economic Theory, 4, 207-241. Strumpf, K., and F. Oberholzer-Gee (2002). "Endogenous Policy Decentralization: Testing the Central Tenet of Economic Federalism." Journal of Political Economy, 110, 1-36. Summers, A., and L. Jakubowski (1997). "The Fiscal Burden of Unreimbursed Poverty Expenditures." Greater Philadelphia Regional Review, 10- 12. Thaler, R., and Shefrin, H. (1981). "An Economic Theory of Self-Control." Journal of Political Economy, 89 (2), 392-406. Tiebout, C. (1956). "A Pure Theory of Local Expenditures." Journal of Political Economy, 64 (5), 416-424. Topa, G. (2001). "Social Interactions, Local Spillovers and Unemployment. "Review of Economic Studies, 67, 261-296. United Nations (2012). World Urbanization Prospects, the 2011 Revisions. New York: UN Department of Economic and Social Affairs, Population Division. Urban Institute (2015). "State and Local Expenditures." State and Local Finance Initiative, Research Report, Washington, DC. Van der Ploeg, F. (2011). "Natural Resources: Curse or Blessing?" Journal of Economic Literature, 49 (2), 366-420. Vickrey, W. (1961). "Counter-speculation, Auctions, and Competitive Sealed Tenders." Journal of Finance, 16 (1), 8-37. Wachter, J. (2013). "A 10-Year Perspective of the Merger of Louisville and Jefferson County, KY: Louisville Metro Vaults from 65th to 18th Largest City in the Nation." Abel Foundation Research Report. Walker, J. (1971). "Innovation in State Politics." In: Politics in the American States, H. Jacob and K. Vines (eds.). Boston: Little, Brown and Company. Wallace, N. (2016). "What is the True Cost of Living in New York City?" Smart Asset's Studies, www.smartasset.com. Weitzman, M. (1974). "Prices vs. Quantities." Review of Economic Studies, 41 (4), 477-491.
525
References
Wong, J. (1995). "Local Government Revenue Forecasting: Using Regression and Econometric Revenue Forecasting in a Medium-Sized City." Journal of Public Budgeting, Accounting & Financial Management, 7 (3), 315-335. Yelowitz, A. (2001). "Public Housing and Labor Supply." Working Paper, University of Kentucky. Yoon, C. (2017). "Estimating a Dynamic Spatial Equilibrium Model to Evaluate the Welfare Implications of Regional Adjustment Processes: The Decline of the Rust Belt." International Economic Review, 58 (2), 473-497.
526
Index 111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
accountability, 12, 134, 184 achievement, 75,317 achievement gap, 299 addictive goods, 355 affordable housing, 424 agglomeration externality, 7, 18, 23, 232, 414,450 all-pay auction, 179 altruism, 327 amenity, 508 Amsterdam, 358 arbitration, 195 Asheville, NC, 119 Athens, OH, 116 Atlanta, 3 balanced budget requirement, 250 Baltimore, 205,267 banking crisis, 466 bargaining in good faith, 194 bargaining power, 199 benefit tax, 219 benefit-cost analysis, 261 benefits, 192 Berlin, 77 block grant, 169 bond rating, 181 Boston,154, 168,228, 337 Brazil, 182 building code, 363, 433 bureau cracy, 45, 177 business taxation, 232 California, 431 call option, 464 Canada, 41, 306 cap and trade, 371 capital budget, 257, 269 capital taxation, 221 capitalization, 149, 209, 369 central business district, 233,406,414 centralization, 41, 44--45, 391 charter school, 336 Chicago, 1, 207, 241, 254, 344 China, 10, 12, 56, 384
city budget, 249 city council, 8, 109, 124 city manager, 109 city ordinance, 113 clean air, 362 Clean Air Act, 367 clean water, 362 climate change, 374 Coase Theorem, 365 cognitive skills, 300 collective bargaining, 192 Commerce Clause, 42--43 commuting cost, 19- 20, 403 comparative advantage, 23, 43 compensation, 192 congestion, 64, 67,414 congestion pricing, 417 consolidation, 152 constant returns to scale, 28 contingent valuation, 73 corruption, 179- 86, 390 cost accounting, 260 cost of crime, 347 cost of living, 7 creative class, 454 credit constraints, 323 credit rating, 281 crime, 163 crime rate, 344 crowd-out, 171 decentralization, 10, 12, 41, 48, 144, 391 default, 465 defined b enefit plan, 205 density, 17 d epreciation, 459 deregulation, 34 deterren ce, 351 Detroit, 275, 309,452 discounting, 196,262 discrete choice, 505 discretionary spending, 46 dispersion force, 414 down payment, 455 Dublin, 10
Index
early childhood education, 332 earnings gap, 299 economic development, 1, 232, 273 economies of scale, 20, 44 economies of scope, 20 education spending, 52,269,317 educational standards, 334 efficient level of public goods, 65, 93, 141,152 EITC, 302 endogenous amenities, 450 environmental justice movement, 373 expenditures, 250, 266 expenses,254 experimentation, 10, 48, 53 externality, 364 fairness, 168, 222 federal government, 9, 41, 271 federal government spending, 46 federal income tax policy, 99 federal income taxation, 452 fee, 9, 252, 270 field experiment, 30,226 fiscal capacity, 392 fiscal competition, 9, 12, 144, 163, 243 fiscal crisis, 200, 267, 290 fiscal federalism, 43 fiscal policy, 266 fiscal spillover, 163 Flint, Ml, 372 flood insurance, 375 forecasting, 250, 267 foreclosure, 466 formal knowledge, 22 Frankfurt, 10, 158 free rider problem, 94 general expenditures, 53, 268 general-purpose bonds, 281 geography, 1 Germany, 41, 384 global warming, 373 government-sponsored enterprise, 468 Great Recession, 457-58, 466,469 green city, 362 Greenwich, CT, 251 Guangzhou, 10, 56 Head Start, 333 hedonic model, 423 hedonic regression, 424,428, 456 heterogeneous preferences, 48,144,423
528
Hong Kong, 10, 56 house value, 423, 455 household sorting, 403,508 housing demand, 146,508 housing price, 420 housing price index, 456 housing supply, 431 housing voucher, 302, 424, 435 Houston, 363 human capital, 6-7, 299, 321 ideology, 134 imitation, 55 increasing returns to scale, 21 inequality, 6, 163 infrastructure, 23, 78,289,394,403 initiative, 114 innovation, 22, 55 institutional design, 384 instrumental variables, 76, 415 intergovernmental transfers, 163, 254, 270,393 Karachi, 383 knowledge spillover, 21, 23 Korea, 384 labor costs, 192, 270 labor productivity, 3 land price, 403, 430 land tax, 228, 239 land use, 403 law enforcement, 344 Lindahl mechanism, 69 lobbying, 45 local government, 41 local labor markets, 443 local public good, 63, 64, 90, 144 London, 10, 18, 36, 158 Los Angeles, 1,369, 429 Louisville, 152 Madrid, 10 magnet school, 336 majority rule, 124, 155 mandatory spending, 46 matching grant, 169 matching in labor markets, 7 median voter theorem, 124, 127 mediation, 195 Medicaid, 302 metropolitan area, 1 Mexico City, 11
Index
Miami, 460 migration, 394 Milan, 10 Milwaukee, 155 minimum wage, 351 Minneapolis, 267 mobility, 144, 508 monitoring, 58 monocentric city, 405 mortgage insurance, 469 mortgage-backed securities, 467 Moving to Opportunity, 310-11, 436 municipal bankruptcy, 274, 290 municipal bonds, 9,280 municipal charter, 109 municipal debt, 280 municipal election, 199 municipal employment, 9 Nairobi, 11 national debt, 46 National Flood Insurance Act, 47 natural hazards, 396 natural locational advantage, 18 natural resources, 391 New Delhi, 11,383 New Orleans, 375 New York City, 1, 4, 19, 34, 121, 158, 242, 283, 337,375, 412,417,430,434, 435 noncognitive skills, 301 nonexcludable, 64 nonlinear pricing, 424 nonpartisan election, 118 nonrival, 64 operating budget, 251,269 optimal class size, 74, 334 optimal gang size, 352 organized crime, 352 own revenues, 53, 270 owner-occupied housing, 458 Paris, 10, 18 pension funding, 205 p eople-based policy, 309 Philadelphia, 171, 208, 223, 228, 329 Pigouvian taxation, 366 PILOT, 227 Pittsburgh, 99, 277,329, 330 place-based policy, 309 police effectiveness, 351 political machine, 111 pollution, 364
polycentric city, 418 population, 4 Port Huron, MI, 477 poverty, 293 poverty line, 295 poverty rate, 296 poverty trap, 295 preference revelation, 71 present discounted value, 262 primary mortgage market, 467 private contributions, 92 procurement auction, 185,261 product differentiation, 423 product variety, 21 production function, 235 profit tax, 238 progressive taxation, 169 prohibition, 357 property crime, 344 property tax assessment, 223 property tax revenue, 217 property taxation, 145,217,252,270,459 Proposition 13, 149 proximity, 17 public housing, 302, 424, 434 public sector corporation, 8 Puerto Rico, 185 quality of education, 327 random utility model, 507 ratchet effect, 372 real estate, 8, 455 recall, 115 redistribution, 44, 169, 273 referendum, 115 regulation, 362,375,431 regulatory policy, 34 rent stabilization, 424 rent-seeking behavior, 177, 390 repeat-sales index, 456 returns to human capital, 327 revenues,250,252,267 right-to-work states, 194 Rio de Janeiro, 11 ruling elites, 386 Sao Paulo, 383 sales tax, 239 Samuelson condition, 66, 68, 128 San Francisco, 207, 344 school choice, 336 school vou cher, 339-40
529
Index
screening, 312 Seattle, 38 secondary mortgage market, 467 segregation, 155 selection, 76, 457 Self-Sufficiency Project, 306 Seoul, 374 Shanghai, 10, 56, 158 sharing rule, 69 Shenzhen, 10,56 Silicon Valley, 21 Singapore, 10, 56,417,436 single-peaked preferences, 125 skill complementarity, 6 SNAP, 302 social conflict, 386 social learning, 48,357 social services, 295 soda tax, 229 special economic zones, 56, 394 special interests, 45, 137 spillovers, 44 stable matching, 427 Stamford, 242 standardization, 44 state government, 41, 271 stratification, 154 strike, 196 tacit knowledge, 22 TANF, 302, 306-7 tax compliance, 223 tax exemption, 281 tax increment financing, 241 taxes,9,252,270,302,392
530
teacher compensation, 338 Tennessee, 334 term limits, 120, 135, 183, 189 Tokyo,158 trade, 28 transaction cost, 366 transportation, 403, 505 transportation cost, 19-20, 403 unanimity rule, 70 underprovision of local public goods, 93, 141 unfunded liabilities, 208 union rights, 194 union-security agreement, 194 urbanization, 1 US Supreme Court, 42 usercharge,217,252,270 user cost of housing, 459 valence, 134 Vickrey-Clarke-Groves mechanism, 72 violent crime, 344 voluntary provision of public goods, 12, 90 wage tax, 238 Washington, DC, 338 welfare system, 397 work requirements, 308 work rules, 204 yardstick competition, 58 zoning,219,363,431 Zurich, 10