Human Capital and Regional Development in Europe: A Long-Run Comparative View (Frontiers in Economic History) 3030908577, 9783030908577

Human capital is of utmost importance for the future of our knowledge economies and societies. However, it is unequally

127 42 4MB

English Pages 160 [152] Year 2022

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Acknowledgements
About the Book
Contents
List of Abbreviations
List of Figures
List of Tables
Chapter 1: Introduction
References
Chapter 2: Regional Human Capital Inequality in Europe
2.1 Introduction
2.2 Human Capital Formation in Europe in the Long Run
2.3 Methodology and Data
2.3.1 Measure of Human Capital
2.3.2 Indicators of Inequality
2.4 Results
2.4.1 Evolution of Human Capital in the European Regions: 1850-2010
2.4.2 Intranational Inequality
2.5 Conclusion
Appendix
References
Chapter 3: Spatial Clustering of Numeracy and Literacy
3.1 Introduction
3.2 Evolution of Basic Education in the European Regions
3.3 Data
3.4 Exploratory Spatial Data Analysis
3.5 Results
3.5.1 Global Spatial Autocorrelation
3.5.2 Moran Scatter Plots
3.5.3 Moran Significance Maps
3.5.4 Robustness Checks
3.6 Conclusion
Appendix
References
Chapter 4: Human Capital and Market Access in the European Regions
4.1 Introduction
4.2 Related Literature
4.2.1 Regional Human Capital Formation in Europe: Today and in the Past
4.2.2 Economic Geography and Market Access in Europe
4.3 Theoretical Model
4.4 Data and Methodology
4.5 Results
4.6 Conclusions
Appendix
References
Chapter 5: The Long-Run Impact of Human Capital on Innovation and Economic Growth in the Regions of Europe
5.1 Introduction
5.2 Literature
5.3 Methodology and Data
5.4 Results
5.4.1 Relationship Between Patents Per Capita and GDP Per Capita Today
5.4.2 Explaining Regional Patents Per Capita
5.4.3 Explaining Regional Economic Development
5.5 Conclusion
Appendix
References
Chapter 6: Lessons from Human Capital Evolution over the Last 200 Years
6.1 Introduction
6.2 Human Capital: Theoretical Origins and Quantitative Measurement
6.3 Regional Education Formation in Europe: Nineteenth Century to Today
6.4 Factors for Regional Human Capital Formation
6.5 Human Capital, Economic Growth, and Their Policy Implications
6.6 Conclusion
References
Chapter 7: Conclusion and Future Directions for Research
References
Recommend Papers

Human Capital and Regional Development in Europe: A Long-Run Comparative View (Frontiers in Economic History)
 3030908577, 9783030908577

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Frontiers in Economic History

Claude Diebolt Ralph Hippe

Human Capital and Regional Development in Europe A Long-Run Comparative View

Frontiers in Economic History Series Editors Claude Diebolt, Faculty of Economics, BETA, CNRS, University of Strasbourg, Strasbourg, France Michael Haupert, University of Wisconsin–La Crosse, La Crosse, WI, USA

Economic historians have contributed to the development of economics in a variety of ways, combining theory with quantitative methods, constructing new databases, promoting interdisciplinary approaches to historical topics, and using history as a lens to examine the long-term development of the economy. Frontiers in Economic History publishes manuscripts that push the frontiers of research in economic history in order to better explain past economic experiences and to understand how, why and when economic change occurs. Books in this series will highlight the value of economic history in shedding light on the ways in which economic factors influence growth as well as social and political developments. This series aims to establish a new standard of quality in the field while offering a global discussion forum toward a unified approach in the social sciences.

More information about this series at https://link.springer.com/bookseries/16567

Claude Diebolt • Ralph Hippe

Human Capital and Regional Development in Europe A Long-Run Comparative View

Claude Diebolt BETA/CNRS, Faculty of Economics University of Strasbourg Strasbourg, France

Ralph Hippe Department for VET and Skills Cedefop Thessaloniki, Greece

ISSN 2662-9771 ISSN 2662-978X (electronic) Frontiers in Economic History ISBN 978-3-030-90857-7 ISBN 978-3-030-90858-4 (eBook) https://doi.org/10.1007/978-3-030-90858-4 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG. The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Disclaimer: The views expressed are purely those of the writers and may not in any circumstances be regarded as stating an official position of Cedefop.

Acknowledgements

This book brings together 10 years of our scientific collaboration and publications in academic journals on the complex links between human capital and regional development in Europe from a cliometric perspective. Our special thanks go to all those colleagues whose interaction has benefitted and influenced our research agenda over the decade, in particular at the University of Strasbourg, the Humboldt University of Berlin, the University of Tübingen, the London School of Economics and Political Science, the European Commission’s Joint Research Centre and Cedefop.

vii

About the Book

Human capital is of utmost importance for the future of our knowledge economies and societies. However, it is unequally distributed in the Europe, contributing to marked spatial patterns of economic prosperity within and across countries. In many cases, these patterns have a long history. To understand them better, it requires to go back in time, when mass schooling was starting to become a reality across Europe. Taking a long-run perspective over more than 150 years, this book shows the development and the distribution of human capital in the regions of Europe and its connections with the economy. It provides insights into recent research findings in this area, including theoretical advances and the use of new empirical data.

ix

Contents

1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 2

2

Regional Human Capital Inequality in Europe . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Human Capital Formation in Europe in the Long Run . . . . . . . . . 2.3 Methodology and Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Measure of Human Capital . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Indicators of Inequality . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Evolution of Human Capital in the European Regions: 1850–2010 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Intranational Inequality . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

5 5 7 10 10 12 16

. . . . .

16 22 26 27 31

Spatial Clustering of Numeracy and Literacy . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Evolution of Basic Education in the European Regions . . . . . . . . 3.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Exploratory Spatial Data Analysis . . . . . . . . . . . . . . . . . . . . . . . 3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Global Spatial Autocorrelation . . . . . . . . . . . . . . . . . . . . 3.5.2 Moran Scatter Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.3 Moran Significance Maps . . . . . . . . . . . . . . . . . . . . . . . . 3.5.4 Robustness Checks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . .

35 35 36 39 42 46 47 47 49 51 52 53 53

3

xi

xii

4

5

6

7

Contents

Human Capital and Market Access in the European Regions . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Related Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Regional Human Capital Formation in Europe: Today and in the Past . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Economic Geography and Market Access in Europe . . . . 4.3 Theoretical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Data and Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Long-Run Impact of Human Capital on Innovation and Economic Growth in the Regions of Europe . . . . . . . . . . . . . . . 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Methodology and Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Relationship Between Patents Per Capita and GDP Per Capita Today . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2 Explaining Regional Patents Per Capita . . . . . . . . . . . . . . 5.4.3 Explaining Regional Economic Development . . . . . . . . . 5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lessons from Human Capital Evolution over the Last 200 Years . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Human Capital: Theoretical Origins and Quantitative Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Regional Education Formation in Europe: Nineteenth Century to Today . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Factors for Regional Human Capital Formation . . . . . . . . . . . . . 6.5 Human Capital, Economic Growth, and Their Policy Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . .

57 57 58

. . . . . . . .

58 60 63 65 70 80 80 81

. . . . .

85 85 87 90 98

. . . . . .

98 99 105 109 111 112

. 117 . 117 . 119 . 123 . 125 . 129 . 132 . 135

Conclusion and Future Directions for Research . . . . . . . . . . . . . . . . 139 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

List of Abbreviations

A ABCC Benelux CV COVID19 d ECEC E(I) EFP EFTA EMP EPO ESDA EU GDP GIS H HC HH HL I IEA ILO IR IRS ISCED IV K

Level of technology A linearly transformed Whipple’s index (named after A’Hearn, Baten, Crayen and Clark) Belgium, Netherlands and Luxembourg Coefficient of variation Coronavirus disease 2019 Distance Early childhood education and care Expected value of Moran’s I European Fertility Project European Free Trade Agreement European marriage pattern European Patent Office Exploratory spatial data analysis European Union Gross domestic product Geographic information system Human capital Human capital High-high cluster High-low cluster Moran’s I International Association for the Evaluation of Educational Achievement International Labour Organization Industrial Revolution Increasing returns to scale International Standard Classification of Education Instrumental variable Capital xiii

xiv

List of Abbreviations

k L LDC LH LISA LL MA max min μ n (or: N, obs.) NEG NUTS OECD OLS PIAAC PIRLS PISA PPS R&D RDI rw S0 SA σ (or: sd, Std. Dev.) t TIMSS UN UNESCO w WI Wz Y yrs. z

Number of neighbours Labour Least developed country Low-high cluster Local indicator of spatial association Low-low cluster Market access Maximum Minimum Mean value Number of observations New Economic Geography Nomenclature of Territorial Units for Statistics Organisation for Economic Co-operation and Development Ordinary least squares Programme for the International Assessment of Adult Competencies Progress in International Reading Literacy Study Programme for International Student Assessment Purchasing Power Standard Research and development Research, Development and Innovation Reading and writing ability Scaling factor Supplier access Standard deviation Time period Third International Mathematics and Science Study; Trends in International Mathematics and Science Study United Nations United Nations Educational, Scientific and Cultural Organization Element of the spatial weight matrix Whipple’s index Spatially lagged vector Output Years Vector

Country Abbreviations AL AM

Albania Armenia

List of Abbreviations

AT AZ BA BE BG BO BR BY CH CL CO CY CZ DE DK EC EE ES FI FR GE GR HR HU IE IN IS IT KE LI LT LU LV MD ME MK MT MX NL NO OT PA PL PT

Austria Azerbaijan Bosnia-Herzegovina Belgium Bulgaria Bolivia Brazil Belarus Switzerland Chile Colombia Cyprus Czech Republic Germany Denmark Ecuador Estonia Spain Finland France Georgia Greece Croatia Hungary Republic of Ireland India Iceland Italy Kenya Liechtenstein Lithuania Luxembourg Latvia Moldova Montenegro Northern Macedonia Malta Mexico Netherlands Norway Ottoman Empire Panama Poland Portugal

xv

xvi

RO RU SE SI SK SR TR TZ UA UK US USSR YU

List of Abbreviations

Romania Russia Sweden Slovenia Slovakia Serbia Turkey Tanzania Ukraine United Kingdom United States Union of Soviet Socialist Republics Yugoslavia

List of Figures

Fig. 2.1 Fig. 2.2 Fig. 2.3 Fig. 2.4 Fig. 3.1 Fig. 3.2 Fig. 3.3 Fig. 3.4 Fig. 3.5 Fig. 4.1 Fig. 4.2 Fig. 4.3 Fig. 4.4 Fig. 4.5 Fig. 4.6 Fig. 5.1 Fig. 5.2 Fig. 5.3 Fig. 5.4 Fig. 6.1 Fig. 6.2

Literacy, c. 1900 (NUTS2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Literacy, c. 1960 (NUTS2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Growth in regional human capital . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Regional inequality and level of human capital .. . .. . .. . .. .. . .. . .. . .. 24 Literacy (in %) in the European regions, ca. 1930 . . . . . . . . . . . . . . . . . . 39 Moran scatter plot for ABCC in Europe, ca. 1850 (k ¼ 10) . . . . . . . . 48 Moran scatter plot for literacy in Europe, ca. 1930 (k ¼ 10) . . . . . . . 48 Moran significance map for ABCC in Europe, ca. 1850 (5% pseudo-significance level, k ¼ 10) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Moran significance map for literacy in Europe, ca. 1930 (5% pseudo-significance level, k ¼ 10) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Location and size of European cities, 1850 . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Location and size of European agglomerations, 1950 . . . . . . . . . . . . . . . 69 Population potential in Europe in 1850 .. . .. . . .. . .. . . .. . . .. . .. . . .. . .. . 71 ABCC and market access, 1850 . .. . .. .. . .. .. . .. .. . .. . .. .. . .. .. . .. .. . .. 72 Population potential in Europe in 1950 .. . .. . . .. . .. . . .. . . .. . .. . . .. . .. . 75 Literacy and market access, ca. 1930 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Illiteracy in 1930 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Non-agricultural employment, 1930 and GPD/c, 2008 . . . . . . . . . . . . . 95 Regional per capita GDP and patent applications, 2008 . . . . . . . . . . . . 99 Non-agricultural employment share and agricultural productivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Average PISA 2012 score and literacy rates in 1900 for Italy and Spain . . .. . .. . .. . .. . .. . .. . .. . .. . .. . . .. . .. . .. . .. . .. . .. . .. . .. . .. . .. . .. . .. . .. . 125 Regional relationship between higher educational attainment and GDP/c in Europe, 2008 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

xvii

List of Tables

Table 2.1 Table 2.2 Table 2.3 Table 2.4 Table 2.5 Table 3.1 Table 3.2 Table 3.3 Table 3.4 Table 3.5 Table 3.6 Table 4.1 Table 4.2 Table 4.3 Table 5.1 Table 5.2 Table 5.3 Table 5.4 Table 5.5 Table 5.6 Table 5.7 Table 5.8 Table 5.9 Table 6.1

Databases on international evolution of human capital in the longer term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Descriptive statistics for the unweighted human capital indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Convergence in regional human capital . . . . . . . . . . . . . . . . . . . . . . . . . . . . CVs at different points in time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Negative differences between European and national Ginis and CVs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Descriptive statistics for human capital proxies . . . . . .. . . . . .. . . . . .. . Moran’s I statistic for regional human capital proxies, 1850 and 1930 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Percentage of observations in each quadrant of Moran’s scatter plot (k ¼ 10) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Percentage of observations in Moran’s significance map (k ¼ 10) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Robustness analysis for 1850 and 1930 (k ¼ 10 to k ¼ 15) . . . . . . Robustness analysis for 1850 and 1930 (k ¼ 10 to k ¼ 20) . . . . . . Descriptive statistics for ABCC and market access, ca. 1850 . . . . Market access and ABCC, ca. 1850 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Market access and literacy, ca. 1930 . . .. . . .. . . .. . . .. . . .. . .. . . .. . . .. . Descriptive statistics . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . Regional patent applications per capita in 2008 . . . . . . . . . . . . . . . . . . . Horse race between literacy and other variables . . . . . . . . . . . . . . . . . . . Including agricultural employment or agricultural productivity . . . Other points in time (1850, 1900, 1960) and patents . . . . . . . . . . . . . . Regional GDP per capita in 2008 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Horse race between literacy and other variables . . . . . . . . . . . . . . . . . . . Including agricultural employment or agricultural productivity . . . Other points in time (1850, 1900, 1960) and GDP per capita . . . . Laws on compulsory schooling in selected European countries . . .

9 16 21 23 25 42 46 49 49 52 52 66 73 76 97 100 101 103 104 105 106 107 108 127 xix

Chapter 1

Introduction

Human capital is a crucial factor to understand regional development. As Gennaioli et al. point out in their worldwide study, there is a “paramount importance of human capital in accounting for regional differences in development” (Gennaioli et al., 2013, p. 105). However, while the relevance of human capital is nowadays acknowledged, this has not always been the case. While the concept of human capital goes centuries back (even up to Adam Smith), early (exogenous) economic growth models (such as Solow, 1956) did not consider human capital at all in their models. Only with the appearance of endogenous growth models in the 1990s the factors human capital and knowledge got integrated (Lucas, 1988; Romer, 1986, 1990). While some econometric studies first disputed the relevance of human capital (e.g. Benhabib & Spiegel, 1994; Pritchett, 2001), the evidence over time in favour of attributing a capital role to it has grown substantially (Demeulemeester & Diebolt, 2011; Hanushek & Woessmann, 2015). Human capital hypothesis is one factor among competing (but not necessarily, mutually exclusive) others, with other popular ones being e.g. geography (Diamond, 1997; Engerman & Sokoloff, 2000) and institutions (Acemoglu et al., 2005; North, 1981). In the long run, the human capital hypothesis has gained track by the more recent unified growth models (Galor, 2005, 2011). While the regional dimension was not a major interest of most economists for a long time, new economic geography (NEG) meant a turnaround in favour of considering regional and subnational aspects of development. The regional dimension has particular advantages as it allows to exclude factors at the national that may bias analyses, such as institutions. Differences between countries also often hide even wider inequalities within them. With more research done in this area, many studies have also used human capital in their regional models and have shown the importance of human capital for innovation and economic growth (Cuaresma et al., 2012; Ljungberg & Nilsson, 2009; Rodríguez-Pose & Crescenzi, 2008; Sterlacchini, 2008). Still, data availability is quite limited in both space and time directions. While today there are educational attainment data available, there are no indicators that measure skills at the regional level for the whole of Europe. In addition, historically © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 C. Diebolt, R. Hippe, Human Capital and Regional Development in Europe, Frontiers in Economic History, https://doi.org/10.1007/978-3-030-90858-4_1

1

2

1 Introduction

available data have mostly considered the national level or considered the regional level within one or very few countries. For these reasons, this book presents new evidence on the long-run evolution of human capital at the regional level, taking a European approach. It uses these new regional human capital data to gather new insights into its role for regional development. The following chapter is the starting point of our adventures, presenting the newly gathered regional database on human capital that covers the time period between 1850 and 2010. It includes numeracy (for the earliest period around 1850), literacy (between 1900 and 1960) and educational attainment (for the most recent period up to 2010). A particular focus is on regional inequality, with the evolution of these inequalities measured over time. Chapter 3 then goes to explore spatial clustering patterns of these data in 1850 and 1930. Spatial econometric techniques such as Moran’s I are used to uncover a set of high/high and low/low clusters in the centre/north and periphery of Europe, which are shown at both time and space. Subsequently, Chap. 4 takes another spatial modelling approach, considering a NEG model and applying it also to the data in 1850 and 1930. The particular relevance of market access and of being located remotely from the European core regions is analysed, as the literature suggests that remoteness may be equal to being backward economically. Chapter 5 then creates a direction connection between the historical human capital indicators on the one hand side and today’s innovation and economic growth on the other hand. Innovation is proxied by patent data, while economic growth uses the standard GDP per capita measures. Different elements of the previous analyses and others from the field are put together in Chap. 6. It reviews some of the key elements that have been gathered throughout the chapters, emphasising the role of historical human capital as well as current policies. Finally, Chap. 7 concludes on the evidence gathered from the previous chapters and provides further directions for future research.

References Acemoglu, D., Johnson, S., & Robinson, J. A. (2005). Institutions as the fundamental cause of longrun growth. In P. Aghion & S. Durlauf (Eds.), Handbook of economic growth (pp. 385–472). Elsevier. Benhabib, J., & Spiegel, M. M. (1994). The role of human capital in economic development: Evidence from aggregate cross-country and regional U.S. data. Journal of Monetary Economics, 34(2), 143–173. Cuaresma, J. C., Doppelhofer, G., & Feldkircher, M. (2012). The determinants of economic growth in European regions. Regional Studies, 48(1), 44–67. https://doi.org/10.1080/00343404.2012. 678824

References

3

Demeulemeester, J.-L., & Diebolt, C. (2011). Education and growth: What links for which policy? Historical Social Research, 36(4), 323–346. Diamond, J. (1997). Guns, germs and steel: The fates of human societies. W. W. Norton. Engerman, S., & Sokoloff, K. L. (2000). Institutions, factor endowments, and paths of development in the new world. Journal of Economic Perspectives, 14, 217–232. Galor, O. (2005). From stagnation to growth: Unified growth theory. In P. Aghion & S. N. Durlauf (Eds.), Handbook of economic growth (Vol. 1A, pp. 171–293). North Holland. Galor, O. (2011). Unified growth theory. Princeton University Press. Gennaioli, N., La Porta, R., Lopez-de-Silanes, F., & Shleifer, A. (2013). Human capital and regional development. Quarterly Journal of Economics, 128, 105–164. Hanushek, E. A., & Woessmann, L. (2015). The knowledge capital of nations: Education and the economics of growth. MIT Press. Ljungberg, J., & Nilsson, A. (2009). Human capital and economic growth: Sweden, 1870-2000. Cliometrica, 3, 71–95. Lucas, R. (1988). On the mechanics of economic development. Journal of Monetary Economics, 22(1), 3–42. North, D. C. (1981). Structure and change. W.W. Norton Company Inc.. Pritchett, L. (2001). Where has all the education gone? World Bank Economic Review, 15, 367–391. Rodríguez-Pose, A., & Crescenzi, R. (2008). Research and development, spillovers, innovation systems, and the genesis of regional growth in Europe. Regional Studies, 42(1), 51–67. Romer, P. M. (1986). Increasing returns and long-run growth. Journal of Political Economy, 94(5), 1002–1037. Romer, P. M. (1990). Endogenous technological change. Journal of Political Economy, 99(5), 71–102. Solow, R. M. (1956). A contribution to the theory of economic growth. Quarterly Journal of Economics, 70(1), 69–94. Sterlacchini, A. (2008). R&D, higher education and regional growth: Uneven linkages among European regions. Research Policy, 37(6), 1096–1107.

Chapter 2

Regional Human Capital Inequality in Europe

2.1

Introduction

Human capital has obtained considerable attention from both researchers and public policy-makers recently and in the more distant past. Human capital is often assumed to positively affect a variety of socioeconomic factors such as economic development (Galor, 2005a, 2005b, 2012; Lucas, 1988; Romer, 1990), democracy and human rights (Beach, 2009; McMahon, 1999; Sen, 1999). Nevertheless, many authors concentrate either on the recent development and significance of human capital or on its evolution in history. For example, Gennaioli et al. emphasise in a recent seminal contribution the “paramount importance of human capital in accounting for regional differences in development” in the world today (Gennaioli et al., 2013, p. 105).1 If human capital has such a striking explanatory power for regional inequalities, then a better understanding of the historical origins of current regional human capital differences appears to be fundamental to comprehend economic development. Are regional inequalities in human capital, then, a product of the modern times, or are they a heritage from a distant past? This is a very important issue, as it gives a better understanding of how large regional inequalities can be avoided, which may ultimately threaten the integrity of a country. In addition, it provides an idea about the degree to which educational policy decisions can have an impact of the human capital endowment of a region and places educational policies into a larger context. The fact that the European Union has witnessed a rise in regional inequalities in recent years makes this issue still more

This chapter was first published in slightly modified form as Diebolt, C., and Hippe, R. Regional human capital inequality in Europe, 1850–2010, Région et Développement, 2017, 45: 5–30. 1 The same is true when considering the city level. For example, Simon and Nardinelli (2002) find that high levels of initial human capital are the drivers for faster city growth in the United States between 1900 and 1990.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 C. Diebolt, R. Hippe, Human Capital and Regional Development in Europe, Frontiers in Economic History, https://doi.org/10.1007/978-3-030-90858-4_2

5

6

2 Regional Human Capital Inequality in Europe

urgent. New insights in this area are also relevant for developing countries, which can take lessons from the historical European experience to foster their regional development. There are already some studies that establish a link between historical and more recent human capital data. However, these studies typically take as the basic unit of analysis either the country level to make international comparisons (e.g. Barro & Lee, 2001; Morrisson & Murtin, 2009) or the regional level to focus on one country (e.g. Felice, 2012). In fact, studies that focus on the regional level in Europe and take a long-term approach, bringing together current data with distant historical ones, are rare. When it comes to regional human capital, the lack of information is even more striking. To fill this gap in the literature, this paper explores for the first time the long-term evolution of human capital in Europe at the regional level. More specifically, we show the regional patterns in various education indicators at different points in time. These patterns give a better impression on how regional inequalities in education have changed over time. In addition, we investigate whether regional education measures have converged over time or whether there was a pattern of increasing divergence. To this end, we construct a new and large database on human capital between 1850 and 2010 from a multitude of sources. More specifically, we employ different indicators for subsequent periods of time. In particular, we use three proxies: numeracy, literacy and educational attainment. Numeracy is measured by the ABCC index (see Hippe & Baten, 2012) and measures very basic calculating skills. Literacy is defined as the ability to read and write, and to have a similar measure for basic levels of educational attainment, we consider the share of individuals who do not have a low level of education (0–2), i.e. who can broadly be considered not to be “early school leavers”.2 These proxies have some important characteristics in common, which make them particularly appropriate for the study of their respective time periods. In this way, we can underline the regional evolution of human capital by taking into account the years 1850, 1900, 1930, 1960, 2000 and 2010. In particular, we consider the evolving inequality in human capital, employing several standard measures, such as the coefficient of variation (CV), the Gini coefficient and the Theil index. The use of different indicators at different points in time does not allow a direct comparison of inequality over time, so we focus on the common patterns at each point. For a better comparison, we also adapt historical regional boundaries to current NUTS 2 regions. In total, we have between 160 and 340 NUTS 2 regions in our database for different years. The results show that intranational inequalities in human capital have been important at different points in time. Regional differences are in many cases quite persistent and are in a number of cases higher than international ones. Convergence

2

However, note that the definition is not exactly the same as the one currently used by the European Commission (in particular concerning the included age group). For further information, see Sect. 2.3.

2.2 Human Capital Formation in Europe in the Long Run

7

takes place in literacy between 1900 and 1960 and in educational attainment between 2000 and 2010. The inequality measures highlight important variations in inequality between countries throughout time. These findings underline the limitations of crosscountry analyses and the need for further human capital research at the regional level in Europe. The paper is structured as follows: first, we highlight some of the most important contributions in the human capital literature that make long-term comparisons or trace the long-term evolution of human capital in Europe. The second part explains the basic underlying methodology and portrays the different data sources that have been used in this study. Subsequently, the results on the regional evolution of human capital and on the regional inequalities are highlighted. Finally, a conclusion sums up the paper.

2.2

Human Capital Formation in Europe in the Long Run

Human capital has been emphasised to be a crucial factor to improve the lives of individuals (e.g. Vincent, 2000). For example, contrary to the first exogenous growth models, endogenous growth models have stressed the important role of human capital (Lucas, 1988; Romer, 1990). Human capital enters in some of these models as a separate factor in the production function. Human capital is understood as one form of capital (alongside physical capital and other forms) which allows to use this concept for growth accounting.3 Furthermore, a long-run view on economic growth has been proposed by the unified growth theory (e.g. Galor, 2005a, 2005b). The unified growth theory highlights that human capital is essential for the creation of long-run growth. However, human capital is a theoretical concept that cannot be measured easily empirically. This is especially true in the long run. Nevertheless, the literature has put forward different proxies for the long-run formation of human capital in Europe. In particular, it is possible to trace its evolution in Europe by the use of numeracy, literacy and book production. A’Hearn et al. (2009) provide information on numeracy (i.e. the ABCC index) on a number of European countries between 1350 and 1850. They show that there is a general tendency of increasing numeracy values over time. Let us first focus on western and northern Europe. Around 1450, the Netherlands were already more advanced in numeracy than (the more developed) northern Italy. The split between the north and the south of Italy is apparent in their data because the southern part of Italy had very low numeracy levels both around 1450 and 1500. Data for southern Italy is lacking for the centuries afterwards, but we know from Hippe and Baten (2012) and Felice (2012) that important differences were still visible at the beginning of the nineteenth century, which became consecutively less pronounced until around the middle of the twentieth century (see also Gagliardi & Percoco, 2011). According

3

For some principles of human capital theory, see Becker (1962) and Schultz (1962).

8

2 Regional Human Capital Inequality in Europe

to Felice (2012), there was a renewed (but small) tendency of divergence in the decades after 1960. The differences in other western and northern European countries between 1600 and 1850 are less striking. The United Kingdom was a numeracy leader in 1700, but other countries such as Denmark, the Netherlands, Belgium, France and Norway reached soon higher numeracy levels. Belgium appears to have been quite rapidly advancing as its catch-up phase was relatively short between 1700 and 1800. On the other hand, A’Hearn et al. (2009) also show the evolution of many central and eastern European countries. These countries had generally lower numeracy values than their western and northern European counterparts. In Germany, there was a similar divide as in Italy in 1700, but this time it is not a clear geographical criterion but a religious one: the Protestant regions were more advanced in numeracy than the Catholic ones, giving further evidence to theories underlying the positive influence of Protestantism on human capital, such as Weber (1958) and Becker and Woessmann (2009). Central European German-speaking countries (Germany, Austria, Switzerland) had higher numeracy than countries to their east in 1700. Nevertheless, they progressed throughout the time period, whereas Protestant Germany fell back between 1750 and 1800, with Switzerland becoming the numeracy leader ahead of Austria and Poland. This description of numeracy has until now been limited to the country level (with some exceptions), but different projects are underway that will also highlight more regional differences within these countries in numeracy in the near future (see also Juif & Baten, 2012). These new projects will considerably improve our understanding on the formation of human capital in the European regions in the very long run. Second, research on literacy has allowed further insights into European human capital formation. For example, Houston (2001) portrays the evolution of regional male literacy in western Europe from before 1700 until 1970. He defines a threshold (i.e. at least 50% of males between 20 and 50 years have to be literate) and divides the European regions and countries into different categories. These categories show when a region has surpassed this threshold level. Similar to the evolution in numeracy, Houston indicates that Germanic countries (Germany, Switzerland, the Netherlands, Sweden) were the leaders in literacy, having surpassed the threshold already before 1700. This might also be due to the fact that those countries were at least in part Protestant countries. The southeast of England and the larger Edinburgh areas in Scotland were similarly quick as the aforementioned countries. Moreover, literacy spread gradually to neighbouring regions in Belgium, France and northern Italy as well as to the other regions in Great Britain (except Wales) and Iceland until 1790. Geographical proximity appears to have been a decisive factor in the diffusion of knowledge in general and of literacy in particular. The pattern is still visible but less striking for the regions surpassing the threshold until 1850. French men from almost all regions became by majority literate during this time period. Exceptions are the Celtic region of Bretagne, some central regions and Corsica. It is possible that the language barrier that separated some of these regions from their French-speaking neighbours played a role here. The same suggestion can be made for the late progress in Wales and Ireland. The strongholds of

2.2 Human Capital Formation in Europe in the Long Run

9

Table 2.1 Databases on international evolution of human capital in the longer term Authors Banks (1971) Barro and Lee (2001) Benavot and Riddle (1988) Cohen and Soto (2007) De La Fuente and Domenech (2006) Flora (1983) Lindert (2004) Mitchell (2003) Morrisson and Murtin (2009)

Time period 1860– 1966 1960– 2000 1870– 1940 1960– 2010 1960– 1995 1810– 1977 1830– 2000 1830– 1919 1870– 2010

Examples of human capital proxies Primary, secondary and tertiary school enrolment; literacy; number of books Educational attainment, years of schooling Primary enrolment rates Educational attainment, years of schooling, enrolment rates Educational attainment, years of schooling Enrolment rates, number of pupils, number of teachers Enrolment rates, years of schooling, teachers Primary, secondary and tertiary education (number of pupils and teachers) Years of schooling

Gaelic-speaking regions in Ireland’s western part only surpassed the threshold until 1900. However, one has to keep in mind that literacy is defined by reading and writing a particular language (often the official and not the regional language), and this might have biased the results here. In Spain, the northern-central regions were the literacy leaders. Most other Spanish regions surpassed the limit only until 1900 (except in the south where it took even more time). In the same class fall the northwestern regions in Portugal, the remaining regions in Ireland and France as well as the north to central Italian regions. Finally, the last European regions became by a majority (male) literate, except for some southern and northern Portuguese regions which were the last ones to achieve the threshold until 1970. More generally, there are different databases available for the second half of the nineteenth century until today. These are in part international databases, including European databases mostly at the national level or sometimes referring to large regions constituting those countries. Some of the most well-known are provided by Banks (1971), Flora (1983), Benavot and Riddle (1988), Barro and Lee (2001), Mitchell (2003), De La Fuente and Domenech (2006), Cohen and Soto (2007) and Morrisson and Murtin (2009) (see an overview in Table 2.1). One has to add that these databases are not always independent of one another. In particular, the most recent ones are in part constructed from earlier databases and take different measurement and correction methods. As can be easily seen, the most popular proxies for the last decades have been educational attainment and years of schooling. The discussions and the intention to improve these databases underline once more the need for more and better data (e.g. De La Fuente & Domenech, 2006; Krueger & Lindahl, 2001).

10

2 Regional Human Capital Inequality in Europe

Nevertheless, even given these ongoing improvements at the international country level, the regional level is still not adequately represented in Europe. Only for the last 10–15 years are comparable regional data available, especially from the European Statistical Office Eurostat. If one wants to go back further in time, data collection becomes much more difficult. This is true in particular if one is interested not only in western but also in eastern Europe. This is surprising given the striking relevance of human capital for the economy and the importance of regional differences in human capital. Accordingly, Cipolla believes that the use of national averages “in a number of cases [. . .] conceal internal differences that are as interesting and significant as international variations” (Cipolla, 1969, p. 15–16). Therefore, how has human capital in the European regions evolved in the European regions? Have regional inequalities really been important? And if so, have they been as important as international ones (or even more important)? This paper makes a first step to fill this gap in the literature.

2.3 2.3.1

Methodology and Data Measure of Human Capital

For the comparison of regional human capital in the long run, it is important that the employed variables follow some basic common principles. Clearly, this is a difficult task as, theoretically, there are many different possibilities to measure human capital. However, in practice, their number is substantially reduced due to lacking data availability in different European countries. For example, the use of income-based approaches (e.g. using the skill premium) is not possible at the European regional level. Therefore, we use an education-based approach to human capital. More specifically, we use three education variables that may each be considered the representative variable of their respective time periods. The first variable is numeracy, calculated by the use of the age heaping method. It is still a rather recent method but has become a very dynamic research field as evidenced by the number of publications over the last years (e.g. A’Hearn et al., 2009; Crayen & Baten, 2010; Hippe & Baten, 2012; Manzel & Baten, 2009). Numeracy is a particular appropriate proxy if the aim is to measure human capital in early periods of human development. This proxy can be employed until the twentieth century in many European countries and in some of them even later on. Because regional data for other human capital variables is much more restricted and less broad in geographical coverage, it may be expected that this method will still have more success and more contributions in the future. The details of this method have already been discussed in other publications (see e.g. Hippe, 2012b; Hippe & Baten, 2012). Thus, we do not go into more detail in this paper. Let us only mention that there are characteristic heaping patterns on ages in historical and even in some currently less developed countries’ censuses. In particular, a part of individuals did not report their exact age but rounded it on 0 and 5. The most important

2.3 Methodology and Data

11

reason for this was that they did not know and were not able to calculate their age. It can be shown that one can measure numerical capacities by taking advantage of this rounding pattern. In practice, the ABCC index is defined as ABCCjt ¼ 125  125 

14 X

n5ijt =

72 X

! nijt ,

ð2:1Þ

i¼23

i¼5

where n is the number of all observations and i is the number of years in region j at point in time t. Numeracy is used to measure human capital around 1850. Second, the next variable is literacy. Literacy is measured by the ability to “read and write”. This proxy has been used for a long time and is still used in many international publications today (see e.g. UNESCO, 2005). In short, it measures the reading and writing abilities of individuals as stated to census takers or as filled out in census forms. We take one common definition for literacy that was used in the earlier decades of the twentieth century and that is similarly used until today4: Literacyjt ¼

N X i¼10

rwijt =

N X

nijt ,

ð2:2Þ

i¼10

where rw is the number of individuals who are able to “read and write” and N is the total number of years of age. Unfortunately, some countries do not collect information on the literacy of the total population, as is the case e.g. for the Scandinavian countries at the turn and the first decades of the twentieth century. Hence, they cannot be included in this study. Finally, literacy as the measure of education of the population is progressively replaced by the level of educational attainment during the twentieth century in most countries. Therefore, the third and last variable is educational attainment. It is one of the standard measures used in today’s official publications and has been widely used in the literature on human capital today and in the recent past (and in part beyond) (e.g. Breinlich, 2006; Lopez-Rodriguez et al., 2005; Redding & Schott, 2003; Rodriguez-Pose & Tselios, 2011). It measures the share of individuals who have surpassed a certain educational threshold level, in particular primary, secondary or tertiary education. Clearly, education systems vary importantly throughout Europe and have been subject to changes throughout history. This makes it more difficult to compare educational attainment. Still, it is possible to obtain a level of sufficient standardisation that allows to compute human capital values. This is common practice in international publications in general, comparing different countries in the world, and in publications on Europe and the European Union in particular. Eurostat provides standardised measures for all of its members. A further advantage of taking regions as the unit of analysis as compared to countries is that the regions are bound to the same educational system and have to adhere to the same ruling 4

For more details, see Appendix.

12

2 Regional Human Capital Inequality in Europe

principles. Therefore, within-country comparisons are generally not biased by differences in the education systems. We measure educational attainment in the following way: " Notlowedujt ¼ 1 

N X

leijt =

i¼15

N X

# nijt ,

ð2:3Þ

i¼15

where le is the number of individuals who have achieved pre-primary, primary and lower secondary education as highest level of education and n is the number of all individuals.5 We have opted for this definition because both numeracy and literacy indicators are proxies of rather basic human capital. In contrast, taking e.g. the share of individuals with tertiary education would clearly be an indicator of more advanced human capital. This measure would be less revealing on the abilities of the overall population. For this reason, it appears more appropriate to choose a proxy for the attainment of rather low education. This proxy captures the basic attainment of the entire population.

2.3.2

Indicators of Inequality

Apart from providing some descriptive evidence on regional inequalities, we also use more analytical tools. In particular, we use weighted indexes of dispersion to analyse regional inequalities. More specifically, we use the coefficient of variation (CV) as our regional inequality measure. The CV is defined as CV ¼

σ  100, μ

ð2:4Þ

where σ is the standard deviation of regional human capital values and μ represents the average population-weighted human capital value. This measure is especially appropriate for our study because it is a number without dimensions and thus enables an easy comparison between the countries of our dataset, which are characterised by very different mean values. As an alternative standard measure of inequality, we

5

This definition is based on the availability of data by Eurostat, referring to ISCED levels 0–2. Note that the Eurostat data are derived from the EU Labour Force Survey and refer to the “economically active population”, including both employed and unemployed individuals and following the principles set up by the ILO (see e.g. ILO, 1982). In contrast, the census data for Russia in 2012 refer to the overall population. Alternatively, it is also possible to take the age range of 25–64 years old for the countries provided by Eurostat. Because both age ranges are correlated to 99.5%, the results do not importantly change when using the alternative age range. We prefer the definition of ages above 15 years because it allows us to include the data on Russia in 2010 and enables easier comparison with literacy.

2.3 Methodology and Data

13

construct population-weighted Gini coefficients.6 This standard measure allows us to countercheck our CV results. For long-term comparisons, although it is clear that each of our human capital variables does not measure exactly the same attributes of human capital, it is important that the variables have some common characteristics to improve their comparability. There are several features that are common or at least similar to all three variables that underline the meaningful use of these variables in the present study. First, all variables are in some sense representative of the contemporaneous period at which they were collected. More specifically, the problem that arises is that often only one educational variable is provided in a census. In early censuses, no direct educational variable (such as literacy) is provided in many cases, which is why the use of the age heaping method to calculate numeracy values is appropriate. Later on, literacy emerges as a standard educational variable. When literacy levels are relatively high (and thus numeracy levels even more so), censuses often begin to report educational attainment, with illiteracy as the category for the lowest educational attainment (i.e. no educational attainment at all). Therefore, educational attainment has become today a standard variable for education. However, the simultaneous use of literacy as a part of the educational attainment variable does not allow a clear comparison between both “independent” variables. Such a pattern can be found for a number of countries such as Italy and Russia. In consequence, it appears reasonable to employ the most basic indicator of human capital (i.e. numeracy) at the beginning of the period, then a more advanced one (literacy) and finally a current standard educational variable (educational attainment). Clearly, the use of one variable throughout the period would improve comparability. However, neither numeracy nor literacy can be used after the beginning or the middle of the twentieth century, and no other variable is available at the regional level for the whole of Europe throughout time. Our use of several proxies is, therefore, currently the best available way to show regional inequalities over a longer time period. Second, the human capital variables are well correlated with each other. For example, the literature has shown that there is a close correspondence between numeracy and literacy levels in censuses from the United States (A’Hearn et al., 2009), developing countries (Hippe, 2012a) and Europe (Hippe, 2012a; Hippe & Baten, 2012). In these studies, it was possible to derive numeracy and literacy values from the same census and for the same set of regions.7 Moreover, Crayen and Baten (2010) have shown the correlation between historical numeracy and schooling. Third, all educational variables have mostly been calculated (or are in part directly taken) from official census publications or official surveys. These official documents are generally intended to provide public policy-makers and the general public with indications on the state of the population in different contexts. They may

6

See Jenkins (1999/2010) for details of calculation. See in particular Hippe (2012a) for a detailed discussion and analysis of this relationship between numeracy and literacy. 7

14

2 Regional Human Capital Inequality in Europe

be better suited than other rather unofficial documents. For this reason, the common underlying document types underline the methodology common to all variables. Fourth, some of the variables may be argued to be more or less direct output measures. This is clear for the case of literacy where the ability to “read and write” is measured. This is also true for numeracy, although the output has to be computed from an age distribution. In view of educational attainment, this is also the case for tertiary education. Even though one may consider secondary education not to be a direct output measure, it is still a relevant variable characterised by distinctive regional inequalities. In particular, early school leavers play a significant role here. They are those individuals “who leave education and training with only lower secondary education or less, and who are no longer in education and training” (Council of the European Union, 2011, C 191/01). The Council of the European Union confirms that this issue is important. It stresses that the reduction in the number of these early school leavers is essential to achieve some of the key objectives of the Europe 2020 strategy (Council of the European Union, 2011). Still in 2009, the early school leavers constituted a share of 14.4%.8 This arguably still (too) high share illustrates that the successful accomplishment of primary education and parts of secondary education cannot be taken for granted for the whole population until today. In consequence, this has even more so been the case in the past. Fifth, all numeracy, literacy and educational attainment measures are considering an important part of the total population. In other words, they are not restricted to some particular social groups in society such as military recruits or married couples, which are commonly used to approximate literacy by using signature rates. Furthermore, they normally consider both sexes (not only males) in contrast to military recruitment data. Thus, all variables are a measure of the basic human capital of the overall population and are not subject to biases that may arise when only a part of the population is considered. Sixth, and directly connected to this, one should consider a similar part of the population with regard to age. All variables are commonly defined by the use of a certain age threshold, i.e. they only take into account the individuals above a certain age threshold. Numeracy takes account of the great majority of the individuals that constitute the population of the other two variables. In consequence, all proxies measure the share of individuals who have similar years of ages. Finally, the seventh and maybe the most important common characteristic for the actual measurement of human capital is the definition of the variable at stake. All three variables are defined by an identical value range, which goes from 0 (or 0%) to 100 (or 100%). The reason for this is that every variable considers the share of individuals who have some form of education. The rest of the individuals does not have this attribute. In other words, every variable is derived from a binary indicator. In consequence, all variables are subject to the same advantages and disadvantages

8

Therefore, the goal of the Europe 2020 strategy is to reduce this share to 10% until 2020 (Council of the European Union, 2011).

2.3 Methodology and Data

15

inherent to share measures (for consequences, see also Hippe, 2012b). This common measurement framework makes it particularly appropriate to consider numeracy, literacy and educational attainment for estimating regional human capital in Europe from 1850 until today. Given this number of common features, human capital data have been collected from a variety of sources. First, the data on numeracy have been taken from the large database by Hippe and Baten (2012). Second, literacy data have been added for 1900 and 1960 from the census publications of the different European countries under study (see Appendix for more details). Moreover, the data referring to 1930 have been taken from Kirk (1946). Finally, educational attainment data for 2000 and 2010 have been collected from Eurostat (2011). They have been supplemented by census data from Russia in 2010.9 Indeed, there are a number of limitations and problems which are related to the use of different datasets. On the one hand, the datasets have to be merged. However, the regional coverage of each database is different. In consequence, only a lower number of regions can be studied when several datasets are considered together. On the other hand, there may be differences in the definition of the variables. For example, the literacy definitions are not always homogeneous across countries and throughout time. Although the differences in the historical data are mostly not large (see Appendix), the interpretation of the results should always be considered within this context. The more recent data mostly stem from Eurostat, so maximum comparability of the data has already been ensured. Finally, the consideration of a long time scale necessitates the use of a common framework for the classification of regions. Otherwise, the results would lack comparability throughout time. For this reason, we have used the NUTS classification. The NUTS classification is the official European classification which has been developed by the European Union. Clearly, the historical regions of Europe do not always correspond to the current ones. Spain and France are good examples where this is largely the case. Eastern Europe is a quite different matter as empires have broken up and important border shifts occurred during the last 150 years. To this end, the historical regions are adapted as best as possible to fit the current regions. More specifically, we standardise our approach by using NUTS 2 regions. NUTS 2 regions are for example the régions in France or the Regierungsbezirke in Germany. In this way, we construct a database with 160–340 NUTS 2 regions depending on the considered year. Note that the NUTS 2 level can be a fairly large aggregation level, in particular for smaller countries (e.g. Slovenia has only two NUTS 2 regions). Thus, the low number of regions could also lead to a high sensitivity in the inequality measures. Given these limitations, the results always need to be taken with some caution.

9

The Russian census data have been classified according to similar classes as specified in the Eurostat data, i.e. ISCED levels 0–2.

16

2 Regional Human Capital Inequality in Europe

Table 2.2 Descriptive statistics for the unweighted human capital indicators Indicator ABCC Literacy Literacy Literacy Educational attainment Educational attainment

2.4 2.4.1

Year 1850 1900 1930 1960 2000 2010

obs. 304 239 228 169 254 339

Mean 91.96 0.57 0.73 0.82 0.70 0.77

sd 11.64 0.30 0.21 0.11 0.16 0.12

min 26.38 0.05 0.17 0.59 0.14 0.26

max 100.00 1.00 1.00 0.99 0.95 0.97

Results Evolution of Human Capital in the European Regions: 1850–2010

A first intuition on the overall evolution may be derived from descriptive statistics for all human capital variables, as depicted in Table 2.2. Before 2010, we have the highest number of regions in 1850, taking advantage of the large database by Hippe and Baten (2012). The number of observations is decreasing during the twentieth century until the most recent data for 2000 and 2010. Interestingly, although the indicators and the time periods are very different, the descriptive statistics show that not only the number of regions but also the standard deviation and minimum and maximum values are very similar in 1850 and 2010. Overall, we may summarise that there is sufficient variation for all variables at all points in time to obtain relevant and pertinent results. The definition of literacy between 1900 and 1960 is (almost) identical, allowing us to make some general comments on its evolution. Note, however, that not always the same countries are included (in particular the data for 1960 refer only to peripheral European countries). Nevertheless, as one would have hypothesised, literacy is progressing over the time period 1900–1960, and regional inequalities decrease as literacy approaches its maximum level. Moreover, one may indicate that the number of observations and the distribution of our measure of educational attainment in 2000 are relatively similar to those of literacy in 1930. Data on more regions become available in 2010, but the minimum and mean values show that educational attainment is progressing until the present. Let us now turn to the more detailed analysis of the data. Due to the different points in time covered in this paper, we limit our analysis of each point in time to the most important aspects and discuss the most noticeable changes. To begin with the data on the ABCC around 1850, Hippe and Baten (2012) note that the most important regional differences exist within Bulgaria, Serbia, Spain and Russia (see also the standard deviation data in the Appendix). There is also an important northsouth difference in Italy and France, to a (much) lesser degree, the reverse can be said of Norway. Spain appears to be characterised by a core-periphery pattern. The case in Russia is more complicated: the Caucasus region and the Belarusian regions

2.4 Results

17

Fig. 2.1 Literacy, c. 1900 (NUTS2). Note: Data for historical Germany, Denmark, Finland, Luxembourg, the Netherlands, Norway and Sweden are not available. For mapping purposes, their literacy rates have been estimated to be above 90%

are the least numerate, while in particular Estonia has the highest numeracy values. The latter may be attributed to their historical and cultural ties to the most advanced countries in literacy (and numeracy) in Scandinavia. In the next step, we consider literacy in 1900. Because literacy did not yet attain its maximum value in many countries, there is more regional variation than in the ABCC. Figure 2.1 shows regional literacy rates in 1900 (see also the data in the Appendix). To our knowledge, it is the first time that such a map has been produced for the turn of the century. Thus, it constitutes in itself a contribution to the existing literature. It shows that the regions with the highest literacy rates were located in central and northern Europe. In addition, the Scandinavian countries, the Netherlands, Germany, Luxembourg and Switzerland can be assumed to have literacy rates above 90%, placing them in the highest literacy category. These are the countries which have historically been the leaders in literacy (see Houston, 2001). It is also a similar argument to Kirk’s (1946) reasoning for 1930. Therefore, we argue that this choice is well justified.10 Furthermore, the map gives an overall impression of a coreperiphery model, with central Europe at its core. Still, there are a number of regions that perform better or worse than one would expect based solely on their geographical location. For example, the Kosovo region appears to have had comparatively 10

But, of course, further data would provide more information about regional variation even within those advanced countries.

18

2 Regional Human Capital Inequality in Europe

Fig. 2.2 Literacy, c. 1960 (NUTS2). Note: Only national data available for Belarus and Ukraine, data unavailable for Polish Opolskie region

low literacy levels. On the other hand, regions such as Santander in Northern Spain appear to outperform other regions. Capital regions, such as Madrid in Spain, Attica in Greece and St. Petersburg in Russia, have the tendency to have higher literacy. The importance of national administrations and the relevance of literacy in an urbanised environment are probably important drivers of this pattern. Note also that Cisleithania (i.e. the Austrian part of Austria-Hungary) has a very high range in literacy to that time (from about 0.2 to almost 1). Regional inequalities are also persistent in a number of other countries between 1850 and 1900. This is the case e.g. for Spain, Hungary and Italy. Less but still important variation is apparent for the regions belonging to the Ottoman Empire, Portugal and the Russian Empire. Advancing to 1930 (not shown), many results of the previous points in time are confirmed, underlining the persistence of human capital inequalities. In 1960, the literacy scale already indicates that regional variation has considerably decreased (Fig. 2.2). For this reason, note that the number of considered countries is limited to less developed regions in western, eastern and southeastern Europe. The highest regional variation can be observed in Spain, Italy and Serbia. Portugal has the lowest average literacy rates in Western Europe, even lower than most regions in the East. The northern and larger urban regions of St. Petersburg, Moscow and Murmansk are positive outliers in Russia. The patterns are in many ways similar to the ones found in 1930 and 1900. Although the Austrian-Hungarian Empire dissolved after the First World War, its geographical limits are still visible more than 40 years later. Similarly, the historical regional differences in Yugoslavia are still perceivable. Path

2.4 Results

19

dependency is clearly one important feature that emerges from this analysis. In general, we also see that the numeracy and literacy patterns broadly reflect the available previous information in the literature (such as Houston, 2001) but provide a larger and in part more detailed information, in particular also highlighting disparities in Eastern Europe. Taking educational attainment in 2000 (not shown), we can now include much more countries. Still, Portugal is at the lower bound of educational attainment. This highlights the historical continuity of Portuguese low human capital performance in a European comparison. Its capital, Lisbon, has significant higher levels of educational attainment. Nevertheless, Lisbon is still at the very bottom of the educational attainment of other countries (except Malta, which has still lower educational attainment). The highest average values come from the Czech Republic and Slovakia, which constituted one country until 1992. Other former communist countries have also high educational attainment levels, such as Poland and Hungary. Given our indicator, within-country regional inequalities are relatively small when compared to literacy data in 1900 or 1930. For example, Italy’s regional educational disparities are relatively small. Still the highest inequalities are found in Spain. Finally, the current situation in 2010 (not shown) is similar to that in 2000, although some countries have been added. Russian disparities are relatively small, but historical literacy leaders are once more significant positive outliers (Moscow, St. Petersburg, Murmansk). In general, one can state that educational attainment has been rising during the first decade of the twenty-first century. In particular, the heritage of educational policies in former communist countries in central Europe is clear, given the high values for educational attainment. Nevertheless, regional inequalities are striking until the present, underlining the importance of the regional level for academic research and policy-makers. After these first comparisons in the long run, we should take a more specific look at the changes from one point of time to another. Given the fact that our three human capital variables are not directly comparable, we emphasise the changes in literacy between 1900 and 1960 and in educational attainment between 2000 and 2010. In Fig. 2.3, we consider the initial literacy rates in 1900, 1930 and 2000 and compare them with the subsequent growth over the next years. We define the annual growth rate as ΔHC

j,tT

¼

HC j,t =HC tT

j,T

,

ð2:5Þ

where HC is the human capital variable, ΔHC is the annual growth rate in human capital, j a region, t is the latter point in time, and T is the initial point in time. Note that the dataset is significantly reduced in 1960, resulting in fewer observations when considering 1960. The part on the top of the figure (a) shows this case for literacy in 1900 and growth in literacy between 1900 and 1930. We find a clear convergence scheme where those regions with the highest literacy rates (and thus closest to the maximum literacy limit) grow the least and those with low literacy rates grow much stronger.

2 Regional Human Capital Inequality in Europe

0

.05

.1

.15

a)

0

.2

.4 .6 Literacy, 1900

.8

1

.02 .04 .06 .08

.1

.12

b)

.2

.4

.6 Literacy, 1930

.8

1

.8

1

.1

.15

.2

.25

c)

.05

Annual growth rate, 2000- 2010

Annual growth rate, 1930-1960

Annual growth rate, 1900-1930

20

.2

.4 .6 Educational attainment, 2000

Fig. 2.3 Growth in regional human capital

2.4 Results

21

Table 2.3 Convergence in regional human capital

Variables ln(HC 1900)

(1) ΔHC 1900–1930 0.0296*** (0.000)

(2)

(3)

(4)

(5)

1930–1960

1900–1930 0.0281*** (0.000)

1930–1960

2000–2010

0.0323*** (0.000)

ln(HC 1930)

0.0322*** (0.000)

ln(HC 2000) Constant Regions Observations R-squared

0.0319*** (0.000) All 178 0.73

0.0298*** (0.000) All 141 0.66

0.0326*** (0.000) Reduced 118 0.60

0.0299*** (0.000) Reduced 118 0.62

0.0429*** (0.000) 0.0943*** (0.000) All 254 0.70

Note: p-values in parentheses; *** p < 0.01, ** p < 0.05, * p < 0.1

Still, there are a number of regions that underperform in this respect, generating noise at the lower end of the distribution. These regions come particularly from the Russian Empire, Yugoslavia and Portugal. Among others, these low performances may be derived from the effects of the Russian Revolution, the First World War and the construction and formation of new regions and countries. But also positive outliers can be found. For example, the highest growth is found for Albania and the second highest for Armenia in the top-left corner, being at the same time at the very end of the literacy scale together with Azerbaijan. The most important negative outlier in the bottom-left corner is the Russian region Dagestan, an ethnically mixed region located north of Azerbaijan with an important share of Muslims. One could hypothesise that the efforts of the Russian central authorities to increase literacy may have not been very effective because it had only limited power in this region. The same plot 30 years later (but a lower number of observations) does not appear to show as many important outliers from the general pattern (b). Clearly, many regions have now higher literacy rates, so one might also expect to find important regional differences in growth rates, particularly at the middle values of the literacy scale. In particular, we find that regions whose growth rates were higher than others between literacy levels of 40 and 80% came from Bulgaria and Romania. Finally, we find again a similar pattern for current educational attainment between 2000 and 2010 (c). The lowest attainment values with the highest growth refer to Portuguese regions. We can also use econometric tools to check whether we find convergence. Similar to Zhang and Li (2002), we regress the natural log human capital in the initial year on the average annual human capital growth rate (see Table 2.3). In columns one and two, we include all regions. In all cases, the coefficient is negative and significant at the 1% level. Convergence is evident. Interestingly, the negative coefficient is smaller in column one than in column two, indicating that convergence is faster in

22

2 Regional Human Capital Inequality in Europe

the period 1930–1960 than in the period 1900–1930. However, we have to take into account that the number of observations is importantly reduced in 1960. Therefore, we check our previous results in columns three and four by only including those regions that are available at all concerned points in time (i.e. 1900, 1930 and 1960). However, the basic finding remains. We can also control for country effects, including dummies for today’s countries. In that case, the negative coefficient for 1900–1930 becomes smaller and the one for 1930–1960 larger, indicating that the finding of increasing speed of convergence in the latter period remains robust (not shown). Thus, the findings suggest that β-convergence (i.e. convergence in average literacy levels across regions) is taking place, and convergence is faster in the second period than in the first one. In column five, we indicate the coefficient for the period 2000–2010. However, as the human capital variable is different, we cannot compare this result for those obtained for literacy. Still, convergence is taking place more rapidly for this variable during the more recent years.

2.4.2

Intranational Inequality

The evolution of inequality within a country is the next crucial aspect to take into account. The complete results when using the CV are shown in Table 2.4. As an alternative standard measure of inequality, it is also possible to construct populationweighted Gini coefficients.11 The results are quite similar to those obtained for the CV (not shown). Therefore, the use of the Gini coefficient validates the results obtained by using the CV. In consequence, for simplicity, we will simply refer to the CV in the following: it appears important to emphasise that our methodology, that is, the use of different human capital proxies, does not allow a direct comparison of CVs and Ginis over time. Such a comparison is not intended here nor wanted. Instead, our focus is on the order of countries with regard to regional inequalities. It is clear that Serbia was marked by very high regional differences. These differences may result from the fact that Serbia was created from regions that formerly belonged to countries with very different levels of development, e.g. Austria-Hungary and the Ottoman Empire, and that we consider Kosovo to be still part of Serbia. Although regional inequalities decrease until 1960, Serbia was not able to relinquish these differences. Furthermore, Portugal appears to have been characterised by important regional differences throughout history even though this country is rather small and homogeneous in cultural terms. This finding underlines once more that regional variation can be striking even for small countries. Other countries with high CVs are in particular Italy, Spain, Greece and Russia. Still, these CVs have different causes in each case. For example, Italy shows striking north-south differences over the

11

See Jenkins (1999/2010) for details of calculation.

2.4 Results

23

Table 2.4 CVs at different points in time Country AT BE BG BY CH CZ DE DK ES FI FR GR HR HU IE IT NL NO PL PT RO RU SE SI SK SR UA UK

CV 1850 0.323 1.411 12.175 3.999 0.781 0.218 0.295 0.542 7.412 2.230 1.299 1.635 4.234 3.153 0.544 0.718 4.571 4.659 2.932 6.689

1900 4.529 6.481 12.961 15.565

1930 1.190 2.241 7.031 5.734

1.119

0.406

32.949

19.180 3.310 2.703 13.333 25.744 3.499

7.078

19.448

6.654

14.337 14.667 11.125 23.920

1.264 7.109 3.423 6.752

9.471 27.285 19.853 7.992 5.209 38.138

28.613 31.880 27.712 30.643

1960

2000 3.089 5.971

3.037

1.145

1.663 2.516 4.454 4.929 14.409 0.909 4.314 13.993 5.571 3.097 2.137 7.806 3.226 2.450 2.128 21.818 10.363 5.227 2.685 2.926 1.585

7.498

4.142

3.528 4.306

4.320 4.020 1.684

17.021 1.017 6.252 20.836 3.586 5.594 4.542 4.135 1.971 3.888 31.568 12.952 4.321

0.518 2.095 18.641 5.592 4.096

4.801 11.324 66.775 24.735 1.804

0.437 3.782 37.854 10.880

2010 2.920 4.571 6.223

11.277

centuries. More specifically, the north has relatively high levels of human capital, while the south lags behind. The core-periphery pattern in Spain has led to similar important regional disparities. The Greater Athens region in Greece is not only the capital region of the country but in this way also the administrative, economic and educational centre, giving it an advance to other Greek regions. Finally, although we are only considering the European part of Russia, this is still a huge area, comprising different ethnical and cultural groups, so regions do not develop their human capital at an identical pace throughout time. We also further investigate the relationship between the regional inequality within one country and the national average of human capital. Presumably, approaching the maximum level of human capital will lead to a decrease in regional inequality. In Fig. 2.4, we show that this idea is largely the case. The years 1850,

24

2 Regional Human Capital Inequality in Europe

BY

ES RU UA IEPT

PL UK ITRO FR SK HU BE HR CH NO NL DK SI AT DE CZ

50

60 70 80 90 Human capital indicator

80 60 40

PTRU RO UA

20

5

10

BG

SR

BY BG

100

.2

.6 .8 Human capital indicator

10

10

GR

.7

.8 .9 Human capital indicator

1

20 15 10

ES GR RO IT BG HR RU DK BE FR UKDE NL SE HU CZ SI AT NO IE CH PL SK FI

5

UK BE IE FR ITNL SEDE PL CZ HU AT NO SK FI

PT

0

RO

CV of human capital indicator

30 20

GR

10

CV of human capital indicator

HR RO BG

2010

ES

0

IT

HU PL SI

2000

.4 .6 .8 Human capital indicator

ES

RU

5

PT

1

PT

.2

1

SR

0

SKFR FI HU BE AT CZ

CV of human capital indicator

40 30 20

BG

BY

.4

IT PL

0

CV of human capital indicator

HR RU

GR RO UA

SK FR HU SI BEIE AT UK CZ

1960

SR

PT

HR

.4 .6 .8 Human capital indicator

1930

ES

IT ES PL GR

0

15

SR

CV of human capital indicator

20

1900

0

CV of human capital indicator

1850

1

Fig. 2.4 Regional inequality and level of human capital

.2

.4 .6 .8 Human capital indicator

1

2.4 Results Table 2.5 Negative differences between European and national Ginis and CVs

25 Country BG ES GR GR HR PT PT RU SR SR SR

Year 1850 2010 2000 2010 1930 2000 2010 1930 1850 1900 1930

Gini difference 0.016 0.007 0.017 0.004 0.001 0.038 0.031 0.005 0.035 0.051 0.042

CV difference 1.189 0.243 2.229 2.354 12.960 7.652 0.530 7.655 18.620 14.464

1960, 2000 and 2010 appear to show this tendency relatively clear. However, the year 1900 shows a different pattern. In fact, there is a tendency for inequality to increase until it reaches about 50%, which then decreases with higher levels of literacy. The only country that does not follow this pattern is Serbia. Serbia is again a clear outlier, for the reasons noted above. Accordingly, apart from Serbia, Italy has the highest CV in this sample, being also close to 50%. This pattern is only observable in this case because it is the only year that we have a relevant number of national averages of a human capital indicator of below 50%. All other years are concentrated on levels higher than 50%. Indeed, this potential inverted U-shaped form of regional inequality in human capital appears to indicate a Kuznets curve of regional human capital inequality. A “Kuznets curve of human capital inequality” has been recently suggested by Morrisson and Murtin (2013). Their methodology is quite different from ours as they do not consider regional inequality but inequality in the distribution of education (as measured by different levels of schooling), taking the national level and calculating the returns to education. Therefore, this tentative evidence could suggest that we find a Kuznets curve also for regional inequality when using our methodology. Finally, we want to address the question whether intranational inequality may sometimes be higher than international inequality. To this end, we calculate the average CV and Gini at the European level and compare this to the national CVs and Ginis. The results are shown in Table 2.5. Apart from Greece in 2010, where a negative difference exists only for Ginis, the countries that have higher intranational inequality than international inequality are the same when using Ginis and CVs. These countries are Bulgaria, Spain, Greece, Croatia, Portugal, Russia and Serbia. Here, we only consider the whole sample of European countries. Comparing regional inequalities to smaller groups of countries (e.g. a number of neighbouring countries), we would certainly find even more countries where regional inequalities are higher than international ones. Still, these examples emphasise that regional inequalities can be quite significant and in some cases they may even be larger than between-country differences.

26

2 Regional Human Capital Inequality in Europe

These results suggest that the study of regional data can offer important insights that are lost in pure cross-country analyses. In a longer-run perspective, regional inequalities appear to be a natural phenomenon, which, however, can be tackled (at least to some extent) if appropriate policy decisions are taken.

2.5

Conclusion

This paper has traced the long-run evolution of human capital in the European regions between 1850 and 2010. Human capital is an important factor that has to be considered in a variety of setups because it affects economic and social developments. The role of human capital has particularly been stressed by parts of the economic growth literature in the last decades, from endogenous growth theories (Lucas, 1988; Romer, 1990) until the recent contributions by the unified growth theory (e.g. Galor, 2005a). We have constructed a new and large database from a variety of sources. The data from different points in time have allowed to construct a dataset that covers at least an important share of all European countries in 1850, 1900, 1930, 1960, 2000 and 2010. To our knowledge, this is the first time that it has been possible to show and analyse the regional evolution of human capital during such a long time period for such an important number of countries located in the European continent. To this end, three different proxies have been used: numeracy, literacy and educational attainment. More specifically, numeracy is measured by the ABCC index, literacy by the share of individuals able to “read and write” and educational attainment by the share of employed individuals having attained an education level above lower secondary education. We show that this choice is not arbitrary but that the inherent characteristics of these variables make them appropriate for the purposes of this study. For a general overview of the evolution of human capital in the long run, we have presented some of the literature that has contributed evidence on human capital in a longer historical perspective. More precisely, we have put forward the evolution of numeracy, literacy and book production. After this point of departure, we have traced the evolution of human capital in the European regions between 1850 and 2010. The most striking result is that regional differences have been significant in many European countries in the past and the present. The persistence of regional human capital differences is striking in many cases. Literacy (1900 to 1960) and educational attainment (2000 to 2010) rates have shown to be converging over time. In addition, we measure regional inequality by the coefficient of variation and the Gini coefficient. Both show that some countries have been much more unequal than others. The high regional differences within countries, which sometimes can be higher than intercountry differences, underline that cross-country analyses miss an important part of the human capital story. For this reason, more research needs to be done at the regional level in Europe. This paper has done a further step into this direction. The new evidence is important for

Appendix

27

both academics and policy-makers because it contributes to the understanding of long-term developments which are important for the present and the future. In particular, it further emphasises the need to better understand the territorial dimension of human capital. At this moment, regional information on human capital is mostly only available in the form of different levels of educational attainment. However, in this study, we were able to use historical output measures in the form of numeracy and literacy, and it would be more than appropriate to have similar measures for today. In other words, large-scale surveys such as PISA, TIMSS, etc. could be used to a greater degree for regional studies. So far, regional data are mostly limited to a few countries, and most studies in this area only consider one or two countries. Further data collection at the regional level would increase the knowledge upon which policy-makers can effectively take the appropriate policy decisions. In addition, the existence of long-term educational structures indicates that policy reforms have to be designed in a way that they take into account a range of “deep” institutional, cultural and economic structures that may be related to educational outcomes. This means that structural changes may be needed, which may not always be an easy task – but the rewards can be significant, in terms of both educational improvements and, potentially, long-run growth (see also Hanushek & Woessmann, 2016).

Appendix Country abbreviations Abbreviation AL AM AT AZ BA BE BG BY CH CZ DE DK EE ES FI FR GE GR

Country Albania Armenia Austria Azerbaijan Bosnia-Herzegovina Belgium Bulgaria Belarus Switzerland Czech Republic Germany Denmark Estonia Spain Finland France Georgia Greece (continued)

28

2 Regional Human Capital Inequality in Europe

Abbreviation HR HU IE IS IT LT LU LV MD ME MK NL NO PL PT RO RU SE SI SK SR UA UK

Country Croatia Hungary Republic of Ireland Iceland Italy Lithuania Luxembourg Latvia Moldova Montenegro Northern Macedonia Netherlands Norway Poland Portugal Romania Russia Sweden Slovenia Slovakia Serbia Ukraine United Kingdom

• Data for 1850 Hippe, R. and J. Baten (2012). Regional inequality in human capital formation in Europe, 1790–1880, Scandinavian Economic History Review, 60 (3): 254–289. • Data for 1900 Country Albania

Austria (Cisleithania)

Census year 1918

1900

Source Preliminary dataset “Albanische Volkszählung von 1918”, entstanden an der Karl-Franzens-Universität Graz unter Mitarbeit von Helmut Eberhart, Karl Kaser, Siegfried Gruber, Gentiana Kera, Enriketa Papa-Pandelejmoni und finanziert durch Mittel des Österreichischen Fonds zur Förderung der wissenschaftlichen Forschung (FWF). Special thanks to Siegfried Gruber for providing the data. K. K. Statistische Central-Commission (1903). Oesterreichische Statistik. Ergebnisse der Volkszählung vom 31. December 1900, 2. Band, 2. Heft, Wien, KaiserlichKönigliche Hof- und Staatsdruckerei. (continued)

Appendix

29

Country Bosnia-Herzegovina

Census year 1910

Belgium

1900

Bulgaria

1900

Cyprus

1911

France

1906

Greece

1907

Hungary (Transleithania)

1900

Ireland

1901

Italy

1901

Montenegro

1900

Portugal

1900

Romania

1899

Russian Empire

1897

Serbia

1906

Spain

1900

Source Mayer, M. (1995). Elementarbildung in Jugoslawien (1918–1941), München, R. Oldenburg Verlag. Statistique de la Belgique (1903). Population. Recensement général. 31 décembre 1900, Bruxelles, TypographieLithographie A. Lesigne. Principauté de la Bulgarie (1906). Résultats généraux du recensement de la population dans la principauté de Bulgarie au 31 décembre 1900, 1-ère livraison, Sophia: Imprimerie “Gabrovo”. Mavrogordato, A. (1912). Cyprus. Report and general abstracts of the census of 1911, London: Waterlow & Sons Ltd. Statistique Générale de la France (1908). Résultats statistiques du recensement général de la population effectué le 6 mars 1906, Paris. Royaume de Grèce (1909). Résultats statistiques du recensement général de la population effectué le 27 octobre 1907, Tome I, Athènes: Imprimerie nationale. Magyar Statisztikai Közlemények (1907). A magyar szent korona országainak 1900. Evi. Népszámlálása. Harmadik rész. A népesség részletes leirása. Budapest: Pesti KönyvnyomdaRészvénytársaság. Census of Ireland, 1901 (1902). Part II. General Report, Dublin: Brown & Nolan Ltd. Ministero di agricultura, industria e commercio (1907). Anuario statistico italiano 1905–1907, Fascicolo Primo, Roma: G. Bertero e C. MacKenzie Wallace, D. (2006). A short History of Russia and the Balkan States, Elibron Classics, Adamant Media Corporation. Ministerio dos negocios da fazenda (1906). Censo Da Populacao Do Reino De Portugal No 1. De Dezembro De 1900, Vol. II, Lisboa: A Editora. Royaume de Roumanie (1905). Résultats définitifs du dénombrement de la population (décembre 1899), Bucarest, Eminesco. издание центрального статистического комитета министерства внутренних (1899–1905). первая всеобщая. Переписъ населения, российской империи, 1897 г., с.петербург. Various tombs. Direction de la Statistique d’Etat du Royaume de Serbie (1908). Annuaire statistique du Royaume de Serbie, Onzième Tome, Belgrade, Imprimerie de l’Etat du Royaume de Serbie. Dirección general del Instituto geográfico y estadístico (1903). Censo de la Poblacion de Espana, según el Empadronamiento hecho en la Península é Islas adyacentes en 31 de diciembre de 1900, Tomo II and Tomo III, Madrid: Imprenta de la Dirección general del Instituto geográfico y estadístico. (continued)

30

2 Regional Human Capital Inequality in Europe

Country United Kingdom

Census year 1901

Source Hechter, M. (1976). U.K. County Data, 1851–1966 [computer file]. Colchester, Essex: UK Data Archive [distributor]. SN: 430, https://doi.org/10.5255/UKDA-SN-430-1. Although all efforts are made to ensure the quality of the materials, neither the original data creators, depositors or copyright holders, the funders of the Data Collections, nor the UK Data Archive bear any responsibility for the accuracy or comprehensiveness of these materials.

Note: Age definitions are as follows: Italy, 6+; Austria, Bosnia-Herzegovina, 7+; Spain, 8+; Albania, Belgium, Bulgaria, France, Greece, Ireland, Portugal, Russian Empire, 10+; Hungary, Romania, Serbia, 11+; Cyprus, 15+; United Kingdom, unavailable (a comparison of the included data for Ireland with the source for Ireland as listed above has revealed similar overall results, so an age definition of 10+ can reasonably be assumed). Age definitions are either directly given in the publication or have been linearly estimated from available age definitions in order to be as close as possible to the standard definition of ages above 10 years. Various age definitions as calculated here may possibly not significantly affect the final results. For example, the percentage change of using a 5+ instead of a 10+ definition is below 1% in Ireland

• Data for 1930 Kirk, D. (1946). Europe’s population in the interwar years. Princeton: Princeton University Press. • Data for 1960 Country Bulgaria

Census year 1956

Greece

1961

Hungary

1960

Italy

1961

Poland

1960

Portugal

1960

Romania

1956

Source централно статистическо управление при министерския свет (1960). преброяване на населението в народна република българия на 1. XII. 1956 година, общи ресултати, книга II, софия: държавно издателство “наука и изкуство”. Royaume de Grèce (1968). Résultats du recensement de la population et des habitations effectué le 19 mars 1961, Vol. III, Athènes: Office nationale de Statistique. Központi Statisztikai Hivatal (1962). 1960. Évi népszámlálás, Budapest, Allami Nyomda. Istat (2012). Serie Storiche, Tavola 7.1.1, online, last accessed 3 August 2012, http://seriestoriche.istat.it/fileadmin/allegati/Istruzione/tavole/ Tavola_7.1.1.xls. Glówny Urzad Statystyczny (1960). Biuletyn statystyczny. Spis powszechny z dnia 6 grudnia 1960 r., Ludnosc. Gospodarstwa domowe, Wyniki ostateczne, Seria “L”, various issues, Warszawa. Instituto nacional de Estatística (1960). X Recenseamento Geral Da Populacao, Tomo III, Lisboa: Sociedade Tipográfica. Republica Populara Romîna (1961). Recensamîntul Populatiei din 21 Februarie 1956, Rezultate Generale, Bucuresti, Direcţia Centrala de Statistica. (continued)

References

31

Country Spain

Census year 1960

USSR

1959

Yugoslavia

1961

Source Instituto nacional de Estadística (1969). Censo de la Poblacion y de la Viviendas de Espana, según la Inscripción realizada el 31 de diciembre de 1960, Tomo III, Madrid: I.N.E. Artes graficas. Demoscope (2012). Всесоюзная перепись населения 1959 года. Таблица 7. Распределение населения по возрасту и уровню образования. РГАЭ. Ф.1562 Оп. 336 Д.1591–1594, online, last accessed 8 August 2016, http://demoscope.ru/weekly/ssp/rus_edu_59. php. Demoscope (2012). Всесоюзная перепись населения 1959 года. Таблица 2,5. Распределение всего населения и состоящих в браке по полу и возрасту. РГАЭ. Ф.1562 Оп. 336 Д.1535–1548. Российская Государственная библиотека, отдел “литературы ограниченного пользования”, online, last accessed 8 August 2016, http://demoscope.ru/weekly/ssp/sng_mar_59_r.php. Statisticni urad Republike Slovenije (2012). Popis prebivalstva 1961, Prebivalstvo, staro 10 let ali več, po spolu, starosti in pismenosti, online, last accessed 8 August 2016, http://www.stat.si/publikacije/popisi/1 961/1961_2_40.pdf.

Note: Age definitions are as follows: Italy, 6+; Hungary, Poland, Portugal, Spain, 7+; Bulgaria, Romania, 8+; Greece, USSR, Yugoslavia, 10+. Age definitions are either directly given in the publication or have been linearly estimated from surrounding available age definitions to be as close as possible to the standard definition of ages above 10 years

• Data for 2000 and 2010 All countries except Russia: Eurostat (2011). Economically active population by sex, age and highest level of education attained, at NUTS levels 1 and 2 (1000), online, last accessed 22 June 2016, lfst_r_lfp2acedu. Russia: всероссийская перепись населения о россии языком цифр (2012). Население по уровню образования по субъектам Российской Федерации, online, last accessed 12 October 2016, http://www.perepis-2010.ru/results_of_the_ census/tab8.xls.

References A’Hearn, B., Crayen, D., & Baten, J. (2009). Quantifying quantitative literacy: Age heaping and the history of human capital. Journal of Economic History, 68(3), 783–808. Banks, A. S. (1971). Cross-polity time-series data. MIT Press. Barro, R. J., & Lee, J.-W. (2001). International data on educational attainment: Updates and implications. Oxford Economic Papers, 53(3), 541–563. Beach, M. J. (2009). A critique of human capital formation in the U.S. and the economic returns to sub-baccalaureate credentials. Educational Studies: A Journal of the American Educational Studies, 45(1), 24–38. Becker, G. S. (1962). Investment in human capital: A theoretical analysis. Journal of Political Economy, 70, 9–49.

32

2 Regional Human Capital Inequality in Europe

Becker, S. O., & Woessmann, L. (2009). Was Weber wrong? A human capital history of protestant economic history. Quarterly Journal of Economics, 124(2), 531–596. Benavot, A., & Riddle, P. (1988). The expansion of primary education, 1870-1940: Trends and issues. Sociology of Education, 61(3), 191–210. Breinlich, H. (2006). The spatial income structure in the European Union: what role for economic geography? Journal of Economic Geography, 6, 593–617. Cipolla, C. M. (1969). Literacy and development in the West. Penguin Books. Cohen, D., & Soto, M. (2007). Growth and human capital: Good data, good results. Journal of Economic Growth, 12, 51–76. Council of the European Union. (2011). Council recommendation of 28 June 2011 on policies to reduce early school leaving, 2011/C 191/01. Retrieved August 15, 2014, from http://eur-lex. europa.eu/LexUriServ/LexUriServ.do?uri¼OJ:C:2011:191:0001:0006:EN:PDF Crayen, D., & Baten, J. (2010). Global trends in numeracy 1820-1949 and its implications for longrun growth. Explorations in Economic History, 47, 82–99. De La Fuente, A., & Domenech, R. (2006). Human capital in growth regressions: How much difference does data quality make? Journal of the European Economics Association, 4, 1–36. Eurostat. (2011). Economically active population by sex, age and highest level of education attained, at NUTS levels 1 and 2 (1000), data online. Retrieved June 22, 2012, from lfst_r_lfp2acedu. Felice, E. (2012). Regional convergence in Italy, 1891-2011: Testing human and social capital. Cliometrica, 6(3), 267–306. Flora, P. (1983). State, economy, and society in Western Europe 1815-1975, Vol. I. Campus Verlag. Gagliardi, L., & Percoco, M. (2011). Regional disparities in Italy over the long run: the role of human capital and trade policy. Région et Développement, 33, 81–105. Galor, O. (2005a). From stagnation to growth: Unified growth theory. In P. Aghion & S. N. Durlauf (Eds.), Handbook of economic growth (Vol. 1A, pp. 171–293). North Holland. Galor, O. (2005b). The demographic transition and the emergence of sustained economic growth. Journal of the European Economic Association, 3(2–3), 494–504. Galor, O. (2012). The demographic transition: causes and consequences. Cliometrica, 6, 1–28. Gennaioli, N., La Porta, R., Lopez-de-Silanes, F., & Shleifer, A. (2013). Human capital and regional development. Quarterly Journal of Economics, 128, 105–164. Hanushek, E. A., & Woessmann, L. (2016). Knowledge capital, growth, and the East Asian miracle. Science, 351(6271), 344–345. Hippe, R. (2012a). How to measure human capital? The relationship between numeracy and literacy. Economies et Societes, 45(8), 1527–1554. Hippe, R. (2012b). Spatial clustering of human capital the European regions. Economies et Societes, 46(7), 1077–1104. Hippe, R., & Baten, J. (2012). Regional inequality in human capital formation in Europe, 1790–1880. Scandinavian Economic History Review, 60(3), 254–289. Houston, R. (2001). Literacy. In E. Stearns (Ed.), Encyclopedia of European social history (Vol. 5, pp. 391–406). ILO. (1982). Resolution concerning statistics of the economically active population, employment, unemployment and underemployment. Adopted by the Thirteenth International Conference of Labour Statisticians, online. Retrieved August 15, 2014, from http://www.ilo.org/wcmsp5/ groups/public/%2D%2D-dgreports/%2D%2D-stat/documents/normativeinstrument/wcms_0 87481.pdf Jenkins, S. P. (1999/2010). INEQDECO: Stata module to calculate inequality indices with decomposition by subgroup. Statistical Software Components S366002, Boston College Department of Economics, revised 19 Apr 2001, online. Retrieved August 15, 2014, from http://ideas.repec. org/c/boc/bocode/s366007.html Juif, D., & Baten, J. (2012). On the human capital of Inca Indios before and after the Spanish conquest. Was there a “Pre-Colonial Legacy”? Explorations in Economic History, 50(1), 227–241.

References

33

Kirk, D. (1946). Europe’s population in the interwar years. Princeton University Press. Krueger, A. B., & Lindahl, M. (2001). Education for growth: Why and for whom? Journal of Economic Literature, 39, 1101–1136. Lindert, P. (2004). Growing public. Cambridge University Press. Lopez-Rodriguez, J., Faina, J. A., & Lopez-Rodriguez, J. (2005). New economic geography and educational attainment levels in the European Union. International Business & Economics Research Journal, 4(8), 63–74. Lucas, R. (1988). On the mechanics of economic development. Journal of Monetary Economics, 22(1), 3–42. Manzel, K., & Baten, J. (2009). Gender equality and inequality in numeracy: The case of Latin America and the Caribbean, 1880-1949. Revista de Historia Económica, 27(1), 37–73. McMahon, W. W. (1999). Education and development: Measuring the social benefits. Oxford University Press. Mitchell, B. R. (2003). International historical statistics: Europe 1750-1993. M. Stockton Press. Morrisson, C., & Murtin, F. (2009). The century of education. Journal of Human Capital, 3(1), 1–42. Morrisson, C., & Murtin, F. (2013). The Kuznets curve of human capital inequality: 1870–2010. The Journal of Economic Inequality, 11(3), 283–301. Redding, S., & Schott, P. (2003). Distance, skill deepening and development: Will peripheral countries ever get rich? Journal of Development Economics, 72(2), 515–541. Rodriguez-Pose, A., & Tselios, V. (2011). Mapping the European regional educational distribution. European Urban and Regional Studies, 18(4), 358–374. Romer, P. (1990). Endogenous technological change. Journal of Political Economy, 99(5), 71–102. Schultz, T. W. (1962). Reflections on investment in man. Journal of Political Economy, 70, 1–8. Sen, A. (1999). Development as freedom. Anchor Books. Simon, C. J., & Nardinelli, C. (2002). Human capital and the rise of American cities, 1900–1990. Regional Science and Urban Economics, 32(1), 59–96. UNESCO. (2005). EFA global monitoring report 2006. Graphoprint. Vincent, D. (2000). The rise of mass literacy. Polity Press. Weber, M. (1958). The protestant ethic and the spirit of capitalism. Charles Scribner’s Sons. Zhang, J., & Li, T. (2002). International inequality and convergence in educational attainment, 1960–1990. Review of Development Economics, 6(3), 383–392.

Chapter 3

Spatial Clustering of Numeracy and Literacy

3.1

Introduction

Human capital is an important factor for economic growth. Particularly, the unified growth theory underlines that human capital plays a major role in the long-run economic development from stagnation to growth (e.g. Galor, 2012; Galor & Moav, 2002; Galor & Weil, 2000; Galor et al., 2009). Still, many studies compare human capital levels only at the national level across countries. However, considerable differences may reside within nations (e.g. Canals et al., 2003; Diebolt et al., 2005). Hippe and Baten (2012a) show that regional inequality in human capital was considerably high (but decreasing) for many European countries in the nineteenth century. Thus, taking a closer comparative look at the differences within countries at the European level may considerably advance our knowledge on the importance of human capital. Economic theories such as NEG (e.g. Fujita et al., 1999) have further highlighted the appropriateness of the regional unit as the basic unit of analysis. Regional data also allow to evaluate the diffusion process of knowledge from one region to another. Accordingly, knowledge spillovers at the local level have attracted the interest of many researchers (e.g. Anselin et al., 2000; Del Barrio-Castro & García-Quevedo, 2005; Jaffe et al., 1993). For these reasons, the aim of this paper is to analyse explicitly the spatial distribution of human capital in Europe in the long run. What role did geographical proximity play for human capital formation? To this end, we take the new and large database used by Hippe and Baten (2012a) for regional numeracy in the nineteenth century and add supplementary data for human capital in the twentieth century. In particular, we employ regional literacy data from censuses for the 1930s (by Kirk, 1946). This chapter was first published in slightly modified form as Hippe, R. Spatial clustering of human capital in the European regions, Economies et sociétés, AF, 2013, 46 (7): 1077–1104. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 C. Diebolt, R. Hippe, Human Capital and Regional Development in Europe, Frontiers in Economic History, https://doi.org/10.1007/978-3-030-90858-4_3

35

36

3 Spatial Clustering of Numeracy and Literacy

The spatial distribution of these data is investigated using the methods of ESDA. These rather recent methods have increasingly been employed to investigate the role of space and spatial effects on economic and social variables. For example, one application of these methods has been the analysis of the spatial distribution of GDP and patents as well as convergence processes at the regional level (Baumont et al., 2003; Diebolt & Pellier, 2009; Le Gallo & Ertur, 2003a; López-Bazo et al., 1999). In fact, it is not sufficient to detect clusters by only mapping data because human perception tends to find patterns even in random data (Messner et al., 1999). In this way, we would run the risk of taking spurious conclusions. Furthermore, it is not possible to evaluate the significance of these clusters. In contrast, ESDA allows a clear identification of spatial clusters and the verification of their significance. Therefore, we can evaluate the evolving spatial heterogeneity of basic human capital in the European regions by employing these methods. The specific methods used in this study are Moran’s I, the Moran scatter plot and the Moran significance map. We also check the robustness of the applied methodology. The results highlight the spatial clustering of regional numeracy and literacy throughout our period. The overall pattern in 1850 consisting in a clustering of high and low values can still be observed in the 1930s. Hence, geographical proximity to central European countries appears to be an important explanatory factor for regional human capital distribution in Europe. However, we also identify several distinct spatial regimes that do not follow this pattern. The paper is structured as follows. First, we describe the general evolution of education in the nineteenth and the first decades of the twentieth century. Second, we present the data and the methodology. We employ the age heaping method to proxy basic numeracy in 1850 and use literacy data in 1930. Our data are disaggregated at the regional level. For this reason, we have attributed the historical boundaries to a constant set of current administrative units. These are the NUTS categories as used by the European Union.1 This strategy allows us to improve the comparability of our data throughout time. Third, we introduce ESDA as a tool for evaluating the spatial heterogeneity of human capital inequality in Europe. The fourth section presents the results of the application of these methods. Finally, the last section concludes.

3.2

Evolution of Basic Education in the European Regions

Prior to the nineteenth century, education was not perceived to be important for broad segments of the population. This is also highlighted by the fact that more than 90% of the population of all continents were not able to write in 1750 (Cipolla, 1969). Therefore, education had been a privilege for the upper classes for a very long

1

NUTS stands for nomenclature of territorial units for statistics. Eurostat distinguishes between different NUTS levels, beginning with NUTS 0 (the country level) and going down to NUTS 3 (approximately the county level).

3.2 Evolution of Basic Education in the European Regions

37

time. Still in the nineteenth century, this fact can be illustrated for the case of Serbia: “[e]ducation in Servia is strong at the top and weak at the bottom” (Report of the International Commission, 1914, p. 270). Nevertheless, this attitude changed radically as education became to be seen crucial not only for achieving the authority of the state (Green, 1990; Vincent, 2000) but also for nation building and national integration and for encouraging economic development (Mishkova, 1994). In the decades between 1870 and the First World War, education played an increasing role in public discussion. The general education of the people was, however, not only perceived by western European countries as an important stake for development. In eastern Europe, important efforts were also made to close the gap to the more advanced Western societies. At the beginning of the period, this region had very low enrolment rates, which improved during the following years (Benavot & Riddle, 1988). Bulgaria is a particular appropriate example. After the creation of the Bulgarian Principality in the year 1878, primary education was made compulsory by the Bulgarian constitution in 1879 (Mishkova, 1994). Subsequently, several laws reiterated the endeavour to significantly improve the basic education of the people. Moreover, the state actively supported the construction of new schools, and more pupils were allowed to acquire basic education. Finally, the effects of this policy bore fruits: international observers evaluated that the educational situation in Bulgaria was better than in other Balkan states in the 1910s (Report of the International Commission, 1914).2 What about the regional differences in basic human capital in the European countries to that time? A general improvement of basic education throughout the nineteenth century can be seen in the development of regional numeracy (Hippe & Baten, 2012a). Scandinavia, the United Kingdom and central Europe had already attained very high numeracy levels in the first decades of the nineteenth century, whereas southern and eastern countries such as Spain, the Balkans and the Russian Empire lagged behind. Moreover, regional patterns are observable. For example, Spain is characterised by a core-periphery pattern where periphery regions, in particular in the west (Galicia) and in the south (Andalusia), have lower levels of basic numeracy than the more central regions. In contrast, Italy is divided into a northern part with high numeracy and a southern one with rather low values. However, the authors considered not only the geographical aspect of the regional distribution of human capital but also its evolution throughout time and the inequalities that existed in different European countries. They show that, in general, the inequalities (as measured by the coefficient of variation) diminished in these countries during the nineteenth century. The main tendencies of this study are corroborated by research using other human capital indicators. For example, the data used by Cinnirella and Hornung (2011) indicate that enrolment rates of 6–14 year olds were already high for contemporary standards in Prussia in 1816 (on average 60.3%). These rates increased during the century. In 1849, they already reached 80.2%. The lowest rates were mostly found in

2

The same was still true 30 years later (Kirk, 1946).

38

3 Spatial Clustering of Numeracy and Literacy

the Poznan provinces, which were also those provinces with the lowest ABCCs in Hippe and Baten’s (2012a) study. Regional variation was even less important at the end of the nineteenth century, as demonstrated by average enrolment rates of 93.5% in 1886 and 94.4% in 1896. Moreover, Felice (2012) assembled human capital data on larger Italian regions in the nineteenth and twentieth centuries. His human capital index is a combination of literacy and enrolment rates. He finds that there were large differences in 1871, which decreased during the following decades. North-western regions had the highest values, followed by those from the centre/northeast. Finally, the south and the Italian islands were characterised by lower human capital. For Spain, Núnez (1992) shows that literacy rates follow a similar core-periphery pattern as indicated by numeracy data. Literacy rates increased so that almost the whole population in the northern regions was literate in the 1930s. However, other regions, particularly in the south, lagged behind. Thus, the correspondence between the overall regional variation in numeracy and literacy has also been demonstrated by Hippe (2012) for a set of historical European regions and some of today’s developing countries in Africa, Asia and Latin America. Moreover, Kirk (1946) considers literacy in the whole of Europe during the first decades of the twentieth century. He shows that there remains important regional variation in literacy around 1930. In fact, the overall patterns are similar to those detected by Hippe and Baten (2012a) for the nineteenth century. The most advanced countries are located in central, northern and western Europe (see Fig. 3.1). There are some elements that may explain the differences between the countries and the regions within the countries. In particular, Kirk names language, the dominant religion and political components as explanatory factors. Language might play a role when the language used by parts of the population differs from the official language of the state. This may lead to a disadvantage for children who go to school and have to learn a new language. Second, religion might be considered as a factor due to the importance attributed to reading the scriptures. Protestant regions are generally more advanced in literacy than others, a result that has once more been highlighted by Becker and Woessmann (2009). Third, history counts because the former political boundaries that had been modified by the First World War are still visible in the 1930s. Kirk takes the examples of Alsace-Lorraine being the most literate region in France due to its heritage of the public school system in Germany. Moreover, the borders of the former Austro-Hungarian Empire are still apparent. Fourth, government policies contributed to the extent to which illiteracy was fought in the first decades of the nineteenth century. In particular, Latin countries did not sufficiently succeed in putting educational policies into place. Portugal is just one obvious case. In contrast, countries in eastern Europe were better able to introduce more thoroughly general education as a means to boost their development in the years before 1930. Nevertheless, apart from these factors, Kirk states quite clearly that space plays the predominant role in the explication of the spatial distribution of literacy: “the

3.3 Data

39

Fig. 3.1 Literacy (in %) in the European regions, ca. 1930. Note: There was no estimation on literacy of Luxembourg available, but it should be between 0 and 5% as in the neighbouring regions. Source: Own calculations based on data by Kirk (1946)

most important element in the degree of literacy was geographical proximity to, and cultural intercourse with, the more literate regions of Northwestern Europe” (Kirk, 1946, p. 187). These examples demonstrate once more the importance of not only analysing national human capital formation but looking at what happens inside countries. However, Kirk makes this hypothetical statement without any (spatial) econometrical analysis. In contrast, we are able to test this hypothesis about the crucial impact of space on regional human capital distribution in Europe.

3.3

Data

The data used in this paper are taken from several sources. First, the database created by Hippe and Baten (2012a) has been used to estimate the regional distribution of numeracy in Europe around 1850. Regional numeracy values are calculated by means of using the age heaping method (e.g. A’Hearn et al., 2009; Crayen & Baten, 2010b; Hippe & Baten, 2012b; Stolz et al., 2013). Age heaping describes the fact that in most historical censuses (and other sources), the actual age

40

3 Spatial Clustering of Numeracy and Literacy

distribution deviates from the expected one.3 More precisely, far more individuals reported being between the ages of 0 and 5, as could actually be the case. Hence, these people were not aware of their own age. For this reason, they rounded (or heaped) their ages. This age heaping phenomenon can be used to calculate an index that proxies numeracy.4 We use the ABCC index in this study because it is characterised by the same value range as literacy data, i.e. it varies from 0 to 100. The ABCC is calculated in the following way: ABCC ¼ 125  125 

14 X

n5i =

72 X

! ni ,

ð3:1Þ

i¼23

i¼5

where i stands for the years of age and n stands for the number of observations.5 As it is the standard definition in the literature, we take the ages between 23 and 72 to proxy for basic numerical skills around 1850.6 Second, Kirk (1946) collected an impressive database on Europe in the interwar period. We use the data on regional literacy from his dataset. The definition of literacy used by Kirk (1946) is Literacy ¼

N X i¼10

rwi =

N X

! ni

 100,

ð3:2Þ

i¼10

where rw corresponds to the number of individuals in a region able to read and write, n to the total number of individuals in that region, and i to the years of age. As can be derived from the formula, the age threshold of individuals to be included in the data is 10 years. Two questions might arise: first, why do we use both numeracy and literacy data? Second, can we use both proxies for the sake of evaluating the evolution of human capital? To answer the first question, we have to deal with the fact that literacy data are not available for many countries as early as 1850. In consequence, we have to rely on another proxy for this point in time. This proxy is the ABCC. On the other hand, the ABCC has already reached or is close to its maximum level in 1930 for an important range of countries, even more than it is the case for literacy. For this reason, the ABCC cannot be used throughout our time period either. Second, we can use numeracy and literacy also because a significant positive correlation between

3

This pattern is still observable in today’s censuses of a variety of developing countries during the second part of the twentieth century (see Hippe, 2012). 4 Other reasons leading to age heaping can also be imagined. However, Crayen and Baten (2010a) and Hippe (2012) show that education is clearly the most important factor. 5 This formula is the result of a linear transformation of the Whipple index into the ABCC index. 6 Note that we adjust for the first birth decade (23–32 years), as proposed by Crayen and Baten (2010a). In addition, we take earlier or later ABCC values and estimated these values by linear extrapolation because not all the data are available for this point in time.

3.3 Data

41

these two indicators has been found in several studies (A’Hearn et al., 2009; Crayen & Baten, 2010a; Hippe, 2012; Hippe & Baten, 2012a). Moreover, both numeracy and literacy are output measures, measuring the actual performance of an individual in numeracy and literacy. This makes them better comparable than using an output measure in combination with an input proxy, such as school enrolment. School enrolment only measures the proportion of pupils enrolled at school without giving us information on the actual knowledge obtained in school. This knowledge might evidently vary very importantly (see also Vincent, 2000).7 Nevertheless, we are aware of the potential measurement bias using different proxies. Still, we believe that this strategy allows us to pursue the goal of this paper, which is to highlight the spatial distribution of basic human capital in our period. After discussing the human capital proxies employed in this study, we have to define the notion of a “region”. To make the data better comparable, we use a standard administrative definition. This means that we have converted historical administrative borders into today’s NUTS regions (see also Hippe & Baten, 2012a). Obviously, internal and external borders have changed throughout our period. This is very much the case for eastern European countries such as Poland, Hungary and Russia due to wars and revolutions. In contrast, countries such as France and Spain have almost completely remained with identical administrative divisions during the last 150 years. The latter countries clearly allow a much easier comparison between historical data and current data. Where administrative units have changed, we adapted our methodology as best as possible to this. However, caution has to be taken with the interpretation of individual cases when borders have changed importantly. Although this strategy has the aforementioned inconveniences, we proceed in this way because it has the very important advantage to allow a look at more or less the same regional structure throughout our period. Nevertheless, we had to adapt some regions to the necessities in the different censuses.8 How aggregated are our data? We have data available on all NUTS levels. For example, we could use the lowest NUTS level, NUTS 3, in Spain, France, Austria or Slovakia. Clearly, we would prefer to analyse the data always at the highest level of regional disaggregation to use all the information available, i.e. to maximise the number of regions and to avoid possible biases involved in using higher aggregated regional units. However, we prefer to use NUTS 2 in general and, in some particular cases, NUTS 1,9 to standardise the regional classification. It is important for the following ESDA methods to operate on similar levels because our estimations would

7

Apart from the fact that there is (to our knowledge) no database available yet for school enrolment or other indicators, such as years of schooling, at the regional level to that time 8 This is particularly true for Russia. In fact, the provinces of the USSR, particularly of the Ukrainian SSR, as listed in Kirk (1946), are bigger in 1930 than the ones in 1897. Moreover, the current administrative structure does not have an equivalent to NUTS 2 (only NUTS 0 or NUTS 1 and NUTS 3). In consequence, we had to merge several smaller regions into bigger regions, which would constitute NUTS 2 regions. These bigger regions are formed in accordance to historical predecessors. 9 These are the greater Paris and London regions.

42

3 Spatial Clustering of Numeracy and Literacy

Table 3.1 Descriptive statistics for human capital proxies Year 1850 1930

Proxy ABCC Literacy (>10 yrs.)

obs. 189 194

mean 87.23 69.96

sd 13.70 20.75

min 26.38 17.00

max 100.00 99.51

otherwise be potentially prone to biases related to different sizes of our unit of analysis. For this reason, we merged NUTS 3 regions to NUTS 2 regions and weighted the human capital values according to the total population in the region. This strategy allows us to construct a dataset of human capital proxies in the European regions. However, according to Kirk (1946), some countries do not report any literacy rates in the 1930s anymore. This has already been the case since the beginning of the century (UNESCO, 1953). These are generally countries with very high literacy rates, such as the Scandinavian countries, Germany or the United Kingdom.10 In consequence, Kirk estimated that these countries had literacy rates between 0 and 5%. Because there is no regional variation in these estimates, we would automatically (and intentionally) create positive spatial autocorrelation if we left these countries in the data. This would clearly violate the fundamental concepts of the analysis strategy employed. In consequence, we excluded the countries concerned from northern and central Europe. Note that our results are, therefore, limited to this specific setup and could be altered if all European regions could be included. This has to be taken always into account in the later analysis. Nevertheless, we can still work with a large database for western, southern and eastern Europe and are able to evaluate spatial clustering for these large parts of Europe. Descriptive statistics of the data can be found in Table 3.1. There are around 190 regions in our dataset for each point in time. Values range between 26 and 100 for the ABCC in 1850 and between 17 and 100 for literacy in 1930. The regions belong to the following countries (within current borders): Albania, Armenia, Austria, Azerbaijan, Belarus, Bosnia-Herzegovina, Bulgaria, Croatia, Czech Republic, Estonia, France, Georgia, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Macedonia, Moldova, Montenegro, Poland, Portugal, Romania, Russia, Serbia, Slovakia, Slovenia, Spain and Ukraine.11

3.4

Exploratory Spatial Data Analysis

Geographic location has an important influence on the growth of the economy (e.g. Azomahou et al., 2009; Fujita et al., 1999) but also on knowledge flows (Paci & Usai, 2009). Maps and other means of visualisation may help to identify specific 10

More specifically, UNESCO (1953) mentions Denmark, Germany, the Netherlands, Norway, Sweden, Switzerland and the United Kingdom. 11 We do not have ABCC data neither for some regions that were not Greek at the date of the underlying census nor for the principalities of Romania in 1850.

3.4 Exploratory Spatial Data Analysis

43

spatial patterns in the data. However, they are not sufficient because their interpretation is subjective. This is also why it is possible to perceive spatial patterns when the data are clearly random (Messner et al., 1999). Moreover, the evaluation of the significance of potential spatial clusters is out of reach without further tools of analysis. Therefore, the use of appropriate methods is important to identify significant spatial autocorrelation. What is spatial autocorrelation? Spatial autocorrelation is at hand when similar values coincide with similar location (Anselin, 2001). There are two types of spatial autocorrelation: positive and negative spatial autocorrelation. First, positive spatial autocorrelation means that there is a clustering of high (or low) values of a given random variable in space, whereas negative spatial autocorrelation refers to a clustering of dissimilar values. In the latter case, a region is surrounded by neighbours with significantly higher or lower values. Particular spatial patterns may arise due to spatial heterogeneity, resulting in clusters of regions with high educational levels (the core regions) and clusters of regions with low educational levels (the periphery regions). In this paper, we investigate the spatial heterogeneity of human capital among the European regions. By comparing several points in time, we are able to get an insight into the persistence of spatial inequality in human capital during our period. For this reason, we identify global and local spatial autocorrelation by using ESDA.12 ESDA is a “set of techniques aimed at describing and visualizing spatial distributions, at identifying atypical localizations or spatial outliers, at detecting patterns of spatial association, clusters or hot spots, and at suggesting spatial regimes or other forms of spatial heterogeneity” (Le Gallo & Ertur, 2003a, p. 177; see also Anselin, 1998a, 1998b; Bailey & Gatrell, 1995; Haining, 1990). These techniques allow the computation of, firstly, global and, secondly, local spatial autocorrelation. On the one hand, Moran’s I statistic is used in this paper to detect global spatial autocorrelation. This statistic is a standard approach to calculate global spatial autocorrelation. It is defined in the following way (Cliff & Ord, 1981; Le Gallo & Ertur, 2003b): PP It ¼

n ∙ S0

i

j

 wij ðxi,t  μt Þ x P t

j,t

ðxi,t  μt Þ2

 μt

 ,

ð3:3Þ

where t is the year of observation (here t ¼ 1850, 1930), n is the number of NUTS regions, x is an observation, μ is the mean value of the observations, and S0 represents a standardisation factor, which is equal to the total sum of all the elements w of the spatial weight matrix.13 The elements of the spatial weight matrix which lie

12

The following presentation of ESDA is based on Le Gallo and Ertur (2003a, 2003b) and Dall’erba (2005). PP 13 That is, S0 ¼ wij . i

j

44

3 Spatial Clustering of Numeracy and Literacy

on the diagonal, wii, are equal to zero, and the other elements, wij, denote in each case the spatial connection of a region i to a region j. It is possible to rewrite Moran’s I statistic in a matrix form by defining a vector zt of the human capital observations for a given year, which is in deviation from μ. Denoting W as the spatial weight matrix, this gives us It ¼

n z0 t Wzt ∙ : S0 z 0 t z t

ð3:4Þ

Note that the vector Wzt is also called spatially lagged vector, representing the averages of human capital values of the neighbours by using spatial weights. Therefore, Moran’s I indicates the level of linear dependence between zt and Wzt. The outside influence which possibly affects each region can be normalised by row-standardising the spatial weight matrix, so that each individual row sums to 1. Consequently, the scaling factor S0 is now equal to n, simplifying Moran’s I statistic to It ¼

z0 t Wzt : z0 t zt

ð3:5Þ

Moreover, the expected value E(I) of Moran’s I is equal to E ðI Þ ¼

1 : n1

ð3:6Þ

Thus, when Moran’s I is larger than E(I), there is positive spatial autocorrelation in the data; when it is smaller, there is negative spatial autocorrelation. Clearly, there are different ways of constructing the spatial weight matrix. However, the inherent characteristics of our regional European data make it particularly appropriate to use a spatial weight matrix, which is derived from the k-nearest neighbours within the great circle distance dij between the centroids of regions. This matrix has been used by earlier publications referring explicitly to the NUTS 2 classification at the European level (e.g. Dall’erba, 2005; Le Gallo & Ertur, 2003a) and in other domains (e.g. Pace & Barry, 1997; Pinkse & Slade, 1998). In accordance with Le Gallo and Ertur (2003a), the spatial weight matrix takes the following form: 8 > wij ðkÞ ¼ 0 if i ¼ j > > > > > < wij ðk Þ wij ðkÞ ¼ 1 ifdij  d i ðkÞandwij ðkÞ ¼ P wij ðkÞ > > j > > > > : w ðk Þ ¼ 0 ifd  d ðkÞ, ij

ij

i

ð3:7Þ

3.4 Exploratory Spatial Data Analysis

45

di(k) being the critical cutoff distance for i, i.e. it “is the kth order smallest distance between regions i and j such that each region i has exactly k neighbors” (Dall’erba, 2005). Row standardisation allows to account for relative in lieu of absolute distance. The resulting matrix is w*. As Le Gallo and Ertur (2003a) and Dall’erba (2005) have done, we use the great circle distance with a minimum of k equal to 10 in order to allow the connection of islands such as Sicily, Corsica or the Greek Islands to the mainland. Otherwise we would have zero values for some rows and columns. Because our later empirical results are based on the selected criterion of the spatial weight matrix, we check the robustness of our results by increasing the number of k to 15 and 20. The increase in the number of k results in an increase in the part of international connections (Dall’erba, 2005). On the other hand, the global Moran’s I statistic does not allow a closer examination of outliers and regional spatial clustering. For example, it is possible that high values of human capital are concentrated in some particular clusters and low values in others. Moreover, outlying regions that deviate significantly from their surrounding neighbours should be taken into focus. Therefore, we use the Moran scatter plot (Anselin, 1996) and the local indicators for spatial association (LISA) (Anselin, 1995) to analyse the contributions of individual regions and clusters of regions to the overall pattern of global spatial autocorrelation. First, the Moran scatter plot allows to study spatial instability at the local level. It is divided into four quadrants by the zero values on each axis. The horizontal axis depicts the human capital values in units of standard deviations (vector zt). The vertical axis reflects the standardised spatially weighted average for the human capital values (Wzt).14 The quadrants represent the four different types of spatial association between a region and the surrounding neighbours. In the case of this study, • Quadrant I (high-high (HH); upper right) shows the regions in the dataset which have human capital values above the mean, the average of their neighbours’ human capital also being above the mean. • Quadrant II (low-high (LH); upper left) shows the regions which have human capital values below the mean, the average of their neighbours’ human capital being above the mean. • Quadrant III (low-low (LL); lower left) shows the regions which have human capital values below the mean, the average of their neighbours’ human capital being below the mean. • Quadrant IV (high-low (HL); lower right) shows the regions which have human capital values above the mean, the average of their neighbours’ human capital being below the mean. These quadrants can be classified into two categories: first, HH and LL indicate positive spatial autocorrelation, i.e. a region is surrounded by regions with similar values. The reverse case is given in the second category of negative spatial

14

That is, the average in human capital of the regions surrounding a region

46

3 Spatial Clustering of Numeracy and Literacy

Table 3.2 Moran’s I statistic for regional human capital proxies, 1850 and 1930 Year 1850 1930

Proxy ABCC Literacy (>10 yrs.)

k ¼ 10 Moran’s I 0.6031 0.7140

sd 0.0291 0.0295

k ¼ 15 Moran’s I 0.5471 0.6634

sd 0.0234 0.0238

k ¼ 20 Moran’s I 0.4930 0.6214

sd 0.0202 0.0205

Note: The expected value for Moran’s I statistics in 1850 and in 1930 is 0.005. All statistics are significant at p ¼ 0.0001. The number of random permutations is 10,000

correlation, i.e. the quadrants HL and LH. Spatial outliers can be easily identified by the use of these scatter plots. Nevertheless, Moran scatter plots do not allow to judge whether the detected local spatial clusters are significant or not. For this reason, we use a LISA. According to Anselin, a LISA has to fulfil two criteria: first, “the LISA for each observation gives an indication of the extent of significant spatial clustering of similar values around that observation”, and second, “the sum of LISAs for all observations is proportional to a global indicator of spatial association” (Anselin, 1995, p. 94). Thus, we employ a local type of Moran’s I statistic (Anselin, 1995): I i,t ¼

ðxi,t  μt Þ X  wij x m0 j

j,t

X   μt with m0 ¼ x

j,t

2  μt =n

ð3:8Þ

i

The interpretation of this LISA is similar to the one of the global Moran’s I statistic: there is positive local spatial autocorrelation when Ii,t is positive and negative spatial autocorrelation when it is negative. Putting the information obtained by the Moran scatter plot and the LISA together gives us the Moran significance map (Anselin & Bao, 1997). This map highlights, by means of different colours, the regions that are characterised by significant (positive or negative) spatial autocorrelation.

3.5

Results

We highlight first the results for global spatial autocorrelation before taking a closer look at local spatial autocorrelation (Moran scatter plot, Moran significance map).

3.5 Results

3.5.1

47

Global Spatial Autocorrelation

Table 3.2 shows Moran’s I statistic for the proxies of human capital at the different points in time.15 We employ a permutation approach using 10,000 permutations as proposed by the literature (Anselin, 1995). It appears from the table that there is positive spatial autocorrelation in our data because Moran’s I statistics are always significant at the level p ¼ 0.0001. Moran’s I is generally lower for the ABCC in 1850 than for literacy in 1930. More specifically, Moran’s I of the ABCC has a value for k ¼ 10 of 0.6031. In contrast, literacy’s Moran’s I is 0.7140 in 1930. Therefore, the statistics imply that regional human capital is clustered in space in 1850 and 1930. This means that we do not find a random distribution of human capital but that higher levels of human capital cluster with higher ones and vice versa. Thus, we find a significant clustering of European regions by using this global indicator. When we use more nearest neighbours (15, 20) and calculate Moran’s I statistics, the sign and significance of the global spatial autocorrelation do not change. The expected value in both cases (ABCC in 1850 and literacy in 1930) is 0.005. Since the number of k-neighbours is higher, Moran’s I is lower. Globally, these results confirm the appropriateness of our spatial weight matrix.

3.5.2

Moran Scatter Plots

In this section, we want to investigate whether there are regions that do not fit into the overall spatial pattern as indicated by Moran’s I statistics. In other words, we want to detect spatial outliers or atypical locations. For this purpose, we use first Moran scatter plots and then Moran significance maps in the next section. The results for the Moran scatter plot can be seen in Figs. 3.2 and 3.3. Note that we have used k ¼ 10 in this case. Most of the observations are located in quadrants I (HH) and III (LL), i.e. these regions are positively spatially autocorrelated. As can be seen in Table 3.3, about 85% (HH ¼ 55.0%, LL ¼ 30.2%) of all regions are associated with similar values in 1850 and 88% (HH ¼ 39.7%, LL ¼ 47.9%) in 1930. The share of positive spatial autocorrelation is therefore quite similar, but the distribution between HH and LL clusters is different. Moreover, there are a number of regions which show an association with dissimilar values, i.e. LH or HL. It appears that there are more regions that are able to have higher basic human capital values than their neighbours than the other way round. In fact, in 1850 there are 9.5% of all regions in the HL quadrant and only 5.3% in the LH quadrant. HL regions are predominantly northern Russian regions (including Moscow), Estonia, Latvia and a range of regions located on similar latitude from Serbia (Vojvodina) over Romania (Sud-Muntenia), some Ukrainian regions to Russia (Rostov). A somewhat third scheme is constituted by Puglia and Calabria 15

All calculations in the following sections were performed by using GeoDa.

48

3 Spatial Clustering of Numeracy and Literacy

Fig. 3.2 Moran scatter plot for ABCC in Europe, ca. 1850 (k ¼ 10)

Fig. 3.3 Moran scatter plot for literacy in Europe, ca. 1930 (k ¼ 10)

in southern Italy. On the other hand, there are three clusters discerning for the LH quadrant. One is located in the western and southern peripheral regions of Iberia (Portugal’s Centro, Spain’s Galicia, Andalucía, Murcia and Comunidad

3.5 Results

49

Table 3.3 Percentage of observations in each quadrant of Moran’s scatter plot (k ¼ 10) Year 1850 1930

Proxy ABCC Literacy (> 10 yrs.)

HH 55.0% 39.7%

LL 30.2% 47.9%

HL 9.5% 7.2%

LH 5.3% 5.2%

Table 3.4 Percentage of observations in Moran’s significance map (k ¼ 10) Year 1850 1930

Proxy ABCC Literacy (>10 yrs.)

Not sig. 45.5% 32.9%

HH 37.0% 34.0%

LL 15.3% 30.4%

HL 1.1% 1.5%

LH 1.1% 0.1%

Valenciana). A second cluster comprises Eastern Polish and western Ukrainian regions. Finally, another LH region is the Border, Midland and Western region in west/northwest Ireland. The area was the harshest hit by the Irish famine of the 1840s. This might explain its underperformance in human capital. In 1930, we have a similar scheme with 7.2% in the HL quadrant and almost identically 5.2% in the LH quadrant. The HL regions are all located in eastern and southeast Europe, except for Asturias in Spain. In particular, these are Moscow and some regions in its northeast, the St. Petersburg region, Estonia, Latvia, the Ukrainian Krim island, Polish Podkarpackie, Romanian Vest and Bulgary’s Severen tsentralen and Yugozapaden (including the capital, Sofia). The case of Bulgaria might confirm our introductory comments on this country, stating that it was particularly successful in elevating the overall human capital of its population during the decades following its independence. The picture is much more diverse for the LH regions, coming from different corners of Europe. More specifically, LH regions are located in Spain (Aragon, Castilla-La Mancha, Comunidad Valenciana), Italy (Sardegna), Croatia (Jadranska Hrvatska), Poland (Swietokrzyskie) and neighbouring Ukraine (Zakarpattia), Romania (Nord-Vest) and Russia (Kareliya). Some regions are negatively spatially autocorrelated and are located far from the mean value of basic human capital. In 1850, these regions are from the Balkan and Caucasus regions and have very low values. The same observation can be made for the 1930s, when southern Balkan countries (Albania and Macedonia) and eastern Caucasus regions have still not caught up to other regions. Because the Moran scatter plot does not give any indication of the significance of the obtained results, we use in the next step the Moran significance maps.

3.5.3

Moran Significance Maps

LISA is helpful to extend our understanding of the results previously obtained. Therefore, we use Moran significance maps. The statistical results of these maps are summarised in Table 3.4. First, most of the regions which are significant (also called hot spots) are found in either the HH or LL quadrants. This shows a tendency

50

3 Spatial Clustering of Numeracy and Literacy

Not Significant High-High Low-Low Low-High High-Low

Fig. 3.4 Moran significance map for ABCC in Europe, ca. 1850 (5% pseudo-significance level, k ¼ 10)

of positive spatial autocorrelation in space. In each year, most regions are in the HH cluster. Second, there are not many observations with significant values which fall into the other two (HL and LH) quadrants. In fact, outliers in the LH and LH clusters are almost non-existent (0.1–1.5%). After these first impressions, we can have a direct look at the Moran significance maps. Note that in the maps (Figs. 3.4 and 3.5), the regions with significant values are coloured according to the underlying spatial regime.16 In both 1850 and 1930, there is a clear pattern of regions in central Europe to be in the HH quadrant. The regions concerned are primarily from France, northern Italy, Austria, (to that time) western Poland, the Czech Republic, Slovenia and Hungary. Cataluña in Spain is also associated with this cluster in both maps. On the other hand, a first LL cluster is located in the southern Balkans (primarily Serbia, Macedonia, Albania and Greece). In 1850, two other ones are visible in Belarus/western Russia and the greater Caucasus regions. The same can be said for 1930, except that much more Russian (and Ukrainian) regions are now also significantly in this LL cluster. In this case, the southern regions of Portugal, Spain and Italy are also concerned. Whereas other clusters appear to be in line with the idea of geographical proximity to north-western European countries as an explanatory factor, the Belarusian case shows that there have to be other additional factors. In fact, one would expect only very eastern regions in Russia to be characterised by LL clusters because they are the most distant regions. However, this is not true for today’s Belarus. An alternative 16

Dark red, HH; light red, HL; dark blue, LL; light blue, LH

3.5 Results

51

Not Significant High-High Low-Low Low-High High-Low

Fig. 3.5 Moran significance map for literacy in Europe, ca. 1930 (5% pseudo-significance level, k ¼ 10)

explanatory factor is land inequality, which is shown to significantly affect numeracy formation in the Russian Empire (Hippe & Baten, 2012b). In addition, some regions do not fit the general pattern of positive spatial autocorrelation. In 1850, for example, this is the case for today’s Ukrainian region Zakarpattia Oblast, a historical multiethnic region. This region had lower ABCC values than its surrounding regions. Cultural and language barriers might have contributed to this relatively low result. In general, the atypical regions are mostly at the outskirts of the neighbouring cluster. In 1930, among others, this is true for Crimea in Ukraine and the greater region around Bulgaria’s capital Sofia (today Yugozapaden). In the latter case, Bulgaria’s success in the fight against illiteracy might explain this positive spatial outlier. Still, negative spatial autocorrelation is rare in the dataset.

3.5.4

Robustness Checks

We checked the robustness of our results by increasing the number of k-nearest neighbours to 15 and 20. If the choice of k does not have a significant impact on the results, most observations would have to lie on the diagonal. That is, they would not change from one category to another. The results in our case are summarised in Tables 3.5 and 3.6. Evidently, the use of alternatives for k does not significantly change our results. In fact, most of the observations lie on the diagonal. Due to the

52

3 Spatial Clustering of Numeracy and Literacy

Table 3.5 Robustness analysis for 1850 and 1930 (k ¼ 10 to k ¼ 15) k ¼ 10/k ¼ 15 Not sig. HH LL HL LH

Not sig. 87.7% 0.3% 2.1% 0.0% 0.0%

HH 4.2% 99.7% 0.0% 0.0% 0.0%

LL 5.5% 0.0% 97.9% 0.0% 0.0%

HL 0.5% 0.0% 0.0% 100.0% 0.0%

LH 2.1% 0.0% 0.0% 0.0% 100.0%

HL 0.8% 0.0% 0.0% 100.0% 0.0%

LH 2.9% 0.0% 0.0% 0.0% 100.0%

Table 3.6 Robustness analysis for 1850 and 1930 (k ¼ 10 to k ¼ 20) k ¼ 10/k ¼ 20 Not sig. HH LL HL LH

Not sig. 84.6% 0.5% 2.6% 0.0% 0.0%

HH 4.7% 99.5% 0.0% 0.0% 0.0%

LL 7.0% 0.0% 97.4% 0.0% 0.0%

larger inclusion of neighbouring regions, the first line for k ¼ 15 and k ¼ 20 indicates that there are some regions that formerly were not significant which now become significant. As can be seen in the case of k ¼ 15, these regions become to be positively autocorrelated in the majority of cases (HH ¼ 4.2%, LL ¼ 5.5%), whereas the move to negative autocorrelation almost only happens to LH regions (LH ¼ 2.1%, HL ¼ 0.5%). Similar are the conclusions when increasing k to 20. Therefore, these results confirm the appropriateness of the choice of the spatial weight matrix. For this reason, we can infer from these checks that our results are robust to different specifications.

3.6

Conclusion

This paper has analysed the spatial clustering of the regional distribution of human capital in Europe in 1850 and 1930. We have used the databases by Hippe and Baten (2012a) and Kirk (1946) to construct evidence on the basis of two cross-sections. To this means, we have employed several proxies for human capital. First, we have used the ABCC method to proxy basic numeracy levels around 1850. Then, we have added literacy levels for the 1930s. Regions were defined according to the NUTS classification of the European Union. This fixed classification allowed a maximum of comparability of the evolution of regional disparities in basic human capital throughout time. By using ESDA methods (Moran’s I, Moran scatter plot, Moran significance map), we investigated the spatial heterogeneity among European regions in human capital. We had to exclude northern and central European regions due to imposed

References

53

data constraints, so that the focus is on western, southern and eastern European regions. Therefore, our findings apply to this specific regional setup. The results show that Europe was characterised by an important core-periphery pattern, as evidenced by important HH and LL clusters. The HH cluster is located in central Europe, whereas several LL regimes exist in eastern Europe and in 1930 also in south-western Europe. Nevertheless, some regions were able to perform better than their neighbours or had significantly lower basic human capital. These outliers showing negative spatial correlation are rare in the sample. These results imply that there were clear spatial clusters in Europe in the past. These spatial clusters persisted over longer time periods. In consequence, the influence of the surrounding regions has been an important determinant of a region’s own human capital. In particular, geographic proximity to the most advanced countries in north-western Europe appears to be an important explanatory factor for the regional distribution of human capital in Europe. Nevertheless, the existence of clusters and outliers which do not fit to this overall pattern shows that there are other determinants of human capital. The effect of land inequality, for example, explains regional numeracy distribution, which cannot be explained by geographical proximity (Hippe & Baten, 2012b). These results are complementary to better understand and explain regional human capital distribution and indicate the road to take in future research policy on this topic. Space is important for regional human capital formation, but the right educational, social and economic policies may allow a region or a country to escape from its spatially given destiny.

Appendix ABCC data: Hippe and Baten (2012a) Literacy data: Kirk (1946)

References A’Hearn, B., Crayen, D., & Baten, J. (2009). Quantifying quantitative literacy: Age heaping and the history of human capital. Journal of Economic History, 68(3), 783–808. Anselin, L. (1995). Local indicators of spatial association-LISA. Geographical Analysis, 27, 93–115. Anselin, L. (1996). The Moran scatterplot as an ESDA tool to assess local instability in spatial association. In M. Fisher, H. J. Scholten, & D. Unwin (Eds.), Spatial analytical perspectives on GIS (pp. 111–126). Taylor & Francis. Anselin, L. (1998a). Interactive techniques and exploratory spatial data analysis. In P. A. Longley, M. F. Goodchild, D. J. Maguire, & D. W. Wind (Eds.), Geographical information systems: Principles, techniques, management and applications (pp. 251–264). Wiley. Anselin, L. (1998b). Exploratory spatial data analysis in a geocomputational environment. In P. A. Longley, S. M. Brooks, R. McDonnell, & B. Macmillan (Eds.), Geocomputation, a primer (pp. 77–94). Wiley.

54

3 Spatial Clustering of Numeracy and Literacy

Anselin, L. (2001). Spatial econometrics. In B. Baltagi (Ed.), Companion to econometrics (pp. 310–330). Basil Blackwell. Anselin, L., & Bao, S. (1997). Exploratory spatial data analysis linking SpaceStat and ArcView. In M. Fisher & A. Getis (Eds.), Recent developments in spatial analysis (pp. 35–59). Springer. Anselin, L., Varga, A., & Acs, Z. S. (2000). Geographical spillovers and university research: A spatial econometric perspective. Growth and Change, 31, 501–515. Azomahou, T., Diebolt, C., & Mishra, T. (2009). Spatial persistence of demographic shocks and economic growth. Journal of Macroeconomics, 31, 98–127. Bailey, T., & Gatrell, A. C. (1995). Interactive spatial data analysis. Longman. Baumont, C., Ertur, C., & Le Gallo, J. (2003). Spatial convergence clubs and the European regional growth process, 1980–1995. In B. Fingleton (Ed.), European regional growth (pp. 131–158). Springer. Becker, S. O., & Woessmann, L. (2009). Was Weber wrong? A human capital history of protestant economic history. Quarterly Journal of Economics, 124(2), 531–596. Benavot, A., & Riddle, P. (1988). The expansion of primary education, 1870-1940: Trends and issues. Sociology of Education, 61(3), 191–210. Canals, V., Diebolt, C., & Jaoul, M. (2003). Convergence et disparités régionales du poids de l’enseignement supérieur en France: 1964-2000. Revue d’Economie Régionale et Urbaine, 4, 649–669. Cinnirella, F., & Hornung, E. (2011). Landownership concentration and the expansion of education. CESifo working papers No. 3603. Cipolla, C. M. (1969). Literacy and development in the West. Penguin Books. Cliff, A., & Ord, J. (1981). Spatial processes: Models and applications. Pion. Crayen, D., & Baten, J. (2010a). Global trends in numeracy 1820-1949 and its implications for long-run growth. Explorations in Economic History, 47, 82–99. Crayen, D., & Baten, J. (2010b). New evidence and new methods to measure human capital inequality before and during the industrial revolution: France and the US in the seventeenth to nineteenth centuries. Economic History Review, 63(2), 452–478. Dall’erba, S. (2005). Distribution of regional income and regional funds in Europe 1989-1999: An exploratory spatial data analysis. Annals of Regional Science, 39, 121–148. Del Barrio-Castro, T., & García-Quevedo, J. (2005). Effects of university research on the geography of innovation. Regional Studies, 39(9), 1217–1229. Diebolt, C., Jaoul, M., & San Martino, G. (2005). Le mythe de Ferry : une analyse cliométrique. Revue d’économie politique, 115(4), 471–497. Diebolt, C., & Pellier, K. (2009). La convergence des activités innovantes en Europe : les enseignements de l’économétrie spatiale appliquée à l'histoire du temps présent. Economies et sociétés, 43(5), 805–831. Felice, E. (2012). Regional convergence in Italy, 1891-2011: Testing human and social capital. Cliometrica, 6(3), 267–306. Fujita, M., Krugman, P. R., & Venables, A. (1999). The spatial economy: Cities, regions, and international trade. MIT Press. Galor, O. (2012). The demographic transition: Causes and consequences. Cliometrica, 6, 1–28. Galor, O., & Moav, O. (2002). Natural selection and the origin of economic growth. Quarterly Journal of Economics, 117, 1133–1192. Galor, O., Moav, O., & Vollrath, D. (2009). Inequality in landownership, the emergence of humancapital promoting institutions, and the great divergence. Review of Economic Studies, 76, 143–179. Galor, O., & Weil, D. N. (2000). Population, technology and growth: From the Malthusian regime to the demographic transition. American Economic Review, 90(4), 806–828. Green, A. (1990). Education and state formation: The rise of education systems in England, France and the USA. Palgrave. Haining, R. (1990). Spatial data analysis in the social and environmental sciences. Cambridge University Press.

References

55

Hippe, R. (2012). How to measure human capital? The relationship between numeracy and literacy. Economies et Societes, 45(8), 1527–1554. Hippe, R., & Baten, J. (2012a). Regional inequality in human capital formation in Europe, 1790–1880. Scandinavian Economic History Review, 60(3), 254–289. Hippe, R., & Baten, J. (2012b). ‘Keep them ignorant.’ Did inequality in land distribution delay regional numeracy formation? University of Tuebingen Working Papers. Jaffe, A. B., Trajtenberg, M., & Henderson, R. (1993). Geographic localization of knowledge spillovers as evidenced by patent citations. Quarterly Journal of Economics, 63, 577–598. Kirk, D. (1946). Europe’s population in the interwar years. Princeton University Press. Le Gallo, J., & Ertur, C. (2003a). Exploratory spatial data analysis of the distribution of regional per capita GDP in Europe, 1980–1995. Papers in Regional Science, 82, 175–201. Le Gallo, J., & Ertur, C. (2003b). An exploratory spatial data analysis of European regional disparities, 1980–1995. In B. Fingleton (Ed.), European regional growth (pp. 55–98). Springer. López-Bazo, E., Vayá, E., Mora, A., & Surinach, J. (1999). Regional economic dynamics and convergence in the European union. Annals of Regional Science, 33, 343–370. Messner, S. F., Anselin, L., Baller, R. D., Hawkins, D. F., Deane, G., & Tolnay, S. E. (1999). The spatial patterning of county homicide rates: An application of exploratory spatial data analysis. Journal of Quantitative Criminology, 15, 423–450. Mishkova, D. (1994). Literacy and nation-building in Bulgaria, 1878-1912. East European Quarterly, 29(1), 63–93. Núnez, C. E. (1992). La fuente de la riqueza: educación y desarrollo económico en la Espana contemporánea. Alianza. Pace, R. K., & Barry, R. (1997). Quick computation of spatial autoregressive estimators. Geographical Analysis, 29, 232–246. Paci, R., & Usai, S. (2009). Knowledge flows across European regions. Annals of Regional Science, 43, 669–690. Pinkse, J., & Slade, E. (1998). Contracting in space: An application of spatial statistics to discretechoice models. Journal of Econometrics, 85, 125–154. Report of the International Commission. (1914). Report of the International Commission to inquire into the causes and conduct of the Balkan Wars, Publications No. 4. Carnegie Endowment for International Peace. Stolz, Y., Baten, J., & Botelho, T. (2013). Growth effects of nineteenth-century mass migrations: “Fome zero” for Brazil? European Review of Economic History, 17(1), 95–121. UNESCO. (1953). Progress of literacy in various countries. Firmin Didot et Cie. Vincent, D. (2000). The rise of mass literacy. Polity Press.

Chapter 4

Human Capital and Market Access in the European Regions

4.1

Introduction

Human capital is generally perceived to be a key factor for today’s knowledgedriven economies. This is particularly true for Europe and the European Union. For this reason, the Council of the European Union highlights that “[e]ducation and training have made a substantial contribution towards achieving the long-term goals of the Lisbon strategy for growth and jobs” (Council of the European Union 2009, C 119/2). Still, the EU is facing important challenges in its regional policy. Although the EU has aimed to decrease economic and social inequalities over the last decades, there still remain important differences between and within countries. The current economic crisis has further widened previous convergence tendencies. Similarly, education is not equally distributed in space. Thus, how can one explain these differences? One possible explanation advanced by theory and in particular by models from NEG is that consumer markets play an important role in the distribution of economic development. These models have already been tested empirically for the last decades (e.g. Breinlich, 2006; Faíña & López-Rodríguez, 2006) and have confirmed the predictions provided by these rather recently developed NEG models. A particular case including human capital formation is presented by Redding and Schott (2003). The authors develop a theoretical NEG model showing that remoteness from large consumer markets gives disincentives to individuals to increase their human capital. For this reason, this “penalty of remoteness” explains worldwide inequalities in human capital accumulation. Subsequent empirical studies have also confirmed the predictions of the model for the European regions for the last couple of years (e.g. López-Rodríguez et al., 2007).

This chapter was first published in slightly modified form as Diebolt, C., and Hippe, R. Remoteness equals backwardness? Human capital and market access in the European regions: insights from the long run (with Claude Diebolt), Education Economics, 2018, 26 (3): 285–304. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 C. Diebolt, R. Hippe, Human Capital and Regional Development in Europe, Frontiers in Economic History, https://doi.org/10.1007/978-3-030-90858-4_4

57

58

4 Human Capital and Market Access in the European Regions

Nevertheless, (to our knowledge) there has not yet been any empirical evidence for the long-term evolution of market access and human capital at the EU or European regional level. This empirical evidence, however, appears particularly important to understand the changes that have shaped today’s European regions in the long run. This may considerably enlarge the recent analyses for the short term, which may be only part of a much larger long-term process. For these reasons, this paper explores for the first time the importance of market access for the spatial distribution of human capital in the European regions in the past. We combine and adapt several databases to create a new unique dataset. More specifically, we use two different human capital indicators at different points in time to test the robustness of our analysis. First, we employ regional numeracy estimates for 1850. The age heaping method enables us to estimate numeracy (e.g. A’Hearn et al., 2009; Hippe, 2012a; Hippe & Baten, 2012). Second, we use literacy as an alternative human capital proxy. Literacy is certainly the most employed indicator for human capital in the past. Therefore, we can check the overall numeracy results by using regional literacy outcomes in 1930. Both indicators also allow to better model the proposed theory than alternative historical education variables. In addition, as has been proposed by the recent literature, we exploit data on the distribution and size of cities in Europe to model historical market access. The results show that market access has a significant negative influence in OLS, Tobit and IV regressions. In the latter case, we use distance to Luxembourg and area size of European countries as instrumental variables. In sum, the “penalty of remoteness” hypothesis theoretically advanced by Redding and Schott (2003) is confirmed by our historical data. This result implies that the “penalty of remoteness” is not a current trend but has existed for long time spans, the present being only a very special case of a larger phenomenon. The paper is structured as follows. First, we consider the literature on human capital formation in the European regions in the past and the main contributions of NEG. Then, we briefly present the underlying theoretical NEG framework, which has been originally proposed by Redding and Schott (2003). Subsequently, the data and the econometric specifications are discussed. In the fourth section, we show the results and their policy implications. The final section concludes.

4.2 4.2.1

Related Literature Regional Human Capital Formation in Europe: Today and in the Past

Human capital formation in the European regions has attracted the attention of many researchers (e.g. Badinger & Tondl, 2003; Breinlich, 2006; Faíña & LópezRodríguez, 2006; Sterlacchini, 2008). For example, Rodríguez-Pose and Tselios (2011) use ESDA to test the spatial distribution of educational attainment in western

4.2 Related Literature

59

Europe between 1995 and 2010. They find that educational attainment is strongly correlated with inequality and that regions tend to cluster in space. Proximity plays an important role for educational attainment even today. Moreover, there are noticeable differences between the north and the south of western Europe and the urban and rural communities. However, as the authors state, “[t]he geography of education, especially at [the] subnational level, is a huge black box” (Rodríguez-Pose & Tselios, 2011, p. 358). If this is still true today, one can only imagine how the situation is in the past. New evidence on the regional distribution of human capital in Europe in the past has recently been provided by Hippe and Baten (2012). They use the age heaping method to calculate numeracy estimates, i.e. whether individuals are able to count or calculate (e.g. A’Hearn et al., 2009; Crayen & Baten, 2010). They show that regional numeracy values steadily improved almost everywhere in Europe during the nineteenth century. Leaders in numeracy were countries in Scandinavia, central Europe and the United Kingdom. Many of the regions in these parts of Europe had already very high numeracy values at the beginning of the nineteenth century. In contrast, Southern and Eastern Europe importantly lagged behind. They needed many more decades to attain similar levels, in part even until the beginning of the first or the second half of the twentieth century. In addition, regional differences in numeracy were quite striking in most of these countries. For example, a coreperiphery pattern characterised Spain. The regions in the north of Madrid had the highest numeracy levels, while those at the southern periphery (Andalusia) and in the north-western periphery (Galicia) followed with a large gap. In contrast, a northsouth gap is visible in Italy. The lowest numeracy values were calculated for regions in the Balkans and the Caucasus. Still, most countries and regions were able to improve the numeracy levels of their population during the nineteenth century. In consequence, regional inequalities in numeracy diminished over that period. Numeracy is one possibility to measure human capital in the past, but there are also other proxies. However, one has to note that no other measure is available on such a scale for that time period. Nevertheless, one can focus on individual countries to check the validity of the numeracy data. In fact, the most important tendencies in numeracy can also be detected when employing other indicators. For instance, Cinnirella and Hornung (2016) use data on Prussian counties in the nineteenth century. Taking a look at the data, one can see that the counties of Poznan province had the lowest enrolment rates (6–14-year-olds), confirming the lower numeracy levels in the study by Hippe and Baten (2012). Another country example is provided by Felice’s (2012) recent study on the regions of Italy during the nineteenth and the twentieth century. In contrast to the abovementioned studies, he uses a specifically constructed human capital measure which takes into account both enrolment rates and literacy. He shows that regional differences in human capital peaked around 1871 but diminished during the next decades. Northern regions had a distinct lead to other regions, followed by central and ultimately southern and island regions. Furthermore, Núñez’ (1992) Spanish literacy data underline a core-periphery pattern similar to the one highlighted by numeracy. Finally, the correspondence of

60

4 Human Capital and Market Access in the European Regions

regional numeracy and literacy data has been emphasised by Hippe (2012b), using data for a number of European countries around the beginning of the nineteenth century and adding more recent data for current developing countries in Latin America, Asia and Africa during the second half of the twentieth century.

4.2.2

Economic Geography and Market Access in Europe

Economic geography has become an important field in economics over the last years. Economic geography models enable to understand why economic activity and individuals cluster in space (e.g. Krugman, 1991). In other words, they allow to clarify the reasons for the existence of urban agglomerations, e.g. Tokyo and Mexico City, and areas with concentrated activity, such as the Manufacturing Belt in the United States and the Blue Banana in Europe. Accordingly, the regional distribution of GDP per capita in Europe is quite unequal. The spectacular growth of urban agglomerations, particularly in developing countries, further shows that economic geography is an important factor for the distribution of the population in the past, today and probably in the future. Given these facts, it is not astonishing that policy-makers are faced with the question of how to deal with these inequalities. Economic geography and in particular NEG have gained attention due to the process of European integration and its consequences for regional inequalities (Fujita et al., 1999). Although integration within the EU has led to increasing convergence across members states, within-member state inequality has had the tendency to accentuate (Faiña et al., 2016). Thus, the developments of NEG have substantially improved the understanding of the concentration of economic activities and of market potential in geographical space. More specifically, they capture the effect of the “pecuniary economies”, which are transmitted through the price system. They have had a revolutionary impact in integrating space structures into the standard economic models considering demand and supply factors. Nevertheless, they only provide a partial explanation of agglomeration patterns. They are less suited to capture the spillover effects from technology and knowledge (and others), which are crucial to comprehend the agglomeration into cities and clusters. These, however, are characteristics of urban economics. In consequence, the specific characteristics of the investigated problem determine which explanation is appropriate in each individual case. For example, the “NEG stresses the role of spatial linkages” (Brakman et al., 2009, p. 777) that exist across large spatial entities, while urban economics is more concerned with the local production conditions, emphasizing the importance of knowledge (and other) spillovers (Combes et al., 2005; Duranton, 2011; Fingleton, 2011). Moreover, Head and Mayer (2004) have found a crucial asymmetry in NEG models’ predictive power concerning spatial evolution and income inequalities. As Faiña et al. (2016, p. 349) note, “NEG predictions on spatial dynamics (through computer simulations) point to instability and breaking points [in tendencies], which do not fit with the stability of spatial patterns and urban hierarchy exhibited by

4.2 Related Literature

61

empirical data. On the contrary, whenever a centre-periphery spatial structure exists, inequalities in salaries and income between central and peripheral areas fit well with available evidence and can be explained by market access differentiation (the so-called nominal wage equation)”. Yet this relationship cannot be considered to be deterministic, because there are clear exceptions to this rule. For example, the “Scandinavian countries have been able to overcome to a large extent the main handicaps of peripheral areas” (Faiña et al., 2016, p. 349). In fact, NEG offers much more than possible explanations for the distribution and concentration of economic activity in space – it offers a framework of analysis suitable for application to the study and evaluation of many different situations with different problems, which require specific combinations, and thus tailor-made policy measures. Additionally, there are further theories which take a nondeterministic approach to policies on regional development and consider the economic expectations of different actors. For example, Redding’s (1996) model combines the two strands in which endogenous growth theory has been split (see Faiña et al., 2016). The first strand focuses on RDI and human capital, while the other sees the innovation process à la Schumpeter as being the most important driver behind growth. His model concentrates on the mutually reinforcing interplay of investments in human capital and in RDI. Furthermore, it considers the interactions of skill development at the individual level and quality-enhancing RDI investments by firms. These interactions lead to various equilibria, which have already been shown in the empirical literature: either high human capital and high RDI investment outcomes or low ones. Depending on the equilibrium, this will then lead to high or low growth rates. The expectations and strategic behaviour of agents are the major driving force behind the selection of the equilibrium. Thus, governments can take an important role in regional development by promoting policies that guide these expectations into the right direction so that the economy may end up in the high growth equilibrium and not get stuck in low growth. In particular, these policies can consist of specific subsidies that enhance the expectations on the positive return of individual human capital and firm-level RDI investments. In consequence, a region can boost sustainable growth by increasing both mutually reinforcing investment types at the same time (see Faiña et al., 2016). Furthermore, the concept of market access or market potential plays a very important role in many economic geography models (e.g. Crozet, 2004; Hanson, 2005; Niebuhr, 2006; Redding & Venables, 2004). Having a good access to large markets is deemed to be a fundamental economic advantage of a region. The notion of market access follows the idea to apply physical laws to human and economic movements. More specifically, human interactions are considered to follow gravitational principles, which could at the earliest predecessors of the concept explain lower demand in remote areas. The concept itself goes back to Harris (1954). Technically speaking, market potential is constructed using volume measures (e.g. number of inhabitants, number of transactions) that are weighted by the inverted distance (see Faiña et al., 2016). Later applications include Clark et al. (1969) and, in particular, Keeble et al. (1982) (see also Niebuhr, 2006). The latter authors show that market potential is lowest in periphery regions. The highest market

62

4 Human Capital and Market Access in the European Regions

potential, in contrast, was found in north-western Europe, including West Germany and the Benelux countries. In this way, mapping market potentials allows to condense and visualise huge amount of spatial information which supports regional planning (Faiña et al., 2016). However, studies à la Harris also attracted criticism for not taking enough account of supply/demand structures (Faiña et al., 2016). They also lacked solid theoretical foundations. These were only later provided by NEG, so newer studies are able to test the implications of these theoretical models. Initially, these were country studies (e.g. Brakman et al., 2004; Mion, 2004; Ottaviano & Pinelli, 2006; Roos, 2001), which generally emphasise the importance of market access. More recently, new studies also take a European approach (e.g. Head & Mayer, 2006; Niebuhr, 2006) and generally confirm the hypotheses set up by NEG. Market access may also have effects on the accumulation of human capital. Human capital is clearly an important economic factor which may enable higher growth rates and lead to convergence or divergence processes. However, the incentive for individuals to invest in their human capital and their geographic location is not independent. In particular, higher market access may encourage human capital accumulation. This hypothesis has been validated by a range of publications focussing on the worldwide national level (e.g. Redding & Schott, 2003) and later on the European regional level. In particular, the latter include the work by LópezRodríguez and co-authors (e.g. Faíña & López-Rodríguez, 2006; López-Rodríguez et al., 2005, 2007). The results of these papers clearly indicate that human capital levels (as approximated by educational attainment levels) decrease when moving from NUTS 2 regions with high market access to those with low market access in the year 2000. Another factor influencing human capital accumulation is labour mobility. Yet, labour mobility has been quite low in Europe, even during the most recent years (e.g. Barslund et al., 2015) (and even more so in comparison to the United States). But it may be higher for skilled labour, thus being a potential factor driving agglomeration in human capital, as labour may move to the more advanced regions with higher human capital. But even at the level of tertiary education, these flows are rather limited. For example, learning mobility in higher education in the EU is on average still low, although it has increased during the last years (see Flisi et al., 2015). Certainly, one can expect labour mobility to be still much lower if we go back in time, particularly when considering time periods before the Second World War. In sum, market access has been shown to be a crucial factor influencing human capital formation in the present time. However, has this always been the case? Is it a more general pattern that has persisted until the present time? This paper contributes to answer this question by analysing econometrically the importance of market access in the long run.

4.3 Theoretical Model

4.3

63

Theoretical Model

The proposed NEG model has originally been developed by Fujita et al. (1999). This model has two sectors, i.e. agriculture and manufacturing. However, the model does not take into account human capital accumulation. This factor has only been added by Redding and Schott (2003). Their model focuses on the interaction between human capital and input-output linkages, taking account of transport costs and assuming increasing returns to scale (IRS). One of their main results is that countries that are remotely located from main markets have to face higher trade costs and a decrease in the skill premium than other countries if one assumes that manufactures are relatively more skill intensive than agricultural goods. In this way, the effect of a remote location has the same consequences as a reduction in the relative price level of manufactures. Due to the assumption that the required skills in the manufacturing sector are higher than in agriculture, skilled workers face a fall in their relative wages. Thus, the incentive for an unskilled worker to invest in human capital and become skilled is decreased. Because the main contribution of this paper is empirical, we only briefly present some foundations and results of Redding and Schott’s (2003) model.1 We adapt the model to the context of this paper by explicitly considering regions (instead of countries as in the original model). First, we consider the preferences and the endowments that have to be modelled. Accordingly, Europe is constituted by i 2 {1, . . ., R} regions. Every region is characterised by an endowment of Li consumers. Every consumer has a single unit of labour. The supply of this unit of labour is inelastic, i.e. there is no disutility. Consumer preferences are identical for all Li. Consumption is restricted to two types of goods: first, the production of the agricultural sector is limited to one homogenous good. Second, the manufacturing sector produces a range of differentiated manufactures. The preferences follow a standard utility function in Cobb-Douglas form. Let us now define the production technologies involved in the two sectors. In the first sector, the produced agricultural good is homogeneous. Production is set within the framework of perfect competition and is characterised by constant returns to scale. In the second sector, the production of the differentiated manufactured goods is characterised by IRS and uses a combination of the two types of labour (skilled, unskilled) and of the intermediate inputs of manufactured goods. In the next step, we introduce endogenous investment in human capital into the model. It is assumed that a conversion from an unskilled to a skilled worker is possible. Denoting an individual as z, this conversion incurs a fixed cost of education Ωi(z) units in terms of unskilled labour. The underlying idea is that real resources are consumed to become skilled, which results in the fact that education cost is a proportion of the wage of unskilled labour. Moreover, the quantity of unskilled

1 We follow López-Rodríguez et al. (2007) and limit ourselves to the supply side of the model. For a complete presentation, see e.g. López-Rodríguez et al. (2005).

64

4 Human Capital and Market Access in the European Regions

labour that is needed to become skilled is dependent on two factors. In particular, Ωi ðzÞ ¼ ahðzi Þ, where hi denotes the overall environment provided by institutions and government policies that have repercussions on the education cost and a(z) denotes the individual’s personal ability. This ability is subject to human biology. Thus, an individual z will only take the decision to invest in human capital if wSi  wU i 

hi U w , að z Þ i

ð4:1Þ

i.e. if education costs are lower than (or equal to) the difference between the wages of a skilled (wSi Þ and an unskilled (wU i Þ worker. The equation defines an implicit critical value for a above which all individuals choose to invest in human capital. This value ai giving the supply of skills in equilibrium is ai ¼ 

hi =wU  1

wSi

:

ð4:2Þ

i

An individual having the ability ai does prefer neither to become skilled nor to remain unskilled but is indifferent to both options. Therefore, this equation is the “skill indifference condition”. Only if an individual has an ability above ai he will choose to get further education. After defining the producer equilibrium and profit maximisation, we can obtain the zero profit conditions. Joining them with the skill indifference condition, we obtain the equilibrium relationship that exists between the geographical location of a region and endogenous investments in human capital. In particular, the equilibrium equations show that if the equilibrium market access of i, MAi, decreases, if the manufacturing sector is assumed to be skill intensive with regard to the agricultural sector and if the region is incompletely specialised, then the equilibrium moves to a new equilibrium with lower skilled wages but higher unskilled wages. This implies that the critical ability level ai increases. This change induces a lower supply of skilled labour and a higher supply of unskilled labour. More specifically, the decrease of MAi has led to a smaller size of the skillintensive manufacturing sector. The reduction in size means that there are now more skilled workers in the market than there is demand for them in agriculture. Therefore, the wages of skilled workers decrease, whereby their relative wages in comparison to the ones of unskilled workers fall. In this way, remoteness leads to smaller incentives to invest in human capital. This means that the model predicts a positive relationship between market access and human capital investment. While this basic two-sector model has so far been used to explain the current economic geography of Europe, it appears even more appropriate for the past. Clearly, whether to switch from agriculture to manufacturing is not a policy question for most European countries anymore. The service sector has become much more relevant, both in terms of GDP and employment, than agriculture in many European countries. In contrast, the simple structure of the model may even more closely

4.4 Data and Methodology

65

mirror the development during the European industrialisation process. Most European countries only began to industrialise during the nineteenth or even the twentieth century. For example, Broadberry (2009)’s data show that agriculture had still a share of 50% in agricultural employment in West Europe in 1870. The share increases to 57% in South Europe and even 70% in East Europe. Without taking major assumptions, it is evident that these shares would be even higher for 1850, illustrating the crucial stake of agriculture in Europe at that time. Ongoing industrialisation in West Europe increases the share of industry while decreasing agricultural employment to 32% in 1929. Nevertheless, a third of overall employment is still a relevant share for agriculture, particularly if compared to only 5% of employment in West Europe in 1992 (and certainly less today). However, industrialisation occurred later and slower in other, peripheral parts of Europe. Therefore, the same share only dropped to 53% in South Europe and to 66% in East Europe in 1929. For this reason, the theoretical model’s two-sector model and the associated switch from unskilled to skilled workers appears related to the historical period under study (i.e. 1850 and 1930). The higher market access in the core industrialising European countries would imply that skilled workers are rarer in the periphery. Can we find this theoretical result also in the data?

4.4

Data and Methodology

We test the theoretical model by the use of different datasets. In particular, we use regional numeracy and literacy as our human capital proxies. First, we employ numeracy as a proxy for regional human capital in Europe in 1850. Numeracy is derived from the age heaping method. Age heaping as a method for calculating basic human capital values has been established by the recent literature (e.g. A’Hearn et al., 2009; Crayen & Baten, 2010; Hippe, 2012b; Hippe & Baten, 2012). In particular, we use the ABCC index to measure numerical abilities. In fact, it measures the share of individuals that are able to calculate. More specifically, historical census data, and in part even data for today’s LDCs, show a clear pattern of rounding. Many people were not able to calculate their age. Therefore, they guessed their age to fulfil the census requirements set up by the state. Given that human biology serves as a first aid for calculations (e.g. five fingers on one hand, ten fingers in total), they rounded their ages on 0 and 5 (see also Harper, 2008). This specific rounding behaviour can be used to calculate numeracy estimates. In particular, an index can be constructed which measures the deviation of actually observed age statements to the one which would be expected. This is the so-called Whipple index. However, this index is not very intuitive, as it takes values from 100 (the highest numeracy score) to 500 (the lowest score). For this reason, A’Hearn et al. (2009) have proposed a linear transformation of the index, with values ranging from 0 (lowest) to 100 (highest numeracy), i.e. the ABCC index. It has been shown that this rough proxy of numeracy is well correlated with other standard human capital

66

4 Human Capital and Market Access in the European Regions

Table 4.1 Descriptive statistics for ABCC and market access, ca. 1850 Variable ABCC Market access Distance

obs. 298 298 298

mean 90.79 4472.78 14.16

sd

min 26.38 995.13 0.61

12.29 2199.80 11.37

max 100.00 17,165.47 51.50

proxies such as literacy (A’Hearn et al., 2009; Hippe, 2012b) and primary school enrolment (Crayen & Baten, 2010). The underlying formula of the ABCC index is ABCCit ¼ 125  125 

14 X j¼5

n5j,it =

72 X

! n

j,it

,

ð4:3Þ

j¼23

where i denotes a region, j the number of years, n the number of individuals and t the time period. Its formula illustrates that one calculates the share of age observations ending in “0” and “5” in relation to all observations. One takes into account all ages between 23 and 72, the standard in the numeracy literature. The ABCC with its limits of 0 and 100 is comparable to other share indexes, in particular literacy. The human capital data have been taken from the new and large database provided by Hippe and Baten (2012). These data are based on original historical census data. The advantage of this measurement method is that it always takes into account the entire population and not, as other historical proxies of human capital (e.g. signature rates), only parts of it. For this reason, it is a representative for the whole population and is not prone to biases that naturally reside in more partial indicators. In this way, we are able to measure the regional distribution of basic numeracy from Portugal to Russia. In total, there are 298 regions in our dataset (see Table 4.1 for descriptive statistics).2 Second, literacy is our alternative human capital proxy and available for 1930. Literacy data are not available for a range of European regions for 1850, which is why the ABCC index is the more suitable indicator for that period. However, literacy became a standard human capital indicator during the second half of the nineteenth century and was used in many countries throughout the first half of the twentieth century. Many public debates focused on the eradication of illiteracy in a number of European countries. Still, this aim was sometimes not achieved until the second half of the twentieth century, so it is a valuable indicator for 1930. Therefore, it can be used as the representative human capital proxy for 1930. Literacy is defined as

Given the variable “Distance to Luxembourg”, we have excluded Luxembourg in all our regressions and do not list it here.

2

4.4 Data and Methodology

67

Literacyit ¼

N X j¼10

rw

j,it =

N X

n

j,it ,

ð4:4Þ

j¼10

where rw denotes the ability to read and write and N is the total number of years. In other words, literacy is the share of individuals (10+ years) who are able to read and write in a region at a given point in time. Data stem originally from Kirk (1946) and have been adapted for the purposes of this paper. Both numeracy and literacy have the advantage that they are share variables. In fact, the proposed NEG model divides individuals into skilled and unskilled workers. A defined level of numeracy is nothing more than the share of numerate to innumerate; a level of literacy is the share of literate to illiterate individuals. We can take the simple but straightforward assumption that an unskilled worker is innumerate (in 1850) or illiterate (in 1930). Similarly, it is reasonable to assume that an innumerate person can decide to become numerate and an illiterate one to become literate. The endogenous investment assumption in the model, from unskilled to skilled workers, can thus be illustrated by our indicators. These parallels show the correspondence between our empirical specification and the underlying theoretical model. In addition, we have been able to collect regional data for these human capital proxies in the past, whereas data with a similar degree of regional precision are not available for other proxies for most of Europe. Moreover, the data on urbanisation are provided by two different sources. For 1850, we use the data provided by Bairoch et al. (1988). It is, alongside with a similar database by De Vries (1984), the standard database on urbanisation in the long run. In fact, the data trace back the cities of Europe until the year 800, starting from 1850. For a general geographical illustration of the data for 1850, see Fig. 4.1. London is by far the largest European city, followed by Paris. Cities are quite dense in most of Europe, except Scandinavia and Eastern Europe, where Russia’s capital, St. Petersburg, is the most important city. Cities are included if they fulfil a minimum threshold of population size between 800 and 1800. This threshold is 5000 inhabitants. In total, there are 2201 cities in our database. We excluded two observations because they were geographical outliers, so we have used the remaining 2199 cities for our calculations.3 Because the Bairoch et al. database does not cover later points in time, we had to use another database for our literacy regressions. In fact, European-wide data for literacy are only available around 1900 onwards, but the earliest data on cities (or, in this case, agglomerations) after 1850 are only available for 1950. Therefore, we use literacy data from 1930 and take as the best approximation of market access in 1930 data on European agglomerations in 1950.4 These agglomeration data have been 3

These outliers are Ponte Delgada which is on the Azores Islands and far off the European continent. Moreover, we excluded Oral, which is not located in the limits of today’s definition of Europe. 4 We are very well aware of the fact that the Second World War affected important portions of regional populations, which may have a biasing effect on our estimates. However, authors such as

68

4 Human Capital and Market Access in the European Regions

Fig. 4.1 Location and size of European cities, 1850. Source: Own graphical presentation of data provided by Bairoch et al. (1988). Size of cities is shown in 1000 inhabitants

assembled by Moriconi-Ebrard (1994) and are compatible to the Bairoch et al. database.5 It is a worldwide database with a threshold of 10,000 inhabitants in 1990, so the entire database includes up to 26,000 worldwide agglomerations. A graphical illustration of these data for Europe shows their general resemblance to those in 1850 (see Fig. 4.2). While London is still the most populous European city, there have been increases in population elsewhere. For example, Russia’s new capital, Moscow, is now significantly larger than St. Petersburg, and other capitals and agglomerations such as Athens show an increased importance in the European urban landscape. However, the overall picture that we get from 1950 is quite similar to the 1850 data. Furthermore, market access has been calculated with population data in the recent literature (e.g. López-Rodríguez et al., 2005).6 Population potential also appears to

Martí-Henneberg (2005) show that population concentrations are highly correlated at the regional level between 1870 and 2000, which suggests that data from 1950 are still a good approximation for 1930. 5 Moriconi-Ebrard’s (1994) database includes agglomeration data from 1950 to 1990. 6 Clearly, it would be preferable to use an even closer theory-based measure, including regional price and interregional trade flow data. Yet, as López-Rodríguez et al. (2007) already emphasise, this measure is not available for today. Without surprise, it is not available for the past either, so we have to rely on our alternative but fairly good proxy estimates.

4.4 Data and Methodology

69

Fig. 4.2 Location and size of European agglomerations, 1950. Source: Own graphical presentation of data provided by Moriconi-Ebrard (1994). Size of agglomerations is shown in 1000 inhabitants

be the best available proxy in historical European applications.7 It is a standard way of representing changes in the pattern in which cities are distributed in space. It allows to identify the relative location of a city within a greater network of other cities. Two factors are essential in the evaluation process: first, the size of the population of cities and, second, the distance of a city to the other regions in the network. In practice, one adds to the population size of a city the population sizes of the other cities, each time divided by their distances to the original city. This is done for every city in the data. In this way, a potential value is assigned to each city. To be more precise, the mathematical formula, which is a development of the classic concept of population potentials first introduced by Harris, is (see López-Rodríguez et al., 2005) as follows: MAi ¼ Poi þ

N X Po j Poi Po þ . . . þ n ¼ Poi þ , Di,1 Di,n Di,j

ð4:5Þ

j6¼i, j¼1

where MAi stands for the market potential at i, Poi is the population of i, and Di,j is the distance that exists between i and j, each i and j representing individual nodes.

7

Other economic measures, such as the regional GDP data, are not yet available for an important part of European regions, in particular in Eastern Europe.

70

4 Human Capital and Market Access in the European Regions

For the econometric specification of the relationship between investment in human capital and market access, we first test a standard OLS regression model as used by the literature. The basic framework is the following: ln ðHCi Þ ¼ β0 þ β1 ln ðMAi Þ þ εi ,

ð4:6Þ

where HC is the respective human capital indicator (i.e. numeracy or literacy; in logarithmic terms), MA is the market access (in logarithmic terms), i is a region, and ε are the unexplained residuals. The basic OLS framework is later complemented by Tobit and instrumental variable regressions. In addition, note that “region” stands for a NUTS region in our case. NUTS is the official nomenclature for territorial units of statistics, which has been developed by the European Union. It comprises all countries of the EU, EFTA and candidate countries of the EU. For countries outside this area, e.g. Russia, we used the current administrative division. This allows us to make our data comparable to current data and other research. Given the fact that market access and distance involves point data (cities and the central point of each region, respectively), the NUTS level can be attributed without any further difficulties. The case is different for human capital data, which were available only for the historical regions. In this case, we developed the correspondence of these historical regions to current regions as best as possible. Because we have often more detailed data than needed for this study (e.g. the départements in France, the provincias in Spain or the Bezirkshauptmannschaften in Cisleithania (i.e. the Austrian part of Austria-Hungary before its disestablishment)), the possible biases are importantly reduced because we can easily aggregate our data from province or county level. As a standard, we use NUTS 2 as the basic unit of analysis, which is also the standard unit in most other contributions in our area.8 In this way, we are able to create a unique dataset for the European regions in 1850 and 1930.

4.5

Results

We first present our results for numeracy in 1850 and subsequently check their overall robustness with the literacy data in 1930. The calculated population potential values for 1850 are illustrated in Fig. 4.3. In the following, we refer to countries and regions in their current boundaries. It is apparent that the highest population potentials are found in the areas of Paris, London and Manchester and the wider locus up to Belgium and the western parts of Germany. In the current literature, this area is often called the Golden Triangle and has also been identified by market access studies such as López-Rodríguez et al. (2007) for 2000. The similarity of our

8

An exception is Greater London, where we had to use the NUTS 1 level due to unavailability of more disaggregated data.

4.5 Results

71

Fig. 4.3 Population potential in Europe in 1850. Note: Graphical representation using natural breaks (Jenks) with 32 classes. Values decrease from the highest to the lowest value in the following broad order of colours: white, pink, blue, green, yellow, orange and red. Source: Own calculations, city data provided by Bairoch et al. (1988)

historical market access estimates to their current estimates further shows the validity of our approach and of potentially existing long-term spatial configurations. Given the size of the aforementioned cities, in particular of Paris and London, the shape of the triangle is not surprising because these were the two most populated cities of Europe in 1850. Still, the figure highlights that they were not isolated from other population hubs but were the centre of a greater accumulation of population in western Europe. This can be explained by the long-term geographic change of economic importance from northern Italy to this area, as has also been postulated by Braudel (1979). This is also in line with the concept of the existence of a “blue banana”, which has been put forward by Brunet (2002), a concentration of population and economic activity stretching from northern Italy over the course of the Rhine River to the United Kingdom and even Ireland. In general, the more one distances oneself from the centre in western Europe, the lower are the potential population values. Going farther away from the centre, the highest estimated values are located in the regions of the United Kingdom, France, Germany, Switzerland, Italy, parts of Austria and Spain. Polish regions are already in

4 Human Capital and Market Access in the European Regions

DK05 RO42 DK04 PL52 PL51 CZ05 FR83 CZ06 AT21 CZ02 SI02 AT13 DE41 DED2 CZ04 DE23 DE42 DE92 DED1 DE24 DEE0 DEA4 DED3 DEG0 AT34 DE25 DE73 DE26 DE11 DE72 DEA5 DE13 DE12 DE71 DEB3 DEC0 DEB1 DEB2 DEA2 DEA1 FR22 DEF0 DE21 DE60 CZ07 HR03 DE22 AT12 DE91 DE27 CZ03 FR21 PL43 DEA3 AT11 DE14 AT31 NL31 AT22 AT33 DE93 DE80 FR42 ITG2 ITD1 NO06NO05 SI01 ITD2 NL32 DK01 PL42 HR01 DE94 NL21 BE33 BE21 ITD3 CH07 NL22 NL42 AT32 FR41 FR81 NL11 BE24 NO03 UKM5 UKM2 FR53 FR30 NL12 BE32 UKE3 UKK3 BE35 NL33 UKJ1 CH02 UKM3 NL34 BE23 EE EE00 PL22 NO01 CH03 UKL1 BE34 NL41 FR51 UKC2 UKF1 UKJ3 FR26 FR23 UKJ4 PL41 ITD4 CH04 PL63 HU23 UKH1 UKG1 UKG3 UKD2 UKF3 UKH2 CH01 ITC4 UKL2 UKE1 UKG2 RO21 UKM6 HU10 HR02 UKC1 FR24 BE10 UKD3 UKK1 UKD1 BE22 UKK2 ES12 UKE2 FR61 ITG1 ITD5 LU00 PL61 FR71 FR52 CH05 CH06 UA008 FR72 PT17 HU21 UA00D ITE2 FR62 PL21 FR43 HU22 HU33 FR63 FR82 FR25 UA00O ITE1 BE25 SK01 ITC2 ITC1 UA00J HU31 ES21 ITF4 ES30 ES22 PT18ES41 ES23 PT15 ES24 ES13 ITE4 HU32 ITC3 RO11 SK03 PL11 ITF1 RU28RU21 RU27 RU4D ES42 ES53 PL12 SK02 FR10 RU15 RU1G LV00 IE02ITF6ES51 ITF3 RU4A RU22 ES43 PT11 SK04 RU2A RU35 PL33 UA003 ES52ITF2 RU13UA007 UA006 PL31 UA001 UA00M RU36 RU31 RU41 UA009 ITF5 RU39 RU4CRU33 UA00C UA002 PT16 RU1H RU34 PL34 UKN0 RU25 RU43 RU32 IE01 UA00IMD RU1C RU44 UA00G UA00P RU1F RU19 RU1D RU45 BY004 BY001ES11 ES62 RU1A RU18 RU42 RU17 ES61 RU48 RU26 RU12 RU14 BY003GR22 BY006 GR42 LT00 GR30 RU11 RU16 BY002 BA

60

RU38 RU37

GE AZ AM

40

ABCC

80

100

72

UKI

YU002 GR14 MK00 BG42 GR21 GR24 GR23 YU001 BG34 GR25 BG41 BG32

BG33

20

AL

7

8

9

10

ln(Market Access)

Fig. 4.4 ABCC and market access, 1850

the next level. Nevertheless, there are some outliers to the overall rule. Large cities create their own high local population potential, which explains the different shading in the areas of e.g. Madrid, Hamburg, Berlin, Prague, Vienna, St. Petersburg and Moscow. In the next step, we investigate the relationship between market access and human capital. To this end, we plot market access against the ABCC (Fig. 4.4). Unfortunately, the ABCC has already achieved its maximum level of 100 in several countries. This is why there are a number of regions that are limited by the upper bound. Nevertheless, there is a clear relationship between market access and the ABCC. To test this relationship econometrically, we perform different regression models, always including country dummies. As OLS is the most basic and standard estimation method, we begin with OLS regressions. Subsequently, we will also test alternative models that incorporate issues concerning the scale of the dependent variable (i.e. Tobit models) and regarding endogeneity (i.e. instrumental variable models). The results of the baseline OLS regressions are shown in Table 4.2. Market access has a highly significant positive effect on numeracy at the 5% level ( p-value ¼ 0.018,

4.5 Results

73

Table 4.2 Market access and ABCC, ca. 1850 Dependent variable ln(MA)

(1) Ln(ABCC) 0.08** (0.018)

ln(Dist. to Lux.) Constant Estimation Country dummies Observations R-squared

3.66*** (0.000) OLS YES 289 0.69

(2)

(3)

(4)

0.10*** (0.006) 0.03*** (0.002) 4.42*** (0.000) OLS YES 289 0.69

3.67*** (0.000) Tobit YES 289

(5) 0.09*** (0.000)

0.06*** (0.001) 4.69*** (0.000) Tobit YES 289

3.78*** (0.000) IV YES 289 0.69

Note: ***, **, * indicate significance at the 1, 5 and 10% level. Robust p-values in parentheses

column one).9 A 1% increase in market access increases numeracy by 0.08%, a sizeable effect. To compare our results for market access with distance to Luxembourg as proposed in the literature, we also computed this distance (in natural logarithm) and show the results in column two (and also in the subsequent steps).10 Distance to Luxembourg is negatively significantly correlated to the ABCC at the 1% level. However, we have seen in the scatter plot that there are a number of regions that have already achieved the upper bound of 100 ABCC points in 1850. This given upper limit may bias our results because some of these regions would have had higher numeracy values if the limit did not exist. For this reason, we take this fact explicitly into account by running the same regressions with the Tobit model. The Tobit model incorporates the problem of upper or lower bounds in its estimations. The lower bound is not important in our case, but the upper limit is. Thus, in total, there are 39 regions which are right-censored by the model.11 The results when using the alternative Tobit model are shown in columns three to four. The coefficient of market access increases in the new specification (to 0.10, columns three), which is also true for distance to Luxembourg (column four). Overall, the Tobit model confirms the robustness of our former results. Nevertheless, it is still possible that our results are biased by endogeneity. In fact, one can imagine that market access is correlated with alternative variables that may have a significant influence on numeracy. Thus, to be able to identify whether there 9

Note that we have opted for the presentation of the results with the logarithmic form of the ABCC. We have also done all regressions without this transformation and obtained the same results (only the value of the coefficients changed, which is a logical consequence of the transformation). 10 Similar to López-Rodríguez et al. (2007), we also explore the possibility of outliers that would have an effect on our results by computing Cook’s distance. According to Cook’s distance, there are no outliers (i.e. with a value >1) in all the regressions. 11 Because we use the logarithmic form of the ABCC here, the upper limit (corresponding to 100) is approximately 4.6052.

74

4 Human Capital and Market Access in the European Regions

is causality between market access and numeracy, we also perform instrumental variable regressions. In the given case, an instrumental variable has to be a determinant of market access but also has to be exogenous to numeracy. Moreover, the variable should not be prone to influences of another underlying variable, which may drive its values and affect both market access and numeracy. Thus, in line with Redding and Venables (2004), Breinlich (2006) and LópezRodríguez et al. (2007), we take the distance from Luxembourg as our first instrumental variable. This variable captures the advantages conferred by being close to the centre of Europe. Second, as proposed by the same authors, we use the (area) “size of a region’s home country” (López-Rodríguez et al., 2007, p. 223), capturing the advantages that are created by big national markets for the market access of a region.12 The use of a similar strategy as previous authors also enables us to put their results for today into a larger historical context, which is the aim of this paper. The results of our IV models are shown in column five. The IV estimates for (logarithmic) market access are once again highly significant at the 1% level. The sign of the coefficient does not change, and the level of the coefficients is in between the ones in our other specifications. In other words, the coefficient of (logarithmic) market access was 0.08 in the OLS, 0.10 in the Tobit and now 0.09 in our IV models. In sum, the IV results confirm once more the importance of market access for numeracy. However, one may wonder if our results are robust to the use of other human capital variables and other time periods in the past. Therefore, our alternative indicator for human capital in the past is literacy in 1930. For this reason, our results would need to be confirmed by the use of this alternative indicator. However, given the use of another dependent variable (i.e. another human capital proxy), the consideration of a later time period (i.e. 80 years later than our numeracy estimates) and another dataset for the calculation of market access (although compatible with the dataset for 1850), we clearly would not expect to obtain the same results, including the same level of coefficients. In particular, the scatter plot has shown that literacy rates are much more dispersed than numeracy rates. For this reason, we expect higher coefficients in our 1930 regressions. Nevertheless, we expect to come to the same broad conclusions using this alternative specification. To achieve a maximum of comparability with our earlier results, we take the same approach as for numeracy in 1850. First, we find that the results of the population potential calculations appear to be quite similar around 1950 (see Fig. 4.5). The “core” of population potential is still located within the Golden Triangle, i.e. the industrial areas of England, Paris, Belgium and western Germany. The Iberian Peninsula, Scandinavia and eastern Europe still make up the periphery. Some differences emerge, however. For example, there appeared to be only two significant

12

Borders and countries in ca. 1850 are considered. Because we are interested in the domestic market and trade advantages, we consider Germany as being constituted by those countries that had joined the Zollverein (German Customs Union). Data on country sizes (in geographical square miles) come from Annuaire Statistique et Historique Belge (1857).

4.5 Results

75

Fig. 4.5 Population potential in Europe in 1950. Note: Graphical representation using natural breaks (Jenks) with 32 classes. Values decrease from the highest to the lowest value in the following broad order of colours: white, pink, blue, green, yellow, orange and red. Source: Own calculations, data on agglomerations provided by Moriconi-Ebrard (1994)

city centres with a high population potential in eastern Europe, that is, St. Petersburg and Moscow. Now, these two cities are joined by Donetsk. In contrast, the cities in Spain and Portugal are not so relevant outliers anymore. The higher fertility rates and increasing urbanisation in eastern Europe over the previous century may explain these changes. Still, the overall pattern is quite robust to these rather minor changes. In addition, it appears to be still more closely related to current market access estimations by López-Rodríguez et al. (2007). Similar to their data for the year 2000, Romanian Bucharest is now a positive outlier (accompanied by Bulgarian Sofia). Major population potential levels are now extending until Polish Wroclaw and Italian Milan, again quite similar to the recent data for 2000. These findings give additional validity and show the robustness of our estimations and may indicate of the long-term nature of regional market access levels. Next, plotting market access and literacy shows that their positive correlation is also quite clear (see Fig. 4.6). Note that there are no literacy data for several developed countries in 1930, such as the Scandinavian countries, Germany or the United Kingdom. Kirk (1946) estimates that these countries had literacy rates between 95 and 100. In the following, we exclude the regions from these countries (as has been done in Fig. 4.6). Alternatively, we can also take the hypothesis that these regions had a literacy rate of 100. In any case, there are no apparent outliers.

4 Human Capital and Market Access in the European Regions

1

76

.6 .2

.4

Literacy

.8

FR42 CZ06 CZ01 ITD1 CZ04 CZ05 CZ07 CZ08 CZ03 FR43 BE34 BE35 PL22 RU23 IE01 IE02 PL62 PL42 PL52 PL51 AT11 AT22 PL43 AT13 AT12 AT31 AT32 AT33 AT34 NL11 NL13 NL12 NL21 NL34 NL32 NL22 NL33 NL31 NL41 NL42 FR25 FR41 FR21 FR10 FR26 FR71 PL41SK01 ITC1 ITC4 BE33 BE22 BE21 FR51 FI20 BE24 FR23 FR72 FR24 FR22 PL63HU10 SK02 FR82 FR62 HU22 FR53 ITC3 FR30 AT21 HU21 EE00 SK03 BE25 BE23 BE32 FR61 FR81 HU23 FR52 FI18 ES13HU33 FR63 ITD3 ES30 HU31 ITD4 FI1A FI19 RU2A ES21 SK04 HU32 ITD5 ES51 LV00 ES12RU1H ES22 PL21 FR83ITE1 FI13 ES41 ES23 YU003 ITE4 RU1G BG41PL32 RU24 BG32 PL11 RU15 UA00B HR02 ITE3 ITE2 RU1E RO42PL61 RU13 RU35RU1D UA00J UA006 ES24 BG31 RO12 UA004 PL34 PL31 PL33 GR24 RU25 BG42 RU1C RU22GR41 BG34 BG33 LT00 ITF1 UA00A UA008 ITF3 ES11 RU27 GR42 RO11 ES53 ITG2 RU21RU43 RO21 UA00F GR25 RU14ES52 ITF4 RU26 BY004 RO31 ITG1 GR43 GR14 GR12 UA00K ES61 ES42 RU32 RU41RU31 RU19 RU1A RU4C RU4B GR22 RU33 RU34 RU18 UA005 ES43 BY003 RO22 UA001 RU4D RU16 RU44 ES62 ITF5 RU28 PT17 RU12 GR21 BY001 UA002 RO41 ITF6 RU49 RU4F RU45 RU1F RU47 RU1B RU4A HR03 RU48 RU42 GR11 MD PT11YU002 AM PT16 YU001 BA PT18 AL MK00 RU38 AZ RU39 RU37

8.5

9

9.5

10

10.5

11

ln(Market access)

Fig. 4.6 Literacy and market access, ca. 1930

Table 4.3 Market access and literacy, ca. 1930 (1)

(2)

(3)

Dependent variable ln(MA)

0.58*** (0.000)

ln(Dist. to Lux.) Constant

-5.78*** (0.000)

0.61*** (0.000) -0.25*** (0.000) 0.69*** (0.000)

(4) (5) ln(Literacy) 0.32*** (0.000)

(6)

0.57*** (0.000)

(7)

(8)

0.29*** (0.000)

-0.10*** -0.26*** (0.000) (0.000) -6.73*** -3.03*** 0.20*** -6.30*** 0.20*** -3.62*** (0.000) (0.000) (0.001) (0.000) (0.000) (0.000)

Estimation OLS OLS IV OLS OLS Tobit Tobit Obs. excluded Estimated Estimated Estimated None None None None Country-fixed effects YES YES YES YES YES YES YES Observations 199 198 194 324 316 324 316 R-squared 0.69 0.67 0.70 0.78 0.77 Note: ***, **, * indicate significance at the 1, 5 and 10 percent level. Robust p-values in parentheses.

IV None YES 308 0.78

4.5 Results

77

The relationship between literacy and market access is even closer than for numeracy. The corresponding regression results are shown in Table 4.3 (always including country dummies). This time, we propose two different specifications. First, we exclude the developed countries without any official literacy data (columns one to three). Column one shows that log market access is again positive and highly significant at the 1% level. That is, a 1% increase in log market access increases literacy by 0.58%. This is substantially more than the 0.08% that we obtained for numeracy in 1850. As noted above, this higher level corresponds to our expectations. All coefficients are higher than in 1850 because literacy rates are more dispersed than numeracy rates. A similar reasoning applies to the higher negative and significant coefficient for log distance to Luxembourg (column two). As we have excluded all estimated literacy data for developed countries, the remaining countries do not reach the upper bound of 100% literacy. For this reason, we do not need to perform Tobit regressions. Even if we perform them, we get the same results (not shown). Therefore, we proceed with the IV estimation, using the same strategy as in our numeracy regressions (column three). The coefficient of log market access remains highly significant at the 1% level and largely stable, increasing only slightly from 0.58 to 0.61. This is the same tendency we have already observed in our numeracy sample. Second, we include the developed countries with their estimated literacy rates (columns four to eight). Most of these countries have estimated literacy rates of 95–100%.13 Thus, we assume that these countries had literacy rates of 100%. As now a number of countries have reached the upper limit, we perform Tobit analyses in addition to OLS models.14 Although we prefer to exclude countries that have the same estimated literacy rate for each region, as we did in our previous specification, this alternative strategy may allow us to show the effects of including all of Europe. We start again with OLS models (columns one and two). While the significance levels are identical, the coefficient of log market access (column four) decreases importantly from previously 0.58 to 0.32 due to the higher number of observations (with a high level of literacy). Log distance to Luxembourg (column five) shows a similar move downwards. Moving to Tobit models, the coefficient of log market access increases from 0.32 (column four) to 0.57 (column six) by more than 50%. However, the same appears for log distance to Luxembourg (column seven). In consequence, taking account of the upper limit increases the coefficient to the same level as when the countries with estimated high literacy rates were excluded (see the initial models in columns one to three). Finally, we perform an IV regression (column eight). The coefficient in the IV regression (0.29) is similar to the one in the initial OLS models without the estimated regions (0.32, column four) and highly significant.

13

See Kirk (1946) for more information. These countries are Denmark, Germany, Ireland, the Netherlands, Norway, Sweden, Switzerland, the United Kingdom and parts of Austria.

14

78

4 Human Capital and Market Access in the European Regions

In sum, market access is a highly significant determinant of literacy in every model. The coefficients are higher, as expected, in our literacy regressions for 1930 than for the ABCC in 1850, and they show the robustness of our results over time. All in all, our results show that space matters in education. We find a coreperiphery pattern in Europe similar to the literature that analyses the EU today (see Sect. 4.2). Market access has a significant influence on human capital, confirming the “penalty of remoteness” hypothesis. Moreover, because we are referring to the rather distant past with our data, the current regional distribution of human capital and economic development appears to be rather stable in the longer run. This gives important implications for regional policy. Remoteness equals backwardness – this statement is certainly an exaggeration, but remoteness makes backwardness definitely more likely and makes it more difficult to get out of it. However, remoteness does not imply that remote areas are necessarily trapped in a vicious circle caused by geographic location. While geographic location and market access play an important role in explaining regional human capital differences, our results also show that, for example, remote countries (in market access terms) such as Finland and Estonia have outperformed their expected human capital level. In consequence, other factors may lead to better human capital performance, such as strong institutions and cultural values directed towards education. In this context, transport costs also matter. In fact, the concentration and agglomeration of firms and labour depend on the level of transport costs (Combes et al., 2008, see also Hippe, 2013). One can distinguish three development stages, assuming the existence of two regions A and B. First, both regions are in autarky as transport costs are at a high level. In consequence, trading and transferring capital or labour is still impossible at this stage between the regions. Decreasing transport costs allows the regions to begin trading, and real incomes rise in both regions. However, at a certain threshold of lowering trade costs, regions begin to diverge and regional inequalities to rise. More specifically, while real incomes in A increase with the formation of an agglomeration, they decrease in B, which loses consumers. On the other hand, if transport costs fall even further to another threshold, region A becomes again less attractive for capital and labour, which turn towards B. In the end, both regions find themselves at a higher level of real incomes, and regional inequalities have again disappeared (Combes et al., 2008, see also Hippe, 2013). Thus, policies need to take into account the level of transport costs, as they are an important factor for regional inequalities. Public policy can lower transport costs. As Faiña et al. (2016, p. 353) point out, “[i]n the first development stages of lagging regions, solid reasons exist to support adequate endowments to transport infrastructure”, although these investments are not enough to remove obstacles to sustainable growth paths but need to be coupled with other effective economic and social development strategies. In consequence, while public policy cannot influence the geographic location of a region, it can act upon these other factors. Nevertheless, policy needs to acknowledge the role of remoteness and adapt its policies according to the needs engendered by it. In other words, it has to develop policies which are appropriate, given the relative location and market access level of a region, to ensure that all regions are

4.5 Results

79

able to grow and that regional inequalities are not exacerbated. However, cohesion policy faces a number of risks, as has been outlined by Farole et al. (2011): first, markets and their functioning may be distorted by allowing inappropriate investments to be made at the regional level. Second, public subsidies may produce a crowding out effect of private investment. Third, they may protect regions from the markets but in doing so create an environment less friendly towards adaptation to evolving conditions. Fourth, they may involuntarily make these regions dependent on public subsidies, so they cannot survive without external support and cannot generate themselves the necessary environment leading to economic growth. Last, the elites that may (partly) be responsible for the lower development level of a region (in the historical period under study, these may be e.g. local landlords) (see also Baten & Hippe, 2018; Galor et al., 2009) may have the power to use cohesion policy investments for their own rent extraction, thus promoting inefficient and growthlimiting local institutions. For these reasons, it is important to consider the particular context of a region, the given institutional framework and the relevant stakeholders (see also Hippe, 2020). Therefore, as Farole et al. (2011) and Hippe (2013) emphasise, a “one size fits all” strategy may not work. In particular, Farole et al. (2011) provide some useful criteria on how to advance policies by differentiating various kinds of regions (from coremetropolitan regions to peripheral sparsely populated ones) and the goals that cohesion interventions would like to achieve. As for the latter, the authors outline the aim to (1) generate growth in the EU as such, (2) diffuse growth from the core to the periphery, (3) encourage growth in particular peripheral areas with potential, (4) support regions with less potential and, finally, (5) provide assistance to regions with least potential while acknowledging that the latter may be in conflict with the aim to promote growth in the entire EU. In some circumstances, it may be best for cohesion policy to focus on national institutions, because the specific policies are effective on a national level (e.g. the legal framework). Yet in others, the involvement of regional institutions may be more appropriate, in particular when it comes to implementing policies. But again, special care needs to be taken not to involuntarily support and promote inefficient structures in underdeveloped regions by giving responsibility into the wrong hands (see also e.g. Farole et al., 2011; Hanushek & Woessmann, 2015). A better understanding of the regional success stories coming from countries like South Korea may provide a useful broader perspective on this issue, showing that change into the right direction is possible (Farole et al., 2011) if the appropriate regionally tailored policy strategy is applied. Whatever strategy is chosen, it should be appropriately defined, assessed, conditioned on certain criteria and rigorously monitored, following a specified number of guidelines and mechanisms (Farole et al., 2011), to avoid any of the potential negative effects of external policy interventions.

80

4.6

4 Human Capital and Market Access in the European Regions

Conclusions

This paper has analysed the importance of market access to explain the spatial distribution of human capital levels in the European regions in the long run. The central focus of the paper is whether remoteness was connected to backwardness in the past, as has been postulated by Redding and Schott (2003) and tested by e.g. López-Rodríguez et al. (2007) for the European regions in the present. In particular, we construct a new combined dataset using two different indicators of human capital, numeracy and literacy, to check the robustness of our results. More specifically, we employ, first, the age heaping method in order to approximate numeracy values for 1850. Second, literacy is a standard human capital proxy in Europe for the end of the nineteenth and the first half of the twentieth century. Therefore, we use literacy in 1930 as our alternative specification. These two binary indicators also partly mirror the assumptions of the underlying theoretical model. Moreover, data on European cities have been used to proxy for market access. In this direction, the standard concept of population potential has been employed to generate average market access estimations for the European regions. The results show that market access is highest in the regions of the Golden Triangle, i.e. England, northern France, Belgium and western Germany. This is an area where there is a concentration of population and economic prosperity (see e.g. López-Rodríguez et al., 2007). In general, the farther one moves away from this centre, the lower is the level of market access. Thus, market access is lowest in the peripheral areas of Europe, for example, in parts of Iberia and Eastern Europe. Therefore, we find a core-periphery pattern of education also in the past. Moreover, OLS, Tobit and IV regressions of market access on numeracy highlight that numeracy is significantly higher in regions with higher market access. Thus, after the literature has in particular used educational attainment for the current period, our numeracy and literacy estimates show that the “penalty of remoteness” hypothesis is not only valid for today but that its importance can be traced back even to the middle of the nineteenth century. This underlines once more that this penalty has existed for a long time in Europe. Thus, it may continue to exist also in the future if not the right policy decisions are taken. These future policies should take into account the particular specificities of the region under consideration and through a variety of measures such as monitoring should ensure that policies do indeed support the aims set out by regional policy.

Appendix Regional data in 1850 include the following countries (in current borders): Albania, Armenia, Austria, Azerbaijan, Bosnia-Herzegovina, Belgium, Bulgaria, Belarus, Switzerland, Czech Republic, Germany, Denmark, Estonia, Spain, France, Georgia, Greece, Croatia, Hungary, Ireland, Italy, Lithuania, Luxembourg, Latvia,

References

81

Moldova, Northern Macedonia, Netherlands, Norway, Poland, Portugal, Romania, Russia, Slovenia, Slovakia, Ukraine, United Kingdom, Serbia, Montenegro Regional data in 1930, without estimated observations, include the following countries (in current borders): Albania, Armenia, Austria, Azerbaijan, Bosnia-Herzegovina, Belgium, Bulgaria, Belarus, Czech Republic, Estonia, Spain, France, Greece, Croatia, Hungary, Italy, Lithuania, Latvia, Moldova, Northern Macedonia, Poland, Portugal, Romania, Russia, Slovakia, Ukraine, Serbia, Montenegro Regional data in 1930, with estimated observations, include the following countries (in current borders): Albania, Armenia, Austria, Azerbaijan, Bosnia-Herzegovina, Belgium, Bulgaria, Belarus, Switzerland, Czech Republic, Germany, Denmark, Estonia, Spain, France, Greece, Croatia, Hungary, Ireland, Italy, Lithuania, Latvia, Moldova, Northern Macedonia, Netherlands, Norway, Poland, Portugal, Romania, Russia, Sweden, Slovakia, Ukraine, United Kingdom, Serbia, Montenegro

References A’Hearn, B., Crayen, D., & Baten, J. (2009). Quantifying quantitative literacy: Age heaping and the history of human capital. Journal of Economic History, 68(3), 783–808. Annuaire Statistique et Historique Belge. (1857). Quatrième année. Aug. Schnée et Cie./A.D. Marcus. Badinger, H., & Tondl, G. (2003). Trade, human capital and innovation: The engines of European regional growth in the 1990s. In B. Fingleton (Ed.), European regional growth (pp. 215–239). Springer. Bairoch, P., Batou, J., & Chèvre, P. (1988). La population des villes européennes : 800-1850 : banque de données et analyse sommaire des résultats. Droz. Barslund, M., Busse, M., & Schwarzwälder, J. (2015). Labour mobility in Europe: An untapped resource? CEPS Policy Briefs (327). Baten, J., & Hippe, R. (2018). Geography, land inequality and regional numeracy in Europe in historical perspective. Journal of Economic Growth, 23(1), 79–109. Brakman, S., Garretsen, H., & Schramm, M. (2004). The spatial distribution of wages: Estimating the Helpman–Hanson model for Germany. Journal of Regional Science, 44, 437–466. Brakman, S., Garretsen, H., & Van Marrewijk, C. (2009). Economic geography within and between European nations: The role of market potential and density across space and time. Journal of Regional Science, 49, 777–800. Braudel, F. (1979). Civilisation materielle, économic el capitalisme, XVe-XVIIIe siècle. Le temps du monde. Armand Colin. Breinlich, H. (2006). The spatial income structure in the European Union: What role for economic geography? Journal of Economic Geography, 6, 593–617. Broadberry, S. (2009). Agriculture and structural change: Lessons from the UK experience in an international context. In P. Lains & V. Pinilla (Eds.), Agriculture and economic development in Europe since 1870. Routledge. Brunet, R. (2002). Lignes de force de l’espace européen. Mappe Monde, 66(2), 14–19. Cinnirella, F., & Hornung, E. (2016). Landownership concentration and the expansion of education. Journal of Development Economics, 121, 135–152. Clark, C., Wilson, F., & Bradley, J. (1969). Industrial location and economic potential in Western Europe. Regional Studies, 3, 197–212.

82

4 Human Capital and Market Access in the European Regions

Combes, P. P., Duranton, G., & Gobillon, L. (2008). Le rôle des marchés locaux du travail dans la concentration spatiale des activités économiques. Revue de l’OFCE 2008/1, 104, 141–177. Combes, P. P., Duranton, G., & Overman, H. G. (2005). Agglomeration and the adjustment of the spatial economy. Papers in Regional Science, 84, 311–349. Crayen, D., & Baten, J. (2010). Global trends in numeracy 1820-1949 and its implications for longrun growth. Explorations in Economic History, 47, 82–99. Crozet, M. (2004). Do migrants follow market potentials? An estimation of a new economic geography model. Journal of Economic Geography, 4, 439–458. De Vries, J. (1984). European urbanization, 1500-1800. Taylor and Francis. Duranton, G. (2011). California dreamin’: The feeble case for cluster policies. Review of Economic Analysis, 3(1), 3–45. Eurostat. (2009). Eurostat regional yearbook 2009. Publications Office of the European Union, Online. Retrieved August 15, 2014, from http://epp.eurostat.ec.europa.eu/cache/ITY_OFFPUB/ KS-HA-09-001-01/EN/KS-HA-09-001-01-EN.PDF Faíña, J. A., & López-Rodríguez, J. (2006). Market access and human capital accumulation: The European Union case. Applied Economics Letters, 13, 563–567. Faiña, J. A., López-Rodríguez, J., & Montes-Solla, P. (2016). Cohesion policy and transportation, Chapter 21. In S. Simona Piattoni & L. Polverari (Eds.), Handbook on cohesion policy in the EU. Edward Elgar Publishing, Elgar Online. https://doi.org/10.4337/9781784715670 Farole, T., Rodríguez-Pose, A., & Storper, M. (2011). Cohesion policy in the European Union: Growth, geography, institutions. JCMS: Journal of Common Market Studies, 49(5), 1089–1111. Felice, E. (2012). Regional convergence in Italy, 1891–2011: Testing human and social capital. Cliometrica, 6(3), 267–306. Fingleton, B. (2011). The empirical performance of the NEG with reference to small areas. Journal of Economic Geography, 11(2), 267–279. Flisi, S., Dinis da Costa, P., & Soto-Calvo, E. (2015). Learning mobility. European Commission Joint Research Centre, EUR 27695. https://doi.org/10.2760/590538 Fujita, M., Krugman, P., & Venables, A. (1999). The spatial economy: Cities, regions and international trade. MIT Press. Galor, O., Moav, O., & Vollrath, D. (2009). Inequality in landownership, the emergence of humancapital promoting institutions, and the great divergence. Review of Economic Studies, 76, 143–179. Hanson, G. H. (2005). Market potential, increasing returns and geographic concentration. Journal of International Economics, 67, 1–24. Hanushek, E. A., & Woessmann, L. (2015). The knowledge capital of nations, CESifo book series. MIT Press. Harper, D. A. (2008). A bioeconomic study of numeracy and economic calculation. Journal of Bioeconomics, 10, 101–126. Harris, C. (1954). The market as a factor in the localization of industry in the United States. Annals of the Association of American Geographers, 44, 315–348. Head, K., & Mayer, T. (2004). The empirics of agglomeration and trade. In J. V. Henderson & J.-F. Thisse (Eds.), Handbook of regional and urban economics (Vol. 4, pp. 2609–2669). Elsevier. Head, K., & Mayer, T. (2006). Regional wage and employment responses to market potential in the EU. Regional Science and Urban Economics, 36, 573–594. Hippe, R. (2012a). Spatial clustering of human capital in the European regions. Economies et Societes, 46(7), 1077–1104. Hippe, R. (2012b). How to measure human capital? The relationship between numeracy and literacy. Economies et Sociétés, AF, 45(8), 1527–1554. Hippe, R. (2013). Human capital formation in Europe at the regional level – Implications for economic growth. PhD thesis, University of Strasbourg, BETA/CNRS & University of Tuebingen. Hippe, R. (2020). Human capital in European regions since the French revolution: Lessons for economic and education policies. Révue d’Economie Politique, 130(1), 27–50.

References

83

Hippe, R., & Baten, J. (2012). Regional inequality in human capital formation in Europe, 1790–1880. Scandinavian Economic History Review, 60(3), 254–289. Keeble, D., Owens, P. L., & Thompson, C. (1982). Regional accessibility and economic potential in the European Community. Regional Studies, 16, 419–432. Kirk, D. (1946). Europe’s population in the interwar years. Princeton University Press. Krugman, P. (1991). Increasing returns and economic geography. Journal of Political Economy, 99(3), 483–499. López-Rodríguez, J., Faíña, J. A., & López-Rodríguez, J. (2005). New economic geography and educational attainment levels in the European Union. International Business & Economics Research Journal, 4(8), 63–74. López-Rodríguez, J., Faíña, J. A., & López-Rodríguez, J. (2007). Human capital accumulation and geography: Empirical evidence from the European Union. Regional Studies, 41(2), 217–234. Martí-Henneberg, J. (2005). Empirical evidence of regional population concentration in Europe, 1870-2000. Population, Space and Place, 11, 269–281. Mion, G. (2004). Spatial externalities and empirical analysis: The case of Italy. Journal of Urban Economics, 56, 97–118. Moriconi-Ebrard, F. (1994). Geopolis, pour comparer les villes du monde, Economica-Anthropos. Colle. Villes. Niebuhr, A. (2006). Market access and regional disparities. Annals of Regional Science, 40, 313–334. Núñez, C. E. (1992). La fuente de la riqueza. Educación y desarrollo económico en la España contemporánea. Alianza. Ottaviano, G. I. P., & Pinelli, D. (2006). Market potential and productivity: Evidence from Finnish regions. Regional Science and Urban Economics, 36, 636–657. Redding, S. (1996). The low-skill, low-quality trap. Economic Journal, 106, 458–470. Redding, S., & Schott, P. (2003). Distance, skill deepening and development: Will peripheral countries ever get rich? Journal of Development Economics, 72(2), 515–541. Redding, S., & Venables, A. (2004). Economic geography and international inequality. Journal of International Economics, 62, 53–82. Rodríguez-Pose, A., & Tselios, V. (2011). Mapping the European regional educational distribution. European Urban and Regional Studies, 18(4), 358–374. Roos, M. (2001). Wages and market potential in Germany. Jahrbuch für Regionalwissenschaft, 21, 171–195. Sterlacchini, A. (2008). R&D, higher education and regional growth: Uneven linkages among European regions. Research Policy, 37(6), 1096–1107.

Chapter 5

The Long-Run Impact of Human Capital on Innovation and Economic Growth in the Regions of Europe

5.1

Introduction

Economic development is one of the predominant research areas in economics. Many theories have been developed to better understand the causes and consequences of economic development and growth. For example, some of the most important fundamental factors for long-run growth are the quality of institutions (e.g. Acemoglu et al., 2005; North, 1981) and geography and naturally given geographical conditions (e.g. Diamond, 1997; Engerman & Sokoloff, 2000). Approximate causes of growth include income inequality (e.g. Alesina & Rodrik, 1994; Persson & Tabellini, 1994), land inequality (e.g. Galor et al., 2009) and human capital accumulation (Galor & Moav, 2002; Glaeser et al., 2004). For instance, an increase in human capital may induce a rise in the number of innovative entrepreneurs and products, thus indirectly spurring economic development through the channel of innovation. In fact, the crucial role of innovation for economic development and growth has been underlined by a large literature in this area (e.g. Lucas, 1988; Romer, 1986; Solow, 1956). Nevertheless, the long-run implications of human capital on innovation and economic development need further research because this issue has only been touched upon in few contexts (e.g. Baten & van Zanden, 2008). Therefore, the question remains whether preexisting human capital is important for the creation of long-run development. Thus far, most of the studies in this area only take a national perspective by focusing on countries. However, regional differences in human capital may be at least as important as national ones (e.g. Cipolla, 1969). The use of regions allows to overcome the inherent problems of cross-country analyses and may explain why some regions are richer than others. In particular, human capital may play a crucial

This chapter was first published in slightly modified form as Diebolt, C., and Hippe, R. The longrun impact of human capital on innovation and economic development in the regions of Europe, Applied Economics, 2019, 51 (5): 542–563. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 C. Diebolt, R. Hippe, Human Capital and Regional Development in Europe, Frontiers in Economic History, https://doi.org/10.1007/978-3-030-90858-4_5

85

86

5 The Long-Run Impact of Human Capital on Innovation and Economic Growth in. . .

role in regional development. In fact, in their recent seminal paper, Gennaioli et al. show the “paramount importance of human capital in accounting for regional differences in development” (Gennaioli et al., 2013, p. 105). But is the effect of human capital also persisting? Their analysis is limited to current data and cannot evaluate any longer-run influence of human capital on regional outcomes. We aim at assessing this aspect in this paper. Therefore, we analyse the long-run impact of human capital on innovation and economic development at the regional level in Europe. To our knowledge, this is the first paper that takes this long-run regional approach at the European scale, contributing a new spatiotemporal dimension to the existing literature. Combining a range of databases for the first time, we employ a new and large dataset in our analysis. First, this dataset includes data on human capital levels between 1850 and 2010 for many European regions and countries. Second, the database also comprises relevant current data on innovation and economic development. More specifically, we measure current innovation by patents per million inhabitants and the level of economic development by GDP per capita. Finally, we add historical socioeconomic control variables that stem from a number of different sources. These historical control variables include the share of agricultural employment, population density, infant mortality, fertility and marital status. We also include dummy variables for former communist countries in Eastern Europe and control for capital regions. Regions are coded according to the European Union’s NUTS classification throughout time. In other words, we adapted the historical European regions to the current NUTS system to directly compare the historical to the current data. In total, we have up to 265 NUTS 2 (or corresponding) regions in our database at a point in time. In this way, we are able to analyse the relationship between human capital, innovation and economic development in a regional and long-run perspective. More specifically, using standard OLS regression models, we regress current regional innovation and economic prosperity measures on a range of historical variables at different points in time. Our baseline specification considers historical explanatory variables in 1930, the year in which we have the maximum number of variables. The results show that historical human capital is a significant determinant of today’s regional levels of innovation and economic development in Europe. In particular, literacy has a significant influence on current patent applications per capita and GDP per capita. We employ a number of specifications to check the robustness of our results. Among others, supplementary results for 1850 (using age heaping-based numeracy), 1900 and 1960 (using literacy) confirm our findings. Therefore, our results suggest that historical human capital has important persisting effects on economic development. The paper is structured as follows. First, we present the relevant literature on the relationship between human capital, innovation and economic development in Europe. Then, we discuss the employed methodology, the underlying data and our econometric strategy. Finally, we show the current relationship between human capital, innovation and economic development and analyse the long-run relationship

5.2 Literature

87

between historical human capital, current innovation and economic development. The last section concludes.

5.2

Literature

Human capital may directly affect economic development and growth or indirectly, in particular through the generation of technology. According to Acemoglu and Autor (2012), there are several channels through which human capital may affect technological progress. Firstly, they stress that the individuals with the highest talents may contribute to technological progress by the use of their human capital if they have the necessary access to educational facilities. These individuals have probably the most important impact on technological progress. Secondly, the workforce in more general terms may affect technology, first, due to the externalities derived from human capital and, second, because human capital alters and increases the incentives to invest more in technological progress. For example, it is possible that a technology is only sufficiently profitable if there are enough workers who have the necessary skills. Finally, technological progress may be influenced by the workforce’s mix of skills and human capital. In general, the importance of human capital was already considered in early works by Smith and Marshall (see Demeulemeester & Diebolt, 2011; Hippe, 2014). However, it took much longer for human capital to emerge as a key factor for economic growth. In fact, the most important contributions were developed from the middle of the twentieth century onwards. In particular, Becker (e.g. see Becker, 1964) is widely acknowledged as a founder of human capital theory, stressing that human capital increases the productivity of workers. Similarly, Arrow (1962) highlights the effect of experience on technical change. In addition, Nelson and Phelps (1966) emphasise that human capital is also important for implementing and adopting new technologies. Later on, Schultz (1975) argues that workers are better able to cope with changes in the economic structure and handle new technologies if they have more human capital. Around the beginning of the 1990s emerged new theoretical advances. An extension of the original Solow growth model (i.e. the human capital-augmented Solow model) was presented by Mankiw et al. (1992). It explicitly includes human capital as a factor in the Cobb-Douglas production function. Another kind of growth models, the endogenous growth models, was initiated by Romer (1986) and Lucas (1988). The former focuses on technological change and the latter on human capital accumulation. The aim is to endogenise the various factors which may lead to economic growth in the model. Overall, these models consider human capital to be an important driver for economic growth. They have also stimulated further research, generating another branch of Schumpeterian growth models (Aghion & Howitt, 1992, 1998, 2006) that model the idea of creative destruction through innovation.

88

5 The Long-Run Impact of Human Capital on Innovation and Economic Growth in. . .

Finally, the newest contribution in the area of human capital theory and economic growth are the unified growth models (e.g. Galor, 2005, 2012; Galor & Moav, 2002; Galor & Weil, 2000). Their aim is to explain economic development in the (very) long run. In these models, human capital is attributed a crucial role for the creation of economic growth. All in all, these different theories show that human capital is an important driver for economic development and growth. Still, there has been some controversy about this issue over the last decades. In fact, Demeulemeester and Diebolt (2011) refer to several alternating waves of optimism and scepticism on the relevance of human capital to generate growth since the Second World War. The contributions by authors such as Solow (1956), Mincer (1958), Schultz (1961) and Becker (1964) led to the consensus in the 1950s and 1960s that education makes an important contribution to economic growth. In contrast, the 1970s were more marked by scepticism in a time of economic downturn. The new important theoretical contributions of the 1990s (Lucas, 1988; Romer, 1990) reinvigorated once again the case for human capital. These optimistic ideas were supported by different empirical studies (e.g. Barro, 1991; Barro & Lee, 1993; Mankiw et al., 1992), but also more critical voices appeared, such as Benhabib and Spiegel (1994) and Pritchett (2001). Measurement error may account for some of these results (Krueger & Lindahl, 2001). Thus, Sianesi and Van Reenen conclude in their literature survey in 2003 that “as a whole we feel confident that there are important effects of education on growth” (Sianesi & Van Reenen, 2003, p. 197). In addition, the more recent studies by e.g. De La Fuente and Doménech (2006), Cohen and Soto (2007), Goldin and Katz (2008) and Ciccone and Papaioannou (2009) show the crucial impact of human capital on growth. The key contribution of cognitive skills (including numeracy and literacy skills, whose historical correspondent we will use in our later analysis) is further highlighted by Hanushek and Woessmann (e.g. 2011, 2012, 2015, 2016). The authors have put a particular focus on two points: first, they argue that not the quantity of education matters but its quality. In other words, most of the previous work in the human capital literature, beginning with Mincer, used the length of education, measured by average years of schooling or the attainment of specific educational levels, as the indicator for human capital. This approach was then also used in international organisations, such as the UN and UNESCO, whose Millennium Goals focused on this quantity of education. The appropriate policy to be pursued would then be to increase the number of years at school or to have more university graduates. However, Hanushek and Woessmann (2015) argue that the appropriate measurement of human capital is not the length of studies but what is learnt at school or university. That is, what matters are specific skills. Many studies in the past were not able to measure skills because such data were not available. In consequence, instead of educational attainment measures, it appears better to use measures of (international) achievement tests. There has been a growth in the number of these tests, often administered by the OECD or IEA, the most famous being PISA, PIAAC, TIMSS and PIRLS. According to Hanushek and Woessmann (2016), economic growth rates

5.2 Literature

89

are much closer related to these achievement scores than the traditional attainment data. More specifically, adding achievement scores in a growth model leads to a much higher explanatory power of the variation in growth rates than educational attainment does (i.e. rising from 33% to 73%; Hanushek & Kimko, 2000). Therefore, Hanushek and Woessmann recommend from both theoretical and empirical perspectives the use of achievement data, what they call the “knowledge capital of nations” (Hanushek & Woessmann, 2015). Second, this “knowledge capital” consists of cognitive skills, which is supposed to be measured adequately by international test scores in mathematics and science. From a policy perspective, the most important driver of cognitive skills are schools, but also other factors may play a relevant role in their development. While other authors show the relevance of noncognitive skills, Hanushek and Woessmann argue that what matters most is this specific type of skills. In particular, measures of noncognitive skills are often not available, or there is no consensus on them (Hanushek & Woessmann, 2008). In addition, concentrating on cognitive skills has the advantage, among others, that they are importantly related to schooling, which are then also related to later labour market outcomes. On the other hand, test scores certainly do not measure all the relevant skills which may impact on the later labour market success. In addition, there may be different reasons that may lead to measurement errors (Hanushek & Woessmann, 2008). Therefore, while test scores are not perfect proxies for knowledge capital, the argument holds that it is important to increase this knowledge capital for increasing growth. Various articles by Hanushek also show that causation goes from education to growth and not vice versa (Hanushek & Kimko, 2000; Hanushek & Woessmann, 2012). In particular, these authors consider instrumental variable estimation, intertemporal analyses using growth rates and difference-in-differences methods, which all result in confirming the suggested direction of causation. In consequence, attaining higher levels of educational achievement can potentially translate into spectacular increases in GDP per capita in the future (Hanushek & Woessmann, 2011). In addition, the literature on the impact of human capital and innovation on economic development and growth in the European regions is also large (e.g. Cuaresma et al., 2012; Fagerberg et al., 1997; Rodríguez-Pose & Crescenzi, 2008; Sterlacchini, 2008). For example, Badinger and Tondl (2003) investigate whether human capital and innovation (as measured by patent applications) have a significant impact on the growth rates of gross value added per capita in 128 regions between 1993 and 2000. Both the relative patent applications and higher education variables are shown to have a significant impact. However, medium levels of education are not significant, which highlights that economic growth in Europe’s “knowledge-driven” economies is boosted by the highest form of educational attainment. Moreover, Sterlacchini (2008) finds that human capital (in the form of higher education) and a region’s knowledge base have a significant and positive impact on economic growth in 12 EU15 countries between 1995 and 2002. Cuaresma et al. (2012) use a dataset including 255 EU regions to analyse which of their 48 potential determinants are significantly explaining economic growth

90

5 The Long-Run Impact of Human Capital on Innovation and Economic Growth in. . .

between 1995 and 2005. Two of their most important results are that capital regions grow faster than other regions and that human capital (i.e. higher education) is a robust determinant of economic growth. Finally, Gennaioli et al. (2013) construct a database of 1569 regions from more than 100 countries to disentangle the determinants of regional development. Considering a broad range of geographical, institutional, cultural and human capital variables, they find that human capital is the single most important factor for regional development. Thus, these different studies show that human capital is a crucial determinant of economic growth and economic development in the European regions and in the world today. But what do we know about its long-term impact in the world in general and in Europe in particular? There have some been studies which shed some light on the question whether historical human capital and technology matter for today’s economies. For instance, Comin et al. (2010) take a long-run perspective and show that there is a strong relationship between technology in 1500 AD and current GDP per capita as well as technology adoption in the world. Madsen (2008, 2010) shows that the growth effects of human capital are important at the country level in OECD countries over the last 100 or so years, underlining the predictions of Schumpeterian growth models. These findings suggest that historical factors may be important for the explanation of current or recent economic levels. We advance this line of research by focusing on regions instead of countries in a European perspective. Using regions instead of countries considerably sharpens the picture. Countries may be composed of regions which do not share a common linguistic, ethnical or cultural identity. Regional differences may thus be very high. However, this information is lost in country comparisons. Aggregated country averages may hide the fundamental forces operating at more disaggregated levels. For example, cross-country analyses cannot disentangle national institutional effects on economic outcomes. Therefore, we analyse whether there are persisting long-run effects of human capital on innovation and economic development, using regional historical human capital and current innovation and economic development data.

5.3

Methodology and Data

Human capital, innovation and economic development are rather large and vague ideas whose measurement has to be specified in greater detail. The human capital data used in this study come from different sources. First, we employ the new and large database created by Diebolt and Hippe (2017), which traces human capital between 1850 and 2010. From this database, we use the years 1850, 1900, 1930 and 1960 to follow the evolution of human capital. Human capital is proxied by numeracy (ABCC) in 1850 and by literacy (ability to read and write) in 1900, 1930 and 1960. Both numeracy and literacy indicators may be considered appropriate for their respective time period. Before 1900, literacy data are not available for many European countries. Even in 1900, a range of countries do not consider literacy in

5.3 Methodology and Data

91

their censuses. This is the case for e.g. Scandinavian countries such as Denmark or Sweden but also for Germany, Switzerland or the Netherlands. In general, these are countries where basic reading and writing skills can be considered almost universal. They had their own specific reasons to refrain from this question in the census. For example, the Swiss administration considered that a sufficient literacy level was already attained in 1860, as the corresponding 1860 census documents highlight (Statistisches Bureau, 1862). According to the census materials, military data had shown that 93% of recruits were able to read and write in the Bern region and even 100% of recruits were literate in the Solothurn region already at the middle of the nineteenth century. Similarly, the Netherlands had already very high literacy levels if one considers recruitment data: only 15% of recruits were illiterate (not or only unsatisfactorily able to read and write) in 1857/1858 (Statistisches Bureau, 1862). These examples highlight the very high levels in literacy which existed in (probably all of) the countries where literacy was not asked in the census at the end of the nineteenth century. For this reason, it appears more suitable to use another indicator for the earliest point in time. Numeracy as proxied by the age heaping method is the appropriate choice because, first, it is closely correlated to literacy (Hippe, 2012a). Second, numeracy is – as literacy data later on – directly derived from censuses. Third, it refers broadly to the same population (the entire population, excluding certain age groups). This allows a better comparison of both indicators. Taking military data from recruits would not allow to take the major parts of the population into account but only a very small selected group: men, in military service, of rather younger age and limited to a defined small age range. Moreover, regional data are often not available. In consequence, numeracy is the appropriate indicator which is also available for almost all European regions around 1850. Numeracy is measured by the age heaping method which has been used in an increasing number of recent publications (A’Hearn et al., 2009; Baten & Hippe, 2018; Crayen & Baten, 2010; Diebolt et al., 2017; Hippe, 2012b; Hippe & Baten, 2012; Manzel & Baten, 2009). The method takes advantage of the fact that in historical censuses, there is a heaping phenomenon on ages particularly ending in 0 and 5. One can show that individuals were not able to calculate their own age, so they did not report their exact age but only a rounded age. The deviation from the ideal age distribution (where all ages are represented by the same share) can be employed to create an index measuring numeracy. This index has originally been the Whipple index (WI) but has recently been replaced by the ABCC index (see A’Hearn et al., 2009). This index has the same value range as literacy (0–100 percentage points or simply points), which makes comparisons much easier. Therefore, we employ the ABCC index also in this study. It is defined as

92

5 The Long-Run Impact of Human Capital on Innovation and Economic Growth in. . .

Fig. 5.1 Illiteracy in 1930. Source: Own calculations of illiteracy (in %), based on Kirk’s (1946) data

ABCCjt ¼ 125  125 

14 X i¼5

n5i,jt =

72 X

! ni,jt ,

ð5:1Þ

i¼23

where i is the number of years, j is a region, t is the point in time (with t ¼ 1850), and n is the number of individuals. Second, literacy was the standard education variable around the turn of the twentieth century and the first half of the twentieth century in many European countries. Illiteracy had to be eradicated – this was a common tenor in all European countries. Success, however, was quite different not only in these countries but also within these countries. Figure 5.1 illustrates this fact quite clearly. While in some countries, like in Scandinavia (Denmark, Iceland, Norway, Sweden) but also the United Kingdom, Ireland, the Netherlands and Germany, an almost completely literate society by 1930 was created, other countries still struggled. For example, parts of the North of Italy were basically universally literate by this time, while only half of the population could read and write in the South. A similar observation can be made to the then existing Yugoslavia, in which today’s Slovenia had a literacy rate above 90%, while in many other parts of the country, only a minority of the population could read and write. Indeed, the Soviet Union also faced

5.3 Methodology and Data

93

huge regional educational inequalities, with the St. Petersburg and Moscow regions being on the top and a number of Caucasus regions at the end of the literacy ladder. Thus, as we can see, a completely literate population was not achieved in many European countries in 1900, and still in 1960, illiterates were more or less common in many European countries. This fact underlines our methodology to use literacy as our human capital indicator for the period. After 1960, one may presume that the ability to read and write is more or less attained by the entire population, so other education variables have to be used. We define literacy as Literacyjt ¼

N X i¼10

rwi,jt =

N X

ni,jt ,

ð5:2Þ

i¼10

where rw is the ability to read and write, N is the total number of years, and t is the point in time (with t ¼ 1900, 1930, 1960). The age definition is the standard contemporary definition. Furthermore, innovation is difficult to be measured statistically. One standard way is to take the number of patent applications or grants (e.g. Acs et al., 2002; Diebolt & Pellier, 2009, 2012). In addition to patent applications, other variables that are used to measure innovation include investments in R&D (e.g. Cohen & Levinthal, 1989), changes in productivity (David, 1990; Von Tunzelmann, 2000), bibliometrics (Andersen, 2001) and data on (international) expositions and fairs (Moser, 2005). Patent statistics have certain setbacks; for example, organisational changes or know-how cannot be patented, and not all patented products become innovations (Griliches, 1990). Nevertheless, patents are generally considered to be the best indicator (e.g. Andersen, 2001; Cantwell, 1989) and are most frequently employed (Diebolt & Pellier, 2009), in particular for the past. Therefore, we use patent applications per million inhabitants to the European Patent Office (EPO) as our indicator of innovation. The regional data come from Eurostat (2014). Lastly, the level of economic development is measured in a standard way by GDP per capita (in PPS) as presented by Eurostat (2014). We use scatter plots and regression models to analyse the relationship of regional human capital, innovation and the level of economic development. For the influence of historical human capital on current innovation and economic development, we employ standard OLS regression frameworks, which are formulated in the following way:   ln Patents=c j ¼ β0 þ β1 H j þ X j þ ε j   ln GDP=c j

¼ β 0 þ β1 H j þ X j þ ε j ,

ð5:4Þ ð5:5Þ

where ln(Patents/c) is the number of patents per million inhabitants (in logarithmic terms), ln(GDP/c) is GDP per capita (in PPS and in logarithmic terms), H is the

94

5 The Long-Run Impact of Human Capital on Innovation and Economic Growth in. . .

human capital indicator, X are other explanatory variables, j is a region, and ε are the unexplained residuals. X is composed of different variables which may have an influence on economic development. Our baseline specification considers X (and H ) in 1930 because we have the maximum number of variables for this point in time. Thus, in 1930, the explanatory variables are total fertility, marital status, population density, the share of individuals not dependent on agriculture, infant mortality, a dummy for capital regions, a dummy for the newer EU regions and country dummies. There is a large literature showing that fertility can have an important effect on growth (e.g. Barro & Becker, 1989; Becker, 1981; Becker et al., 1990; Galor, 2012; Galor & Weil, 1996, 2000, see also Hippe & Perrin, 2017). According to the quantity-quality trade-off theory, parents face a trade-off between the quantity (number) and the quality (education) of their children. Whereas the quantity of children prevailed during most of human history, parents began to prioritise child quality in the course of development. The increased investment in human capital spurred technological progress and economic growth. Ultimately, more child quality meant less quantity of children, reducing the number of children, leading to lower fertility rates and causing the demographic transition. Therefore, the fertility transition was an important factor in the transition from the post-Malthusian era to the modern growth regime (see also Galor & Weil, 2000). During our historical period, the demographic transition had already started in some regions, whereas it was still to begin in others. Therefore, it is a relevant factor that we should include in our analysis. We use total fertility data provided by the famous Princeton European Fertility Project, which defines total fertility as “a measure of the fertility of all women in the population” (Coale & Treadway, 1986, p. 154). Moreover, marital status comes from the same source and is “the ratio of the number of births produced by married women in [. . .] a population to the number that would be produced if all women were married” (Coale & Treadway, 1986, p. 154). In other words, this measure represents “the proportions married at each age” (Watkins, 1986, p. 315) and can thus be used as a proxy for nuptiality. There have been important nuptiality differences in Europe in the past, as has most famously been put forward by Hajnal (1965). Hajnal pointed out that western Europe was characterised in the past by a specific and unique European marriage pattern (EMP). The EMP describes the fact that there were much higher average ages at marriage in western Europe than in eastern Europe (and the rest of the world). Thus, differences in the average age at marriage may also explain differences in economic development (e.g. De Moor & van Zanden, 2010). For example, Foreman-Peck (2011) emphasises that this specific demographic pattern was an important force directly contributing to the development advantage of Western Europe by increasing innovation and productivity. Thus, we also control for nuptiality in our analysis. In addition, population density is measured (in logarithmic terms) as the number of individuals per square kilometre. More generally, total population positively affects population growth and technological change in a very long-run perspective (e.g. Kremer, 1993). Population density, as Klasen and Nestmann (2006, p. 623) point out, “generates the linkages, the infrastructure, the demand and the effective

95

11.5

5.3 Methodology and Data

NL11 CZ01 SE11

DE60 FR10 AT13 DE50 DE71 NL31 UKM5 NL32UKJ1 FI20 ITD1 IE02 AT32 DE11 DK01 FI18 ITC4 NL33 BE21 DEA1 ITC2 AT33 AT34 NL41 DE12 ITD5 NL34 AT31 DE14 ITD2 BE24 ITD3 ITE4 GR30 UKH2 SE33 ITD4 SE23 NL42 UKD2 BE31 RO32 DEA2 NL21 UKK1 UKJ2 GR42 SE32 ITC1 DE13 DK04 DE73 DE92 DEC0 NL22 ITE1 DK03 SE21 ES53 DEA4 DE91 UKM2 BE25 UKJ3 DK05 PT17 DE72 AT22 FR71 DEA5 SE22 NL12 SE31HU10ITC3 UKF2 SE12 DEB3 FI1A FI19ITE3 NL13 AT21 BE23 AT12 DE94 UKL2 DE30 UKM3 UKH1FR82 FR42 DEA3 ITE2 UKE4 BE22 UKD3 FR51 DEB1 UKG3 IE01 FR21 UKE2 UKG1 FR62 DEB2 UKF1 FR61FR26 FR23UKK2 UKJ4 UKH3 FR24 FI13 ITF1 PL12GR22 FR52 DED3 UKC2 ES70FR43 DE42 DK02 UKD1 DED2 UKK4 FR30 FR72 BE33 PT15 FR53 GR13 FR83 DEE0 FR41 FR81 FR25DE80 DE93 GR24 UKD4 UKE1 FR22 UKM6 DED1 UKG2 ITF2 GR43 FR63 UKE3 AT11 BE35 ITG2 UKD5 UKF3 BE34 GR41 DE41 GR12 CZ02 UKK3 BE32 UKC1 GR25 ITF5 CZ06 BG41 PT18 GR14 SK02 EE00 CZ08 CZ03 CZ05UKL1 ITG1 GR11 ITF4 GR23 ITF6 ITF3 CZ07 PT16 PT11 LT00HU22 GR21 PL22 PL51 CZ04 SK03 PL41 HU21 LV00 PL63 PL11 SK04 RO42 PL42 PL61 PL21 PL43 PL52 RO12 HU23 HU33 PL33 RO11 PL62 PL34 HU32 HU31 PL31 RO31 RO22 BG33 PL32 RO41 BG34 BG42 BG32 RO21 BG31

10

10.5

SK01

9

9.5

ln(GDP/c 2008)

11

LU00

0

.2

.4 .6 Not dep. on agr. 1930

.8

1

Fig. 5.2 Non-agricultural employment, 1930 and GPD/c, 2008

market size for technological innovations”. In this way, it may foster innovations and economic development in the long run. For this reason, population density has been a significant explanatory variable in empirical growth regressions in cross-country settings (e.g. Kelley & Schmidt, 1995) and in some European countries (e.g. Ciccone, 2002). Finally, its importance for technological progress and ultimately growth has been underlined in long-run growth models (e.g. Galor & Weil, 2000). The data stem from Kirk (1946) in 1930. Population density has been derived for the other years from raster data provided by Klein Goldewijk et al. (2010) and Klein Goldewijk et al. (2011).1 The next variable is the share of the total population which is not dependent on agriculture. This share roughly proxies the regional economic development and industrialisation in 1930. Shares of agriculture or industry have been used in different historical publications where GDP per capita estimates are not available (e.g. Becker & Woessmann, 2009; Good, 1994; Hatton & Williamson, 1994). Indeed, although we cannot show the relationship for historical GDP per capita estimates due to lack of data, Fig. 5.2 shows that there is a relationship between this historical share and current GDP per capita. Some outliers are apparent. For

1

To check whether these estimations are sufficiently reliable, we also correlated the derived data for 1930 with those calculated by Kirk (1946) in 1930. They are correlated to 91%, allowing us to use them in our subsequent analyses.

96

5 The Long-Run Impact of Human Capital on Innovation and Economic Growth in. . .

example, some regions outperform what could be expected by historical data, such as Luxembourg (LU00, whose GDP per capita has been boosted, among others, by financial services) and Åland Islands (FI20; a region with a very small population whose economy is driven my international shipping). Still, the general pattern clearly holds. For instance, some Bulgarian and Romanian regions (such as BG31 Severozapaden and RO41 Sud-Vest Oltenia) were among those regions with the highest dependency on agriculture ratio in 1930, while they have among the lowest GDP per capita values in 2008. For this reason, we argue that we can reliably proxy for historical economic development with this variable. Given the fact that we are interested in the correlation of historical variables with current economic development, it appears essential to control for the initial historical level of industrialisation. The data come from Kirk (1946). In addition, infant mortality represents a variable related to health. According to Kalemli-Ozcan (2002), low mortality may promote economic growth through different channels such as population growth and education. When parents face high uncertainty about the survival of their children, they will demand a higher number of children. When the risk of child death is reduced, parents may increasingly replace child quantity by child quality. This decreases fertility and lowers human capital, leading to sustained long-run economic growth. Kirk (1946) provides this data. Moreover, the capital region dummy has been introduced because capital regions have often specific characteristics due to their administrative functions. The dummy for the newer EU regions captures the fact that these countries joined the EU later on and have had different historical and economic experiences in the past, having mostly been part of the communist bloc before the fall of the Soviet Union. More specifically, these regions come from the newest 10 EU members (Bulgaria, Cyprus, Czech Republic, Estonia, Hungary, Malta, Poland, Romania, Slovenia and Slovakia). For this reason, West Germany is also part of the “old” member states, while (former communist) East Germany is considered as part of the “new” states even though it was already reunified with West Germany in 1990. Finally, there may be different inherent characteristics of countries (e.g. institutions) which may bias the results. Therefore, the inclusion of country dummies allows to control for these country-fixed effects. Most variables are available for 1930, which is why we focus in our analysis on this year. A reduced number of variables are also available for 1850, 1900 and 1960. These variables are literacy, fertility, marital status, population density and our two dummy variables. Descriptive statistics on all variables are shown in Table 5.1. We have up to more than 250 regions in our dataset at the different points in time. The regions that are covered may be different at each point, thus reducing the number of observations in the regressions. In addition, we need to consider the question how a region is defined in this paper. Clearly, the regions in 1930 and at other points in time are often not the same as

5.3 Methodology and Data

97

Table 5.1 Descriptive statistics Variable ABCC 1850 Total fertility 1870 Marital status 1870 ln(Pop. density 1850) Literacy 1900 Total fertility 1900 Marital status 1900 ln(Pop. density 1900) Literacy 1930 Total fertility 1930 Marital status 1930 ln(Pop. density 1930) Infant mortality 1930 Not dep. on agr. 1930 Literacy 1960 Total fertility 1960 Marital status 1960 ln(Pop. density 1960) ln(GDP/c 2008) ln(Patents/c 2008) Higher edu. attain. 2008 Capital Newer EU regions

obs. 265 265 265 265 192 192 192 192 192 192 192 192 192 192 146 146 146 146 256 256 256 256 256

mean 0.94 0.40 0.54 3.79 0.57 0.39 0.58 4.05 0.74 0.30 0.58 4.16 0.13 0.46 0.83 0.22 0.61 4.10 10.04 3.62 0.72 0.08 0.22

sd 0.07 0.09 0.12 1.05 0.29 0.12 0.12 0.93 0.20 0.11 0.09 0.98 0.05 0.24 0.11 0.05 0.08 0.96 0.36 1.66 0.15 0.27 0.42

min 0.65 0.23 0.28 0.41 0.13 0.20 0.31 0.74 0.22 0.05 0.32 0.67 0.04 0.06 0.59 0.12 0.48 0.71 8.88 1.59 0.18 0.00 0.00

max 1.00 0.65 0.81 6.39 1.00 0.68 0.81 5.99 1.00 0.54 0.80 8.82 0.30 0.99 0.99 0.50 0.82 6.64 11.31 6.26 0.97 1.00 1.00

today, at least for a number of European countries.2 For this reason, the historical regions have been adapted to the NUTS classification of the European Union (see also related work by e.g. Diebolt & Hippe, 2017). In some countries, historical regions have not very much changed until today. For example, there has been a very stable subnational administrative organisation in France and Spain for the last 200 years. However, wars and administrative reforms have led to greater changes in other countries. Therefore, these changes are incorporated as best as possible to fit the modern equivalent. More precisely, we use NUTS 2 regions as our standard regional classification, which is also done in the relevant literature in regional economics (e.g. Badinger and Tondl 2003; Herwartz & Niebuhr, 2011; Scherngell & Barber, 2011). This regional level corresponds, for example, to the régions in France and the Comunidades Autonomas in Spain. Moreover, note that the availability of the data can be quite different at each time period. In particular, the Eurostat data for the current period refer only to countries of EU27, EFTA and some candidate countries. For this reason, the corresponding 2

For example, Spain and France have preserved almost the same regions and regional boundaries until today.

98

5 The Long-Run Impact of Human Capital on Innovation and Economic Growth in. . .

regressions only consider these regions, although the historical data is in part more extensive. For example, it also covers important areas in Eastern Europe such as Russia. However, using the Eurostat data allows us to use comparable regional data across European countries for the current period. On the other hand, whereas the ABCC data for 1850 consider most of the European regions in the larger sense, the literacy data for 1900 and 1930 only refer to those countries where literacy was still measured. Still, many countries can be included in this study. In contrast, literacy in 1960 is only available for a reduced number of countries (see Appendix). Therefore, the results for the data for 1960 are less comparable than for the other points in time. Still, they allow us to get some additional insights for the respective regions at the beginning of the second half of the twentieth century.

5.4 5.4.1

Results Relationship Between Patents Per Capita and GDP Per Capita Today

Before analysing the long-run impact of human capital on innovation and economic growth, we consider the current relationship of the latter two dependent variables in our subsequent regressions. Figure 5.3 shows their relationship for 2008. The figure highlights a general positive relationship between current GDP per capita and patent applications per million inhabitants to the EPO in Europe.3 The “new” member countries have typically a lower number of patents and GDP, but they follow the basic pattern of the old member states, underlining the relevance of controlling for the new EU member states. The most important outliers are Inner London (UKI1) and Luxembourg (LU00), which had much higher GDP per capita levels than their relative number of patent applications would suggest. In both cases, they are important financial centres. In fact, most of Luxembourgish GDP is related to finance and banking. In the case of London, it is the heart of the British economy, and its most important industry is again the financial sector. The strength of the financial sectors thus leads to higher GDP per capita values than would otherwise have to be expected. On the other hand, Germany’s core industrial zones in the greater region around Munich (Oberbayern, DE21) and Stuttgart (DE11) alongside Dutch Noord-Brabant (NL41) and Austrian Vorarlberg (AT34) apply most often. Finally, the lowest GDP per capita values have the regions in the two newest member states, i.e. Bulgaria and Romania.

3

Note that data are available for more regions in 2008 than in 2000.

99

11.5

5.4 Results

UKI1

10

10.5

NL11 DE60 CZ01 FR10SE11 SK01 AT13 DE50 DE71DE21 UKM5NL31 DK01 FI20 IE02 NL32 UKJ1 ITD1AT32 DE11 ITC4 NL33 FI18 BE21 DEA1 ES21 ES22 ES30 ITC2 DE12 DE25 NL41 AT34 ITD5 AT33 NL34 AT31DE27 DE23 DE14 ITD2 UKH2 BE24 ITD3 ITE4 SE33 ITD4 NL42 SE23 UKD2 BE31 RO32GR30 DEA2 NL21 UKK1 ES24 ES51 UKJ2 DE26 ES23 DE22 GR42 SE32 ITC1 DE13 DEC0 DE73 DK04 DE92 NL22 DE24 ITE1 ITC3 SE21 DK03 ES53 DE91 DEA4 UKM2 BE25 UKJ3 PT17 DK05 SI02 DE72 DEA5 AT22 FR71 SE22 NL12 UKF2 SE31 FI19 SE12 AT21 DEB3 HU10 ITE3 FI1A NL13 BE23 AT12 UKL2 DE94 DE30 UKM3 ES13 DEF0 UKH1 FR42 DEA3 CY00 NL23 FR82 ITE2 UKE4 ES41 UKI2 ES12 UKD3 BE22 FR51 DEB1 FR21 UKE2 IE01 UKG1 DEB2 UKF1 FR62 FR61 FR23 UKJ4 UKK2 ES11 ES52 UKG3 FR26 UKH3 FR24 FI13 FR52 DED3 PL12 UKC2 ES70 DE42 ES62 UKD1 DK02 DED2 UKK4 ITF1 FR30 FR72 BE33 GR24 PT15 FR43 UKN0 FR53 FR83 ES42 DEE0 FR41 GR43 UKD4 FR81 FR25 DE93 UKE1 UKM6 UKG2 DED1 FR22 ITF2 DE80 UKE3 FR63 AT11DEG0 BE35 ES61 ITG2 UKD5 MT00 UKF3 UKC1 BE32 BE34 DE41 GR12 SI01 CZ02 GR25 ITF5 UKK3 PT20 CZ06 BG41 PT18 GR14 ES43 FR92 SK02ITF6 CZ08 UKL1 EE00 CZ03 ITG1 ITF4 GR11 GR23 ITF3 FR94 CZ05 FR91 GR21 CZ04PT16 HU22 PT11 CZ07 LT00 PL22 PL51 SK03 PL41 HU21 LV00 PL63 PL11 RO42 SK04 PL42 PL21 PL43 PL61 PL52 PL33 RO12 HU23 HU33 RO11 PL62 PL34 HU32 HU31 PL31 RO31 PL32 BG33 RO41 BG42 BG32 RO21

9

9.5

ln(GDP/c 2008)

11

LU00

-2

0

2 ln(Patents/c 2008)

4

6

Fig. 5.3 Regional per capita GDP and patent applications, 2008. Note: Patent applications per capita are defined as patent applications to the EPO per million of inhabitants. Source: Data provided by Eurostat (2014)

5.4.2

Explaining Regional Patents Per Capita

In the next step, we use standard OLS regression models to dig deeper into the relationship between human capital and innovation on the one hand and between human capital and economic development on the other hand. More specifically, we regress current patents per capita (i.e. patent applications per million inhabitants, in 2008) on historical variables (in 1930). We use the year 2008 because it provides the highest number of observations.4 Note that we always include country dummies to control for country-fixed effects. We report robust p-values to avoid problems related to heteroskedasticity. Nevertheless, all regions have the same weight, representing each an historical experience. The results are highlighted in Table 5.2. In each case, literacy is a significant positive explanatory variable of current patents per million inhabitants at the 1% level. In other words, when literacy increases by 1%, patents per capita increase by 4.3–5.4% – a sizeable effect. When all variables are included (column one), population density is positively significant at the 10% level, while newer EU regions have significantly lower patent applications (1% level). This negative sign (in all cases

4

Note that we will use an alternative range of years in subsequent robustness checks.

100

5 The Long-Run Impact of Human Capital on Innovation and Economic Growth in. . .

Table 5.2 Regional patent applications per capita in 2008

Literacy 1930 Total fertility 1930 Marital status 1930 ln(Pop. density 1930) Infant mortality 1930 Not dep. on agr. 1930 Capital Newer EU regions Constant Observations R-squared

(1) (2) ln(Patents/c 2008) 5.42*** 4.33*** (0.001) (0.000) 0.94 (0.717) 1.01 (0.645) 0.32* (0.092) 1.29 (0.802) 1.46 (0.126) 0.21 0.49** (0.513) (0.026) 2.19*** 2.24*** (0.003) (0.000) 2.61 0.80 (0.297) (0.306) 129 157 0.87 0.84

(3)

(4)

(5)

(6)

4.45*** (0.000) 0.20 (0.912)

4.57*** (0.000) 0.25 (0.894) 0.83 (0.684)

4.66*** (0.000) 0.77 (0.678) 1.09 (0.563) 0.20 (0.146)

4.59*** (0.000) 1.46 (0.526) 1.00 (0.582) 0.21 (0.138) 3.14 (0.531)

0.32 (0.161) 2.21*** (0.000) 0.63 (0.587) 145 0.85

0.32 (0.163) 2.14*** (0.000) 0.04 (0.986) 145 0.85

0.05 (0.861) 0.72*** (0.007) 1.86 (0.314) 144 0.85

0.07 (0.812) 0.55 (0.116) 1.76 (0.326) 144 0.86

Note: ***, **, * indicate significance at the 1, 5 and 10% level. Robust p-values in parentheses. Patents/c refers to patent applications to the EPO per million inhabitants. Country-fixed effects included

except column six) confirms the descriptive evidence shown in the figure above. When only literacy is considered, the dummy for capital regions turns significant (column two), meaning that capital regions have a higher number of patents per capita than other regions. However, the coefficient is insignificant in all other cases. These regression results show that literacy is the most significant historical explanatory variable for current patents per capita. However, how robust is this result? We propose several robustness checks. First, we perform a horse race, including only literacy and another explanatory in each regression to check whether our human capital indicator can survive the direct comparison with other potential explanatory variables (Table 5.3). These regressions confirm our previous results, indicating that literacy is the most important historical variable for explaining current patents per capita. Population density also appears to play a role, being significant (column five). Capital regions (column eight) and newer EU regions (column nine) show also significantly higher and lower patent applications, respectively. Second, a related question concerns multicollinearity. It is possible that some variables are highly correlated, and this may cause biased estimates. In particular, fertility, marital status and infant mortality are potential candidates. We may consider this by excluding first one and then two of these variables from the regressions

5.4 Results

101

Table 5.3 Horse race between literacy and other variables

Literacy 1930 Total fertility 1930 Marital status 1930 ln(Pop. density 1930) Infant mortality 1930 Not dep. on agr. 1930 Capital Newer EU regions Constant

(1)

(2)

(3)

(4)

(5) (6) ln(Patents/c 2008)

5.42*** (0.001) 0.94 (0.717) 1.01 (0.645) 0.32* (0.092) 1.29 (0.802) -1.46 (0.126) 0.21 (0.513) -2.19*** (0.003) -2.61 (0.297)

4.56*** (0.000)

4.48*** (0.000) -0.25 (0.888)

4.66*** (0.000)

4.36*** (0.000)

4.30*** (0.000)

(7)

(8)

(9)

4.21*** (0.000)

4.33*** (0.000)

4.56*** (0.000)

0.81 (0.689) 0.21** (0.025) -3.69 (0.348) 0.01 (0.986) 0.49** (0.026)

0.57 (0.453)

0.69 (0.554)

0.01 (0.994)

-2.23*** (0.000)

1.15 (0.228)

0.91 (0.345)

Observations 129 157 145 145 156 157 141 R-squared 0.87 0.83 0.85 0.85 0.84 0.83 0.85 Robust p-values in parentheses. Significance levels are as follows: *** p