256 39 1MB
English Pages 189 [194] Year 2008
1
Stefan Bergheim
Long-Run Growth Forecasting 1 23
Long-Run Growth Forecasting
Stefan Bergheim
Long-Run Growth Forecasting
Stefan Bergheim Raimundstr. 121 60320 Frankfurt Germany [email protected]
ISBN 978-3-540-77679-6
e-ISBN 978-3-540-77680-2
Library of Congress Control Number: 2008923365 © 2008 Springer-Verlag Berlin Heidelberg This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permissions for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Cover design: WMXDesign GmbH, Heidelberg, Germany Printed on acid-free paper 987654321 springer.com
To my mother
Acknowledgement
This book is based on my dissertation at WHU Otto Beisheim School of Management. I would like to thank Michael Frenkel for supervising this project and J¨ urgen Weigand for being co-supervisor. Special thanks go to Marco Neuhaus for the long discussions in 2004 and 2005 that inspired large parts of this study. Chiara Osbat introduced me to the non-stationary panel literature, and Magdalena Korb, Andrea Schneider, Elga Bartsch and Sarah Rupprecht were so kind to comment on drafts of the book. Seminar participants at ETH Zurich, Deutsche Bundesbank, the Max Planck Institute of Economics, the Institut der deutschen Wirtschaft Cologne and the Macroeconomic Policy Institute (IMK) in D¨ usseldorf also made helpful suggestions.
Contents
1
The importance of long-run growth analysis . . . . . . . . . . . . . . . 1.1 Frequent forecast failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Strong demand - but little supply . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Plan of work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Choosing a sensible theoretical model . . . . . . . . . . . . . . . . 1.3.2 Choosing the best econometric technique . . . . . . . . . . . . .
1 1 3 6 6 8
2
Assessment of growth theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 The search for a dynamic model . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The basic neoclassical model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Application in cross-country analysis . . . . . . . . . . . . . . . . . 2.3 Focus on convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Tests for conditional convergence . . . . . . . . . . . . . . . . . . . . 2.4 Models with deeper insights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Including human capital (Lucas) . . . . . . . . . . . . . . . . . . . . 2.4.2 Modeling barriers to riches (Parente & Prescott) . . . . . . 2.5 Opening the theories further . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Models with scale effects . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.2 Evolutionary models of growth . . . . . . . . . . . . . . . . . . . . . . 2.5.3 Open-system models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 General critique of the standard approach . . . . . . . . . . . . . . . . . . 2.6.1 Production function cannot be estimated . . . . . . . . . . . . . 2.6.2 Aggregate production function does not exist . . . . . . . . . 2.6.3 The concept of TFP is not helpful . . . . . . . . . . . . . . . . . . . 2.6.4 Beyond neoclassical economics . . . . . . . . . . . . . . . . . . . . . . 2.7 The augmented Kaldor model . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9 9 10 12 13 15 16 17 18 19 20 21 24 25 25 28 29 29 30
3
The dependent variable: GDP growth . . . . . . . . . . . . . . . . . . . . . 35 3.1 Choosing the appropriate data source . . . . . . . . . . . . . . . . . . . . . . 36
X
Contents
4
Labor input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Population growth is endogenous . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Hours worked per capita are important . . . . . . . . . . . . . . . . . . . . . 4.3 Age structure of the population . . . . . . . . . . . . . . . . . . . . . . . . . . .
43 43 46 48
5
Physical capital . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Measuring capital accumulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Investment and changes in capital stocks . . . . . . . . . . . . . 5.1.2 Different databases - different investment ratios . . . . . . . 5.1.3 Capital stocks from perpetual inventory . . . . . . . . . . . . . . 5.2 Main insights on capital accumulation . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Investment ratios are not constant . . . . . . . . . . . . . . . . . . . 5.2.2 Investment ratios do not differ much across countries . . 5.2.3 Investment ratios are not proportional to changes in the capital stock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.4 Investment ratios are not proportional to levels of the capital stock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.5 Capital productivity does not correlate with income . . . 5.2.6 Capital accumulation is not exogenous . . . . . . . . . . . . . . . 5.3 Proper modeling of capital accumulation . . . . . . . . . . . . . . . . . . .
51 52 52 53 54 56 56 56
6
Human capital . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Micro- and macroeconomic theory . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1 Microeconomic analysis: labor economics . . . . . . . . . . . . . 6.1.2 Macroeconomic models with different conclusions . . . . . . 6.2 Measures and empirical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Best measure: years of education . . . . . . . . . . . . . . . . . . .
67 69 70 71 74 75
7
Openness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Theory: higher efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.1 Extent of the market and specialization . . . . . . . . . . . . . . 7.1.2 Good macro polices and more competition . . . . . . . . . . . . 7.1.3 Additional influences of trade on income . . . . . . . . . . . . . 7.2 Measuring openness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Black market premium and tariffs . . . . . . . . . . . . . . . . . . . 7.2.2 The openness dummy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.3 Best measure: adjusted trade share . . . . . . . . . . . . . . . . . . 7.3 Empirical debate: levels versus growth . . . . . . . . . . . . . . . . . . . . .
81 83 84 84 86 86 87 87 88 90
8
Spatial linkages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 8.1 Spatial economics - location matters . . . . . . . . . . . . . . . . . . . . . . . 97 8.1.1 Absolute location: latitude and climate . . . . . . . . . . . . . . . 97 8.1.2 Relative location: rich neighbors . . . . . . . . . . . . . . . . . . . . . 97 8.2 Constructing spatial GDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 8.3 Sum: Spatial linkages not much help . . . . . . . . . . . . . . . . . . . . . . . 101
60 61 61 63 63
Contents
9
XI
Other determinants of GDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
10 The theory of forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 10.1 The benefits of forecast experiments . . . . . . . . . . . . . . . . . . . . . . . 106 10.2 The characteristics of good forecasts . . . . . . . . . . . . . . . . . . . . . . . 106 10.3 Intercept correction and forecast combination . . . . . . . . . . . . . . . 109 11 The evolution of growth empirics . . . . . . . . . . . . . . . . . . . . . . . . . . 113 11.1 Still widely used: cross-section . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 11.2 Weaknesses of cross-section regressions . . . . . . . . . . . . . . . . . . . . . 116 11.2.1 Same production function assumed . . . . . . . . . . . . . . . . . . 117 11.2.2 Long-run growth path assumed to be constant and the same across countries . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 11.2.3 Same pace of conditional convergence assumed . . . . . . . . 118 11.2.4 Errors are assumed uncorrelated with the explanatory variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 11.2.5 Right-hand side variables assumed exogenous . . . . . . . . . 118 11.2.6 In sum: many assumptions are violated . . . . . . . . . . . . . . . 119 11.3 The climax of cross-section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 11.4 Advantages of panel techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 11.4.1 Initial technology can differ across countries . . . . . . . . . . 123 11.4.2 Dealing with endogeneity bias . . . . . . . . . . . . . . . . . . . . . . . 123 11.4.3 Addressing lagged dependent bias . . . . . . . . . . . . . . . . . . . 124 11.4.4 Modeling heterogeneous technological progress . . . . . . . . 125 11.4.5 Summary of results from panel regressions . . . . . . . . . . . . 125 11.5 Non-stationary panel techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 11.5.1 Pooled mean group technique . . . . . . . . . . . . . . . . . . . . . . . 126 11.5.2 Testing unit roots and cointegration in panels . . . . . . . . . 128 11.5.3 Panel unit root tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 11.5.4 Panel cointegration tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 11.6 A two-stage estimation method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 12 Estimation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 12.1 Correlation analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 12.2 Panel unit root tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 12.3 Panel cointegration test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 12.4 The short-run forecasting models . . . . . . . . . . . . . . . . . . . . . . . . . . 146 13 Forecast competitions and 2006-2020 forecasts . . . . . . . . . . . . . 151 13.1 Forecast competition 2001-2005 . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 13.2 Forecast combination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 13.3 Forecast competition 1996-2005 . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 13.4 Forecasts for 2006-2020 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 13.5 Other long-run forecasting models . . . . . . . . . . . . . . . . . . . . . . . . . 160
XII
Contents
14 Conclusion and outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 List of figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 List of tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
List of variables and coefficients
Y total GDP y GDP per capita (Y /L) y˙ change of GDP per capita dy/dt yˆ percentage change of GDP per capita y/y ˙ K physical capital k physical capital per capita kˆ growth rate of per capita capital stock a share of capital in national income α elasticity of output with respect to capital input r real interest rate I investment in physical capital κ growth rate of investment L total labor supply n growth rate of population Lt = L0 ent w real wage rate H human capital h human capital per capita ˆ growth rate of human capital per capita h S years of education R work experience o trade openness per capita A level of “technology” or of total factor productivity TFP g growth rate of technology At = A0 egt E country-specific efficiency x other drivers of per-capita income s savings rate sk saving to accumulate physical capital sh saving to accumulate human capital t time i index of countries u fraction of human capital reinvested
XIV
List of variables and coefficients
β elasticity of human capital; or regression coefficient γ capital-output ratio K/Y δ depreciation rate ǫ regression error η coefficient in Dickey-Fuller regression λ rate of convergence µ coefficient on trade openness in technical progress function π coefficient on cointegration errors ρ rate of time preference σ rate of risk aversion φ return to human capital ω coefficient on human capital in technical progress function
List of abbreviations
ECB European Central Bank ECO Economic Outlook database (by OECD) GDP Gross domestic product IMF International Monetary Fund MRW Mankiw, Romer and Weil (1992) OECD Organisation for Economic Co-operation and Development PMG Pooled mean group estimator PPP Purchasing power parity PWT Penn World Table R&D Research and development RMSE Root mean squared error TFP Total factor productivity UN United Nations WDI World Development Indicators (by World Bank) WEO World Economic Outlook (by IMF)
1 The importance of long-run growth analysis
Forecasts are usually made to help and guide decision making. Good forecasts are preconditions for good, informed decisions. These decisions may vary from a financial market bet on interest rate changes to the policy decision on how to structure a country’s pension system. Ideally, decision-makers should be as well prepared as possible for the future, which would allow them to act appropriately. To detect challenges and opportunities in a timely manner decision-makers require a good forecasting framework. Given the role governments, companies and individuals play, knowledge about the drivers and linkages that determine the future will allow these players to actually shape the future themselves.
1.1 Frequent forecast failures Unfortunately, history is full of examples of poor predictions and therefore poor decisions. In the early 1990s, the USA was seen by many as a sclerotic economy destined for anemic economic growth with high unemployment and to be overtaken by Japan within a few years. As we know today, these predictions could not have been more wrong. Growth of US gross domestic product (GDP) averaged 3.3% per year between 1992 and 2005. Asset markets in the US surged as they became increasingly confident that the future would be much brighter than assumed in the early 1990s. By contrast, Japan in the early 1990s was seen as a role model. In the event, a decade of economic stagnation, falling asset prices and banking sector problems followed and made many forecasters look incompetent. Germany is another case where trend GDP growth has been overestimated significantly for the past 10 years. From about 2% in 1995 the consensus forecast for trend growth was revised down to around 1% by 2005. Actual growth over the years 2001-05 was just 0.7%. Year after year, growth expectations of investors and companies had to be revised downwards. Had investors known already in the mid-1990s just how low Germany’s growth potential was, some investment
2
1 The importance of long-run growth analysis
plans would have turned out quite differently: production capacities would not have been expanded as much and investors would have avoided companies with a large exposure to German domestic demand. Forecast errors are not confined to developed markets. The frequent crises in emerging markets over the past two decades tended to be even more severe and surprising. For example, the 1997 crisis in emerging Asia caught many investors by surprise - who wished they had been better able to anticipate the difficulties. Even worse, after retreating from Asia during the crisis, many companies were surprised by the rapid rebound of countries like Korea and Malaysia - and wished they had had a framework to tell them to stay engaged in these countries. This anecdotal evidence is supported by more formal analysis. The forecasts in the International Monetary Fund’s semi-annual World Economic Outlook (WEO) displayed a tendency to systematically overpredict real GDP growth as Timmerman (2006) shows. Between 1991 and 2003, the next-year forecasts in the September WEOs for France, Germany, Italy and Japan were on average a full percentage point too high. This bias points to a significant overestimation of trend growth. Indeed, one of Timmerman’s recommendations on how to address these forecast errors is to have “more frequent reviews of estimates of potential output growth” (op. cit. p. 9). This would entail a more thorough modeling of trend growth using elements of growth theory. The IMF is not alone in having made these systematic forecast errors. They are also visible in the European Central Bank’s (ECB) staff forecasts for the euro area and in consensus forecasts. For the year 2003 the staff forecast for GDP started out at 2.5% - way above the final outcome of 0.5%. Forecasts for 2002, 2004 and 2005 were also too high by between 0.4 and 2.1 percentage points. The consensus according to the ECB’s Survey of Professional Forecasters did not fare much better. In the first quarter of 2001 the five-year ahead GDP forecast was 2.7%, while growth over 2001-05 actually averaged just 1.4%. Repeated small differences in growth rates can lead to large differences in outcomes many years down the road. A growth rate of per capita GDP of 1.5% does not look all that different from a growth rate of 2%. But over 20 years, this translates into a 10% difference in income levels - not a negligible amount. These persistent and large forecast errors indicate that economists do not yet have the appropriate theories, data and/or statistical tools to adequately model the developments of national economies. Unfortunately, the task at hand is really enormous: Lucas (1988, p. 13) observed that “The growth rate of an entire economy is not an easy thing to move around. Economic growth, being a summary measure of the activities of an entire society, necessarily depends, in some way, on everything that goes on in a society.” In trying to help reduce forecast errors in the future, this study makes three contributions. First, it provides an assessment of the main existing theories of economic growth and proposes an augmented Kaldor model as the most
1.2 Strong demand - but little supply
3
reasonable synthesis. It is the first work to derive hypotheses of pair-wise cointegration among the key variables in growth models. Second, this study applies modern non-stationary panel estimation techniques to test these hypotheses for 40 countries for the period from 1971 to 2003. And third, it presents longrun growth forecasts for the years 2006-20. A forecast competition will show that forecasts based on the theories outlined here can outperform consensus forecasts and simple time-series models. The time horizon is the medium to long run of 5 to 20 years. In economics the term “growth” already refers to the long-run development of an economy, the evolution of its potential or trend output. However, since the media and financial markets frequently use “growth” when referring to changes in GDP over shorter periods of a year or even a quarter (which are the combined result of trend and cyclical factors), this study uses the term “long-run growth” to avoid any uncertainty regarding the time horizon. While the ultimate goal of this study is to derive a set of forecasts, the path to these forecasts is at least as important. A forecaster has to understand the assumptions made, the limits of theories, the datasets used and has to crosscheck the insights with real world experience. Unfortunately, the theoretical and empirical growth literature has not yet produced a consensus on some of the most important questions: How important is the accumulation of physical capital for GDP growth? Is investment exogenous or endogenous? Should population growth be treated as exogenous? Does an increase in education lead to higher output? What is the best econometric technique to try to answer these questions?
1.2 Strong demand - but little supply Demand for substantiated long-term growth forecasts is high following the surprises and forecast errors made in the past. Growth forecasts are used in many areas in business and financial markets and by governments. Businesses require forecasts for economic growth for their budgeting, strategic planning and for the analysis of business cases. Since many corporate investments, such as a new chemical plant, have investment horizons of 10 to 15 years they also require GDP forecasts over a similar horizon. The need for a neutral forecast is particularly strong here because individual business units have a genuine interest in presenting high forecasts, which may steer the allocation of resources to their unit. If budgets, strategic plans and selected business cases are based on wrong assumptions, losses for the whole company may ensue. If production and inventories are too high relative to actual demand in the future, then prices may need to be set below initial plans to clear inventory. Just-in-time production may ease some of these difficulties, but production capacities nevertheless have to be aligned with expected demand.
4
1 The importance of long-run growth analysis
Financial markets make heavy use of long-run GDP forecasts in many ways. For example, many pricing models are based on the economy’s underlying growth trend: Government bond yields are often priced on the sum of expected real GDP growth and inflation. And these bond yields are themselves the benchmark against which other assets (equities, real estate) are priced. Fund managers try to outperform their peers by comparing the growth forecast that the market is pricing in at the moment with their own, possibly model-based, forecast for long-run growth. They would prefer to invest in markets that few others see as promising today but that will show their strength in the near future. When assessing the risk of overheating of an economy, the current growth rate of GDP is compared to the rate of potential GDP growth. Business cycle analysis usually starts from trend growth and then adds or subtracts from it depending on the current state of policy variables and exogenous developments (e. g. oil, exchange rate). But most of these analyses use past trends as a starting point. If trend growth is on a downward trajectory, this may lead to a series of downward revisions of growth forecasts and upward revisions of inflation forecasts - as seen in Europe and especially in Germany since 2001. Furthermore, policy-makers are interested in specific advice on how to strengthen their countries’ growth performance - or how to prepare for geopolitical changes resulting from diverging economic outcomes. A systematic analytical framework and a set of conditional forecasts for growth would make their tasks easier. Long-run growth forecasts are also important for international organizations like the World Bank or the International Monetary Fund. A stabilization program and the associated recommendations may look quite different depending on the economy’s underlying growth potential. It turns out that the IMF’s medium-term growth projections have a tendency to err on the high side. As Batista and Zalduendo (2004) emphasize, this ”over-optimism may lead to complacency regarding the adequacy of growth-oriented structural reforms pursued by a country.” In addition, national fiscal authorities require solid forecasts for trend GDP growth to estimate future tax revenue and pension liabilities. Wrong estimates of revenue and expenditure may lead policy-makers to cut tax rates and expand welfare spending. The result would be unexpectedly high fiscal deficits - as seen in many European countries since 2001. Around the turn of the millennium, many European governments used GDP forecasts that turned out to be too high because they were too optimistic both on the cyclical and on the trend development of GDP. This meant that budget deficits turned out much higher than expected and led to major political upheaval inside the European Union because several countries did not comply with the Stability and Growth Pact. Likewise, central banks need a good grasp of the growth rate of potential GDP over the medium term. If a central bank overestimates the trend growth rate, it may supply too much liquidity and end up with unexpectedly high in-
1.2 Strong demand - but little supply
5
flation. For example, in 1999 and 2000 the European Central Bank came under pressure to revise its reference value for the expansion of M3 money supply upwards because strong current GDP growth had led many financial market analysts to revise upward their forecasts of the euro area’s potential GDP growth. In the event, the ECB correctly opted for maintaining its assumption of potential GDP growth at 2 to 2.5% - possibly because it has superior capabilities in modeling the economy’s potential GDP. Similarly, central bank reaction functions such as the Taylor rule require a ‘normal’ or natural real interest rate as an important input. Often, this natural rate is derived from the economy’s long-run growth potential. A stronger focus on forecasting in growth economics may also have positive effects on the development of economic theory. When trying to apply principles of economic theory to build a forecast model, the usefulness of these principles is put to a test. This study is partly about what economists can learn when applying the ideas of growth theory in a real-world forecasting context. The theoretical model outlined in chapter 2 and the discussion of the individual variables in chapters 4 to 9 benefited strongly from having to be useful for forecasting. While demand is strong, there is a scarcity of substantiated long-run growth forecasts. Some models used by central banks and international organizations just extrapolate the past trend of GDP growth or of labor productivity into the future using simple statistical tools like the Hodrick-Prescott filter. Academic research into economic growth shies away from exploring the forecasting performance of growth models and focuses mostly on explaining the past. The Handbook of Economic Growth published in 2005 does not include a chapter on forecasting in its two volumes with a total of 1998 pages. Private institutions either make ad-hoc assumptions on future growth or generate models that may not deliver what they claim to do. The ”Growth Competitiveness Index” (GCI) developed by the World Economic Forum (WEF) claims to “evaluate the potential for the world’s economies to attain sustained economic growth over the medium and long term”1 and receives a lot of media attention every autumn partly because it is one of just a few models in the face of the strong demand. Unfortunately, there is a slight negative correlation between the 2001 GCI ranking and actual GDP growth over 2001-05. This forecasting weakness has some tradition: In the early 1990s, the WEF produced a joint ranking with the Institute for Management Development (IMD). The 1993 version saw Japan and Germany ranked at numbers 2 and 5 - just ahead of a decade of very weak growth. By contrast, Finland and Korea ranked 25th and 28th in 1993 - just ahead of a decade of very strong growth. Therefore, the GCI or the IMD index does not seem to be a good predictor of economies’ future growth prospects. Neither fills the gap between high demand and low supply of long-run growth models. 1
See Blanke et al. (2003), p. 3.
6
1 The importance of long-run growth analysis
International organizations such as the IMF and the World Bank maintain a set of models to forecast long-run GDP growth for a large number of countries. However, these models and their forecasts are not usually available to the public. Exceptions are working papers, for example those by Batista and Zalduendo (2004) and Ianchovichina and Kacker (2005). Wherever possible, I will compare my models and insights with those from these forecasters. Recently, private-sector institutions such as banks and consultancies added their own models. Chapter 13 will discuss these contributions as well.
1.3 Plan of work This study is about forecasting long-run GDP growth both per capita and in total. It will derive forecasts for average annual GDP growth for the period 2006 to 2020 for 40 economies around the world based on models with annual frequencies using the most current data available. The forecast horizon of 2020 is motivated by the average investment period of large projects initiated by companies or governments. In addition, to allow some out-of-sample testing of the model, I will use data until 1995 to derive forecasts for average annual per capita GDP growth over the period 1996 to 2005 (and data until 2000 for forecasts over 2001 to 2005). In principle, it would be possible to calculate and evaluate estimates for each year over the forecast horizon. However, the model does not aim at explaining the business cycle. Evaluating annual observations either of growth rates or of GDP levels is likely to produce much larger absolute forecast errors than evaluating the averages. A five-year horizon should be long enough to average out business cycle disturbances. Indeed, in most countries, the span from 2000 to 2005 seems to be close to a peak-to-peak period. The model forecasts will be evaluated against a set of alternatives in chapter 13. On the way to these forecasts I will evaluate the different theoretical models, discuss the individual drivers of economic growth and decide on the most appropriate econometric technique. 1.3.1 Choosing a sensible theoretical model Economics carries the stamp of the ”dismal science” partly because of the gloomy predictions of Thomas Malthus in 1798 that population would grow faster than food supply, dooming mankind to unending poverty and hardship. Ricardo and Marx drew similarly gloomy conclusions. By contrast, the history of the past 200 years shows that economies can create tremendous riches and move far away from poverty. The reason is substantial technological progress, which economists continue to struggle to explain. While this shows that the models of Malthus and Ricardo were clearly misspecified, there is still no consensus on the drivers of economic growth even today.
1.3 Plan of work
7
A crucial challenge in model building is to distinguish between correlation and causality. Many variables are likely to be correlated with economic growth, but not all of them have a causal link. My strategy is to combine the information from historical country experiences (e.g. Landes [1999]) and careful econometric analysis to build a model that uses variables that are as exogenous and as causal for economic growth as possible. Judgments, assumptions and compromises will have to be made. This is in line with the view of Brock and Durlauf (2001), who believe that historical and qualitative studies play a crucial role in the development of credible statistical analyses. There is no single best way to conduct empirical analysis in the social sciences. However, it is crucial to understand the advantages and disadvantages of the different approaches in order to find the most suitable framework. The approach in this study is what Colander (2000, p. 137) calls modern economics or the economics of the model, i.e. ”the study of the economy and economic policies through empirically testable models.” Colander also quotes Keynes as defining the task at hand: ”Economics is the science of thinking in terms of models joined by the art of choosing models which are relevant to the contemporary world.” However, there is less and less art in economics because modern applied policy models must be specified in a way that can be directly empirically tested. Since the focus is on growth rates rather than levels of GDP, I will leave aside constant factors that affect mainly the level of economic activity such as climate, religion, a colonial past, settler mortality, being landlocked etc.2 This significantly reduces the realm of possible theoretical models. Furthermore, the focus has to be on developments that are reasonably predictable. This relegates many important developments to the sidelines. For example, a depreciation of the exchange rate, an unusually expansionary monetary or fiscal policy, a drop in energy prices or a change of government may all lead to a significant acceleration in GDP growth for several years. Indeed, Hausmann, Pritchett and Rodrik (2005) find that events like these explain most of the accelerations of GDP growth over time. In this study they will be excluded from the analysis because they are either highly unpredictable (e. g. exchange rate moves) or because they will eventually be followed by a reversal of policy (e. g. short-term monetary and fiscal shocks). A change of government may have a short-run confidence-boosting effect which already anticipates measures that will have a visible impact on drivers of growth in the long run. My analysis focuses on these long-term effects of policy changes. When assessing the theoretical growth literature, chapter 2 will gather the most useful elements from different theories. The neoclassical model contributes the importance of diminishing returns to factors. New growth models 2
Parts of the literature claim that these factors also permanently affect the growth rates of GDP. However, this would imply a centuries-long divergence of income levels, which does not appear to be observed outside Africa, i. e. in the countries considered here.
8
1 The importance of long-run growth analysis
add human capital and barriers to technology adoption. Evolutionary models emphasize the importance of complementarities. My synthesis builds on Kaldor’s technical progress function and augments it with insights from the other models. Chapter 2 also explains why the production function should be used with considerable care in growth models and why many endogenous models with their scale effects are not helpful for building a model of the real world. Chapters 4 to 9 look at some of the most important drivers of growth in more detail. Each chapter focuses on the theoretical rationale and the best available data for measuring each driver. 1.3.2 Choosing the best econometric technique The survey of growth theories will show that no consensus is available on how to best model long-term economic growth. This implies that empirical analysis has to help with selecting an appropriate model. Solow (1987, 2001 add-on) suggests that an ”alternative strategy might be to begin with unprejudiced empirical study of the determinants of the speed of technological innovation...”. In general, my approach will be rather pragmatic, in line with Romer (1994c, p. 20): “If we set our standards for what constitutes relevant evidence too high and pose out tests too narrowly, we will indeed end up with too little data. We can thereby enshrine the economic orthodoxy and make it invulnerable to challenge.” Temple (2000, p. 202) rightly points out that “the litmus test for the cross-country growth literature will come when we find out how useful our current models are in predicting the variation in growth rates, not for existing data, but for periods beyond the usual samples.” What holds for cross-country models applies equally to time-series and panel models: Forecast performance may be an important indication for a model’s validity. On the other hand, the theory of forecasting sketched in chapter 10 shows that a sound theoretical basis is not a necessary condition for good forecasts. With these difficulties in mind, chapter 11 includes an evaluation of the different empirical growth models and assess their strengths and weaknesses. Cross-country regressions will be dismissed as not flexible enough for modeling the complex process of long-run economic growth - even though these models are still widely used today. Panel models are more appropriate, but initial attempts did not take into account the non-stationarity of the underlying data. I will propose a two-stage approach, which first analyzes the long-run linkages between the levels of the key variables (panel cointegration). The second stage is the modeling of growth rates of GDP, taking into account the information gained in the first stage. Chapter 12 will present the estimation results. Finally, chapter 13 conducts two forecast competitions and presents forecasts for GDP growth over 2006 to 2020.
2 Assessment of growth theories
Long-run economic growth is a highly complicated process that involves the decisions and the complex interaction of a large number of economic agents over a long time horizon. So far, modeling this process has remained incomplete and a major challenge for economists. Therefore, this chapter cannot possibly present the ”true” model of growth. Rather, it aims at highlighting the strengths and weaknesses of different models. It also summarizes the important lessons to be learned from each model and the pitfalls to be avoided. The filters used in the chapter are usefulness for forecasting, availability of data to test the theory and, most importantly, real world validity of the theoretical conclusions drawn. At the end, this chapter presents a model that takes most of these lessons on board but is also tractable empirically. This chapter cannot possibly summarize the vast body of theoretical growth models. Textbooks such as Obstfeld and Rogoff (1996), Aghion and Howitt (1998), Frenkel and Hemmer (1999), Jones (2002a) or Barro and Salai-Martin (2004) provide comprehensive overviews of neoclassical and endogenous models with all the mathematics behind them. This chapter only gives an overview of the different approaches. The detailed treatment of individual drivers of growth is left to the subsequent chapters.
2.1 The search for a dynamic model Robert Solow (2001, p. 383) always thought “of growth theory as the search for a dynamic model that could explain the evolution of an economy over time” or equivalently “the theory of the evolution of potential output”(p. 286). Earlier he made clear that the ”art of successful theorizing is to make the inevitable simplifying assumptions in such a way that the final results are not very sensitive” (Solow [1956]). Solow (2001, p. 383) also demands that “an economic model should have some internal structure; its causal arrows should rest on some sort of behavioral mechanism.” These are the standards against which I will evaluate the theoretical literature and build my own model.
10
2 Assessment of growth theories
Also, I will use the same definition as Lucas (1988, p. 5): A theory is “an explicit dynamic system, something that can be put on a computer and run.” And I will try to avoid constructing a model that produces “logically possible outcomes that bear no resemblance to the outcomes produced by actual economic systems” (ibid). Keeping the model and the resulting empirical tests reasonably simple is also a goal, avoiding “kitchen-sink” models that have become so widespread in growth empirics. This implies that I will have to abstract from some potentially relevant aspects of the real world. A severe problem in trying to explain the path of per capita GDP is that growth theories are “open-ended”. Brock and Durlauf (2001, p. 235) define openendedness as the validity of one causal theory of growth not implying the falsity of another: “So for example, a causal relationship between inequality and growth has no implications for whether there exists a causal relationship between trade policy and growth.” Any macroeconomic model also has to abstract from the fact of reality that firms and households are highly heterogeneous: some firms make good profits, while others lose money. This heterogeneity is helpful under the surface of macroeconomic models because it moves the economy forward: loss-making firms will shrink, while profitable firms will grow. However, it cannot be part of a long-run growth model for forecasting purposes. To a large extent, macroeconomic modeling is still an art. All models agree that technology is a key factor shaping economic growth - but it remains hard to exactly define and explain changes in technology. One focus of this chapter will therefore be on modeling knowledge and the barriers to technology. This chapter will also briefly evaluate which elements of post-Keynesian models, Schumpeterian models, evolutionary models and open-system models provide meaningful insights.
2.2 The basic neoclassical model The neoclassical Solow model (Solow [1956], [1957]) remains the workhorse of growth empirics and the benchmark against which all other models are still compared. Therefore, any search for a useful growth theory has to start there. The details of the model have been outlined competently and in detail in many places (e. g. Jones [2002a]), so a short summary should suffice. Solow’s intention was rather narrow and straightforward: He wanted to explain the growth performance of a developed economy like the US and show that decreasing returns to capital imply that accumulation of physical capital alone cannot explain the long-run growth performance of an economy. Countries cannot get rich by simply accumulating more and more physical capital. Decreasing and ultimately zero (or negative) returns will prevent this. The basic Solow-model is for a closed economy with only one good Y , no government involvement, a constant returns to scale technology given exogenously, a constant and exogenous savings rate s, two factors of production
2.2 The basic neoclassical model
11
capital K and labor L, and a level of technology (or total factor productivity TFP) A. The neoclassical Cobb-Douglas production function with Hicksneutral technological progress is:1 (2.1)
(1−α)
Yt = At Ktα Lt
where the subscript t denotes time and α the elasticity of output with respect to capital input. Taking logs and then differentiating with respect to time leads to: (2.2)
ˆt, Yˆt = Aˆt + αKˆt + (1 − α)L
where the hat above a variable denotes percentage changes, while a dot above a variable (used below) will denote absolute changes. The labor force L grows at the exogenous rate n, Lt = L0 ent , and technology advances at the constant and exogenous rate g, At = A0 egt Therefore equation 2.2 can be written as: (2.3)
Yˆt = g + αKˆt + (1 − α)n
The steady state (or balanced growth path) is defined as the situation in which output Y and the capital stock grow at the same rate, so that the capitaloutput ratio K/Y remains constant. I will show later that this assumption (a Kaldor stylized fact) actually holds in reality and will use it in the theoretical model as well. Capital is in any model accumulated according to: (2.4)
K˙ t = Kt − Kt−1 = It − δKt−1
where It is gross investment in period t and δ the rate of depreciation of the capital stock. When using Harrod-type technological progress (Yt = Ktα (At Lt )(1−α) ), expressing all variables in intensive form (i.e. per unit of effective workers At Lt ), and combining the above equations with the insight that capital per effective worker does not change in the steady state, we can derive the level of per capita GDP yt (lower case variables will in general denote per capita amounts: yt = Yt /Lt ) in the steady state as:2 (2.5)
yt = At
α (1−α) s n+g+δ
The level of GDP depends positively on the level of technology At and the savings rate, but negatively on population growth, the rate of technological progress and the depreciation rate. 1
2
Below, technological progress is often assumed to be of the Harrod-neutral, laboraugmenting type Yt = Ktα (At Lt )(1−α) See for example Jones (2002a), p. 40.
12
2 Assessment of growth theories
Importantly, the growth rates of GDP per capita and of the capital stock per capita in the steady state are independent of the savings rate and depend only on the pace of technological progress g. (2.6)
yˆt = kˆt = Aˆt = g
In econometric terminology, the three variables y, k and A (and h in an augmented model, see chapter 6) are all driven by a single shock g. Among the three integrated variables y, k and h, there should therefore be two cointegrating relationships as will be explained in detail below. Decreasing returns make the model mathematically tractable, while technological progress prevents the marginal product of capital from falling too much and thus keeps capital accumulation going. Unfortunately, the model leaves the main driver of economic growth, g, unexplained. The Solow model has received a wide following because it generates a stable equilibrium - in sharp contrast to the knife-edged models of Harrod and Domar proposed in the 1930s and 40s. Since real-world economies tend to move in a rather smooth way over longer periods of time (exception: the Great Depression), a theoretical model of long-run growth should be able to lead to stable equilibria. In the Solow model, this stability is possible because of the assumption that investment is always equal to savings, i.e. investment is the accommodating variable to guarantee an equilibrium. There is no role for entrepreneurs, animal spirits or expectations about the future to affect investment and growth. Solow made clear that sustained economic growth depends on technological progress - even if he left it to future researchers to model that progress. Or as Frankel (2003, p. 189) puts it, “all the action was not in capital accumulation, but rather in the residual.” Technological progress is needed for continuing accumulation of physical capital. This insight has to carry through into any growth model and I will make use of it as well. Capital reacts endogenously to the advance in technology and therefore to the change in capital’s productivity. 2.2.1 Application in cross-country analysis Although intended for a narrow question, Solow’s model was also used for many purposes that it was not intended for. So any criticism linked to the Solow-model is mostly a criticism of the (mis-)use of the model and not of the contribution of the Nobel laureate himself. One of these uses is cross-country applications (prominently in Mankiw, Romer and Weil [1992]). Countries with high savings rates and low rates of population growth are seen to have a higher steady state level of GDP per capita. However, Solow himself (2001, p. 283) has “been skeptical from the beginning about the interpretation of crosscountry growth regressions”, partly because of problems of omitted variables and reverse causality. He also pointed out that he never used his model for developing economies because the underlying mechanisms in those countries
2.3 Focus on convergence
13
are quite different. Nevertheless, this is exactly what much of the empirical growth literature has done over the past 15 years. Researchers in the crosssection literature make the assumptions that g is constant and identical across countries (i. e. complete knowledge spillovers), that the depreciation rate δ is constant and identical across countries, that the coefficient α in the production function is identical across countries, and - crucially - that the initial level of technology A0 is equal to a common constant B0 in all countries plus a random, country-specific shock ǫ. Applying natural logarithms to equation 2.5 and assuming that g + δ = 2% + 3% = 5% as in Mankiw, Romer and Weil (1992) leads to the basic empirical equation: (2.7)
ln yi = ln B0 +
α α ln si − ln (ni + 0.05) + ǫi (1 − α) (1 − α)
The level of per capita income of country i depends only on that country’s savings rate si , its population growth rate ni and a random error. Differences in the level of technology or productivity are not explained but relegated to the random error. This approach is still the workhorse of empirical growth analyses today. However, the cross-country differences in savings and population growth rates are not nearly large enough to explain the variation in income per capita. For example, GDP per capita in the USA is more than 15 times the level in Thailand - but the US investment ratio is lower than that of Thailand and population growth is higher. Prescott (1998) is very explicit, saying that “the neoclassical model is not a theory of development.” Therefore, awareness of the cross-section models is crucial for any assessment of the literature, but this approach will not be used in my empirical model. In addition, I will show in chapter 11 that most of the assumptions used to derive equation 2.7 do not hold in reality.
2.3 Focus on convergence The practical use of the basic neoclassical model is severely limited by the fact that long-run growth g is assumed to be exogenous and identical across countries. As a result, all economies would grow at the same rate in the longrun. Since growth rates differ significantly in reality, this may not be the ideal starting point for long-run growth forecasts.3 With the main driver of long-term growth outside the standard neoclassical model, the empirical growth literature quickly focused on the gap between where an economy should be in the long run and where it is today. Differences in observed growth rates across countries would then stem from different distances to their own steady states. This is the notion of conditional convergence where economies approach their own country-specific steady state path over 3
The assumption that all countries grow at the same rate will, however, enter as one of the contenders in the forecast competition in chapter 13.
14
2 Assessment of growth theories
time. The level of that path is determined by savings rates and population growth rates as shown in equation 2.5 above. Figure 2.1 illustrates conditional convergence: the steady state levels of income advance at the same constant rate in both economies, leading to the parallel dotted lines. However, economy 1 starts out well below its steady state path, while economy 2 is slightly above its steady state. Assuming that 10% of a gap between actual and steady state GDP closes each period, the trajectory of actual GDP in economy 1 is much steeper than that of economy 2: Economy 1 will post stronger growth rates as it converges to its own steady state despite the common pace of technological progress. 2.9 ln GDP 2.7
Convergence trajectory of economy 2
Steady state path of economy 2
2.5
2.3 Steady state path of economy 1 2.1
Convergence trajectory of economy 1
1.9 Time 1.7
Fig. 2.1. Two economies converging to parallel steady-state paths
The drivers behind this convergence process are differences in the marginal products of capital in a production function with diminishing marginal products. Below the steady state an economy has a low capital-labor ratio and a high marginal product of capital. This leads to high capital accumulation, which then dampens the marginal product of capital on the way to equilibrium. The capital-labor ratio and output per capita both rise as they converge from below to their steady state paths. If income per capita closes the gap between its steady state level y∗ and last year’s level at the rate of λ , the resulting growth rate of GDP for g = 0 is: (2.8)
yˆt = ln yt − ln yt−1 = λ(ln y ∗ − ln yt−1 )
2.3 Focus on convergence
15
Replacing the steady state income level with its determinants βx leads to a simple convergence regression like (2.9)
yˆt = −λ ln yt−1 + λ βx
This specification is used for example by Mankiw, Romer and Weil (1992) or in forecasting by Ianchovichina and Kacker (2005). A significant coefficient on yt−1 is interpreted as evidence for conditional convergence. However, interpreting the coefficients on the exogenous variables x may turn hazardous, as they are often erroneously interpreted as measuring the unconditional impact of the variables on the growth rate of GDP. Furthermore, this specification is sometimes interpreted as cointegration between ln yt−1 , s and n, but the literature hardly ever formally tests for either non-stationarity or cointegration. 2.3.1 Tests for conditional convergence Given that convergence to a country’s steady state income level is the only thing that differentiates between countries’ actual growth experiences in the neoclassical framework, it is understandable that a heated debate developed over how quickly that convergence proceeds. If it is very fast, say 20% of the gap closes every year, then the idea of conditional convergence is of minor use for long-run growth analysis. On the other hand, neoclassical authors claim that a 2% convergence rate is like an “iron law.” Mankiw, Romer and Weil (1992, p. 423) even derive this 2% from the supposed parameters of the economy. However, the empirical result of a 2% convergence rate stems from the problematic cross-section regressions. As I will explain in detail in chapter 11, more appropriate techniques find convergence speeds of 10% (Caselli, Esquivel and Lefort [1996]) or even 30% (Lee, Pesaran and Smith [1997]). Therefore, as Caselli et al. put it, “economies spend most of their time in the neighborhood of the steady state.” These high speeds of adjustment make convergence an important notion for business cycle analysis (the output gap closes), but not for growth forecasts over a 10 to 15-year horizon. In addition, conditional convergence assumes a common steady state growth rate. However, Lee, Pesaran and Smith (1997, p. 375) point out that the whole idea of conditional convergence has little economic meaning because this common growth rate does not exist (the parallel dotted lines in figure 2.1). They also find that the rate of technological progress has been higher in OECD countries over the period 1960 to 1989 with a smaller dispersion as compared to the world as a whole. In general, convergence studies also suffer from sample selection bias (only successful countries provide reliable data on GDP growth) and measurement error (poor countries have a larger informal economy). Even more, Wacziarg (2002, p. 910) thinks that “With thirty or so years of data, it is simply impossible to distinguish steady-state growth from transitional growth” given estimates of a half-life of transition of around 32 years.
16
2 Assessment of growth theories
A different issue is whether countries’ GDP per capita converge to that of the world leader. This so-called “absolute convergence” would be extremely helpful for growth forecasting: the distance to the leader ”distance to frontier”) that would explain growth is reasonably easy to calculate. However, the prediction following from absolute convergence is that all countries should achieve the same level of income per capita over time - possibly by now. This is clearly not the case. Rather, Quah (1993) finds a tendency towards a twocamp world, divided between haves and have-nots. He sees no sign of absolute convergence. Likewise, Pritchett (1997) sees “divergence, big time.” Similarly, Easterly and Levine (2001) analyze economies over the period 1980 to 1992 and conclude that divergence, not convergence, is the big story. The theoretical reasoning uses the idea of a “poverty trap” i. e. a stable steady state with low levels of per capita income possibly resulting from inferior production technology. Therefore, the simple prediction of absolute convergence is not supported empirically. Growth does not come about automatically but is a complicated process and requires hard work. However, there may still be some value in looking at the distance to the leader. The leaders may set examples that followers may decide to apply at home. For example, Hausmann, Pritchett and Rodrick (2005) show that growth accelerations are more likely in poor countries than in rich countries. Likewise, Parente and Prescott (2000) think that any useful theory of economic development has to be able to explain the catching-up of countries like Japan and South Korea in the second half of the 20th century. Still, any catching-up is likely to be associated with policy decisions and changes in the determinants of economic growth. As this brief review illustrates, convergence to a country-specific growth path was the focus of a large body of literature and subject to a heated debate that is still ongoing. However, as argued above, the main task of growth theory and empirics should be to explain the long-run path of an economy and not a short-run convergence to that path. The real issue is not how quickly an economy approaches its long-run growth path, but the trajectory of the longrun path itself. The main focus of this study is on that long-run path. Nevertheless, the idea of convergence will feature prominently in my empirical model. It is intimately related to the idea of cointegration. If two variables tend to move together over time, any deviation from a linear relationship between the two tends to get smaller over time. The error is being corrected, and the two variables converge to the common path.
2.4 Models with deeper insights A forecasting model would be highly unsatisfactory if growth was either predetermined outside the model or only stemmed from the convergence to the steady state. A useful model has to go significantly further, but has to remain tractable mathematically.
2.4 Models with deeper insights
17
A very simple way of expanding the Solow model adds a third factor of production, human capital (see also chapter 6 on human capital). However, this so-called “augmented Solow model” does not add any significant further insights. The long-run growth rate of all per capita variables is still determined outside the model by g. 2.4.1 Including human capital (Lucas) A big step forward in growth theory came with Lucas (1988) adding a second sector to the economy that produces new human capital H by using a fraction u of the existing human capital in a constant returns technology. The model maintains perfect competition, but produces a more complicated formula for long-term growth. It effectively assumes that human capital and knowledge are one and the same.4 The law of motion for human capital in per capita terms is: (2.10)
h˙ t = ht − ht−1 = B(1 − u)ht − δht−1
where h˙ t denotes the absolute change of human capital per capita. In Lucas’ model with intertemporally optimizing households but without spillovers of human capital, this leads to a long-run growth rate of per capita GDP of:5 (2.11)
yˆt =
1 (B − δ − ρ) σ
This growth rate depends on the underlying characteristics of the economy: The rate of risk aversion σ, the technology of teaching B, the rate of depreciation of (human) capital δ and the rate of time preference ρ determine the share of human capital devoted to teaching u. This model provides much richer insights than the Solow model because there are now several parameters that policy may be able to influence. However, the long-run growth rate is just as constant and exogenous as it is in the Solow model. And all key variables grow at the same pace per capita in the steady state: (2.12)
ˆ t = 1 (B − δ − ρ) yˆt = kˆt = h σ
This result stems from the assumption of a constant returns production function for human capital, thereby avoiding the reason why Solow had to use g. Still, moving from a variable that drops like manna from heaven to a variable that has more to do with economic policy is very important and I will use much of this thinking below. Lucas’ assumption of exactly constant returns to human capital in the production function for human capital is a highly restrictive one - small deviations will make the model look starkly different. In the words of Solow (1994, 4 5
See Aghion and Howitt (2005b), p. 2 See for example Frenkel and Hemmer (1999), p. 209ff.
18
2 Assessment of growth theories
p. 51), “this version of the endogenous-growth model is very un-robust.” Also, the Lucas model has no role for active individuals like entrepreneurs and does not include explicit research and development (R&D) activity. 2.4.2 Modeling barriers to riches (Parente & Prescott) Another useful model that has its roots in neoclassical thinking but tries to capture more aspects of reality is the ”barriers to riches” approach by Parente and Prescott (e. g. 1994, 2000 and 2005). They propose to model the barriers preventing a country from using the best globally available technology as a way to explaining differences in income levels as well as some growth miracles of the 20th century. The theory of relative efficiencies (or of differences in total factor productivity TFP) presented in Parente and Prescott (2000) decomposes a country’s technology into a pure knowledge component At common to all countries and an efficiency component Ei specific to each country i, which measures the degree to which that country exploits the usable knowledge that is generally available. This Ei can also be interpreted as a measure for the quality of a country’s institutions. The production function as outlined in Parente and Prescott (2005) therefore extends the standard neoclassical production function: (2.13)
(1−α)
α Lit Yit = Ei At Kit
This function gives the “maximum output that can be produced given not only the technology constraints but also the constraints on the use of technologies arising from policies” (Parente and Prescott [2000], p. 82). It is straightforward to allow Ei to vary over time as well. The efficiency component could, for example, capture work rules that require firms to use a minimum number of workers per machine or per plant. Or it could capture the cost of switching to a new technology in the form of regulations, bribes or severance packages to redundant workers. The difficulty with the Parente/Prescott approach is that efficiency is difficult to measure. However, it is impossible to combine free trade and barriers to technology at the same time: If rules prevent domestic firms from using the best available technology, then foreign firms would be able to capture the market if they were allowed to enter. Opening an economy to free trade is only sensible if barriers to technology are reduced at the same time or earlier. In principle, a country can reduce its domestic barriers without opening to free trade. But then it will not be able to make use of all the technology that is incorporated in imports or conveyed in the exchange with foreigners. The conclusion is that low barriers (good institutions) and free trade tend to go hand in hand. The Parente and Prescott model suggests that the potential for rapid growth is larger the further a country is behind the technological leader,
2.5 Opening the theories further
19
i. e. the lower its Ei . This is consistent with the observation that growth miracles are more likely to happen among poor than among rich countries. It also implies that careful modeling of any changes in barriers is particularly important when trying to model GDP growth in emerging markets, who tend not to engage much in R&D. Since I want to model the growth process of a large number of countries with extremely different income levels, this idea is very appealing.
2.5 Opening the theories further The Lucas model is one example of how to circumvent the inability of the Solow model to explain the evolution of long-run growth by relaxing the assumption of diminishing returns. This strand of endogenous growth theory tries to explain growth endogenously inside the model, so that it does not depend on the exogenous rate of technological progress g. The other strand of the endogenous growth theory focuses on the creation of ideas and their spillovers, trying to avoid the Solow model’s feature that accumulation of capital alone cannot make a society rich. Technological progress is modeled as a purposeful economic activity, as research and development: “Long-run growth is driven primarily by the accumulation of knowledge by forward-looking, profit-maximizing agents.”6 This strand of the literature tries “to uncover the private and public sector choices that cause the rate of growth of the residual to vary across countries” (Romer [1994c], p. 3). It drops the neoclassical assumption that the same technology is available in each country and recognizes that “technological advance comes from things that people do” (ibid, p. 12). Solow (1994, p. 51) applauds the attempt to “model the endogenous component of technological change as an integral part of the theory of economic growth.” Another term often used is “new growth theory” to distinguish it from the “old” neoclassical models that treated technology as given. According to Romer (1994a) new growth theory is mainly about the creation of new goods. He criticizes general equilibrium theory for operating within a fixed set of goods: “It is an inevitable fact of life that economies will forever operate on the boundary of goods space, that only a small subset of all possible goods will ever be introduced” (p. 19). Therefore, any useful model of economic growth will have to try to model the creation of new goods, services and ideas. Societies with institutions that allow ideas to be generated or applied more quickly than elsewhere tend to grow faster or to be richer. Raising the pace of knowledge acquisition is likely to raise the speed of GDP growth. This is an important insight that I will use when including human capital and trade openness in the forecasting model. Or as Solow (2001, p. 287) put it: “The good thing about the fox that Paul Romer started chasing more than 15 years 6
Romer (1986), p. 1003.
20
2 Assessment of growth theories
ago is that it leads us to focus on the analysis of the economic incentives to create new technology.” Modeling the creation of new ideas and knowledge is in line with the thinking of Joseph Schumpeter, who saw capitalism as an engine of progressive change. In his view the evolution of an economy is linked to individuals doing new things or doing things differently. This would be compatible with the conditions for a growth model outlined in chapter 1. According to Schumpeter, any economy is full of abrupt changes as new ideas are implemented in the quest for monopolistic profits: new firms drive out old firms and improve technology - the famous “creative destruction”. It is possible for innovations to be clustered in time partly because one innovation may disrupt the business model of other firms, which are then forced to be innovative themselves. Radical innovations may be introduced and then diffuse into the general economy with a series of incremental innovations. As a result, economic growth is not a balanced process where all sectors of an economy expand in the same way every year. Firms go bankrupt, people lose their jobs and have to find new things to do. In general, endogenous growth models offer some valuable insights into the evolution of growth and knowledge in leading developed countries. However, they do not help to explain the huge cross-country differences in incomes. As Parente (2001) makes clear, poor countries do not engage in R&D because they don’t have to. They can grow quickly simply by adopting technologies available elsewhere. This is how the Japanese in the 1970s, the Koreans in the 1980s and the Chinese in the 1990s achieved much of their stellar growth rates. The modeling of purposeful R&D activity cannot help explain these success stories. Together with the paucity of reliable data, this is enough reason to not include R&D in my model, which is supposed to capture the performance of nearly 40 highly heterogeneous countries. 2.5.1 Models with scale effects The 1990s saw a revival of Schumpeter’s ideas with many contributions to the endogenous growth theory. Some explicitly used ”creative destruction” in the title like Aghion and Howitt (1992). However, all models with spillovers lead to so-called “scale effects” as Jones (1999) outlines in a detailed and structured way. The growth rate of per capita GDP depends positively either on the size of the population (or of human capital, see Romer [1990], Grossman and Helpman [1990], Young [1991] and Aghion and Howitt [1992]) or on the growth rate of the population (see Jones [1995b, 2002b, 2005], Segerstrom [1998], Howitt [2000] and Aghion and Howitt [2005a]). However, Parente (2001) highlights that “the prediction known as scale effect is not born out by the data”. Likewise, Bottazzi and Peri (2007) see ”a clear rejection of the existence of a strong scale effect.” Large countries or countries with fast population growth do not necessarily show faster growth of GDP per capita than smaller countries. Similarly, Rudd (2000) finds little
2.5 Opening the theories further
21
evidence of an external effect of human capital and concludes that “human capital spillovers of the form postulated in the new growth literature are unlikely to matter much in practice.” Likewise, Acemoglu and Angrist (2000) are unable to find external returns to education that differ significantly from zero.7 In sum, the evidence is not strong enough to make models based on spillovers useful vehicles for long-run growth analysis. Parente (2001) concludes that “endogenous growth theory is not a reasonable theory of economic development.” Classifying the different models remains hazardous. Partly for good reasons, some researchers classify the models just sketched as “new neoclassical” or as Schumpeterian. The focus on knowledge is in line with the observation that rich countries tend to be those with the highest level of knowledge. The feature that knowledge is nonrivalrous and leads to increasing returns is on the one hand important to drive the endogenous models. However, this feature makes it difficult to create realistic models that do not show explosive growth paths or scale effects. Furthermore, R&D expenditures clearly are an “inadequate measure of the resources devoted to increasing productivity” (Solow 1994, p. 53) because there are many cases where the improvement of a product or the cost reduction stems from the input from workers, management or customers. The smarter these workers, managers and customers are, the higher the quality of their proposals and the higher is the level of productivity. Certainly, the activities of entrepreneurs are very important in changing the economic and social landscape. But any analysis of the role of entrepreneurship in growth such as in Beaugrand (2004) quickly turns to the environment within which the entrepreneur operates: Will he be able to hire enough qualified workers? Will he be able to reap the rewards from his activities? 2.5.2 Evolutionary models of growth The growth models sketched in the previous section borrow heavily from Schumpeter, who tried to combine the dynamic thinking of Marx with the historical school and with neoclassical microeconomic foundations. However, even before the contributions of Romer and others, evolutionary economists (“neo-Schumpeterians”) started to model the evolution of economies, their qualitative transformation in historical time. The benchmark book by Nelson and Winter (1982) summarizes their earlier contributions that focus on capitalism as an engine of change. Theses models explicitly think of growth as a process that takes place in historical time.8 7
8
The empirical support that Jones (2005) derives from Alcal´ a and Ciccone (2004) is highly problematic (see chapter 7). Outlines can be found in Verspagen (2000, 2002), Fagerberg (2002) and Andersen (2004). Fagerberg highlights the difficulties of being able to publish in mainstream
22
2 Assessment of growth theories
Nelson and Winter (2002, p. 31) highlight that their focus is on “how individual skills, organizational routines, advanced technologies and modern institutions come into being” in a “trial-and-error cumulative learning partly by individuals, partly by organizations, partly by society as a whole.” Unfortunately, this makes formal modeling rather difficult. The evolutionary literature usually does not link its insights to models like those of Lucas or Parente and Prescott.9 This section attempts to sketch this link and to see to what extent different theoretical models satisfy requirements from evolutionary theory. Neoclassical models (including endogenous models) use representative agents with complete foresight who pick the best technology from a freely accessible set of technologies depending on prevailing prices and who always keep the economies in equilibrium. But - to borrow from Keynes - these are not the characteristics of the economic society in which we actually live, as a look at real-world companies and markets shows. Evolutionary models try to move closer to the real world by using bounded rationality, heterogeneity of agents, economic selection, and disequilibrium notions to model how economies evolve, i. e. their continuous change from a lower or simpler state to a higher or more complex state. Firms act under bounded rationality and use rules of thumb rather than fully rational profit maximization. They are constrained by the limits of what they know and by old habits and routines. Usually, they develop and adopt new routines when the old ones no longer work. In evolutionary economics, these routines are the equivalent to genes in biology. It is possible that companies hang on to old routines for a very long time despite new possibilities becoming available. Output would stay below a maximum possible value until companies for some reason decide that things have to change. They adopt new routines and output rises back to the production possibility frontier possibly after temporary disruptions during the adjustment process may have depressed it further. Unfortunately, the insight that the process of growth should be analyzed at the company or at least the sectoral level is something I cannot use explicitly in my macroeconomic model. However, the pace of acquiring new routines probably depends on the pace at which human capital grows and on the pace at which a country opens to routines from abroad. Nelson (2005, p. 11) stresses that the essence of human cultural evolution is that “knowledge is accumulating in the heads of human beings.” Evolutionary models use biological notions such as natural selection and genetic mutation. The counterpart to natural selection in biology is economic selection among the heterogeneous actors: firms with better products, strate-
9
US journals and concludes that “evolutionary modelling does not appear to have been accepted as a welcome addition to the discipline by hardcore mainstream economics” (p. 37). Beaugrand (2004) notes that the profession keeps cultivating its differences: “articles in the Journal of Evolutionary Economics rarely quote papers from Aghion and Howitt, Lucas, Romer or Alwyn Young.”
2.5 Opening the theories further
23
gies and routines will grow and survive. The more pressure firms experience to improve their technology, the more productive they will be - a link to the Parente and Prescott framework. Selection improves the average fitness of the firms but has a negative aspect as well in line with Schumpeter’s creative destruction: firms with old technologies will shrink or disappear. The evidence presented by Lentz and Mortensen (2005) strongly supports the view that worker reallocation from shrinking to growing firms is an important source of productivity growth - an aspect that macroeconomic models cannot capture. While economic selection reduces variety, innovation is crucial to generate new varieties. Evolutionary theory uses a very broad concept of innovation, which also includes imitation and learning, i. e. any measure that improves on a firm’s old ways of doing things in order to avoid destruction. However, formal models usually have to resort to an exogenous rate of innovation, which makes them look similar to the simple neoclassical model outlined earlier. Krugman (1996) sees economics and evolutionary biology as sister fields because they are both about the interaction of self-interested individuals. The main difference is that economics tends to assume intelligent behavior, while biology assumes myopic (i.e. lack of foresight or discernment) behavior. In practice, however, evolutionary biology is also mostly about maximization and equilibrium - just as neoclassical economics - and less about myopia and dynamics as it tries to come up with comprehensible models. Evolutionary models lead to conclusions that differ significantly from neoclassical models. Nelson (1981, page 42 in reprint) argues “it may be fruitful to consider the several sources of growth as being like the inputs into a cake. All are needed.” He goes on to conclude that “where complementarity is important, it makes little sense to try to divide up the credit for growth, treating the factors as if they were not complements.” In other words, growth accounting would offer few useful insights. However, this close linkage among the factors of growth leads to the question of what ultimately determines their evolution. Nelson also refers to the “general features of the economic environment and of political and social institutions that support all three sources and the growth they promote”, which points to the same research agenda as the neoclassical growth theory has turned to in recent years: what ultimately explains the exogenous rate of technological progress that shapes all the proximate drivers of economic growth? One useful insight from evolutionary theory is that we cannot really distinguish between economic and non-economic factors when trying to explain economic growth. There are strong interactions between technology, science, culture, politics, institutions and the economy that should be taken into account. For example, at the moment we see technological progress and the rapid integration of the global economy exerting significant pressure on institutions in many European countries. But will they reshape their labor and product markets in order to take full advantage of these changes? In the words of Paul Romer (1994b), “a government must create an environment that fosters change - and progress - in the techniques we use.” This includes allowing entry
24
2 Assessment of growth theories
and exit of firms. Change tends to be disruptive and societies in many developed countries have institutions that try to shield their citizens from these disruptions. Ideally, a model of long-run growth should therefore try to take into account the evolution of change-inhibiting institutions (cf. Parente and Prescott). Or at least it has to model some of the symptoms of sclerosis such as a decline in labor usage in economies that have developed highly protective institutions. Over the past decades, evolutionary economics made use of methods developed in biometrics from the 1930s to start an evolutionary econometrics or evometrics. Andersen (2004) gives an overview of the development of evometrics and illustrates a microeconomic example using selection effects and innovation effects at the firm level. When evaluating their models, evolutionary economists make heavy use of data on patents and R&D. However, this captures only the last step in their models. A deeper question would be what allows and causes these new patents and the new R&D spending? What allows new firms to develop, new technologies to be introduced? So far, evolutionary economics models only some of the issues crucial to explain long-run economic growth. 2.5.3 Open-system models Evolutionary models point in the same direction as the “open-system” approach advocated by Chick and Dow (2005). In their definition, a system is a network, a structure with connections, within which agents act, mostly in ways which reproduce and reinforce the system, but sometimes in ways which lead the system to evolve. This is similar to the notions of routines and innovations just highlighted. An open system is a system that interacts with the outside world. The economic system is “embedded in and connected with politics, philosophy, history, values, all the elements of social life” (Chick 2003, p. 3). Closure and openness are a matter of degree and the researcher has to choose where and how to close his system. The argument for closed systems rests on internal consistency being the only test of rigor available in economics - often at the expense of external validity or realisticness. This becomes an especially big issue when dealing with the time dimension. Economists often abolish time, for example by modeling convergence to an asymptote, which is the end-point of the analysis, often equated with a long-run result as in the neoclassical model. By contrast, open systems try to take history and initial conditions into account in the form of changing networks and institutions, conventions, social systems and behavior. This is easier said than done in practice. In most models one has to decide which variables are determined outside the model and have an impact on the system under investigation, i. e. where to close the system. In forecasting, exogenous variables are especially helpful. So keeping the necessity
2.6 General critique of the standard approach
25
of an open system approach in mind is important in trying to build as realistic models as possible. Kibritcioglu and Dibooglu (2001) also propagate an interdisciplinary approach to long-run growth analysis, taking the insights from other disciplines like demographics, history etc. into account. This is consistent with the calls from so-called broadband economists such as Fullbrook (2004), who suggest economics (a) use a broader concept of human behavior, (b) recognize culture as embedding economic activities, (c) consider history as economic reality, (d) develop a new theory of knowledge, (e) ground studies empirically, (f) expand the set of methods and (g) conduct an interdisciplinary dialogue. Going even further, Paul Romer highlighted back in 1994 the possible chaotic behavior of economies. “Any number of arbitrarily small perturbations along the way could have made the world as we know it turn out very differently” (Romer 1994a, p. 9). The next modeling step would be chaos theory, which models the complex interaction of independent agents with spontaneous self-organization, positive feedbacks and complexity. However, this is clearly too ambitious for this study. The conclusion from this overview of different theoretical approaches is that economic growth is a highly complex process: many heterogeneous agents interact in an evolving system of science, culture and politics which is largely inherited from the past. One has to be conscious of any assumptions made to create a workable model and in defining the boundaries of the model. The following chapters on the different elements in my empirical model try to show this awareness.
2.6 General critique of the standard approach So far, this chapter has shown how difficult it is to construct a formal dynamic theoretical model that is able to endogenously explain economic growth and lead to reasonable predictions about the real world. However, the situation gets even more complicated because of the widespread use of some vehicles that are supposed to increase our understanding of the growth process: the production function and total factor productivity. The following sections will show that they can produce misleading conclusions and should therefore be avoided. 2.6.1 Production function cannot be estimated As shown above, most theories of economic growth assume or use an aggregate production function and make statements about its coefficients. For example, the simple Solow model would imply a marginal product of capital of one third. The augmented Solow model in Mankiw, Romer and Weil (1992, p. 417) posits that the marginal product of human capital “β is between one third and one
26
2 Assessment of growth theories
half.” Tests for endogenous growth models tend to focus on whether α+β > 1 in the augmented model. Empirical papers estimate an equation like 2.7 to test whether the coefficients of the production function really have the postulated size. The general finding is a high R2 and coefficients that match theoretical priors. However, these tests are meaningless and the high R2 is not at all surprising. The fact that the coefficient on labor equals labor’s share in income should not be taken as support for any theory, as was outlined in Felipe and McCombie (2005) on which the following sections are based. The simple reason is that these tests basically only estimate the national income identity, which states that the value of output (value added) always has to equal the value of inputs, i. e. the wage sum plus the income from capital. This point was made earlier several times, for example by Herbert Simon in his 1978 Nobel Prize lecture. The income identity is: (2.14)
Yt ≡ wt Lt + rt Kt ,
where wt is the real wage and rt is the real interest rate. Totally differentiating the identity and assuming that factor shares are constant over time i. e. at = a = rt Kt /Yt , which is a stylized fact in macroeconomics, we arrive at the accounting identity in growth form: Yˆ ≡ arˆt + aKˆt + (1 − a)wˆt + (1 − a)Lˆt
(2.15)
now define ˆbt = aˆ rt + (1 − a)w ˆt to get ˆ t + (1 − a)Lˆt Yˆt ≡ ˆbt + aK
(2.16)
which is equivalent to the standard equation used in Solow accounting exercises with ˆbt called “the residual”. Integrating equation 2.16 and taking antilogarithms leads to (2.17)
(1−a)
Yt ≡ B0 rta wt
(1−a)
Kta Lt
(1−a)
≡ Bt Kta Lt
,
(1−a)
where Bt = B0 rta wt (see Felipe and McCombie [2005], p. 373). Bt is called the ”dual” measure of productivity, which can be differentiated to lead ˆt + aˆ rt . Equation 2.17 looks like a standard Cobbto gt = Bˆt = (1 − a)w Douglas production function, although it is simply a transformation of the income identity using the assumption of constant factor shares. If we use the same definition for capital accumulation as in the Solow model outlined ˆ t = sYt /Kt − δ) and the steady-state assumption that Kˆt = Yˆt , we above (K can derive an equation similar to 2.7:
ln y = b0 +ln wt + (2.18)
a (1 − a)w ˆt + aˆ a a rt ln rt + ln s− ln n+δ+ (1 − a) (1 − a) (1 − a) 1−a
2.6 General critique of the standard approach
27
Y Since by assumption a = constant, K/Y = constant, and r = a K = constant, it follows that rˆt = 0: the real interest rate does not change over time. Equation 2.18 looks like a modified version of the Mankiw, Romer and Weil model, although no assumption has been made here on the degree of returns to scale, on whether markets are competitive and on whether factors are being paid their marginal products. All that has been done is to use the income accounting identity plus the assumptions of constant factor shares and of a constant capital-output ratio. If these two assumptions are correct (since Kaldor [1957] both seem to be widely accepted), then the data must necessarily always give a near perfect fit to the model. Equations such as 2.18 are therefore not helpful when trying to test hypotheses about the Solow model or any other model with a production function. Augmenting the standard Solow model leads to even better estimated fits for the identity if the additional variables are good proxies for Bt , which is driven by the real wage rate. Most likely, a measure of human capital is correlated with the real wage wt across economies. So it should not come as a surprise that the human capital-augmented Solow model improves the fit. Likewise, using fixed effects panel techniques also should improve the fit. But both approaches do not change the basic fact that an identity is estimated and that no insights can be gained about the coefficients of the production function or the degree of returns to scale. The underlying problem is the nature of the data used. The idea of a production function refers to real quantities of factor inputs such as the number of machines and the number of hours worked. However, estimates of production functions today do not and cannot use physical quantities of capital. Instead they have to rely on values which aggregate different types of capital using the prices in the base period. However, these prices stem from the rate of profits, which is in practice derived as the residual from the national accounts identity 2.14. To illustrate this point, Felipe and McCombie (2006) simulate an economy consisting of 10 firms with constant returns to scale Cobb-Douglas production functions and output elasticities of labor of 0.25 and of capital of 0.75 with a random error to avoid multicollinearity. The firms set prices by a markup on unit labor costs worth 33%. Wages and the return on capital are the same in all firms, consistent with the assumption of perfect competition in the factor markets. Because of the mark-up, the labor share in income is 0.75, while the capital share is 0.25. When using value data for the capital stock, the regression of output on labor and capital inputs generates a near perfect fit and a coefficient on the aggregate capital stock of 0.25 - identical to the factor share but very different from the true capital elasticity of 0.75. The reason is that factors are not being paid their marginal product in the simulated economy as a result of the mark-ups. Given the focus of macroeconomic models on monopolistically competitive economies (New Keynesian theory) it seems reasonable to assume that at least one of the neoclassical assumptions is violated: factors are not being paid their marginal products.
28
2 Assessment of growth theories
In a second exercise Felipe and McCombie simulate a model with increasing returns to scale for all firms and a mark-up of 33%. The estimated production function now has a higher intercept, but the same coefficients of 0.25 on capital and 0.75 on labor. Had this been a test for the null hypothesis of increasing returns (α + β > 1), the hypothesis would have been erroneously rejected. The conclusion must be that it is impossible to estimate the coefficients of an aggregate production function using standard macroeconomic data. Theories of growth cannot be tested in this way because they cannot be refuted. As a consequence, I will try not use the notion of output elasticities below and will not try to estimate or interpret them. In any case, the quality of a forecasting model does not depend on estimated elasticities being equal to the factor shares. 2.6.2 Aggregate production function does not exist All students of economics use an aggregate production function from their first term onwards. The production function takes on different forms and purposes during the course of the studies. However, it does not have a sound theoretical foundation as was shown by Fisher (1969), retraced in Cohen and Harcourt (2003) and surveyed by Felipe and Fisher (2003). The first problem of writing Y = f (K, L, X) is the use of aggregates for output Y , capital K, labor L and other inputs X. The second problem lies in aggregating the coefficients of production functions of different firms. At least since Wicksell it is well known that capital goods cannot be measured and aggregated in physical units because of their heterogeneity: how does one add up an airplane and a printing machine? Therefore valuation measures must be used. The value of a capital good can be the cost of its production or the value of the output that it will produce in the future. Both approaches require an interest rate (discount rate), but that interest rate is usually determined by using the amount of capital in relation to output. The circularity is clear. Kaldor (1975, p. 348) noted that the difficulty of isolating or measuring the change in the quantity of capital ”makes it impossible to attribute to capital a marginal productivity of its own.” The problem is not as severe in aggregating labor input (common physical unit: hours of work) or human capital (common physical unit: years of education). However, extreme care has to be taken if one wants to express these in value terms as well (e.g. ”the value of human capital in Spain is EUR 40,000 billion”). Even if capital was physically homogeneous, there would still be the second aggregation problem. The aggregation of firm production functions has been discussed since the 1940s. The survey by Felipe and Fisher (2003) lists several extremely restrictive conditions for aggregation to be valid: for example, all micro production functions have to be identical except for the coefficient of capital efficiency. Felipe and Fisher also emphasize that the usual aggregation cannot even be interpreted as an approximation of the true relationships. The production function cannot even be used as a parable as was suggested by the
2.6 General critique of the standard approach
29
Massachusetts side of the Cambridge capital controversy, because that would require the same time pattern and capital-labor ratios for all goods: machines would have to be produced in the same way as consumption goods. 2.6.3 The concept of TFP is not helpful Simple time differentials of the production function are still widely used as so-called Solow accounting exercises. In levels this is: (2.19)
At = Yt − αKt − (1 − α)Lt
Many studies use the ”Solow residual” or total factor productivity (TFP, At ) as a measure of technology and try to explain it. However, the construction of TFP is tautological: countries with high income per capita will have high TFP (and high levels of physical capital per capita). There is not much to be learned here. Indeed, using the reasoning from equations 2.15 to 2.17, At is by definition nothing but a weighted average of the wage rate and the interest rate. Barro (1999) also highlights this fact. According to Felipe and McCombie (2007) what is usually labeled TFP growth is nothing but a ”measure of distributional changes”. As was shown above, long-run growth of per capita GDP in the neoclassical model depends only on the progress of technology. The capital stock only adjusts to keep the capital-output ratio and the real return on capital constant. ˆ t = 0 (which is a relevant approximation for many With constant population L ˆ t = Aˆt . However, growth accounting OECD countries), it follows that Yˆt = K would attribute a third (equal to the share of capital in national income) of output growth to the changes in the capital stock even if those are induced by the change in A. In other words, with a constant population all of the output gain should be attributed to TFP. Attributing parts of it to capital does not improve our understanding of the growth process. Growth accounting assumes that factors are paid their marginal product. Together with the assumption of a production function Felipe and McCombie (2006) illustrate that the rate of technological progress as measured by TFP need not have any relation with the true underlying rate of technological progress. Their simulation uses a rate of technical progress at the firm level of 5% per annum for each firm. However, evaluating input changes at the factor shares leads to an estimated rate of TFP growth of only 1.5%. The conclusion is that growth accounting exercises may lead to misleading estimates of actual technological change. Factor shares depend on the relative bargaining power of labor and capital - they need not have any link to technology. Therefore, ideas of growth accounting will not be used in this study and literature based on it will not receive a prominent role below. 2.6.4 Beyond neoclassical economics Given all these difficulties, it seems surprising that the aggregate production function and growth accounting exercises are still widely used in growth
30
2 Assessment of growth theories
analysis. The simple reason seems to be their good fit to the data. Although coming from a different angle, Romer (1994c, p. 10) is also aware that “if you are committed to the neoclassical mode, the kind of data [used] cannot be used to make you recant.” However, as pointed out above, this good regression fit stems from the constancy of the factor shares in income over time and - in cross-country analysis - their similarity across countries. This fit provides no information about the existence let alone the shape of an aggregate production function. The crucial discussion about an aggregate production function is not whether it exhibits constant or increasing returns to scale. The question rather is whether it exists at all. With the available data it is impossible to assess the coefficients of a production function. No dataset can refute the null hypothesis that the elasticities equal the factor shares and that there are constant returns to scale. This holds in particular for tests of Ramsey’s AK model, which is nothing but a test for whether output and capital grow at the same rates - which they do. If the aggregate production function does not exist, then there is no point in speaking of a marginal product of aggregate capital or of aggregate labor. I will therefore not try to estimate coefficients or make statements about the degree of returns to scale. This is very different from what the bulk of the empirical growth literature and prominent examples such as Mankiw, Romer and Weil (1992) or Barro and Sala-i-Martin (2004) do. In sum, the neoclassical model is not the appropriate starting point for an analysis of long-run growth. As Felipe and McCombie (2005, p. 24) put it “this neoclassical framework does not, in our opinion, help answer the central question of why some countries are richer than others.” Similarly, the unrealistic predictions of some endogenous models with their scale effects make them of limited use for forecasting real world economies. The model presented in the next section will abide by the requirements derived above. It will also make use of the empirical insights outlined in later chapters, thereby following Solow (1987, 2001 add-on), who suggests (as mentioned earlier) that an ”alternative strategy might be to begin with unprejudiced empirical study of the speed of technological innovation” and to take insights from other branches of economics and social sciences into account.
2.7 The augmented Kaldor model From the discussion above it should be clear that a theoretical growth model should not make use of an aggregate production function of a specific shape or use marginal productivities in the reasoning. However, some formal link between inputs and output is necessary to help structure the analysis. This takes us to Cambridge (England) economists such as Joan Robinson or Nicholas Kaldor, who have always criticized the use of aggregate production functions. Kaldor (1957, p. 591) argued that the “purpose of a theory of economic growth is to show the nature of the non-economic variables which ultimately
2.7 The augmented Kaldor model
31
determine the rate at which the general level of production of an economy is growing.” Robert Solow would probably agree with this statement. In Kaldor’s model, economic growth depends on the readiness of an economy to absorb technical change, combined with the willingness to invest capital (p. 599). He used the stylized “Kaldor facts” of a constant labor share in income, a constant capital-output ratio and a constant rate of profit (real interest rate) over time. Just like Solow, Kaldor modeled a single economy.10 It is a model of the long run, i. e. output “is limited by available resources, not by effective demand” (1957, p. 593). This is also in line with Solow, who thinks of ”growth theory as precisely the theory of the evolution of potential output. So it is mostly concerned with the supply side of the macroeconomy” (2001, p. 286). Both assume full use of the economy’s labor and capital input. Models deviating from the full employment assumption have been developed by Post-Keynesians such as Robinson and Kalecki and recently reviewed by Stockhammer (1999). I will not follow them except for acknowledging the development of labor usage over time. Likewise, I will not make use of models that focus on the demand side of the economy or on balance of payment constraints such as in Thirlwall (2002) because these constraints are unlikely to be binding in the long run for the countries in the sample of this study. Kaldor thought it was impossible to distinguish between a rise in output induced by new technology and one induced by additional capital because a higher capital stock per worker must inevitably have been preceded by the introduction of a superior technology. In other words, “Solow accounting” exercises would make no sense to Kaldor. Kaldor treated long-run GDP growth as exogenous. Using per-capita variables his ”Technical Progress Function” is: (2.20)
yˆt = c + akˆt
Equation 2.20 is a compromise solution to the challenges outlined in the previous sections. It has the advantage of allowing some structured modeling, but the disadvantage of assuming that aggregate capital exists. And it links inputs to outputs, but in a less restrictive way than a Cobb-Douglas production function because there is no link between the coefficients and a marginal factor product. Without a link between output and input and without assuming that a capital stock exists, macroeconomic modeling would be impossible. The simple technical progress function of equation 2.20 can easily be augmented to include human capital h and a variable that measures openness o (capturing the Ei in the Parente/Prescott model from equation 2.13). My augmented technical progress function is: 10
When moving to cross-country comparisons, one has to carefully analyze which coefficients are assumed to be identical across the economies (e. g. labor shares, capital-output ratios or profit rates).
32
(2.21)
2 Assessment of growth theories
yˆt = akˆt + bh˙ t + (1 − a − b)o˙ t
Output grows at the weighted average rate of the percentage change of physical capital and the absolute changes of human capital and trade openness. Openness is assumed to be exogenous and h˙t may either be exogenous or it may be determined as in the simple Lucas model in a human capital producing sector. A similar equation appears in Bernanke and G¨ urkaynak (2001) and in Bottazzi and Peri (2007), though without any of the labels used here. The version in Bottazzi and Peri (2007) is yˆt = akˆt + bAˆt , where At is the stock of total available scientific and technological ideas. Equation 2.21 includes the most important drivers of growth in line with the models sketched earlier in this chapter: Human capital (discussed in more detail in chapter 6) captures knowledge in a broad sense, so equation 2.21 does not use R&D separately. Trade openness (discussed in detail in chapter 7) and human capital are closely linked to the quality of institutions, so no additional series is included for institutions - which are difficult to measure anyway. For emerging markets, openness is also a measure of how intensely an economy is making use of knowledge used or generated abroad. One aspect of human capital is that it measures the development of new knowledge in one country, whereas openness partly measures the diffusion of existing knowledge across countries. Having established this augmented technical progress function, one can now - just as in the Solow model - invoke the stylized Kaldor-fact that physical capital and output grow at the same rate in the long run (kˆ = yˆ) to ensure a constant capital-output ratio. Similarly, the absolute change of human capital times a constant is equal to the growth rate of GDP per capita as I will show in the empirical part in chapter 12: ω h˙ = yˆ. Input this into 2.21 to get (2.22)
yt + (b/ω)ˆ yt + (1 − a − b)o˙ t yˆt = aˆ
or (2.23)
yˆt = kˆt = ω h˙ t = µo˙ t ,
(1−a−b) . This close link between the growth (percentage or absowhere µ = (1−a−b/ω) lute) of the main variables in this model mirrors the results from other models summarized in equations 2.6 and 2.12 and in the evolutionary economics literature. Physical capital plays a passive role in the Kaldor model. It adjusts to the intensity of innovation driven by openness and human capital, which influences the profitability of new investments. This is in line with Prescott (1998, p. 525) arguing that “the reason that capital per worker is high in rich countries is that total factor productivities are high in rich countries.” A stronger development of openness and human capital raises the productivity of physical capital (the profit rate) and therefore the rate of capital accumulation. The capital stock per capita obeys the following rule:
2.7 The augmented Kaldor model
(2.24)
kt = αyt−1 + β
33
pt−1 yt−1 kt−1
It is equal to a coefficient α times output in the previous period plus a coefficient β times profits p in relation to the capital stock in the previous period. Reformulation of this equation leads to the rule for investment per capita which depends on the change in GDP and the change in the profit ratio: p pt−1 pt−1 t yt +β − (2.25) it = kt+1 − kt = (yt − yt−1 ) α + β kt−1 kt kt−1 If GDP and profits both do not change then there will not be any addition to the capital stock. In principle, this model could be extended to allow human capital to react to higher GDP just as physical capital does. This possible endogeneity of human capital was highlighted by Bils and Klenow (2000). However, my empirical analysis in chapter 12 shows that GDP reacts to slow-moving human capital. This is in line with the fact that building human capital is a lengthy process that is determined by decisions of society and policy-makers several years earlier. Equilibrium in the Kaldor model is ensured by assuming that sP − sL > β(Y /K), i. e. the savings rate out of profit income has to exceed the savings rate out of labor income by an amount larger than the sensitivity of the capital stock to the profit rate times the output-capital ratio. As is the case in most economic papers, my theoretical model is to a large extent motivated by the empirical results such as the absence of scale effects, the significant link between GDP and human capital and openness, and the enogeneity of physical capital. These results will be described in detail in chapter 12. Joan Robinson (1961, p. 360) highlighted that this is common practice even if the formal layout of papers is the other way around: “An author starts from some doctrine which he wishes to defend [...] and sets out finding the least unplausible-looking assumptions that lead to the conclusions that he requires.” This chapter was about finding that “least unplausible” model - and one that matches the empirical results. I will show in chapter 12 that the evolution of output, physical capital, human capital and openness indeed follows equation 2.23, which is the same as saying that there are three cointegrating relationships in the paths of these four variables - they are pair-wise cointegrated. The evidence for physical capital reacting to output and to the deviation of the capital-output ratio as in 2.24 is more mixed. In addition, the levels of the grand ratios such as the capital-output ratio turn out to be quite different across countries. The following seven chapters will prepare the ground for the empirical analysis thereafter by presenting datasets on GDP and its main determinants. Each chapter on the determinants includes an overview of the theoretical and empirical literature and concludes with an assessment of how to best model this variable.
3 The dependent variable: GDP growth
The purpose of this study is to analyze and forecast the growth rate of gross domestic product (GDP), i. e. of the value of all final goods and services produced in an economy. This is the most widely used measure of an economy’s success. The growth rates of total GDP are in focus when international organizations or the media discuss an economy’s performance. By contrast, GDP per capita is more useful when comparing the economic situation of individuals. It is often labeled (labor-)productivity and will be the main focus of this study. Table 3.2 on page 40 shows that most studies in the area of growth empirics focus on GDP per capita. Some studies focus on GDP per capita of the working age population, usually referring to people aged 15 to 64. However, it is not clear why people automatically become unable to work on their 65th birthday. A more meaningful measure of labor productivity is GDP per worker or per hour worked, which also takes into account that working hours differ significantly across countries. While I will not investigate GDP per hour worked, labor input as measured by overall hours will be an important variable to help explain GDP per capita. It is well known that GDP is not a measure of wellbeing. GDP tends to rise after natural disasters because the rebuilding activities enter GDP, while the value destroyed by, say, a hurricane is not accounted for. Likewise, GDP only measures market production. However, many countries have a significant amount of home and non-market production. In continental European countries, for example, high taxes on labor contribute to a lot of home production while non-market production might be particularly important in emerging markets. If the share of these activities varies over time, then economic fundamentals such as human capital may only explain parts of measured GDP growth. For example, strong growth rates in many Asian economies may stem partly from non-market activities becoming market activities and therefore being included in official GDP estimates - which also may help explain why some countries’ incomes have converged to the world leader.
36
3 The dependent variable: GDP growth
3.1 Choosing the appropriate data source The dataset for GDP has to satisfy several requirements besides being calculated according to a common standard. Firstly, it has to provide a history at least back to 1970 to allow panel estimation. Secondly, it has to provide reasonably current data, with a lag of at most two years. Thirdly, since I want to compare my forecasts with others, the growth rates have to be equal to those published and forecast by the IMF or the OECD. And fourthly, GDP data have to be available for around 40 rich and poor countries around the world. Choosing a dataset is not a task of minor importance: Hanousek et al. (2004) show that the results of studies on growth determinants are sensitive to the choice of database. Four datasets satisfy at least some of the conditions outlined above: the IMF’s International Financial Statistics (IFS), the OECD’s Economic Outlook database (ECO), the Penn World Table (PWT) and the World Bank’s World Development Indicators (WDI). The IMF’s IFS provide history for GDP only from 1980 as index levels and their series were full of errors when I first retrieved the data.1 For these reasons, IFS data will not be used here. The OECD’s ECO database is not too helpful for the purpose of this study because it only covers OECD countries. The most widely used database for academic work (see table 3.2 at the end of this chapter) is the Penn World Table, available as release 6.1 (PWT 6.1) in early 2006. Its major advantage is the easy and free accessibility.2 It provides, among others, nominal and real GDP data per capita for 168 countries from 1950 to 2000. This database was first made available in the early 1980s and became very popular in the early 1990s with the release of version 5 (Summers and Heston [1991]). The PWT uses domestic currency expenditures for consumption, investment and government spending and transforms each component into ”international prices” using purchasing power parity exchange rates. These purchasing power parity (PPP) components are then added up to total GDP. Real GDP per capita is available with 1996 as the base year (rgdpl) and as a chainweighted series (rgdpch). The main disadvantages for my purpose are that all series end in the year 2000 and that the annual growth rates do not equal those reported by international organizations or the media. Heston and Summers (1996, p. 24) point out that many researchers used the PWT growth rates “unaware that the rates they obtained are not the same as the rates implicit in the countries’ own national accounts.” In addition, the PWT showed some bugs. For example, chain-weighted real GDP in Spain jumped by 10.3% in 1
2
In cooperation with the database agent Global Insight, I asked the IMF to upload corrected data for Belgium, Finland, and Greece. But some large differences between the IFS database and the IMF’s own semi-annual World Economic Outlook remained. See Heston et al. (2002). Version 6.2 was released in late 2006. Data are available on pwt.econ.upenn.edu
3.1 Choosing the appropriate data source
37
1995 only to fall back by 4.8% the next year. None of the other databases showed these changes. The most reliable and comprehensive database appears to be the World Bank’s World Development Indicators (WDI). The WDI provides long histories on GDP at constant prices in local currency units and in international dollars for a large number of countries. The database is updated once a year in the spring. The only small downside is that the most recent data available stem from two years prior to the release year. The WDI is the database used in the empirical analysis in this study. The main focus of this study is on growth rates of GDP. However, for policy analysis cross-country comparisons of levels of GDP in a common currency are necessary as well. The WDI, OECD ECO and PWT databases all provide GDP in PPP US dollars, converting local currency values using purchasing power parity exchange rates (in the case of the PWT and now also the WDI called “international dollars”). All rely on information from the United Nations’ International Comparison Project (ICP). Table 3.1. The weights in world GDP of the 40 countries considered 21 rich countries 19 emerging markets United States Japan Germany UK France Italy Canada Spain Australia Netherlands Belgium Austria Sweden Greece Switzerland Portugal Norway Denmark Ireland Finland New Zealand Sum 21:
20.10 6.40 4.13 3.00 3.00 2.73 1.81 1.78 1.03 0.82 0.53 0.45 0.44 0.41 0.39 0.33 0.32 0.31 0.28 0.27 0.17 48.70
China India Brazil Mexico Korea Indonesia Taiwan South Africa Turkey Thailand Argentina Philippines Colombia Egypt Malaysia Chile Nigeria Israel Singapore
Sum 19: Sum 40:
Not included 15.41 5.95 2.58 1.76 1.63 1.60 1.03 0.93 0.93 0.89 0.87 0.68 0.55 0.50 0.48 0.32 0.28 0.26 0.20
Russia Iran Poland Pakistan Saudi Arabia Ukraine Bangladesh Vietnam Algeria Hong Kong Romania Czech Rep. Hungary Peru Venezuela Morocco UAE Kazakhstan
2.58 0.91 0.81 0.66 0.58 0.55 0.50 0.41 0.39 0.38 0.31 0.31 0.28 0.27 0.27 0.22 0.21 0.21
36.9 85.6
Table shows PPP weights in world GDP in 2005 in %. Source: IMF WEO April 2006
38
3 The dependent variable: GDP growth
The GDP data used here are in 2000 international dollars. The starting point is the level of GDP from the WDI database for the year 2000 for each country. Earlier and later years are calculated using the growth rates from the series in local currency units. This procedure maintains the “headline” growth rates of GDP but makes the levels roughly comparable across countries. The levels of overall GDP are then divided by population levels to get GDP per capita. Figures 3.2 and 3.3 at the end of this chapter plot the trajectories for GDP per capita. My country sample covers 85.6% of world GDP in purchasing power parities in 2005. Table 3.1 lists the countries included with their respective shares in world GDP according to the IMF’s April 2006 World Economic Outlook using PPP exchange rates. The most important countries not included in my sample are transition countries (e.g. Russia, Poland, Ukraine and Vietnam) because not enough time-series history is available and because they have been through a major structural break. Oil exporters like Iran and Saudia Arabia are just as impossible to model based on growth theory. And countries such as Pakistan or Bangladesh lack some of the data needed for a fundamental model (e. g. years of education).
40
Percentage differences to the level in the USA in 2000
30 20 10 0 -10 -20 -30 PPP GDP per capita PPP GDP per hour worked
-40
Greece
Portugal
Japan
New Zealand
Spain
Australia
Switzerland
Canada
Sweden
UK
Finland
Denmark
Italy
USA
Austria
Germany
Netherlands
Ireland
France
Belgium
Norway
-50
Fig. 3.1. Large differences in distance to frontier
The different databases provide slightly different pictures of how rich the countries are relative to the USA as measured by GDP per capita. On average, the PWT income levels are higher than the WDI levels in the poor countries and lower in the rich countries. The biggest difference appears for Norway
3.1 Choosing the appropriate data source
39
which the PWT sees 19% below the USA in 2000, while the WDI sees it 3% above the US level. By contrast, Chile’s GDP per capita was 72.9% of the US level in 2000 according to WDI, while it was 70.2% in the PWT. Dowrick (2002) argues that the ”true” ratio may lie somewhere in between the two. Since data have to be as up to date as possible for forecasting purposes, the WDI database is the preferred choice, but one has to keep in mind that comparing income levels is a hazardous task. The literature on the distance to frontier models economic growth depending on how far a country’s productivity is away from the technology leader usually the USA. However, these studies tend to use GDP per capita or total factor productivity per capita. A good example is Vandenbussche et al. (2006), who calculate total factor productivity as output per adult minus capital per adult multiplied by capital’s share in income. No account is taken of differences in working hours. However, these can make a significant difference as figure 3.1 shows. While French GDP per capita was 26% below the US level in 2000, French GDP per hour worked was almost 10% above the US level according to the WDI data. Similar conclusions hold for other continental European economies with their low working hours per capita.
Barro DeLong & Summers Levine & Renelt Mankiw, Romer & Weil Loayza Islam Caselli, Esquivel & Lefort Klenow & Rodriguez-Clare Lee, Pesaran & Smith Sala-i-Martin Edwards Evans Frankel & Romer Hall & Jones Judson & Owen Acemoglu, Johnson & Robinson Bassanini, Scarpetta & Hemmings Easterly & Levine Fernandez, Ley & Steel Krueger & Lindahl Sarno Dowrick & Rogers Bosworth & Collins Alcala & Ciccone Barro & Sala-i-Martin Batista & Zalduendo Bond, Leblebicioglu & Schiantarelli Doppelhofer, Miller & Sala-i-Martin Hauk & Wacziarg Rodrik, Subramanian & Trebbi Hausmann, Pritchett & Rodrik Ianchovichina & Kacker This study
Growth rate of Growth rate of Growth rate of Level & growth of Level of Level of Growth rate of Level and growth Level of Growth rate of Growth rate of Level of Level of Level of Growth rate of Level of Growth rate of Growth rate of Growth rate of Growth rate of Level of Growth rate of Growth rate of Level of Growth rate of Growth rate of Level & growth of Growth rate of Growth rate of Level of Growth rate of Growth rate of Level & growth of
GDP GDP GDP GDP GDP GDP GDP GDP GDP GDP TFP GDP GDP GDP GDP GDP GDP GDP GDP GDP GDP GDP GDP GDP GDP GDP GDP GDP GDP GDP GDP GDP GDP per per per per per per per per per per per per per per per per per per per per per per
per per per per per per per per per per
Specification Measure PWT 4 PWT 5 PWT 5 PWT 5 PWT 5 PWT 5 PWT 5 PWT 5 PWT 5.6 PWT 5 Nehru & Dhare. PWT 5 PWT PWT 5.6 PWT 5 WDI ECO PWT 5.6 PWT 5 PWT 5.6 PWT 5 PWT 5.6 WDI & OECD PWT 5.6 PWT 6.1 IMF WEO PWT 6.0 PWT 5 PWT 6.1 PWT 6.0 PWT 6.1 & WDI PWT 6.1 & WDI WDI
n Sample length Source capita 98 1960-85 worker 25 1960-85 capita 98 1960-89 capita 15-64 98 1985, 1960-85 worker 98 1960-85 capita 96 1960-85 capita 97 1960-85 worker 98 1960-85 capita 102 1960-89 capita n/a 1960-92 93 1960-90 capita 54 1960-89 capita 98 1985 worker 127 1960-88 capita 64 1950-88 capita 69 1995 capita 15-64 21 1971-98 capita 73 1960-95 capita 72 1960-92 capita 110 1960-90 capita 7 1950-92 worker 51 1970-90 worker 84 1960-00 worker 138 1985 capita 86 1965-95 capita 89 1961-00 worker 98 1960-98 capita 98 1960-92 capita 69 1960-00 capita 140 1995 capita n/a 1950-99 capita 70 1966-99 capita 40 1971-03
Table 3.2. Databases and sample sizes
Note: Table is not comprehensive. Some papers may use several measures and sample sizes.
1991 1991 1992 1992 1994 1995 1996 1997 1997 1997 1998 1998 1999 1999 1999 2001 2001 2001 2001 2001 2001 2002 2003 2004 2004 2004 2004 2004 2004 2004 2005 2005
Year Author(s)
40 3 The dependent variable: GDP growth
3.1 Choosing the appropriate data source
41
40,000
in PPP USD of 2000 Norway 35,000
Ireland 30,000
USA 25,000
Switzerland
20,000
15,000
Portugal
10,000
5,000 1971
1976
1981
1986
1991
1996
2001
Fig. 3.2. GDP per capita in 21 rich countries
25,000
in PPP USD of 2000 Singapore
20,000
Korea Israel
Taiwan
15,000
Argentina 10,000
5,000
0 1971
1976
1981
1986
1991
1996
Fig. 3.3. GDP per capita in 19 emerging markets
2001
4 Labor input
As mentioned in chapter 2, many empirical growth models use the growth rate of the population to explain the level of GDP per capita. However, while there is a negative correlation between these two variables, the causality runs primarily from income to population growth as this chapter will argue. Therefore, the rate of population growth (or the fertility rate) will not appear in my model of per capita GDP. However, population growth is crucial for calculating forecasts for overall GDP growth from the forecasts for per capita growth. Conversely, a variable that is usually not included in empirical growth models receives a significant role in my framework: hours worked. The usual assumption in the growth literature is that utilization rates of labor supply are constant. However, this is not the case in most countries. Furthermore, the age structure of the population may be important for the evolution of GDP: a large share of experienced workers in the overall population is likely to be positive for average income as outlined in the work of Malmberg and Lindh (2004a and 2004b).
4.1 Population growth is endogenous The presumed negative impact of population growth on income levels that is still used in many growth models has deep roots in classical and neoclassical analysis. Classical economists like Malthus and Ricardo relied on biology to model population growth. If there is more food, animals and humans will have more offsprings. But with a fixed supply of land and slow technological progress, the larger population will then depress the average consumption possibilities and eventually birth rates. In their time in the late 18th century this might have been an appropriate description of economic reality, but it is no longer relevant in most economies in the 20th and 21st centuries. In the neoclassical model pioneered by Solow (1956, 1957), population growth is assumed to be an exogenous variable with a constant growth rate of
44
4 Labor input
n. Higher population growth reduces the steady state level of GDP because the additional workers have to be equipped with physical capital which reduces the average capital available per worker. Therefore, steady state labor productivity is lower. The message that many empirical economists took from this reasoning is that countries with faster population growth will be poorer than countries with slower population growth. These theoretical considerations led to development policies in the 1960s and 1970s focusing on birth control as a path to higher income. Endogenous growth models with spillovers kept the assumption that population levels or growth rates were exogenous. However, in these models a higher level of population tends to lead to higher growth rates because knowledge spillovers are more powerful if there are more people nearby. As outlined above, these scale effects do not reflect macroeconomic reality. Why are the one billion people of India so much poorer than the 5 million people of Finland? In the second generation of endogenous models the rate of population growth has a positive impact on the rate of per-capita GDP growth (e.g. Jones [1995b, 2002] or Segerstrom [1998]). Again, as outlined in chapter 2 this also does not reflect reality. Keynesian models also tend to see a positive impact of population growth on GDP as more people bring a stimulus to demand, investment and income growth. High immigration into Germany around the time of reunification appears to be an example for this link. However, there is unlikely to be a positive impact on per capita income. I side with Becker, Glaeser and Murphy (1999, p. 146) who conclude that “in the modern view, the growth in per capita income during the past 150 years has little to do with population” and with Easterly (2001, p. 91)“that there is no evidence one way or the other that population growth affects per capita growth.” Likewise, Temple (1999, p. 142) concludes that “the popular belief that population growth is economically harmful is not yet well supported by statistical evidence.” In general, population growth does not affect per capita income growth - nor is it fixed or exogenous. Over the past 40 years, average population growth in my sample of 21 rich countries has fallen from 1% per annum to 0.5%, while it fell from 2.5% to 1.1% in the 19 emerging markets. Fertility rates and population growth depend on the level of income. Richer societies tend to have lower birth rates. Partly responsible are the better health systems in rich societies which lead to lower child mortality rates. Easterly (2002, p. 95) calls ”development, the best contraceptive.” In addition, the opportunity cost of having children goes up (foregone earnings and costs of educating children) as income rises. Barro (1991, p. 422) already pointed out that ”people shift from saving in the form of children to saving in the form of physical and human capital.” This is in line with the model of Galor and Weil (2000), who argue that technological progress raises the return on human capital and induces a substitution from the quantity of children to the quality. Furthermore, large social security systems in rich
4.1 Population growth is endogenous
45
countries imply less dependence on own children for old-age provision, which may lower birth rates as well. Because fewer children die and those surviving are better educated, fewer have to be born to support their parents in old age. This introduces a link from high income to low birth rates. And it explains why cross-section regressions find a significant relationship between birth rates and income. Net migration rates are becoming an ever more important source of variation in population growth rates. The decision to migrate is partly driven by migration policies, but also by the relative attractiveness of countries according to gravity models. Countries with higher levels and growth rates of income tend to attract immigration. This was not part of neoclassical models, which focus mainly on closed economies. My conclusion is that population growth should not be treated as an exogenous variable despite the great regression fit both in cross-section and in panel models. When trying to forecast GDP growth, a regression of GDP on population growth will probably have high explanatory power. But population forecasts use the development of GDP as an explanatory variable. This circular reference should be avoided. Mankiw, Romer and Weil (1992, p. 411) relegate the issue of endogenous population growth to a footnote, admitting that ”if s and n are endogenous and influenced by the level of income, then estimates [...] are potentially inconsistent”. They recommend the use of appropriate instruments, but acknowledge that this is a formidable task. Admittedly, there are cases where population growth is exogenous. The prime example is China, where the government instituted a one-child policy in the late 1970s, which reduced population growth far below the level indicated by income and education. However, this is probably the exception rather than the rule. Population data are still necessary to calculate per capita levels of the other variables and to calculate forecasts for total GDP from those for per capita GDP. For total mid-year population levels until 2005, I use the data from the Groningen Growth and Development Centre.1 The forecasts for annual percentage changes until 2020 stem from the UN Population Division’s 2004 revision. I interpolated the levels published for five-year intervals using a spline and appended the resulting annual changes to the Groningen data to calculate population levels. In some cases this leads to jumps in growth rates between 2004 and 2005 (Switzerland, South Africa, Singapore and Israel), as the UN forecasts are based on sometimes outdated assumptions. This serves as a warning that the UN forecasts - just as any other forecasts - may not always materialize. 1
http://www.ggdc.net/dseries/totecon.html
46
4 Labor input
4.2 Hours worked per capita are important While population growth - one of the most frequently used variables in growth empirics - should therefore play no role in models for per capita GDP growth, another less frequently used variable should play an important role: hours worked per year and per capita of the total population. The use of per capita data (instead of per worker) is the natural choice for a study that focuses on per capita GDP (rather than per employee), a point made in detail by Ramey and Francis (2006). This variable captures the significant variation over time and across countries in the overall usage of the available labor resources in market activities. The variation may stem from differences in the hours worked per worker, from different unemployment rates (i.e. the ratio between workers and the labor force) and/or from different labor force participation rates: (4.1)
hours worker laborf orce hours = ∗ ∗ population worker laborf orce population
As was shown in the previous chapter, the high levels of GDP per capita in the USA and in Japan owe a lot to the high level of annual hours worked per capita there. Similarly, the slow growth rate of GDP in Germany over the past decades stems partly from the shrinking labor input per capita.2 Figure 4.1 below shows the trajectories for the 21 rich countries. There should be a positive link between hours worked per capita and GDP per capita, but a negative link to GDP per hour worked: The least productive people are usually the first to exit employment. Measured hourly productivity of those remaining employed will be higher. Likewise, starting from a 50-hour workweek for example, fewer hours worked per employee could mean that productivity goes up because workers are well-rested after a long weekend, or are not tired if the working day ends after six hours. The long-run trend of hours worked per year is largely exogenous to per capita GDP, being determined by taxes, labor market institutions and preferences. In the debate about differences in income and growth between Europe and the US, Olivier Blanchard (2006) and Robert Gordon (2004) highlight the role of both institutions (regulation, taxes etc.) and preferences for low and falling hours worked in Europe. Prescott (2004) sees tax rates as explaining why Americans work 50% more in the market economy than many Europeans. Furthermore, Ramey and Francis (2006) argue that the increase in hours spent in school can explain a large part of the decline in hours worked, so that there need not be an increase in leisure. In emerging markets, where hours worked are still high today, working hours will have to be an important element in long-run forecasts of GDP over time. Some countries will decide to use the additional income and productivity 2
On the other hand, there is more household production in many European economies, which does not affect GDP but should affect overall welfare.
4.2 Hours worked per capita are important
47
to extend their leisure time, others will decide to continue to work as long as they are doing today. The resulting GDP growth rates are likely to differ significantly. In short, any forecasting model has to include a variable for labor input, but one that does not depend simply on the level of income but on society’s decisions. As mentioned above, the empirical growth literature tends to omit this important element in explaining differences in levels and growth rates of GDP across countries. Hall and Jones (1999, p. 8) recognize the omission: ””We do not have data on hours worked for most countries, so we use the number of workers instead to measure labor input.” Bassanini, Scarpetta and Hemmings (2001, p. 53) also are aware of the omission - if only in the data appendix: “However, [...] employment rates have changed significantly over time in most Member Countries and in particular in Continental Europe where significant declines were recorded in the 1980s and 1990s. Under these conditions, a specification in GDP per person employed is likely to yield different results from that in GDP per capita.” Bassanini, Scarpetta and Visco (2000) comprehensively consider hours worked. Limited availability of data is the standard explanation why hours worked are not considered in large cross-country studies. There are no series for hours worked in the Penn World Table or in the World Bank’s WDI. For smaller samples, the OECD publishes data in its Economic Outlook database. The Groningen Growth and Development Centre publishes hours actually worked per person employed per year for 40 countries. These data combine a variety of sources and are quite up to date and therefore useful for forecasting purposes. From the data on hours worked per employee combined with the employment and population data, hours worked per year per capita of the overall population for the 21 rich countries are derived. Unfortunately, lack of data prevents similar calculations for the emerging markets sample. In the rich country group, hours worked per year have tended downward since the 1960s, although this trend seems to have slowed or even stopped since the 1980s. From 860 hours worked per year and per capita in the early 1960s on average in the 21 countries, the number has fallen to around 750 in the early 1980s and has stayed there since - with ups and downs driven by the business cycle (see figure 4.1). However, different countries show vastly different trajectories. Hours worked per capita kept falling in Germany throughout the sample period and are now almost 20% below the rich-country average. By contrast, hours worked in Spain have trended upward since the mid-1980s and are now above the rich-country average. These country-specific differences should be useful in explaining past GDP growth - and for making better informed forecasts.
48
4 Labor input 1000 Japan CH 900
800
700
600 France Spain 500 1971
1976
1981
1986
1991
1996
2001
Fig. 4.1. Annual hours worked per capita in 21 rich countries
4.3 Age structure of the population As outlined above, the growth rate and the level of the population probably do not have a significant impact on the level of per capita GDP or on per capita GDP growth. Demographic research suggests another variable that may have a significant influence on per capita income levels: the age structure of the population. Malmberg and Lindh (2004b, p. 2) argue that a consensus has emerged ”that population age structure, and not population size, is what matters for the level of per capita income.” This would address the issue that simply using overall population data equates a 5-year old to a 40-year old and to a 70-year old even though their respective productivities are likely to differ significantly. Demographers speak of the ”demographic transition” when mortality rates fall first and birth rates decline with some delay. The ”demographic window” is the period when the birth rate has already fallen - taking the youth dependency ratio down - and the baby boomers are in their prime earnings phase. This leads to a high ratio of workers in the total population (see equation 4.1), which should be positive for total hours worked and for per capita income levels (but not necessarily for output per hour). In addition, savings rates might be higher when youth dependency ratios are low. The big advantage of considering the age structure for a forecasting model is that its shape in the year 2020 is relatively easy to forecast in 2006: most of the population of 2020 has already been born and fertility and mortality rates do not vary that much over a decade. In addition, history and forecasts
4.3 Age structure of the population
49
are easily accessible on the website of the UN Population Division. In my sample of 21 rich countries, the share of ”prime-age” people, i.e. those aged between 30 and 49, has risen from 25% in 1975 to almost 30% in 2005 using the unweighted average. In China, the share even reached 33% in 2001 - but is set to fall from there as figure 4.2 shows. 0.34 China 0.32 USA Japan 0.30 Germany 0.28
0.26
0.24 India 0.22
0.20 1960
1965
1970
1975
1980
1985
1990
1995
2000
2005
2010
2015
2020
2025
Fig. 4.2. Share of 30 to 49-year age group
In the long run some of the endogeneity problems outlined above persist. Birth rates, and therefore also the age structure, ultimately depend on income levels. A more serious issue is the fact that the characteristics, health, lifestyle and work-environment of, say, a 50-year old keep changing over time. A 50-year old in an advanced economy today is likely to be healthier and more productive than a 50-year old three decades ago. Malmberg and Lindh (2004a, 2004b) use the relationship between the age structure and GDP to derive forecasts for GDP growth into 2050 for 111 countries using data from PWT 6.1 and from the UN Population Division. They regress the natural logarithm of GDP per capita on different age shares and a linear as well as a quadratic time trend and find the largest positive effect on GDP coming from the 30 to 49-year age share. Their out-of-sample forecasts produce smaller mean errors, but larger mean absolute errors than the na¨ıve alternative models. From their forecasts until 2050 they conclude that because of recent decreases in fertility rates, today’s poor countries will start to catch up with developed economies in which the growth process will stagnate because of the growth of the elderly population. My empirical analysis in chapter 12 finds only weak support for these conclusions. First of all, the shares of “prime age adults” and of 50 to 64year olds seem to be stationary variables in my sample of 21 rich countries.
50
4 Labor input
Therefore, they cannot by themselves explain the path of non-stationary GDP per capita. Second, only 13 of the 40 short-run models outlined in chapter 12 find a significant role for the age structure in explaining the growth rate of GDP - Sweden is not one of them in contrast to Lindh (2004). The panel estimations do not find a significant relationship. As a result, I will keep the age structure in the overall model, but focus attention on other variables such as human capital and openness as will be seen in the following chapters.
5 Physical capital
One of the most obvious and most widely investigated determinants of economic growth is the accumulation of physical capital. The general idea is that consumption today is postponed to allow even higher consumption in the future: investment today leads to a higher capital stock and therefore more output tomorrow. Despite this simple and intuitive reasoning, a heated debate on the link between capital accumulation and economic growth is ongoing in the literature. For example, Easterly and Levine (2001) argue that factor accumulation is not the main driver of output growth. Their results are challenged by Bond et al. (2004) who argue that investment has a positive long-run effect on both the level and the growth rate of output. Krueger and Lindahl (2001, p. 1125) find ”an enormous effect” of the growth rate of capital per worker on GDP growth in their cross-section estimations but they suspect some endogeneity bias. This chapter will build on the augmented Kaldor model outlined in chapter 2 and it will argue that the accumulation of physical capital reacts to variables like labor input, human capital, trade openness and institutional settings. Output growth is unlikely to happen without accumulation of physical capital, but accumulation is not the main or ultimate reason for output growth. Nevertheless, empirical studies should find that rapid output growth goes hand in hand with rapid growth of the physical capital stock. Output and the capital stock are cointegrated as Kaldor (1957) found (although the term ”cointegration” was not coined back then). This view reconciles the opposing views in the empirical literature mentioned above. In addition, this chapter will show that many of the results of earlier empirical studies which find a significant link between the level of the investment ratio and the level of income are sensitive to the use of investment valued at international prices instead of domestic prices. The conclusion is that there is neither a theoretical nor an empirical reason why investment ratios should correlate with output levels across countries - in contrast to the Solow model as usually applied in cross-country analyses.
52
5 Physical capital
Cross-checking the neoclassical assumptions outlined in chapter 2 with reality reveals that none of them hold. The data reveal five important features: 1. The investment ratio (gross investment as a share of GDP) in any given country is not constant, but can vary over time. 2. Over long time spans, investment ratios barely differ across countries and therefore cannot explain the large variation in income levels. 3. Investment ratios are not proportional to the change in the capital stock. 4. Investment ratios are not proportional to the level of the capital stock either. 5. Capital intensities do not vary systematically with income levels. This chapter will also show that there are significant differences in investment ratios depending on databases and on whether real or nominal ratios are used.
5.1 Measuring capital accumulation Physical capital is difficult to measure. As outlined in section 2.6.2 above, value measures rely on the rate of interest, which is itself derived from the national accounts income identity. This leads to a circularity problem. Kaldor (1957) avoided this issue by using the weight of steel embodied in capital equipment to aggregate the different types of machines with the same physical unit. Unfortunately, this is no longer feasible today. In order to make progress towards a long-run forecasting model, value measures of physical capital are unavoidable. 5.1.1 Investment and changes in capital stocks The basics of the accumulation of physical capital were already highlighted in chapter 2. However, the simple relationship between investment and physical capital presents some pitfalls that are worth discussing in more detail. For example, if we expand equation 2.4 to model different cross-sections i, we get: (5.1)
K˙ it = Kit − Ki,t−1 = Iit − δKi,t−1 ,
where Iit is gross investment in country i in period t and a common depreciation rate δ is assumed. Dividing both sides by Ki,t−1 and expanding by total GDP Yit gives: (5.2)
Yit Iit K˙ it = −δ Ki,t−1 Ki,t−1 Yit
Invoking the assumption that the Kaldor ratio holds, i.e. that Kit /Yit = Ki /Yi = γi is constant over time leads to the conclusion that the percentage change in the capital stock is proportional to the investment ratio:
5.1 Measuring capital accumulation
(5.3)
53
ˆ it = 1 Iit − δ K γi Yit
The big step in cross-country analysis - that is unfortunately usually not talked about - is to also assume that γi is the same for all countries: γi = γj = γ. In that case, countries with relatively high investment ratios would see proportionally higher growth rates of physical capital. Most empirical crosssection growth papers simply assume that this relationship holds, but they do not test it. I will show later in this chapter that capital-output ratios are not identical across countries. To do so requires databases for investment ratios and capital stock, which are the subject of the next sections. 5.1.2 Different databases - different investment ratios To improve the understanding of the link between physical capital and output, several databases for investment and capital stocks are used in this study. This allows three main goals of the study to be achieved: cross-check theory and reality; evaluate findings of previous empirical research; and decide which route is most promising for the purpose of forecasting. Most empirical growth analyses have used and continue to use the investment ratios supplied by the Penn World Table, which evaluates all output components at international prices.1 The relative price between investment and consumption goods is exactly the same in every country in a given year and equals the weighted world average relative price. Since in reality, investment goods are expensive relative to domestic services in poor countries, the PWT shows much lower investment ratios in low-income countries than the investment ratios based on local prices.2 Summers and Heston (1991, pp. 337339) clearly made that point when releasing the first versions of the PWT, but few researchers seemed to take note. However, the international ratios of investment and consumption goods prices are not the relative prices that steer resource allocation inside each individual country. Nuxoll (1994, p. 1434) prefers using domestic prices since these “characterize the trade-offs faced by the decision-making agents.” Knowles (2001) also argues that it is preferable to use investment ratios (and government consumption shares) at local relative prices. Bosworth and Collins (2003, p. 12) also think that ”capital input should be valued in the prices of the country in which it is used”. The World Bank’s WDI and the OECD’s ECO database use local relative prices, but these databases are hardly used in growth empirics. As table 5.1 on page 65 shows, these exceptions include Bosworth and Collins (2003) and 1
2
See page 36 above for a general description of the PWT and table 5.1 on page 65 for an overview of the measures and databases used in growth studies Dowrick (2002) documents a negative correlation between the relative price of investment to consumption and the level of real GDP per capita.
54
5 Physical capital
the OECD’s own growth project as in Bassanini, Scarpetta and Hemmings (2001). The WDI provide nominal and real fixed investment spending in local currency units for a large number of countries, dating back to 1970 or even further. With the help of real and nominal GDP data, nominal and real investment ratios for all 40 countries were constructed. As outlined above, the data released in spring 2005 run until 2003. In 2003, nominal investment ratios ranged from below 15% of GDP in Colombia and Argentina to 29% in South Korea and even 40% in China. The OECD’s ECO database only includes data for the OECD economies, which makes it less useful for the purpose of the present broad-based investigation. For the available countries the ECO investment ratios are basically identical to the WDI ratios, therefore I will only analyze the latter here. The recent spread of chain-weighted GDP statistics has added a further complicating factor: the components of real GDP no longer add up to overall GDP. As a result, the simple ratio between real investment and real GDP may have no meaning. Therefore, the use of nominal investment ratios is indispensable when investment is added on the right-hand side of growth regressions. Moreover, even in countries that do not use chain-weighted GDP, the time paths of nominal and real investment ratios differ significantly. Real ratios maintain the relative prices of the base year throughout the sample period. However, prices of investment goods tend to rise at a significantly lower rate than the overall GDP deflator. As a result, the unweighted rich-country average investment ratio in 1980 was 3.5 percentage points higher using nominal values than using real values, while the two ratios were equal in the base year 2000 as figure 5.1 shows. As a result, the time path of the nominal ratio shows a slight decline while the real ratio has been roughly constant since the late 1970s. Empirical studies might come to different conclusions depending on the measure used. 5.1.3 Capital stocks from perpetual inventory Unfortunately, getting data on capital stocks is not a straightforward task. The OECD’s ECO database provides data for the business sectors in OECD countries. However, some are gross capital stocks, while others are net of scrapping, so the levels are not comparable. Furthermore, the business sector may not capture the evolution of the overall economy and the country sample is too small for my purpose. Capital stock series in the national accounts are calculated using the perpetual inventory method in equation 5.1 above. The rate of depreciation should be interpreted as a scrapping rate: a machine may have been fully depreciated, but if it is not yet scrapped it may produce just as much as when it was first set up. I assume a scrapping rate equal to the depreciation rate.
5.1 Measuring capital accumulation
55
29 % of GDP 27 WDI nominal WDI real (base=2000) 25
23
21
19
17
15 1961
1966
1971
1976
1981
1986
1991
1996
2001
Fig. 5.1. Nominal and real investment shares in 21 rich countries
A rough estimate for the capital stock in the 40 countries in my sample can be constructed by perpetual inventory calculations following Hall and Jones (1999). The capital stock in the base year is set equal to real gross investment in the base year divided by the sum of the average growth rate of investment and the depreciation rate. So K1971 = I1971 /(κ + δ). For the growth rate of investment κ, I use the grand mean of 2.4% for the rich countries and 5% for the emerging markets.3 The depreciation rate δ is set at 6 percent for all countries in all years, the same rate as in Hall and Jones (1999) or in Vandenbussche et al. (2006). Other authors use slightly higher rates of 7% (Easterly and Levine [2001]) or lower rates of 5% (Prescott [1998], Yao and Lyhagen [2001] or Chen and Dahlman [2004]) or even 3% (Mankiw, Romer and Weil [1992], Klenow and Rodriguez-Clare [1997] and Sarno [2001]). An experiment with 5% did not show a major difference in the paths of the capital stock, while 3% clearly appears too low. Easterly and Levine (2001) use a slightly different approach. For the initial capital stock they calculate country-specific steady-state values of the capitaloutput ratio γ from the steady-state investment ratio I/Y , the growth rate of GDP g and the depreciation rate δ: γ = [I/Y ]/[g +δ]. They then multiply this γ by the average GDP level in the first three years of the sample. Their use of long-run averages gives less weight to the initial investment ratio than the 3
An alternative calculation using country-specific growth rates for the first 10 years leads to implausible levels and paths of the capital stock.
56
5 Physical capital
approach by Hall and Jones, but any guess of the initial capital stock ceases to be important after roughly a decade anyway. Bond et al. (2004) suggest using the backward sum of the investment shares as a measure of the capital stock and find that output and the backward sum of investment cointegrate in a panel when allowing for country-specific deterministic trends. However, their calculation is equivalent to the perpetual inventory method without depreciation and with a lower initial level. The backward sum will just rise more quickly. I will not explore this route. The capital stock data based on real investment from the WDI are expressed in international dollars of 2000 just as the GDP series are. In per capita terms, capital stocks in 2003 ranged from USD 3,000 in Nigeria and USD 4,700 in India to 78,000 in Norway and 81,000 in Japan as indicated in figures 5.8 and 5.9. These large differences are of the same order of magnitude as the differences in per capita GDP. Over the 10 years between 1993 and 2003, growth rates of the per capita capital stock ranged from 0.5% per annum in Finland and Argentina to 6.5% in Korea and Taiwan. Mainland China stands out with a growth rate of 10.9%.
5.2 Main insights on capital accumulation With the datasets for investment ratios and capital stocks at hand, it is possible to test the explicit and implicit assumptions used for capital accumulation in the literature. 5.2.1 Investment ratios are not constant As outlined above, the simple neoclassical model has a rather narrow focus. It analyzes an economy with a constant savings or investment ratio. However, this assumption does not hold in reality. Investment ratios tend to vary significantly over time even in rich countries. For example, the nominal ratio in Finland went from almost 30% in the early 1960s to 25% in the late 60s, up to more than 30% again in the mid-70s and down to below 20% in the mid-1990s using the WDI data. Incidentally, the investment ratio in the country that originated most of the growth models tended to be stable over time, see figure 5.2. In the neoclassical framework this could be interpreted as reflecting a shift in the steady state of the economy including the adjustment of the capital stock to that new steady state. But with several of theses shifts happening, this would render cross-section models - with their reliance on long-run averages - useless. 5.2.2 Investment ratios do not differ much across countries As outlined above, a crucial conclusion from neoclassical models is that countries with high investment ratios have high income levels per capita because
5.2 Main insights on capital accumulation
57 40
% of GDP
South Korea
35
30 Finland
25
20 USA
15 Source: WDI
10 1960
1965
1970
1975
1980
1985
1990
1995
2000
Fig. 5.2. Nominal investment shares over time
they have a high capital stock per capita - all else equal. However, the long-run averages of the nominal investment ratios from the WDI vary only slightly across countries. During 1981-2000, Malaysia, South Korea and Thailand had investment ratios of more than 30%, while Argentina, the USA and the UK had rates of below 20%. On average, countries with low per capita income levels had higher investment ratios over that period. The correlation coefficient for the 40 countries is -0.2, as illustrated in figure 5.3. This already indicates that differences in investment ratios cannot possibly account for the large differences observed in income levels. The average investment ratio over the period 1970-2002 was 23% in my sample of 40 countries with a standard deviation of 2.7 percentage points. As shown above, per capita GDP levels in 2000 in international prices were 90% or more below the US level in many countries. Nevertheless, many papers draw conclusions such as ”a 1 percentage point increase in the investment share brings about an increase in steady-state GDP per capita of about 1.5%.” Similarly, the investment share in a given country cannot grow without bounds and therefore cannot explain long-term income growth. The theoretical upper bound is 100%, while the upper bound in practice is around 40%. Bond et al. (2004) make the point in their pre-testing section arguing that therefore the investment share ”should have neither a deterministic nor a stochastic trend in the long run.” The conclusion is that investment ratios by themselves cannot be a driver of income or growth differences: “Differences in the fraction of product invested do not account for differences in international per capita incomes” (Prescott [1998]). This is in contrast to DeLong and Summers (1991), who ar-
58
5 Physical capital
gue that there is a robust, causal link from equipment investment to economic growth, which does not proxy for some other determinants of growth. Why then is the investment ratio still one of the most robust correlates of income in cross-section studies? The reason is the wide-spread use of data from the PWT which value investment goods at international prices as table 5.1 on page 65 shows. Nominal ratios from this database show a much wider variation with a range from 9.6% to 35.4% over the 1981-2000 period. The correlation coefficient with per capita GDP jumps from -0.2 according to the WDI data to +0.48 according to the PWT data. Bosworth and Collins (2003, p. 11) point out that “investment shares for low income countries are much smaller when measured in international prices than when measured in national prices.” And that the “conversion to international prices introduces a strong positive correlation between the investment rate and the level of income per capita” (p. 12). Figure 5.4 illustrates how the gap between the World Development Indicators (WDI) and Penn World Table (PWT) investment ratios varies systematically with the level of GDP per capita. It seems that the widespread use of international prices drives the results in cross-section estimates - a fact that is usually ignored when the results are interpreted. Poor countries with low prices of non-tradables are attributed low investment ratios by those constructing the data - and empirical estimates recover this relationship. Barro (1991, p. 425) alluded to the role of “variations across countries in the ratio of the investment deflator to the GDP deflator”, but he did not elaborate on this point. Knowles (2001) shows that the results
40 Per capita GDP in '000 international $ in 2003 (WDI)
USA
35 30
Japan 25 Korea 20 15 10 China 5 Egypt 15
Nominal investment in % of GDP (WDI), avg 1994-2003 20
25
30
35
Fig. 5.3. Investment ratios and the level of GDP p.c.
0 40
5.2 Main insights on capital accumulation
59
of Barro and Lee (1994) are sensitive to the use of international or local price levels. But why do time-series or panel estimates (like the pooled mean group technique) also tend to find a link between the development of GDP and the development of the investment ratio over time? It seems that these techniques pick up co-movements of investment and output at the business cycle frequency. As the other correlates of income that usually enter on the righthand side tend to move only sluggishly over time, a cyclical acceleration of GDP is attributed mainly to the coincident and also cyclical acceleration of investment. This could be examined by plotting the (probably stationary) time-series residuals of a regression without the investment ratio against this investment ratio. In sum, three conclusions arise from the analysis of investment ratios: (1) The cross-country variation in nominal investment ratios from the WDI database is not nearly high enough to explain the large variation in income levels across countries. (2) The use of PWT investment data at international prices seems to be behind the finding of a robust empirical relationship in cross-section estimates. (3) Time-series estimates seem to pick up business cycle co-movements of GDP and investment. My overall conclusion is that investment ratios should not be used to model income levels.
Fig. 5.4. Predictable differences in investment shares
60
5 Physical capital
5.2.3 Investment ratios are not proportional to changes in the capital stock As highlighted above, standard cross-section analyses assume that capitaloutput ratios γ are identical across countries so that there is a linear relationship between investment ratios and changes in the capital stock. In reality, countries with high investment ratios do not necessarily show large changes in the capital stock. One reason is that countries with a high capital stock need higher absolute investment to replace depreciated capital.
Fig. 5.5. Investment ratios and changes in the capital stock
Therefore, there can be strong capital accumulation at low gross investment ratios (if the starting capital stock is low) and weak capital accumulation at high investment ratios (if the starting capital stock is high). For example, Switzerland, Austria, India and Chile all had nominal investment ratios according to the WDI numbers near 23% during the decade from 1994 to 2003 (see figure 5.5). However, the average annual changes in their capital stocks over the same period ranged from 1.8% in Switzerland and 2.5% in Austria to 6.5% in India and 7.6% in Chile. The main reason for these differences is that the capital-output ratio over that period was 2.9 in Switzerland but just 2.0 in India according to the perpetual inventory data introduced above. Bosworth and Collins (2003, p. 10) are among the few to question the proportional relationship between changes in the capital stock and investment ratios. They see very little correlation between the change in the capital stock and the mean investment ratio in their sample for two reasons: the capitaloutput ratios are neither constant over time nor the same across countries.
5.2 Main insights on capital accumulation
61
While my analysis disagree with the first reason, it supports the view of significant differences across countries in the capital-output ratios. In my sample there is no fixed relationship between changes in the capital stock and the investment ratio, but the correlation for the 1994-2003 period is 0.8, driven mostly by China and the Asian Tigers (see figure 5.5). Changes in the capital stock are what really matters for output growth, not a high investment ratio per se. Therefore, focusing on the investment ratio alone does not provide a complete picture. 5.2.4 Investment ratios are not proportional to levels of the capital stock Given that the investment data are for gross investment, one might suspect that countries with a large per capita capital stock tend to have high investment ratios. This is not the case: the correlation between nominal investment ratios and the per-capita capital stocks across the 40 countries is slightly negative using averages over 1994-2003 as figure 5.6 shows.
Fig. 5.6. No link between investment ratios and capital stocks
5.2.5 Capital productivity does not correlate with income Easterly and Levine (2001, footnote 9) claim that the capital-output ratio ”systematically varies with income per capita. Capital-output ratios are systematically larger in richer countries”. Similarly, Klenow and Rodriguez-Clare
62
5 Physical capital
(1997, p. 89) conclude that richer countries tend to have higher K/Y and higher H/Y than poorer countries. The data used here do not support these claims. Klenow and Rodriguez-Clare’s results are mostly driven by the use of investment data at international prices from the PWT. As described in section 5.1.2 and illustrated in figure 5.4, PWT data systematically push down the investment ratio in poor countries, so they also push down any derived capital stock proxies. Using the capital stocks calculated from WDI investment data there is no correlation between income and capital productivity (see figure 5.7). Japan, Malaysia and Switzerland had some of the highest capital-output ratios in 2003, while the USA and the UK had among the lowest. The result of no correlation does not change when GDP per hour worked is used instead of GDP per capita. 40000 GDP per capita in 2003 in international $ of 2000
35000 30000 CH JP
25000 20000 15000 10000 5000 Capital-output ratio in 2003
0 1.7
1.9
2.1
2.3
2.5
2.7
2.9
3.1
3.3
3.5
Fig. 5.7. No link between GDP per capita and capital-output ratios
This is further support for Kaldor, who already noted in 1975 (p. 356) that the dramatic increase of the capital-labor ratio in the course of progress happened “without corresponding changes in the capital-output ratio”. His example compares the United States with India, where the ratio of “the capitallabor ratio is of the order of 30:1, while the capital-output ratio is around 1:1.” However, figure 5.7 also shows that the capital-output ratio is not roughly identical across countries as Kaldor claims and Felipe and McCombie (2007) use. This indicates that returns on capital may not be equal across countries in line with the standard conclusion from the international finance literature.
5.3 Proper modeling of capital accumulation
63
5.2.6 Capital accumulation is not exogenous With the investment ratio neither constant nor a good candidate to explain changes in the capital stock or the capital productivity of a country, this leaves us with the question of what really drives capital accumulation. Capital accumulation is driven by the growth rates of GDP and by the returns on capital as outlined by Kaldor (1957) and here on page 31 and supported by the empirical results in chapter 12. According to the Kaldor model, physical capital reacts to any deviation from the long-run relationship between GDP and the capital stock. Investment picks up if profits earned on physical capital rise above their long-term average because of a productivity shock. In the Kaldor model the development of the capital stock depends on the productivity of capital, which in turn depends on the development of technology and on the input of other factors. Results of Granger causality tests are in line with this linkage between capital and GDP. Blomstr¨om, Lipsey and Zejan (1996) find that growth Grangercauses investment, but that investment does not Granger-cause growth. However, this does not rule out that in the short run and even in the long run investment has a positive impact on GDP (not least because it is a part of GDP). Even more, the possibly very long lags between investment and GDP growth as well as the variability of capacity utilization over the business cycle may make it difficult to disentangle the individual effects.
5.3 Proper modeling of capital accumulation Accumulation of physical capital is an important driver of (labor) productivity and GDP. However, the standard ways of modeling this in the empirical growth literature do not appear to be appropriate for reasons outlined in this chapter. The focus should shift to the development of the capital stock - in spite of the aggregation issues highlighted earlier. Because of the (erroneously) assumed correspondence between investment ratios and changes in the capital stock, there was no need to use capital stock data in cross-section analyses. The new panel methods and their use of cointegration analysis usually run into a key problem with investment ratios: they are stationary. Therefore, growth econometrics should move towards using the capital stock instead of investment ratios to explain income levels.
64
5 Physical capital
90,000
in PPP USD of 2000
Japan
80,000
NO 70,000
60,000
CH
50,000
USA Greece
40,000
30,000
Portugal 20,000 1971
1976
1981
1986
1991
1996
2001
Fig. 5.8. Capital stock per capita in 21 rich countries 70,000
in PPP USD of 2000 Singapore 60,000
50,000
Israel
40,000
Korea 30,000
Argentina
20,000
10,000
0 1971
1976
1981
1986
1991
1996
Fig. 5.9. Capital stock per capita in 19 emerging markets
2001
Variable used Real electrical and non-el. inv (% GDP) Real investment (% real GDP) Real investment (% real GDP) Real investment (% real GDP) Real investment (% real GDP) Real investment (% real GDP) Real electrical and non-el. inv (% GDP) Physical capital stock (PPI) Physical capital stock (PPI) no investment included Real electrical and non-el. inv (% GDP) Real non-resid inv. (% real GDP) Real investment (% real GDP) Real investment (% real GDP) Real investment (% real GDP) Nominal investment (% of nominal GDP) no investment included Real investment (% real GDP) Real investment (% real GDP) Real investment (% real GDP) Real investment (% real GDP) Real investment (% real GDP) Real investment (% real GDP) no investment included Capital stock per capita
Source
WDI & own
PWT 6.1 PWT 6.0 PWT 6.1 PWT 1 PWT 1 IMF WEO
PWT 5 OECD PWT 5.6 PWT 5.6 PWT 5.6 OECD & WDI
PWT 5 PWT 5 World Bank PWT 5 PWT 5 PWT 5 PWT 5 PWT 5 PWT 5.6 & own
Note: Table is not comprehensive. Some papers may use several measures and sample sizes.
DeLong & Summers Barro Levine & Renelt MRW Loayza Caselli, Esquivel & Lefort Sala-i-Martin Klenow & Rodriguez-Clare Hall & Jones Easterly & Levine Fernandez, Ley & Steel Bassanini et al. Bernanke & Guerkaynak Sarno Dowrick & Rogers Bosworth & Collins Doppelhofer et al. Barro & Sala-i-Martin Bond et al. Hauck & Wacziarg Hendry & Krolzig Hoover & Perez Batista & Zalduendo Ianchovichina & Kacker This study
Year Authors
1991 1991 1992 1992 1994 1996 1997 1997 1999 2001 2001 2001 2001 2001 2002 2003 2004 2004 2004 2004 2004 2004 2004 2005
Method
Cross-section Cross-section Cross-section Cross-section Panel Pi Panel GMM Cross-section Cross-section Cross-section Panel GMM Robust relationship Cross-section Significant positive PMG Positive cor. with growth rate Cross-section Positive in cointegration vector Cointegration Significant positive relation Panel GMM No significant impact on growth Cross-section Cross-section Significant positive relation Cross-section Large, significant effect Pool Significant positive relation CS, GMM & LSDV Significant positive relation Cross-section PCGets Significant positive relation Cross-section No effect Cross-section & time Cross-section Significant cointegration 2-step panel
Robust and causal Negative cor. with initial GDP Positive and robust correlation Positive on level Significant positive relation Positive on level Significant effect Positive correlation Depends of institutions
Impact
Table 5.1. Overview of the empirical literature on physical capital
5.3 Proper modeling of capital accumulation 65
6 Human capital
The previous two chapters have argued that - in contrast to mainstream neoclassical models - population growth and the accumulation of physical capital are largely endogenous to economic growth. This chapter turns to an important driver of income that is exogenous at least over a 10 to 15-year horizon: human capital. Investment in education usually precedes the pay-out period by many years. Individuals tend to first go to school and then earn higher income. As a result, an increase in a country’s level of human capital occurs mostly because young people entering the labor force are better educated than old people leaving the labor force. Preferences and educational policies determine how many children complete a certain level of education. History shows that these decisions do not depend much on current income levels. Support for the view that education is exogenous comes from Stevens and Weale (2004), who note that the ”spread of formal school seems to have preceded the beginning of modern economic growth.” Today’s decision to raise a country’s level of human capital will have its biggest impact in 15 years or even later when the better-educated young enter the labor force. Any resulting rise in GDP will be spread out over many years. While potentially frustrating for policy-makers, this slow and gradual movement of human capital can be very informative for a long-run forecasting model. Adam Smith (1776) saw two genuine roles for the government beyond defense: to support institutions ”facilitating the commerce of the society” and institutions providing education. He argued that the state should provide access to general education for the broad population and that attendance should possibly be mandatory: ”For a small expence the public can facilitate, can encourage, and can even impose upon almost the whole body of the people, the necessity of acquiring those most essential parts of education” (p. 843). This was also Smith’s answer to the basic problem in economics: the social question. Equal educational opportunities are a more sustainable vehicle towards social peace than transfer payments from rich to poor.
68
6 Human capital
In a similar spirit, Landes (1999, p. 180) traces the decline of the Spanish empire in the 16th century to the catholic church stifling education by, for example, banning the import of foreign books and evicting Jews, who had been so important for the intellectual life. He sees part of the economic success of protestant societies stemming from their emphasis on instruction and literacy partly by expecting people to read the Bible themselves. Human capital can be defined as an individual’s general level of skills or more broadly as the knowledge, skills and competencies embodied in an individual that facilitate the creation of personal, social and economic wellbeing. This human capital can be acquired in many different ways at home, in school or at work. It can rise ahead of formal school education (parents) and continue to increase afterwards (experience, further education etc.). The role of experience was highlighted most prominently by Arrow (1962, p. 156), who stated that “technical change in general can be ascribed to experience.” Higher human capital allows an individual to perform higher value-added tasks more efficiently and more quickly. He or she can receive, decode and understand information more quickly and can also apply more new ideas and be more innovative: “educated people make good innovators.”1 In short, higher human capital leads to more output per hour worked. Similar to additional physical capital (machines), additional human capital raises the productivity of raw labor. In addition to the direct effects of human capital on income, there are several indirect effects both on income and on measures of a nation’s wellbeing. For example, human capital correlates strongly with other factors that are often seen as explaining the level of GDP. Countries with a high level of education or human capital tend to also score highly in economic freedom indices such as the Fraser Institute’s Economic Freedom Index or the Heritage Foundation’s Index of Economic Freedom. Expenditures on research and development also tend to be higher in countries with a high level of human capital. There can be no research without human capital and not much innovation without research. On top of the productivity-enhancing effects, higher human capital also leads to additional positive consequences such as improved health, higher life expectancy, social peace, less crime etc. These benefits are probably quite large, but they are extremely difficult to quantify. From a growth perspective, human capital has some similarities to physical capital. Both types of capital have to be produced by sacrificing some of today’s output of final goods. Both are investments in future output. Somebody enrolled full-time in school today cannot contribute to today’s GDP. Another similarity to physical capital is that most of the benefits from a person’s higher human capital accrue to that person and to the firm where he works. However, there might be additional effects on co-workers in the same firm and even in other firms. The ideas developed by one person (because of his high education level) can help somebody else create more ideas as well 1
Nelson and Phelps (1966), p. 70.
6.1 Micro- and macroeconomic theory
69
or another person can apply the same idea in his firm. These spillovers and feedback effects of non-rivalrous ideas make human capital so special and can lead to market failure with too little investment in human capital because the individual can only reap part of the returns on his own investment. As is the case particularly in the openness literature, the empirical human capital literature does not always clearly distinguish between effects of the level of human capital and of its growth rate. Convergence regressions with initial income and the level of education on the right-hand side may not uncover the unconditional effect of the level of education on GDP growth. The interpretation of these regressions is not always clear. Therefore, table 6.1 on page 79 reviewing the empirical literature often avoids the attribute of level or growth. In addition, the measurement error in the Barro-Lee data makes results derived from that dataset highly questionable especially when using first differences. It is not surprising that the survey by Sianesi and Reenen (2003) concludes that the empirical literature is still largely divided on whether education affects the long-run level or the growth rate of the economy. Section 6.2.1 will show that proper handling of the data makes this confusion disappear.
6.1 Micro- and macroeconomic theory The link between human capital and income should be obvious. Rich countries today are rich because they have accumulated know-how for many generations and passed it on to the next. The usefulness of the wheel is a simple example. If we were to forget this and other advances, income levels would plunge. As Dowrick (2004) puts it, all “economic activities depend on institutions that encourage the preservation, transmission and development of knowledge.” The theoretical models of the impact of human capital on economic activity have become increasingly sophisticated over the past centuries, especially during the last decades. As early as 1890, Alfred Marshall noted that “the most valuable of all capital is that invested in human beings.” And Benjamin Franklin was aware that “investment in education pays the best interest.” Gary Becker refined and deepened these insights and coined the term “human capital” with the title of a book published in 1964. Education is an investment. As outlined in chapter 2, Robert Lucas modeled the link between human capital and economic activity by splitting the economy into two sectors. The education sector produces new human capital with the help of existing human capital (teachers), while the final goods sector uses both human and physical capital as inputs. In this model, a rise in human capital leads to a rise in national income, while a high level of human capital explains a high level of income. Economic policy that raises the rate of growth of human capital will lead to higher rates of GDP growth.
70
6 Human capital
An alternative view is that a high level of human capital allows for a high growth rate of GDP. Paul Romer stirred up a lot of attention in the late 1980s with his model of knowledge spillovers, where the stock of knowledge determines the growth rate of GDP. However, there is no clear-cut empirical evidence for his thesis. Germany (unless it gets a lot wrong in other areas) should, for example, post one of the highest rates of income growth given its high level of human capital. Some researchers claim that individuals only go to school to signal their potential future employers how motivated and capable they are. In this view, they do not really learn much in school that would increase their productivity and innovative capacity. This point of view is highly controversial on theoretical and empirical grounds - and I will not use it below. 6.1.1 Microeconomic analysis: labor economics Using insights from other branches of economics can be very helpful for growth economists. This is particularly the case in the analysis of human capital and income, as labor economists have long analyzed the impact of education on an individual’s earnings. The classic contribution by Mincer (1974) assumes that the only opportunity cost of attending school is foregone wages and that the increase in wages generated by this extra schooling is constant over time. In this case, the natural logarithm of an individual’s wage or income Yi is proportional to the level of schooling years he completed Si , his work experience Ri plus a constant ci and a random error ǫi . (6.1)
ln Yi = ci + φSi + βRi + ǫi
The coefficient on Si is interpreted as the rate of return on investment in education. Microeconomic studies tend to find that an additional year of education raises an individual’s wage by between 5% and 10% depending on country or study subject - a useful benchmark for macroeconomic studies. If macroeconomic returns exceed microeconomic returns this may be indicative of positive externalities. The survey by Psacharopoulos and Patrinos (2004) reports that an additional year of education leads to a 10% higher wage on average in a large sample of countries. In these micro studies the economy-wide state of technology and the capital stock are identical for all individuals. If individuals with education earn more than those without it, the same should be true for whole economies. If we replace wages with national income in equation 6.1 and move all variables from individual levels to countrywide measures, we quickly arrive at a macroeconomic function. A simple least squares regression of the natural logarithm of per capita GDP on the level of the years of education leads to an R2 of 0.64 for the 21 rich countries analyzed in this study and of 0.80 for all 40 countries.
6.1 Micro- and macroeconomic theory
71
The lesson that growth economics has taken from these micro studies is that there is a significant positive link between education and income - and that the most appropriate measure for human capital should be the years of schooling. The experience dimension tends to get neglected because on a macroeconomic level the average work experience of the overall labor force does not change much (except through an increase in life expectancy that increases the working life as well). In addition, a natural starting point for macroeconomic analysis should be a log-linear relationship, where income enters in log form, but schooling in levels as in equation 6.1. For example, Hall and Jones (1999) use the Mincerian equation to derive the level of human capital H = eφS L where φ is between 13.4% and 6.8% depending on the level of education. Similarly, Bosworth and Collins (2003, p. 7) use H = (1.07)S as measure of human capital, assuming a 7% return on each year of education, while Klenow and Rodriguez-Clare (2005) use 8.5%. I will take account of this generally accepted specification by using the log-linear form in my empirical model. The augmented technical progress function in the Kaldor model in equation 2.21 already made use of this log-linear relationship. 6.1.2 Macroeconomic models with different conclusions As indicated above, macroeconomic models developed over the past 20 years come to two different conclusions on the effect of human capital on income. One camp sees the growth rate of human capital affecting the growth rate of GDP per capita - and the level explaining the level. The other camp sees the level of human capital affecting the growth rate of GDP per capita. My empirical results below support the first camp. The first camp consists primarily of two models. The easiest way to introduce human capital into a model of long-run economic growth is to simply add it as another input factor into a production function. There are two types of capital, the standard physical capital K plus human capital H. Both are accumulated by investing a constant, exogenous fraction of output in each of them with the same technology. With constant returns to scale and laboraugmenting technological progress, the production function 2.1 becomes: (6.2)
Yt = Ktα Htβ (At Lt )(1−α−β)
This so-called augmented Solow-model was introduced by Mankiw, Romer and Weil (1992) and used for example by Bassanini and Scarpetta (2001) and Cohen and Soto (2001). While this model allows a richer empirical modeling, the theoretical insights are the same as in the standard Solow model. In the
72
6 Human capital
long run, GDP and each type of capital grow at the same exogenous rate g. The measure of capital is just defined more broadly.2 A much richer theoretical model is the Lucas-Uzawa model (Lucas [1988]) briefly alluded to in chapter 2, which uses a more complicated production function for human capital in a two-sector economy. A constant fraction of the labor force’s time is spent accumulating human capital, while the remainder is devoted to producing final goods by combining physical and human capital. Ignoring the possibility of externalities in Lucas’ production function, the result (derived in detail in Frenkel and Hemmer (1999, p. 206ff) or in chapter 10 of Aghion and Howitt [1998]) is that GDP, human capital and physical capital all grow at the same constant rate. This rate is determined by the rate of time preference, the depreciation rate, the productivity of the education sector and the relative risk aversion. There is no need for a manna-from-heaven rate of technological progress g. Therefore, this model is one of endogenous growth even if the actual growth rate is again determined by a set of (fixed) parameters. As shown in chapter 2, the testable hypotheses for empirical work are the same as in the augmented Solow model: GDP, human capital and physical capital are all driven by the same process, so they should be pair-wise cointegrated. The model of Azariadiz and Drazen (1990) complements this analysis and points to the use of error correction models. They argue that a high ratio of human capital relative to income per head is a prerequisite for strong GDP growth. This fits well into my empirical framework outlined in chapter 12, where a deviation of GDP below the level indicated by human capital would point to stronger GDP growth in the future. When taking the Lucas-Uzawa model literally, one of the difficulties is the assumption that a constant fraction of time is spent in education. Another assumption is that individuals have infinite lives, which raises the question what a fraction of infinity is. Under constant life expectancy, this setup would preclude a rise in a measure of human capital such as the average years of education over time. In addition, in reality we observe significant rises in life expectancy, which increase the incentive to invest in education (see Cohen and Soto [2002]). While life expectancy in advanced countries increases by around 8 years every 20 years, education has increased by “just” 2 years over the past 20 years on the OECD average. Still, in percentage terms the rise in education was roughly twice as strong as the rise in life expectancy. Nevertheless, the Lucas-Uzawa model was a major step forward and I will use its conclusions below. In contrast to the exogenous rate of the augmented Solow-model, the Lucas-Uzawa model provides clearer policy recommendations and allows more flexibility, for example by changing the rate of time 2
Taking this broadening to the extreme leads to the AK model, where a broad measure of capital is the only input into production and where there are no diminishing returns to capital anymore.
6.1 Micro- and macroeconomic theory
73
preference. The fraction of time devoted to schooling (and other parameters assumed exogenous in the Lucas-Uzawa model) may change over time. The second camp of theoretical models on human capital argues that the stock of human capital determines the growth rate of GDP. This camp draws on Nelson and Phelps (1966) and was advocated most prominently by Romer (1986, 1990) and Aghion and Howitt (1992, 1998). Lucas (1988) also allows for the possibility of human capital externalities. Here, the existing stock of an economy’s disembodied knowledge (or human capital) determines the pace at which new ideas or knowledge are generated. The production of knowledge can be captured in a function such as: (6.3)
A˙ t = θLA,t Aνt
The change in technology depends on the number of workers in the knowledge-producing sector LA and on the level of existing knowledge A. The underlying assumption is that there is an infinite universe of ideas waiting to be discovered. The larger the set of known ideas, the more neighboring ideas there are, the easier it becomes to make additional discoveries (“standing on giants’ shoulders”). However, there are also good reasons to think that it gets harder and harder to make new discoveries the more there have been made already. Jones (1995b) argues that researchers may also be stepping on each other’s toes or may just reinvent ideas (“fishing out”). One can reconcile these two views by arguing that standing on the shoulder of giants allows researchers today to discover ever more complicated ideas and products, but that the pace of progress need not accelerate over time. This interpretation seems to be most in line with observed developments. In this case, faster GDP growth could be a result of a faster increase in human capital or in the share of people doing research. Models focusing on spillovers and externalities were repeatedly criticized for generating models with unrealistic scale effects (see chapter 2) in the sense of Romer (1990, p. S99) who concludes that ”an economy with a larger total stock of human capital will experience faster growth.” Both camps of models generate testable hypotheses. According to the first camp, time series for the level of GDP and human capital should be cointegrated. According to the second camp they should not be cointegrated: a deviation of GDP above any scaled level of human capital will not be corrected by GDP coming down in the future. That is, GDP can continue to grow strongly even if human capital does not grow anymore. The second camp would expect the level of human capital and the growth rate of GDP to be cointegrated. However, given the stationarity of GDP growth, this cannot be the case. My estimates in chapter 12 will show a highly significant cointegration between the level of GDP per capita and the level of human capital per capita, rejecting models of the second camp.
74
6 Human capital
6.2 Measures and empirical analysis While the importance of human capital for a nation’s income level is theoretically well established, the statistical basis for analyzing that link quantitatively is rather weak. It is standard practice for national statistical agencies to publish an estimate of their country’s stock of physical capital, but the same is not (yet) the case for human capital. A measure frequently used in the past is enrollment in secondary education, which compares the number of people going to secondary school in a given year with the number of people in the typical age group. Similar calculations can be made for primary or tertiary education. Secondary enrollment is the measure used in the early empirical growth literature in the cross-section world (e.g. Barro [1991], Mankiw, Romer and Weil [1992] and Levine and Renelt [1992]) and in the first applications with panel data (e.g. Loayza [1994] and Caselli, Esquivel and Lefort [1996] see table 6.1 at the end of this chapter). It was still used heavily even in the early 2000s for example by Doppelhofer, Miller and Sala-i-Martin (2004) and Hauck and Wacziarg (2004). Sometimes enrollment is used to measure the stock of human capital, sometimes to measure the flow. Barro (1991, p. 420) already noted that: ”One problem with the previous result is that the 1960 school-enrollment rates could be proxying for the flow of investment in human capital, rather than for the initial level.” However, enrollment rates are not a useful measure of either the stock or the flow of human capital. The same 70% secondary enrollment rate in two countries may go hand in hand with different paths of the stock of human capital. For example, in a country with a low stock, say 5 years of education, a 70% secondary enrollment rate will lead to a significant rise in the workforce’s average education level over time. On the other hand, in a country with a high stock of human capital, say 12 years of education, the same 70% secondary enrollment rate may not even be enough to maintain the initial level of human capital. In other words, enrollment data alone tell us nothing about either the level or the growth rate of human capital. Either we need the information about the initial stock and combine the two measures to get a sense for the future path of human capital. Or we have to make the extremely strong assumption that all countries are on the same steady-state path, where the enrollment ratios are just high enough to keep human capital constant. However, this assumption has nothing to do with reality, where the paths of human capital are very different across countries. For the forecasting model I am interested in exactly these differences in the paths of human capital across countries. In sum, enrollment rates are not useful measures of human capital. A better measure is attainment rates, which measure what share of the population has completed a certain level of education. However, there will be at least three ratios (primary, secondary and tertiary) for each country, which
6.2 Measures and empirical analysis
75
makes each individual measure less useful for empirical analysis. A combined measure is the average attainment level. 6.2.1 Best measure: years of education Over the past decade a more useful measure of human capital has gained an increasing following: the average years of education of the working age population as a proxy for the average educational attainment level. These data are often referred to as ”average years of schooling”, but that could be misleading if it suggests that university level education is not included. Therefore, I will use the term ”education” throughout this study even if the original source uses ”schooling.” One advantage of this measure is that it is the same as the one used in the microeconomic studies outlined above. Unfortunately, the measurement error in macroeconomic data appears to be large, especially in early datasets. The path-breaking work by Barro and Lee published in 1993 made a comprehensive dataset of average years of education available for the first time. It covers 129 countries at five-year intervals between 1960 and 1985 for the population aged 25 years and above. For 40% of the country-year combinations they used survey- or census-based data as reported by UNESCO on the type of education that individuals have completed. The remaining 60% of the observations are derived using the perpetual inventory method by using enrollment data. Barro and Lee (1996) extend this dataset to 1995 and later in Barro and Lee (2001) to 2000 and to measures for the age group 15 and over as well as 25 and over. These datasets have been used in the majority of empirical studies of human capital over the past 13 years as table 6.1 on page 79 shows. However, de la Fuente and Domen´ech (2000) detect significant measurement error in the Barro-Lee data for 21 developed countries. Portela et al. (2005) also emphasize the measurement error in the Barro-Lee data that stems from the calculation of non-census observations using enrollment data. They suggest a systematic way of correcting these errors using the time elapsed since the last census. De la Fuente and Dom´enech (2000) incorporate a greater amount of national information to avoid implausible breaks in the data. Their time series for the population aged 25 years and older appear much smoother and more plausible than the Barro-Lee series. De la Fuente and Dom´enech (2000) also add information on education levels that were attended but not completed. In de la Fuente and Dom´enech (2002) they argue that their dataset has the highest signal-to-noise ratio of all available sets. De la Fuente and Dom´enech (2000) show that their dataset leads to significant positive coefficients on human capital both in a levels specification and in a first-difference specification, while the Barro-Lee dataset leads to lower or (in first differences) insignificant coefficients. They argue that the finding of insignificant or sometimes even negative coefficients in Benhabib and Spiegel (1994) or in Pritchett (2001) stems mostly from the use of Barro-Lee data which include significant measurement error. Bassanini and Scarpetta (2001)
76
6 Human capital
use the de la Fuente and Dom´enech dataset and the pooled mean group panel technique to find that an additional year of schooling raises output by around 6% in the long run. For the emerging markets Cohen and Soto (2001) conduct similar adjustments for the population aged 15 to 64 at the beginning of each decade from 1960 to 2000 plus a projection to 2010 for 95 countries. The two datasets differ in the age groups covered (25-64 vs. 15-64) and in the frequency of observations. Cohen and Soto (2001) find a significant positive effect of human capital accumulation on growth. The return on human capital is estimated at 8% for each additional year of education. Since this is similar to microeconomic return measures, they conclude that there are no externalities to human capital. I will use two different datasets on the average years of education of the population aged 25 to 64. For the 21 rich countries I use the dataset from 1971 to 1998 published in Bassanini and Scarpetta (2001) in the annex table. This was built on the work of de la Fuente and Dom´enech (2000), who in turn used the Barro-Lee data as a starting point. I extend the dataset through 2003 using the OECD’s Education at a Glance of 2005. There may still be considerable measurement error, witness the fact that the average years of education jumped by 1.2 years in the USA between 2002 and 2003 according to the OECD’s Education at a Glance issues of 2004 and 2005. The jump was even larger in New Zealand with two years. For the 19 emerging markets I use the dataset from Cohen and Soto (2001) with observations every 10 years between 1960 and 2010. Intermediate data were interpolated by spline. For Taiwan, I have to use the Barro-Lee data, which I extrapolate. The datasets are plotted in figures 6.1 and 6.2 at the end of this chapter. In the rich countries, the years of education in 2003 varied between 8.2 in Portugal and 13.8 in Norway and the USA, with a mean of 12.2. Over the 10 years to 2003 human capital rose by 2.8 years in Ireland and Spain but just 0.1 years in Switzerland. The average 10-year change was 1.2 years. In the emerging markets, the years of education in 2003 varied between 4 in Nigeria and 12.7 in South Korea with a mean of 8.2. Over the 10 years to 2003 human capital rose by between 0.2 years in Israel and 2.6 years in Singapore. The average 10-year change was one year. Current measures of years of education do not include learning-on-the-job or lifelong learning. Since especially the second element is becoming increasingly important over time in advanced economies, in the future the data are likely to include these aspects of education as well. For my purpose it is comforting to note that lifelong learning is highly correlated with the highest formal degree attained. Another interesting question is whether there is a ceiling or optimum for the number of years spent in formal education. While this may be a valid concern over the very long run, there are no signs of a leveling off in the data yet (with the exceptions of Germany and Switzerland) and there is a lot of
6.2 Measures and empirical analysis
77
room in advanced economies before, say, 50% of the working population have a university degree and most others a high school degree. Therefore, I do not consider a ceiling as binding over my forecast horizon until 2020. Years of education are a rather crude and simple measure of the stock of education. Alternatively, one could also measure the value of an economy’s human capital using cost-based or income-based approaches. Le, Gibson and Oxley (2003) survey these methods and highlight that on all measures the stock of human capital greatly exceeds that of physical capital. The Centre for the Study of Living Standards publishes time series on cost-based values of human capital.3 However, countries with high salaries of teachers need not necessarily produce higher quality workers. In addition, the income-based measure requires as an input something that growth empirics tries to identify: the rate of return on human capital. Therefore, the focus of this study has to be on a stock-based measure as outlined above. Another problem is that the data on average years of education treat a school year in Japan just as one in Egypt and they treat a year of primary school just as one in university. In particular, the quality of education may differ significantly across countries as the OECD’s PISA tests repeatedly show. Hanushek and Kimko (2000) present a dataset on labor-force quality that combines achievement tests for different cohorts. Their cross-section estimates show a significant explanatory power beyond schooling quantity. However, these data are not available in long enough time series. The same is true for the International Adult Literacy Survey (see Coulombe et al [2004]). In addition, for a given point in time there is a high correlation between the level and the quality of education suggesting that omitting the quality dimension may not be too detrimental. Furthermore, the average human capital data do not distinguish between different types of education. Vandenbussche et al. (2006) illustrate why skilled human capital is more important the closer an economy is to the technological frontier. However, since all economies consist of highly heterogeneous actors, there should be room for skilled and unskilled workers in every economy. What really matters for average income is the average level of education. In sum, the best available measures of human capital - with all the remaining deficiencies - are those from de la Fuente and Dom´enech (2002) updated with the OECD’s annual reports Education at a Glance for the OECD countries, and Cohen and Soto (2002) for the emerging markets. The most appropriate empirical specification is a log-linear relationship between the logarithm of the level of GDP and the level of years of education. And the most appropriate estimate uses non-stationary panel techniques. So far this combination has not been used in the literature. The most widely used data are those from Barro and Lee (see table 6.1). The log-linear specification seems to have gained a wider following in recent years as the fourth column of table 6.1 shows. 3
www.csls.ca
78
6 Human capital 14 years
Germany
12
10 Spain
8 Portugal
6
4 1971
1976
1981
1986
1991
1996
2001
Fig. 6.1. Average years of education in 21 rich countries
14 Korea
12
10
8
6 India
4 Nigeria
2
0 1971
1976
1981
1986
1991
1996
Fig. 6.2. Average years of education in 19 emerging markets
2001
Measure Secondary enrollment 1960 Secondary enrollment Secondary enrollment Secondary enrollment Years of education Secondary enrollment Primary and sec. enrol. Primary enrollment 1960 Years of education Human capital stock School quality Years of education Primary enrollment 1960 Years of education Secondary enrollment Years of education Years of education Years of education Years of education Years of education 15-64 Secondary enrollment Years of education Years of secondary schooling Years of education Higher education enr. 1961 Primary enrollment 1960 Secondary enrollment Years of education Years of education Years of education, males Secondary enrollment Years of education Years of education Years of education ln linear linear linear ln ln linear linear
ln ln ln ln linear linear linear ln linear ln linear
ln
ln
ln ln ln ln
United Nations Barro (1991) UNESCO MRW (1992) Barro-Lee (1993) Barro-Lee (1993) Barro-Lee (1993) Barro-Lee (1993) Barro-Lee (1993) Own data Own data Own data Barro-Lee (1993) DD (2000) WDI DD (2000) Barro-Lee (1996) Barro-Lee (1993) Barro-Lee (1993) Own data UNESCO CS (2001) Barro-Lee (1996) BL (2001) & CS (2001) Barro-Lee (1993) Barro-Lee (1993) Barro-Lee (2000) BL (2001) & CS (2001) Barro-Lee (2000) Barro-Lee (2001) WDI Barro-Lee (1993) corrected BL (2001) DD, CS & EaG
Spec. Source
Impact Positively related to growth Fragile Strong positive Not shown, in fixed effects No significant link Negative relation to GDP Small positive Positively related to growth Small explanatory power GDP causes human capital Sign. positive on top of quantity Significant positive Weak positive Significant positive Significant positive Significant positive Significant positive effect on growth Pos. impact of change on growth No significant link Significant positive Positive in cointegration vector Significant positive Negative relation to growth Limited positive impact No robust relation with growth Robustly, pos. related to growth Significant positive impact Positive impact of level on growth Significant positive impact Significant relation with growth Small positive Positive impact of level on growth Growth and level are significant Significant positive
Method Cross-section Cross-section Cross-section Panel Pi Panel LSDV Panel GMM Cross-section Cross-section Cross-section Cross section Cross-section Cross-section Cross-section PMG Cross-section PMG Panel GMM Cross-section Cross-section Cross-section Cointegration System GMM Panel GMM Cross-section Cross-section Cross-section CS, GMM & LSDV Cross-section Cross-section & t Cross-section Cross-section Cross-section & panel Cross-section & panel 2-stage nonst. panel
”Spec.” indicates the transformation of the education measure. Some studies may use more than one measure and/or more than one method. MRW (1992) refers to Mankiw, Romer and Weil (1992), DD (2000) is Domenech and de la Fuente (2000) and CS (2001) is Cohen and Soto (2001), Barro-Lee (2000) is WP version of Barro-Lee (2001).
Barro Levine & Renelt Mankiw, Romer & Weil Loayza Islam Caselli, Esquivel & Lefort Klenow & Rodriguez-Clare Sala-i-Martin Hall & Jones Bils & Klenow Hanushek & Kimko de la Fuente & Domenech Fernandez, Ley & Steel Bassanini & Scarpetta Bernanke & Gurkaynak Bassanini et al. Easterly & Levine Krger & Lindahl Pritchett Cohen & Soto Sarno Soto Dowrick & Rogers Bosworth & Collins Doppelhofer et al. Doppelhofer et al. Hauck & Wacziarg Chen & Dahlman Batista & Zalduendo Barro & Sala-i-Martin Ianchovichina & Kacker Benhabib & Spiegel Portela et al. This study
Year Author(s)
1991 1992 1992 1994 1995 1996 1997 1997 1999 2000 2000 2000 2001 2001 2001 2001 2001 2001 2001 2001 2001 2002 2002 2003 2004 2004 2004 2004 2004 2004 2005 2005 2005
Table 6.1. Overview of the empirical literature on human capital
6.2 Measures and empirical analysis 79
7 Openness
Not long ago, Rodriguez and Rodrik (2001, p. 39) concluded that “the challenge of identifying the connections between trade policy and economic growth is one that still remains before us.” In the early 2000s, papers on growth and openness included terms like “once again” or “revisiting” in their titles.1 Three aspects appear to be behind this challenge: 1. Trade policy is highly correlated with other growth-relevant policy measures such as the rule of law. Disentangling the effect of trade policy is therefore a difficult empirical undertaking. I will avoid these difficulties by interpreting trade openness as an overall policy variable that captures a broad array of growth-relevant policies. 2. There is no consensus on how to best measure openness and trade policy. Measures such as tariffs, black market exchange rate premia and trade shares have been proposed. Some authors argue for adjusting trade shares for gravity variables and price differences, while others use the raw shares. 3. The literature is not always explicit about whether it is looking for an effect of openness on the level or on the growth rate of GDP. Theory gives conflicting priors, and cross-section convergence regressions make interpretation of some of the results rather hazardous. Trying to find a link between the level of openness and the growth rate of GDP is likely to lead to inconclusive results as I will show later.2 I will argue that the level of trade openness is a good explanatory variable for the level of income per capita - and that the change in openness helps explain the change in income. 1 2
For example, Lee, Ricci and Rigobon (2004) and Dollar and Kraay (2003). Therefore I am not concerned by findings like those of Vamvakidis (2002) that “the positive correlation between free trade and growth after 1970 is an exception”, or Lee, Ricci and Rigobon (2004) that the effect of openness on growth is small. They seem to try to uncover a link between openness levels and GDP growth rates.
82
7 Openness
Trade policy can be defined as any measure that policy takes to promote the exchange of goods and services with foreigners. This could include cuts in tariffs, but also governments promoting the construction of new harbors. By contrast, the trade share measures the actual exchange of goods and services with foreigners as the average of the shares of exports and imports in GDP. Trade openness (or short: openness) measures how exposed the average firm or individual of a country is to the exchange of goods and services with foreigners. Both the trade share and trade openness may be affected by policy as well as non-policy changes. For example, a decline in transportation costs (container shipping, fiber optic wires etc.) could lead to a rise in trade without any policy changes. A country’s geographic proximity to other large and rich economies would by itself also imply a higher level of openness.3 While Rodriguez and Rodrik (2001) focus on policies, the focus here is on outcomes: anything that promotes the cross-border exchange of goods, services, technology and ideas will have a positive impact on incomes because the existing amount of resources is used more efficiently. This raises the returns on physical (and human capital), promoting more accumulation in the second round. Case studies are important for providing background for the analysis of the empirical link between openness and income. For example, Ben-David (1993) analyzes episodes of trade liberalization and concludes that more liberal trade systems tend to go hand in hand with higher income levels. Hausmann, Pritchett and Rodrik (2005) find that growth accelerations tend to coincide with a significant rise in the trade share. Two of the most striking case studies on how detrimental the closing of an economy can be come from Asia. Until about 1250 China was one of the richest countries on earth. But the centralization of power during the Ming period was accompanied by massive regulation and in the mid-16th century the government even banned ships with more than two masts, massively restricting foreign trade. The result was that income levels fell back to those of the 10th century and stayed there for centuries. Likewise, Korea was called the ”Hermit kingdom” in the 19th century after it closed itself to foreigners. The experiment ended in bankruptcy and foreigners forced their way back in. The failure of import substitution strategies in many former colonies in the 1960s and the success of the Asian openers in the 1970s add further anecdotal evidence. Unfortunately, economists have to accept some of the blame for countries following the wrong strategies. Raul Prebisch, the Secretary General of the United Nations Economic Commission for Latin America and later the founding secretary general of UNCTAD, advocated infant industry arguments for all of manufacturing in developing countries.4 Today, the pro-openness consensus is overwhelming among economists - even if they are aware that some groups will be hurt by free trade. 3 4
But chapter 8 will show that growth does not simply spill over across borders. Quoted from Baldwin (2003).
7.1 Theory: higher efficiency
83
7.1 Theory: higher efficiency From theory there is no clear conclusion on whether changes of openness have any effect on GDP at all and whether the effect will be on the level or on the growth rate. This disagreement partly stems from the later development of endogenous growth theories, which emphasize the effect on levels. Macroeconomic data do not make it possible to identify which of the many potential transmission channels is the most important. But this is not necessary for a model of long-run GDP growth which is interested in the overall effect. What is necessary, however, is some sound theoretical basis for the model. Opening would be negative for growth according to infant industry models and models of optimal protection. Likewise, opening may have a negative effect on the terms of trade or on the dynamic comparative advantage. However, all these models advocate protection at most as a short-term strategy.5 Other models see a positive effect either on the growth rate or the level of GDP, using a wide range of arguments. In the simple neoclassical growth model openness has no effect on long-run growth, which is assumed to be exogenously given anyway. However, there may be an effect of the level of openness on the steady-state level of GDP. In a dynamic interpretation, if changes in openness make the steady-state GDP level move over time, then there would also be an effect of changes in openness on changes in GDP. A presumed positive link between openness and GDP is derived from Ricardian trade theory, where the movement from autarky to free trade (or any smaller movement in between) leads to an increase in welfare in spite of constant production possibilities. Under free trade each country produces more of the product that was relatively cheaper (using real exchange rates) in that country before trade. However, GDP is not a measure of welfare. It only captures the real value of final production in an economy. Therefore, welfare gains of trade cannot be uncovered by regressing GDP on trade. The reason for these complications is that usually Laspeyres quantity indices are used to calculate time series of real GDP. However, as Neuhaus (2006) points out, in the neoclassical trade model an increase in specialization driven by differences in relative prices will lead to a decline in GDP using Laspeyres indices. Each country will produce more of the good that is becoming relatively more expensive. Valuing this additional production at the lower prices of the base period (and the reduced production of the other good at the high prices of the base period) leads to a decline in reported real GDP. The use of chain-linked indices - as is now common in Europe - reduces the bias, while using the Fisher ideal index pushes it close to zero. In practice, this effect is spread out over many years as an economy opens gradually. Still, as Neuhaus (2006) highlights, these are important theoretical and statistical reasons why the empirical literature had such difficulties uncovering a significant positive relationship between openness and GDP. 5
Vamvakidis (2002, p. 60) summarizes the main theories and references.
84
7 Openness
Since we cannot expect to uncover welfare or growth effects stemming from specialization due to comparative advantage when using standard GDP data, other factors must be behind any positive correlation between openness and income. 7.1.1 Extent of the market and specialization In the Ricardian example sketched above, the production possibility frontier remains unchanged. However, trade can help push that frontier outwards as specialization increases an individual’s productivity, promotes learning by doing and raises the return on physical capital. As Adam Smith (1776, p. 3) highlighted, “The greatest improvement in the productive powers of labour [...] seem to have been the effects of the division of labour.” And the division of labor - or the ability to specialize - is limited by the extent of the market (ibid p. 19). Clearly, this extent of the market has kept increasing over the last 500 years, helped by a significant fall in transportation costs (railroad, airplanes, fiberoptic cables etc.). In the middle-ages, tailors and merchants in the city of Nuremberg outsourced spinning and then weaving to nearby F¨ urth, where fewer restrictions applied (Landes [1999], p. 43). In the 17th and 18th century, outsourcing and specialization was least restricted in England, which contributed to England becoming the country of the Industrial Revolution. Specialization continues today as China and India become more integrated into the world economy. Endogenous growth models of the Grossman-Helpman type can be seen as modeling specialization through the use of varieties of intermediate inputs. In addition, endogenous growth models with their focus on scale effects see beneficial effects of openness both on growth and on the level of GDP.6 A larger market implies a higher payoff of innovations, more R&D effort, a larger variety of capital goods, a larger accessible stock of knowledge, more technological spillovers etc. This appears to be an area where models with scale effects have a point. The “size” of an economy rises with trade and so does efficiency and income. However, this reasoning does not necessarily imply that the growth rate of GDP is permanently higher when there is a higher level of openness. 7.1.2 Good macro polices and more competition The literature on openness and that on institutions are closely connected. Strong institutions go hand in hand with openness; they promote openness and vice versa. An open trade regime coincides with other growth-positive policies such as prudent monetary and fiscal policy or a solid infrastructure. This makes it empirically difficult to uncover partial effects. Marx and Engels were among the first analysts of the links between openness and insitutions. In their Communist Manifesto they noted that ”The 6
Very explicit in Rivera-Batiz and Romer (1991).
7.1 Theory: higher efficiency
85
bourgeoisie, by the rapid improvement of all instruments of production, by the immensely facilitated means of communication, draw all, even the most barbarian, nations into civilization. [...] It compels all nations, on pain of extinction, to adopt the bourgeois mode of production; it compels them to introduce what it calls civilization into their midst, i.e. to become bourgeois themselves” (quoted from Sachs and Warner [1995], p. 5). Similarly, Thurow (2003) emphasizes that the biggest danger of globalization is to be left behind. Today we can interpret ”becoming bourgeois” as adopting institutions that allow the efficient use of the factors of production. This includes openness to trade and to ideas from abroad, sound domestic political and financial institutions and so on. As indicated in chapter 2 on growth theory, the Parente-Prescott model of barriers to technology adoptions has a close link to openness. Barriers to domestic production can only prevail if the more productive competition from abroad is kept out. Crucially, both domestic technology barriers and foreign trade barriers require policy decisions. Parente and Prescott use a similar definition of technology as Lucas (1988, p. 15), who sees technology as ”something that is common to all countries, [...] something whose determinants are outside the bound of our current inquiry.” This does not differ much from Solow (2001, p. 287) who thinks that ”the nontechnological sources of differences in TFP may be more important than the technological ones. Indeed, they may control the technological ones, especially in developing countries.” In sum, barriers to technology adoption are indeed important for understanding international differences in income levels. It is next to impossible to distinguish what came first: good institutions or openness. Both may be determined by third factors such as geography or history. Given the lack of long time series on measures of institutional quality, I will focus only on one of the two: openness. Bassanini, Scarpetta and Hemmings (2001) explicitly use trade openness as a policy variable. However, in this case great prudence is required when interpreting any regression coefficients or when deriving policy conclusions: a change in openness likely requires a change in institutions for the full effect on GDP to materialize. Dollar and Kraay (2003) address this point with a variety of econometric techniques to conclude that “the cross-sectional evidence is not very informative about the relative importance of trade and institutions in the long run, although existing evidence on their individual effects suggests that both are important.” The conclusion from panel analysis is likely to be the same. Frankel and Romer (1999) try to disentangle the impact of trade openness from the impact of domestic institutions. They find that countries that are large, far away from other countries and landlocked trade relatively little with other countries. This should not come as a surprise. But the fact that these geographic characteristics should impact income only through the openness channel can be used to identify the partial effect of trade on income. Frankel and Romer (1999, p. 394) conclude that trade indeed raises income. A one percentage point rise in the trade share raises income per person by at least
86
7 Openness
one-half percent. However, Irwin and Tervi¨o (2002) show that these results are not robust to inclusion of the distance from the equator. They also discuss the fact that high incomes in partner countries lead to higher bilateral trade. 7.1.3 Additional influences of trade on income In addition to these two important links between trade and income, importing ideas and technology may also boost GDP at home. Poor countries do not need to create new ideas to improve their standard of living. They only have to apply existing ideas to the production of goods and services (Parente and Prescott [2000], p. 5). For example, imports may introduce new machines that cannot be produced at home. In the model of Easterly et al. (1994) trade restrictions raise the price of imported intermediate goods, which depresses the rate of GDP growth. Societies that generate and tolerate new ideas are likely to have higher income than countries stuck in their own past. However, the adoption of technologies developed elsewhere still requires some physical and human capital in the adopting country. The tolerance and openness towards new ideas has both a domestic and an international dimension. These two are probably highly correlated: countries with the right institutions and incentives to innovate at home are also those open to learning from abroad. In a very different framework, post-Keynesian models of balance of payments constraints in the growth process lead to Thirlwall’s Law that the rate of a country’s GDP growth is proportional to the ratio of its rate of export growth and to its income elasticity of demand for imports. This theory may have been relevant in the 1960s and 1970s when many countries could not afford to purchase imports because of lack of foreign currency. However, these constraints have become less binding on average over the past decades.
7.2 Measuring openness Several measures of openness have been used in the empirical growth literature. There is an even larger disagreement here than on how to best measure human capital. This section reviews these measures and critically evaluates them. For inclusion in my forecasting model, the same criteria as for the other variables have to be satisfied as much as possible: (1) time series at an annual frequency have to be available, (2) the series have to measure the theoretical concept (here: actual openness), (3) the measure should be reasonably easy to forecast and (4) there has to be an empirical link to per capita GDP. For my purpose, the best available measure is the time series of the trade share, i.e. the average share of exports and imports in GDP. If cross-country comparisons of levels are needed, it is best to adjust this measure for differences in relative population size. Before presenting the most appropriate measure in detail, it is worth briefly discussing other measures used in recent empirical analyses.
7.2 Measuring openness
87
7.2.1 Black market premium and tariffs Initial empirical studies often used data on the difference between the official exchange rate of a country’s currency and the exchange rate in the black market. This black market premium is supposed to be a measure of trade distortions. However, very few countries nowadays restrict their foreign exchange markets, so black market premia are mostly zero. Several cross-section studies used the average tariff rates on imported manufactures as reported by UNCTAD. This is calculated by dividing total import duties by the volume of imports. A problem with this variable is that it may be a poor measure of actual trade barriers for a number of reasons. For example, a country may have zero tariff rates, but all potential trading partners have extremely high tariff rates. As a result, there would be hardly any efficiencyenhancing trade despite the country’s low tariff rate. On the other hand, if tariff rates remain unchanged but falling transport costs allow more trade, then tariffs would also be an inappropriate measure for openness. As a result, I focus on outcomes, which are the result of a large number of factors. Trade can be - and has been - restricted in many other ways than through tariffs. Therefore, attention turned to non-tariff barriers. The Heritage Foundation’s Index of Economic Freedom includes a sub-component measuring the distortions in international trade. This appears to be a useful index, but the time series only start in 1995 which is not long enough for my purpose. 7.2.2 The openness dummy To date, one of the most widely used approaches to measuring openness (see table 7.1 on page 94 with the overview of empirical studies) is that of Sachs and Warner (1995). They constructed a binary variable by assessing five characteristics in 113 countries for 1970 to 1989. A country is defined as open only if none of the five characteristics applies: non-tariff barriers on more than 40 percent of trade, average tariff rates of 40 percent or more, the exchange rate in the black market is 20 percent or more below the official exchange rate, a socialist economic system and a state monopoly on major exports. Wacziarg and Horn Welch (2003) present updated values of the SachsWarner dummy for 2000. However, they conclude for example that China remains a closed economy, which appears to have little to do with what has been happening there over the past 20 years. Wacziarg and Horn Welch see “a distinct possibility that the classification of countries as ‘open’ or ‘closed’ is too crude to provide much information for growth” (p. 12). Rodriguez and Rodrik (2001) criticize the Sachs-Warner dummy for depending mostly on the state monopoly of exports and the black market premium. They argue that sample selection bias makes the state monopoly variable indistinguishable from a sub-Saharan-Africa dummy. Moreover, openness is always a matter of degree. No country is completely open, as the debate about the European services directive illustrates. The
88
7 Openness
Sachs-Warner dummy does not allow for different degrees of openness. In many cases, the literature has used a compromise approach by assuming that a country is more open today the further back it was first assessed as open: the number or the fraction of years open is used (which is equivalent to using the first year of openness).7 But even using the number of years open is not helpful in my context, because the annual change in openness would be the same for all countries once they are deemed open. Therefore, even the number of years open cannot explain different growth experiences across the countries seen open throughout my sample period - which includes all advanced countries. 7.2.3 Best measure: adjusted trade share In principle, a country can have done all the right things to promote trade but still not trade at all. This might be the case if the country is infinitely far away from potential trading partner. The world is an example of an effectively closed economy because of a lack of trading partners. On earth, New Zealand is an example of a country relatively far away from potential trading partners. So even if New Zealand abolishes all tariff and non-tariff barriers, its actual trade with others might still by relatively small. However, actual trade is what matters when a country tries to reap the benefits from free trade outlined in section 7.1. True openness can therefore only be measured using actual trade data and not with barriers.8 This also holds for changes in openness. Therefore, the ratio between exports and imports and GDP has increasingly become the variable of choice in empirical growth analysis (see table 7.1 on page 94) and I will use it as well. Trade shares are the sum of nominal exports and imports divided by two times GDP. In 2003 these shares ranged from 11% in Japan and 12% in the USA to 104% in Malaysia and 200% in Singapore. Over the 20 years from 1983, the un-weighted average trade share rose by 3.4 points in the 21 rich countries and by 14.1 points in the 19 emerging markets. Countries opening most quickly during that period include Malaysia, Thailand, Ireland and Austria. The literature often uses ratios of real variables to construct trade shares. However, as is the case for investment shares (see page 55) the relative price of traded and non-traded goods is likely to change over time: the development of the real trade share will differ from that of the nominal trade share. In fact, real trade shares rose much faster than nominal shares in the past as prices of traded goods rose less than prices of non-traded goods. Companies in small countries have few compatriots to trade with, so they trade more across borders. International trade stemming from this reason does not differ much in quality from domestic trade. Since my empirical analysis 7
8
The number is used for example in Fernandez, Ley and Steel (2001), the fraction is used in Hall and Jones (1999), see also table 7.1. A mixture of the two is used in the Trade Openness Index presented by Gwartney et al. (2001).
7.2 Measuring openness
89
will use panel methods to explore the impact of changes in trade on changes in GDP, the raw trade shares would be sufficient as input; the fixed effects would pick up the effect of country sizes. However, to allow cross-country comparisons of openness levels and to derive policy conclusions, it is helpful to adjust the trade shares for the size of the country. It is not easy to correctly deal with the nexus of country size and magnitude of the trade share. For example, Alcal´a and Ciccone (2004) explain the level of productivity using both the trade share and the population size on the right-hand side. They conclude that “an increase in real openness taking a country from the thirtieth percentile to the median value raises productivity by 80 percent” (p. 614). However, they ignore that this country is highly likely to also move down the size scale, which in their reasoning would lower productivity. They themselves report a correlation coefficient between lnrealopen and lnpopulation of -0.6, but do not consider that collinearity in their regression for productivity. As a corollary they also find that larger countries grow faster, a finding quoted by Jones (2005) in support for models with scale effects. However, this finding stems from the use of both the trade share and the country size in the same regression. To avoid these traps, I will first correct for size - call the result ”trade openness” - and then proceed with estimating. My size adjustment stems from a simple regression of the average nominal trade share in 1995-2000 on the natural logarithm of the population level over the same period for 39 countries (Singapore excluded) as shown in table ??. The R2 is 0.25. The trade openness levels are then simply the deviations from this regression line. Countries below the line get negative values (e.g. Japan -16%), while countries above the line get positive numbers (e.g. Belgium +34.2%). Importantly, the absolute changes of openness over time are not affected by these simple intercept shifts. Figures 7.2 and 7.3 at the end of this chapter show the trajectories of the trade openness measures in the two samples. They illustrate the upward tendency of openness and the particularly strong development in, for example, Ireland and Thailand. Another important question for the empirical estimates is whether absolute or relative changes of trade openness affect GDP. Should we use a log-log or a log-linear specification? Most often, the log-log form is chosen (e. g. Bassanini, Scarpetta and Hemmings [2001]). This specification implies that a 10 percentage point rise in the trade share has a smaller impact on GDP if it starts at 70% than if it starts at 10%. However, my empirical analysis and country examples indicate that the log-linear specification is more appropriate. The example of Ireland shows that a rise in the trade share may be very important even if the starting level is already high. Also, even large countries are able to raise their trade share by more than ten percentage points within just 10 years, as the case of Germany shows. Finally, there is a case for another adjustment to the trade share in order to account for differences in relative price levels. Prices for exports and imports tend to be similar across all countries. Therefore, a country such as Japan with a high domestic price level would - all else being equal - show a lower
90
7 Openness
120
Nominal trade share (1995-00) Malaysia
100 Ireland
Belgium
80
60
40 China
NZ
20
Argentina
ln population level (avg. 1995-2005)
0 8
9
10
11
12
13
14
Fig. 7.1. Small countries with large trade shares
trade share than a country with a low domestic price level. Alcal´a and Ciccone (2004) were the first to promote an adjustment for this effect and they provide the theoretical background (p. 615ff). However, Rodrik, Subramanian and Trebbi (2004) criticize their approach for introducing a spurious correlation between openness and income, because a rise in productivity in the tradables sector can produce a rise in real openness (and income) regardless of the origin of the productivity increase. In addition, using the PPP conversion factors for the year 2000 from the World Development Indicators and dividing the trade share by these conversion factors to generate a PPP trade share (or in the Alcal´a and Ciccone terminology a ”real” trade share) significantly reduces the trade share in poor countries such as India and China. Now the same 10 percentage point rise in the raw nominal trade share no longer can have the same effect on GDP growth. Therefore, I do not adjust for differences in price levels.
7.3 Empirical debate: levels versus growth The literature on the linkage between openness and income is marred by many difficulties. Some are unavoidable, some are home-made. The unavoidable difficulties include the Laspeyres-measurement of GDP mentioned above and the problem of finding an appropriate and exogenous measure of openness. The home-made difficulties include a sometimes misleading handling of levels and differences of openness and income and the accompanying interpretation of the empirical results.
7.3 Empirical debate: levels versus growth
91
A considerable part the literature is either not explicit about what link it is examining or is concerned with the link between the level of openness and the growth rate of GDP. Rodriguez and Rodrik’s (2001, p. 1) main question is stated as ”do countries with lower barriers to international trade experience faster growth?” and this is what they seem to look for. Given my reasoning, it is not surprising that they find no robust link and ”view the search for such a relationship as futile” (p. 39). Even more misleading is the fact that some papers effectively use models of levels on levels, when deriving their claims that openness does or does not influence GDP growth. Much of the confusion stems from the use of cross-section convergence regressions described in detail in chapter 2. The starting point is a production function, where the level of GDP is explained by the level of labor input, of capital and of openness. Subtracting initial income from both sides produces the convergence regression with the growth rate of GDP on the left hand side and the levels of labor, capital, openness and - crucially - initial GDP on the right hand side. Now, a positive coefficient on openness is interpreted as showing a link between the level of openness and the growth rate of GDP. However, all that has been done is to show that openness helps explain the level of steady state GDP, which - together with the gap to initial GDP - explains the growth rate of GDP. For example, Dollar and Kraay (2003) conclude that ”countries that trade more grow faster” although they start from a regression of the level of GDP on the level of the trade share. When they move to explaining GDP growth, they claim to analyze ”the effect of changes in trade on changes in growth” (p. 155, their italics) without mentioning again that they added initial GDP on the right hand side. Similar confusions and conclusions can be found in Lee, Ricci and Rigobon (2004), who go straight to regressing GDP growth on the level of openness, or in Chang, Kaltani and Loayza (2005) who regress changes in growth on changes in openness. Similarly, Edwards’ (1998) finding that ”more open countries will tend to experience faster productivity growth than more protectionist countries” should not be interpreted as the level of openness having a direct link to the growth rate of GDP. Rather, growth is higher for the same level of initial GDP. Therefore, the more appropriate interpretation in a convergence regression like the one used by Edwards would be that the level of openness has an impact on the level of GDP. Because of these confusions and misconceptions an extensive literature review appears inappropriate in this chapter. Some papers are clear and helpful, however. For example, Fendel and Frenkel (1999) regress the percentage changes of GDP on the percentage changes in trade shares and find a significant relationship. Yao and Lyhagen (2001) use the pooled mean group estimator to find a positive long-run relationship among per capita GDP, per capita capital and the trade share in 28 Chinese provinces in the period 1978 to 1997 with a significant positive coefficient on the trade share. However, they do not check whether there is more than one cointegrating relationship among these three variables.
92
7 Openness
Another helpful contribution is Brunner (2003), who separately tests the effect of trade on income levels and on income growth for 125 economies over 1960 to 1992 using time series methods. He starts from the premise that for the level of trade to have permanent effects on GDP growth, it is necessary for both GDP growth and trade to be integrated of at least order one. He rejects non-stationarity of GDP growth and concludes that the level of trade openness cannot possibly impact the growth rate of GDP. This is in line with my empirical approach although Brunner does not use the powerful panel techniques. Chapter 12 will show that there is a cointegration relationship between trade openness and per capita GDP in both the 21 rich countries and the 19 emerging markets. In 10 of the 40 countries I find that GDP adjusts to deviations of openness and GDP from the cointegration relationship. While not as statistically significant and economically important as in the case of human capital, this nevertheless indicates that trade openness includes a significant amount of information that can be exploited for long-run growth forecasts.
7.3 Empirical debate: levels versus growth
50 Deviation from regression line 40 Belgium 30 Netherlands 20 Ireland
10
0
−10
−20 Australia −30 1971
1976
1981
1986
1991
1996
2001
Fig. 7.2. Trade openness in 21 rich countries
50 Deviation from regression line
Singapore
40 Thailand 30
20
Nigeria
10
0
-10
-20 Argentina -30 1971
1976
1981
1986
1991
1996
Fig. 7.3. Trade openness in 19 emerging markets
2001
93
Variable Years open (number) Black market premium Distortions in intern. trade Import tariff Openness dummy Outward orientation Trade as % of GDP Years open (fraction) Trade as % of GDP Trade exposure adjusted (ln) Years open (number) Trade as % of GDP Trade as % of GDP at PPP Trade instrument Years open Level of trade as % of GDP Openness dummy Trade as % of GDP (real) Trade as % of GDP at PPP Years open (number) Years open (number) Years open (number) Trade share adjusted Trade as % of GDP (nominal) Trade share adjusted Trade share adjusted
Source Sachs & Warner (1995) Barro & Lee (1994) Heritage Foundation UNCTAD Sachs & Warner (1995) World Development Report IMF Direction of Trade Sachs & Warner (1995) As in Beck et al. OECD & own calculations Sachs & Warner (1995) IMF Direction of Trade PWT 5.6 Frankel & Rose (2002) Sachs & Warner (1995) n/a Sachs & Warner (1995) China Statistics Yearbook PWT 5.6 Sachs & Warner (1995) Sachs & Warner (1995) Sachs & Warner (1995) Own calculations PWT 6.0 Own calculations WDI
Impact Robust positive relationship Significant positive effect Significant positive effect Significant positive effect Significant positive effect Significant positive effect Significant positive effect Significant positive effect Sig. positive effect on growth Significant positive effect Modest role Significant and large effect Significant positive effect Sign. positive effect on growth No additional effect No significant relationship Significant positive effect Significant positive effect Sign. and robust positive effect Strongly and robustly related Significant positive effect Significant positive effect Significant positive effect No effect on top of institutions Significant positive effect Significant positive link
Note: Table is not comprehensive. Some papers may use several measures and sample sizes.
Sala-i-Martin Edwards Edwards Edwards Edwards Edwards Frankel & Romer Hall & Jones Easterly & Levine Bassanini et al. Fernandez, Ley & Steel Brunner Dollar & Kraay Bosworth & Collins Bosworth & Collins Barro & Sala-i-Martin Chen & Dahlman Yao & Lyhagen Alcala & Ciccone Doppelhofer et al. Hendry & Krolzig Hoover & Perez Batista & Zalduendo Rodrik et al. Ianchovichina & Kacker This study
Year Author(s)
1997 1998 1998 1998 1998 1998 1999 1999 2001 2001 2001 2003 2003 2003 2003 2004 2004 2004 2004 2004 2004 2004 2004 2004 2005
Table 7.1. Overview of the empirical literature on openness Method Cross-section Cross-section Cross-section Cross-section Cross-section Cross-section Cross-section Cross-section Panel GMM PMG Cross-section Panel Panel GMM Cross section Cross section Cross-section Cross-section PMG Cross-section Cross-section Cross-section PCGets Cross-section Cross-section plus time Cross-section Cross-section 2-stage nonst. panel
94 7 Openness
8 Spatial linkages
“Everything is related to everything, but near things are more related to each other.” This first law of geography formulated by Waldo Tobler (1970) is frequently ignored in empirical growth analyses at the country level. Most growth models abstract from geographic distance by effectively treating countries as randomly distributed in space or as isolated islands. The aim of this chapter is to explore the importance of geographic distance for the growth of national economies. If you have a rich neighbor, are you more likely also to be rich? Similarly, if your neighbors grow quickly, are you more likely also to experience strong growth? The lack of attention to distance in growth empirics is surprising, since distance features prominently in the gravity models used in the analysis of trade and migration. If a country has reached a high income level relative to its neighbors, this could potentially induce it to outsource activities to low-cost neighbors, thereby driving up income there. Recently, theoretical growth models have made use of spillovers to explain economic growth:1 A country’s GDP growth rate depends on the growth and income levels of other countries. Similarly, the literature on the ”distance to frontier” focuses on technological distance to the world leader and the diffusion of technology from the leader to the followers. As outlined in chapter 3 on GDP, the further a country is below the level of the world leader or the world average, the more it is supposed to benefit from spillovers. However, this literature is usually not concerned with the position of the leader in geographic space and with the distance that the spillover effects have to travel. An exception is Keller (2002), who shows that the benefits of spillovers decline with distance in 14 OECD countries and that the amount of spillovers is halved after about 1200 kilometers. 1
Klenow and Rodriguez-Clare (2005) provide a helpful taxonomy of different models.
96
8 Spatial linkages
Spatial econometrics is a subfield of econometrics that deals with the treatment of spatial interactions and spatial structure.2 Spatial dependence refers to observations in one country depending on the observations in other countries. Another issue is spatial heterogeneity which refers to parameters differing depending on the spatial characteristics of a country. The simplest example is a different constant (fixed effect) that is correlated with spatial aspects, although slope coefficients may also differ. The focus in this chapter will be on spatial dependence. More generally, spatial autocorrelation simply examines the coincidence of value similarity with locational similarity. The most widely used measure is Moran’s I statistic (introduced in Moran [1950]), which relates deviations from the mean of own values to the values of neighbors.3 Moran scatterplots are the graphical counterpart that will be used below. To my knowledge, spatial aspects have not been explored in a panel framework in the growth literature. Ramirez and Loboguerrero (2005) use a pooled cross-section approach for 98 countries to conclude that ”spatial relationships across countries are quite relevant.” They argue that a country’s economic growth is affected by the performance of its neighbors and that ignoring spatial interrelations can result in model misspecification. Similarly, Moreno and Trehan (1997) find that ”a country’s growth rate is closely related to that of nearby countries, and show that this correlation reflects more than the existence of common shocks.” Several studies look at regional economic growth, for example across US states (e.g. Lim [2003]) or European regions, and tend to conclude that there are significant cross-regional externalities. For example, Vaya et al. (2004) find for 17 Spanish regions that “a 10 percent increase in the level of total factor productivity of the neighbors raises the level of technology in one region by almost 3 percent.” Cannon et al. (2000) find for 142 European regions that if any particular region is experiencing a high long-term growth rate it is more likely that the regions closer to it will also experience a high long-run growth rate. However, my analysis in section 8.2 in this chapter and more comprehensively in chapter 12 does not support these conclusions. Good economic performance is mostly home-made and does not simply spill over from across the border.4 I attribute the differences in conclusion to the differences in country samples, the ways of constructing spatial GDP and to the different estimation techniques used. 2 3 4
Anselin (2001) provides a comprehensive overview. See Lim (2003) for a brief overview. Certainly, active foreign intervention by foreigners through colonialism or war would affect activity.
8.1 Spatial economics - location matters
97
8.1 Spatial economics - location matters The literature on economic geography distinguishes between absolute and relative location.5 Absolute location refers to the location at a particular point in space, for example in a certain latitude, climate zone or continent. Relative location refers to the location nearer or further away from other countries. 8.1.1 Absolute location: latitude and climate The most obvious reason to focus on absolute location is the fact that agricultural productivity is lower in warmer climates, because winter frost is essential for rich soils and for keeping diseases in check. Landes (1999) argues that the favorable climate was one of the reasons why the Industrial Revolution happened in Europe and not elsewhere. Plenty of rain and forests allowed independent, small economic units to prosper and compete. Technological progress was higher than in dry, hot regions. In addition, severe climate may lead to lower labor input (hours) because recovery periods have to be longer. Acemoglou, Johnson and Robinson (2001) argue that heavy incidence of malaria kept colonizers from building strong institutions in some colonies with lasting effects on income levels today. Authors using absolute latitude or the distance from the equator include Sachs and Warner (1995), Lee, Pesaran and Smith (1997), Hall and Jones (1999) and Doppelhofer, Miller and Sala-iMartin (2004). Continent dummies were used for example in Barro (1991). All this may be helpful in explaining differences in income levels across countries. But since absolute location is fixed, it will not be of much help in an attempt to explain and forecast GDP growth rates. It will be captured in the fixed effects in the panel analysis. 8.1.2 Relative location: rich neighbors A potentially more important influence on a country’s economic performance may be its relative location to others and the performance of these neighbors. For example, if there is political instability or even a civil war in a large neighboring country, then negative spillovers may depress activity also in the country of interest. Trade may slow, exchange of ideas may fade, military spending may rise. A country far away from the instable country will suffer far less than an immediate neighbor. Likewise, successful neighbors may provide a good example to emulate. This would imply that strong growth in neighboring countries would be followed by stronger growth at home once these growth-positive policies are implemented. However, the empirical difficulty is that the other drivers of growth such as human capital and trade openness would also accelerate and 5
Abreu, de Groot and Florax (2004) provide a useful survey and show that the literature on spatial econometrics and growth only took off around the year 2002.
98
8 Spatial linkages
help to explain the final outcome of stronger GDP growth. In this case, successful neighbors might be the ultimate reason for growth-positive reforms, but it would be difficult to disentangle this empirically. DeLong and Summers (1991) investigate whether the distance between capital cities has any explanatory power for the residuals from their crosssection estimates. They expect to find that the product of the residuals from two countries would tend to be high when the countries were close together, but could not find signs of this spatial correlation.
8.2 Constructing spatial GDP In order to capture the economic characteristics of a country’s neighbors, I construct time series of ”spatial GDP per capita”. To this end, a country’s neighbors in my sample have to be weighted: important countries have to receive a higher weight than less important countries. The most often used and easy to implement weighting matrix uses the physical distances between the countries or their main cities. This is the approach that will be followed here, in line with Moreno and Trehan (1997). Another possibility is to use trade weights (as in Arora and Vamvakidis [2005]), or whether regions or countries have a common border (as in Lim [2003] and Ramirez and Loboguerrero [2005]). However, the openness variable already captures trade linkages. Also, having a common border is far too narrow a measure of spatial relationships. I use distance data from the Centre d’Etudes Prospectives et d’Informations Internationales CEPII as described in Mayer and Zignago (2006). They provide a matrix of bilateral distances in kilometers for 225 countries.6 For my sample of 40 countries covering 86% of world GDP, I use their measure distw. It provides the distance between two countries based on bilateral distances between the biggest cities of those two countries, where the inter-city distances are weighted by the share of the city in the country’s overall population. For example, the distance between Spain and Germany is given as 1,627 kilometers, while the distance between Spain and Argentina is then 10,080 kilometers. Since nearby countries have to get a higher weight than more distant countries, these distances have to be inverted first. And since the weight of all neighbors has to sum to one for each country, these inverted distances have to be standardized (”row normalized”). The resulting matrix shows that Germany has a weight of 4.6% in Spain’s spatial GDP, while Argentina has a weight of 0.74%. The ratio between these two shares equals the inverse of the ratio of the two distances mentioned earlier. An alternative specification would be to use a quadratic instead of a linear function. However, this would give a very small role to more distant countries. In the case illustrated above, Germany would have 38 times the weight of 6
http://www.cepii.fr/anglaisgraph/bdd/distances.htm
8.2 Constructing spatial GDP
99
Argentina in Spain’s spatial GDP if a quadratic function was used - in the linear case the ratio was 6.2.7 Not surprisingly, the resulting trajectories of spatial GDPs per capita look similar to the trajectories of the underlying series for GDP per capita - only much smoother as country-specific business cycles tend to get averaged out. Figures 8.3 and 8.4 at the end of this chapter plot the series. The so-called Moran scatterplot in figure 8.1 shows that there is a small positive correlation between the levels of spatial GDP and own GDP in my sample of 40 countries. However, this is probably not indicative of a causal relationship. Rather, it is driven by the rich European countries being surrounded by other rich (European) countries. The USA, Australia, Singapore and Japan show that it is possible to have high levels of per capita GDP despite having relatively poor neighbors. Egypt, Turkey and Mexico are examples of countries with relatively rich neighbors (at least in my 40 country sample) that nevertheless do not have a high level of GDP per capita themselves.
Fig. 8.1. Moran plot for the levels of GDP per capita in 2000-2003
Turning from levels to growth rates of spatial GDP per capita, there are some - albeit small - differences across countries over longer horizons. For example, over 1994 to 2003, Japan’s spatial GDP rose relatively strongly at 2.92% per annum, almost 0.6 percentage points above the average. This stems from the fact that Korea and China (with weights of 16.6% and 8.0% in Japan’s spatial GDP) posted particularly strong growth over that period. 7
Since I do not use all 225 countries in the weighting, poor countries, e. g. Nigeria and Egypt whose actual neighbors are not in my sample will receive a relatively high level of spatial GDP.
100
8 Spatial linkages
Despite this strong growth of neighboring economies, Japan’s own GDP per capita rose by just 1.1% per annum. At the low end of the rankings of spatial GDP growth, Chile had to face just 1.9% weighted growth in its neighboring countries. Here, the weak growth performance of Argentina and Brazil (with weights of 20.0% and 7.2% in Chile’s spatial GDP) left its mark. 8.0 Own GDP growth p. c. (deviation from avg. in percentage points) 6.0 Ireland
China
4.0 Korea 2.0
India Taiwan
Chile 0.0 Japan Indonesia
−2.0 Switzerland −4.0 −0.6
Spatial GDP growth p.c. (deviation from average)
Argentina −0.4
−0.2
0.0
0.2
0.4
0.6
Fig. 8.2. Moran plot for the growth rates of GDP p.c. 1994-2003
The Moran scatterplot in figure 8.2 showing the deviations of growth rates from averages illustrates that there was no visible link between own country growth rates in 1994 to 2003 and neighbors’ growth: Japan in the bottom right quadrant had below-average own growth despite strong performance of its neighbors. China in the top right quadrant clearly outperformed its neighbors, as did Ireland and Chile in the top left quadrant. Brazil, Italy and Switzerland all faced a combination of weak growth at home and nearby (bottom left quadrant). The absolute size of a neighboring economy does not play a role in my measure of spatial GDP. However, in the gravity models of migration and trade, the weight of the attracting unit is an important element. A large neighbor should be more important than a small neighbor. It is probably more important for the German economy if French GDP grows by 3% than if Luxembourg’s GDP grows by 3%. In my framework, these size effects are already captured by the trade openness variable. For Germany’s trade share it clearly makes a difference whether all neighbors are the size of Luxembourg or the size of France. In addition, when the home country is itself very large, whatever happens in neighboring countries is unlikely to matter that much. In principle, it is possible to take this into consideration in spatial analysis: Moreno and Trehan (1997) scale their initial weighting matrix with the
8.3 Sum: Spatial linkages not much help
101
corresponding ratios of foreign to domestic output to also capture effects of country size. Again, this effect is already mostly captured in my openness variable - as is the effect of absolute distance, for example that New Zealand is much further away in absolute terms from other countries than is Germany. All else equal, if a country has large, rich and close neighbors, its trade openness measure will be higher than that of a remote country with small, poor neighbors. There is no correlation between the levels of spatial GDP and the levels of trade openness in the 40 countries in 2000-03.
8.3 Sum: Spatial linkages not much help While theoretically appealing at first sight, I find little evidence of a major impact of neighbors’ GDP growth on home country growth. The analysis of the Moran scatterplots in this chapter will be supported by the panel analysis in chapter 12. While there is - not surprisingly given the way the data are constructed - a long-run panel cointegration relationship between own GDP and spatial GDP, the errors from this relationship enter significantly only in 9 out of 40 models for short-run GDP growth. Likewise, lagged spatial GDP growth enters only 3 of 40 short-run models. Therefore my results do not support those derived from cross-section regressions for large and heterogeneous samples. Simple correlation as measured by Moran’s I does not necessarily indicate causality. I conclude that growth is mostly driven by policies and decisions taken at home - which may themselves be positively correlated across space as the Asian Tigers show. Growth does not simply spill over from next door, it requires the right decisions at home.
102
8 Spatial linkages
26,000
in PPP USD of 2000
22,000
18,000
USA 14,000
Japan 10,000
Australia
6,000 1971
1976
1981
1986
1991
1996
2001
Fig. 8.3. Spatial GDP p.c. of 21 rich countries
21,000
in PPP USD of 2000
Turkey
18,000
15,000
Taiwan 12,000
Singapore
9,000
6,000 1971
1976
1981
1986
1991
1996
Fig. 8.4. Spatial GDP p.c. of 19 emerging markets
2001
9 Other determinants of GDP
Long-run economic growth is a highly complex phenomenon. Therefore, the measures discussed so far cannot provide a comprehensive picture of all the relevant factors. However, expanding the list of variables beyond those presented generates a variety of problems. Either data are not available or the theoretical link is not clear or the empirical link is not strong enough. One very important candidate for help with modeling the progress of technology is spending on research and development (R&D). However, the ratio of R&D spending to GDP shows no correlation with the growth rates of GDP: countries with high rates of R&D spending do not grow systematically faster, a point made clearly by Klenow and Rodriguez-Clare (2005). While this lack of correlation does not imply that there is no partial effect of R&D, it suggests the need for very detailed analysis. On the other hand, countries with high R&D shares tend to have high levels of income. Given my focus on growth rates of GDP, this would suggest the use of changes in R&D shares to explain growth rates of GDP. However, R&D shares barely changed over the past decades, so they may not be too helpful in explaining long-term GDP growth. Using stock measures appears more promising and would be in line with the other variables suggested in the previous chapters. Unfortunately, constructing stocks of R&D is not an easy undertaking. Bottazzi and Peri (2007) construct domestic and global stocks of knowledge using patent applications from 15 rich countries. They find a cointegrating relationship between the domestic stock of knowledge, domestic R&D resources and the global stock of knowledge. Two difficulties prevent the inclusion of this relationship in the present study. First, their knowledge data are available for only 15 countries, while the focus here is on 40 countries. And second, forecasting both R&D resources and the stock of global knowledge is a tremendous task on its own but would be necessary for a forecasting model. Furthermore, the correlation between the stock of R&D and the stock of human capital is very high, so that it is not clear whether there will be sufficient additional information given the high cost of constructing the data. Still, a measure of the R&D stock could be the focus of future research.
104
9 Other determinants of GDP
A second candidate to consider would be the structure of a country’s financial system. A financial system that channels resources to the most promising activities and allows risk sharing, with the accompanying decline in financing costs, would clearly be beneficial for GDP. The empirical literature indicates that episodes of financial market liberalization are followed by stronger GDP growth (Bekaert et al. [2005]) and that better developed financial markets ease firms’ external financing constraints (Levine [2005]). However, the literature struggles with how to best measure the quality of a financial system. Using, for example, equity market capitalization may not be a promising route as the high level of equity markets in Japan in the early 1990s illustrates. In addition, forecasting financial system quality is even more challenging. Therefore, these questions are not explored here although they might be the subject of future research. Another candidate is the political constitution of countries. Sometimes it is argued that democratic governments may not only have a positive effect on non-material well-being, but also on GDP. However, the evidence is far from convincing as the example of poor but democratic India illustrates. Tavares and Wacziarg (2001) even find a moderately negative overall effect of democracy on growth. A compromise would be to assume that there is no major effect of the political setup on economic activity either way. In addition, a large body of literature (e. g. Rodrik, Subramanian and Trebbi [2004], Acemoglu, Johnson and Robinson [2001] and North [2005]) argues that high quality institutions are crucial for explaining the levels of GDP per capita. However, measuring these institutions - and for my purpose the changes in their quality - is a far from trivial task. Most empirical papers from the cross-section area use just one observation on quality for each country, but time series would be necessary for a forecasting model. In addition, the assessments of institutions usually stem from investors or observers and may be subject to endogeneity bias. Rich countries may get a high rating partly because they are rich. In sum, it would be extremely helpful to have time series of objective measures of institutions, but these simply are not available. Overall, as the empirical growth literature has shown repeatedly, new questions and variables keep coming up all the time. I close my system at the borders marked in the preceding chapters but acknowledge that there are other elements that also deserve attention.
10 The theory of forecasting
As we know, there are known knowns. There are things we know we know. We also know there are known unknowns. That is to say we know there are some things we do not know. But there are also unknown unknowns, The ones we don’t know we don’t know.1
This assessment by the U.S. Secretary of Defense was not warmly received by all listeners back then, but it summarizes the difficulties of forecasting quite well. Academics would put it slightly differently: “because of the things we don’t know we don’t know, the future is largely unpredictable. But some developments can be anticipated, or at least imagined, on the basis of existing knowledge.”2 Before building a forecasting model it is necessary to review some of the basics of forecasting theory. This should help focus attention on the most relevant issues and avoid some of the traps in forecasting. However, since forecasting is only one of the purposes of my analysis, I will not follow all the recommendations in that literature. Forecasting often is a mixture of science and art, i. e. judgments on how to deal with a host of issues have to be made. The basics of forecasting are outlined in Diebold (2004), while Clements and Hendry (1999) is a more advanced exposition (summarized in Clements and Hendry [2003]). The following sections will draw heavily on these two sources. Unfortunately, the general conclusion is that forecasting is extremely difficult and that confidence bands tend to be very wide. This is particularly the case in macroeconomics where endogeneity problems abound. 1
2
U.S. Secretary of Defense Donald Rumsfeld at a Department of Defense news briefing Feb. 12, 2002. Singer (1997). Singer is a biochemist.
106
10 The theory of forecasting
10.1 The benefits of forecast experiments The importance of long-run growth forecasts for many different constituencies was emphasized in the introduction in chapter 1. Despite its practical importance, forecasting is not a key element of economic curricula and of econometric textbooks. Few economics departments offer courses in forecasting.3 The growth literature tends to avoid the topic of forecasting altogether. The textbooks of Jones (2002a), Barro and Sala-i-Martin (2004) and Hemmer and Lorenz (2004) do not mention it. The 1998-page Handbook of Economic Growth published in 2005 does not include a chapter on forecasting. It is not clear why a whole strand of economics avoids the forecasting issue even though the demand for growth forecasts is so strong. This study tries to fill the gap. As argued in the introduction, growth theory may benefit from forecast exercises because more focused questions may be asked and the practical relevance of the different growth theories has to be shown. Unfortunately, the beneficial effects do not necessarily go both ways: Diebold (1998) concludes that there is no support for the belief that a greater reliance on economic theory will help forecasting. Successful models are those that adapt quickly to shifts in the economy - not necessarily models that are based on sound economic theory.
10.2 The characteristics of good forecasts The link between theory and forecasting is also important when communicating forecasts. Communication is easier if the underlying model is coherent, i.e. logically integrated, consistent and intelligible. Logical coherence requires that forecasts satisfy economic identities, e.g. that the budget deficit equals expenditures minus revenues. Economic coherence requires that forecasts make economic sense by obeying economic rules and regularities in the data. A business cycle model where unemployment does not fall when GDP grows above trend is not coherent in this sense. At times, the requirement for coherence may conflict with other requirements, as the most coherent model need not necessarily be the most precise one. Coherence is important for my model as well, which should also be useful for policy analysis. I side with Don (2000, p. 11) that we “should put logical and economic coherence above satisfying some statistical criteria.” 3
A basic econometrics textbook such as Greene (1993) has only six pages dealing with forecasting out of more than 700 in total. The panel textbooks by Arellano (2003), Baltagi (2001) and Hsiao (2003) all do not mention forecasting. At least the benchmark book on time series econometrics by Hamilton (1994) has a full chapter on forecasting.
10.2 The characteristics of good forecasts
107
Another standard requirement for forecasts is stability.4 This could imply that forecasts must not change significantly between one forecasting exercise and the next - otherwise the change requires a detailed explanation. The requirement of stability may also mean that forecasts must not completely contradict the current consensus view. For example, a model forecasting 2% annual GDP growth in China in the next 5 years has next to no credibility because the forecast is so far away from the consensus following several years of GDP growth of around 9%. Therefore, I will compare the forecasts for 200620 derived at the end of this study with the countries’ past growth experience and with the forecasts in Bergheim (2005). When evaluating the quality of forecasts, it is important to keep the loss function of the users of these forecasts in mind. Usually, a future loss (or gain) depends on a decision today and on the state of the world in the future. Deviations in either direction may be equally relevant: overestimating growth may be just as costly as underestimating it. The loss may rise linearly with the difference between the forecast and the actual outcome. Or it may rise with the square of this difference: larger losses carry a more-than-proportinal weight.5 In the first case, the “mean absolute error” is the criterion to evaluate different forecasting models. In the second case the “mean squared error” would be used. I will mostly focus on the second, assuming a loss function where larger absolute errors matter more. An optimal forecast should be unbiased, i. e. the average error should be zero. This implies for example that additional information should not be able to improve the forecast: the “unforecastability principle” (Diebold [2004], p. 295). In the case of a bias, regressing the forecast on a constant would help improve the forecast - a clear violation of the unforecastability principle. Since I will only produce one forecast per country, the bias test will have to be along the cross-country dimension. Forecasts tend to assume that the model is correctly specified and that it remains valid over the forecast horizon. Unfortunately, reality is likely to be different.6 New, unpredictable events will happen in line with Rumsfeld’s categorization quoted above. Therefore, a model that may have captured past events quite well may not be able to capture the events that will drive the future. All econometric models are mis-specified and all economies experience significant, unanticipated shifts. The future rarely is like the past. As is now well known, great in-sample fit is no guarantee for good forecasts. On the contrary, simple, parsimonious models tend to have the best out-of-sample forecasting performance. However, the reverse also holds: the best forecasting models often do not capture the past development of the variable very well. The general rule in forecasting is called KISS: Keep it so4 5
6
See Don (2000), p. 10. In principle, truncated or asymmetric loss functions are also possible. More details on loss functions can be found in Diebold (2004), p. 32ff. More details can be found in Clements and Hendry (2003).
108
10 The theory of forecasting
phisticatedly simple! The “parsimony principle” applies: simple models are preferable to complex models (Diebold [2004], p. 45). As Clements and Hendry (1999) point out, the general advantage of simple models stems from their higher flexibility in adjusting to breaks. Clements and Hendry identify shifts in the coefficients of deterministic terms as the most important source of systematic forecast errors. Some simple models such as exponentially-weighted moving averages are very flexible in capturing such shifts, while other simple models such as a linear determinist trend are not flexible. Even more, causally-relevant variables cannot be relied upon to produce the best forecasts when the model is mis-specified for a non-constant mechanism (Clements and Hendry [2001, p. 29]). In fact, forecast failure can be an indicator of important structural changes in the economy. Likewise, shifts in the equilibrium of a cointegration relationship may lead to serious forecast errors: forecasts may be for a decline in the variable of interest exactly at a time when it rises strongly. So it is certainly necessary to analyze the sources of a model’s forecasting difficulties. Do they stem from the cointegration errors, which may be particularly large at the most recent observation? The trade-off for my long-run GDP model is clear. On the one hand, I want it to incorporate economic theory to be able to help policy decisions. So the model has to be reasonably comprehensive. On the other hand, poor forecasting performance relative to very simple models may damage its credibility. To get a sense for this trade-off, the model’s out-of-sample forecasts will be evaluated against very simple time-series models in chapter 13. A further severe problem for forecasts is that policy can influence them. Decisions taken today may affect the path of the economy and therefore the outcome of the variable to be forecast. Therefore, forecasts must be made conditional on a certain policy decision. Forecasts may also be made conditional on many other factors. For example, business cycle forecasts may assume unchanged interest rates, exchange rates and oil prices. If these turn out different, then the forecast will be off even if the forecasting model itself is excellent. Therefore, it is important to clarify the assumptions made and to present alternative scenarios using different assumptions on important but uncertain variables. The forecasts presented in this study are conditional on the forecasts for the exogenous variables materializing. Weak forecasting performance of a model tells us nothing about the policy implications of the model. Forecasts may be wrong because exogenous variables turned out different from what was expected. The policy links may still be valid. Furthermore, my set of forecasts assumes that the patterns characterizing the data in the past will continue to hold in the future, i.e. the regression coefficients are assumed to stay stable. The Lucas critique has made clear that this may not necessarily be the case in practice. However, as Frankel (2003) points out, “if we can never use past experience to predict consequences of some innovation in policy, then we might as well give up and go home.”
10.3 Intercept correction and forecast combination
109
10.3 Intercept correction and forecast combination An important practical issue is how to link the forecast and the actual data series. Figure 10.1 illustrates the choices. The simple forecast model is a deterministic trend with a slope equivalent to 2.0% growth per year. However, the last actual observation for the year 2000 lies well above the fitted line. Assuming that the error between actual GDP and the fitted level disappears over the forecast horizon, actual GDP would grow by just 1.6% per annum over the five-year horizon. This can be interpreted as a closing of the output gap by 2005. In my fundamental model, the error-correction components will capture the adjustment to any deviations from long-run equilibrium. 10.3 ln GDP Last actual
Intercept correction
10.2 1.6% p.a.
10.1 lnY = a + 0.02 *t 10.0 Forecasts
Actual 9.9
9.8 Year
9.7 1981
1983
1985
1987
1989
1991
1993
1995
1997
1999
2001
2003
2005
Fig. 10.1. Intercept correction for a simple trend model
Clements and Hendry (2003) show that intercept correction is a widely used and successful strategy to improve a model’s forecasting performance. These corrections set the error at the last observation equal to zero. The chart illustrates this upward shift in the trend line. The 2% p. a. GDP forecast from the trend line is now attached to the last observation. Therefore, the average annual growth forecast with intercept correction would be 2.0% instead of the 1.6% without intercept correction. Because of the importance attributed to intercept correction in the forecasting literature, I will also employ corrected forecasts in the forecast competitions. Clements and Hendry recommend to always use intercept correction when deterministic shifts are suspected. Again, judgment enters the construction of forecasts. My fundamental models produce annual forecasts for GDP growth (rather than levels, as the trend model depicted in figure 10.1). Therefore, an intercept correction is neither necessary nor possible here.
110
10 The theory of forecasting
In general, it is possible to present point forecasts, interval forecasts or density forecasts (Diebold [2004], p. 38ff). In this study I will mostly use point forecasts, but will keep in mind that these are surrounded by considerable uncertainty. Interval forecasts give a range in which the future outcome is expected to lie with, say, a 60% probability. However, it is not straightforward to produce these confidence intervals. Merely using the standard errors of the underlying equation is usually not enough because uncertainty about the forecasts of the explanatory variables will add to the overall uncertainty. Many examples in macroeconomic forecasting show actual outcomes well outside the calculated confidence bands. The third option, density forecasts, is even more demanding and rarely used in practice. One exception is the Bank of England with its ‘fan charts’ on inflation which show a widening probability density as the forecast horizon expands. It turns out that these densities are also quite wide, which may hurt the forecast’s public credibility. The uncertainty stems to a large extent from uncertainty about the path of the exogenous variables. The lesson to be learned for my question is that one should try to base the forecasts on exogenous variables that are relatively easy to predict. The trade-offs are clear: a slow-moving variable may be easy to forecast, but it may not matter all that much for the model. Another variable may be very important for the model, but extremely difficult to predict, thereby introducing considerable uncertainty into the forecast of the variable of interest. This trade-off already influenced the selection of the variables presented in the previous chapters. For the forecast competitions and to derive the forecasts over 2006-20, I need to generate forecasts for the exogenous variables. I will do this by using either forecasts from other institutions (UN for population) or simple rules such as “average growth will equal the average of the past 10 years” or “average growth will equal the weighted average of past growth rates in that country and past growth in other countries”. Another salient insight from the forecasting literature is that combining (or pooling) forecasts from different models often leads to a forecast with lower errors and variances, similar to combining different assets in a financial portfolio. This is particularly often the case when individual forecasts are based on different information sets. For example, combining the sales forecasts from an econometric model with the judgmental forecast from the sales force stands a good chance of doing better than each of the two forecasts separately. Usually, regression analysis is used to determine whether there is any value in combining forecasts and what the weights should be. In some cases it might make sense to let the weights vary with the forecast horizon: one model might be better at explaining shorter horizons, while another model might explain the long run. In the context of forecasting GDP growth, one could weight the numbers from a business-cycle model and those from a long-run growth model to derive annual GDP forecasts. One of the reasons to do out-of-sample forecasting exercises is to find out which models perform well and which models
10.3 Intercept correction and forecast combination
111
may be combined to produce the best overall result. This will be the subject of chapter 13. Current best practice in macroeconomic forecasting appears to be multivariate vector error-correction models. However, I have too few observations to apply this technique, but also strong views on what variables are reasonably exogenous. Therefore, I do not use a vector autogressive model (VAR). Rather, I will use panel data, which are not discussed in the forecasting book of Clements and Hendry (1999). The advantage of the use of panel data is that we can learn about a country’s behavior by observing the behavior of other countries in addition to that of the country in question. Of course, this is only helpful if the countries behave similarly - or if differences in behavior are properly taken care of.
11 The evolution of growth empirics
The ultimate goal of this study is to create a sensible and robust forecasting model for long-run GDP growth. This model has to be built with two main components: variables (appropriately measured and transformed) and econometric techniques. The preceding chapters outlined the different theories, the candidate variables and how best to measure them. The chapter on forecasting sketched some of the theoretical requirements for a successful forecasting model. Finally, the most appropriate available econometric technique has to be selected. This is not a straightforward task because a wide range of methods is used in growth empirics and new methods are becoming available almost every month. No consensus has yet emerged on the best technique. This chapter therefore assesses the strengths and weaknesses of different econometric methods for long-run growth analysis. It also explains why I use a two-stage panel method that combines cointegration analysis and a stationary model. The range of empirical techniques extends from the still popular crosssection methods to the recently developed panel cointegration techniques.1 Growth empirics took off with Barro (1991) and Manikw, Romer and Weil (1992), which used long-run averages of GDP growth and its determinants in a cross-country framework. The initial reaction came from Levine and Renelt (1992), who challenged the robustness of Barro’s results. A whole industry of cross-section empirics developed in the 1990s and is still alive today, with Fernandez, Ley and Steel (2001), Doppelhofer, Miller and Sala-i-Martin (2004), and Hauk and Wacziarg (2004) recent examples. In very important contributions Islam (1995) and Caselli, Esquivel and Lefort (1996) were among the first to use panel methods to deal with heterogeneity issues. Lee, Pesaran and Smith (1997) showed that heterogeneity also applies to long-run growth rates 1
Chapter 2 showed that the also still popular growth accounting technique adds little value. The argument of Barro and Sala-i-Martin (2004, p. 459) combined with constant capital-output ratios implies that all growth in per capita income stems from changes in TFP in the simple models. Capital adjusts to productivity gains, but provides no independent contribution.
114
11 The evolution of growth empirics
and convergence speeds. The pooled mean group estimator of Pesaran, Shin and Smith (1999) makes it possible to deal with these heterogeneities, but does not yet test for cointegration. The work of Pedroni (2000 and 2001), Breitung (2000 and 2005) and others on panel unit roots and panel cointegration provides a toolbox to model economies’ long-run dynamic properties and exploit panel information. My setup combines their work with stationary panel techniques and time-series models to allow enough parameter heterogeneity.
11.1 Still widely used: cross-section The overviews of the empirical literature in the chapters on the individual drivers of economic growth highlighted the still widespread use of cross-section (since the section usually is a country: cross-country) estimation techniques over the past 15 years. It was the best available technique when growth empirics took off in the early 1990s, given the scarcity of time series back then. Table 11.1 lists some of the most relevant contributions in chronological order and shows that even in 2004/05 many authors used cross-section techniques. However, this approach has many weaknesses - outlined below - and I will not use it to build my forecasting model. Fortunately, the focus on these weaknesses has helped foster the development of more appropriate techniques over the past 15 years or so. The newly developed panel techniques also imply higher data requirements: time series have to be available. Techniques and datasets developed in parallel. In order to clarify assumptions as well as strengths and weaknesses of crosssection methods and to track the evolution of empirical methods, it is useful to sketch the basic estimation framework in some detail. The equation to be estimated is generally derived from a simple neoclassical production function, often augmented with human capital. Labor input and technology in country i are assumed to grow at exogenously determined rates of ni and gi as outlined earlier on page 10 in the chapter on growth theory. A constant fraction sk of output is saved to accumulate physical capital and a constant fraction sh is saved to accumulate human capital. Using the steady-state characteristic that physical and human capital per efficiency unit of labor do not change anymore one can derive an equation in levels as in Manikw, Romer and Weil (1992) and similar to equation 2.7 shown in the chapter on growth theory:
ln yi = A0 +gt+
α β α+β ln sk,i + ln sh,i − ln(ni +g+δ)+ǫi 1−α−β 1−α−β 1−α−β
(11.1) where the index i runs across countries, t is a time trend, ǫi is the regression error and the other coefficients are as described in chapter 2. To estimate this in the cross-country dimension, MRW make a host of strong assumptions. For
11.1 Still widely used: cross-section Table 11.1. Empirical techniques used in the literature Year Author(s) 1991 1991 1992 1992 1994 1995 1996 1997 1997 1997 1998 1999 1999 2000 2000 2000 2001 2001 2001 2001 2001 2001 2001 2001 2002 2002 2003 2003 2003 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2005 2005
Barro DeLong & Summers Levine & Renelt Mankiw, Romer & Weil Loayza Islam Caselli, Esquivel & Lefort Klenow & Rodriguez-Clare Lee, Pesaran & Smith Sala-i-Martin Edwards Frankel & Romer Hall & Jones Bils & Klenow de la Fuente & Dom´enech Hanushek & Kimko Bassanini, Scarpetta & Hemmings Bernanke & G¨ urkaynak Cohen & Soto Easterly & Levine Fernandez, Ley & Steel Kr¨ uger & Lindahl Pritchett Sarno Dowrick & Rogers Soto Bosworth & Collins Brunner Dollar & Kraay Alcal´ a & Ciccone Barro & Sala-i-Martin Batista & Zalduendo Bond, Leblebicioglu & Schiantarelli Chen & Dahlman Doppelhofer, Miller & Sala-i-Martin Hauck & Wacziarg Hendry & Krolzig Hoover & Perez Portela et al. Rodrik, Subramanian & Trebbi Yao & Lyhagen Benhabib & Spiegel Ianchovichina & Kacker This study
Method used Cross-section Cross-section Cross-section Cross section Panel π Panel LSDV Panel GMM Cross-section Panel Cross-section Cross-section Cross-section Cross-section Cross section Cross-section Cross-section Pooled Mean Group Cross-section Cross-section Panel GMM Cross-section Cross-section Cross-section Cointegration analysis Panel GMM System GMM Cross-section Panel Panel GMM Cross-section Cross-section Cross-section plus time Pool Cross-section Cross-section CS, GMM & LSDV Cross-section PCGets Cross-section Cross-section & panel Cross-section Pooled Mean Group Cross-section & panel Cross-section 2-stage method
115
116
11 The evolution of growth empirics
example, the elasticities α and β in the production function are assumed to be identical across countries despite the extremely heterogeneous sample of 98 countries. MRW also assume that the growth rate of exogenous technological progress g is the same 2% across countries and that the depreciation rate δ is identical at 3% across countries.2 Population growth and the investment ratios enter in their country-specific averages over the years 1960 to 1985. The number of observations is equal to the number of countries. Adding more countries increases the degrees of freedom but also the heterogeneity of the sample. Unfortunately, this setup works least well in the sub-sample where these assumptions are most likely to hold: the OECD sample. As outlined earlier in this study one of the reasons for these difficulties may be that capitaloutput ratios are not the same across the OECD countries. The move to convergence regressions in the empirical literature was already highlighted in chapter 2. Here, the time dimension enters through the comparison of averages of growth drivers with the starting level of income or productivity. A classic cross-section convergence contribution comes from Barro (1991): In a sample of 98 countries he regressed the growth rate of real GDP per capita over the period 1960 to 1985 on 1960 school enrollment and the 1960 level of GDP per capita. Over the past 15 years, the literature has mushroomed, as ever more regressors were added to the model either to introduce a new theory of economic growth and/or to improve the fit of the regressions. Additional so-called “state variables” were introduced to better model A0 (e.g. life expectancy in 1960 and the fertility rate), while so-called “control variables” were to capture other determinants of steady-state income such as average trade openness over 1960-85. As outlined earlier, one has to be careful when interpreting the results from convergence regressions. With the growth rate of GDP (or TFP) on the left-hand side and initial GDP on the right-hand side, the control variables capture the level of steady-state GDP. Therefore, a positive coefficient on a right-hand side variable should not be interpreted as the level of this variable having an unconditional positive impact on the rate of long-run GDP growth. In these models, GDP growth depends on the difference between initial and steady-state GDP levels.
11.2 Weaknesses of cross-section regressions In principle, it is hard to see how an inherently dynamic process such as economic growth can be modeled by abstracting from the dynamics through the use of long-run time averages and by focusing only on cross-sections. This may have been necessary in the early days of growth empirics because of the 2
Chapter 5 showed that this 3% is at the low end of the range of depreciation rates used in the literature.
11.2 Weaknesses of cross-section regressions
117
scarcity of time series, but it does not seem appropriate anymore. The other weakness of cross-section methods is that they assume all parameters to be homogeneous across countries: the production function, the rate of technological progress and the rate of convergence. In addition, exogeneity of the explanatory variables is assumed and the errors are assumed to be uncorrelated with the regressors. All five assumptions are likely to be false as the following sections will explain in detail. 11.2.1 Same production function assumed As outlined above, the basic cross-country growth estimates assume identical production functions in all economies: in the long run the same amount of input will generate the same amount of output in all economies. Technology is a public good to which all countries have access. Leaving aside the general problems with the aggregate production function as outlined in chapter 2, this appears to be a highly restrictive assumption. A slightly less restrictive assumption would see the elasticities of substitution between the production factors identical, but at least allow the initial level of technology A0 to differ across countries. If these differences in technology exist, then cross-section regressions will suffer from omitted variable bias if the explanatory variables are correlated with A0 in equation 11.1. The standard cross-section estimator will not be consistent. Caselli, Esquivel and Lefort (1996) show that the speed of convergence will be biased downward when individual effects are ignored. Technically, this requires a fixed effects panel model. Islam (1995) was the first to apply this approach to growth studies. He indeed finds significant differences in A0 across countries. However, Solow (2001) thinks that this is not enough heterogeneity: ”It is certainly unwise to assume that all economies are equally efficient at reallocating inputs across industries” so that capital intensity may adjust to changing relative factor prices. 11.2.2 Long-run growth path assumed to be constant and the same across countries Another weakness of the cross-section method is that technological progress is assumed to be constant and identical across countries (gi = g) and is thereby pushed to the sidelines. The term g is absorbed in the constant term. However, this abstracts from a potentially very interesting aspect of the growth process: technologies may develop differently across countries over time. The assumption that technology improves at the same rate in all countries in the sample is not tested - and cannot be tested - with cross-section methods. However, recent research based on panel techniques shows that this assumption does not hold. Lee, Pesaran and Smith (1997) find that steady state growth rates differ significantly across countries and that ignoring these differences significantly biases the estimates of the speed of convergence downward.
118
11 The evolution of growth empirics
Likewise, Evans (1998) uses cointegration tests on the natural logarithm of GDP per capita for country pairs and finds widely differing trend growth rates for a sample of 54 countries over 1950 to 1990. The results from the sub-sample of 27 relatively poorly educated populations lead him to conclude that technical knowledge may not be ultimately accessible to countries with relatively small stocks of per-capita human capital. In a similar spirit, Quah (1993) argues that long-run growth rates are neither stable in one country nor the same across countries. 11.2.3 Same pace of conditional convergence assumed Cross-section models assume that all countries converge to their steady state at the same pace: λi = λ for all i. This assumption is not (and cannot be) tested with cross-section methods. In reality, countries are heterogeneous and their flexibility to react to shocks differs significantly. Indeed, most panel models find heterogeneous adjustment speeds across countries (for example: Bassanini, Scarpetta and Hemmings [2001]). The model used below supports this view of heterogeneity. 11.2.4 Errors are assumed uncorrelated with the explanatory variables The identifying assumption in Manikw, Romer and Weil (1992) is that the error term in their regression is not correlated with the regressors such as the investment ratio and population growth. However, Islam (1995, p. 1134) argues that “it is not entirely convincing that savings and fertility behavior will not be affected by all that is included in A(0)”, i. e. in the country-specific features of an economy which MRW absorb in the error term. Ordinary least squares estimation is only valid under the assumption of independence. Fixedeffects panel estimates can tackle this problem, as proposed by Islam (1995). 11.2.5 Right-hand side variables assumed exogenous Among the most robust variables in cross-section studies are the population growth rate and the investment ratio. From the discussion in chapter 4 on labor input it should have become clear that population growth is endogenous: higher per capita income leads to lower birth rates, not the other way around. Also, the discussion of physical capital in chapter 5 indicated that investment seems to react to the returns on physical capital (defined as GDP per unit of capital, so it includes the left hand side) which depends on many other factors such as the input of human capital, the country’s trade openness and its institutions. This is in line with Caselli, Esquivel and Lefort (1996) who argue that the rate of investment in physical capital is determined simultaneously with the rate of growth.
11.3 The climax of cross-section
119
11.2.6 In sum: many assumptions are violated In sum, several studies show that the assumptions used in the cross-section literature are violated. Also, cross-section methods have to use highly heterogenous countries to get enough degrees of freedom. Combined with the assumptions outlined above, this is problematic. Zambia is unlikely to have the same preferences or technology as the United States. This view was emphasized many years ago, but it has not prevented the cross-section industry from flourishing. For example, DeLong and Summers (1991) were ”skeptical of what can be learned by combining in one regression very poor countries, which appear to have productivity levels less than Britain before the industrial revolution, with technologically-sophisticated developed countries.” They nevertheless include Peru and Costa Rica, which had GDP per worker around 65% below the US level in 1980, in their cross-section estimation. In spite of all these contributions, Barro and Sala-i-Martin (2004) is probably by far the most influential textbook on growth theory and empirics, even if it does not mention the weaknesses of cross-section methods or the alternative methods that are becoming available.
11.3 The climax of cross-section Despite these fundamental shortcomings as well as Robert Solow’s 1994 revelation that he does “not find this a confidence-inspiring project” (Solow [1994], p. 51) and Romer (2001, p. 226) referring to the cross-section approach as a “dead end”3 , the cross-section industry reached a climax around the turn of the millennium. These studies ignore the weaknesses of cross-section techniques outlined above, the invalidity of assumptions made and the problems with data especially for investment and human capital. It is nevertheless important to be aware of the developments in this area and of the conclusions drawn in order to understand the current state of the empirical growth literature and to appreciate the differences from the panel work promoted in this study. Sala-i-Martin (1997) uses millions of cross-section regressions to test the robustness of the variables used in empirical growth studies and to show that Levine and Renelt (1992) were too pessimistic in their conclusions that hardly any variable robustly explains growth. Fernandez, Ley and Steel (2001) use a Bayesian model averaging strategy on 2 million regressions and come to similar conclusions to Sala-i-Martin: there are robust correlates with GDP growth in cross-section regressions. Sala-i-Martin refined the analysis with coauthors using a Bayesian averaging of classical estimates (BACE) approach, which combines an averaging of estimates across models (a Bayesian concept) 3
He also says that looking “only at the cross-country regression evidence, there is nothing that would raise questions about the initial assumption of identical technologies in all countries.” (ibid)
120
11 The evolution of growth empirics
with the classical approach of OLS estimation. Doppelhofer, Miller and Salai-Martin (2004) find 11 variables robustly partially correlated with long-term growth. Their conclusions are based on approximately 21 million randomly drawn cross-section regressions on a dataset of 32 variables for 98 countries over the period 1960 to 1992. No reference is made to the weaknesses and disadvantages of cross-section regressions. Brock and Durlauf (2001) criticize the robustness investigations of Levine and Renelt (1992) and Sala-i-Martin (1997) as being vulnerable to the likely collinearity of regressors: any coefficient is highly unstable when alternative collinear regressors are added to the model. Hoover and Perez (2004) introduce another layer to the analysis by comparing the robustness tests used by Levine and Renelt (1992) and those used by Sala-i-Martin (1997) with a general-to-specific approach. Based on Monte Carlo experiments they show that the Levine and Renelt approach is too stringent and rejects a true relationship too often (small size, but low power), while the Sala-i-Martin approach is not stringent enough and accepts a false relationship too often (high power, but large size). They advocate the generalto-specific approach which is shown to have size and power near their nominal levels. Florax, de Groot and Heijungs (2002) use a ”response surface metaanalysis” to show that a limited number of variables are robustly related to growth. Hendry and Krolzig (2004) propagate a quick and easy way to reach the same conclusions as Fernandez, Ley and Steel (2001) by also using a general-to-specific procedure. They contrast their approach with the millions of regressions that Sala-i-Martin and his coauthors require. While these certainly are further methodological contributions, they still do not address the crucial issues in growth empirics. For example, Hoover and Perez (2004) use a cross-section approach for some 100 highly heterogeneous countries. Forecasting exercises at the IMF and at the World Bank published in 2004 and 2005 also used cross-section methods.4 These ever more sophisticated cross-section exercises did not address the really important methodological issues outlined earlier in this chapter. Solow (2001, p. 282) repeatedly said that he has ”been skeptical from the beginning about the interpretation of cross-country growth regressions.” ”It is probably not a good idea to set cross-country regressions the task of explaining the rate of growth ... It is time paths that need to be modeled and studied” (p. 287). In addition, he is especially wary of applying his own theoretical framework to developing economies, something that is common in the cross-section models. Cross-section analysis too narrowly focuses on the issue of convergence although the real focus should be the evolution of output over time. Islam (1998, p. 326) thinks that the work of Lee, Pesaran and Smith (1997) ”robs the concept of convergence of any real meaning, as far as the cross-country dimension of this concept is concerned”. We are back at the original Solow 4
Batista and Zalduendo (2004), Ianchovichina and Kacker (2005).
11.4 Advantages of panel techniques
121
model that explicitly focuses on the development within one country. Capturing differences in countries’ growth paths should be the prime task of growth empirics.
11.4 Advantages of panel techniques Estimating cross-section regressions does not allow the analysis of the complex dynamic adjustment processes or of the heterogeneity of growth rates across countries. An alternative would be to analyze countries individually with time series methods. However, some of the relevant data are only available at annual frequencies starting in the 1960s or 1970s. Given the large number of potentially relevant regressors, this leads to a low number of degrees of freedom. Trying to deal with the short sample problem by using the mean group method, where equations are estimated country by country and then averages of the coefficients are used, is unlikely to lead to meaningful country estimates. Another approach would be to abstract from any regressors and use simple filters for GDP to estimate a rate of potential growth. However, filters like the Hodrick-Prescott version suffer from the well-known end-point problems. My goal is to avoid that vulnerability by focusing on growth determinants. However, Hodrick-Prescott filters will be used as one of the benchmarks for comparison with forecasts from fundamental models. Only panel models can capture the evolution of economies over time and still provide enough degrees of freedom to include a reasonable number of regressors. In general, a panel or longitudinal dataset follows several crosssections over time (see for example Hsiao [2003], p. 1 and Wooldridge [2003], p. 408): there is a cross-sectional dimension N and a time-series dimension T . In my context, data for GDP and investment for the same country would be observed over several consecutive points in time. By contrast, a pool is a dataset with observations on different units at different points in time drawn randomly from a large population, such as prices of houses in different cities in different years.5 This is consistent with Hsiao (2003, p. 16) who refers to a pool as a panel where all coefficients (constants and slopes) are the same across the units (because one cannot determine whether the same unit has been observed several times).6 In growth empirics, studies such as that of Barro and Sala-i-Martin (2004), who split samples into smaller periods and treat the observations on the USA in 1970-80 just like those on Japan in 197080 and like those on the USA in 1980-90, are pools. The estimate does not 5
6
Often ”to pool” refers to adding additional units to a panel. This may explain why Durlauf, Johnson and Temple (2005) refer to the panel work of Islam (1995) as a pool analysis. In contrast to the textbook definitions, EViews for its own historical reasons distinguishes between a pool that has few cross-sections and a panel which has many cross-sections. EViews allows unit-specific slope coefficients only in “pools”, which contradicts Hsiao’s definition.
122
11 The evolution of growth empirics
take account of the fact that one country is observed several times. Pooling increases the degrees of freedom, but it does not capture the time dimension of the data. The panel equivalent of the MRW equation is: (11.2)ln yit = a0i + gi t + a1 ln sk,it + a2 ln sh,it − a3 ln(nit + g + δ) + ǫit where the index i runs across countries and the a0i are the fixed effects. The first and most important advantage of panel approaches over crosssection methods is that they make it possible to control for unobserved (omitted) individual-specific characteristics that may be correlated with explanatory variables but constant over time. Figure 11.1 illustrates how cross-section estimates using long-run country averages may find a positive slope while fixed effects estimates would find no relationship between X and Y. The large dots are averages of the annual observations (small dots) for each country and the thin regression line through these large (cross-section) dots has a positive slope. By contrast, using all observations for each country and a countryspecific constant (fixed effect) in a panel model will result - in this stylized example - in a slope of zero. 11.6
Y
Observations country 1
11.4 11.2 11.0 Fixed effects estimation: slope = 0
10.8
Cross-section estimation using time averages: positive slope
10.6 10.4 10.2
Observations country 2
10.0
X
9.8 17
18
19
20
21
22
23
24
25
26
27
Fig. 11.1. Difference between cross-section and panel
The second advantage is that the larger number of data points and more degrees of freedom in panels than in cross-section or time-series datasets lead to more efficient estimates. More interesting hypotheses can be tested. Temple (1999) already argued that ”panel data studies will increasingly be the best way forward for many questions, especially as longer spans of data become available.” And, one might add, as better estimation techniques become available. Therefore, panel techniques are ideal for my purpose, which is to model
11.4 Advantages of panel techniques
123
the development of the trend in (steady-state) GDP plus any convergence to that steady state. For example, some of the slope coefficients can be allowed to be country-specific. As I will outline below, the panel literature has relaxed the assumptions of identical production functions (Islam [1995], Evans [1998]), identical convergence speeds (Lee, Pesaran and Smith [1997]) and identical pace of technological progress. Endogeneity issues still need to be addressed by a careful selection of the regressors. However, one still has to avoid combining too heterogeneous data. Ignoring individual-specific characteristics that exist among the different cross sections can lead to parameter heterogeneity and biased coefficients. In this case, estimates of the relevant parameters may turn out inconsistent or meaningless (Hsiao [2003], p. 8). Therefore, panel models which allow enough flexibility and permit some of the coefficients to be heterogeneous across countries appear preferable. This implies that countries with very different characteristics should only be included in the panel if the fixed effects can model that heterogeneity. In order to abstract from business cycle variation and to deal with the fact that many variables are measured less frequently than annually, many studies used medium-term averages, for example 5 years. This leaves at most eight time observations given that datasets usually do not start before 1960 or extend beyond 2000. 11.4.1 Initial technology can differ across countries The seminal paper by Islam (1995) was one of the first to advocate the use of panel methods in growth empirics. He estimated a dynamic panel model which makes it possible to capture the country-specific levels of technology, using a least squares dummy variable (LSDV) ”within” estimator and the minimum distance (MD) estimator on five data points with variables in fiveyear averages over 1960-85 for 96 countries. The LSDV estimate uses a full set of country-specific intercepts (or dummies) to capture the heterogeneity in technology. With this approach, omitted variables that are constant over time will not bias the estimates. The slope parameters are homogeneous and identified by using only the within-country variation in the data. Islam finds a wide range of fixed effects and thus a wide range of efficiency with which countries are transforming their capital and labor resources into output: technology does differ across countries. Loayza (1994) uses a similar approach and draws similar conclusions to Islam. 11.4.2 Dealing with endogeneity bias Unlike Islam, the work of Caselli, Esquivel and Lefort (1996, below: CEL) does not focus on differences in technology levels (individual effects) so they eliminate the country-specific effect by first-differencing the data. They also
124
11 The evolution of growth empirics
eliminate time effects by calculating deviations from period means. Instead, their focus is on the endogeneity bias that is present both in Manikw, Romer and Weil (1992) and in Islam (1995) and on the correlation of lagged income with the error term which was also not addressed by Islam. CEL propose a General Method of Moments (GMM) estimator following Arellano and Bond (1991), using lagged levels as instruments for the changes. They indeed find that treating the country-specific effects properly raises the estimated speed of convergence. And they find strong evidence of endogeneity. GMM dynamic panel estimators are also used by Easterly and Levine (2001) and by Dowrick and Rogers (2002). However, this technique is rather problematic because the relationship between the variables and their instruments is likely to be weak. Indeed, the GMM approach was criticized by Bond, Hoeffler and Temple (2001) because lagged levels are weak instruments for subsequent first differences. 11.4.3 Addressing lagged dependent bias Judson and Owen (1999) also use panel methods to identify country-specific effects. Their focus is on the well-known bias of the LSDV method when estimating the coefficient of a lagged dependent variable. They suggest the use of a corrected LSDV estimator for panels with a small time dimension (t