204 41 5MB
English Pages 152 [153] Year 2023
Contributions to Economics
Jerzy Witold Wiśniewski
Forecasting from Multi-equation Econometric Micromodels
Contributions to Economics
The series Contributions to Economics provides an outlet for innovative research in all areas of economics. Books published in the series are primarily monographs and multiple author works that present new research results on a clearly defined topic, but contributed volumes and conference proceedings are also considered. All books are published in print and ebook and disseminated and promoted globally. The series and the volumes published in it are indexed by Scopus and ISI (selected volumes).
Jerzy Witold Wiśniewski
Forecasting from Multi-equation Econometric Micromodels
Jerzy Witold Wiśniewski Econometrics and Statistics Nicolaus Copernicus University, Toruń Toruń, Poland
ISSN 1431-1933 ISSN 2197-7178 (electronic) Contributions to Economics ISBN 978-3-031-27491-6 ISBN 978-3-031-27492-3 (eBook) https://doi.org/10.1007/978-3-031-27492-3 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Introduction
The existing economics literature is dominated by works presenting forecasting solutions that are based on econometric models in the form of a single stochastic equation. Forecasting from multi-equation models has very rarely been the subject of focus in econometric literature. Significant interest in econometric macromodels emerged in the twentieth century. The multi-equation models known to the literature are mostly systems of interdependent equations, mainly describing national economies of various countries. Most commonly, macromodels are based on annual time series, which are characterized by a “smooth” run. Econometric macromodels based on quarterly data are an exception. The description accuracy of each equation is usually high in such cases, for instances of empirical equations with R2 convergence coefficient values above 0.95, often reaching 0.99, dominate. Possible discrepancies in the forecasts from reduced form equations, after confrontation thereof with forecasts from a given model’s structural form equations, are not notable in such cases. That is why, forecast builders have not attempted to extensively develop the forecast building procedures for each of the three classes of multi-equation models: simple models, recursive models, and systems of interdependent equations. When it comes to systems of interdependent equations, the use of the reduced form for construction of econometric forecasts has predominated. The work aim is to verify authorial forecasting solutions for prediction from econometric micromodels of a simple, recursive, and interdependent-equation nature. All forecasting solutions are based on structural form empirical equations, which are mainly intended for econometric micromodels. Forecasting from a multiequation model is performed for each equation individually, just as in singleequation models. Due to the endogenous variable lags, however, appropriate ordering is required, which implies the need for the so-called sequential forecasting, in each class of multi-equation models. In recursive models, as well as in systems of interdependent equations, the so-called chain prediction, inherent in recursive models, is additionally required. Systems of interdependent equations necessitate v
vi
Introduction
the use of iterative forecasting, which involves a chain-like proceeding. This yields a proposal of a forecasting procedure for prediction from a system of interdependent equations, which can be defined as reduced-recursive (Wiśniewski 2016a, pp. 43–45, 2017, 2021). It can also be referred to as helical (or snaillike) forecasting. Under the circumstances indicated above, the procedure contributes to the theory of econometric forecast building. The econometric forecasting procedures proposed are illustrated with empirical examples that are based on real economic, mostly business-derived data. The procedure of forecast building from systems of interdependent equations will be presented on two categories of econometric models: models with a feedback effect and a model with closed-loop links between interdependent variables. The forecasts obtained via this technique will be compared with the results derived from reducedform equations of the econometric model given. An attempt will also be made to generalize the rules of the reduced-recursive (helical, iterative) procedure application, against the background of the hitherto suggested method of forecast building from reduced-form equations of systems of interdependent equations.
References Wiśniewski JW (2016a) Microeconometrics in business management. Wiley, New York Wiśniewski JW (2017) Predykcja z układu równań współzależnych. Econometrics 1(55):9–20 Wiśniewski JW (2021) Forecasting in small business management. Risks 9(4):69. https://doi.org/10.3390/risks9040069
Contents
1
Single-Equation Econometric Model . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 The Essence of Econometric Models . . . . . . . . . . . . . . . . . . . . . . 1.2 Econometric Model Specification . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Econometric Model Parameter Estimation . . . . . . . . . . . . . . . . . . 1.4 Econometric Model Verification . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Multiplicative Econometric Models . . . . . . . . . . . . . . . . . . . . . . . 1.6 Bound Endogenous Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 4 5 10 14 17 21
2
Multi-Equation Econometric Models . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Multi-Equation Model Classification . . . . . . . . . . . . . . . . . . . . . . 2.2 Reduced Form of a Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Model Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Multi-Equation Model Parameter Estimation . . . . . . . . . . . . . . . .
23 23 27 28 30
3
Econometric Forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The Concept of an Econometric Forecast . . . . . . . . . . . . . . . . . . 3.2 Conditions for Econometric Forecast Estimation . . . . . . . . . . . . . 3.3 Forecasts from Single-Equation Models . . . . . . . . . . . . . . . . . . . 3.4 Analysis of Econometric Forecast Accuracy . . . . . . . . . . . . . . . . 3.5 Forecast Estimation from Multi-Equation Models . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
37 37 39 40 43 44 51
4
Forecasting from Simple Econometric Micromodels . . . . . . . . . . . . . 4.1 Forecasts from an Enterprise Cost Micromodel . . . . . . . . . . . . . . . 4.2 Forecasting of Worker Efficacy from a System of Two Simple Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Forecasting of Sales Representative Efficacy from a System of Two Simple Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53 53 59 66 74
vii
viii
Contents
5
Forecasting from Recursive Econometric Micromodels . . . . . . . . . . . 75 5.1 Forecasts from an Econometric Model of a Medium-Sized Enterprise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.2 Econometric Forecasts of a Sports Equipment Selling enterprise’s Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.3 Forecasts from a Recursive Model of China’s Payment Card Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6
Forecasting from an Econometric Micromodel in the Form of a System of Interdependent Equations . . . . . . . . . . . . . . . . . . . . . 6.1 The Specifics of Forecasting an Enterprise as an Economic System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Iterative Forecasting from a Closed-Loop Econometric Micromodel, Assuming a System Inertia . . . . . . . . . . . . . . . . . . . 6.3 Iterative Forecasting when Interfering with the System Using Control Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Liquidity and Debt Collection Efficiency Forecasting, Using the Forecasts of Loop Variables . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
113 113 118 129 136 145
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Chapter 1
Single-Equation Econometric Model
1.1
The Essence of Econometric Models
An econometric model—in the form of a single stochastic equation—is a primary tool in econometrics. The subject of its description is a dependent variable Y with yt observations, where t is the statistical observation’s number (t = 1, . . ., n) and n is the sample size. The dependent variable is economic in nature and represents a specific economic category.1 Explanatory variables X1, . . ., Xj, . . ., Xk essentially represent the factors causing dispersion of the dependent variable Y. Each of the explanatory variables is also assigned statistical observations: xt1, expressing the variable X1, . . .; xtj, representing the variable Xj, . . .;; and xtk, for the variable Xk. The most general form of a single stochastic equation model can be written as: yt = f xt1 , . . . , xtj , . . . xtk , ηt ,
ð1:1Þ
where another variable ηt—the random component—additionally occurs. The random component assigns a stochastic character to the model, and results from: • the random nature of economic phenomena and processes, • conscious and deliberate omission of less important and statistically insignificant factors, • inaccuracies in economic phenomenon and process observation and measurement, • lack of full precision when determining the equation’s analytical form,
1 The nature of the category represented by the dependent variable assigns the model to a specific discipline. A dependent variable representing a demographic category, for instance, makes the model demometric; if the dependent variable is sociological in character—the model is sociometric; when the dependent variable represents a psychological category—the model is psychometric.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. W. Wiśniewski, Forecasting from Multi-equation Econometric Micromodels, Contributions to Economics, https://doi.org/10.1007/978-3-031-27492-3_1
1
2
1
Single-Equation Econometric Model
• roundups during numerical calculations when applying model parameter estimation procedures. The most commonly used analytical form of the model is linear in character: yt = α0 þ α1 xt1 þ αj xtj þ . . . þ αk xtk þ ηt
ð1:2Þ
The model can take a shorter form: yt =
k X
αj xtj þ ηt
ð1:3Þ
j=0
Structural parameters (α1, . . ., αj, . . ., αk) occur in Eqs. (1.2) and (1.3), as measures of each explanatory variable’s impact on the dependent variable.2 Parameter α0 is called the model’s constant term and often cannot be interpreted economically. Of all the nonlinear econometric-model analytical forms, models referred to in the literature as multiplicative are used most commonly. The first variant of such a model is a power model: α
yt = α0 xαt11 . . . xtjj . . . xαtkk eηt ,
ð1:4Þ
a concise form of which can be written as: yt = α0
k Y
α
xtjj eηt :
ð1:5Þ
j=1
The second variant of a multiplicative model is exponential: yt = α0 αx1t1 . . . αj tj . . . αxktk eηt : x
ð1:6Þ
Its shorter form can be written as: yt =
k Y
αj tj eηt : x
ð1:7Þ
j=0
Structural parameter αj( j = 1, . . ., k) indicates an increase in the value of observation xtj by one unit, which, assuming immutability of other explanatory variables (the ceteris paribus principle), changes the size of yt by αj units.
2
1.1
The Essence of Econometric Models
3
Mixed-type multiplicative models, i.e., power-exponential models, may also be used.3 An exemplary power-exponential model can be written as follows: yt = α0 xαt11 xαt22 αx3t3 αx4t4 eηt :
ð1:8Þ
Construction of an econometric model entails five consecutive steps: (a) (b) (c) (d) (e)
model specification, model identification, parameter estimation, model verification, model exploitation.
During model specification, the test purpose and scope are defined. The model’s set of variables, i.e., the dependent variable and the explanatory variables, is also determined in this stage. The method of variable measurement is then indicated. Further, the statistical material needed for the test ought to be collected, in the form of time series or as cross-sectional data.4 Finally, a model hypothesis must be formulated as an appropriate analytical form of an equation or equations.5 Such specification results in a hypothetical (theoretical) econometric model. Model identification is required when models consist of multiple stochastic equations. The issue of mathematical correctness of the model construction needs to be resolved, as discussed in Chap. 2. Estimation of the model parameters begins with the selection of an estimator suitable for the hypothetical econometric model.6 The estimator is used for numerical calculations. Using the statistical information available for the calculations, estimates of the model’s structural parameters and stochastic structure parameters are obtained. Model verification involves validation of the empirical model’s statistical quality and its economic logic. To test the model’s statistical quality, specialized goodnessof-fit measures7 as well as a variety of statistical tests need to be used. Model exploitation entails application of the model constructed for its intended purpose, in accordance with the objective of its construction. This can serve as support in economic decision making. Another type of model application is
3
Explanatory variables of discrete nature (discrete variables) should be included in the model only exponentially, as it is difficult, in power series terms, to assign economic interpretation to structural parameters. 4 The statistical data collected must be of appropriate quality, i.e., it should be comparable, with no gaps in the statistical series, both in the interior and on the edges thereof. Any minor data deficiencies can be filled using statistical techniques (interpolation, extrapolation). The statistical material should be free of any statistical bias. 5 For models consisting of multiple equations. 6 The estimator ought to be selected to contain the necessary statistical properties, i.e., consistency, unbiasedness, efficiency, and sufficiency. Details on that are presented in Sect. 1.3. 7 Global and specific goodness-of-fit measures.
4
1
Single-Equation Econometric Model
simulation. The most common use of time-series-based models entails forecast estimation of economic phenomena and processes.
1.2
Econometric Model Specification
From economic perspective, econometric model specification is the key to its proper construction. Any inadequacy in specification can result in numerous model flaws. The primary task in model specification is to define the economic system and its components. The system components are represented by the dependent variable and the explanatory variables. Many of the system components that are expressed by the dependent variable can be defined and measured variously. The dependent variable must be specified in a manner shaping it as equivalent8 to the economic object or its feature. The economic category of output (production), for instance, can be represented by a number of variables, e.g., by-cost value of finished production, selling-price value of finished production, net sales revenue, gross sales revenue,9 cash inflows from sales of goods and services. Depending on the purpose of the test, output can be represented in the model by a variable adequate for the case given. Consideration of the factors possibly affecting (by both stimulating and inhibiting) a given dependent variable leads to the specification of the model’s potential explanatory variables. The impact thereof on the dependent variable will be subject to verification by the empirical model. Once the dependent variable and the explanatory variables are defined, the statistical data necessary for the analysis should be collected. The number of statistical observations for each of the potential model variables, which must visibly exceed the number of explanatory variables, is of significance here. It is desirable that the condition of the so-called large statistical sample is met. A lack of necessary statistical information, its poor quality, or a significant number of gaps in the statistical material can prevent the modeling investigation. Marginal gaps in statistical data can be filled through statistical techniques (interpolation, extrapolation). The resultant accumulation of proper statistical data with defined potential model variables completes the phase of variable specification, which enables progression to the phase of equation specification. Specification of an equation (or equations) consists in determining the number of equations and selecting a proper analytical form for each model equation. Although econometrics provide a large arsenal of possible analytical forms, linear equations of
8
Cf. Wiśniewski (1986), subchapter 1.5. This concept means that an economic variable, which— from research perspective—best reflects the economic category constituting the subject of an empirical verification, is called an equivalent variable. Cf. works Wiśniewski (2013), subchapter 1.3. 9 Including the amount of tax on goods and services.
1.3
Econometric Model Parameter Estimation
5
type (1.2) and multiplicative equations of types (1.4) and (1.6) are the ones used most commonly. The choice of the analytical form of the equation(s) completes the phase of model specification. Such specification results in a hypothetical (theoretical) econometric model.
1.3
Econometric Model Parameter Estimation
Estimation of a given model’s structural parameters and stochastic structure parameters requires initial specification of a theoretical model and collection of the necessary statistical data on each of its variables. First, an estimator ought to be selected, i.e., a function estimating the parameters of a model characterized by the following properties: (a) unbiasedness—let ^θ be the parameter estimator that is based on a set of observations {yi}, expressed as ^θ = hðy1 , y2 , . . . , yn Þ. If equality Eð^θÞ = θ holds, then ^θ can be called an unbiased parameter θ estimator. If Eð^θÞ < θ, then the estimator has a negative bias; when Eð^θÞ = θ, the estimator has a positive bias; (b) consistency—if estimator ^θ = hðy1 , y2 , . . . , yn Þ, convergent in probability to θ, at n → 1, then ^θ is a consistent parameter θ estimator; ^θ seeks to be θ by probability, when: lim P ^θ - θ = 0 = 1:
n→1
ð1:9Þ
It is worth noting that for sufficiently large n, a consistent estimator is always unbiased, whereas the inverse theorem is not always true, since an unbiased estimator does not need to be consistent; (c) efficiency—let ^θj = hj ðy1 , y2 , . . . , yn Þ, and j = 1, 2 be two parameter θ estimators that are based on an observation set {yi}. Estimator ^θ2 efficiency, with respect to estimator ^θ1 , can then be defined as a quotient: 2 E θ - ^θ1 H1 λ= 2 = H : 2 ^ E θ - θ2
ð1:10Þ
Since we have not restricted ourselves, in this case, to the class of unbiased estimators, H1 and H2 do not need to become error variances. If we do restrict ourselves to unbiased estimators ^θ1 and ^θ2 , however, then H1 and H2 will be such variances. Estimator ^θ1 can be called an efficient parameter θ estimator, if H2 ≥ H1 holds for all other unbiased ^θ2 , i.e., no other unbiased estimator holds a variance
6
1
Single-Equation Econometric Model
h 2 i smaller than E θ - ^θ1 : A variability characteristic alternative to statistical deviation is worth mentioning here—estimator (κ) precision, defined as κ = σ1, where σ is the standard deviation. An estimator of smaller variance, i.e., of smaller standard deviation, holds higher precision.10 As such, it can be said that an estimator of higher efficiency is a more precise estimator; (d) sufficiency—an estimator is sufficient if it contains all the information from the set of observations for the parameter under estimation. Suppose y1, y2, . . ., yn is a sequence of observations in a sample drawn from a population characterized by a density function f(y, θ). If ^θ = hðy1 , y2 , . . . , yn Þ is a parameter estimator for which the conditional expected value E ^θjðy1 , y2 , . . . , yn Þ is not dependent on θ, then ^θ is a sufficient estimator. One general estimation method, characterized by numerous mutations, is the ordinary least squares method (OLS), developed by Carl F. Gauss. The method involves selection of such estimator ^θ = hðy1 , y2 , . . . , yn Þ for which the sum of the squared differences between observations yi and the corresponding function f yi ^θ values is minimal: S=
n X 2 yi - f yi ^θ = min:
ð1:11Þ
i=1
The ordinary least squares method is widely used in practice. It requires the OLS estimator to hold essential statistical properties, however. Consider a linear model: Y = Xα þ η,
ð1:12Þ
where: 2
1
6 1 6 6 6... X=6 6 1 6 6 4... 1
x11
...
x1j
...
x21 ...
... ...
x2j . . . ... ...
xtl ...
... ...
xtj . . . ... ...
xn1
...
xnj
...
x1k
3
2
y1
3
2
η1
3
2
α0
3
6y 7 6η 7 6α 7 x2k 7 7 6 27 6 27 6 17 7 6 7 6 7 6 7 6...7 6...7 6...7 ...7 7, Y = 6 7, η = 6 7, α = 6 7, 7 6 7 6 7 6α 7 xtk 7 6 yt 7 6 ηt 7 6 j7 7 6 7 6 7 6 7 5 4 5 4 5 4...5 ... ... ... x11 yn ηn αk
and a model:
10
The term estimator precision is more appropriate, in terms of estimator properties, than the phrase estimator efficiency. Efficiency is usually associated with effectiveness, whereas this property refers to the estimates’ accuracy.
1.3
Econometric Model Parameter Estimation
7
Y = Xb α þ u,
ð1:13Þ
where: 2
^0 α
3
2
u1
3
6 7 6 7 7 6α 6 u2 7 6 ^1 7 6 7 6 7 6 7 6...7 6...7 6 7 6 7 ^=6 7 u=6 7 α 7 6α 6 ut 7 ^ 6 j7 6 7 6 7 6 7 6...7 6...7 4 5 4 5 ^k α
un
In model (1.12), X is the matrix of observations from its explanatory variables, Y is the vector of observations from the dependent variable, η is the vector of random components, α is the vector of structural parameters, n is the number of statistical observations, while k is the number of explanatory variables. Hypothetical model (1.12) has been assigned an empirical model (1.13), in which two new vectors occur: ^ —vector α estimator, and u—residual vector. Having estimator α ^, theoretical α values of the dependent variable can be determined: ^y = X^ α,
ð1:14Þ
where a vector of the dependent variable’s theoretical values (calculated based on model (1.14)) emerges: 3 ^y1 6 ^y 7 6 27 6 7 6...7 7 Y^ = 6 6 ^y 7: 6 t 7 6 7 4...5 2
^yn The conditions for OLS applicability can be defined as follows: 1. The econometric model must be linear in form, i.e., it must mirror Eq. (1.2). If the nonlinear model can be transformed to a linear form, OLS is acceptable. For instance, power model (1.4) and exponential model (1.6). can be transformed into a linear form by logarithmizing both sides of the equations. 2. The mathematical expectation of the random component should be equal to zero:
8
1
Single-Equation Econometric Model
E ðηt Þ = 0:
ð1:15Þ
3. The random component variance should be constant and finite: σ 21 = . . . = σ 2t = . . . = σ 2n = σ 2 < 1:
ð1:16Þ
4. The sequence of the explanatory variable observation matrices X is equal to the number of the model’s structural parameters (k + 1): rzðX Þ = k þ l < n:
ð1:17Þ
This means that the n number of statistical observations is greater than the number of the model’s structural parameters, that is, the model is characterized by a positive degree of freedom. What is more, none of the explanatory variables entail a linear combination of another variable of this type. 5. The explanatory variables should not be correlated with the random component, which can be written as: E X T η = 0:
ð1:18Þ
6. The random component should be void of autocorrelation: 2
σ 21 6 6... 6 E ηηT = σ 2 I = 6 6 0 6 4... 0
... ...
0 ... ... ...
... ...
σ 2t . . . ... ...
...
0
...
3 0 7 ...7 7 0 7 7, 7 ...5 σ 2n
ð1:19Þ
where E(η ηT) is the variance and covariance matrix of the random components. The zero elements outside the main diagonal imply that the random component covariances, for different pairs thereof, are equal to zero: covðηt ηt , Þ = 0, t, t ’ = 1, . . . , n; t ≠ t ’ :
ð1:20Þ
The second group of econometric model parameters consists of stochastic structure parameters, which describe the random component η distribution. This distribution is usually assumed to be normal [N(0, σ 2)]. The assumption of distribution normality of the random component η, whose expected value is zero, can be interpreted as follows:
1.3
Econometric Model Parameter Estimation
9
(a) the positive and negative random variations compensate one another; (b) the number of positive random deviations is close to the number of negative deviations; (c) most random deviations can be expected to not be much different from zero, whereas more than 99.7% of all random variations should fall within ± three standard deviations. The random component standard deviation (σ) provides information on how much, in plus or in minus, the average observations for the α. The lower the value of dependent variable (yt) deviate from function E ðY Þ = X^ σ, therefore, the smaller the dependent variable’s random component. Using a criterions written as (1.11), the OLS estimator for vector α can be given as follows: -1 T ^ = XT X α X Y,
ð1:21Þ
where XT is the transpose of the explanatory variable observation matrix X. The random component variance (σ 2) needs to be estimated as well. It can be demonstrated that the residual variance (Su2) is an unbiased estimator of the model’s random component variance, which can be calculated using the following formula: Su2 =
n n X X 1 1 ðyt - ^yt Þ2 = u2 , n - k - 1 t=1 n - k - 1 t=1 t
ð1:22Þ
where ŷt denotes the theoretical dependent variable values, calculated from the empirical model, while ut denotes the model residuals. Alternatively, Eq. (1.22) can be written in a matrix form: Su2 =
1 uT u n-k-1
ð1:23Þ
where u is the residual vector defined in connection with Eq. (1.13). An alternative method of model parameter estimation—an OLS generalization— is the Aitken’s method, also called the generalized least squares method.11 The ~ Þ takes the following form: Aitken estimator ðα -1 T -1 ~ = XT Ω - 1X X Ω Y, α
ð1:24Þ
in which a weight matrix Ω emerges:
11 The Aitken’s method is recommended when the random component variations for different statistical observations are not equal, that is, equality (1.16) does not hold.
10
1
2
ω1 6 6... 6 Ω=6 6 0 6 4... 0
Single-Equation Econometric Model
3 0 ... 0 7 ... ... ...7 7 ωt . . . 0 7 7, 7 ... ... ...5 0 . . . ωn
... ... ... ... ...
ð1:25Þ
where ω1,. . ., ωt,. . ., ωn are the weights incorporating random component’s variability of variance for different observations. Note that when ω1 = . . . = ωt = . . . = ωn = 1, matrix Ω = I, that is, it becomes a unitary matrix of n degree. As such, the Aitken estimator becomes equivalent to the OLS estimator. In the Aitken’s method, the random component variance estimator is given by formula: Su2 =
1.4
1 uT Ω - 1 u: n-k-1
ð1:26Þ
Econometric Model Verification
Statistical verification of a model involves the use of several measures primarily characterizing the random component’s role in the model. The first of those measures is the residual variance discussed in the previous section. It bears no economic interpretation. The square root of residual variance Su =
pffiffiffiffiffiffiffi Su2
ð1:27Þ
is called the residual standard error. Su is expressed in the same measurement units as the dependent variable yt. It tells by how much—on average, over the course of n statistical observations—the theoretical values of the dependent variable ŷt, calculated based on the empirical model, differ from the actual (observed) values of the variable (yt). The second general measure of a model’s goodness of fit provides information on the relative role of the random component. Convergence coefficient φ2 is calculated using the following formula: n P t-1
φ = P n 2
t-1
u2t
ðy - yÞ2
,
ð1:28Þ
1.4
Econometric Model Verification n P
11
yt
where y = t =n1 denotes the arithmetic mean value of the dependent variable observations,which measures the relative share of the model’s random deviations n n P P u2t in the total dependent variable volatility ðyt - yÞ2 . The model is the t=1
t=1
better, the smaller the share of random deviations in the total dependent variable volatility. The convergence coefficient is a normalized number satisfying the condition of 0 ≤ φ2 ≤ 1. By this criterion—the better the empirical model, the closer φ2 is to 0. Expression 100φ2 [%] indicates what percentage of the total variable yt volatility is random. An alternative measure of a model’s goodness of fit is the square root of the multiple correlation coefficient, also called the coefficient of determination, represented by R2. This measure tells us how much of the total dependent variable volatility is caused by the explanatory variables included in the empirical model. The coefficient of determination is calculated as follows: R 2 = 1 - φ2 :
ð1:29Þ
Expression 100 R2 communicates what percentage of the total dependent variable volatility results from the impact of the set of the explanatory variables occurring in the empirical model. As such, the better the model, the closer the R2 coefficient is to unity. Another issue to be addressed here entails examination of the random component’s autocorrelation.12 Absence of such correlation means that we are dealing with a so-called pure random component. Occurrence of random component autocorrelation means that the random component forms an autoregressive process: ηt = f ηt - 1 , ηt - 2 , . . . , ηt - nþ1 , εt ,
ð1:30Þ
where εt is the pure random component. Random component autocorrelation coefficients ρ1, ρ2, . . ., ρn–1 with values other than zero indicate occurrence of such autocorrelation. The tool for testing first-order random component autocorrelation13 is the Durbin-Watson test. It verifies the null hypothesis, in which ρ1 is assumed to be equal to zero, which can be written as H0: ρ1 = 0. An alternative hypothesis states
12 Random component autocorrelation is a model specification error, for it can result from: 1) an omission of an important, statistically significant explanatory variable in the empirical model, which results in a positive autocorrelation; 2) a defective analytical form of the empirical model, resulting in a positive random component autocorrelation; 3) an excess of statistically insignificant variables in the empirical model, resulting in a negative random component autocorrelation. 13 It can be demonstrated that if first-order autocorrelation does not occur in the model, there is no autocorrelation of higher order either. Occurrence of first-order autocorrelation, however, indicates a model specification error, which necessitates model re-specification. Re-specification should be continued until the empirical model lacks first-order random component autocorrelation.
12
1
Single-Equation Econometric Model
that ρ1 is positive, i.e., H1: ρ1 > 0. The null hypothesis can be verified via the DW statistic, calculated using the following formula: n P
DW =
t=2
ð ut - ut - 1 Þ 2 n P t=1
,
ð1:31Þ
u2t
where ut denotes cycle t (t = 1, . . ., n) residuals, while ut - 1 denotes residuals delayed by 1 cycle. For large samples, the DW statistic falls within the range of 0 ≤ DW ≤ 2, when ρ1 is positive. As such, if the DW statistic >2, the alternative hypothesis should be modified and occurrence of negative random component autocorrelation should be assumed, i.e., H1: ρ1 < 0, in which case an adjusted Durbin-Watson statistic should be calculated, using the following formula: DW = 4 - DW:
ð1:32Þ
The DW (or DW*) value calculated is compared with critical test values: the lower dl and upper du values, taken from the Durbin-Watson tables,14 for a corresponding significance level γ. When, drawing on the statistical verification tools outlined above, the empirical model is considered acceptable, statistical significance of its explanatory variables should be tested next. Empirical econometric model: yt = a0 þ a1 xt1 þ . . . þ aj xtj þ . . . þ ak xtk þ ut ,
ð1:33Þ
in which aj ( j = 0,1,. . ., k) denotes the structural parameter estimates and ut denotes residuals, can take an alternative form: ^yt = a0 þ a1 xt1 þ . . . þ aj xtj þ . . . þ ak xtk ,
ð1:34Þ
where ŷt is the theoretical dependent variable value in cycle t (t = 1,. . ., n). Equations estimate (1.33) and (1.34) differ by residuals (yt - ŷt = ut). Each structural parameter
(aj) is characterized by a corresponding mean estimation error S2aj ( j = 0,1,. . ., k), which is also the square root of the j-th structural parameter estimate’s variance
If DW(DW*) > du, there are no grounds for rejection of the null hypothesis (H0), which means that no random component autocorrelation occurrence can be inferred, with a risk of a first-type error (at significance level γ). If DW(DW*) < dl, hypothesis H0 is rejected in favor of an alternative one, by which first-order random component autocorrelation can be inferred. When dl ≤ DW (DW*) ≤ du, the test does not determine whether autocorrelation occurs or not. This indicates that the DW statistic has hit a region of test insensitivity, which is synonymous with necessary application of another test of random component autocorrelation, e.g., the Student’s t-test, to examine the autocorrelation coefficient. 14
1.4
Econometric Model Verification
13
S2aj , providing information about the accuracy of that estimation. It is thus
necessary to determine the variance of the model’s structural parameter estimates. The variance and covariance matrix of the structural parameter estimates [D2(a)] should be determined using the following formula: -1 , D2 ðaÞ = Su2 X T X
ð1:35Þ
where: Su2 is the residual variance given by Eq. (1.22), and (XTX)-1 is the inverse of the so-called Hessian matrix found in Eq. (1.21). The diagonal elements of matrix D2(a) are the variances of corresponding structural parameter estimates: S2aj = diag D2 ðaÞ :
ð1:36Þ
Having the mean errors of structural parameter estimates, the empirical model can be written as follows: yt = a0 þ a1 xt1 þ . . . þ aj xtj þ . . . þ ak xtk þ ut , ðsa0 Þ ðsa1 Þ ð s ak Þ ðsaj Þ
ð1:37Þ
where the mean estimation errors are given under the structural parameter estimates, in parentheses. Model (1.37) enables conduction of a statistical significance test on the explanatory variables. Null hypothesis H0:αj = 0 ( j = 1, . . ., k) is posed, meaning an assumption that the j-th structural parameter is equal to zero. In economic sense, it is a hypothesis assuming insignificance of the model’s j-th explanatory variable. Alternative hypothesis H1:αj ≠ 0 assumes that the j-th structural parameter differs from zero, which implies statistical significance of the j-th explanatory variable. The null hypothesis is tested via the empirical Student’s t-statistic given by the following formula15: tj =
j aj j , ðj= 1, . . . , k Þ, Sa j
ð1:38Þ
in which the numerator contains the absolute value of the j-th structural parameter estimate, and the denominator—its mean estimation error. Drawing on the critical value tables of the Student’s t-distribution, critical value tγ;n - k-1 should be obtained. An arbitrarily reasonable significance level16 γ is then 15
The alternative hypothesis implies that the statistical test is a test with the so-called two-sided critical region. 16 Most commonly, a significance level of γ = 0.01 or γ = 0.05 is selected, which implies acceptance of 1% or 5% risk of Type I error.
14
1
Single-Equation Econometric Model
selected. The table reading is done at n-k-1 degrees of freedom. By comparing the empirical statistic tj ( j = 1,. . ., k) with the critical value tγ;n-k-1, the j-th variable’s significance can be inferred. If inequality tj ≤ tγ;n-k-1 holds, there are no grounds for rejecting the null hypothesis, which essentially necessitates removal of this variable from the empirical model and re-estimation of its parameters, followed by verification of the model respecified. When tj > tγ;n-k-1, the null hypothesis is rejected in favor of an alternative hypothesis, and the j-th explanatory variable’s statistically significant impact on the dependent variable is inferred. Statistically insignificant variables are removed from the model. It is advisable to eliminate only one statistically insignificant variable in a given iteration, i.e., the one for which the tj statistic is the lowest. Such procedure, consisting in repeated calculations and model verification, is continued until all variables in the empirical model are statistically significant at a reasonable level of significance. This results in an acceptable empirical econometric model.17 Oftentimes, the empirical model is written as: yt = a0 þ a1 xt1 þ . . . þ aj xtj þ . . . þ ak xtk þ ut , ðt 1 Þ ðt 0 Þ ðt k Þ ðt j Þ
ð1:39Þ
where empirical Student’s t-statistics are given under the structural parameter estimates. Having such model, its economic assessment is performed, consisting in determining the modeling results’ consistency with economic theory and the logic of economic practice.
1.5
Multiplicative Econometric Models
Multiplicative models—right after linear models—are among the nonlinear models most frequently used in economic research. Both groups of multiplicative models can be transformed to a linear form. Consider power model (1.4). yt = α0 xαt11 . . . xαtj1 . . . xαtkk eηt : Logarithmizing both sides of the above equation, the following model is obtained: ln yt = ln α0 þ α1 ln xt1 þ . . . þ αj ln xtj þ αk ln xtk þ ηt :
ð1:40Þ
By substituting yt , = ln yt , α0 = ln α0 , xtj = ln xtj , for j = 1, . . . , k; model (1.40) can be written as: 17
Provided that all previous measures of the model’s goodness of fit are at a satisfactory level.
1.5
Multiplicative Econometric Models
15
yt , = α0 = α1 xt1 þ . . . þ αj xtj þ . . . þ αk xtk þ ηt :
ð1:41Þ
Due to its parameters, Eq. (1.41) is linear in nature. Structural parameters α0 = α1 , . . . , αj , . . . , αk can thus be estimated using the ordinary least squares method. Analogous transformation can be performed on the below exponential model (1.6): yt = α0 αx1t1 . . . αj tj . . . αxktk eηt : x
Logarithmizing both sides of the above equation, the following transformed form of the equation is obtained: ln yt , = ln α0 þ xtl ln α1 þ . . . þ xtj ln αj þ xtk ln αk þ ηt :
ð1:42Þ
Substituting successively: yt = ln yt , αj = ln αj , j = 0,1,. . ., k, model (1.42) can be written in a linear form: yt = α0 α1 xt1 þ . . . þ αj xtj þ . . . þ αk xtk þ ηt :
ð1:43Þ
Equation (1.41) parameters can be estimated using OLS. Applying Eq. (1.21), the following matrices of observations from the model’s variables are obtained: 2
1 6... 6 X = 6 6 1 4... 1
ln x11 ... ln xt1 ... ln xnl
2 3 3 . . . ln x1k ln y1 6 7 ... ... 7 7 6 ... 7 6 7 . . . ln xtk 7, Y = 6 ln yt 7 7: 4 ... 5 ... ... 5 . . . ln xnk ln yn
. . . ln x1 j ... ... . . . ln xtj ... ... . . . ln xnj
Formula (1.21) takes the following form: - 1 T ^ = X T X X Y , α where: 2
3 ^ α 6α 7 6 ^1 7 6 7 6...7 7 ^ = 6 α 6α 7 6 ^j 7 6 7 4...5 ^k α
ð1:44Þ
16
1
Single-Equation Econometric Model
Type (1.43) equation parameters can also be estimated using OLS, in which case the estimator is as follows: -1 T ^ = X T X X Y , α
ð1:45Þ
^ takes the following form: where Y* is identical to the case of (1.41), and vector α 2
3 ^0 α 7 6α 6 ^1 7 6 7 6...7 7 ^ = 6 α 7: 6α 6 ^j 7 6 7 4...5 ^k α Suppose that the power-exponential model parameters have been estimated as: yt = α0 xαt11 xαt22 αx3t3 αx4t4 eηt : A vector of structural parameter estimates is then obtained: 2
0:944 0:657 0:446
3
6 7 6 7 6 7 a = 6 7 6 7 4 0:057 5 - 0:034 The structural parameter estimates are therefore known: a0 = 0:944, a1 = 0:657, a2 = 0:446, a3 = 0:057, a4 = 0:034: Note that parameter estimates a0 , a3 , a4 are given as logarithms. Accordingly, the following calculations should be performed: a0 = exp a0 = expð0:944Þ = 2:570, a3 = exp a3 = expð0:057Þ = 1:059 and a4 = exp a4 = expð - 0:034Þ = 0:967: The power-exponential empirical model takes then the following form:
1.6
Bound Endogenous Variables
17
^yt = 2:570X 0:657 X 0:446 1:059X tl t2
t3
0:967X
t4
:
ð1:46Þ
All structural parameter estimates in the power model (except for the a0 estimate) are thus obtained directly, whereas in the exponential model, additional calculations are required to obtain estimates aj ( j = 0,1,. . ., k).
1.6
Bound Endogenous Variables
An econometric model’s dependent variable should be characterized by a relatively large region of variability. It should not be bounded either. This means that it has neither a lower nor an upper limit. Meanwhile, variables acting as explanatory ðoÞ variables in the model, with observations yt , holding even two-sided bounds, can sometimes emerge. Their specificity entails a lower and an upper bound: ðoÞ
ymin ≤ yt ≤ ymax ,
ð1:47Þ
where ymin is the lowest possible observation value for the variable under consideration, while ymax is the highest possible observation value for this dependent variable. ðoÞ Suppose bound variable yt is described by a linear model: yt = α0 þ α1 X t þ ηt : Figure 1.1 shows the linear econometric model for a bound dependent variable. Consequences of a possible extrapolation beyond the range of statistical observation can be noted. An attempt to perform such extrapolation can result in the extrapolant Fig. 1.1 Linear model of bound dependent variable
(o)
yt
y max
y min 0
xt
18
1
Single-Equation Econometric Model
(p)
Fig. 1.2 Basic transformation of dependent variable
yt
(p)
yt
(o)
=
yt
- y min (o)
y max - y t
1
y max
(o )
2
0
y max
yt
values falling outside the bounded variable’s area of variability, which defies logic. For instance, the bound variable can be a structure index satisfying inequality ðoÞ 0 ≤ yt ≤ 100: An attempt to extrapolate a variable in the form of a structure index can lead to extrapolants reaching values of less than 0% or greater than 100%. The solution may then be to use one of the several potential bound dependent variable transformations, the first of which involves basic transformation of the bound variable, given by formula: ðPÞ
yt =
ðoÞ
yt - ymin
ðoÞ
ymax - yt
ð1:48Þ
, ðPÞ
where the designations are identical to those in Eq. (1.47), but yt denotes basic ðoÞ transformation of a two-sided bound variable yt . Basic transformation of the bound dependent variable converts it into a variable taking values within the interval of ðPÞ ðPÞ is unbounded (free) in terms of 0 ≤ yt ≤ 1: A variable in the form of yt nonnegative values. It still has a lower bound of 0, however; hence, it exhibits the characteristics of many economic variables taking nonnegative values. Fig. 1.2 ðoÞ shows basic transformation of variable yt , with the minimum bound variable value equal to 0, i.e., ymin = 0. Nevertheless, this does not revise the generality of the idea presented in this figure. Another important transformation of a variable with two-sided bounds is logit transformation, the concept of which is presented in Fig. 1.3, given by formula:
1.6
Bound Endogenous Variables
19 (l)
Fig. 1.3 Logit transformation of two-sided bound dependent variable
yt
0 y min + y max
2
y min
ð1Þ
ðPÞ
yt = ln yt =
ðoÞ
yt - ymin
ðoÞ
ymax - yt
:
(o)
y max
yt
ð1:49Þ
The bound variable logit transformation thus is a logarithm of basic transformation. It converts the two-sided bound variable into a free variable. Note that the logit form of the variable satisfies inequality. As such, application of linear models: ðPÞ
yt =
K X
αj X tj þ ηt
ð1:50Þ
αj X tj þ ηt
ð1:51Þ
j=0
or ð1Þ
yt =
K X j=0
eliminates the risk associated with the dependent variable extrapolation outside the range of statistical observations.18
18
An extensive discussion of econometric model building for bound dependent variable transformation can be found in Wiśniewski (1986).
20
1
Single-Equation Econometric Model
Parameter estimation for a model with a bound dependent variable can be carried out via the ordinary least squares method, using the procedure given by formulas (1.21), (1.22), and (1.23). As Goldberger suggests,19 the Aitken estimator, given by formula (1.24), is a more precise estimator in such a case. This raises a question of how to determine the components of the matrix Ω given by formula (1.25). A two-step procedure is required in this case. First, OLS should be used to estimate the parameters of a model with a dummy endogenous variable. After calculating the theoretical values from a type (1.34) empirical equation, weights can be determined for each observation, calculated as follows: wt = ^yt ð1- ^yt Þ,
ðt= 1, . . .,
nÞ
ð1:52Þ
^ can be constructed, taking the following form: Accordingly, empirical matrix Ω 2
w1
...
6 6... ... 6 ^ =6 0 ... Ω 6 6 4... ... 0
...
0
...
0
3
7 ... ... ...7 7 wt . . . 0 7 7: 7 ... ... ...5 0 . . . wn
ð1:53Þ
In practice, negative values of the wt weights can occur; therefore, a better variant here is to use the moduli of the weights calculated via formula (1.52). This way, ^ will take the following form: matrix Ω 2
j w1 j 6 6 ... 6 ^ =6 0 Ω 6 6 4 ... 0
...
0
... ... . . . jwt j ... ... ... 0
... ... ... ... ...
0
3
7 7 7 7: 7 7 ... 5 jwn j ... 0
ð1:54Þ
The Aitken estimator for the dummy dependent variable will thus take the form of:
-1 ^ -1 X ^ - 1 Y: α = XT Ω XT Ω ^ - 1 will be structured as: Matrix Ω
19
Cf. Goldberger (1972: 321).
ð1:55Þ
References
21
2 1 6 w1 6 ... 6 6 -1 6 ^ Ω =6 0 6 6 ... 6 4 0
3 ...
0
. . . . . . 1 . . . wt ... ... ...
0
...
0 7 ... 7 7 7 7 ... 0 7, 7 . . . . . . 7 7 1 5 . . . wn ...
or 2 1 6 ^yð1 - ^y1 Þ 6 ... 6 6 -1 6 ^ =6 0 Ω 6 6 ... 6 4 0
3 ... ...
... ...
... ...
0 ... 1 ^y ð1 - ^y Þ t t ...
...
0
...
... ...
0 ...
7 7 7 7 7 0 7: 7 7 ... 7 5 1 ^y ð1 - ^y Þ n n
ð1:56Þ
Estimator (1.55) yields more efficient (precise) parameter estimates for a model with a dummy dependent variable, compared to the OLS estimator.
References Goldberger AS (1972) Teoria ekonometrii. PWN, Warszawa Wiśniewski JW (1986) Ekonometryczne badanie zjawisk jakościowych. Studium metodologiczne. UMK, Toruń Wiśniewski JW (2013) Forecasting staffing decisions. Econometrics 1(39):22–29
Chapter 2
Multi-Equation Econometric Models
2.1
Multi-Equation Model Classification
A multi-equation model is a system consisting of multiple (at least two) equations describing a selected economic system, or a part thereof, called a subsystem. It features G endogenous variables: Y1, . . ., Yg, . . ., YG with statistical observations y1t, . . ., ygt, . . ., yGt. Its endogenous variable is characterized by a certain feature—it acts as the dependent variable in one of the equations; it can also act as an explanatory variable. Exogenous variables X1, . . ., Xj, . . ., Xk, with observations xt1, . . ., xtj, . . ., xtk, occur in such models as well. The exogenous variables act as explanatory variables only. Unlagged endogenous variables shall be called the model’s interdependent variables. Alternative group of predetermined variables Z1, . . ., Zj, . . ., ZK (with observations zt1, . . ., ztj, . . ., ztK) consists of exogenous variables and lagged endogenous variables, acting in the model equations as explanatory variables. A system of G structural form equations in a multi-equation model1 can be written as follows:
1
A structural form multi-equation model reflects the full structure of interdependent variable interdependence and the direct impact of predetermined variables on each of the interdependent variables.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. W. Wiśniewski, Forecasting from Multi-equation Econometric Micromodels, Contributions to Economics, https://doi.org/10.1007/978-3-031-27492-3_2
23
24
2
y1t = α10 þ
G X
β1g ygt þ
g=2
K X
Multi-Equation Econometric Models
α1j ztj þ η1t ,
j=1
....................................... G K X X ygt = αg0 þ βgg, yg,, t þ αgj ztj þ ηgt , g,, = 1; g ≠ g,
ð2:1Þ
j=1
....................................... G -1 K X X βGg ygt þ αGj ztj þ ηGt : yGt = αG0 þ g=1
j=1
Random components (η1t, . . ., ηgt, . . ., ηGt), as well as structural parameters βgg, (g, g’ = 1, . . ., G; g ≠ g’), associated with the interdependent variables, emerge in the above G equations. Parameters αgj (g = 1, . . ., G; j = 0, 1, . . ., K ), associated with the predetermined variables, occur as well. In practice, it is natural for only some of the interdependent and predetermined variables to act as explanatory variables in individual equations. This means that a significant proportion of parameters βgg’ and αgj (g, g’ = 1, . . ., G; j = 0, 1, . . ., K ) assumes zero values. Moreover, parameters βgg = 1 indicate the g-th equation’s dependent variable. The multi-equation model can also be written in matrix terms: BY þ AZ = η,
ð2:2Þ
where: 2
1 6 6 ... 6 6 B=6 6 - βg1 6 6 ... 4 - βG1 2
- α10 6 6 ... 6 6 A=6 6 - αg0 6 6 ... 4 - αG0
...
- β1g
...
...
...
...
...
1
...
...
...
...
...
- βGg
...
- α11
...
- α1j
...
...
...
- αg1
...
- αgj
...
...
...
- αG1
...
- αGj
- β1G
3
2
3
2
zt0
3
y1t 6 7 6 zt1 7 7 6 7 6 7 6...7 ... 7 6 7 7 6 7 6...7 7 6 7 7 7 6 7 - βgG 7, Y = 6 ygt 7, Z = 6 6 7 6 ztj 7 7 6 7 6 7 6...7 ... 7 6 7 5 4 5 6...7 4 5 1 yGt ztK 3 3 2 . . . - α1K η1t 7 7 6 6...7 ... ... 7 7 7 6 7 7 6 7 6 . . . - αgK 7, η = 6 ηgt 7 7: 7 7 6 6...7 ... ... 7 5 5 4 . . . - αGK ηGt
Matrix B contains the model’s structural parameters, at the interdependent variables. In matrix A, the model’s structural parameters occur at the predetermined variables. Vector Y contains the model’s interdependent variables, vector Z includes
2.1
Multi-Equation Model Classification
25
the model’s predetermined variables, while vector η—the random components of the models’ structural form equations. Given the mechanism of the interdependent variable dependencies, a multiequation model can be classified as belonging to one of three categories. The manner by which a model’s interdependent variables are interrelated determines the model type: • simple models, • recursive models, • systems of interdependent equations. Simple models hold no direct linkages among interdependent variables. This means that none of the interdependent variables acts as an explanatory variable in any of the equations. Such a model can be exemplified by the following system of equations: y1t = α10 þ α11 xt1 þ α14 t þ α16 y3t - 1 þ η1t , y2t = α20 þ α22 xt2 þ α24 t þ α25 y2t - 1 þ η2t , y3t = α30 þ α33 xt3 þ α36 y3t - 1 þ η3t :
ð2:3Þ
The interdependent variables (y1t, y2t, y3t) in the model (2.3) are not interrelated. None of them acts as an explanatory variable. Only lagged endogenous variables y2t– 1 and y3t–1, which belong to the group of predetermined variables, act as explanatory variables. Recursive models are characterized by a chain (recursive) nature of the interdependent variable linkages. The chainlike character of these links implies unidirectionality of the dependencies, where the beginning and end of the chain can be indicated. Such a chain denoting the model’s recursiveness can be exemplified as:
y 1t
y 2t
y 3t
y 4t
The chain begins with variable y1t and ends with variable y4t. A recursive model of the above presented links between interdependent variables can take the following form: y1t = α10 þ α11 xt1 þ α14 t þ α16 y3t - 1 þ η1t , y2t = α20 þ α22 xt2 þ β21 y1t þ α25 y2t - 1 þ η2t , y3t = α30 þ α34 t þ β31 y1t þ β32 y2t þ α35 y2t - 1 þ η3t , y4t = α40 þ α43 xt3 þ β43 y3t þ α46 y3t - 1 þ η4t :
ð2:4Þ
26
2
Multi-Equation Econometric Models
Systems of interdependent equations are characterized by the occurrence of mutual multilateral links between the interdependent variables. There can be two types of such linkages: direct feedback, or indirect feedback, also referred to as closed-loop linkages between interdependent variables. Feedback involves contemporaneous mutual interaction of a pair of such variables. For instance, variables ygt and yg’t (for g, g’ = 1, . . ., G; g ≠ g’) are feedback related when: ygt
← →
yg , t
Indirect feedback (a closed loop) between interdependent variables, in turn, occurs when: y1t → y2t " # y4t ← y3t The two types of linkages can occur in a model simultaneously, which is often the case in large multi-equation models. Occurrence of one of the mechanisms indicated is a condition enough for the model to form a system of interdependent equations. An exemplary model with direct feedback can take the following form: y1t = α10 þ α11 xt1 þ α14 t þ α15 y3t - 1 þ η1t , y2t = α20 þ α22 xt2 þ α24 t þ β23 y3t þ η2t , y3t = α30 þ α33 xt3 þ β32 y2t þ η3t :
ð2:5Þ
Model (2.5) is characterized by a feedback mechanism between variables y2t and y3t. Variable y3t affects variable y2t and acts as an explanatory variable in equation two. Moreover, variable y2t affects y3t, assuming the role of an explanatory variable in equation three. As such, the direct feedback requirement is met. Equation one in model (2.5) draws attention, as only predetermined variables occur in the set of its explanatory variables; hence, the equation is of simple-model nature. An equation in a system of interdependent equations, in which the explanatory variables are exclusively predetermined variables, is called a detached equation. Consider the following model: y1t = α10 þ α11 xt1 þ β14 y3t þ α15 y3t - 1 þ η1t , y2t = α20 þ α22 xt2 þ α24 t þ β21 y1t þ η2t , y3t = α30 þ α33 xt3 þ β32 y2t þ η3t :
ð2:6Þ
2.2
Reduced Form of a Model
27
The model’s interdependent variables from a closed loop:
y 1t
y 2t
y 3t
As such, model (2.6) forms a system of interdependent equations, and thus can be classified within the category of most complex econometric models.
2.2
Reduced Form of a Model
A structural form model written in matrix terms as Eq. (2.2) can be subjected to left multiplication by matrix B-1, which yields the following: B - 1 BY þ B - 1 AZ = B - 1 η: Matric product B-1B = I, where I is a unitary matrix of G degree. After moving expression B-1 AZ to the right side of the equation, we obtain the following: Y=
-
B - 1 AZ þ B - 1 η:
After substituting C = -B-1 A and ε = B-1 η, we arrive at a reduced form of the model: Y = CZ þ ε,
ð2:7Þ
where2: 2
c10
6 ... 6 6 C=6 6 cg0 6 4 ... cG0
c11
...
c1j
... cg1
... ...
... cgj
... cG1
... ...
... cGj
...
c1K
... ...
... cgK
3
7 7 7 7, 7 7 ... ... 5 . . . cGK
2
ε1t
3
6 7 6...7 6 7 7 ε=6 6 εgt 7 6 7 4...5 εGt
Matrix C contains the structural parameters of the model’s reduced form equations, while ε is the vector of the reduced form equations’ random components. The reduced form equations (in G amount) can be written as follows:
2
Designations Y and Z have been explained in connection with Eq. (2.2).
28
2
Multi-Equation Econometric Models
y1t = c10 þ c11 zt1 þ . . . þ c1j ztj þ . . . þ c1K ztK þ ε1t , ygt = cg0 þ cg1 zt1 þ . . . þ cgj ztj þ . . . þ cgK ztK þ εgt ,
ð2:8Þ
yGt = cG0 þ cG1 zt1 þ . . . þ cGj ztj þ . . . þ cGK ztK þ εGt : Drawing on the above, it follows that each of the reduced form equations contains an identical set of explanatory variables, whereby the set of each such equation’s explanatory variables is formed by all the predetermined variables of entire multiequation model. For instance, a system of reduced form equations for model (2.6) takes the following form3: y1t = c10 þ c11 xt1 þ c12 xt2 þ c13 xt3 þ c14 t þ c15 y3t - 1 þ ε1t , y2t = c20 þ c21 xt1 þ c22 xt2 þ c23 xt3 þ c24 t þ c25 y3t - 1 þ ε2t ,
ð2:9Þ
y3t = c30 þ c31 xt1 þ c32 xt2 þ c33 xt3 þ c34 t þ c35 y3t - 1 þ ε3t :
2.3
Model Identification
In order to use a multi-equation model, the issue of its structure correctness, in terms of the relationship between its reduced and structural forms, needs to be resolved. Consider the following equation: C = - B - 1 A,
ð2:10Þ
linking the structural form to the reduced form. Through left multiplying both sides of Eq. (2.10) by matrix B, we arrive at an identification equation: B C = - A:
ð2:11Þ
A model is identifiable when, based on the components of matrix C, the system of linear equations can be solved on account of the components of matrices B and A. This implies the need to solve a system of G(K + 1) linear equations.4 When solving a system of G(K + 1) linear equations, three possible outcomes can arise: (a) there is only one solution to the system of equations. We can then speak of an unambiguous solution, in which case the multi-equation model is uniquely identifiable; 3
The original structural-form variable designations can be used, or new designations can be introduced by assigning a new designation ztj to each of the exogenous and endogenous lagged variables. 4 The G(K + 1) size results from the dimensions of matrix A, which contains that many elements.
2.3
Model Identification
29
(b) there are multiple solutions to the system of G(K + 1) equations. The system solution is then ambiguous, which means that the model is non-uniquely identifiable. Its structure is correct. Such a model is also referred to as overidentified; (c) the system holds no solution, in which case the multi-equation model is unidentifiable, meaning its construction is defective. Such a model requires reconstruction (re-specification), in a manner making it at least identifiable uniquely. Two identifiability conditions must be met in empirical study of model identifiability, which arise from the need to impose the so-called zero constraints on some of the structural parameters. This means that a part of each equation’s structural parameters must take zero values. In practice, thus, some interdependent and predetermined variables should not occur in the explanatory variable sets of certain equations of this form. Identifiability testing is carried out for each equation separately. The prerequisite for the g-th equation (g = 1,. . .,G) identifiability is that the entire model’s number of the variables which do not appear in this equation (Lg) is at least equal to G-1: Lg ≥ G - 1:
ð2:12Þ
The second condition—which is an imperative and sufficient requirement—is that the rank of matrix Wg (g = 1, . . ., G) equals G-1, that is5: rz W g = G - 1:
ð2:13Þ
If condition (2.13) is satisfied, the g-th equation is identifiable uniquely, when Lg = G-1. In contrast, if condition (2.13) is satisfied, the g-th equation is identifiable non-uniquely (overidentified), when Lg > G-1. The g-th equation is not identifiable when Lg < G-1 or rz(Wg) < G - 1. This means that the entire model is non-identifiable and requires reconstruction. If all the equations of the model are identifiable, it is identifiable uniquely. A multi-equation model is non-uniquely identifiable, if all its equations are identifiable, and at least one of them is overidentified. Consider model (2.6): y1t = α10 þ α11 xt1 þ β14 y3t þ α15 y3t - 1 þ η1t , y2t = α20 þ α22 xt2 þ α24 t þ β21 y1t þ η2t , y3t = α30 þ α33 xt3 þ β32 y2t þ η3t :
5
Matrix Wg has been constructed from the model parameters occurring at the variables which do not appear in the g-th equation (g = 1, . . ., G).
30
2
Multi-Equation Econometric Models
First, all the model variables, i.e., y1t, y2t, y3t, xt0, xt1, xt2, xt3, t, y3t-1, need to be specified. Note that the following do not occur in equation one: y2t, xt2, xt3, and t. This means that the number of the variables not appearing in the first equation is L1 = 4. The prerequisite is thus met, as L1 = 4 > G – 1 = 2. Analogously, equation two does not contain y3t, xt1, xt3, and y3t-1, which means that L2 = 4. As such, equation two is identifiable. Variables y1t, xt1, xt2, t and y3t-1 are not present in equation three, which results in L3 = 5. Accordingly, equations two and three are identifiable. That being the case, it is necessary to construct matrices W1, W2, and W3, containing coefficients at the variables which do not appear in a given equation. Matrix W1 will thus contain structural parameters, at the variables y2t, xt2, xt3 and t from equations two and three: W1 =
1
- α22
0
- α24
- β32
0
- α33
0
:
ð2:14Þ
All it takes is for parameters α22, α24, α32, α33, and β32 to be different from zero, then the rank of this matrix rz(W1) = 2 = G-1. Condition (2.13) is thus satisfied, as a result of which equation one is identifiable. Since inequality L1 = 4 > G-1 = 2 holds, equation one is identifiable non-uniquely. Analogously, matrices W2 and W3 are as follows: W2 =
- α11 0
- β14 1
- α15 0
0 - α33
and W3 =
1
- α11
0
0
- α15
- β21
0
- α22
- α24
0
:
It can be demonstrated that matrix rz(W2) = rz(W3) = 2 = G-1. What is more, L2 = 4 > G-1 and L3 = 5 > G–1. As such, equations two and three are identifiable non-uniquely. Model (2.6) is therefore non-uniquely identifiable (overidentified), which implies its correct construction, allowing further work on it in the subsequent stages.
2.4
Multi-Equation Model Parameter Estimation
Methods of multi-equation model parameter estimation essentially can be divided into two groups. The first encompasses the methods used for estimating each of the model’s equations separately, just as in single-equation models. The second
2.4
Multi-Equation Model Parameter Estimation
31
comprises the methods used for parameter estimation of all the model’s equations simultaneously, referred to as joint estimation methods. The prevailing view in econometric literature is that parameters of simple and recursive model equations can be estimated using the ordinary least squares method. This means that each equation in models of this type can be treated in the estimation as a single-equation model. The ordinary least squares method does not yield consistent parameter estimators of structural form equations in systems of interdependent equations. The reason for the inconsistency is the fact that interdependent variables, which act as explanatory variables in the equations, are correlated with contemporaneous random components. The condition of OLS applicability, written as Eq. (1.18), is thus not met in such a case. This necessitates a search for other methods of structural parameter estimation in models with interdependent equations. In doing so, it should be remembered that transition to the estimation stage is only possible, if the model is identifiable uniquely or non-uniquely. Note that the reduced form of a system of interdependent equations is of simplemodel nature. Hence, if there are no particularly unfavorable conditions, parameters of the model’s reduced form equations can be estimated via the ordinary least squares method, in each equation separately. Let us consider a case when a system of interdependent equations is identifiable uniquely. This means that the identification Eq. BC = -A (2.11) holds an unambiguous solution. Consider the following model: y1t = α10 þ β12 y2t þ α11 xt1 þ η1t , y2t = α20 þ β21 y1t þ α22 xt2 þ η2t :
ð2:15Þ
The above model is a system of interdependent equations, in which each equation is identifiable uniquely. Its reduced form can be written as: y1t = c10 þ c11 xt1 þ c12 xt2 þ ε1t ,
ð2:16Þ
y2t = c20 þ c21 xt1 þ c22 xt2 þ ε2t :
The identification equation for the above structural and reduced form is as follows:
1
- β12
- β21
1
c10
c11
c12
c20
c21
c22
=
α10
α11
0
α20
0
α22
:
ð2:17Þ
Suppose that by means of OLS, using the statistical data at hand, the parameters of each of the model’s reduced form equations have been estimated, yielding the following empirical equations:
32
2
Multi-Equation Econometric Models
y1t = 3:2 þ 0:8xt1 þ 2:4xt2 þ e1t , y2t = 0:4 - 1:6xt1 þ 6:2xt2 þ e2t :
ð2:18Þ
Equations (2.18) contain residuals, denoted by e1t and e2t, respectively. The empirical identification equation can be written as:
1 - b21
- b12 1
3:2 0:4
0:8 2:4 a10 = - 1:6 6:2 a20
a11 0
0 : a22
ð2:19Þ
In system (2.19) matrices, designations b12 and b21 denote parameter β12 and β21 estimates, while a10, a11, a20, a22 denote parameter α10, α11, α20, α22 estimates in the system of structural form Eq. (2.15). Based on matrix Eq. (2.19), a system of six linear equations, with six unknowns, is formed: a10 = 3:2 - 0:4b12 , a11 = 0:8 þ 1:6b12 , 0 = 2:4 - 6:2b12 , a20 = 0:4 - 3:2b21 ,
ð2:20Þ
0 = - 1:6 - 0:8b21 , a22 = 6:2 - 2:4b21 : Solution of the system of Eq. (2.20) yields the structural parameter estimates for the system of Eq. (2.15), which reach the following numerical values: b12 = 0:387, b21 = - 2:0, a10 = 3:045, a11 = 2:787, a20 = 6:8, a22 = 11:0: The above parameter estimates of the structural form equations have been obtained via the indirect least squares method (ILS). Consequently, model (2.15) empirical equations can be written as follows: ^y1t = 3:045 þ 0:387y2t þ 2:787xt1 , ^y2t = 6:8 - 2:0y1t þ 11:0xt2 :
ð2:21Þ
Considering the above, it follows that the indirect least squares method is quite easy to apply. It proceeds in two stages: the first uses OLS to estimate the parameters of the model’s reduced form equations; the second, in turn, solves the system of the linear equations derived from the matrix identification equation. One shortcoming in this method is the lack of a variance and covariance matrix of the structural parameter estimates for structural form empirical equations. This prevents determination of the mean parameter-estimation errors for this form of model equations. In
2.4
Multi-Equation Model Parameter Estimation
33
consequence, the explanatory variables’ significance in individual empirical structural form equation cannot be tested. When a system of interdependent equations is identifiable non-uniquely, the indirect least squares method cannot be applied. The most commonly adopted estimation procedure is then the double least squares method (2OLS), which involves twofold application of the ordinary least squares method. First, OLS is used to estimate the parameters of the model’s reduced form equations. Based on the reduced form empirical equations, theoretical values of the interdependent variables are determined, which are ipso facto devoid of the randomness part. The interdependent explanatory variables in the structural form equations are then substituted with their theoretical values derived from the empirical reduced form equations. The parameters of such modified structural form equations can be estimated via the ordinary least squares method. Let us go through the 2OLS procedure, using the system of interdependent Eq. (2.5) as an example: y1t = α10 þ α11 xt1 þ α14 t þ α15 y3t - 1 þ η1t , y2t = α20 þ α22 xt2 þ α24 t þ β23 y3t þ η2t , y3t = α30 þ α33 xt3 þ β32 y2t þ η3t : A system of the model’s reduced form equations is written as follows: y1t = c10 þ c11 xt1 þ c12 xt2 þ c13 xt3 þ c14 t þ c15 y2t - 1 þ ε1t , y2t = c20 þ c21 xt1 þ c22 xt2 þ c23 xt3 þ c24 t þ c25 y2t - 1 þ ε2t , y3t = c30 þ c31 xt1 þ c32 xt2 þ c33 xt3 þ c34 t þ c35 y2t - 1 þ ε3t : After estimating the above equations’ parameters, using the OLS, empirical reduced form equations are obtained: y1t = ^c10 þ ^c11 xt1 þ ^c12 xt2 þ ^c13 xt3 þ ^c14 t þ ^c15 y2t - 1 , y2t = ^c20 þ ^c21 xt1 þ ^c22 xt2 þ ^c23 xt3 þ ^c24 t þ ^c25 y2t - 1 , y3t = ^c30 þ ^c31 xt1 þ ^c32 xt2 þ ^c33 xt3 þ ^c34 t þ ^c35 y2t - 1 : In the above system of model (2.3) empirical reduced form equations, designations y1t , y2t , y3t denote the theoretical values of the interdependent variables in individual equations, which result from the calculations carried out after applying OLS. The second step of the estimation procedure can now be performed. The actual magnitudes of the interdependent variables in the structural form equations are substituted with their theoretical values derived from the empirical reduced form equations, wherever they act as explanatory variables. Let us thus consider a new system of structural form equations:
34
2
Multi-Equation Econometric Models
y1t = α10 þ α11 xt1 þ α14 t þ α15 y3t - 1 þ η1t , y2t = α20 þ α22 xt2 þ α24 t þ β23 y3t þ η2t ,
ð2:22Þ
y3t = α30 þ α33 xt3 þ β32 y2t þ η3t :
The parameters of each system (2.22) equation can be estimated using OLS. Equation one draws attention, as none of the interdependent variables occur in the set of its explanatory variables, which means that the detached equation’s parameters in the system of interdependent equations can be estimated directly, using the ordinary least squares method. In the second equation, explanatory variable y3t has been substituted with a variable in the form of the theoretical values calculated from the reduced form y3t . This way, no correlation of the explanatory interdependent variable, which is additionally non-random, with the random component occurs in the second equation. As such, parameter estimation for such modified equation two via the ordinary least squares method is acceptable. An analogous swap has occurred in equation three, in which y2t has been substituted with y2t , enabling parameter estimation via OLS. Let us try to generalize the double least squares method formula. To do so, a matrix notation of any (g-th; g = 1, . . ., G) equation from the system of interdependent equations is necessary: yg = Y g βg þ Z g αg þ ηg ′
ð2:23Þ
where: yg—vector of observations for the g-th equation’s dependent variable, of nx1 dimensions; Yg—matrix of observations for the interdependent variables which act as explanatory variables in the g-th equation, of nxF dimensions; Zg—matrix of observations for the predetermined variables which act as explanatory variables in the g-th equation, of nxH dimensions; βg—structural parameters at the interdependent variables which act as explanatory variables in the g-th equation, of Fx1 dimensions; αg—structural parameters at the predetermined variables which act as explanatory variables in the g-th equation, of nx(H + 1) dimensions; ηg—the g-th equation’s random component of nx1 dimensions. Individual matrices occurring in formula (2.23) are as follows:
2.4
Multi-Equation Model Parameter Estimation
2
yg1
3
2
Y 11
35
. . . Yf1
...
Y F1
3
6 6 7 7 6...7 6 ... ... ... ... ... 7 6 6 7 7 6 6 7 7 6 7 7, y . . . Y . . . Y Y yg = 6 = , Y 1t ft Ff g 6 gt 7 6 7 6 6 7 7 6...7 6 ... ... ... ... ... 7 4 4 5 5 ygn Y 1n . . . Y fn . . . Y Fn 2 3 1 z11 . . . z1h . . . z1H 6 7 6... ... ... ... ... ... 7 6 7 6 7 6 Z g = 6 1 zt1 . . . zth . . . ztH 7 7, 6 7 6... ... ... ... ... ... 7 4 5 1 zn1 . . . znh . . . znH 2 3 αg0 3 2 3 2 ηg1 β1 6 7 6 7 7 6 7 6 6 αg1 7 6...7 6...7 6 7 7 6 7 6 6 ... 7 7 6 7 6 6 7 7 6 6 η βg = 6 βf 7, αg = 6 7, ηg = 6 gt 7 7: 6 7 7 6 7 6 6 αgh 7 6...7 6 ...7 6 7 5 4 5 4 6 ... 7 4 5 η βF gn αgH In Eq. (2.23), the matrix of observations for the original dependent variable values, explaining Yg in this equation, is substituted with matrix Y g of their theoretical values derived from the reduced form equations. As such, consider the following equation: yg = Y g βg þ Z g αg þ ηg ,
ð2:24Þ
which can also be written as: yg = Y g where: Y g = Z^c = Z ðZ T Z Þ
Zg
βg þ ηg , αg
ð2:25Þ
-1 T
2
Z Y g , .and
1
6... 6 6 Z =6 6 1 6 4... 1
z11
...
z1j
... ... ... zt1 . . . ztj ... ... ... zn1
...
znj
. . . z1K
3
... 7 7 7 . . . ztK 7 7, : 7 ... ... 5 . . . znK ...
36
2
Multi-Equation Econometric Models
The 2OLS estimator of the structural parameter vector takes thus the following form: "
# h iT h i - 1 h iT β^g Y g Z g yg : = Y g Z g Y g Z g ^g α
ð2:26Þ
Performing appropriate matrix multiplications, we ultimately arrive at the following: "
#
2
β^g 6 =4 ^g α
Y T g Yg
: Y T g Zg
:
:
:
Z Tg Y g
:
Z Tg Z g
3 - 12 7 5
Y T g yg
3
6 ... 7 4 5: T Z g yg
ð2:27Þ
Product matrix Y T g Y g can also be written as:
T -1 T T -1 T T Z Z Z Z Z Yg = Y T g Yg = Yg Z Z Z -1 T Z Yg = Y Tg Z Z T Z
ð2:28Þ
Using notation (2.28) in Eq. (2.27), the final double least squares estimator formula is obtained: "
β^g ^g α
#
2
Y Tg Z ðZ T Z Þ 6 =4 : Z Tg Z ðZ T Z Þ
-1 T
Z Yg
-1 T
Z Yg
: : :
Y Tg Z ðZ T Z Þ :
-1 T
Z Tg Z g
Z Zg
3 - 12 7 5
Y Tg Z ðZ T Z Þ 6 ... 4
-1 T
Z yg
3 7 5
Z Tg yg
Estimates of structural parameter vectors βg and αg are obtained after defining the initial statistical observation matrices: Yg, Z, Zg, and yg, which were determined at the beginning of the deliberation on the 2OLS method.
Chapter 3
Econometric Forecasts
3.1
The Concept of an Econometric Forecast
One important area of econometric model exploitation entails construction of forecasts. Econometric forecasting means inference into the future, using an econometric model. Econometric prediction, therefore, encompasses a set of steps in the research procedure, which result in an econometric forecast. The process of forecasting and estimating the future, based on theoretical studies, analytical considerations, logical premises and practical experience, constitutes an essential basis in the rapidly developing statistical (probabilistic) theory of forecasts (Zeliaś 1997: 15).
A process of inferring the future understood in this way employs quantitative methods, especially mathematical and statistical methods as well as the concepts and analytical instruments of probability calculus. In addition to that, specifically constructed econometric models are used, which are based on the past patterns observed in the economy. The use of primarily mathematical and statistical tools of inference into the future enables forecast estimation based on a relatively objective method. Econometric forecast objectivity mainly results from the fact that the selection of the forecasting principle unambiguously determines the manner of forecast construction. Application of econometric methods prevents forecast “adjustment” depending on the subjective feelings or suggestions of the prediction participants.1 An econometric forecast—resulting from econometric prediction—is understood as only such a numerical estimate of the economic reality fragment considered, in the formulation of which information on the regularities or trends observed in the past is used. The starting point in econometric forecasting entails appropriate empirical econometric models describing economic systems or the components thereof.
1
Cf. Pawłowski (1973: 15). The voluntarism of the forecast builder and/or user is ipso facto eliminated.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. W. Wiśniewski, Forecasting from Multi-equation Econometric Micromodels, Contributions to Economics, https://doi.org/10.1007/978-3-031-27492-3_3
37
38
3 Econometric Forecasts
Rational control of an economic system requires recognition of its future behavior and the changes taking place within it. Econometric forecasts constitute an important component of rational economic process programming. They provide reasonably objective information on the future, by which they offer additional rationale when decision making. The economic practice in Poland, however, makes insufficient use of the statistic and econometric tools in forecast estimation. Decision making regarding future solutions has been excessively dominated by autopsy and faith in the decision-makers’ intuition. The rarity of scientific forecasting method application oftentimes is detrimental to the decision effects. Forecasts are intended to create new rationales, by providing new information in the process of decision making. Forecasting allows the anticipated economic system directions and dynamics to be taken into account. It also allows for—by providing information on important elements of the system’s behavior—early interference, in order to exert active influence on the course of processes. In this sense, forecasts can be of a warning character, since they signal the negative economic and social consequences of continuing the previous system behavior trends and rules ahead of time. Most commonly forecasting is of exploratory (research) nature. Exploratory forecasts entail numerical estimation of the future state of an economic object or system, based on the permanent cause-and-effect relationships characterizing the changes taking place. The future state of an object or economic system is treated as a consequence of the previous state, in combination with a set of hypotheses regarding the general conditions and individual factors of economic development. Normative forecasts are also frequently used. Their specificity entails estimation of the results to be achieved in the future (especially in longer period perspective). Accordingly, development objectives are so to speak formulated. The cause-and-effect relationships, however, are considered from the future to the present. The sequence of the events that should occur is thus taken into consideration, as well as the tasks to be performed to achieve the end result established, in the form of a forecast. Most generally, forecasts can be differentiated between quantitative and qualitative forecasts. Quantitative forecasting refers to the numerical value of a specific random variable, whereas qualitative forecasting tells whether a certain random event will materialize a certain number of times during the forecast period. Quantitative forecasts can be made on a point or interval basis. Point-based forecasting involves selection of a single number which, based on certain prediction-theoryderived premises, can be considered the best forecast variable estimate during the forecast period. Interval forecasting determines the numerical range to which an appropriately high (close to unity) probability of containing the true value of the forecast variable in period T corresponds. Possible significant qualitative changes, arising from general socioeconomic policies, need to be taken into account as well. Analysis with the use of econometrics tools hits upon significant difficulties, or is even impossible, in such transitional states.
3.2
Conditions for Econometric Forecast Estimation
39
Econometric forecast accuracy is primarily determined by the precision which the empirical model describes the economic system with. Estimation of forecasts based on econometric models is all the more justified, the: (a) shorter the prediction horizon, i.e., the time interval (t0; t0 + τ), where t0 is the current period, while t0 = n often is the last of the observation periods, and τ is the length of the prediction horizon determining the point in time for which the forecasts constructed are acceptable (reasonable, sensible); (b) longer the period which the empirical forecasting model has been constructed based on; (c) slower the changes (evolutionary, not revolutionary) in the forecasted variables; (d) more autonomous in nature, i.e., less dependent on strategic decisions, the forecasted variables (Cf. Zeliaś 1997: 16).
3.2
Conditions for Econometric Forecast Estimation
Econometric forecasting of an economic system or a selected component thereof is legitimate, provided that certain necessary conditions, referred to as basic econometric prediction theory assumptions, are met (Pawłowski 1973: 38–45): 1. When forecasting involves a single economic variable, the empirical model describing the formation of that variable must be known. When an economic system is forecasted, the empirical econometric model of that system, whose objects are the interdependent variables described in individual equations of that model, must be known. It is essential to know the structural parameter and stochastic structure parameter estimates. 2. The mechanism of the links between the endogenous and explanatory variables is stable throughout the entire time segment, starting from the period which the sample forming the basis for the model parameter estimation has been derived from, up to, and including the forecast period. When changes in structure do occur, they can be slow and regular. Such changes can be captured in the model by variating the structural parameters of its equation/s. 3. The random component distribution is stationary both in the period which the sample has been drawn from and the forecast period. Changes can involve the type of the distribution or its parameter modification. If changes in the random component distribution do occur, they should be regular enough to be detected and extrapolated to the forecast period. 4. The values of the model equations’ explanatory variables in the forecast period should be known. To meet this requirement, the variables playing fundamental role in the manifestation of the regularities under study, as well as those whose values in the forecast period are precise enough, must be introduced into the econometric model used for the prediction. The explanatory variable values predicted for the forecast period T (T = t0 + 1, t0 + 2, . . ., t0 + τ) can be determined:
40
3
Econometric Forecasts
(a) at a planned level, which allows inference about the effects of plan implementation; (b) by using the existing forecasts of these variables; (c) by determining trend models and then extrapolating the trends in the values of these variables. The trend values in the forecast period T are adopted as estimates of the explanatory variable values in the forecast period; (d) by constructing a new model, in which exogenous variables act as endogenous variables. The new empirical model is used to estimate the exogenous variable values in the forecast period, and then to estimate forecasts of the endogenous variables representing elements of the economic system. This method yields positive results, at a small number of exogenous variables. It fails, however, with a larger number of these variables, as it requires abundant statistical material, which is not always available to be collected (Zeliaś 1997: 129–130). In practice, the explanatory variable values in the forecast periods are not known. In classical prediction theory, econometric forecasts are conditional in nature, in the sense that they depend on the explanatory variables reaching certain values. This is because the explanatory variable values in the forecast period may form at a level different from the one assumed at the forecast estimation. A significant discrepancy between the forecasted variable’s materialization and the forecast estimated ought to be taken into account. 5. Substantively, extrapolation of the model beyond the range of variable volatility, as observed in the statistical sample used to estimate the model’s parameters, is permissible. This assumption is intended to guard against mechanistic generalization of the regularities observed in the sample. Caution is necessary when extrapolating a model, especially when the number of observations in the sample was low or the area of explanatory variable volatility was small. In such cases, a risk of selecting a flawed analytical form for the equation arises, meaning that outside the volatility area examined, the endogenous variable’s dependence on the explanatory variables can take a different form. The main assumptions of the econometric forecasting theory are usually supplemented by two praxeological postulates (Cf. Pawłowski 1973: 45). The first states that a prediction should result in both an adequate forecast and an assessment of its order of accuracy, in the form of an appropriate measure. The second indicates that when a forecast can be constructed in several manners, the best one, given the criterion selected (here: the measure of the order of forecast accuracy), should be selected.
3.3
Forecasts from Single-Equation Models
Let us assume that a single-equation linear econometric model of the following form is used for forecasting:
3.3
Forecasts from Single-Equation Models
41
yt = α0 þ α1 xt1 þ . . . þ αj xtj þ . . . þ αk xtk þ ηt , in which ηt is the pure random component with zero expected value. Depending on the structural parameter vector estimator applied, different predictors can be obtained.2 Suppose the above model’s parameters were estimated using the ordinary least squares method (OLS); the forecast will then use an OLS predictor in the form of: ^0 þ α ^1 xT1 þ . . . þ α ^j xTj þ . . . þ α ^k xTk yTp = α
ð3:1Þ
^1 , . . . , α ^j , . . . , α ^k are the parameter α ^0 , α ^1 , . . . , α ^j , . . . , α ^k estimates ^0 , α where α obtained via OLS. Designation T denotes the forecast period, with T = n + 1, n + 2, . . ., n + τ. Using the generalized least squares (Aitken’s method) estimator, for instance, an Aitken’s predictor can be obtained, and so on. In matrix terms, predictor (3.1) can be written as: ^, yTp = X T α
ð3:2Þ
where XT = [1 xT1. . .xTj. . .xTk] is the vector of explanatory variable values in the forecast period T, while the transposed vector of structural parameter estimates takes ^0 , α ^1 , . . . , α ^j , . . . α ^k . ^T = α the form of: α Prediction variance for predictor (3.1), and thus for (3.2), is determined by the following formula: h -1 Ti V 2T = σ 2 1 þ X T X T X XT ,
ð3:3Þ
where V 2T denotes the prediction variance of the forecast variable in period T, XT is the vector of the explanatory variable values in forecast period T, and σ 2 is the variance of the model’s random component.3 Formula (3.3) can also be written alternatively: ^ÞX TT , V 2T = σ 2 þ X T D2 ðα
ð3:4Þ
where a variance and covariance matrix of the model’s structural parameter esti^Þ, obtained via OLS, emerges. mates D2 ðα It can be easily demonstrated that the following inequality holds: V 2T ≥ σ 2 :
2
Here, the predictor is the empirical function constituting the tool for forecast estimation.
3
In practice, residual variance S2u is used as estimator σ 2.
ð3:5Þ
42
3
Econometric Forecasts
This means that the forecast accuracy cannot be greater than that of the empirical model used in the prediction. The square root of the prediction variance is the mean prediction error, given as follows: VT =
qffiffiffiffiffiffi V 2T :
ð3:6Þ
The mean prediction error is expressed in the units of the forecast variable YT. It enables assessment of the forecast accuracy in period T. The adequate forecast accuracy required is defined by the forecast user, who determines the limiting prediction error VG. If the following inequality holds: V T ≤ V G,
ð3:7Þ
the forecast is acceptable, because it meets the requirement of the user’s desired precision. When: V T > V G,
ð3:8Þ
the forecast is unacceptable, as it is too inaccurate from the perspective of the user’s needs. Often, forecast users find it difficult to determine the value of the mean prediction error VG for each of the forecast variables. It is easier to determine the relative limiting prediction error V G , expressed as a percentage of the forecast value. A measure of forecast accuracy is then used, which is the relative prediction error, calculated via the following formula: V T =
VT 100ð%Þ: yTP
ð3:9Þ
Comparison of the relative mean prediction error with the limiting relative error of prediction facilitates appropriate decision making. In the event of: V T ≤ V G ,
ð3:10Þ
the forecast is considered acceptable; it is sufficiently accurate for the user’s needs. If, however, the following inequality holds: V T > V G ,
ð3:11Þ
the forecast is unacceptable, as it not accurate enough from the user’s perspective.4 4
An unacceptable forecast is not always a worthless forecast. If its accuracy deviates slightly from the user’s requirements, it can be used as a "turn signal” for the forecast variable in question. It can allow the user to prepare for the expected direction of the forecast variable formation.
3.4
3.4
Analysis of Econometric Forecast Accuracy
43
Analysis of Econometric Forecast Accuracy
In addition to the use of the forecast accuracy measures enabling ex ante prediction assessment, it is essential to observe and record the forecast variable yT manifestation. Knowledge of the forecast variable manifestation enables its comparison with the forecast. This means that expired forecasts5 can be examined via application of ex post measures of forecast accuracy. The difference between the forecast variable materialization in period T (yT) and the yTp forecast, denoted as ωT, will be called the forecast error: ωT = yT - yTp :
ð3:12Þ
Even a single forecast error observation ωT can necessitate intervention into the prediction results, for a grossly inaccurate forecast can result. This happens when the forecast error significantly exceeds the mean prediction error (|ωT| > VT). Such an instance may imply emergence of a series of inaccurate forecasts in the future, often with homonymous forecast error signs. Emergence of a series of forecast errors with the same sign means that a sequence of underestimated or overestimated forecasts has been formed. A sequence of overestimated forecasts occurs when ωT < 0 in several consecutive forecast periods, whereas a sequence of underestimated forecasts emerges when inequality ωT < 0 is noted in at least three periods. Normally, proper reaction to such events should involve an adjustment of the predictor, by changing the set of the empirical model’s explanatory variables and the equation’s analytical form as well as and supplementing the model information with the data resulting from the forecast variable materialization.6 Valuable information on the accuracy of the forecasts estimated against the forecast variable materialization is provided by the mean forecast error δυ, calculated using the following formula: vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u t0 þυ u1 X 2 δυ = t yT - yTp , υ T = t þ1
ð3:13Þ
o
where T (T = t0 + 1, . . ., t0 + υ) is the forecast period number, and υ—the number of expired forecasts. The mean forecast error tells us how much (on average) the forecast variable materializations differ from the forecasts estimated earlier. Of course, measure (3.13) can be calculated only after the forecast variable
5
An expired forecast is understood as such a forecast for which the forecast variable yT realization is known. 6 A sequence of expired forecasts should be characterized by a variety of prediction error (ωT) signs and the error moduli magnitudes (|ωT|) smaller than the mean forecast errors (VT).
44
3 Econometric Forecasts
materializations have been obtained, i.e., having a υ sequence of expired forecasts. The smaller the value of δυ, the more accurate the expired forecasts. One interesting forecast accuracy measure is the Janus coefficient (J ), proposed by A. Gadd and H. Wold, and calculated using the below formula:
J=
1 υ
nP þυ
yT - yTp
T = nþ1 n P 1 ðyt n t=1
2 ,
- ^yt Þ2
ð3:14Þ
where {ŷt} is the sequence of the endogenous variable theoretical values, in the sample which the empirical equation parameter estimation was carried out based on, while n is the number of observations in the statistical sample. The remaining designations are the same as in formula (3.13). The Janus coefficient is thus a quotient of the mean square root of the forecast errors and the mean square root of the equation residuals in the sample. The econometric model can be used as a predictor in the forecasting process as long as the J coefficient is equal to unity or only slightly exceeds the value of 1. If, however, J significantly exceeds 1, it means that the predictor should be adjusted, using the latest statistical information.
3.5
Forecast Estimation from Multi-Equation Models
A multi-equation model for which the basic assumptions of econometric prediction theory have been satisfied can become a predictor. The procedure for forecast estimation from a multi-equation model can ultimately be made very similar to that used to construct forecasts from single-equation models. The prediction procedure technique is different for each class of multi-equation models. Nonetheless, similarities can be found in the manner of arriving at a forecast, regardless of the multi-equation model class. Irrespective of the multi-equation model class, the key issue in econometric prediction is to determine the exogenous variable values in each forecast period. This requirement does not apply to lagged endogenous variables. The values of endogenous variables lagged by 1 period are known for the first of the forecast periods T = t0 + 1, as ygn (g = 1,. . .,G) quantities, for n = t0. In subsequent forecast periods T = t0 + 2, T = t0 + 3, T = t0 + τ, they are obtained by sequential reference from the endogenous variable forecasts estimated already. This procedure is called sequential prediction. The second common feature of prediction from different classes of multi-equation models is the possibility of forecasting from each equation separately. The procedure can thus be reduced to that carried out for simultaneous prediction from G singleequation models.
3.5
Forecast Estimation from Multi-Equation Models
45
Positive results of simple model verification allow for, based on the status quo principle, an extrapolation of the model beyond the statistical sample. As such, a vector forecast can be obtained, for a vector of interdependent variables: h i ðpÞ ðpÞ ðpÞ Y Tp = y1T . . . ygT . . . yGT ,
ð3:15Þ
where the forecasts for individual forecast variables are the components of vector YTp. Forecasts in the form of the above vector are obtained by inserting the values of explanatory variables in the forecast period T into the empirical multi-equation model. Vector YTp is therefore formed by correlating the G number of forecasts, which have been formed independently from each equation separately. Forecast estimation from each equation is carried out identically to forecasting from a singleequation model. Prediction from a simple model therefore entails a G-fold prediction from on a single-equation model. In a recursive model, each equation can be considered separately, just as in a single-equation model. Prediction from individual equations must be carried out in proper order, however. Such procedure is called chain prediction. In recursive models, individual endogenous variables are numbered according to their causal ordering. In such ordering, variable yit is dependent on the predetermined variables and on only such interdependent variables ylt, for which indices l and i satisfy inequality l < i. Chain prediction in such a case consists in the construction of forecasts for individual components of the forecast vector YTp, carried out in a recursive manner, in accordance with the variable ordering reflected by the model. One peculiar feature of chain prediction entails the fact that, if the model shows that the forecast variable YiT is dependent on any other contemporaneous variable Y1T, ðpÞ ðpÞ ðpÞ Y1T,. . ., YlT, with l < i, then forecasts y1T , y2T , . . . , ylT :, relating to those prior in the ðpÞ chain of interdependent variables, are used to estimate forecast yiT . If the forecast period does not immediately follow period t0, but is h > 1 time units distant from it, the chain prediction consists in h-fold repetition of the above procedure. This way, h prediction vectors are obtained for successive periods. A {Y pT } sequence of these vectors determines the expected paths of individual forecast variables. As such, it signals the expected path of arriving at the quantities predicted for the last forecast period. It can thus be concluded that at h > 1, chain prediction for each of the forecast variables generates a sequence of forecasts for subsequent periods, which indicates a sequential prediction. The combination of chain prediction and sequential prediction results in multiple vectors of forecasts, which can be written in the form of an appropriate forecast matrix. When performing chain prediction, it is useful to determine the correlation coefficient matrix of the random components from individual model equations:
46
3
2
1 6 ... 6 6 ρ=6 6 ρg1 6 ... 4 ρG1
... ...
ρ1g ...
...
1
... ...
... ρGg
Econometric Forecasts
3 . . . ρ1G ... ... 7 7 7 . . . ρgG 7 7: ... ... 7 5 ... 1
Matrix ρ, of GxG dimensions, contains elements ρgg0 , which act as the coefficients of linear correlation between the random components of the g-th and g′-th equations, with g, g′ = 1,. . .,G and g ≠ g’. In practice, correlation coefficients ρgg0 are estimated, and their ^ρgg0 estimates are obtained, based on the residuals of the model equations. Very small—in modulus terms—correlation coefficients suggest that individual equations are independent of one another. If ^ρgg0 is close to +1, it can be inferred that the g-th and g′-th equations’ residuals simultaneously took values of the same sign. Conversely, when ^ρgg0 is close to -1, it can be assumed that, as a rule, the signs of the considered residuals of equations g and g′ differed. Consider a prediction technique from the following recursive model: y1t = α10 þ α11 xt1 þ α14 t þ α16 y3t - 1 þ η1t , y2t = α20 þ α22 xt2 þ β21 y1t þ α25 y2t - 1 þ η2t , y3t = α30 þ α34 t þ β31 y1t þ β32 y2t þ α35 y2t - 1 þ η3t , y4t = α40 þ α43 xt3 þ β43 y3t þ α46 y3t - 1 þ η4t : The OLS predictor for the above model is written as follows: 2
1 6 -b 21 6 6 4 - b31 0 2
0 1
0 0
- b32
1
0
- b43
- a10 6 -a 20 6 þ6 4 - a30 2
- a40
3
2 3 3 yðpÞ 0 1T 6 ðpÞ 7 6 7 07 76 y2T 7 76 ðpÞ 7 0 56 y 7 4 3T 5 ðpÞ 1 y4T
- a11 0
0 - a22
0 0
- a14 0
0 - a25
0 0
0 0
0 - a43
- a34 0
- a35 0
xT0 6 x 7 6 T1 7 2 3 6 7 0 6 xT2 7 6 7 607 6 7 6 7 6 xT3 7 = 6 7: 6 7 405 6 T 7 6 7 0 6 7 4 y2T - 1 5 y3T - 1
3 - a16 0 7 7 7 0 5 - a46
ð3:16Þ
3.5
Forecast Estimation from Multi-Equation Models
47
Construction of forecast variables’ forecasts should begin with the so-called initial equation, which in model (3.16) is the first equation. In recursive models, any equation in which the explanatory variables are exclusively predetermined ðpÞ variables can become an initial equation.7 Forecast y1T estimation requires the value of the exogenous variable xT1 in the forecast period T to be determined. The time variable will then reach T in the forecast period. Lagged variable y3T–1 will be ðpÞ given as y3n from the set of observations for T = n + 1, or—when T > n + 1—the y3T ðpÞ forecast, estimated earlier on, will be used. When estimating forecasts y2T , a procedure analogous to that in equation one can be applied, except, previously ðpÞ estimated forecasts y3T must be used. Continuing the chain procedure, we ultimately arrive at equation four, making ðpÞ prior prediction from equation three, by which forecast y3T has been estimated. ðpÞ ðpÞ ðpÞ Forecast y4T is obtained using forecasts y2T and y3T , estimated earlier on. As a result, a vector forecast of type (3.15) is obtained, in the following form: h i ðpÞ ðpÞ ðpÞ ðpÞ Y Tp = y1T y2T y3T y4T ,
ð3:17Þ
where each of the vector YTp components has been calculated identically to forecasting from a single-equation model. Each time, mean prediction errors are calculated as well, using Eq. (3.3). This way, vector VT, containing individual mean prediction errors, is obtained: V T = ½ V 1T
V 2T
V 3T
V 4T :
ð3:18Þ
Prediction from a system of interdependent equations can be carried out in two ways. In the first, the model’s structural form equations are used, whereas in the second, inference into the future is made based on the reduced form equations. These methods cannot be used interchangeably, whereas applicability of each one depends on the type of the questions posed and to be answered in the course of the inference into the future (Cf. Pawłowski 1973: 259–265). Structural equations can be used when the existence of causal relationships between stochastic interdependent variables is omitted in the consideration, as well as when the effect of only one-sided dependence of these variables is sought to be estimated. The procedure is then similar to that used for simple equations. The values of the endogenous variables acting as explanatory in the equations are determined for the forecasting period T by the same methods as those used with exogenous variables. Forecasting from structural form equations, taking only one side of the multilateral interdependent variable linkages into account, is thus characterized by inference 7
This means that in a recursive model, there can be more than one initial (starting) equation. The starting equation is of a simple-model-equation character, just as a detached equation in a system of interdependent equations.
48
3 Econometric Forecasts
into the future for very short periods exclusively, as abstraction from the other sides of the interdependent variables’ dependencies can be made in very short time intervals only. In longer intervals, endogenous variable interdependencies play an important role, while omission thereof can distort the meaning and results of the forecasting investigation. For the above reason, the second manner of inferring the future—based on reduced form model equations—is of greater practical importance. In this method, a forecast can be treated as conditional mathematical hope, with predetermined variables occurring in the condition. The forecasting is carried out based on each of the reduced form equations individually. The procedure here is identical to that of a simple model, for the reduced form is of a simple model nature. If the reduced form equations’ parameters have been estimated directly, then the variances and covariances of the structural parameter estimates for each equation of this form are known. Prediction variances for each equation can then be determined quite easily. When the reduced form has been determined from the empirical structural form, however, this task becomes more difficult. It is worth noting that, normally, reduced form equations, each of which contains all the predetermined variables, are characterized by the presence of a statistically insignificant large number of explanatory variables. This usually results in mean prediction errors, calculated from the reduced form. It is thus worth to determine the mean prediction errors for the forecasts from the systems of interdependent derived from reduced form equations, from the variance and covariance matrix of structural parameter estimates derived from structural form equations. Forecasting from a model’s reduced form equations is, in a sense, of optimality properties, provided that an appropriate estimation method has been used to estimate the parameters. Such prediction is optimal, because it yields smaller mean prediction errors, compared to other methods using the same information resource (Cf. Pawłowski 1973: 254). Consider prediction from a system of interdependent equations, based on the following model: y1t = α10 þ α11 xt1 þ α14 t þ α15 y3t - 1 þ η1t , y2t = α20 þ α22 xt2 þ α24 t þ β23 y3t þ η2t , y3t = α30 þ α33 xt3 þ β32 y2t þ η3t : Forecasting from equation one of the above system, for period T = n + 1, can be carried out independently of the other equations, since it is a detached equation. ðpÞ ðpÞ Forecasts y2T and y3T should be estimated based on the reduced form predictor: ðpÞ
y2T = ^c20 þ ^c21 xT1 þ ^c22 xT2 þ ^c23 xT3 þ ^c24 T þ ^c25 y3T - 1 , ðpÞ
y3T = ^c30 þ ^c31 xT1 þ ^c32 xT2 þ ^c33 xT3 þ ^c34 T þ ^c35 y3T - 1 ,
3.5
Forecast Estimation from Multi-Equation Models
49
where designations ^cgj (g = 2, 3; j = 0, 1, . . ., 5) denote the second and third reduced form equations’ parameter estimates obtained using OLS. The following will act as the predictor for the first interdependent variable: ðpÞ
y1T = a10 þ a11 xT1 þ a14 T þ a15 y3T - 1 , where designations a10, a11, a14, a15 denote the equation’s structural parameter estimates obtained via OLS. Note that it becomes essential to use sequential prediction in successive forecast periods (T = n + 2, n + 3, . . ., n + τ). Lagged variable y3T - 1, occurring in each equation of the predictor under consideration, necessitates estimation of the third interdependent variable forecasts in the first instance. Such proceeding allows for the ðpÞ use—in each of the predictor YTp equations—of forecast y3T - 1 as each equation’s explanatory variable, in the subsequent periods. Forecasts from a system of interdependent equations can be also partially estimated from reduced form equations and partially from structural form equations. Let us consider the following system of equations: y1t = α10 þ α11 xt1 þ α14 t þ α16 y3t - 1 þ β14 y4t þ η1t , y2t = α20 þ α22 xt2 þ β21 y1t þ α25 y2t - 1 þ η2t , y3t = α30 þ α34 t þ β31 y1t þ β32 y2t þ α35 y2t - 1 þ η3t ,
ð3:19Þ
y4t = α40 þ α43 xt3 þ β43 y3t þ α46 y3t - 1 þ η4t : Closed-loop linkages between the interdependent variables can be noted: y1t → y2t ð3:20Þ
" # y4t ← y3t ,
which indicates a system of interdependent equations. Prediction from the above model can be carried out using a mixed technique: partly from the reduced form and partly from the structural form, using a chain prediction technique specific to recursive models. Forecast estimation with the use of a structural form predictor in the following form: ðpÞ
ðpÞ
y1T = a10 þ a11 xT1 þ a14 T þ a16 y3T - 1 þ b14 y4T , ðpÞ
ðpÞ
y2T = a20 þ a22 xT2 þ b21 y1T þ a25 y2T - 1 , ðpÞ
ðpÞ
ðpÞ
y3T = a30 þ a34 T þ b31 y1T þ b32 y2T þ a35 y2T - 1 , ðpÞ
ðpÞ
y4T = a40 þ a43 xT3 þ b43 y3T þ a46 y3T - 1 ,
ð3:21Þ
50
3
Econometric Forecasts
is not immediately possible. The obstacle here is the lack of an initial equation, resultant from the closed-loop linkage forming “loop” (3.21). That loop can be ðpÞ eliminated via the use of a reduced form equation to estimate forecast y1T : ðpÞ
y1T = ^c10 þ ^c11 xT1 þ ^c12 xT2 þ ^c13 xT3 þ ^c14 T þ ^c15 y3T - 1 þ c16^y3T - 1 :
ð3:22Þ
ðpÞ
Given forecast y1T , chain prediction can be applied to the subsequent equations of ðpÞ the structural form predictor. As such, forecast y2T can be estimated from the following equation: ðpÞ
ðpÞ
y2T = a20 þ a22 xT2 þ b21 y1T þ a25 y2T - 1 : ðpÞ
ðpÞ
ðpÞ
Having forecasts y1T and y2T , forecast y3T can be estimated based on an equation constructed as: ðpÞ
ðpÞ
ðpÞ
y3T = a30 þ a34 T þ b31 y1T þ b32 y2T þ a35 y2T - 1 : ðpÞ
ðpÞ
Through knowledge of forecast y3T , in turn, forecast y4T can be estimated from an equation given as: ðpÞ
ðpÞ
y4T = a40 þ a43 xT3 þ b43 y3T þ a46 y3T - 1 : The proposed prediction technique for successive forecast periods T should cover the need to proceed sequentially, resulting from the occurrence of lagged endogeðpÞ ðpÞ nous variables y2T - 1 and y3T - 1 . Ultimately, forecasting from a system of interdependent equations can integrate prediction from reduced form equations with sequential and chain prediction. Recent studies show that forecasting from a system of interdependent equations can begin with any equation (Cf. Wiśniewski 2019: 123–147). The value of forecasts from the reduced form is not required either. In the starting iteration, the values of the initial equation forecasts can be assumed, for performance of subsequent iterations leads to forecast auto-synchronization in a model with a feedback mechanism or a model with closed-loop linkages between the interdependent variables. In this monograph, individual model equations for quarterly or monthly data will take the following structure: ygt =
G X g′ =1
þ
m -1 X h=1
where:
βg,g ′ yg ′ t þ
γ gh d qh þ ηgt
k X j=0
αgj xtj þ
m X i=1
αgi yg,t - i þ
m X k X
λgt - l,j þ αgkþ1t
l=1 j=1
ð3:23Þ
References
51
ygt—observations for the endogenous explanatory variable numbered g (g = 1,. . ., G), (t = 1, . . ., n), xtj—observations for exogenous variables, t—time variable, dq1, . . ., dqm—null variables, taking the value of 1 in the period distinguished and 0 in the other periods, with m = 4 for quarterly data and g = 12 for monthly data, αgj, βg,g,, αgi, λgt - l,j, . . ., γ gh ( j = 0,1, . . ., k, i = 1, . . .,m, l = 1,. . ., m, h = 1,. . .,m 1)—the equation’s structural parameters, αgk + 1—trend parameter, ηgt—the equation’s random component. Autoregression, a linear trend, periodic variations, and interactions of lagged endogenous and exogenous variables will be taken into account in each equation.
References Pawłowski Z (1973) Prognozy ekonometryczne. PWN, Warszawa Wiśniewski JW (2019) Autosynchronizacja prognoz w mikromodelu ekonometrycznym o zamkniętym cyklu powiazań. In: Batóg B (ed) Mikroekonometria. Teoria i praktyka. DIFIN, Warszawa, pp 123–147 Zeliaś A (1997) Teoria prognozy. PWE, Warszawa
Chapter 4
Forecasting from Simple Econometric Micromodels
4.1
Forecasts from an Enterprise Cost Micromodel
Econometric cost models are not among the constructs commonly presented in economic literature. In Poland, the first such studies were published at the end of the 1960s. The pioneer of this research was A. Barczak (1968, 1971; Barczak and Dziembała 1969). This subject matter was also taken up by Z. Pawlowski (1976, ch. 8). In the twenty-first century, more studies dealing with econometric cost analysis emerged (Wiśniewski 2010; Juszczyk et al. 2019). In this work, the purpose of building such an empirical model is to construct econometric forecasts of enterprise costs. Knowledge of cost forecasts can support enterprise management (Dobrodolac 2011) and thus improve the efficiency thereof. We shall consider a system of simple equations describing three groups of enterprise costs: • cost of production (COSTPR), • cost of sales (COSTSAL), and • cost of management (COSTMAN). The individual equations are autonomous in nature. None of the cost groups impacts another group of costs. In each of the empirical equations, autoregression up to and including the fourth order, a trend, seasonal variations and the impact of sales revenue on each type of cost have been taken into account. The empirical equations of each cost group are presented in Tables 4.1, 4.4 and 4.6 as well as in Figs. 4.1, 4.3 and 4.5. The production cost (COSTPR) equation is characterized by high description accuracy of that variable.1 The R2 coefficient exceeds the level of 0.9 (R2 = 0.917). Only causal determinants occur here, which is manifested by the significant effect of 1 The calculations performed for the purpose of this monograph were carried out using the GRETL package primarily.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. W. Wiśniewski, Forecasting from Multi-equation Econometric Micromodels, Contributions to Economics, https://doi.org/10.1007/978-3-031-27492-3_4
53
54
4
Forecasting from Simple Econometric Micromodels
Table 4.1 Empirical equation of production costs (COSTPR), observations 2006:4–2019:2 (N = 51) Variable const SALES SALES_2 SALES_3 COSTPR_2 Mean dependent var. Sum squared resid. R-squared F(4, 46) Log likelihood Schwarz criterion Autocorrel. coeff. (rho1)
Coefficient 967.53 0.659361 0.217571 -0.082353 -0.314989 32,128.29 2.86e+08 0.916729 126.6041 -468.6607 956.9806 -0.213264
Std. Error t-Statistic 2033.12 0.4759 0.0303517 21.7240 0.0892961 2.4365 0.0360354 -2.2853 0.13507 -2.3320 S.D. dependent var. S.E. of resid. Adjusted R-squared Prob(F-statistic) Akaike info criterion Hannan-Quinn criterion Durbin-Watson Stat.
Prob. p 0.6364