386 30 2MB
English Pages XIX, 227 [239] Year 2020
Jau-Lian Jeng
Contemporaneous Event Studies in Corporate Finance Methods, Critiques and Robust Alternative Approaches
Contemporaneous Event Studies in Corporate Finance
Jau-Lian Jeng
Contemporaneous Event Studies in Corporate Finance Methods, Critiques and Robust Alternative Approaches
Jau-Lian Jeng School of Business and Management Azusa Pacific University Los Angeles, CA, USA
ISBN 978-3-030-53808-8 ISBN 978-3-030-53809-5 https://doi.org/10.1007/978-3-030-53809-5
(eBook)
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Palgrave Macmillan imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The applications of event study methodology for empirical corporate finance initiate the discussions on the robustness of the methods when empirical data are used. The skepticism (for likely data manipulation) on applicability of event study methods such as selection of relevant events, event periods, necessary robust statistics to apply, all require further analyses to justify the empirical results. In presenting the possible alternatives in my first book Analyzing Event Statistics in Corporate Finance, the essential ambiguity of current applications of conventional methods (especially the impacts from the event windows) are briefly surveyed. In addition, the proposed alternative methods may require further rigorous elaborations on theoretical developments (in statistics) and demonstrations on empirical applicability. These concerns initiates the need to provide an additional book that may encompass the necessary elaborations and developments. This current book can be considered as a reference book for the event study methodology where issues in economics, finance, management are of interest.
v
vi
Preface
For the technicality applied, readers are suggested to have some readings in mathematical statistics, econometrics, and/or probability theory. Understanding of diffusion processes, asymptotic theory of mathematical statistics are certainly of advantage. Although the contents of the book are mainly for the applications of event study methodology in social science or business management, the technical elaborations with mathematical reasoning in stochastic processes and econometrics are essential to provide rigors in various arguments. Los Angeles, USA
Jau-Lian Jeng
Acknowledgments The writing of this book will not be feasible without the assistance of the editorial board of Palgrave Macmillan publishing, the fruitful reviews of Dr. Hou and Dr. Phillips, and the spiritual supports of my family.
Appreciation
Especially, this book is done during the period of stroke where my wife’s hearted support is cordially appreciated. In particular, during the editing period, my mom passed away due to a stage 4 lung cancer and my dad passed away later on in which the Coronavirus consumes the global economy entirely. Yet, the project continued. Many thanks are given to the one who said “I am!” in granting me the strength and the healing to carry on. Hopefully, this work will commemorate my parents for their hardship in supporting my education and provide the foundation for anyone who’s interested in the event studies in corporate finance or else and will benefit from the discussions.
vii
Introduction
The purpose of the event studies in corporate finance is to identify the possible impacts from the occurrence of corporate finance events and propose the plausible explanation/hypothesis to present the causes from theoretical background. Initial contribution from Fama et al. (1969) provides the pioneering analysis for event studies via market efficiency as considering the adjustment of stock price to new information. For years, some huge amount of publications in applying the event study methodology can be found in many journals of (say) accounting research, financial economics, and corporate finance. However, few publications and books that survey and analyze the methodology while providing suggestions and improvements over the limitations of these methods in empirical applications. In particular, with the recent concerns on the systemic risk Notice that the definition of systemic risk stated here emphasizes the persistence and prevalence of a single corporate event that my affect the entire capital market or system. This differs from the classical definition of systematic risk which is nondiversifiable from the perspective of some welldiversified portfolios that consist of a large number of securities in the
ix
x
Introduction
capital market. which is considered as a single corporate event may persist strongly to impact the entire capital market, it is necessary to see if the essence or impact of some corporate events may be serious enough to produce turmoils in the capital market. For instance, the study provided by Acharya et al. (2016) considers the possible model and measure for systemic risk where the difficulty of financial institutions during 2007– 2009 causes a widespread externality and impact onto the entire capital market. Hence, justification of the corporate event studies should not be limited to assessment on the existence of significant events only. Instead, the suitable analysis should also consider the persistence of impacts from events and their possible consequences. This book is to discuss (1) the conventional event study methodology in corporate finance, and (2) to consider the alternative methods that may improve the robustness of statistical inferences for the possible impacts and endurance from the corporate events. Although event studies are performed in many publications of finance literature for the corporate finance, accounting and management issues, few discussions are shown to assess the soundness of the existing methodology in some rigorous settings. In particular, inappropriate handling of data selection (such as pre-event and post-event windows), calculation of expected (or normal) returns, and applications of empirical statistics (in favor of certain preferred hypotheses) may derive wrongful justification for the corporate finance events. One reason for this skeptical perspective on conventional event studies is that, although corporate events may be similar or repetitive in nature as they seem, the underlying circumstance over different time horizons or schemes of studies may differ. Therefore, although historical evidence shows that certain hypotheses may prevail in the past, it is too conclusive to assert that the existing hypotheses will continue to hold true since the capital market may or may not interpret the given corporate finance information identically over time. As stated, some cautions should be imposed on the existing methodology since the purpose of corporate finance event study is to objectively identify whether there exist significant impacts from the events of interest or not. It is not to cope with some plausible pre-identified hypotheses (or
Introduction
xi
findings) for the events that may interest the readers or, simply to match the trend of existing literature. For instance, the contents of this book discuss the possible similarities between the applications of cumulative sum (CUSUM) statistics with structural change tests, and the applications of cumulative abnormal returns (CAR’s) in event studies of some corporate finance issues. In particular, it is easy to verify that the applications of CAR’s are sensitive to the presumption of pre- and post-event windows where manipulation can be easily introduced if some intended hypotheses are of interest. Namely, if the pre-event window is set to accommodate the intent to prove the significance of events of interest, and if there is a possible significant parameter change in the systematic component of asset returns, there is a high likelihood that the resulting CAR’s may show that their (cross-sectional) mean (within the event window) will be statistically different from zero. In other words, the empirical statistical results may show that there’s a significant impact from the events of interest. In fact, it is a parameter change in the systematic components of asset returns that provides the cause, and the result is not related with the corporate events of concern. More specifically, it is easy to see the usual CUSUM tests and the cumulative abnormal returns (CAR’s) can be related to each other and that, the results may lead to incorrect verification for the hypotheses of interest. For instance, let the specification of normal rate of return be given as the following simplified univariate model such that rit = αi + βi xt + εit , where i = 1,2, …,n is the index for firm’s rates of return, t = 1, 2,…, T is the time index over the entire time horizon of interest, x t is the common factor (such as market index return) for the return series. Suppose there is an unknown structural change over the time period of interest, and there is an interest of study to verify if there is a significant corporate event during the time periods by using the conventional cumulative abnormal returns according to MacKinlay (1997).
xii
Introduction
The conventional statistics with cumulative abnormal returns may actually lead to wrongful conclusion. Specifically, let the unknown structural change happen at t = T ∗ where T ∗ < T , T ∗ is unknown and αi becomes αi∗ , where αi∗ = αi + δi , and δi = 0 for a sufficient number of firms of study. Suppose the empiricist does not consider the unknown structural change at all, it is apparent that the cumulative abnormal returns can be shown as t > T ∗ C A Ri =
t
∼ εis
∗
= t − T δi +
s=0
t
εis ,
s=0
where ts=0 εis is the genuine cumulative abnormal returns. It is easy to see that if the cross-sectional average of these cumulative abnormal returns are used to form the statistics, the cross-sectional average of these CAR’s may reject the null hypothesis such that 1 ∼ 1 AC A R = εis = t − T ∗ δi + εis . n n n
t
i=1 s=0
n
i=1
n
t
i=1 s=0
The statistic ACAR may become significantly different from zero even n t though n1 i=1 s=0 εis may actually converge to zero. In other words, if there exists an unknown structural change of the parameter(s) in the fitted model of normal returns, it is likely that the conventional CAR’s (and hence, the ACAR’s across all sampled firms) may become significant statistically when the event period is extensive enough to encompass the time of possible structural change in parameter(s). The statistics may even become more significant when the excedance of time (after structural change) t − T ∗ expands. Alternatively, following the same equation as above, if there is a significant impact from the corporate events of some firms, this gives the condition that for some significant number of firms, the ts=0 εis may not be of zero mean. Using the conventional CUSUM (cumulative sum of residuals) from the above regressions (even with recursive estimation)
Introduction
xiii
may exceed certain pre-specified boundaries and reject the null hypothesis as there is no structural change in the parameters α i or β i . In other words, the applications of CUSUM statistics may wrongfully conclude that there is a significant structural change in parameters of regressions during the sampled period of time, while the so-called structural change is actually the consequence of the idiosyncratic corporate events. Particularly, the usual structural change tests or the so-called monitoring tests rely on whether the test statistics exceed the boundaries (prespecified according to the assumption of convergence in distribution to a certain diffusion process) or not. The approach ignores two issues: epidemic change and dependent sequential test statistics due to dependence in sequential multi-hypotheses. For instance, when there is an epidemic change, the parameters may change for a tentative short period and end with no change afterward. The usual structural change test may simply reject the null hypothesis of no change when the statistics cross the boundaries when observations within the epidemic-change time period are applied. In other words, the tests usually assume permanent change on the parameters. This result, in turns, is not often observed in corporate finance events. On the other hand, the tests statistics applied in the sequential monitoring test are usually sequentially dependent where the control of type-1 errors in test should take this possibility into account. Furthermore, there are research articles that proclaim the usage of firm-specific information may help explaining the forecasts of asset returns. It is not surprising to state that, although these applications of firm-specific variables assist contemporaneous explanation on asset returns, the contribution is usually short-lived and lack of explanation from the theoretical standpoints. However, applications of these additional firm-specific variables (for conditional expectations of asset returns) may actually dilute the specification of abnormal returns which are the essential variables for the event studies. More specifically, it is dubious to accept those claims since event studies should focus on the possible impacts on the so-called abnormal returns, where abnormal returns should be the additional returns in excess of the expected returns merely based on systematic (nondiversifiable) economic variables. In other words, the prerequisite for the robust event
xiv
Introduction
studies should be based on the clear dichotomy between systematic and firm-specific components of asset returns. Accordingly, a prerequisite of robust assessments on abnormal returns (for event studies) lies in the prevalent efforts to obtain the systematic components of asset returns. Therefore, to obtain the abnormal returns for the purpose of event studies, one needs to ensure that the (conditional) expected returns of assets are based solely on the nondiversifiable systematic information, even though adding some additional variables (and/or time-series properties such as (G)ARCH in mean) is attractive with better in-sample goodness-of-fit statistics. Yet, the methodology should consider that the more essential the event is, the more time it may consume the impacts and more frequently the impacts may happen on the information announced. For instance, the recent studies in the earthquake modeling, the so called Epidemic Type Aftershock-Sequences (ETAS) model with Conditional Intensity Function (Ogata [1998] 2013) examines the earthquake’s aftershock model through the time intervals, magnitudes, and frequencies of the aftershocks. Basically, the investigation is on the strength of earthquakes trough time and the intensities of magnitudes specified as λ (t|Ht ) = μ +
∞ t M0
0
K0 eα(M−M0 ) N (ds, d M) (t − s + c) p
where K 0 , α, c, p are constants, and N (ds, d M) = 1 if an infinitesimal element (ds, d M) includes an event (ti , Mi ) for some i, and otherwise, N (ds, d M) = 0. And the H t is the past history of the earthquake. The aftershock activity is not only due to the magnitude of the shock but also on the time intervals that carried over the quake and the frequency of occurrence of some magnitudes (for the aftershocks) during the time span. Other than the magnitude of the earthquake, the shorter the aftershocks (for certain magnitudes) may happen and more frequently, the higher the intensity of the quake. The application can also be seen in Egorova L. and I. Klymyuk (2017) in using the Hawke process to analyze the currency crash. Let the process that describes the history of occurrence be denoted as a sequence Ht = {(ti , m i )}i∈N ∗ , N ∗ = N ∪ {0}, where t i denotes the time m i denotes the depth of decline. The process
Introduction
N (t) = Nt =
i∈N ∗ δ(ti σ 2 ,it is easy to see that Given that σi,2 i,1 2 2 2 (X i X i )−1 ≤ p(St = 1)σi,1 (X i X i )−1 + p(St = 2)σi,2 (X i X i )−1 . σi,1 (1.24) In other words, the standard error of OLS estimates is overestimated if the return process is a two-state process. Hence, given the event-induced volatility, the conventional event study methodology may suffer loss of power when the estimation period is contaminated by unrelated events.
1 Popular Methods for Event Studies in Corporate Finance . . .
21
In other words, if there is a multi-event situation in the event period, there is a possibility that the OLS estimate may be incorrectly stated and the empirical results of statistical inferences are not reliable.
1.2.4 Multiperiod Event Studies (Karafiath and Spencer 1991) Karafiath and Spencer (1991) consider the event study for abnormal performance during a multiperiod event window. They demonstrate that the usual test statistics applied in event studies do not follow the standardized normal and hence, the results are thus biased. Furthermore, the bias may increase with the length of event window and lead to excessive rejection of the null hypothesis of no events. Let the observed returns over the estimation period of equal length T be denoted as rit , where i = 1, 2, . . . n, t = 1, 2, . . . T. For each firm, there is a multiperiod event window of length Ti . In other words, the event windows for each firm may be of different lengths and different calendar dates. Assuming the event windows occur following the estimation period, it can be shown that ri1 X i1 0
i ε = + i1 , (1.25) X i2 ITi ri2 δi εi2 where ri1 is a T × 1 vector of observed returns for i-th security in the estimation period, ri2 is a Ti × 1 vector of returns for i-th security in the event period, X i1 and X i2 are the T × 2 and Ti × 2 matrices for a constant term and market returns, respectively. In addition, ITi is the identity matrix with dimension Ti and i is a 2 × 1 vector for market model parameters, δi is a Ti × 1 vector of abnormal returns for i-th firm in the corresponding event window. The error terms εi1 and εi2 are all zero-mean stochastic disturbances. The system can be denoted in a more compact form as Ri = X i i + εi , i = 1, 2, . . . , n,
(1.26)
22
J.-L. Jeng
ri1 X i1 0
i where Ri = , Xi = , and i = accordingly.11 ri2 δi X i2 ITi For the regressor Rmt in market model, it is assumed that they are contemporaneously uncorrelated with the disturbance such that E (Rmt εit ) = 0, for all t = 1, . . . , T + Ti , i = 1, . . . , n, which gives plim X i εi = 0. T →∞ In addition, let E εi εi = σi2 IT +Ti , where {εit }t=1,...T +Ti are serially uncorrelated, and contemporaneously uncorrelated across firms for all i = 1, 2, . . . , n.12
1.3
Cumulative Abnormal Returns—Construction and the Statistical Test
The earlier study of Brown et al. (1985) indicates that the event itself may in fact be the systematic risk during the period surrounding the firmspecific event. This shows that the confusion may start from the impacts of the event toward to the understanding of systematic risk and hence, it may obscure the event study analysis when using the residuals from the asset pricing models to consider the impacts and effects of the abnormal returns. Yet, it confirms that the incorrect imposition of systematic risk models may alternate the findings thereafter. One difficulty from the theoretical standpoint is that if the firm-specific event may cause the systematic risk to change, what causes it to be called “systematic”? In general, Brown et al. (1985) consider the multi-factor return model for j -th security with a two-stage switching model as R = X t γ j1 + u jt , t < t−1 or t > t1 R jt = X t (γ j1 + γ j0 ) + u jt = X t γ j2 + u jt for t−1 ≤ t ≤ t1 where X t is the r -th row of a data matrix containing k explanatory variables and γ j is the j-th component of the associated k-vector of parameters. They also assume that u jt ∼ N (0, σ jt2 ) ∀t E u jt u js = 0 ∀t, s
1 Popular Methods for Event Studies in Corporate Finance . . .
23
Given the null hypothesis as H0 : γ0 = 0 we can rewrite these equations in matrix form as R1 = X 1 γ1 + u 1 , R2 = X 2 γ2 + u 2 ,
t∈ / [t−1 , t1 ] t ∈ [t−1 , t1 ]
for for
and u i ∼ N (0, σi2 Ii );
i = 1, 2.
Using the above equation systems, the least squares estimates for the equations can be shown as R1 = X 1 g1 + e1 , R2 = X 2 g2 + e2, and with the above assumptions, under the null hypothesis (g1 − g2 ) ∼ N (0, σ12 (X 1 X 1 )−1 + σ22 (X 2 X 2 )−1 ), Hence, the random variable −1 (g1 − g2 ) σ12 (X 1 X 1 )−1 + σ22 (X 2 X 2 )−1 (g1 − g2 )
(1.27)
will have a chi-square distribution of degree of freedom equal to the number of parameters. Then again, e1 e1 σ12
+
e2 e2 σ22
2 ∼ χ(T 1 +T2 −2k)
(1.28)
where T1 and T2 are the numbers of observations in each model and k is the number of parameters estimated. Given the mutual independence of
24
J.-L. Jeng
Eqs. (1.27) and (1.28),13 this give the statistic as testing the parametric change in the return model while allowing for changes in the distribution of firm-specific risk during the event period as ⎤ ⎡ (g −g ) F =⎣ 1 2
2 −1 σ1 (X 1 X 1 )−1 +σ22 (X 2 X 2 )−1 (g1 −g2 ) e1 e1 e2 e2 + 2 σ12 σ2
×
T1 +T2 −2k k
⎦
Given that the variances σi2 are unknown, the estimates for them are given as e ei σˆ i2 = i Ti − k The trouble of the above analysis is the parametric structural change of the systematic risk model is allowed in the event period which the firmspecific risk may alternate the abnormal returns’ distribution (for instance). Likewise, the test statistics (for structural change) is shown as −1 1 2 −1 2 −1 ( ) (g1 − g2 ) σˆ 1 (X 1 X 1 ) + σˆ 2 (X 2 X 2 ) (g1 − g2 ) k as an F statistic with degree of freedom (k, T1 + T2 − 2k). With the sample collected from the New York Stock Exchange from December 1926 to December 1979 with a five for four stock split, they confirmed the parametric change of systematic risk model. However, the methodology may actually introduce another specification error—the classification on the event periods may cause the parametric change on the coefficients of systematic risk due to the event period which they claim that universal event period may lead to estimation problem. The problem here is that the parametric change of systematic risk model can happen any time—due to whatever reasons. Using the sample period classification may actually cause the statistics to fit the purpose of study. Instead, letting the systematic risk be estimated recursively may release the difficulty to identify the precise event period for the event. In other words, the normal returns are obtained through the continuous updating for the market
1 Popular Methods for Event Studies in Corporate Finance . . .
25
expectation assumed in order to eliminate the possible parametric change of systematic risk (say, beta) due to various reasons and event periods. Brown et al. (1985) then apply the dummy variables for the time periods for switching regressions. For example, let the switching regressions be given as ∗ γ j1 R j = X; Z j (1.29) + u j, ∗ γ j0 γ j1 Rj = X : Wj + u j, γ j0
(1.30)
are matrices whose i-th elements in the r -th row are where Z j and W j equal to X it , Dit∗ and ([X it , Dit ]), respectively. D ∗jt = 1 when observation t is in the true event period, 0 otherwise D jt = 1 when observation t is in the arbitrarily imposed event period, 0 otherwise.
The best linear unbiased estimator for γ in the above equations are −1 γˆ ∗ = (X : Z ) ∗−1 (X : Z ) (X : Z ) ∗−1 R
(1.31)
−1 γˆ = (X : W ) −1 (X : W ) (X : W ) −1 R
(1.32)
and
where and ∗ are diagonal covariance matrices with respective elements 2∗ or σ 2∗ and σ 2 or σ 2 , corresponding to the event period or equal to σu1 u2 u1 u2 nonevent period observations. Then it is shown that −1 −1 −1 γˆ1 γ1 X Mw γ0 X Mw X (1.33) E = + −1 −1 −1 γˆ2 γ2 W Mx γ0 W Mx W where Mw = I − W (W −1 W )−1 W −1 and Mw = I − X −1 −1 (X X )X and = Z − W which stands for the specification
26
J.-L. Jeng
error in showing the event period. Hence, the higher the specification error, the higher the bias. Although the article indicates the problem of specification error in classifying the event period, it has many theoretical problems like (1) the error terms in the equation system all assume normality in the distributions (2) if the systematic risk model is linked directly or indirectly toward the firmspecific event, that indicates there exists some possible dependence among the explanatory or dependent variables (in the systematic risk model) and the error terms of the equations if the residuals are applied to demonstrate the effects of events. Burnett et al. (1995) consider the multiple structural changes in the return generating process that may cause the usual residuals analysis to be biased. It is said that If the parameters in the return-generating process are constant over the sample period, the residuals during which will be attributed to the event under study. Yet, given that there are a lot of evidences that point to the instability of the possible parameters in the return process, the residuals analysis may lead to incorrect conclusion regarding the impact of the event. In order to verify this point of view. they offer the methodology that corrects the changes of the market model parameters, and to examine the implication for measuring the market response to a particular event, stock split. The methodology involves a Bayesian switching regression for the usual market model for the normal returns for possibly three changes in the sample. Although they claim that they limit the maximum number of changes to three is for the computational reasons, it is still questionable why the return series will only change as few as they assumed due to various reasons. Given that the market expectations are changing for different information flows over time, the change of regimes is mostly unknown a priori. They assume that the return process follows the market model with p change points τ1, τ2 . . . , τ p with 0 = τ0 < τ1 < . . . < τ p = T that forms p + 1 regimes. The market model parameters γi = (αi , βi , σi2 ) are assumed to be constant within the regression regime i but one or more of parameters may shift at each change point. They begin with the following switching market model as Rt = αt + βt Rmt + εt ,
1 Popular Methods for Event Studies in Corporate Finance . . .
27
and βt equals βr , αt equals αr and εt ∼ N (0, σr2 ) where t is in the r −th regime: [τr−1 + 1, τr−1 + 2, . . . , τr ]. Each regime contains nr = τr − τr−1 observations. Let γ = (γ1 , . . . , γ p+1 ) be the regression parameters, and let δ = (τ1 , . . . , τ p , p) = (γ , p) be the parameters for the switching process. Then, let φ = (γ , δ) the Bayes rule can be h(φ|Z ) = L(φ|Z )g(φ)/ f (Z )
(1.34)
where h(φ|Z ) is the posterior probability density function for the parameter vector φ, L(φ|Z ) is the likelihood function conditional on the data Z , g(φ) is the prior pdf and f (Z ) is the marginal pdf of the data. Given the noninformative prior on γ , conditional on δ, the conditional posterior density of regression parameters can be written as h(γ |δ, Z ) = L(φ|Z )g(γ |δ)
∝
p+1
nr (2πσr2 )− 2
r=1
(1.35)
vr sr2 + (βr − br ) (X r X r )(βr − br ) ex p 2σr2
where vr = nr − 2, br = (X r X r )−1 X r yr , and sr2 =(yr − X r br ) (yr − X r br )/vr . For regime r , yr is the nr × 1 vector of observed asset return, X r is the nr × 2 matrix consisting of column of ones and a column of observed market returns, βr is the 2 × 1 vector of regression parameters and εr is the nr × 1 vector of errors such that εr ∼ N (0, σr2 I (nr )) where I (nr ) is the nr × nr identity matrix.14 Given the above system, we can get the marginal posterior pdf of δ as h(δ|Z ) = h(γ , δ|Z )dγ Hence, the marginal posterior pdf for δ can be written as h(δ|Z ) =
L(γ , δ|Z )g(γ |δ)dγ g(δ)/ f (Z )
(1.36)
28
J.-L. Jeng
Likewise, the so-called integrated likelihood function is given as L ∗ (τ, p, Z ) =
p+1 r =1
− vr 1 1 − vr vr π 2 ( )|X r X r |− 2 (yr − X r br ) (yr − X r br ) 2 2 2
(1.37)
where (.) is the standard gamma function. They used the sample of stock split from 1978 to 1989 which limit the split to be greater than or equal to 5-for-2 in order to catch the true stock split as stated. There are 118 such events on the Center for Research in Security prices (CRSP) Daily Stock Master file. With discarding twelve of the stocks that had missing returns, two that were distributions to class B stock, and nine that represented trust companies, the final sample consisted of 95 stock splits. For each firm in the sample, daily returns are generated beginning 170 trading days preceding the split announcement and ending 60 trading days after the ex-date. Due to the number of days between the announcement of the split and the ex-date varies from firm to firm, the total number of returns in each time series is different which ranges from 246 to 357 days, respectively. Market returns are measured by the equally weighted CRSP index. The abnormal returns are calculated using the “standard” event study estimates and the switching regression estimates in which the standard parameters are obtained using the 130 days for the estimation period beginning 170 prior to the split announcement. The switching regression estimates are obtained from the Bayesian switching regressions stated above. The event period is the 21-day period surrounding the event date. Using the Bayesian switching regression, they confirm the systematic risk model with 95 firms is subject to instability of the parameters. Only 10% of the firms had stable return series with no statistically identifiable change points, while almost 30% show that the returns have three change points. Although the empirical evidence confirms the possible instability of parameters (for the systematic risk model), it depends on assuming the changes are subject to three regimes without using particular reasons to explain.15 Again, this points out that the residual analysis which assumes the stable estimates for normal returns using the systematic risk models (in
1 Popular Methods for Event Studies in Corporate Finance . . .
29
estimation period) does not fit the market expectations precisely. The average residuals and cumulative average residuals based on the standard parameter estimates and the switching regression estimates were calculated for the 21-day event period surrounding the split announcement. Although they did not find any significant differences for event study among the residuals for different estimation schemes, it still indicates the need to incorporate the non-stationarity of the parameters estimated for normal returns. For the developments of residual analyses in normality, it is easy to consult the excellent surveys of MacKinlay (1997) (or Corrado 2011) for event studies which assumes that the underlying assumption (for the asymptotic distribution) of abnormal returns to follow a normal distribution. The hypothesis will then, depend on whether the statistics will exceed the barriers of the asymptotic distribution of abnormal returns. Unfortunately, this kind of method may actually confirm the intended hypothesis (of the investigation) even when there is no story to discuss. The confirmation may actually originate from the assumption for asymptotic distribution, which in turn, destroys some other theories possibly to explain the event significance. Therefore, the construction that forms the asymptotic statistics actually consider the barriers to cross (such as the cumulative average to pass through the zero, for instance) or not, in determining the event test statistics for acceptance or not, is still insufficient to tell that the event(s) is significant enough to justify the hypothesis. In other words, if the market model is assumed, βi =
τ2 μi )(rm − μm ) τ1 +1 (ri − , τ2 2 μm ) τ1 +1 (rm −
i i rm ), as the normality αi = μi − β μm , σε2i = L 11−2 ττ21 +1 (ri − αi -β granted, for the estimated with enough observation is both in estimation period and event periods and the σε2i
=
σε2i
1 (rm − μm )2 + [1 + ] L1 σm2
30
J.-L. Jeng
for any given company i.16 Hence, as for the event study for τ = 0, the estimation is given the coefficient of the estimation period, and the forecast error in the event period used to test the hypothesis of the null. For example, let rit = E[rit |Ft ] + εit = Normal Return + Abnormal Return, the specification is to assume that the abnormal return follows εit ∼ N (0,σ 2 (εiτ )) where the underlying variance comes from the aggregation of abnormal returns during the event period where the σ 2 (εiτ ) = (τ2 − τ1 + 1)σε2 and the ad hoc event period is [τ1 , τ2 ]. Notice that the information set Ft should only include the systematic variables that explain the systematic components of asset returns. (These can be found as necessary conditions in my book as Empirical Asset Pricing Model 2018.) The distribution for the cumulative (fitted) abnormal return under ττ21+1 εˆ i in the null hypothesis Ho follows as CAR(τ1 ,τ2 ) ∼ N (0,σ 2 (τ1 , τ2 )). It is obvious that this result depends on the normality assumption to hold— especially for the estimation and event periods. This is given if the abnormal returns follow the i.i.d.(independent and identically distributed) assumption of normal distribution, and the variance will approach σ 2 (τ1 , τ2 ) when the event period is given. The assumption is acceptable if indeed, the abnormal return actually follows the normality in distribution and the event period is long enough. Unfortunately, this assumption is difficult to follow in reality. On the other hand, we are assuming that the abnormal returns will jointly normally distributed with σ 2 (τ1 , τ2 ) if the estimation period is long enough so that the estimation error will approach zero as L1 approaches infinity where the estimation errors or forecasted errors approach the genuine abnormal returns. The intuition of the event study is to consider that the aggregate abnormal returns will approximate the normal distribution when the event period is given so that the aggregate abnormal return will exceed the mean (which is zero) if the event is significant. The validity of the empirical tests is based on the correctness of the distributional assumption. Basically, it is obvious that the empirical abnormal returns will not follow the multivariate normality, and that the cumulative abnormal returns will not follow the assumption, either. On the other hand, the difficulty further is that the tests for event studies are similar (in spirit or technicality) to the tests for change of the coefficients in the models of normal returns. As the mean of cumulative abnormal returns exceeds the zero as the null hypothesis
1 Popular Methods for Event Studies in Corporate Finance . . .
31
assumes, it is equivalent to show that if the boundary for the cumulative error terms of the regression model use the mean as the bound the test is similar to test(s) for structural change in the model. The conventional method suffers from the following limitations: (1) the estimation is limited to the estimation period arbitrarily defined, and enough observations, (2) the event period is limited to the event identified. Yet in reality, it is usually unknown or uncertain, (3) the observations are obtained through event period which may be too far that contains the other events. That is, the multi-event case may happen, (4) the normality assumption may fail in the estimation period, (5) the underlying timeinvariant system is assumed as the estimation period for the conditional expectation given the information, (6) the length of estimation and/or event period is determined arbitrarily by the sampling procedures, (7) may fail to justify that the systemic risk when cross-sectional dependence is present and calculated by their weak dependence only, (8) the discussion of systemic risk which is identified by the cross-sectional dependence is only represented by the cross-sectional correlation, which is a merely linear representation. This actually assumes that the underlying system is time-invariant so that the forecasted errors are the representatives for the events or issues. Instead, even if the underlying system is time-invariant, the forecasted errors in using the ad hoc event period will either assume that the period is long enough and there is no other event in the same event period. The other assumption is that the return distribution will not change over time. In other words, it is assumed that the normal returns which are represented as the conditional expectations will hold unchanged over time. And that the abnormal returns are represented by the error terms where observed stock returns minus the conditional expectations. In fact, the recursive algorithm such as stochastic gradient method or recursive least squares, etc. will generate some more informed results for the abnormal returns for the event period without the assumption for the estimation period for the stability of estimates in normal returns. Given that the event period is undetermined (for some firms the event periods are determined privately), the recursive estimation can reduce the reliance on the predetermined sample sizes—because the genuine event date is determined by the samples themselves. The updating recursive estimations will
32
J.-L. Jeng
enhance the information accordingly such that they may resemble the market adjustment when new information is feasible. And hence, the arbitrary determination of estimation and event periods becomes unnecessary.
1.4
The Other Econometric Issues in Event Study: Problems and Developments
a. Conditional Methods in Event Studies (Eckbo et al. 1990; Prabhala 1997; Nayak and Prabhala 2001). Eckbo et al. (1990) (denoted as EMW model hereon) provides the analyses for event studies in using the cross-sectional models with truncated regression. They show that if the event is voluntary and investors are rational, the standard OLS and GLS estimators are inconsistent. Let the manager of bidding firm j have the private information about the potential target from the possible merger. They assess at return y j on their stock − + − from the possible synergy merger and y j = ln(v + j /v j ) where v j and v j represent the private information of with and without the synergy merger, respectively. Assuming that the acquirer’s assessment y j depends linearly on some publicly observable characters x j and a single statistic η j , one can rewrite the equation as y j = x j γ + η j , where η j summarizes the firm’s inside information, and with the vector γ of constant parameters. It is assumed that the acquiring firm’s private information η j is normally distributed with mean 0 and with variance ω2 and independently across all acquiring firms j = 1, . . . , J. Most of the time the event studies will need two steps to estimate the parameters. The first step is to obtain the abnormal returns as the residuals from a model, usually the market model, of conditional expectation E(rit |Ft ). Secondly, regress a cross-sectional regression with the independent variables x j to obtain the parameters γ . The two-step procedure can be summarized as a single time series and cross-sectional regression: r jt = α j + β j rmt + d jt x j γ + ε jt,
1 Popular Methods for Event Studies in Corporate Finance . . .
33
for all firms j = 1, . . . , J and times t = −T1 , . . . , 0, . . . T2 , where r jt and rmt are the rates if return of the stock of firm j and market portfolio, respectively, and d jt is a dummy variable as 1 as t during the event period and 0 elsewhere. And x j is a K − dimensional vector of explanatory variables. The residual ε jt is assumed as E(ε jt ) = 0, E(ε2jt ) = σ 2 + d jt δ 2 , E(ε jt ε j t ) = 0,
for all firms j = j and for all times t = t . However, this setting is not consistent with rational expectations by the participants in capital market. For example, in the synergy mergers, the managers of the acquiring firm will announce the mergers if and only if they assess that the firm’s share of synergy will be non-negative. That is, 0 ≤ y j = x j γ + η j . The outsiders will infer that the announcement period abnormal return as F(x j ) = E(y j |η j ≥ −x j γ )
(1.38)
= x j γ + E(η j |η j ≥ −x j γ )
= x jγ + ω
xjγ ω ) xjγ , N( ω )
n(
where the n and N represent the normal density and distribution, respecx γ tively because the variable is truncated below with ωj . And that the equation ω of η j will become E[η j |η j ≥ −x j γ ] = ωE[
ηj ηj x jγ | ≥ −( )]. ω ω ω
These will show that the regression will become that r jt = α j + β j rmt + d jt F(x j ) + ς jt ,
34
J.-L. Jeng
where E(ς jt ) = 0, E(ς 2jt ) = σ 2 + d jt δ 2 , E(ς jt , ς j t ) = 0, for all firms j = j , and all times t = t . This shows that the error term of the original regression should be ε jt =
xjγ ) ς jt + ω xωj γ . N( ω )
n(
Consequently, the partial derivative as ∂ε jt/∂ x ji =
xjγ ω ) −γi x j γ [z j N( ω )
n(
+
n(z j ) ], N (z j )
(1.39)
where z j = x j γ /ω, and all the conventional OLS and GLS estimators of the coefficients of regression in ε jt ’s are therefore, inconsistent. Prabhala (1997): Conditional method for the event studies. Prabhala (1997) started the explanation that the event study can be specified as three conditions; (1) the information arrival is known prior to event; (2) the information arrival is not known prior to event; and (3) the information arrival is partially known. In confirming the conventional method, Prabhala (1997) suggested that the empirical method should follow the concepts as (1) estimate for each firm the unexpected information that the event reveals; (2) compute the cross-sectional correlation between information and abnormal return and test for its significance. A nonzero correlation would indicate that the abnormal return is systemically related information revealed the event. The difficulty is that the information arrival is usually unknown to the public. It is the action (to act or not-to act) that is observed. Furthermore, Prabhala (1997) did not solve the optimal decision for the event period to observe. Furthermore, the results still are based on the normality. In fact, Prabhala (1997) introduces several assumptions in order to model the informational contents when actions are observed. Three possibilities or scenarios are assumed for different situations. Assumption 1 Markets know, prior to the event, that the event-related information τi has arrived at firm i. (but not its exact content).
1 Popular Methods for Event Studies in Corporate Finance . . .
35
Assumption 2 Markets do not know, prior to the event, the event-related information τi has arrived at firm i. Assumption 3 Markets assess a probability p ∈ (0, 1) that information τi has arrived at firm i. With these different assumptions, Assumption 1 leads to discuss that information arrival is a common knowledge prior to the event, Assumption 2 states that markets do now know the information arrival as prior to the event. And, Assumption 3 that permit markets to make probabilistic assessments about information arrival. Let E −1 (τi ) = θ x i = nj=1 θ j xi j as the expectation of τi . The private information is given as ψi = τi − E −1 (τi ), where E −1 (ψi ) = 0 without loss of generality. The action of firm is shown as E ⇐⇒ τi ≥ 0 ⇐⇒ ψi + θi xi ≥ 0, N E ⇐⇒ τi < 0 ⇐⇒ ψi + θi xi < 0. Prabhala (1997) introduced additional assumptions to derive the model for the firm’s action. These assumptions make the analysis easier to handle especially under the normality. Although Prabhala (1997) admits that the distribution assumption may be violated in empirical data, it only happens for certain data sets. Assumption 4 Risk Neutrality: Investors are risk-neutral toward the event risk. (This assumption makes the conditional expectation becomes the essential incentive in using the firm’s private information.) Assumption 5 Linearity: Conditional information is a linear signal of expected stock return. That is, E(ri |ψi ) = πψi , where ri stand for stock return, and ψi for the conditional information.
36
J.-L. Jeng
Therefore, π will be a significant coefficient as if E(εi |E) = π E(ψi |E) = π E(ψi |θ x i + ψi ≥ 0), and E(εi |N E) = π E(ψi |N E) = π E(ψi |θ x i + ψi < 0), where the εi is the event-date abnormal return for firm i. Now, by assuming that ψi follows the normality N (0, σ 2 ), it is reasonable to rewrite the above equations as E(εi |E) = πσ
n(θ x i /σ ) = πσ λ E (θ x i /σ ), N ((θ x i /σ )
(1.40)
and E(εi |N E) = πσ
−n(θ x i /σ ) = πσ λ N E ((θ x i /σ ), 1 − N ((θ x i /σ )
(1.41)
where n() and N () denote the density and distribution functions under normality, λC denotes the updated expectation of private information ψi , where the firm’s choices are C ∈ {E, N E.}. The thing that is strong for all firms, that given normality, the λ() s are all identical given that the private information is not observable and feasible to the markets, and it is possibly distinct from each other. The difference may be explained in the values of explanatory variables x i and θ . So, the work of Prabhala (1997) indicates that if the private information arrives at the firm prior to the event, it is possible to consider the information effect if a significant π is shown. Instead, if the tests for the significance of coefficients on θ j , j = 1, 2, . . . , k from the set of explanatory variables x j , it explains the cross-section of announcement effects. On the other hand, if the information arrival is not known to the firm prior to event, it shows that if π significant, it is shown that E(εi |E) = π E(τi |E) = π E(τi |τi ≥ 0) = π[θ x i + λ E (θ x i /σ )].
(1.42)
1 Popular Methods for Event Studies in Corporate Finance . . .
37
If when the information arrival is partially known, and suppose markets assess a probability p that information τi has arrived at firm i, the stockprice reaction will be E −1 (εi ) = pπ(θ x i ), In fact, Prabhala (1997) has the following expression for conditional expectation as θ xi E(εi |E) = π[(1 − p)θ x i + σ λ E ( )], (1.43) σ and the usual models of Eckbo et al. (1990) (the so-called EMW model) and Acharya (1988, 1993) are the special cases p = 0 and p = 1, respectively. Prabhala (1997) applies the Acharya model (1988,1993) to explain the interpretation that the conventional method is still valid under the framework. The usual methodology that considers the coefficient π = 0 is a correct test for the existence of information effect and that the traditional cross-sectional yields the regression coefficients proportional to the true cross-sectional parameters θ. The issue, however, is that the coefficient π is obtained through the linearity assumption on the conditional expectation such as Assumption 5. Given the setting, Prabhala (1997) shows that E(ε|E) = βo + β x = βo + β1 x1 + · · · + βk xk , for the estimate of firms’ announcing event E. Although the system is misspecified, it can be shown that the coefficient of linear regression’s β is proportional to the true parameter θ. In addition, Prabhala (1997) shows the following proposition to demonstrate the result. Proposition: suppose the event E occurs if and only if θo + kj=1 θ j x j + ψ > 0, and the information ψ and abnormal return ε are bivariate normal with correlation π and the marginal distributions are as N (0, 1), and the regressors (x1 , . . . , xk ) are multivariate normal, independent of ψ, then the coefficients (β1 , . . .βk ) in the given linear model are given as β j = −θ j π
(1 − R 2 )(1 − t) = −θ j πμ, t + (1 − R 2 )(1 − t)
(1.44)
38
J.-L. Jeng
where 1. t = var (τ |E)/var (τ ), τ = θ x + ψ. 2. R 2 = coefficient of determination in the population regression τ on (1, x1 , . . . xk ), (1−R 2 )(1−t) 3. μ = t+(1−R 2 )(1−t) . Nayak and Prabhala (2001) applied the conditional method in disentangling the dividend information in splits. Since the dividend information is usually mixed with the stock split information, and the differential expectations and informational substitute can both be possible for the split announcement, simply comparing announcement effects on dividendpaying and non-dividend-paying firms can not allow one to distinguish or assess. The non-dividend-paying firms are likely to be younger and more volatile than the dividend-paying firms. Thus, the non-dividend-paying firms tend to be less likely to announce a split. Therefore, the market will be more surprised when the non-dividend-paying firms announce the splits. This is called the “differential expectations” hypothesis. On the other hand, the decision to split is also conditioned on future cash flow. The dividend-paying firms tend to have established dividend payment program. Therefore, a unit of the unexpected information in the split will have less effect for the dividend-paying firm than the non-dividend-paying firm. This is called the “informational substitute” hypothesis. Since both can constitute the ways to explain the announcement effects of dividendpaying and non-dividend-paying firms, using the difference of announcement effects in the two kinds of firms does not certify the hypotheses. Hence, they propose a conditional announcement method to analyze the effects. Let S P L i denote the net benefit from announcing the split where S P L i > 0. Let X si be the vector of variables of pre-announcement information set, where S P L i = θs X si + ψsi, where ψsi is the firm i’s private information possibly not known to the market, and E(ψsi ) = 0. Hence, the firm will announce the split if S P L i = θs X si + ψsi > 0.
1 Popular Methods for Event Studies in Corporate Finance . . .
39
Based on this setting, it can be shown that E(A Ri |S) = γs + βs E(ψsi |θs X si + ψsi > 0),
(1.45)
where A Ri stands for the announcement effect associated with the split S for firm i. If the informational differential is correct, we should expect a significant difference in γs between dividend-paying stocks and nondividend-stocks. If none of the expectation difference and the informational differential explains the result, we would expect that γs,nd − γs,d is positive and statistically significant (given the literature show that the non-dividend-paying firms will tend to have more announcement effect). That is, the difference in intercepts should capture most of announcement effect in split. As for the details of dividend-paying stocks, since it is likely that the firms tend to announce the near-term dividend increases around the split dates, the question becomes what part is due to the other non-dividend factors? Hence, Nayak and Prabhala (2001) propose the conditional approach as D I Vi = θd X di + ψdi , where θd X di is the part of D I Vi known ex ante to the market and ψdi is the part of D I Vi known to the firm but the market and E(ψdi ) = 0. Nayak and Prabhala (2001) specify the dividend decision as C ∈ {I, U, D} for the decision to increase, leave unchanged, or decrease dividends, respectively. And the decision process is as an ordered probit model where I ⇐⇒ D I V > μ I U ⇐⇒ μ D ≤ D I Vi ≤ μ I D ⇐⇒ D I Vi < μ D , where the dividend decision reveals the private information of firm i s and makes the market revise the expectation on ψdi . As such, Nayak and Prabhala (2001) show that E(A Rsdi |C, S) = γsd + βd E(ψdi |C, S) + βs E(ψsi |C, S), (1.46) where the split decision is S, with the dividend announcement as D.
40
J.-L. Jeng
For the further decomposition of the increment of dividend and nondividend related components in splits, they show that if the private information can be separated as ψsi = ρsd ψdi + ψs−d,i , where ρsd ψdi represents the dividend information implied in ψsi and ψs−d,i represents the pure split information that is orthogonal to the dividends. Thus, the total announcement effect A Rsi is the sum of there two parts as E(A Rsi |ψsi ) = E(A Rs−d,i ) + E(A Rdi )
(1.47)
= αs−d E(ψs−d,i |ψsi ) + αd E(ψdi |ψsi ). That is, the expectation conditional on split information ψsi provides the relative effect of the “pure” split and dividend-related components. Given that ψsi = ρsd ψdi + ψs−d,i , it can be written as E(A Rsi |ψsi ) = αs−d E(ψs−d,i |ψsi = ρsd ψdi + ψs−d,i )
(1.48)
+αd E(ψdi |ψsi = ρsd ψ + ψs−d,i ) Hence, given that ψsi and ψdi as standard normal, it is shown that 2 E(A Rsi |ψsi ) = (1 − ρsd )αs−d ψsi + ρsd αd ψsi ,
where the first part is the “pure” split effect, and the second part is for the dividend-related component of split valuation effects. In other words, the announcement effect can be written as A Rsdi = αd ψdi + αs−d ψs−d,i = (αd − ρsd αs−d )ψdi + αs−d ψsi .
(1.49)
1 Popular Methods for Event Studies in Corporate Finance . . .
41
Given so, a firm can make a dividend announcement C ∈ (D, U, I ) and a split decision S, the joint announcement effect can be shown as E(A Rsdi |C, S) = (αd − ρsd αs−d )E(ψsd |C, S) + αs−d E(ψsi |C, S). (1.50) Comparing the earlier equations, we may find out that αs−d = βs and αd − ρsd αs−d = βd . b. Event-Induced Variance Increase (Boehmer et al. 1991; Brown and Warner 1985; Harrington and Shride 2007). The earlier studies of the possible variance increases can be traced back to Brown and Warner (1985). Brown and Warner (1985) examine the daily stock return with the possible non-normality in the distribution and hence, where the Central Limit Theorem is usually applied. Furthermore, due to the non-synchronous trading, the investigation for difference of estimations among firms and variance are also provided. Two hundred and fifty observations of 50 securities from CRSP are randomly collected from July 2, 1962 to December 31, 1979. Each stock is with a hypothetical event date given. With the event day denoted as day “0”, the entire period is for day −244 to ending day as 5. The first 239 days (−244 through −6) are in the estimation period. And for any security to be included, it must have at least 30 observations in the tire 250 day period and no missing data in the last 20 days. Let Ri,t stand for the rate of return of security i at time t. For every security, denote Ai,t as the excess return for the security i at day t. The excess return is expressed as Ai,t = Ri,t − E t−1 Ri,t , where E t−1 Ri,t is the conditional expected rate of return based on the past information, the mean adjustment and the market-adjustment such as the market indexed return or model adjustment such as market model are taken into account. Brown and Warner (1985) investigate that, for each individual return, it appears that the distribution for the statistics of excess returns is highly non-normal. And the specification of E t−1 Ri,t for various methods seems to have little effect on the excess return for event studies. In particular,
42
J.-L. Jeng
they also investigate the cross-sectional dependence among the security returns. The find shows that taking the cross-sectional dependence is not always preferred in the empirical studies since the improvement is not influential in their samples. Other pointing out the possible estimation methods on normal returns have little effect om the statistics, the result does show up the possible time series effect on excess returns and clustering effect among the firms. The evidence also shows that the variance increased in the event period for each security. This may lead to too many rejection to the null hypothesis of no significant event by the conventional methods. One question for the finding is that the evidence proves that the variance of same excess returns consider only the increase during the event period. The question is that if the event is significant the variance of excess returns should change to be higher or less—depending on the nature of event is expected or not. Hence, the variance of the excess return although alternates during the event period, it can either positively increase or negatively decrease depending on whether the event is favorable or unfavorable. Boehmer et al. (1991) consider the event-induced variance in the event period and show that the conventional method rejects the null hypothesis of zero-mean for abnormal (or excess) returns too frequently. They also use 250 observations for 50 securities from CRSP data set from July 1962 to December 1987. All securities and event dates are sampled with replacement. The estimation period is (−249 through −11) and no missing returns in the 30 days surrounding the event date (−19 through +10). Each of the 250 portfolios is sampled independently of the others. For each of the 250 portfolios, the estimated normal returns are calculated with market model method (with CRSP equally weighted index). With simulated event-induced variance into the statistics, Boehmer et al. (1991) compare the statistical results with the conventional test, standardized-residual test such as Patell (1976), the sign test, crosssectional test, the method of moments estimation with classification in industry (use of one-digit SIC code), standardized cross-sectional test, it shows that when event-induced variance is present, the rejection rate for the null hypothesis is too frequent. Hence, it shows that the event-induce variance (increase) may affect the conclusion of statistical inferences.
1 Popular Methods for Event Studies in Corporate Finance . . .
43
As pointing out earlier, the issue is that the events are assumed to always induce the variance increase in the event period for the securities of studies. The essence is that it is expected indeed that the variance of excess returns may change in the event period. Yet, it is assumed the variance of excess returns would increase naturally in the simulated data. The issue is that not all the events will always increase the variance when the event occurs depending on whether the event is advantageous or negatively expected. In addition, the impact of events can be both influential on the excess returns and the associated parameters. Hence, the simulated methods of Brown and Warner (1985) and Boehmer et al. (1991) only answer part of the story. Furthermore, the results also depend on the event period selected for their simulations. c. Robust Methods in Event Studies (Sorokina 2013). Sorkina et al. discuss the robust methods for outliers in the statistics of event studies. The weighted regression (M-estimation) approach of Huber (1973) and the extended MM estimation of Yohai (1987) are applied in the daily returns of securities. As mentioned earlier, the study points out that assessments on event studies rely on the determination of the frequency of returns sampled, the length of estimation period, and the event period of which the impacts of events are considered. Instead of using the statistics of cumulative abnormal returns, Sorkina et al. apply an extended market model for returns where various dummy variables are included in the regression models of security returns across 10 countries. It is claimed that they applied the robust method of Huber (1973) and Yohai (1987) to demonstrate the usefulness and importance of the stock returns instead of subjectively ignoring the problematic observations to trim the data points. The period of study is the financial crisis 2007–2009 with the Dodd–Frank Act. It begins with the regression model as Ri = αi + αi D + α0i D0 + βi Rm + βi D Rm + β0i D0 Rm + δi Rr f + λi R f x + γi D + εi , where Ri —daily return of the tested index ( obtained from Yahoo! Finance). αi —constant term.
44
J.-L. Jeng
αi —difference between index alpha before/after the tested legislation introduced. αoi —difference between index alpha before/after tested legislation enacted. D —before/after the tested legislation introduction dummy (0-before, 1-after). D0 —before/after the tested legislation enactment dummy (0-before, 1-after). βi —beta risk. Rm —market return (international equity index obtained from Boomberg). βi —coefficient of the change in country index beta risk after the tested legislation introduced. β0i —coefficient of the change in country index beta risk after the tested legislation enacted. δi —risk free rate coefficient. Rr f —risk free rate return (6-month LIBOR from Mortgage-X website). λi —forex market return coefficient. R f x —forex market return (MCI index from Federal Reserve). γi —coefficient of cumulative abnormal returns. D—dummy variable of the event periods (1 - during the period, 0 otherwise). εi —error term. They then use 120 days before the beginning of the first event window and 120 days after the last event window to estimate the model. The event window is determined by (−1, +1) day period. They claim this is based on MacKinlay (1997) suggestion. Unfortunately, this is due to the ad hoc rule in determining the size of event window. To consider the outliers in affecting the usual OLS estimation, they apply the Cook’s (1977) distance to identify the outliers. Cook’s distance is equal to 4/(n − k − 1) as the cutoff for identifying an observation as an outlier where n = number of observations and k = number of independent variables. For the robustness, they also apply the M-estimator of Huber (1973) and the MM-estimator. The M-estimator utilizes the median values of the sample and mitigates the influence of outliers by assigning them
1 Popular Methods for Event Studies in Corporate Finance . . .
45
a weight based on a repeating algorithm until the result is sufficiently improved. The MM-estimation is suggested by Yohai (1987) by combining M-estimation and high breakdown values estimation (S estimation) developed by Rousseeuw and Yohai (1984). For their empirical findings, they would like to see if the systematic risk estimated with various methods may alternate the result. With data they collected, they identify a substantial number of outliers within the event window, particularly the OLS estimates. Furthermore using the simulation, they see the percentage of correct event recognition increases dramatically for changes of risk effect. Based on the results from the OLS and robust regression methods, they conclude that their findings extend beyond the specific case of a particular empirical sample. Robust M and MM-estimators improve inferences from the event study model. However, their findings have several weaknesses: (1) the event window is determined subjectively with some sampling schemes developed by the conventions; (2) the tests for comparing the systematic risk only verified the systematic risk may change over sample for various reasons, in particular, not related to the event itself; (3) the identification for the event effect is still based on the residual analyses which still assumes normality in distribution. d. Misspecification (Marks and Musumeci 2017). Marks and Musumeci (2017) show that concerns over the underlying distribution on the stock returns indicate that the higher skewness and kurtosis may drive the convergence toward Central Limit Theorem (CLT) slower than theoretical result. The stock returns may not even have a finite variance, according to Madelbrot (1963) and Fama et al. (1969). The slower rate of weak convergence indicates that the conventional statistics could lead to biased statistical inferences when sample size is not large enough. The results show that previous research using Patell (1976) test may reject the true null hypothesis more often than the significance level. Marks and Musumeci (2017) use the daily returns from 1926 to 2015, with the estimation period as 120 days. They use the 120-day estimation period for two days prior to the events. These left them with 68,934,304 observations of E(Ri,e ) across 24,021 securities. They use the CRSP equally weighted index for the market model to find the benchmark returns as E(Ri,E ) = αˆ i + βˆi R M,E ,
46
J.-L. Jeng
where i denotes the firm, E the event day, M the CRSP equally weighted index, and αˆ i , βˆi are the ordinary least squares (OLS) estimates from the 120-day estimation period. They also require that the estimation period contains at least 100 observations. Then they construct the simulated data as the actual return Ri,E adding ¯ (0 or 0.25%) and a variance of θ(0 or 1) a variable i,E with a mean times the variance of the market model residuals during the estimation period. (This is to obtain the event-induced variance increase.) Then, they add this simulated effect of event to the actual return on the event day and subtract the benchmark expected return to get the simulated abnormal return A Ri,E = [Ri,E + i,E ] − [αˆ i + βˆi R M,E ], then construct the following standardized abnormal return, S A Ri,E =
! αi 1 +
A Ri,E 1 T
(R M.,E − R¯ M )2 ¯ 2 t ((R M.,E − R M )
,
According to Patell (1976), where αi is the standard deviation of the estimation period residuals, and the others are for the fact that these are for the out-of-sample prediction. Marks and Musumeci (2017) use different scenarios for the simulated data. The first case is the specification with no increase in variance. (That ¯ = 0, θ = 0). And they find out that the Patell test tends to over reject is, the null hypothesis even with increase of sample size. On the other hand, the BMP test seems to perform better. The second case they simulated is ¯ = 0.25%, θ = with a mean change and no variance increase. (That is, 0.). Although the Patell test is more powerful in this case, the winsorizing (which is a procedure to reduce the impact from the outliers) will indicate that the Patell test’s advantage seems to due to the outliers. Hence, the conventional method seems to misguide the research that leads to favor the alternative hypotheses and over-reject the null. Later on, they simulated the data when considering a variance increase ¯ = 0 and ¯ = 0.25%; θ = 1). The result shows that, with the mean ( effect as zero, the conventional Patell test is severely misspecified
1 Popular Methods for Event Studies in Corporate Finance . . .
47
regardless of the sample sizes. When the variance is increased, the Patell test seems more powerful while it’s due to the misspecification errors. The null hypothesis is rejected more often than the significance level either when it is true or when it is false. Marks and Musumeci (2017) then, applied the Bayes’ theorem to show that the conventional method will have more rejection errors when there is misspecification in the samples. That is, using the definition of conditional probability, the Bayes’ theorem can be expressed as p(null false|null rejected) = p(null rejected|null false) p(null false)÷[p (null rejected|null false)p(null false) + p(null rejected|null true)p(null true)], we have Patell’s p(null false|null rejected) = 70.34% while BMP’s p(null false|null rejected) = 76.86%, Thus, without any event-induced increase of variance, and when using sample size for 500 firms, the BMP test is more powerful to reject the null when it is false. By the same token, Marks and Musumeci (2017) consider the eventinduced variance increase, the result shows that Patell’s p(null false|null rejected) = 47.90%, and BMP’s p(null false|null rejected) = 68.58%, 0.6858 − 1 = 43.17% increase in the probThat is, the BMP test becomes 0.4790 ability when the null is false and the test rejects it.
48
J.-L. Jeng
The result indicates that (1) even when there is no event-induced increase of variance, the conventional method for event study is incorrectly biased toward rejecting the null hypothesis given the large samples selected, (2) with the ad hoc determination of event day, the conventional test is usually less powerful than the robust version of statistical inferences which may consider the event-induced variance increases, (3) the selection of samples are essential for the empirical results to show, which indicate that various empirical results could be due to the different samples collected, and (4) the model selection for the normal returns (or conditional expectations) is so influential for the specification of models which determine the abnormal returns later on. e. Sample Selection (Ahern 2009). Ahern (2009) considers that the sample selection issue in the event studies, when certain characteristics are applied to group the sample, may produce erroneous results in statistics in the short-run event studies. In fact, according to Ahern (2009), the results suggest that standard event study methods produce statistical biases in the grouped sample. Differing from Brown and Warner (1985) who applied the simulated event studies of random samples, the samples are drawn non-randomly. Samples are constructed by various highest and lowest deciles of market equity, prior returns, book-to-market, and earnings-to-price ratios. For the prediction models, Ahern (2009) used the characteristic-based benchmark model, the market model, the Fama FrenchThree-Factor model and the Carhart FourFactor model to construct the condition expectations (or normal returns) for the simulated samples. Instead of using only the data as Brown and Warner from 1963 to 1979, Ahern (2009) used the daily returns around 40 years from 1965 to 2003 with 31 years including the NASDAQ. The different samples show that the most significant evidence are the small firms and firms with low prior returns, where the portfolios will tend to have false positive abnormal returns. On the other hand, the characteristicbased benchmark model tends to have the least bias of all models. Ahern (2009) shows that using the sample with some underlying characteristics of firms selected for an event study may lead to biased prediction if the non-robust method applied to market average returns is used.
1 Popular Methods for Event Studies in Corporate Finance . . .
49
Notes 1. This also shows that, even with the knowledge of possible announcement for corporate issues, the identification of event dates can still be misleading. Therefore, the arbitrary clarification of the event period can easily misguide the statistical inferences of hypotheses of interest. 2. In other words, this is to demonstrate that incorrect classified or identified event period may introduce biases in the estimation of both systematic risk and unsystematic risk. 3. The notation “∞” represents the equation is approximated by the following expression. 4. Notice that the setting is to state the presumed event date is assumed as day 0. Yet, the actual event date is random and can be anywhere in between −c and c. 5. The notation as rit |θt = 0 is an abbreviation of the expression as Pr ob[rit |θt = 0]. That is, it is the conditional probability function of the return process when there is no event at day t. Likewise, the notation such as rit |θt = 1 is the conditional probability function of return process when there is an event at day t. 6. In other words, the setting as δi2 = 1 states that the event-induced volatility can either be larger than σi2 or less than that. 7. Notice that the null hypothesis states that both the conditions as A = 0 and δ 2 = 1 are true when no significant event exists in event period. Therefore, if the null hypothesis is not true, it appears that either A = 0 or δ 2 = 1 which means the possible situation can include the mean change or increase in volatility, or both. 8. The dimension of the vector x is (2c + 1) × 1 since the event period starts at −c and ends at +c with midpoint as t = 0. 9. This is a condition that assumes the variance of error term in each period (namely, the estimation and event periods) is constant within each period. 10. The issue, however, does not explain why the impact of events always increases the volatility. In other words, the setting always assumes that volatility hikes during the event period. In reality, however, not all events will always lead to the increase of information uncertainty and hence cause the volatility hikes.
50
J.-L. Jeng
11. Notice that the system of equations does not assume any time-varying coefficients on the systematic risk. Hence, the coefficients of the estimation period are assumed fixed. 12. Although these assumptions are somewhat too strong for firms in certain industries, Karafiath and Spencer (1991) state that their empirical results from statistical tests seem to confirm these theoretical settings. 13 The claim here is rather stringent. If the situation is based on the zero correlation from sample, that does not necessarily imply statistical independence. If statistical independence is assumed, it implies that the parametric change of the systematic risk model is not influenced by the firm-specific event assumed. Hence, it is said with the assumption, that these two effects are not statistically related to each other at all which assumes that, under the firm-specific events, the structural change for parameters (of the systematic models) are not influenced by the firm-specific events. 14. It can be seen that the system here does not take the (conditional or unconditional) heteroscedascity into account. The variances although take the regime change into account, it only considers the changes may happen in the regimes identified. 15. Again, it shows that the determination of event period is done with a subjective matter where the event period is chosen arbitrarily by the samplers. 16. L1 − 2 is for the correction of degree of freedom.
References Acharya, S. 1988. A Generalized Economic Model and Tests of a Signaling Hypothesis with Two Discrete Signals. Journal of Finance 43 (2): 413–429. Acharya, S. 1993. Value of Latent Information: Alternative Event Study Methods. Journal of Finance 48 (1): 363–385. Ahern, K.R. 2009. Sample Selection and Event Study Estimation. Journal of Empirical Finance 16: 466–482. Aktas, N., E. de Bodt, and J.-G. Cousin. 2007. Event Studies with a Contaminated Estimation Period. Journal of Corporate Finance 13: 129–145.
1 Popular Methods for Event Studies in Corporate Finance . . .
51
Ball, C. A., and W. N. Torous. 1988. Investigating Security-Price Performance in the Presence of Event-Date Uncertainty. Journal of Financial Economics 22 (1): 123–153. Berkman, H., and C. Truong. 2009. Event Day 0? After-Hours Earning Announcements. Journal of Accounting Research 47: 71–103. Boehmer, E., J. Musumeci, and A.B. Poulsen. 1991. Event-Study Methodology under Conditions of Event-Induced Variance. Journal of Financial Economics 31: 253–272. Brown, S.J., and J.B. Warner. 1985. Using Daily Stock Returns: The Case of Event Studies. Journal of Financial Economics 14: 3–31. Brown, K.C., L.J. Lockwood, and S.L. Lummer. 1985. An Examination of Event Dependency and Structural Change in Security Pricing Models. Journal of Financial and Quantitative Analysis 20 (3): 315–333. Burnett, J.E., C. Carroll, and P. Thistle. 1995. Implications of Multiple Structural Changes in Event Studies. Quarterly Review of Economics and Finance 35: 467– 481. Cable, J., and K. Holland. 1999. Modelling Normal Returns in Event Studies: A Model-Selection Approach and Pilot Study. The European Journal of Finance 5 (4): 331–341. Collins, D.W., and W.T. Dent. 1984. A Comparison of Alternative Testing Methodologies Used in Capital Market Research. Journal of Accounting Research 22: 48–84. Corrado, C.J. 1998. Event Studies: A Methodology Review. Accounting and Finance 51 (1): 207–234. Corrado, C.J. 2011. Event Studies: A Methodology Review. Accounting and Finance 51: 207–234. Eckbo, B.E. 2007. Handbook of Corporate Finance, pp. 3–36. Elsevier. Eckbo, B.E., V. Maksimovic, and J. Williams. 1990. Consistent Estimation of Cross-Sectional Models in Event Studies. Review of Financial Studies 3: 343– 365. Fama, E., L. Fisher, M.C. Jensen, and R. Roll. 1969. International Economic Review 10(1): 1–21. Harrington, S., and D.G. Shride. 2007. All Events Induce Variance: Analyzing Abnormal Returns When Effects Vary across Firms. Journal of Financial and Quantitative Analysis 42: 229–256. Huber, P.J. 1973. Robust Regression: Asymptotics, Conjectures and Monte Carlo. Annals of Statistics 1: 799–821. Karafiath, I., and D. Spencer. 1991. Statistical Inference in Multi-period Event Studies. Review of Quantitative Finance and Accounting 1: 353–371.
52
J.-L. Jeng
Klein, A., and J. Rosenfeld. 1987. The Influence of Market Conditions on EventStudy Residuals. Journal of Financial and Quantitative Analysis 22: 345–351. Krivin, D., R. Patton, E. Rose, and D. Tabak. 2003. Determination of the Appropriate Event Window Length in Individual Stock Event Studies. SSRN Electronic Journal, November. Lee, S.H., and O. Varela. 1997. An Investigation of Event Study Methodologies with Clustered Events and Event Day Uncertainty. Review of Quantitative Finance and Accounting 8: 211–228. MacKinlay, A.G. 1997. Event Studies in Economics and Finance. Journal of Economic Literature 35: 13–39. Madelbrot, B. 1963. The Variation of Certain Speculative Prices. Journal of Business 36 (4): 394–419. Marks, J.M., and J. Musumeci. 2017. Misspecification in Event Studies. Journal of Corporate Finance 45: 333–341. McWilliams A., and D.S., Siegel. 1997. Event Studies in Management Research: Theoretical and Empirical Issues. The Academy of Management Journal 40 (3): 626–657. Nayak, S., and N.R. Prabhala. 2001. Disentangling the Dividend Information in Splits: A Decomposition Using Conditional Event-Study Methods. Review of Financial Studies 14: 1083–1116. Patell, J. 1976. Corporate Forecasts of Earnings per Share and Stock Price Behavior: Empirical Test. Journal of Accounting Research 14 (2): 266–276. Prabhala, N.R. 1997. Conditional Methods in Event Studies and an Equilibrium Justification for Standard Event-Study Procedures. Review of Financial Studies 10: 1–38. Rousseeuw, P.J., and V.J. Yohai. 1984. Robust Regression by Means of SEstimators. In Robust and Nonlinear Time Series Analysis, Lecture Notes in Statistics, ed. J. Franke, W. Hardle, and D. Martin, 26, pp. 256–272. New York: Springer. Sorokina, N., D.E. Booth, and J.H. Thornton Jr. 2013. Robust Methods in Event Studies: Empirical Evidence and Theoretical Implications. Journal of Data Science 11: 575–606. Yohai, V. J. 1987. High Breakdown Point and High Efficiency Robust Estimates for Regression. Annual of Statistics 15 (2): 642–656.
Part II Alternative Approach for the Contemporaneous Event Studies
2 Assessments of Normal Returns
2.1
Introduction
The conventional approach for fitting the specification of normal returns is either to apply the time series models or the market model using the market index return as an explanatory variable. However, it is obvious that with the newly available systematic information, the specification of the normal returns should be adjusting accordingly. More specifically, even if the same market model is applied, the parameters for the arbitrary preevent period may change over time. Adaptive filtering should be used to control the possible systematic changes in the normal returns which, are not due to the corporate (finance) events of interest. Confusion in using the ad hoc methods in obtaining the normal returns actually may lead to inconclusive or faulty conclusion of event studies when abnormal returns are consequently used. Since the normal returns contain the explanatory variables that are systemic, it common to require that the normal returns of these variables should also have the cross-sectional dependence or persistence even though they only have the weak dependence on time series. Earlier study by Cable and Holland (1999) emphasize the model selection approach to specify the normal returns for event studies. Unfortunately, the model selection approach emphasizes the existence for a correct model © The Author(s) 2020 J.-L. Jeng, Contemporaneous Event Studies in Corporate Finance, https://doi.org/10.1007/978-3-030-53809-5_2
55
56
J.-L. Jeng
for the underlying series of interest. In particular, the model selection approach depends on the possible candidate sets for the models to choose and ignores the possible time-changing nature for the market expectation. Due to these reasons for the correct specification of normal return, this chapter is to consider various methods for the possible updates on the normal returns throughout the periods of interest. In particular, it is easy to consider that if the systematic information is already absorbed in the normal returns where the parameters of the normal returns are adapted and the cumulative abnormal returns are insignificantly different from zero, that would simply imply the so-called corporate event issues are not essential enough to cause significant impacts. The reason is simple. The dichotomy of normal and abnormal returns must be clear-cut so that assessments on the corporate events are not mixed with possible systematic components of normal returns. Otherwise, it is manipulable for the (corporate) event studies to simply seek for significance of presumed corporate (finance) events by minimizing possible systematic adjustments on new information even when the market is adapting toward efficiency. In addition, since the normal returns are usually systematic, it is obvious that the cross-sectional long dependence should prevail in these asset returns (please see Jeng’s [2018] Empirical Asset Pricing Models for more details). However, if the asset pricing models (applied to estimate normal returns) are subject to possible time-varying (or randomly varying) parameters, adaptive or recursive estimation is needed for the updates of information. Hence, the arbitrary sample separation with event and estimation periods is not needed.1 In that case, various adaptive/recursive filters in control theory for dynamic filtering can be applied here. Nevertheless, caution should be applied here that the asset pricing models of time-varying parameters (such as beta’s) require certain assumptions for the dynamics of the varying schemes which verify the systematic requirement for the explanatory variables. Although the intent of study is to consider the possible misspecification of conditional capital asset pricing model, Ghysels (1998) shows that the applications of conditional capital asset pricing models may not necessarily outperform the constant beta models since the pricing errors may be more serious when the beta risk is misspecified. Specifically, one may assume that the parameters (such as beta’s in the factor pricing models) follow
2 Assessments of Normal Returns
57
either certain random-walk or Markov model in transition. However, if the dynamics of the transition in parameters is incorrectly specified, the models of time-varying coefficients may not necessarily perform better than the constant-coefficient models for the normal returns. This leads to the discussions of online recursive estimation of the normal returns. The recursive estimation is usually more convenient and up-todate for the current information whether the parameters (such as beta’s) are time/randomly varying or not. Specifically, even when the asset pricing models are considered as constant-coefficient models, the recursive estimation does not require subjective determination of estimation period versus the event period as in the conventional approach. It is noticeable that the conventional approach simply assumes that the parameters of asset pricing models for normal returns remain stable over both estimation and event periods. Hence, possible discussions as in Aktas et al. (2007) present the contamination problems when some unrelated events may occur in the predetermined estimation period. That may cause the biases in estimating the parameters of interest when no further continuous modification or adjustment is applied for the normal returns in event period. Instead, the estimation for normal returns should be processed continuously over the time horizon where currently available information is applied. This in turn, may obtain robust abnormal returns for the discussions of event studies. If the parameters are randomly varying, and if the dynamics of their transition are specified correctly, the recursive schemes of filtering can be applied to obtain optimal control for the normal returns in the presumed asset pricing models with possible mixing error terms. In particular, if one is to apply the abnormal returns to analyze the impacts of events, one needs to consider if the explanatory variables (or factors) are indeed systematic before applying adaptive/recursive estimations on the asset pricing models—whether they are of time-varying parameters or not. In other words, identification or model selection for the systematic components of asset returns should be introduced prior to the estimation schemes. Therefore, even if the “filtering” of automatic control theory can be applied, the intent is not to filter out all possible information or signals including the time series dynamics of asset returns (or excess returns) from
58
J.-L. Jeng
the return series to obtain optimal control or prediction. Instead, the purpose is to filter out the systematic/non-diversifiable components from the data and to obtain the genuine abnormal returns for event studies. In other words, using the recursive/adaptive estimation to filter the asset returns (or excess returns) is to filter out the systematic components when allowing the abnormal returns may still contain unspecified serial dependence. The advantage of “filtering” is that the estimation for normal returns is processed continuously through time horizon and is not confined by the ad hoc determination of estimation period and event period. In the following, we assume that the explanatory variables used for the normal return (or the so-called conditional expectation) are all systematic already. Proposition 2.1.1 If the boundary for the structural change is given as a function of time trend and the event is persistent so that the event period is long enough, the test based on cumulative sums of abnormal returns will be similar to the test of structural change in the regression model. Proof If (say) the event is significant in the event period, let z t = σett = et∗ be the standardized abnormal return where σt may consider the (conditional) heteroscedasticity for the data, then under the null hypothesis z t = a(t) > cα , where cα can be considered as the significant critical value if the data is used to confirm the positive event (for the negative event, we can set the proof accordingly in the opposite way) and the event is persistent in the event window. Now it is easy to see that to i=0
z i = Z to ,Z to =
to i=0
a(i) > cα to
(to ) = co to G(to ) ≈ F(to ) + to , to
where to → 0 as to → ∞ for the persistence of the event, (ttoo ) and G(.) are called the empirical distribution functions for the event, (to ) is the
2 Assessments of Normal Returns
59
number of significant events in the event period, cα is the critical value that indicates the significance of the hypothesis, F(t) is then the approximating function at to as the boundary function set for the cumulative sum test for some t ’s, due to the assumption that the events are persistent. Hence, if the events are persistent, there is a significant structural change in the cumulative sum of abnormal returns for the boundary function F(t) if the conventional event study test is significant. On the other hand, if the cumulative sum of abnormal returns is greater (or less when depending on whether the event positive or not) than the boundary function (say) to
ei∗ > F(t) + to ,
i=0
it implies that at least one of the abnormal returns will be large enough to exceed zero. Suppose that this event point is at t† and suppose the event is persistent. It is easy to show that at least one abnormal return has et∗† > cα for cα ≈ (F(t† ) + t† )/t† from what we construct it before. This shows that at least one of the abnormal returns is significantly different from zero statistically. That is, an abnormal return et∗ will be nonzero to imply that the event is significant in the conventional t -test of the event studies. Furthermore, if the arbitrary scheme in determining the event period or estimation period is not entirely suitable (unless the one is really lucky) for event studies, the statistical inferences will not be robust to confirm any model for the events. The question is that since the event itself is to indicate the impact of event may be toward stock returns, why not let the data speak for themselves? That is, let the observations generate the best conditional expectations or normal returns based on the current information. Given that the conditional expectations are informational adapting, it is more optimal for the abnormal returns (for the error terms) to modify themselves and let the system adjust to its expectations as the new information available, rather than from the forecast errors for a given
60
J.-L. Jeng
distributional assumption or else. In particular, the periods are determined arbitrarily by the subjective decision or any ad hoc methods. Proposition 2.1.2 Assuming that the presumed asset pricing models for asset returns are specified as follows rit = E[rt |Ft ] + it =
k
β(t)i j ϕ jt + it ,
(2.1)
j=1
where rit represents the asset returns for asset i at time t, i = 1, 2, . . . , n, t = 1, 2, . . . , ϕ jt j=1,...,k are k explanatory variables or factors that represents the systematic components associated with each asset returns, β(t)i j j=1,...,k which represent the k systematic risks for each asset return, i = 1, 2, . . . , n, and {it }i=1,...,n represent the abnormal returns or idiosyncratic risk. Suppose that E[rt |Ft ] is a given conditional expectation with the available information Ft , where Ft is the sub-σ —field that represents the information with the recursive algorithm (such as stochastic approximation or least mean-squared errors, for example.) such that Ft is the σ — field generated by {rit , ϕt }, the recursive algorithm will generate a lower fitted error for abnormal return in L 2 —sense than the ad hoc estimation and event period. Say let T ∗ be the number of observations that exceeds the actual event date, it can be shown that2 ∗
T t=0
∗
et2
≤
T
et†2 ,
t=0
whether the system is time-invariant or time-varying where ei† is the abnormal return generated by the other ad hoc method of event period in using ˆ t , ϕt ) = θˆ (Ft ), the given estimation period, eˆt = r − ϕtT θˆt , θˆt = θ(r † T T † † † ˆ E(eˆt θt ) = 0, and et = rt − ϕt θ , θ = θ (F t− ), E(et†T θ † ) = 0 which assuming all the past information is used efficiently where F t− −⊆ Ft , θ † is the estimate for the coefficient of conditional expectation based
2 Assessments of Normal Returns
61
on the ad hoc determination and the information up to the event period Ft− . Proof It is easy to see that et† = rt − ϕtT θt† = rt − ϕtT θt† = (rt − ϕtT θˆt ) + ϕtT (θˆt − θt† ) = eˆt + ϕtT (θˆt − θt† ),
and let T ∗ be the number of observations that exceeds (or is equal to) the actual event date. Hence, ∗
T t=0
∗
et† 2 =
T t=0
∗
eˆt2 +
∗
∗
T T T (ϕtT (θˆt − θt† ))2 = eˆt2 + E(E(ϕtT (θˆt − θt† ))2 |Ft )) t=0
t=0 ∗
=
T t=0
t=0
∗
eˆt2 +
T
E(E(ϕtT (θˆt (Ft ) − θt† (F t− ))2 |Ft )),
t=0
T ∗ 2 T ∗ †2 by the iterated rule of expectations. Therefore, t=0 eˆt ≤ t=0 et since † T 2 ˆ E(ϕt (θt (Ft ) − θt (F t− )) |Ft )) ≥ 0, when F t− ⊆ Ft . Therefore, if the system is time-varying or not, the ad hoc method of event period will not necessarily come up with an informative solution for abnormal return in the event studies. In other words, given the loss of information with ad hoc method, it is better to use the recursive methods to generate the abnormal return for the event study which doesn’t require the determination of event period (and the event day), and to acquire the abnormal returns in almost all cases. Hence, the arbitrary choice of event period will naturally end up with higher estimate for sum of square errors whether the system is time-varying or not. In other words, the estimates for variances (or else) with the events identified could be mistaken and the tests henceforth may not indicate the event correctly. This also implies that the event studies that form with ad hoc choice of event period can possibly bias the ending results.
62
2.2
J.-L. Jeng
Estimation of the Normal Returns: Static or Recursive?
For the discussions of estimating normal returns for the firms of interest, it is assumed that the presumed asset pricing models already succeed in identifying the systematic explanatory variables/factors for asset returns. Hence, the work is to correctly estimate and update the information for the specification of normal returns accordingly. Given that lots of empirical studies in corporate finance confirm that the parameters of interest (in the normal return models) are time-varying, our emphasis is on the timevarying case. Since most of the model specification is, to the best extent, an approximation for the genuine data generating mechanism for normal returns, the time-varying coefficient approach may assist the tracking for the normal returns over time without the ad hoc setting of estimation period vs. event period. Assuming that the presumed asset pricing models for asset returns are specified as a time-varying coefficient model as follows rit =
q
β(t)i j ϕ jt + it ,
(2.2)
j=1
where rit represents the asset returns for asset i at time t, i = 1, 2, . . . , n, t = 1, 2, . . . , ϕ jt j=1,...,k are q explanatory variables or factors that represent the systematic components associated with each asset returns, β(t)i j j=1,...,k represent the q systematic risks for each asset return which could be time-changing, i = 1, 2, . . . , n, and the error terms of asset pricing models {it }i=1,...,n represent the abnormal returns or idiosyncratic risk.3 Notice that the above setting assumes that the systematic factors are already identified for the models. In essence, one should consider the model selection tests or others to ensure that these factors are indeed systematic and non-diversifiable in order to facilitate the modeling of the normal returns. In particular, even in the cross-sectional regressions for event studies, one needs to pursue the correct approximation of systematic components first before trying to identify the significance of events in
2 Assessments of Normal Returns
63
using the pre-specified dummy variable. In other words, for the so-called abnormal returns, instead of using the arbitrary determination schemes to decide the event incidence by event period, the system is to allow the parameters of normal returns to alternate through time. Therefore, the information is updated accordingly for the conditional expectation for the stock returns since the markets are updating the information rapidly. In the following, we can set the dependent variable yk = rik as the rate of return for a particular ith asset at time k in our discussion. Chen and Guo (1991) introduce the stability conditions or the so-called conditional richness condition for the tracking algorithms. The tracking errors are subject to the finite-ness when the condition is granted. In our applications, although the models we used may not be so complicated as the adaptive control, the system applicable in the conditional expectation for normal returns should be more updated with current information, instead of the ad hoc determination of the estimation period and event period. The essence is that the tracking algorithms are asymptotically stable if the system is linear and the conditional richness conditions are provided. Given that the time-varying system as the linear model for ∀k ≥ 0. In other words, the subindex k is used for the representation of time index t instead. The coefficient is time-varying with a random-walk model. In the following, the explanatory variables are free to model selection already (similar to market model in using market index returns, and the timevarying coefficient as changing “beta,” for instance), the analyses for timevarying system for adaptive control can be applicable in our case. Let the model be described as yk = ϕkT θk + νk ,
k = θk − θk−1 , where yt represents the dependent variable (such as stock return in the model for normal returns) and νk is the error term. The conditional richness condition is defined as for an adapted sequence as {ϕk , Fk } (that is, ϕk is Fk —measurable for any k, where {Fk } is a family of nondecreasing
64
J.-L. Jeng
σ -algebras), if there exists an integer h > 0 such that for all m ≥ 0 E
m+h k=m+1
ϕk ϕkT 1 |Fm ≥ I 2 1 + ||ϕk || αm
(2.3)
where {αm , F m } is an adapted nonnegative sequence satisfying αm+1 ≤ aαm + ηm+1 ,∀m ≥ 0, Mo Eαo1+δ < ∞, with {ηm , F m } being an adapted nonnegative sequence satisfying 1+δ |Fm ] ≤ M, supm≥0 E[ηm+1
almost surely where a ∈ [0, 1), 0 < δ < ∞, and 0 ≤ M < ∞ are constants. Intuitively, the idea is to ensure the explanatory variables are so informative enough over time. Loosely speaking, the condition is to consider that the explanatory variables are informative enough so that conditional on the past information, the (normalized) conditional covariances for explanatory variables are subject to a lower bound. This indicates that the dependent variable is tractable in using the given past information and models to track down its path using the information given in the explanatory variables. Basically, it is imaginable to say that the market will not be so idle in updating their information during which the rumors and related information may convey—even when the event is around. Hence, recursively modifying the conditional expectation (based on the past information), seems more responsive in the specification of normal returns. According to Chen and Guo (1991), many conditions are included in the conditional richness condition. For instance, the usual φ-mixing conditions in the following example are included. Let {ϕk } be an r dimensional φ-mixing process such that there exists a deterministic
2 Assessments of Normal Returns
65
sequence {φ(h), h ≥ 0) where (i)φ(h) → 0, as h → ∞; (ii)
sup ∞ ,B ∈ F s |P(A|B) − P(A)| ≤ φ(h) ∀s ≥ 0,∀h ≥ 0 A ∈ Fi+h 0
for any nonnegative integers s ≥ 0 and h ≥ 0, F σ {ϕ,0 ≤ k ≤ s}, ∞ = σ ϕ ,s + h ≤ k < ∞} as the separated σ -fields. Suppose and {Fi+h k that in f k λmin (Eϕk ϕkT ) > 0,
supk E||ϕk ||4 < ∞
the conditional richness holds with Fm = F m . ∞ —measurable Theorem 2.2.1 (Chen and Guo (1991)): For any Fm+h scalar function xm+h with |xm=h | ≤ 1,
|E[xm+h |F m ] − Exm+h | ≤ 2φ(h), and for any random vector ϕ λmin
ϕϕ T E 1 + ||ϕ||2
≥
[λmin (E[ϕϕ]T )]2 . E[||ϕ||2 + ||ϕ||4 ]
Proof Let first the xm+h be the simple indicator function first as xm+h = ∞ . Suppose that there is a B ∈ F with P(B) > I A where A ∈ Fm+h m 0, such that |E[xm+h |Fm ] − E xm+h | > φ(h), ω ∈ B. Using this and mixing property (ii) gives 1 φ(h) < | [E[xm+h |Fm ] − P(A)d P| P(B) B 1 =| [P(AB) − P(B)P(A)]| ≤ φ(h) P(B) This hence reaches a contradiction according to the mixing condition. Hence, for each indicator function x m+h , the earlier inequality holds.
66
J.-L. Jeng
Now that for the general case, let xm+h be approximated by the step function such that ai I Ai , xm+h = i
where {Ai } is a finite decomposition of the sample space into disjoint ∞ . Since |x element of Fm+h m+h | ≤ 1 implies that |ai | ≤ 1, |E[xm+h |Fm ] − E xm+h | = |
i
≤
ai [P(Ai |Fm ) − P(Ai )]|
|P(Ai |Fm ) − P(Ai )|.
i
Now denote the following σ -fields C+ = C− =
{Ai : P(Ai |Fm ) − P(Ai ) > 0} {Ai : P(Ai |F ,m ) − P(Ai ) ≤ 0}
∞ , where C + and C − belong to Fm+h
|P(Ai |Fm ) − P(Ai )| i
= [P(C + |Fm ) − P(C + )] + [P(C − ) − P(C − |Fm )] ≤ 2φ(h), which implies the first inequality. Now let x be the eigenvector for λmin inequality we see that [λmin (Eϕϕ T )]2 ≤ ≤E
E
ϕϕ t 1+||ϕ||2
|x T ϕ| 1 + ||ϕ||2
, then by the Schwartz
.||ϕ|| 1 + ||ϕ||2
x T ϕϕ T x E[||ϕ||2 (1 + ||ϕ||2 ). 1 + ||ϕ||2
2
2 Assessments of Normal Returns
67
Therefore, if for any m ≥ 0 ||E[
T ϕm+h ϕm+h
1 + ||ϕm+h ||2
||Fm − E
T ϕm+h ϕm+h
1 + ||ϕm+h ||2 ≤ 2r φ(h)
Hence, for φ(h) → 0 as h → ∞, there exists a constant α > 0 such that for all m ≥ 0 and large h E
T ϕm+h ϕm+h
1 + ||ϕm+h ||2
|Fm ≥ α I.
which shows the conditional richness condition (2.3) holds. In our applications, we can set the explanatory variables such as the market index return for the market model with the above conditional richness condition in obtaining the conditional expectation based on the past and current information when allowing the “beta” to be time-varying. Thus scheme instead, is considering the possible time-varying systems for the parameters of concern. in the systematic risk models. Hence, the normal returns will be more updating to the current information likewise without using the concurrent determination of the event period and the estimation period. Given so, there is no need to separate the estimation period and the event period in the event studies. In fact, since we can track the system accordingly, the so-called abnormal returns are calculated on line with the recursive algorithms used. That means the abnormal returns are obtained with the system simultaneously. Hence, there is no loss of information and the ad hoc decision for the estimation period and event period is no longer meaningful for the analyses. In other words, the abnormal returns are obtained with more precise residuals to approximate when the concurrent information is applied. Hence, the conclusion of event studies is more trustable for the statistical inferences applied thereafter.
68
J.-L. Jeng
Now for the tracking algorithm, we can start with the Kalman filter as the form as: θˆk+1 = θˆk + Pk+1 = Pk −
Pk ϕk (yk − ϕk θˆk ) R + ϕkT Pk ϕk Pk ϕk ϕkT Pk R + ϕkT Pk ϕk
+ Q,
where Po ≥ 0, R > 0 and Q > 0, θˆo are deterministic. (R and Q are considered as the a priori estimates for the variances of νk and k respectively.) Chen and Guo (1991) consider the following theorems for the tracking errors (due to the tedious technicality, the proof is not shown in the details) of the tracking algorithm. Theorem 2.2.2 (Chen and Guo (1991)): Suppose that {νk , k } is a stochastic sequence such that for some p > 0 and β > 1 σ p supk≥0 E{Z k [log(e + Z k )]β+3 p/2 } < ∞, p
and
E{||θ˜o || p [log(e + ||θ˜o ||)] p/2 < ∞,
where Z k = ||νk || + ||k+1 ||, and θ˜o = θo − θˆo . Then under the conditional richness condition (2.3), the estimation error as {θk − θˆk } is L p -stable and (2.4) limsupk←∞ E||θk − θˆk || p ≤ A σ p log 1+3 p/2 (e + σ p−1 ) where A is a constant. This shows that the estimation errors are finite and tracking is feasible with some finite errors. Therefore, the algorithm allows us to approximate the dependent variable as close as possible and the conditional expectations based on the concurrent information are obtained without the arbitrary determination of estimation and event periods. And in the next theorem it is shown that if the conditional richness is satisfied, the moments of ||Pk ||, k = 0, 1, . . . exist in small neighborhood of the origin and are uniformly bounded in k.
2 Assessments of Normal Returns
69
Theorem 2.2.3 (Chen and Guo (1991)): For {Pk } recursively generated as above, and if the conditional richness condition (2.3) holds, there exists a constant ε∗ > 0 such that for any ε ∈ [0, ε∗ ) supk>0 Eex p{ε||Pk ||} ≤ C, and
1 ex p{ε||Pi ||} ≤ C limsupk→∞ k k
i=0
almost surely, where C and C are constants. In addition, Chen and Guo (1991) show the following result such that for {Pk } generated by the above algorithm and if the conditional richness condition is likewise satisfied then (i) supk≥0 E||Pk || < ∞, k (ii) limsupk→∞ k1 i=0 ||Pi ||m ≤ c < ∞, (iii) ||Pk || = O(logk),
a.s a.s
∀m > 0 ∀m > 0 k→∞
Hence, based on the results obtained, Chen and Guo (1991) show the following theorems for the estimation errors, particularly for their convergence. This indicates that using various recursive algorithms, it is feasible to track down the parameters of interest when the time-varying coefficient model is given as before. Hence, the time-varying coefficient systematic risk model can be tracked down with the information given where the conventional method in setting the arbitrary estimation and event period by assuming the coefficients in the systematic risk models as given are unchanged is very subjective. Theorem 2.2.4 (Chen and Guo (1991)): Consider the time-varying model as above. Suppose that {νk , k } is a stochastic sequence and satisfies for some P > 0 and β > 1 3p σ p supk≥0 E Z kP [log(e + Z k )]β+ 2 < ∞,
70
J.-L. Jeng
p p p 2 ˜ ˜ E ||θo || log(e + ||θo || < ∞,
and
where Z k = ||νk || + ||k+1 ||, θ˜o = θo − θˆo and νk , k θo , θˆo are given as the time-varying system and the recursive algorithm. Then under the con ditional richness condition, the estimation error θk − θˆk , k ≥ 0 is L p — stable and 1+ 32p (e+σ p−1 ) p ˆ limsupk→∞ E||θk − θk || ≤ A σ p log
(2.5)
where A is a constant depending on h, a, M, Mo , and δ only. In addition, if νk ≡ 0 and k ≡ 0 (that is, θk ≡ θo ), then as k → ∞ E||θk − θˆk || p −→ 0, and
E||θk − θˆk ||q −→ 0
exponentially fast for any q ∈ (0, p). Theorem 2.2.5 (Chen and Guo (1991)): Consider the above time-varying system and {νk , k } is a stochastic sequence and for some p > 0 1 {||νi || p + ||i+1 || p } < ∞ k k−1
ε p limsupk→∞
a.s.
i=0
Then, under the conditional richness condition {θˆk − θk , k > 0} is L q — stable in the time average sense for any q ∈ (0, p) and 1 limsupk→∞ ||θˆi − θi ||q ≤ B(ε p )q/ p k k
i=0
(2.6)
2 Assessments of Normal Returns
71
where B is a constant depending on q, h, a, M, Mo , and δ, but independent of sample path. Furthermore, if νk ≡ 0, and θk ≡ θo , then θˆk −→ θo a.s. exponentially fast. We now can analyze the least mean squares (LMS)-like algorithms θˆk+1 = θˆk + μϕk (yk − ϕkT θˆk ),k ≥ 0 where μ ∈ (0, 1) is a positive constant, or called the step-size. In particular, we introduce the following conditional richness condition for the least mean squares (LMS)-like algorithms. Conditional Richness condition for LMS-like algorithms—For the LMS algorithms with μn ∈ Fn , the regressor {ϕn , Fn } is said to satisfying the Conditional Richness condition if μm ϕm 2 ≤ 1,E
m+h
μk ϕk ϕkT |Fm
k=m+1
≥
1 I αm
a.s.∀m ≥ 0,
where h is a positive constant and {αm , Fm } is nonnegative sequence satisfying αm ≥ 1, and αm+1 ≤ aαm + ηm+1 ,∀m ≥ 0,
Eαo1+δ < ∞
where a ∈ (0, 1) and ηm , Fm is a nonnegative sequence such that 1+δ supm≥0 E ηm+1 |Fm ≤ M,a.s. with δ > 0 and M < ∞ being constants.
72
J.-L. Jeng
Theorem 2.2.6 (Chen and Guo (1991)): Consider the time-varying model as above suppose the conditional richness condition satisfied and that for some α>0 σα supn≥0 E(|νn |α + n α ) < ∞,Eθ˜o α < ∞, Then the tracking error θ˜k = θk − θˆk of the LMS algorithm has the following property β
limsupn→∞ Eθ˜n β ≤ c (σα ) α ,∀β ∈ (0, α), where c is a positive constant. Moreover, if νn ≡ 0, and n ≡ 0, then Eθ˜n+1 β −→ 0, exponenn→∞
tially fast ∀β ∈ (0, α) and θ˜n+1n→∞ −→ 0
exponentially fast. Hence, if the system is time-varying, the tracking errors are finite and if the system is of time-invariant nature, the speed of convergence of tracking is exponentially fast. In other words, if forming the conditional expectation of the normal returns, we can use both systems to track down the possibly time-changing pattern of the normal returns and obtain the abnormal returns for more up-to-date information without the ad hoc determination of event period. Hence, there is no need to consider the estimation and event periods. Guo (1994) explains the stability of the recursive stochastic tracking algorithms that include the Kalman filter, recursive least squares, and least mean squares algorithms and shows it is feasible to consider some dependence conditions of the system. The idea is that the algorithms including the tracking errors will rely on the random linear equation as xn+1 = (I − An )xn . Hence, the stability if the algorithms will depend on the condition of the random matrix xn . For example, substituting the time-varying equations as above for the similar time-varying system and
73
2 Assessments of Normal Returns
using the definitions of k θk − θk−1 and θ˜k = θk − θˆk , we have θ˜k+1 = (I − L k ϕkT )θ˜k − L k νk + k+1, k ≥ 0. This equation will follow the general form for linear equations as xk+1 = (I − Ak )xk + ξk+1 ,k ≥ 0, where {Ak } is a sequence of (say) d × d random matrices, {ξk+1 } represents the disturbances, and L k is the adaptation gain for various forms of tracking algorithms. For the Kalman filtering algorithm we can have Lk =
Pk ϕk , R + ϕkT Pk ϕk
Pk ϕk ϕkT Pk
Pk+1 = Pk −
R + ϕkT Pk ϕk
(2.7)
+ Q,
where Po ≥ 0, R > 0, Q > 0, and θˆo are deterministic and can be arbitrarily chosen. For the Least mean squares, Lk = μ
ϕk , 1 + ||ϕk ||2
(2.8)
For the Recursive least squares, Pk ϕk , α + ϕkT Pk ϕk Pk ϕk ϕkT Pk 1 = Pk − , α α + ϕkT Pk ϕk Lk =
Pk+1
(2.9)
where Po > 0, and α ∈ (0, 1) is a forgetting factor. Guo (1994) introduces a similar excitation condition for the above recursive algorithms (i.e. ϕk is Fk —measurable for all k, where {F k } is a
74
J.-L. Jeng
sequence of nondecreasing σ −algebras) and there exists an integer h > 0 such that {λk } ∈ S o (k) for some λ ∈ (0, 1) where λk is defined by ⎧ ⎡ ⎤⎫ (k+1)h ⎨ ⎬ ϕk ϕkT 1 ⎦ | F λk λmin E ⎣ (2.10) k ⎩ ⎭ 1+h 1 + ||ϕk ||2 kh+1
! = a : ak ∈ [0, 1], E kj=i+1 (1 − a j ) ≤ Mλk−i , where ∀k ≥ i, ∀i ≥ 0} for some M > 0, λ is the parameter reflecting the stability margin. With this definition for the stability of the tracking system, Guo (1994) shows the following proposition for the φ-mixing sequence as the explanatory variables. This shows that many dependence situations for the explanatory variables can be included in the time-varying system and the tracking algorithms can be applied to the normal returns in our cases. In other words, if the excitation condition or the conditional richness condition is satisfied, and the underlying system for the normal returns are time-varying, the (moments of ) tracking errors for the recursive algorithms are finitely bounded. Hence, the normal returns for the underlying system can be obtained with the more updated information without the artificial truncation for the event period in studies of corporate events—even when the underlying system is time-varying. S o (λ)
Proposition 2.2.7 (Guo (1994)): Let {ϕk } be a φ-mixing process. Then the necessary and sufficient condition for the above condition to be satisfied is that there exists an integer h > 0 such that ⎧ ⎫ ⎨(k+1)h ⎬ T ϕk ϕk in f k≥0 λmin E > 0. ⎩ 1 + ||ϕk ||2 ⎭ kh+1
For the other dependence applications other than the φ-mixing, Guo (1994) shows that the same excitation condition will hold. For convenience, the elaboration of the some cases are not shown without loss of simplicity. Tracking error bounds-
2 Assessments of Normal Returns
75
Lemma 2.2.8 (Guo (1994)): Let {cnk , n ≥ k ≥ 0}, {dnk , n ≥ k ≥ 0}, and {ξk , k ≥ 0} be three nonnegative random processes satisfying (i) for some M > 0, (i)cnk ∈ [0, 1], Ecnk ≤ Mλn−k , n ≥ k ≥ 0, λ ∈ [0, 1], (ii) There exist some constants ε > 0 and α > 0 such that 1/α supn≥k>0 E ex p(εdnk ) < ∞ (iii) σ p supk ||ξk log β (e + ξk )|| L p < ∞ for some p ≥ 1, and β > 0, then ∀n ≥ 0
n
||cnk dnk ξk || L p ≤ cσ p f (σ p−1 ),
k=0
where c is a constant independent of σ p and f (σ p−1 ) = log 1+(β/2) (e + σ p−1 ) if β > 2max(1, α) log β (e + σ p−1 ), if {cnk } is deterministic, β = α log(e + σ p−1 ), if {dnk } is deterministic, β > 1. Guo (1994) also established several lemmas before proving the finiteness of the tracking errors of Kalman filter and other recursive algorithms. The work indicates that the time-varying parameters of hypothesized models for normal reruns can be tracked down over time if the updating information is applied accordingly. Hence, there is no need to arbitrarily determine the estimation period and event period. The tracking with timevarying parameters is able to estimate the normal returns even though the parameters are only alternating moderately or periodically over time. This gives a more precise estimate for normal returns (and hence, the abnormal returns) subject to the current information. The following arguments are applied to estimation on the information science where optimal control issue is of concern. Yet, it is also applicable to use in our optimal search for the normal returns in corporate finance.
76
J.-L. Jeng
Lemma 2.2.9 (Guo (1994)): For {Pk } using the Kalman filter algorithm, if the excitation condition (2.10) holds, then there exists a constant ε∗ > 0 such that for any ε ∈ [0, ε∗ ], supk≥0 Eex p(ε||Pk ||) < ∞. Lemma 2.2.10 (Guo (1994)): Let {Pk } be generated by the Kalman filter algorithm, then under the excitation condition (2.10) for any μ ∈ (0, 1), there exists a constant λ ∈ (0, 1) such that {μ/(1 + ||Q −1 ||· ||Pk ||)} ∈ S o (λ). Guo (1994) then, introduces the following theorem for the recursive Lyapunov equation as Pk+1 = (I − Ak )Pk (I − Ak )T + Q k Po > 0, k ≥ 0, for the random matrices. Theorem 2.2.11 (Guo (1994)): Let {Ak } be a sequence of d × d random matrices, and {Q k } be a sequence of positive definite random matrices. Then for {Pk } be cursively determined by the above equation, we have for all n > m ||
n−1 "
(I − Ak )||2 ≤
k=m
#
n−1 "
1−
k=m
$
1 1 + ||Q −1 k Pk+1 ||
||Pn ||· ||Pm−1 ||,
Hence, if {Pk } satisfies the following two conditions, (i)
1
1+||Q −1 k Pk+1 ||
∈ S o (λ),
λ ∈ [0, 1)
(ii) supn≥m≥o ||Pn ||· ||Pm−1 || L p < ∞, p ≥ 1,
77
2 Assessments of Normal Returns
then {Ak } ∈ S p (λ1/2 p ), where S p (λ) is defined as the stability condition as for a sequence of random matrices A = {Ak } with a parameter λ ∈ (0, 1) of exciting order p ( p ≥ 1) S p (λ) =
⎧ ⎨ ⎩
A:
k "
(I − A j ) L p ≤ Mλk−i , ∀k > i, ∀i ≥ 0, M > 0
j−i+1
⎫ ⎬ ⎭
Then, Guo (1994) shows the following theorem for the Kalman filter algorithm for the recursive estimation of the time-varying system. The theorem shows the finite-ness of the tracking errors. In other words, even though the system is a time-varying one, the recursive tracking algorithms can obtain some good fitness of the system. Theorem 2.2.12 (Guo (1994)): Consider the time-varying system as above, and the Kalman filter algorithm, and the excitation condition (2.10) is satisfied and that for some p ≥ 1 and β > 2
and
σ p supk ||ξk log β (e + ξk )|| L p < ∞
(2.11)
||θ˜o || L 2 p < ∞,
(2.12)
where ξk = |νk | + ||k+1 ||, θ˜o = θo − θˆo , and νk and k+1 are defined as above, then the tracking error {θk − θˆk } is L p -stable and limsupk→∞ ||θk − θˆk || L p ≤ c σ p log 1+β/2 (e + σ p−1 ) ,
(2.13)
where c is a finite constant depending on {ϕk } and R, Q, and p only, its precise value may be found from the proof. Proof By the system we can rewrite the equation as Pk+1 = (I − L k ϕkT )Pk (I − L k ϕkT )T + Q k ,
78
J.-L. Jeng
where Q k = R L k L kT + Q and Q k ≥ Q, Pk+1 ≥ Q. By applying Theorem 2.2.10 we have for all n > m,
n−1 "
(I − L k ϕkT ) ≤
k=m
n−1 "
1−
k=m
1 −1 1 + Q · Pk+1
1 2
1
1
· Pn 2 Q 2 .
√ 1 Note that L k ≤ Pk 2 (2 R) θ˜k+1 L p ≤ 1 Q −1 2
k i=0
k " j=i+1
# 1−
1 2(1 + Q −1 · P j+1 )
$
k "
(I − L i ϕiT )θ˜o L p +
i=0
#
1 Pk+1 2
1
Pi 2 1+ √ 2 R
$ ξi L p
Now by the Lemma 2.2.9 and by the Schwarz inequality we have
&1 % 1 1 1 supk≥i Eex p εPk+1 2 Pi 2 ≤ supk≥i Eex p(εPk+1 ) 2 [Eex p(εPi )] 2 < ∞.
We now can investigate the tracking errors of least mean squares algorithm and show that the time-varying parameters of the regression model (such as time-varying beta of the asset pricing model) can be obtained without the usage of arbitrary determination on the time horizons for event periods. Given so, the updating for the estimation for parameters is concurrent. Hence, the determination of the period of the event incidence is substituted by the updating information online, not by the subjective determination of the sampling schemes. Theorem 2.2.12 (Guo (1994)): Consider the time-varying model described as above and the lease mean squares algorithm defined. Suppose the excitation condition (2.10) holds and that for some p ≥ 1 and β > 1, the conditions (2.11) and (2.12) hold. Then {θk − θˆk } is L p -stable, and limsupk→∞ θk − θˆk L p ≤ c[σ p log(e + σ p−1 )] where σ p is defined by equation (2.11) and c is a constant.
(2.14)
2 Assessments of Normal Returns
79
! ϕ j ϕT Proof Let cki = kj=i+1 (I − μ 1+ϕ j 2 ). Then, it can be shown that j {cki } will satisfy the conditions in Proposition 2.2.7. Note that L k ≤ μ, we have k ˜ ˜ cki ξi L p θk+1 L p ≤ ck−1 θo L p + i=0
and by the result of Lemma 2.2.8, we get the desired result. Guo (1994) then introduces another lemma for the recursive least squares algorithm. In particular, some moment conditions are given for the recursion of the forgetting factor in the recursive least squares. Lemma 2.2.13 (Guo (1994)): Let {Pk } be generated as above for the recursive least squares algorithm with forgetting factor α ∈ (0, 1). If the excitation (2.10) is satisfied, then for any p ≥ 1 supk≥0 EPk p < ∞,
(2.15)
−1
provided that α satisfies λ[16hd(2h−1) p] < α < 1, where λ and h are given by the excitation condition (2.10) and d is the dimension of {ϕk }. Theorem 2.2.14 (Guo (1994)): Consider the time-varying system as above with the forgetting factor as given. Suppose the following conditions are satisfied (i) λm ∈ S o (λ) for some λ ∈ (0, 1) and some integer h > 0 where λm is defined by the excitation condition; (ii) For some p ≥ 1 supk (νk L 3 p + k L 3 p ) ≤ σ3 p ; (iii) supk ϕk L 6 p < ∞;
−1
(iv) The forgetting factor α satisfied λ[48hd(2h−1) p] the dimension of {ϕk }. Then there exists a constant c such that
< α < 1 where d is
limsupk→∞ θk − θˆk L p < cσ3 p.
(2.16)
80
J.-L. Jeng
Proof By the matrix inverse formula, it follows that −1 = α Pk−1 + ϕk ϕkT , Pk+1
multiply Pk−1 and using the recursive least squares algorithm, we get [I − L k ϕkT ] = α Pk+1 Pk−1 and k "
−1 (I − L j ϕ Tj ) = α k−i Pk+1 Pi+1 .
j=i+1
Multiply ϕk on both sides of Pk in recursive least squares algorithm we −1 L k = ϕk . Hence, by the recursion of the system and the above, have Pk+1 we have θk+1 − θˆk+1 L p ≤ α k Pk+1 Po−1 θ˜o L p +
k
−1 α k−i (Pk+1 ϕi νi L p + Pk+1 Pi+1 i+1 L p
i=0
Now, by the Hölder Inequality and assumptions of (i)–(iv) and Lemma 2.2.13, or from the above matrix inversion formula and assumption (iii) −1 L 3 p ≤ αPk−1 L 3 p + ϕk 2L 6 p ,∀k ≥ 0, Pk+1 −1 L 3 p < ∞. we have supi Pi+1 Guo and Ljung (1995) introduce further extension of the analysis for the similar time-varying system. The variance-covariance matrix of the tracking error can be approximated closely if suitable conditions are given. Let the time-varying coefficient system be shown as the following:
yk = ϕkT θk + νk , k ≥ 0,
2 Assessments of Normal Returns
where
81
θk = θk−1 + γ ωk ,
the γ is a scaling constant and ωk is the noise. The adaptive tracking and estimations for the unknown parameters θk usually have the form θˆk = θˆk (y k , ϕ k , θ k ), where the superscript denotes the time history such as y k = (yo , y1 , . . . , yk ). In our application, the time-varying system is the regression of interest for the normal return (or, conditional expectation). Denote the tracking error as θ˜k = θk − θˆk , where the covariance matrix is denoted as ok = E[θ˜k θ˜kt ]. The general tracking algorithm can be written as θˆk+1 = θˆk + μL k (yk − ϕkt θˆk ), μ ∈ (0, 1), where the gain matrix is chosen in some different ways. For the Least Mean Squares case we choose L k = ϕk , to discuss many adaptive signal processing applications. For the Recursive Least Squares we choose L k = Pk ϕk , 1 Pk = 1−μ Po > 0.
Pk−1 − μ
{Pk−1 ϕk ϕkT Pk−1 } 1 − μ + μϕkT Pk−1 ϕk
82
J.-L. Jeng
This gives an estimate θˆk that minimizes k ˆ 2, (1 − μ)k−t (yt − ϕtT θ) t=1
where (1 − μ) is the forgetting factor as the α in the earlier Lemma 2.2.13 and Theorem 2.2.14. For the Kalman Filter Based Algorithm, we have Lk =
Pk−1 ϕk R + μϕkT Pk−1 ϕk
Pk = Pk−1 −
μPk−1 ϕk ϕkT Pk−1 R + μϕkT Pk−1 ϕk
+ μQ,
where R > 0, Q > 0. That is, R is a positive number, Q is a positive definite matrix. It is optimal in the posterior mean squares sense that if both the νk and ωk are Gaussian white noises with covariance matrices as R and Q, respectively, and μ is chosen as in γ . Accordingly for instance, if estimated with the least mean square algorithm, it can be shown that θˆk+1 = θˆk + μϕk (yk − ϕk θˆk ), and
θˆk+1 = (1 − μϕk2 )θˆk − μϕk νk + γ ωk+1.
Squaring and taking expectation gives ok+1 = (1 − 2μRϕ + μ2 R4 )ok + μ2 Rϕ Rν + γ 2 Q ω . This will give a linear time-invariant difference equation for ok . In particular, if |1 − 2μRϕ + μ2 R4 | < 1,
2 Assessments of Normal Returns
83
the solution (for the equation) will convert to ∗ ' ( 1 γ2 1 = μRϕ Rν + , = Qω . 1 − μR4 /(2Rϕ ) 2Rϕ μ ∗
In short, this gives '
( R4 /(2Rϕ ) μ. | − | ≤ σ (μ),σ (μ) = 1 − μR4 /(2Rϕ ) ∗
This shows an example for the system dynamics of the system with the recursive tracking algorithms. Given the setting of some assumptions, Guo and Ljung (1995) show that 1. The regressors {ϕk } span the regressor space (to ensure that the whole parameter θ can be estimated), 2. The dependence between ϕk and (ϕi , νi−1 , ωi ) decays to zero when the time distance (k − i) tends to infinity, 3. The measurement error νk and the parameter drift ωk are of white noise character, In other words, we set St = E[ϕt ϕtT ] assuming that there exist constants h > 0 and δ > 0 such that for all k k+h
St ≥ δ I,
t=k+1
Let G k = σ {ϕk }, Fk = σ {ϕi , νi−1 , ωi , i ≤ k} Assume that {ϕk } is weakly dependent (φ-mixing) in the sense that there is a function φk (m) with φ(m) → 0 as m → ∞ for all k and m sup {|P(A|B) − P(A)|} ≤ φ(m) A ∈ G k+m , B ∈ Fk Also assume that there is constant cϕ > 0 such that ||ϕk || ≤ cϕ almost surely for all k.
84
J.-L. Jeng
Let Fk be defined as before, and assume that E(νk |Fk ) = 0, 2 E[νk |Fk ] = Rν (k),
E[ωk+1 |Fk ] = E[ωk+1 νk |Fk ] = 0, E[ωk ωkT ] = Q ω (k),
sup {E[|νk |r |Fk ] + E||ωk ||r } ≤ M, k
r > 2, M > 0.
k+1 = (I − μRk Sk )k (I − μRk Sk )T + μ2 Rν (k)Rk Sk Rk + γ 2 Q w (k + 1), (2.17)
where Sk = E[ϕk ϕkT ] and Rk is defined as follows. Consider the following tracking algorithms in the general adaptation algorithms: θˆk+1 = θˆk + μL k (yk − ϕkT θˆk ),μ ∈ (0, 1) Least Mean Squares:
L k = ϕk ,
Recursive Least Squares (RLS):
Pk =
1 1−μ
L k = Pk ϕk , Pk−1 ϕk ϕkT Pk−1
Pk−1 − μ
1 − μ + μϕkT Pk−1 ϕk Po > 0
where (1 − μ) is the “forgetting factor.”
85
2 Assessments of Normal Returns
Kalman Filter: Lk = Pk = Pk−1 −
Pk−1 ϕk R + μϕkT Pk−1 ϕk
μPk−1 ϕk ϕkT Pk−1 R + μϕkT Pk−1 ϕk
+ μQ
(R > 0,Q > 0) where R is a positive number and Q is a positive definite matrix. Guo and Ljung (1995) then had the following theorem for the recursive algorithms; Theorem 2.2.15 (Guo and Ljung (1995)): Consider any of the recursive algorithms stated, assume the above three conditions satisfied, ∀μ ∈ (0, μ∗ ), ∀k ≥ 1, ' ( γ2 k ˜ ˜ + (1 − αμ) , E[θk θk ] − k ≤ cσ (μ) μ + μ
(2.18)
where σ (μ) → 0 (as μ → 0) which is defined by √ σ (μ) min m≥1 { μm + φ(m)}, where φ(m) as defined in (2.3) above, and α ∈ (0, 1), μ∗ ∈ (0, 1), c > 0 are constants using the properties of {ϕk , νk , ωk }, and the k+1 = (I − μRk Sk )k (I − μRk Sk )T + μ2 Rv (k)Rk Sk Rk + γ 2 Q ω (k + 1),
& % Sk = E ϕk ϕkT and Rk as defined in the various recursive algorithms. LMS case (Least Mean Squares) Rk = I,
86
J.-L. Jeng
Recursive Least Squares case (RLS): Rk = Rk−1 − μRk−1 Sk Rk−1 + μRk−1 , (Ro = Po ) Kalman filter case: Rk = Rk−1 − μRk−1 Sk Rk−1 + μQ/R, (Ro = Po /R) For the applications of time-varying parameters in event studies, Brockett et al. (1999) is the one that considers the time-varying parameter (“beta”) in the market model for the normal returns. The work includes the GRACH (1,1) model for the conditional variance and a portmanteau test on the cumulative sums of standardized one-step-ahead forecast errors. One thing they identified throughout their study is that the similarity of event study’s methodology and that of the structural change test (in econometrics). Although they reexamined the classical event study methodology and claimed that the classical method may lead to incorrect statistical inferences, they did not devise some different statistics for the event studies as we would do in the next chapter. They did nevertheless claim that the overall time-varying coefficient model affects the statistical power of tests in classical event study methodology, concerning the width of event window and the exact date of the event. The time-varying coefficient model used is described as ⎫ ⎧ et |t ∼ (0, h t ), ⎬ rt = α + βt rmt + et , ⎨ ¯ + at , at ∼ (0, σa2 ) βt − β¯ = φ(βt−1 − β) ⎭ ⎩ 2 . h t = αo + α1 h t−1 + α2 et−1 where t is the information up time t and the time-changing beta is of partial adjustment model differing from the earlier setting. Brockett et al. (1999) then, used the maximum likelihood estimation and Kalman filter as the recursive algorithm to estimate and to obtain the predictions of the time series data. Notice that in addition to they use the time-varying system to forecast the abnormal return, they use the conditional autoregressive heteroskedasticy (ARCH) for the variance updating mechanism. These take into account the possible variance change due to the event of interest.
2 Assessments of Normal Returns
87
The one-step-ahead prediction errors for the time-varying system are used to consider the event study test. They even consider the event study as similar to the test for structural change of coefficient of structural model where the uncertainty of event date is found by the CUSUM test (for prediction errors) to see if the turning point on the plotted graph may show the possible date. This in turn, confirms our claim in Proposition 2.1.1. that the conventional event study test is similar to the statistical inferences on structural change. Hence, identification of the actual event date is handled by the CUSUM graph of prediction errors to indicate the possible event date which evades from the ad hoc methods to locate the event date from data manipulation. In particular, they consider the model selection for the conditional expectation of regression model in normal returns. The possible variance change for the event period is also considered with the GARCH (1,1) model. Using the ACF function for auto-correlations, they confirm the above setting for normal returns. The portmanteau test of these prediction errors indicates the conventional event study may actually overstate the possible explanation of the events. In fact, given their empirical data, they find the results are opposite to the existing studies on the same event issue. For the details, let the event date be denoted as n where the event period be, the test is shown as zt =
et , σt
Z to =
n
zt ,
i=o
T = maxt≤n |Z t | and as n → ∞, ∞ −(2k + 1)2 π 2 T −1 k −1 (−1) (2k + 1) ex p P √ ≤ t → 4π 8t 2 n
k=o
So we reject the null hypothesis as no significant event when small probability from the observed data show that this probability is less than α, where α is the significance level. That is to say, we reject the null hypothesis when the probability of the observed data (for the statistics constructed)
88
J.-L. Jeng
is less than the significance level. In other words, we reject the null since the chance of observed data to perform like the null hypothesis is rather slim. In using the boundary function as y = ±[d + c(t − 1)] √ for √ some t in (1, n) as the sample points where d = a n − 1 c = 2a/ n − 1, a can be solved from the equation as Q(3a) + ex p(−4a 2 )(1 − Q(a)) = (1/2)α, where
1 Q(z) = √ 2π
∞
ex p( z
−1 )du. 2u 2
Brockett et al. (1999) use the above boundary function in CUSUM to determine the possible turning point for the event date to solve the event date uncertainty. They then applied the above methodology to the event as the passing of Proposition 103 for automobile insurance. Although considered by many studies in using the conventional methods to confirm that the passage of the said Proposition may affect the insurance firms negatively, their studies show that the major portion (85%) of the publicly traded insurance firms in their sample indicate that there is no significant negative effect due to the passage of Proposition 103. This shows that the deficiency of conventional method for model selection, variance calculation, and event date uncertainty in identifying and confirmation for the event studies.
Notes 1. This is because the transition of time-varying parameters must be specified so that the adaptive/recursive filters can be devised to update the information for the purpose of control.
2 Assessments of Normal Returns
89
2. Notice that the notation as T stands for the matrix transverse, and T ∗ stands for the period that has the number of observations exceeds the actual event date. 3. The time subindex is not specified here for a particular time frame to allow the continuum and expansion of time horizon for event studies.
References Aktas, N., E. Debodt, and J-G Cousin. 2007. Event Studies with a Contaminated Estimation Period. Journal of Corporate Finance 13(1): 129–145. Brockett, P. L., H.-M. Chen, and J.R. Garven. 1999. A New Stochastically Flexible Event Methodology with Application to Proposition 103. Insurance: Mathematics and Economics 25: 197–217. Cable, J., and K. Holland. 1999. Modelling Normal Returns in Event Studies: A Model-Selection Approach and Pilot Study. The European Journal of Finance 5: 331–341. Chen, H.-F., and L. Guo. 1991. Identification and Stochastic Adaptive Control. Basel: Birkhauser. Ghysels, E. 1998. On Stable Factor Structure in the Pricing of Risk: Do TimeVarying Betas Help or Hurt? Journal of Finance 53: 549–573. Guo, L. 1994. Stability of Recursive Stochastic Tracking Algorithms. SIAM Journal on Control and Optimization 32 (5, September): 1195–1225. Guo, L., and L. Ljung. 1995. Performance Analysis of General Tracking Algorithms. IEEE Transactions on Automatic Control 40 (8, August): 1388–1402. Jeng, J.-L. 2018. Empirical Asset Pricing Models: Data, Empirical Verification, and Model Search. Basingstoke: Palgrave Macmillan.
3 Occupation Time Statistics—The Intensity of Events
3.1
Endurance of Impacts from Events: The Occupation Time Statistics
To develop the new test, we consider the time span (or sample path) of the stochastic processes. Intuitively, it is easy to see that if an event is essential, its impacts should persist for a while through time. The more significant the event may cause, the longer (or the deeper) its impacts may pursue. In other words, the stochastic process that exhibits a certain property, its impacts (and the extent) should continue thereon instead of just identifying changes or timing of (say) the parameters only. Hence, consider the Takács’ (1998) study of occupation times of reflected Brownian motion, we can call the underlying statistics as occupation time statistics for the event studies—since we are investigating the relative frequency of the abnormal returns in exceeding a certain threshold. Later on, the test can also be extended for the sequential detection of the event’s importance and consequence. This alternative method is attractive for the following three reasons. First, the entire resolution period and beyond is used and without the arbitrary choice for estimation and event period since the recursive algorithms are applied to update the current information for the normal returns. © The Author(s) 2020 J.-L. Jeng, Contemporaneous Event Studies in Corporate Finance, https://doi.org/10.1007/978-3-030-53809-5_3
91
92
J.-L. Jeng
The impact of event then, is determined by the data themselves, and the adjustments of markets due to the announcement (or not) of the firm are due to the markets.To utilize the full information set available to the firm and to the market as this information converts from the target or issues. Second, there is no need to know if the action taken by the firm in advance. For example, in the merger and acquisition case, we do not ask or require knowing if the implementation of the merger is successful; rather we focus on the contemporaneous market’s expectation of these benefits, which is of more value to anyone wanting to implement a trading strategy based on merger activity. Moreover this method does not require identifying a closely matched set of control firms. Third and finally, the method allows us to apply (for instance) the theoretical results of Takács (1998) and analyze the duration of the impact from the new information. Thus, we redefine a significant event as: an event is significant when the duration of the impact from the new information is significantly larger or lower than the occupation time of a reflected Brownian motion. Notice that the conventional event study tests based on cumulative abnormal returns are concerned only with the change of location statistics for the asymptotic distributions. That is, they ask if the mean of the cumulative abnormal return will be different from zero if the underlying event is essential. Accordingly, event-study tests based on (standardized) cumulative abnormal returns for the location statistics or for the change in scales are special cases for the tests of changes in means (or distributions) if the event period is long enough. However, tests of structural changes or differences in parameters or distributions do not necessarily cope with the essentiality of the event(s). Changes in volatility, parameters or distributions often occur in some sub-intervals of time horizon for financial securities without any claim of specific event(s). Therefore, an indication of change (in distributions or in parameters) may not suffice to justify the identification of an essential event. In particular, the application of cumulative abnormal returns (denoted as CAR heretofore) is similar to the tests for parameter changes. For instance, testing for a non-zero mean for CAR is in fact, similar to a test for changing means or shift in locations. Unfortunately, a change of parameter is not sufficient to depict an essential event in mergers and acquisitions, or other corporate finance issues. Given that the financial data
3 Occupation Time Statistics—The Intensity . . .
93
is highly volatile and possibly has time-changing distributions, its parameters may be different (across time) even though there is no essential event at all. The alternative test strategy considers the occupation time (or sojourn time) of the absolute values of cumulative abnormal returns instead. Intuitively speaking, if an event is essential, its impact will not simply be a split-second burst or spike on the sample path of asset returns. Instead, the fluctuations (including the spike) will last a while when the market works to digest the new information (or shock) and adjust itself before it settles. In other words, the market needs a little more time to resolve the new information from the event—if the event or announcement is really to everyone’s surprise. The more essential the information is, the longer the time market needs to react. Alternatively, situations occur when the cumulative abnormal returns may be more volatile before and after the event than close around the event. More specifically, it may not necessarily lead to an increase of variance within the neighborhood of the event time in contrast with the other time periods. In such instance, a narrow event study might consider that no essential event took place, and the event study would be wrong. Instead, it is possible that the announcement simply resolves the uncertainty for the market. In other words, an essential corporate event does not necessarily increase the fluctuations of the markets at announcement. Instead, it may dampen the fluctuations. Hence, any method that aims to identify essential corporate events should account for both cases of “volatile jittering’s” or “tamed-down’s” when the new information hits. Our statistic therefore, considers the relative frequency of the statistics for cumulative abnormal returns that exceed some thresholds over the entire sampled period. If the impact causes “volatile jittering’s”, the relative frequency will increase significantly. Conversely, if the event impact causes “tamed-down’s” of the market, the relative frequency will then decrease significantly. This statistic converges (for example) in distribution to the occupation time of a reflected Brownian motion—provided that there is no essential event under the null hypothesis and the invariance principle for the cumulative abnormal returns holds when the sample period for each firm’s abnormal returns is sufficiently large.
94
J.-L. Jeng
Given the formula for distribution and moments provided by Takács (1998), we then can apply the Banach-valued Central Limit Theorem for the cross-sectional average of occupation time functional across all firms in Chapter 6 to demonstrate its applicability in corporate finance. Under the null hypothesis, this test statistic will converge to a normally distributed random variable where the mean and variance of the occupation time of reflected Brownian motion are given with the formula of Takács (1998). The work of empirical processes and its Central Limit Theorem can be found in Andrews (1991) and many others for the weak convergence results. Given the similarity of our statistics and the empirical process, the asymptotic properties of the two are applicable.
3.2
Introduction
The intuition for this chapter is that if the events are significant and essential, it is conceivable that the impacts from the events will be both significantly strong and lasting for a while. Specifically, unless the events are only contemporaneously perceived and resolved quickly, the time period of the impacts may proceed through the time horizon. Hence, the assessments of events’ significance are not confined to the magnitude of impacts only. The conventional methods of event studies (or event tests for structural change) are most likely attempts to assess the magnitude of impacts. These methods tend to consider if the impacts are strong enough to knock through some barriers or borderlines to identify the significance of impacts. However, these methods ignore that some essential events don’t just hit the barriers or borderlines, they may proceed (over time) from the initial impacts. In other words, the essentiality of the events does not simply rely on the strength of initial impact, it also depends on how long it may prevail. An essential event will usually create shocks to the original system indeed. However, digestion and processing of information is not necessarily imminent. Unless the capital market in absorbing the corporate event information is so efficient that the correction path is extremely short, the impacts will usually take a while to dismiss the influence. A major surprise or event that is significant enough to study will normally need time to absorb the news. Hence, the endurance of impacts or intensity from
3 Occupation Time Statistics—The Intensity . . .
95
events can be applied as an alternative statistic to consider the essentiality of events in corporate finance when the efficiency of capital market is concerned-whether it is the announcement or the information effect. Definitions of Brownian Sheet, Bi-variate Brownian Bridge, Kiefer process and Brownian Pillow Brownian Sheet—Brownian sheet is a two-parameter Gaussian process {W (s, t), s, t ≥ 0} 1. W ((s1 , s2 ) × (t1 , t2 )) ∼ N (0, (s2 − s1 )(t2 − t1 )) for all (s1 , s2 ) × (t1 , t2 ) in R 2 and W (s, t) ∼ N (0, st), 2. W (0, t) = W (s, 0) = 0, 3. W (s, t) has independent increments, 4. the sample path of W (s, t) is continuous in s,t with probability one. and with the covariance function is given as E W (s, t)W (s , t ) = min(s, s ) × min(t, t ) s, s , t, t ≥ 0. Bi-variate Brownian BridgeA bi-variate Brownian bridge is a mean-zero Gaussian process tied-down Brownian sheet defined as B(t1 , t2 ) = W (t1 , t2 )w(1,1)=0 = W (t1 , t2 ) − t1 t2 W (1, 1), where t1 ,t2 ∈ I = [0, 1] and W (t1 , t2 ) is a Brownian sheet, whose covariance function is given as E {B(s1 , s2 )B(t1 , t2 )} = (t1 ∧ s1 ) (t2 ∧ s2 ) − t1 t2 s1 s2 , t1 , t2 , s1 , s2 ∈ I = [0, 1], where the symbol “∧” is as the minimization. A tied-down Brownian bridge is as 0 ≤ s, t ≤ 1 B∗ (s, t) = W (s, s) − sW (1, t) − t W (s, 1) + st W (1, 1), with covariance function as E {B (s1 , s2 )B (t1 , t2 )} = (s1 ∧ t1 − s1 t1 ) (s2 ∧ t2 − s2 t2 ). Kiefer ProcessA Kiefer process K (s, t), 0 ≤ s, 0 ≤ t ≤ 1 is also a mean-zero Gaussian process which can be written as the difference between two Brownian sheets K (s, t) = W (s, t) − t W (s, 1)
96
J.-L. Jeng
and with the covariance function as E K (s, t)K (s t ) = t ∧ t s ∧ s − ss . ˆ t) is a so-called tied-down Brownian Pillow—A Brownian pillow B(s, Brownian sheet 0 ≤ s, t ≤ 1 such that ˆ t) = K (s, t) − s K (1, t) = (W (s, t) − t W (s, 1)) − s (W (1, t) − t W (1, 1)) B(s,
= W (s, t) − t W (s, 1) − sW (1, t) + st W (1, 1) where K (s, t) is a Kiefer process, and with covariance function as ˆ 1 .s2 ) B(t ˆ 1 , t2 ) = ((s1 ∧ t1 ) − s1 t1 ) ((s2 ∧ t2 ) − s2 t2 ) E B(s One can easily see that the so-called tied down bi-variate Brownian bridge is the Brownian pillow.
3.3
Asymptotic Properties of Occupation Time (or Sojourn) for Diffusion Processes
Let {ξ(t), √ t ≥ 0} be a standard Brownian motion and P(ξ(t) ≤ x) = (x/ t) for t > 0,where 1 (x) = √ 2π
x
−∞
e−u
2 /2
du
as the normal distribution function with the normal density function as 1 2 ϕ(x) = √ e−x /2 . 2π
3 Occupation Time Statistics—The Intensity . . .
97
Define the function as τ (α) = lim ε→0 1ε measure {t : α ≤ ξ(t) < α + ε, 0 ≤ t ≤ 1} for any real α as the local time for ξ(t) at level α, and
1
ω(α) =
δ(ξ(t) > α)dt
0
as the occupation time of {ξ(t), t ≥ 0} spent in the set (α, ∞) in the interval (0, 1). For the reflecting Brownian motion {|ξ(t)|, t ≥ 0}, we define 1
ω∗ (α) =
δ(|ξ(t)| > α)dt
0
as the occupation time of {|ξ(t)|, t ≥ 0} spent in the set (α, ∞) in the interval (0, 1). The work of Takács (1998) is to give the explicit formulas for the moments and distribution function for τ (α), ω(α) and ω∗ (α). Define (3.1) E [τ (α)]r = m r (α) (3.2) E [ω(α)]r = Mr (α) E
∗ r ω (α) = Mr∗ (α)
(3.3)
and the explicit formulas are Mr (α) = m 2r (α)/2r r ! Mr∗ (α)
(3.4)
(r − 1)! m 2r ((2k − 1)α) = 2r−1 (r − k)!(r + k − 1)! r
(3.5)
k=1
for r = 1, 2, . . . and α > 0, which gives the distribution function as P {ω∗ (α) ≤ x} = G α (x), G α (x) = 2Fα (x) − 1 + 2
∞ k k=2 j=2
(−1) j j! (k + j − 1)!
k−2 j −2
k−1 k−1 1 − F(2 j−1)α (x) x d d x k−1 (3.6)
98
J.-L. Jeng
where 0 ≤ x < 1, α > 0, and G α (1) = 1. In the formula Fα (x) = P {ω(α) ≤ x} as 1 Fα (x) = 1 − π
1−x
0
e−α /(2u) du √ u(1 − u) 2
(3.7)
for where 0 ≤ x < 1, α ≥ 0, and Fα (0) = 2(α) − 1 for α ≥ 0. P(τ (α) ≤ x) = 2(α + x) − 1 if 0 ≤ x < 1, α ≥ 0, where m r (α) = 2r
∞
(3.8)
x r−1 [1 − (α + x)] d x
(3.9)
0
if α ≥ 0, and r ≥ 1. More explicitly from Takács (1998), we have m r (α) = 2(−1)r {ar (α) [1 − (α)] − br (α)ϕ(α)}
(3.10)
for r = 1, 2, . . . where ar (α) = r !
[r/2] j=0
αr−2 j 2 j j!(r − 2 j)!
(3.11)
and br (α) =
[(r−1)/2]
j=0
r −1− j j
j j!αr−1−2 j r ν 2j
(3.12)
ν=0
for r ≥ 1. For more discussions in the occupation time of Brownian motion with drift, one can consult Pechtl (1999) for details.
3 Occupation Time Statistics—The Intensity . . .
3.4
99
Empirical Processes and the Asymptotic Properties
The Definition of Empirical Process The reason to introduce the empirical process here is because the process is similar to the occupation time statistic in definition. Basically, they are similar to each other since they are of the opposite inequalities of the same threshold. The only major difference is that the main interest of occupation time statistics are based on the cumulative sums of abnormal returns. Although there is a difference in applications for the issues of interest, the arguments are applicable to the asymptotics for the occupation time statistic since the number of occurrence for the event of interest are counted relative to the sample size. Empirical process is usually for the empirical distribution function of the data from observations. It is called empirical process since the result is based on the actual observations from the data. The definition of empirical process can be shown as follows. Due to the asymptotics of empirical distribution function are for the function spaces, we need to consider several concepts for the function spaces where the empirical distribution function (or empirical processes) may lie in so that the weak convergence can be established. In the following, we introduce some concepts on the function space for the empirical processes. Empirical process: Let a sample of random elements x1 , . . . , xn be on the measurable space (X , A). The empirical measure as 1 δ(X |i X i ∈ C), n n
i=1
is called the empirical process, where C is the constraint set the elements must satisfy, and δ(.) is the √ indicator n function is defined as the for the δ(X |i X i ∈ C) − F(.) where F(.) intensity of the process as n( n1 i=1 is the distribution function of all xi ∈ (X , A). In our application, the random element is the cumulative sum of abnormal returns, and the constraint is the threshold h that the cumulative sum must (at least) exceed. Hence, the empirical process is the similar
100
J.-L. Jeng
argument in emphasizing the intensity of events or the relative impacts from the event shot. Therefore, this application simply assumes that the cumulative sums of abnormal returns will fall in a measurable space where the empirical measure of this sum defines the counting of how often the cumulative sums will exceed the threshold, which measures the intensity of the cumulative sum and determines the intensity by the relative counting measure. Hence, since the counting measure is used, the empirical process is simply the opposite side of the process. All the asymptotic properties of empirical process can be applied for the intensity of the event studies. Example 3.4.1 (Empirical Distribution Function): Let the random elements X 1 , . . . ,Xn be the i.i.d. elements of R d of identical distribution F(t), the empirical distribution function identified is as the empirical measure as n 1 δ(X i ≤ t), t −→ n i=1
and the empirical process is defined as √
1 δ(X i ≤ t) − F(t)), n n
n(
(3.13)
i=1
where F(.) is the distribution function of X. Example 3.4.2 (Empirical process indexed by sets): Let C be a collection of measurable sets in the sample space (χ , A), the empirical measure defined n 1 δ(X i ∈ C), C −→ n i=1
is the empirical distribution when a certain property √ C of the measurable 1 n sets are given and the empirical process is defined as n( n i=1 δ(X i ∈ C) − F(.)) where F(.) is the distribution function of X i ∈ C.
3 Occupation Time Statistics—The Intensity . . .
101
Since we are interested in the intensity if the cumulative sums of abnormal returns, the processes we want to apply is the one in a certain function space. If the cumulative sums of abnormal returns lie in a certain proper space, the empirical process defined by the sets will be defined accordingly. In other words, if the cumulative sums are defined in a measurable set, the empirical process defined on the sets is also well-defined. In fact, we can rewrite the intensity for a given threshold h as u u 1 n 1 n √1 1 √1 1 u=1 δ( n σε σε | i=1 εi | > h) = 1 − n u=1 δ( n σε | i=1 εi | ≤ h)i n
t = 1 − 0 δ(|Yu | ≤ h)du,
t where the expression 0 δ(|Yu } ≤ h)du is the occupation time of the diffusion process |Yu | if the cumulative sums of abnormal returns converge. This is the reason why we name the underlying statistics for the intensity of events as occupation time statistics. Therefore, instead of testing the magnitude of the cumulative sums, we are testing the essentiality of events by the intensity. Since the cumulative sums are functionals of the abnormal returns, the empirical process is defined in a function space with a proper norm. We let the occupation time function play the role for describing the intensity of the events. Hence, the essentiality of the events are not on the cumulative sums of abnormal returns instead, the occupation time describes the impact of the events. The more serious the event is, the more intense the impact and it will last longer. Therefore, the investigation on the events has changed from the cumulative sums of abnormal returns to its intensity or functional. If the event is fully expected by the markets, the impact of decision cannot be long, and the impact will be short. Notice that the definition of occupation time statistics is determined by the functional of abnormal returns, which in turn, depends on the weak convergence or functional’s central limit theorem where the weak convergence of cumulative sums of abnormal returns are granted if properly normalized. This differs from the conventional statistics and their robust version which depend heavily on the magnitude of the cumulative sums of abnormal returns and central limit theorem where the slow convergence rate is often criticized due to the skewness and kurtosis of individual stock returns. Most of time, they would consider the average of cumulative sums
102
J.-L. Jeng
of abnormal returns will exceed zero and reject the null hypothesis of no event under normality in the asymptotic sense. Given that the following arguments are for the function space of the functionals such as the cumulative sums of abnormal returns. Notice that since the arguments are for the functionals, we need to have the definition of the function space to have proper asymptotic equicontinuity established and the weak convergence is granted. Some Technicalities of Empirical Process for the Cumulative Sums of Abnormal Returns We begin with the introduction of asymptotic technicalities that the empirical processes which discusses the function spaces (for example, for the cumulative sums of abnormal returns is defined), the convergence results, the asymptotic distributions and moments. Due to the function space, we need to define certain metrics (or norms) for the space, and their covering numbers. This is to guarantee that the complexity of the function space and there is no “hole” in the function space. The main interest of empirical processes is on the empirical distribution defined or indexed by some function spaces. The asymptotics are to find the possible properties that given the functionals what are the possible large sample convergences and moments with asymptotic distributions are available. Since the cumulative sums of abnormal returns can be denoted as a particular form for functional of the stock returns, it is easy to see that the asymptotics of empirical processes defined in general, will conform to the applications of cumulative sums of abnormal returns. The Orlicz norm Let ψ be a non-decreasing, convex function with ψ(0) = 0 and X be a random variable. Then the Orlicz norm of X is defined as ||X ||ψ = in f {C > o : Eψ(
|X | ) ≤ 1}. C
The norm is useful that the general function as x → x p for p ≥ 1, the Orlicz norm is simply the usual L p −norm such as ||X || p = (E|X | p )1/ p .
3 Occupation Time Statistics—The Intensity . . .
103
Covering numbers The covering numbers is defined as the following definition for empirical processes. This will give us the definitions for the properties of the function spaces of empirical processes where the event studies can be formatted. Definition 3.4.3 (Covering numbers): The covering number N (ε, F , · ) is the minimal number of balls {g : g − f < ε} of radius ε is needed to cover the set F . The centers of the balls need not belong to F ,but they should have finite norms. The entropy (without bracketing) is the logarithm of the covering number. Definition 3.4.4 (Bracketing numbers): Given two functions l andu, the bracket [l, u] is the set if all functions f with l ≤ f ≤ u. An ε−bracket [l, u] is a bracket with u − l < ε. The bracketing number N[] ε, F , · ) is the minimal number of ε−brackets needed to cover F .The entropy with bracketing is the logarithm of the bracketing number. Theorem 3.4.5 (van der Vaart and Wellner (1996)): Let (T, d) be an arbitrary semi-metric space. The covering number is the minimal number of balls of radius ε to cover T. There exists a maximal inequality to contain the space filled with the cumulative sums of abnormal returns. That is, for a series of nested sets To ⊂ T1 ⊂ T2 ⊂ · · · ⊂ T such that every T j is a maximal set of points such that d(s, t) > η2− j for every s, t ∈ T j , for the increments of the process η maxs,t∈Tk+1 d(s,t) 0. If P F 2 < ∞, then F is P−Donsker, where Q, Q = (X , A) is the class of discrete probability measures. with F 2Q,2 = F 2 d Q > 0.
Glivenko–Cantelli Theorems There are several different ways to show the Glivenko–Cantelli theorems. We will start with the most primitive one. That is, from the empirical distribution function. Let the random sample we collect as Sn = (X 1 , . . . , X n ) we can define that empirical distribution as 1 1 Fˆn (x) = #(X i ∈ Sn , X i ≤ x) = δ(X i ≤ x), n n n
i=1
the Glivenko–Cantelli theorem simply shows that if there is an identical distribution function on Sn such that a.s. supx | Fˆn (x) − F(x)| → 0,
(3.14)
A more general setting can be found in van der Vaart and Wellner (1996) as the following. The first one is a little limited with the bracket number while the second one is for more general cases. Theorem 3.4.7 (van der Vaart and Wellner (1996)): Let F be a class of measurable functions such that for every ε > 0, N[.] (ε, F, L 1 (P)) < ∞, then F is Glivenko–Cantelli. That is to say, if the class F is of measurable functions, and they are dense enough such that for any given ε > 0, we will
3 Occupation Time Statistics—The Intensity . . .
105
find that there are sufficiently finite numbers of ε−brackets that cover the F, and there exists a probability distribution function for these samples altogether, then we can claim that convergence for the probability distribution is granted when given the Glivenko–Cantelli theorem. In our applications, since we are interested in the cumulative sums of abnormal returns, this is equivalent to say that the increments of cumulative sums of abnormal returns cannot be indefinite so that they can no longer “explode” indefinitely. However, since we are interested in the functional of cumulative sum of abnormal returns, the so-called event study here is similar to the empirical processes of the functionals of abnormal returns. Hence, the empirical processes in the event study therefore is dependent of the relative frequency of cumulative sums of abnormal returns for the event of interest. The Glivenko–Cantelli simply shows that if the complexity of the function space is not too explosive, the empirical processes of the functional of interest, will likely converge weakly. Theorem 3.4.8 (van der Vaart and Wellner (1996)): Let F be a Pmeasurable class of measurable functions with envelop F such that P F < ∞ (P is the outer-measure) Let FM be the class of functions f 1{F < M} where f ranges over F . If log N (ε, FM , L 1 (Pn )} = op (n) for every ε and M > 0, then Pn − P F → 0 both almost surely and in mean. And F is Glivenko–Cantelli. The second theorem is more general than the first one. Besides, it works for the function space F . Notice that since our statistics are based on the cumulative sums of abnormal returns, even in the applications of empirical distribution, the arguments are on the distribution of the cumulative sums of abnormal returns. Hence, in the applications, the central limit theorem that is based on various dependence can be applied here even though the original data of returns are heterokedastic or mixing. In the following, we first show we show the Andrews’s (1991) generalization for empirical process to the heterogeneous and dependent observations in the central limit theorem indexed by the classes of smooth functions, particularly for the statistical inferences needed in applications. Further generalizations to dependent observations such as Dedecker and Prieur (2007) are shown afterwords.
106
J.-L. Jeng
The following definitions are from Andrews (1991). The idea of the empirical processes are more general than the above definition where it allows dependent and not identically distributed observations. Let {X ni : 1 ≤ i ≤ n, n ≥ 1} be a triangular array of X −valued random vectors defined on a probability space (, B , P) where X ⊂ R k . Let m(., .) be a real function defined on X × T where T is an index set that is a metric space with metric ρ as defined later. Assume m(x, τ ) is Borel measurable in x for each τ ∈ F . Define the empirical process νn (.) by n 1 (m(X ni , τ ) − Em(X ni , τ )), νn (τ ) = √ n i=1
let L ∞ (T ) denote the space of bounded real functions on F and endow L ∞ (T ) with the uniform metric d. Definition 3.4.9 if {νn (.) : n ≥ 1} are (not necessarily Borel measurable) maps from into the metric space (L ∞ (T ), d) and if ν(.) is an L ∞ (T )-valued Borel measurable random variable (not necessarily defined on (, B .P)), then νn (.) weakly converges to ν(.) if E f (νn (.)) → E f (ν(.)) as n → ∞ for all bounded continuous real functions f on L ∞ (T ), where E denotes the outer expectation. Stochastic Equicontinuity We need the following stochastic equicontinuity on the empirical process which is for the sample functions of the stochastic process so that we can guarantee the consistency (particularly, the weak convergence) or asymptotic normality indexed by a suitable (metric) space. In other words, we can apply the theoretical developments for the in-sample occupation time statistics to the event studies. Let . be the Euclidean norm as denoted for the following arguments. For the stochastic process {νn (.) : n ≤ 1} is stochastically equicontinuous if ∀ε > 0 and η > 0, ∃δ > 0 such that lim n→∞ P supτ1 ,τ2 ∈T,ρ(τ1 ,τ2 ) η < ε. (3.15)
3 Occupation Time Statistics—The Intensity . . .
107
Andrews (1991) had the following extension for the empirical processes of the study. The only extension is to introduce the outer measure which is more general then the original setting. According to Andrews (1994), the idea of stochastic equicontinuity is equivalent to a probabilistic and asymptotic generalization of the uniform continuity of a function, By Andrews (1994) it is equivalent to the stochastic continuity (i) if p
supρ(τ,τ21 )≤δ |νn (τ1 ) − νn (τ2 )| −→ 0, p
where −→ stands for the convergence in probability or (ii) for a sequence p of τˆ1n , τˆ2n such that ρ(τˆ1n , τˆ2n ) → 0,and νn (τˆI N ) − νn (τˆ2n ) −→ 0. It can be shown with a simple example of linear parametric model lim n→∞ P (supρ(τ,γ ) < ε ≤ lim n→∞ P √ δ n l
where δ is sufficiently small and √1n ln (Wl − E Wl ) = O p (1). The idea can be generalized to the outer probability which is more general than the above setting. In the following definition the empirical process is shown as the inclusion of outer probability where the outer probability measure is defined as the infimum over all probabilities that cover the original set. Definition 3.4.10 (Andrews (1991)): {νn (.) : n ≥ 1} is stochastically equicontinuous if ∀ε > 0 and η > 0, ∃δ > 0 such that lim n→∞ P (supρ(τ,γ ) η) < ε, where P denotes P−outer probability.
(3.16)
108
J.-L. Jeng
In our applications, the νn (.) can be defined as the (empirical process) function of the cumulative sums of abnormal returns as m(X ni , τ ) which we used. Andrews (1991) forms the following assumption to establish the stochastic equicontinuity for the m(., τ ) function. Assumption A (i) (Series expansion) For some sequence {h j (.), j ≥ 1} of real or complex Borel measurable functions on X , m(., τ ) has a pointwise convergent series expansion for each τ ∈ T ,m(x, τ ) = ∞ j=1 cj (τ )hj (x), ∀x ∈ X , where for each τ ∈ T , {c j (τ ) : j ≥ 1} is a sequence of ( real or complex) constants, ∞ (ii) (Smoothness) j=1 |c j (τ )|E|h j (X ni )| < ∞, ∀i ≤ n, n ≥ 1, τ ∈T, (iii) (Smoothness/weak dependence trade-off ) supτ ∈T ∞ j=J |c j 2 (τ )| /a j → 0 as J → ∞ for some summable sequence of positive real ∞ constants {a j } for which j=1 a j γ j < ∞, where γ j = ∞ s=−∞ γ j (s) and γ j (s) = supi≤n−|s|,n≥1 |Cov(h j (X ni ), h j (X ni+|s| )|. 1/2 ∞ 2 |c (τ ) − c (γ )| , Now define the metric ρ(τ, γ ) = j j j=1 ∀τ, γ ∈ T , Andrews (1991) has the following theorem. Theorem 3.4.11 (Andrews (1991)): For the ρ(., .) defined above, Assumption A above implied that {νn (.)} is stochastically equicontinuous and (T , ρ) is totally bounded. Proof; By the above assumptions (i) and (ii), m(X ni , τ ) = ∞ j=1 c j (τ ) (h j (X ni ) − Eh j (X ni )). Hence, lim n→∞ P (supρ(τ,γ ) η) ∞ ∞ |c j (τ ) − c j (γ )|2 −2 ≤ η supρ(τ,γ ) 0, then {g(Z ni ) : i ≤ n, n ≥ 1} is a near-epoch dependent array of real or complex random variables of size −ξ on the same strong mixing double array {Vni }. In addition, the near-epoch dependent numbers {ηg (s)} of {g(Z ni )} satisfy ηg (s) ≤ Cη(s) for all s ≥ 0, where {η(s)} are the nearepoch dependent numbers of {Z ni }. Lemma 3.4.17 (Andrews (1991)): Let {Z ni } be a near-epoch dependent triangular array on {Vni } (with near-epoch dependent numbers of {η(s) : s ≥ 0}) where {Vni } is a strong mixing double array (with mixing numbers {α(s) : s ≥ 1}. Then for all r > 2, |Cov(Z ni , Z ni−s )| ≤ Z ni−s η(a) + 6 Z ni 2 Z ni−s α(b)(r−2)/(2r) for all 0 ≤ s < i and n ≥ 1,where a and b are positive integers for which a + b ≤ s. Proof of Theorem 3.4.15 (Andrews (1991)): Assumption A1(i) and A(i) ∞ are equivalent. By Assumption A1(ii), Assumption A(ii) holds if j=1 |c j (τ )| < ∞ ∀τ ∈ T . This holds under Assumption A1(iii) because ∞ j=1
⎞1 ⎛∞ 2 ∞ 1 2 ⎠ ⎝ |c j (τ )| ≤ |c j (τ )| /a j ( a j ) 2 < ∞. j=1
j=1
Next we show that Assumption A1 implies A(iii). By Assumption A1(ii) and (iv) and Lemma 3.4.16, {h j (X ni )} is a near-epoch dependent triangular array on {Vni } with near-epoch dependence numbers {ηh j (s)} that satisfy ηh j (s) ≤ B j η(s) ∀s ≥ 0 where {η(s)}are the near-epoch dependent numbers of {X ni }.Thus, Assumption A1(ii) and Lemma 3.4.17 with Z ni = h j (X ni ) and a = b = [s/2] give γ j (s) ≤ supi≤n,n≥1,,ν≥1 h ν (X ni ) 2 (B j η([s/2]) +6supi≤n,n≥1,ν≥1 h ν (X ni ) r α([s/2)](r−2ν(2r)) ) .
112
J.-L. Jeng
Hence, γj =
∞
γ (s) j ≤ B j D1
s=−∞
∞
η([s/2)]
s=−∞
+D2
∞
α([s/2])(r−2)/(2r)
s=−∞
∀ j ≥ 1 for some finite constants D1 and D2 , using the fact that ∞{η(s)} and {α(s)} are of size −1 and −2r/(r − 2), respectively. Thus, j=1 a j γ j < ∞ ∞ is implied by ∞ j=1 a j B j < ∞ and j=1 a j < ∞. This result and Assumption A1(ii) imply Assumption A(iii). Assumption B (Fidi convergence) For some r > 2, (i) S(τ, γ ) = lim n→∞ Cov(νn (τ ), νn (γ )) exists ∀τ, γ ∈ T , (ii) supi≤n,n≥1 E X ni r < ∞, (iii) {X ni : i ≤ n, n ≥ 1} is a near-epoch dependent triangular array of size −1 om {Vni }, where {Vni : i = 0, ±1, . . . , ; n ≥ 1} is some strong mixing double array of size 2r/((r − 2). and (iv) m(., τ ) ∈ Li p(1, C X ) ∀τ ∈ T for some C < ∞. Theorem 3.4.18 (Andrews (1991)): Under Assumption B, for each finite subset (τ1 , . . . , τν ) of T , (νn (τ1 ), . . . , νn (τν )) , converges in distribution to a N (0, Sν ) random variable , where Sν is a ν × ν covariance matrix with (s, t)th element S(τs , τt ). Proof of Theorem 3.4.18: The proof is based on the following Proposition of Wooldridge (1986) and the above lemmas 3.4.16 and 3.4.17. Proposition 3.4.19 Let {Z ni : i ≤ n, n ≥ 1} be a triangular array of realvalued random variables that satisfies n Z ni ) = σ 2 , (i) lim n→∞ V ar ( √1n i=1 (ii) supi≤n,n≥1 E|Z ni |r < ∞, for some r > 2, and (iii) {Z ni } is near-epoch dependent of size −1 on {Vni }, where {Vni } is a strong mixing double array of random variables of size −2r/(r − 2), n (Z ni − E Z ni ) will converge to N (0, σ 2 ) as n → ∞. Then, √1n i=1
3 Occupation Time Statistics—The Intensity . . .
113
Let τ = (τ1 , . . . , τν ) ∈ T , and m(., τ1 ), . . . , m(.τν ). It suffices to show that the condition (i) (ii) and (iii) of the above proposition hold with Z ni = λ m(X ni , τ ) and σ 2 = λ Sν λ ∀τ ∈ T ν , ∀λ ∈ R ν with λ = 1, and ∀ν ≥ 1. Central Limit Theorem for empirical processes There are a lot of contributions for central limit theorem in the statistical literature of empirical processes particularly, for the one indexed by the function spaces. For example, Hoffmann-Jorgensen and Pisier (1976), Araujo and Giné (1980), Dehling (1983), Ermakov and Ostrovoskii (1986), Ledoux and Talagrand (1991), Andrews (1991), and Dedecker and Prieur (2007). Hence, for the applications of the occupation time statistics, we can simply apply the central limit theorem for the empirical processes and obtain the similar asymptotic properties. We will state a few on the generalizations of empirical central limit theorem in the following. Define U (T ) = {y ∈ L ∞ (T ) : y is uniformly continuous with respect to ρ on T . Using the above definitions and the lemmas given, we have the following result for the empirical process {νn (.) : n ≥ 1}. Theorem 3.4.20 (Andrews (1991)): For νn (.) and ρ(., .) defined above, Assumption A or A1 plus Assumption B imply that νn (.) weakly converge to ν(.), where ν(.) is a mean zero Gaussian process with covariance function S(., .) whose sample paths lie in U (T ) with probability one. Proof: Given the above theorems 3.4.11, 3.4.15, and 3.4.16 and Pollard [Theorem 10.2] give the result. This shows that for the dependent observations with near-dependence condition and heterogeneous distribution, the empirical process will converge weakly to a normally distributed random variable with mean zero and a suitable covariance asymptotically. That is to say, the asymptotic normality is available if the observations follow the given conditions where many classes of time serial dependence or models (without identical distribution) are included. Although the dependence condition is rather general, the result is true if the observations follow the given dependence condition. In fact, there are many cases that do not satisfy the mixing condition. Hence, we introduce the contribution of Dedecker and Prieur (2007) for the dependent and stationary cases. The dependence condition for
114
J.-L. Jeng
observations is more flexible than the usual mixing conditions. Their work proves the central limit theorem for the empirical processes for d-dimensional distribution function for a class of stationary sequences. Due to its high-level complexity in technicality, the proofs are left for the interested readers. Definition 3.4.21 (Dedecker and Prieur (2007)): Let (, A, P) be a probability space, let M be a sub σ −algebra of A,and let d be a given positive integer. Let X = (X 1 , . . . , X k ) be a random variable with values in R kd . Let P X be the distribution of X and let P X/M be the conditional distribution of X given M. For 1 ≤ i ≤ k and t in R d . let gt,i (x) = 1x≤t − P(X i ≤ t), where x ≤ t means that x ( j) ≤ t ( j) for 1 ≤ j ≤ d. Define the random variable b(M, X 1 , . . . X k ) = sup(t1 ,...,tk )∈R kd |
k
gti ,i (xi )P X |M (d x) −
i=1
k
gti ,i (xi )Px (d x) |,
i=1
with P X |M (d x) = P X |M (d x1 , . . . , d xk ) and P X (d x) = P X (d x1 , . . . , d xk ). For any p in [1, ∞], define the coefficient β(M, X 1 , . . . , X k ) = b(M, X 1 , . . . X k ) p where b(., .) is defined in the following proposition. For p = 1 or ∞, β1 (M, X 1 , . . . , X k ) = β(M, X 1 , . . . , X k ) and β∞ (M, X 1 , . . . , X k ) = φ(M, X 1 , . . . , X k ). Let 1 (R kd ) be the space of functions f satisfying | f (x1 , . . . xkd ) − f (y1 , . . . , ykd )| ≤
kd
|xi − yi |,
i=1 ( j)
Let p ≥ 1 and assume that X i belongs to L p (P) for any 1 ≤ j ≤ d and 1 ≤ i ≤ k. Define the following coefficient τ p (M, X 1 , . . . , X k ) = ||sup
f (x)P X |M (d x) −
f (x)P X (d x)|, f ∈ 1 (R kd ) || p .
3 Occupation Time Statistics—The Intensity . . .
115
Proposition 3.4.22 (Dedecker and Prieur (2007)): Let BV1 be the space of left continuous functions f whose bounded variation norm is smaller than 1, that is d f is a signed measure such that ||d f || = sup {|d f (g), ||g||∞ ≤ 1} ≤ 1. Let X = (X 1 , . . . , X k ) be a random variable with values in R k . If f is a function in BV1 , let f (i) (x) = f (x) − E( f (X i )). Keeping the same notations as in Definition 3.1.16, we have the equality b(M, X 1 , . . . , X k ) = sup f1 ,..., fk ∈BV1 |
k
(i)
f i (xi )P X |M (d x) −
k
i=1
f (i) (xi )P X (d x)|.
i=1
Dedecker and Prieur (2007) then define the following coefficients for the dependence conditions of observations. The conditions allow more weak dependence for the generalization of empirical processes. Definition 3.4.23 (Dedecker and Prieur (2007)): Let (, A, P) be a probability space. Let (X i )i≥0 be a sequence of random variables with values in R d , and let (Mi )i≥0 be a sequence of σ −algebras of A. For any p ≥ 1, k ∈ N ∪ {∞} and n ≥ 1,define βk, p (n) = max1≤l≤k supi+n≤ j1 0 such that +∞ k=1 kβ2,d+ε (k) < +∞. Theorem (ii) holds, √ 3.4.24 (Dedecker and dPrieur (2007)): If either∞(i) or d then { n(Fn (t) − F(t)}, t ∈ R } converges weakly in (R ) to a tight Gaussian process with covariance function (t, s) =
k∈Z
Cov(1 X 0 ≤t, 1 X k ≤s )
(3.18)
116
J.-L. Jeng
In other words, to apply the asymptotic results for empirical processes to the event studies, we can treat the cumulative sums of abnormal returns as the functions of the statistics defined. The weak convergence result of empirical processes implies that we can apply the central limit theorem where the statistics are obtained from the data. For the independent observations, without using the mixing or nearepoch dependence, we show the following result provided by van der Vaart and Wellner (1996) for the possible very general class of observations. The case with heterogeneous distributions, is covered in the following arguments. Even though the data are not of identical distribution, the weak convergence is still valid for the empirical processes. Theorem 3.4.25 (van der Vaart and Wellner (1996)): For each n, let Z n1 , Z n2 . . . , Z m n , be independent stochastic processes m n indexed by a totally semi-metric space (F , ρ). Assume that the sums i=1 ci Z ni are measurable as indicated mn
E ∗ Z ni 2F { Z ni F > η} → 0 η > 0,
i=1
supρ( f,g) 0 a.e. √ √ 3. n(ρˆi − ρi ) = O p (1), and n(θˆ j − θ j ) = O p (1), i = 1, . . . , p, j = 1, . . . , q. then, sups∈[0,1],x∈R |Kˆ n (s, x) − Kn (s, x)| = O p (1) With the stronger condition of consistency in the estimates, Bai (1994) shows the following theorem. Theorem 3.4.27 (Bai (1994)): Assume the following conditions are satisfied1. The εt are i.i.d. with distribution function F to the domain of a stable law with a index α, 0 < α < 2, 2. The distribution function F admits a bounded derivative f, f > 0 a.e. 3. n(ρˆi − ρi ) = o p (1) and n(θˆ j − θ j ) = o p (1), i = 1, . . . , p, j = 1, . . . , q. where γ = ( 21 )I (α > 1) + ( α1 − 41 )I (α < 1),
120
J.-L. Jeng
then
sups∈[0,1],x∈R |Kˆ n (s, x) − Kn (s, x)| = o p (1)
For the applications of Bai’s (1994) work in sequential empirical process, we let Z 1 , Z 2 , ...Z [nr] , Z [nr]+1 , . . . , Z n be random variables. Suppose that the first [nτ ] are i.i.d with distribution function F1 and n − [nτ ] are i.i.d. with distribution function F2 , where τ ∈ [0, 1] unknown. Let F[ns] and Fn−[ns] be the empirical distribution functions constructed from the first [ns] and the last n − [ns] observations, respectively. Consider the process
√ [ns] [ns] (x) Tn (s, x) = n 1− F[ns] (x) − Fn−[ns] n n
(3.22)
and the test statistic Mn = sups∈[0,1],x∈R |Tn (s, x)|. One rejects the null hypothesis H0 when Mn is too large. The above theorem 3.4.22. provides further the test for the distributional change of residuals. Since the residuals are not observable the test is using the estimated residuals instead. Define Tˆn (s, x) =
√ [ns] [ns] ˆ n 1− F[ns] (x) − Fˆn−[ns] n n
(3.23)
are estimated empirical distribution functions where Fˆ[ns] and Fˆn−[ns] based on the residuals. Define Mˆ n correspondingly as before.
Theorem 3.4.28 (Bai (1994)): It implies that Tn (., .) and Tˆn (., .) converge weakly under the null hypothesis to a Gaussian process B(., F(.)) with zero mean and the covariance function as E B(s, u)B(t, v) = (s ∧ t − st)(u ∧ v − uv), where F denotes F1 = F2 . Accordingly, we can have the test as d Mˆ n → sup0≤s≤1 sup0≤t≤1 |B(s, t)| where the critical values are tabulated by Picard (1985). The distribution function of Kiefer process however, is not available in the analytical form. Inoue (2001) has a similar approach for testing the structural change in time series observations. Particularly, the settings are for the distributional
3 Occupation Time Statistics—The Intensity . . .
121
change in the data when the change point and the distribution of the population is unknown a priori. Let {X ni : i ≤ n, n = 1, 2, . . .} be a triangular array of p-dimensional random variables defined on a complete probability space (, A, P). The study is to test that there is no change of the distribution F such that P(xni ≤ t) = F(t) for all t ∈ R p and i = 1, 2, . . . n and n ≥ 1. Since there is no knowledge of the distributional form or the change point, the test is to consider the differences of the empirical distributions m n 1 1 | δ(X n,i ≤ t) − δ(X ni ≤ t)| m n−m i=1
i=m+1
where the m denotes the possible change date. Inoue (2001) hence, considers the following two sequential statistics: one is the weighted Kolmogorov–Smirnov statistics T1 = sup1≤m d(2 + γ ) the following moment constraints hold E[εt |F t−1 ] =&0, with F t−1 =' σ (ε j : j ≤ t − 1) ∀t ∈ Z, 2+γ
E[ε12 ] = 1, E |ε1 | Q 2 < ∞ , the following integrals exists 2+γ Q 2+γ |σ (u)| Q 2 d FX (u) < ∞, |m(n)| 2 d FX (u) < ∞, and the mixing coefficient α(.) such that ∞
γ
t Q−2 α(t) 2+γ < ∞,
t=1
a test statistics for H0 can be shown as 1 βn (s, z) = √ n
( [ns] i=1
[ns] Yi δ(X i ≤ z) − Yi δ(X i ≤ z) n n
i=1
)
142
J.-L. Jeng
for s ∈ [0, 1] and z ∈ R d . Thereby, in the i.i.d. setting it can be shown that βn (s, z) = αn (s, z) − sαn (1, z) + o p (1), for s ∈ [0, 1] and z ∈ R d , where [ns] 1 (Yi δ(X i ≤ z) − E[Yi δ(X i ≤ z)]). αn (s, z) = √ n i=1
Under H0, (Yt , X t ) ∈ R × R d , t ∈ Z is strictly stationary and condition (A1) in Theorem 3.4.46 holds, let F = (y, x) → yσ (x ≤ z) : z ∈ R d hold that N[] (ε, F , L 2 (O)) = O(ε−2d ) where X 1 ∼ P. The integral condition in (A2) holds for all Q > d(2 + γ ). Let the moment conditions for ε1 , m, σ hold so that condition (A3) of Corollary 3.4.46 is satisfied, then the above Corollary can be applicable to the sequential empirical process as [ns] 1 (ϕ(Yi , X i ) − E[ϕ(Yi , X i )]), s ∈ [0, 1], ϕ ∈ F , n ∈ N. √ n i=1
With these works given (particularly for the in-sample statistics), we can modify the framework toward to monitoring tests in order to pursue the online detection of the impacts of the events. Therefore, this shows that the sequential change point detection can be done with the in-sample empirical distributions where the Gaussian process could be used to consider. However, the scheme is for the in-sample retrospective statistics, not for the monitoring (online) detection of the change point in the regression models or else. Nevertheless, the concepts of sequential detection and the weak convergence conditions can be modified to meet the requirement of the online monitoring approach.
3 Occupation Time Statistics—The Intensity . . .
3.5
143
Occupation Time Statistics and Similarity with Empirical Process
The Similarity of Occupation Time Statistics and Empirical Process The definitions of occupation time statistics and empirical process are similar to each other except for the range of the variable of interest (i.e. One is for the variable is less than the threshold while the other is for the variable is greater than the threshold). The advantage is that the definition of caption time is on the asymptotic distribution of cumulative sum of abnormal returns, instead of the distribution of each individual returns. Therefore, the occupation time statistics are in fact, the opposite of the empirical distribution function by the definitions. Yet, since the asymptotic arguments for the empirical processes follow, the same properties apply to the statistics of our interest, particularly for the empirical processes indexed by the functions defined in some function spaces. As described in many areas that the individual stock return is famous with skewness and kurtosis which make make many statistics suffer from the slow rate of weak convergence in central limit theorem. The current work is based on the cumulative sum of abnormal returns instead. Hence, the asymptotic argument here is about the cumulative sum and not the distribution of the individual stock returns. The convergent in distribution is provided on the cumulative sum of abnormal returns. Hence, with appropriate normalization, the sum will converge weakly to an appropriate distribution (say, of Kiefer process) even though the individual stock returns may suffer from the changing skewness and kurtosis for different distributions over time. It is not difficult to find out that the usual identification of the impacts of event which depend on the stock returns should not be limited on the magnitude of the impacts alone. Instead, if the event is essential and significant, the impacts should last through some period of time which is similar to the severity (or intensity) of the impacts to earthquake. For example, the conventional statistics apply the cumulative sum of abnormal returns that are similar to the structural change tests in the literature are only to consider the impacts through the changes of magnitude (over the parameters of interest). It may for instance, identify the change on the distribution of parameters, however. Usually, these findings may identify
144
J.-L. Jeng
the change of stock returns due to the events alone. Yet, this does not guarantee the impacts of the events are strong enough to last for several day, for instance. Especially, the usual findings focus on the identification of events on the detection of the change in the sequential analysis. It fails to recognize that these changes alone are not enough to prove that the events are significant for the market. The flow of information in the market (or hetroskedasticity) may alternate the underlying stock returns anyway. In fact, similar to the earthquake studies, the intensity and the duration of the shocks are essential to consider the impacts of the events. To identify the events on the stock returns therefore, we need an additional method to carefully point out the impacts of the events or to detect the consequence.
References Andrews, D.W.K. 1991. An Empirical Process Central LimitTheorem for Dependent Non-identically Distributed Random Variables. Journal of Multivariate Analysis 38: 187–203. Andrews, D.W.K. 1994. Empirical Process Methods in Econometrics. Handbook of Econometrics 4 (37): 2248–2296. Araujo, A., and E. Giné. 1980. The Central Limit Theorem for Real and Banach Valued Random Variables. New York: Wiley. Bai, J. 1994. Weak Convergence of the Sequential Empirical Processes of Residuals in ARMA Models. Annals of Statistics 22: 2051–2061. Berkes, I., S. Hörmann, and J. Schauer. 2009. Asymptotic Results for the Empirical Process of Stationary Sequences. Stochastic Processes and Their Applications 119: 1298–1324. Dedecker, J., and C. Prieur. 2007. An Empirical Central Limit Theorem for Dependent Sequences. Stochastic Processes and Their Applications 117: 121– 142. Dehling, H. 1983. Limit Theorems for Sums of Weakly Dependent Banach Space Valued Random Variables. Probability Theory and Related Fields 63: 393–432. Ermakov, S.V., and E.I. Ostrovoskii. 1986. The Central Limit Theorem for Weakly Dependent Banach-Valued Random Variables. Theory of Probability and Its Applications 30: 391–394. Hoffmann-Jorgensen, J., and G. Pisier. 1976. The Law of Large Numbers and the Central Limit Theorem in Banach Spaces. Annals of Probability 4: 587–599.
3 Occupation Time Statistics—The Intensity . . .
145
Inoue, A. 2001. Testing for Distributional Change in Time Series. Econometric Theory 17 (1): 156–187. Ledoux, M., and M. Talagrand. 1991. Probability in Banach Spaces. Strasbourg: Springer-Verlag. Mohr, M. 2019. A Weak Convergence Result for Sequential Empirical Processes Under Weak Dependence. Stochastics 92: 140–164. Pechtl, A. 1999. Distributions of Occupation Times of Brownian Motion with Drift. Journal of Applied Mathematics and Decision Sciences 3 (1): 41–62. Picard, D. 1985. Testing and Estimating Change-Points in Time Series. Advanced Applied in Probability 17: 841–867. Takács, L. 1998. Sojourn Times for the Brownian Motion. Journal of Applied Mathematics and Stochastic Analysis 11: 231–246. van der Vaart, V.D., and J.A. Wellner. 1996. Weak Convergence and Empirical Processes. New York: Springer.
4 Monitoring Tests and the Time of Duration
4.1
Introduction
In this chapter, we begin with the introduction for the discussions of monitoring tests and their limitations. In particular, in introducing the concept of time duration, the definition of stochastic drawndown (and drawup) for the underlying stochastic processes (due to its running nature), is given according to the sample path of the processes. It is not only the magnitude of the change is of concern. Instead, the duration (or sample path) for the change and its likelihood are investigated. The statistical inferences hence, no longer focus on the change (of the parameter) alone. The interest of study focuses on the time span and the likelihood that a certain change may happen. In terms of event studies in corporate finance, this corresponds to the study of intensity of the events and the essentiality of the events of interest. Although the idea of the occupation time is discussed for the stochastic drawdown or drawup which may emphasize the time span the incidents happen, the application is to identify the market crash (for the portfolios) and its probability of the associated extents. In other words, even the time span of the stochastic process is of interest, it does not consider the associated time span for the other events of interest except the © The Author(s) 2020 J.-L. Jeng, Contemporaneous Event Studies in Corporate Finance, https://doi.org/10.1007/978-3-030-53809-5_4
147
148
J.-L. Jeng
extremes. However, it covers the discussions of associated time span for the (present value) of portfolio and the running extremes upon the values. Hence, it alternates the statistical inferences in identifying the changes (of parameters) in magnitude to the time-wise assessment for the extremes of the stochastic processes. Instead, the studies are on emphasizing on the time frame of the underlying stochastic processes to pass through certain extremes (from their own histories) and the sample path to reach the thresholds. Intuitively, it is very similar to the intensity we said about the event incidence.
4.2
Essential Corporate Finance Events and Intensity
For instance, in the work of Aue et al. (2006), the monitoring scheme to detect the change point is when the underlying statistic once exceeds the boundary function given where the change is considered permanent. We now show the framework of the detection scheme as in Aue et al. (2006) for elaboration. Consider the following regression model of interest. Let yi = xiT βi + i ,
1 ≤ i < ∞,
(4.1)
where xi is a p × 1 dimensional random or deterministic vector of the form xiT = (1, x2,1 , . . . , x p,i ) βi is a p × 1 m dimensional parameter vector and {i } is an error sequence. the first assumption states that there is no change of parameters in the first m observations 1 ≤ i ≤ m, (4.2) βi = β0 We wish to test no change in the regression parameter null hypothesis H0 : βi = β0 ,
i = m + 1, m + 2, . . . ,
against the alternative hypothesis such as
(4.3)
4 Monitoring Tests and the Time . . .
149
H :a there exists a k ≥ 1 such that βi = β0 when m < i < m + k , but βi = βk i = m + k , m + k + 1, . . . with β0 = βk . The parameters β0 , βk and k are assumed unknown. The monitoring procedures use the detector function (m, k) and a boundary function g(m, k) which together define the stopping time τ (m) = in f {k ≥ 1 : |(m, k)| ≥ |g(m, k)|}
(4.4)
which must satisfy
lim m→∞ P[τ (m) < ∞] = α, under H0 , lim m→∞ P[τ (m) < ∞] = 1, under Ha . The probability α ∈ (0, 1) controls the false alarm rate which ensures the probability of a false alarm is asymptotically bounded by α, and the change point is detected with probability approaching one. Furthermore, assume that Ei = 0, Ei j = 0 (i = j) and Ei2 ≤ C with some C > 0, ∞ Cov(02 , i2 ) > 0. Ei4 0 such that sup1≤k b} , for any a, b > 0. Theorem 4.3.1 (Zhang (2018)): We have P(τ D÷ (a) < τU (a) ∧ T ) =
e−2δa + 2δa − 1 e−2δa + e2δa − 2
156
J.-L. Jeng
−
∞ 2n 2 π 2
Cn2
n=1
e
2C n 2a 2
−e
T(
(An + Bn T ),
where
n −δa
An = 1 − (−1) e
n −δa
Bn = 1 − (−1) e
4δ 2 a 2 1− + (−1)n δae−δa , Cn
n2π 2σ 2 a2
,
Cn = n 2 π 2 + δ 2 a 2 ,
(4.19) μ δ= 2 σ (4.20)
Proposition 4.3.2 (Zhang (2018)): For t > 0 and −a ≤ x < 0,we have P(τ D÷ (a) ∈ dt, τU÷ (a) > t, X t ∈ d x) = g(t, x)dtd x, where
2 nπ x 2 δx− σ C2n t 2 π 2 σ 2 t − 2a 2 sin nπ x 2a − (n g(t, x) = σ δ ∞ nπ e 2 nπaxcos n=1 a a
a
with δ and Cn , n ∈ N defined as above.
Proposition 4.3.3 (Zhang (2018)): In the case of an infinite time-horizon, we have e−2δa + 2δa − 1 . P(τ D÷ (a) < τU÷ (a)) = −2δa e + e2δa − 2 Proof of Theorem 4.3.1: We use the Proposition 4.3.2 to obtain P(τ D÷ (a) < τU÷ (a) ∧ T )
= P(τ D÷ (a) < τU÷ (a)) − P(T ≤ τ D÷ (a) < τU÷ (a)) ∞ 0 = P(τ D÷ (a) < τU÷ (a)) − P(τ D÷ (a) ∈ dt, τU÷ (a) > t, X t ∈ d x) = P(τ D÷ (a) < τU÷ (a)) −
T
∞
−a
dt T
0 −a
g(t, x)d x (4.21)
4 Monitoring Tests and the Time . . .
157
Apply the results of the above propositions yields the results. In addition, the following corollary connect the relation with the range process of the Brownian process. Corollary 4.3.4 Let R = X t − X t , be the range process of X t . Then, P(RT < a) =
∞ 4n 2 π 2 n=1
Cn2
e
2C n 2a 2
−e
T
( A˜ n + B˜ n T ),
where 4δ 2 a 2 n ˜ An = (1 − (−1) cosh(δa)) 1 − − (−1)n δa · sinh(δa) Cn (4.22) 2π 2σ 2 n B˜ n = 1 − (−1)n cosh(δa) (4.23) a2
Proof. Define the first passage time of range process R by τ R÷ (a) = in f {t ≥ 0 : Rt > a)} it is easy to see that τ R÷ (a) = τ D÷ (a) ∧ τU÷ (a). Therefore, we have P(RT < a) = P(τ R÷ (a) ≥ T ) = 1 − P(τ R÷ (a) < T )
= 1 − P(τ D÷ (a) < τU÷ (a) ∧ T ) − P(τU÷ (a) < τ D÷ (a) ∧ T ) ˜ ÷ (a) < τ ÷ (a) ∧ T ) = 1 − P(τ ÷ (a) < τ ÷ (a) ∧ T ) − P(τ D
U
D
U
where P˜ is a probability measure under which the law of X is the same as that of −X under the original measure P.2
158
J.-L. Jeng
It appears that the statistics of stochastic drawdown and drawup although explain the time-wise properties of the stochastic process depend heavily on the assumption of the characteristics of the process. The above work however, states the importance in using the statistics of time span to describe the underlying nature of stochastic process—especially, it can be applied to explain the possible market crash and its speed of crash for the portfolio of interest. However, the technicalities depend heavily on the assumption of the characteristics of the underlying stochastic processes. With the stochastic processes assumed, the properties and mathematical derivations can be done in detail.
Notes 1. We will, however, pick up the monitoring scheme of analyses in Chapter 5 for the sequential detection on the occupation time statistics instead The difference is that we don’t have to specify that characteristics of the stochastic process for the statistics used. 2. It indicates that the time span of the incidence is of concern, when one is discussing the pattern of stochastic process. That is, instead of looking at the patterns of the stochastic process and its magnitude, the time horizon of the process is of importance However, the current discussion is useful only if the characteristics of the stochastic process are shown. In other words, it is given only if the stochastic process is assumed in the issue. The question is, we usually do not the details for the stochastic process that (say) governs the stock returns.
References Aue, A., L. Horváth, M. Huškova, and O. Kokoszka. 2006. Change-Point Monitoring in Linear Models. Econometrics Journal 9 (3): 373–403. Clark, T., and M. McCraken. 2005. Evaluating Direct Multistep Forecasts. Econometric Reviews 24 (4): 369–404.
4 Monitoring Tests and the Time . . .
159
Horvath, L., M. Huskova, P. Kokoszka, and J. Steinebach. 2004. Monitoring Changes in Linear Models. Journal of Statistical Planning and Inference 126 (1): 225–251. Zhang, H. 2018. Stochastic Drawdowns. London: World Scientific.
5 Sequential Monitoring for Corporate Events in Using Occupation Time Statistics
5.1
Introduction
As many advocated to develop empirical tests for event studies in corporate finance, the theme is usually on the robust version on tests for structural differences either between for-event-periods and off-event periods (of some preselected firms) or on the event-related firms versus noneventrelated firms. In most occasions, the tests (parametric or not) are based on various forms of cumulative abnormal returns. Unfortunately, as stated in Jeng (2015), these tests are rarely different from the tests for structural change(s) in financial time series. Given that financial time series are notorious for the time-varying parameters (even when no obvious corporate event is incurred), these tests are easily bound for over-rejection in showing significance of presumed events of interest. For that matter, an alternative methodology based on the occupation time of stochastic processes is provided in Jeng (2015) and Jeng et al. (2016). In the current paper, this methodology is extended to provide a diagnostic test for these occupation time statistics. In particular, the diagnostic test can be applied to tell the difference between the so-called event-related firms versus nonevent-related firms over the arbitrary time horizon that encompasses © The Author(s) 2020 J.-L. Jeng, Contemporaneous Event Studies in Corporate Finance, https://doi.org/10.1007/978-3-030-53809-5_5
161
162
J.-L. Jeng
the presumed corporate finance events. More specifically, instead of using the cumulative abnormal returns for the preselected event-related firms to test the impact of events, the diagnostic test is simply to investigate if there is a significant difference between the occupation time statistics of event-related firms versus nonevent-related firms. Intuitively, if the (interested) corporate finance event is relevant, there should be a significant difference between these two samples. In other words, this is similar to the development for a two-sample test for the function of abnormal returns. Specifically, the approach is similar to that of empirical processes where empirical distribution functions are of use. The difference, however, is that the emphasis is on the occupation time statistics where the absolute value of (normalized) cumulative abnormal returns are exceeding some thresholds within the unit interval. In addition, as many research articles are attempting, an online sequential detection of monitoring scheme is usually considered as an alternative early warning device to many issues in economics and finance. However, most of these developed methods focus merely on the boundary-crossing detection. Those devices however, do not consider the intensity of such boundary-crossings and may easily include false alarms. To accommodate these issues, a sequential monitoring test is developed to detect the possible impact of events of interest when additional observations of financial time series are available recursively.1
5.2
Sequential Monitoring Tests and Analyses
There are two different approaches to analyze the data sequentially—one is the retrospective way in using the in-sample statistics to consider the possible structural change (and its change point) and the online monitoring approach to investigate the change with sequential observations. In the following, we emphasize the second approach to analyze the change when new observations are available. We now introduce the work of Huškova and Chochola (2010) as an introduction for the sequential monitoring for change in distribution.
5 Sequential Monitoring for Corporate Events . . .
163
Let X 1 , . . . , X n , . . . be the observations arriving sequentially and let X i have a continuous distribution function Fi , i = 1, 2, . . . and the first m observations have the same distribution F0 , that is F1 = . . . = Fm = F0 , where F0 is unknown. X 1 , . . . , X m are called training data. The null hypothesis is specified as H0 : Fi = F0
∀i > m,
and the alternative hypothesis is given as H A : there exists k ≥ 0 such that Fi = F0 ,1 ≤ i ≤ m + k , Fi = F 0 , m + k < i < ∞, F0 = F 0 The testing procedue is described by the stopping rule as (5.1) τm,N = in f 1 ≤ k ≤ N : |Q(m, k)| ≥ cqγ (k/m) where in f ∅ := ∞ and either N = ∞ or N = N (m) with lim m→∞ N (m) /m = ∞, Q(m, k) is the detector depending on k = 1, 2, . . . qγ (t), t ∈ (0, ∞) is a boundary function with γ ∈ [0, 1/2) (a tuning parameter) and c is a suitably chosen positive constant. Furthermore, under H0 for α ∈ (0, 1) lim m→∞ PH0 (τm,N < ∞) = α,
(5.2)
lim m→∞ PH A (τm,N < ∞) = 1,
(5.3)
and under H A ,
Thus. under H0 , the test has the asymptotic level α and under H A , the test if consistent. In particular, let the detector Q(m, k) and the boundary function qγ (.) and constant c be given as under H0 |Q(m, k)| lim m→∞ P max1≤k c = α, qγ (k/m)
(5.4)
164
J.-L. Jeng
m+k where Q(m, k) = σˆ 1√m i=m+1 ( Fˆm (X i ) − 21 ), k = 1, 2, . . . , Fˆm is m an empirical distribution function based on X 1 , . . . , X m and σˆ m is a suitable standardization based on X 1 , . . . , X m . They set the boundary function as 1 qγ (t) = (1 + t)/(t/(1 + t))γ , t ∈ (0, ∞), 0 ≤ γ < . 2
(5.5)
They have given two sets of assumptions (H 1 ){X i }i are independent identically distributed (i.i.d) random variables {X i } has continuous distribution function F0 , (H2 ) {X i }i is a strictly stationary α-mixing sequence with {α(i)} such that for all δ > 0 P (|X 1 − X 1+i | ≤ δ) ≤ D1 δ, α(i) ≤ D2 i −(1+η)3 ,
i = 1, 2, . . .
i = 1, 2, . . .
(5.6) (5.7)
for some positive constants η, D1 ,D2 , and X i has a continuous distribution function F0 , the coefficient α(i) is defined as α(i) = sup A,B |P(A ∩ B) − P(A)P(B))|, where sup is taken over A ∈ σ (X j , j ≤ n) and A ∈ σ (X j , j ≥ n + i). Theorem 5.2.1 (Huškova and Chochola (2010)) (I) Let the sequence {X i }i 1 fulfill the assumption (H1 ) and put σˆ m2 = 12 . Then,
|W (t)| = P sup0≤t 0} m+[mt] 1 ˆ 1 Vm (t) = √ ( Fm (X i ) − ) 2 m i=m+1
is the same as {Z m (t), t > 0} where ⎛ ⎞ m+[mt] m 1 1 1 k (F0 (X i ) − ) − (F0 (X j ) − )⎠ , Z m (t) = √ ⎝ 2 m 2 m i=m+1
j=1
166
J.-L. Jeng
m+[mt] 1 √1 i=m+1 (F0 (X i ) − 2 ) converges to a m m 1 2 √1 j=1 (F0 (X j ) − 2 ) converges to N (0, σ ), m
as m → ∞, the process
Gaussian process and where ∞ 1 2 σ = cov{F0 (X 1 ), F0 (X j+1 )} +2 12 j=1
In the independent case, σ 2 = 1/12. While in the dependent ones we used the estimator of σ 2 as σˆ m2
ˆ = R(0) +2
m
ω(k/ m ) Rˆ m (k),
(5.11)
k=1
1 ˆ 1 1 Rˆ m (k) = ( Fm (X i ) − )( Fˆm (X i+k ) − ) n 2 2 n−k
(5.12)
i=1
where ω(.) is a weight function. Usual choice is either
1 1 + 2(1 − t) 0 E
P max1≤k≤N √
≤ q −2
m(1 +
N
E
≤q
D
m
k k γ m )( m+k )
m
≥ q|X 1 , . . . , X m
m
j=1 h(X j , X i )
m 3 (1 +
k=1
−2
|J1 (m, k)|
k 2 k 2γ m ) ( m+k )
−2+2γ −2γ
k
2
+
k=1
N
k
−2
k=m+1
= q −2 O(m −1 ) for some D > 0. This relation holds true for any integer N . Hence, the limit behavior of lim 1≤k≤N |Q(m,k)| is the same as max1≤k≤N √|J2 (m,k)| k k . qγ ( m )
mqγ ( m )
When {X i }i is α-mixing, it implies that {φ(X i )}i is also α-mixing for any
168
J.-L. Jeng
measurable function φ with the same mixing coefficient. Therefore, for a positive constant D such that for h(., .) we have 2
|E (h(X i1 , X i2 ) h(X i3 , X i4 )| ≤ D (α(i)) 3−ξ , for any ξ > 0, where i = min i (2) − i (1) , i (4) − i (3) with i (1) ≤ i (2) ≤ i (3) ≤ i (4) such that E J1 (m, k)2 ≤ Dmk for some D > 0. And hence, by Theorem B.4 in Kirch (2006), we get P max1≤k≤N
|J1 (m, k)| (1 +
k k γ m )( (m+k) )
≥q
≤ q −2 O(m −1 (log(N ))2 ).
Theorem 5.2.2 (Huškova and Chochola (2010)) Let {X i }i satisfy either (A1 ) or (A2 ), let k < N η for some 0 ≤ η < 1, let (5.5) be satisfied. Then as m → ∞, |Q(m, k)| p sup1≤k 2. Then, as m → ∞, σˆ m2 − σ 2 = o p (1). We can also survey the work of Li et al. (2015) for monitoring distributional changes in autoregressive AR(p) models based on the weighted empirical process of residuals. Let {xt , t = p + 1, . . .} be an AR(p) process defined by the equation (5.15) xt = β X t−1 + t where X t−1 = xt−1 , . . . , xt− p and β = β1 , . . . , β p is an unknown regression parameter. The errors t are independent, each having a corresponding distribution function Ft with mean-zero and finite variance. For some T < ∞, we have the following hypothesis H0 : Ft = F0 , t = 1, . . . , T, . . . , [T κ],
(5.16)
H1 : Ft = F0 , T < t < T + k ; Ft = F 0 , T + k ≤ t ≤ [T κ] (5.17) where the distribution functions F0 = F 0 , and the time of change k is assumed unknown, κ is some fixed number greater than 1 and [.] denotes the integer part. Since the errors are unobserved, the residuals are calculated ˆt = xt − βˆT X t−1 , and using these residuals to test the null hypothesis H0 against the alternative H1 . The test is when the detector St is calculated sequentially at each time
170
J.-L. Jeng
point, the decision will check the detector St in exceeding an appropriate critical value as cα . The stopping time is given as in f {T < t < [T κ] :} τ (T ) = [T κ], St ≤ cα
St > cα T < t < [T κ]
(5.18)
which they control value of α such that lim T →∞ PH0 (τ (T ) < ∞) = α,
(5.19)
lim T →∞ PH1 (τ (T ) < ∞) = 1,
(5.20)
the probability α ∈ (0, 1) control the false alarm rate. Equation (5.19) is to ensure the probability of false alarm is asymptotically bounded by α, and Eq. (5.20) is to that a change is detected with probability approaching one. Assumption 5.2.4 { t , t = 1, 2, . . .} are i.i.d. random variables with common distribution F0 with zero mean, positive variance and E| t4 | < ∞, F0 admits a density function f, f > 0. Both f (x) and x f (x) are assumed to be uniformly continuous on the real line. Furthermore, there exists a finite number L such that x| f (x)| < L and | f (x)| < L for all x. Assumption 5.2.5 The initial values x 1 , x2 . . . , x p are independent of p+1 , . . . , T , let β p = 0, and the roots of the polynomial z p − β1 z p−1 − · · · − β p are less than one in the absolute value. 1 Assumption 5.2.6 Y Y 2 βˆT − β = O p (1), as T → ∞, where Y = (x1 , x2 , . . . , xt ) . Assumption 5.2.7 The regressors satisfy lim T →∞ max1≤t≤[T κ]
1 T 1/2
|xt | = o p (1),
5 Sequential Monitoring for Corporate Events . . .
171
s] 2 1 [T s] 2 lim T →∞ T1 [T t=1 x t = lim T →∞ T E t=1 x t = l(s), uniformly s ∈ [1, κ], where l(s) is positive for s > 0. Assumption 5.2.8 For every fixed s1 , there exists a sequence of positive number z T = O p (1) such that [T s] 1 |xt | ≤ (s − s1 ) z T T t=[T s1 ]
almost surely for all s ≥ s1 , and the tail probability of z T satisfies, for some ρ > 0, P(z T > C) ≤ M/C 2(1+ρ) , where C > 0 and M > 0. Assumption 5.2.9 There exist γ > 1, α > 1 and K < ∞ such that for all 0 ≤ s ≤ s ≤ 1, and for all T 1 E(xt2 )γ ≤ K (s − s ), T i h for threshold h (or h(t)) where the as (say) in f m : | m t=1 it 0≤m≤1
5 Sequential Monitoring for Corporate Events . . .
179
test is to perform early detection of parameter changes when sequential observations are available. However, the essential issue with event studies is not solely for the structural change. It is on how long (or how often) these changes may last (or occur). In other words, it is the intensity of the impacts from the event(s) that should be of concern. On the other hand, if (say) the CUSUM test is applied to consider the abrupt change in “beta,” it is difficult to associate the change with the particular corporate finance events of interest unless specific theoretical foundation is provided.
The setting The following assumptions are for the test based on cumulative abnormal returns and provide the asymptotic functional where explicit formulas for the asymptotic distribution and moments are available. Assumption 5.3.3 is to establish the statistic based on cumulative abnormal returns under the null that there is no “essential” event in the data and consider the relative frequency of occurrence when cumulative abnormal returns exceeding some thresholds. Assumption 5.3.3 Suppose under the null hypothesis that no impacts from event(s) are significant, and let the cumulative sums of abnormal returns { it, }t=1,...T for each firm i follow the invariance principle such that as T → ∞ [λT ] 1 d it −→ Yλ , (5.26) √ T σ i t=1 where 0 ≤ λ ≤ 1, and σ 2i represents the long run variance of { it, }t=1,...T , Yλ is a nondegenerated zero-mean diffusion process defined on interval d
[0, 1], the notation −→ stands for the convergence in distribution.6 Many examples for the invariance principle of the error terms in regression have been applied in the related literature. For instance, under the null hypothesis that the drift and “beta” are not time-varying in the linear regression of Eq. (5.1), Sen (1982) provides the invariance principle for recursive residuals from the linear regression even though the error
180
J.-L. Jeng
terms { it }i=1,2,...,t=1,2,..., do not follow normal distribution. In the current context, Assumption 5.3.3 allows both serial dependence and heteroskedasticity given possible (dynamic) model misspecification. There is no need to consider if the statistics are obtained through the estimation period, event window, or post-event period since the invariance principle holds for the entire period of interest. The study is not based on the (asymptotic) normality assumption of stock returns or abnormal returns. Nor does the test verify the normality in asymptotic distribution of abnormal returns at all. On the other hand, Assumption 5.3.3 is for the error terms from the time-varying coefficient model of Eq. (5.1) when the online recursive estimation will be applied to track and update the normal (or expected) returns sequentially. These error terms are different from the abnormal returns obtained from the conventional (say) market model (or else) estimated based on the presumed estimation periods prior to the assumed event window. More technically, the abnormal returns applied in conventional CARs tests are out-of-sample forecast errors of fitted regressions when using data from estimation period. No recursive updating of information is considered in the conventional approach. In addition, the asymptotic distribution (including normality) in the conventional CARs tests are obtained when number of observations in estimation period prior to event window grows sufficiently large. In Eq. (5.3), the assumption does not assume that the estimation period must grow sufficiently large prior to the event date for the weak convergence to hold asymptotically. Instead, it assumes that the entire sample period T grows sufficiently large. Although the conventional CARs tests may apply asymptotic normality in the cumulative sums of abnormal returns across the event window, the event window is assumed to have only finite number of dates or observations. The asymptotic arguments for normality in CARs tests are based on the large number of observations within the estimation period. Hence, Assumption 5.3.3 applies even to the occasions where no knowledge is feasible for the separation of estimation period and event window. Furthermore, since the online recursive estimation is applied, the estimation on normal (or expected) returns is updated with new observations. Therefore, there is no need to assume that the length of estimation period (prior to event window) must be sufficiently long. Nor is there the need to
5 Sequential Monitoring for Corporate Events . . .
181
consider subjectively the number of days prior to and after the event dates to determine the event window even when the precise event date is known a priori. However, an issue with applying structural change tests on event studies is that these parametric or distributional changes resolve gradually after certain time periods even if the event(s) are essential or significant. Thus, even with possible changes in parameters or distributions, the discussions on the essentiality of the event(s) should focus on “how long the impact may last.” An event that is significant must have some occurrences of statistics (such as the frequencies of cumulative sums of abnormal returns that cross certain thresholds) persist over some time horizons. In other words, similar to the assessments of earthquakes, the magnitudes of changes, or fluctuations of parameters provide only tentative estimates or partial information to the severity of the event. The measurement such as frequencies or duration of the impact (in time horizon) is a more valid method to describe the severity or essentiality of the event(s). This study provides an alternative method that allows for such a measurement. In addition, the method also encompasses the occurrence of permanent changes as special cases. In the following, a general definition for occupation time of a nondegenerated real-valued diffusion process is provided. Furthermore, Assumption 5.3.3 serves only as an approximation for the weak convergence for the cumulative sums of abnormal returns. Given that, the following occupation time statistics are not to test whether these cumulative sums will converge to Brownian motion or not. Instead, the test statistics is to consider whether the hitting frequency or duration of impacts (for a certain level of threshold) is distinct from each other with sequential observations. Although the cumulative sums of abnormal returns are similar to the statistics applied in the conventional CARs tests, the test statistics are not based on the asymptotic distribution of the CARs (or so-called cumulative abnormal returns). In addition, the definition of occupation time for the diffusion process in the following may include various stochastic processes such as Lévy process where the jumps or discontinuities on the sample path may happen. In other words, the occupation time statistics can be extended to some jump processes in financial time series. For instance, in Fatalov (2009), the
182
J.-L. Jeng
occupation time and its asymptotics are extended to the L p —functional of the Ornstein-Uhlenbeck process. Fitzsimmons and Getoor (1995) show the distribution for occupation times of Lévy bridges and excursions.These extensions can be applied to the following occupation time statistics when Assumption 5.3.3 is modified to different invariance principles for weak convergence to various diffusion processes of interest. Definition 5.3.4 Following Assumption 5.3.3, for a nondegenerated real-valued zero-mean diffusion process with stationary increments Yλ , λ ∈ [u, v] and for a threshold h, h > 0, the occupation time (denoted as ζ¨ (h)) for the process Yλ is defined as ζ¨ (h) =
u
v
δ(Yλ > h)dλ,
(5.27)
where δ(y) is an indicator function for y to lie in a set A such that δ(y) = 1 if y ∈ A, A ⊂ R, A = {y|y > h, y ∈ R + }, and δ(y) = 0, otherwise.7 Notice that Definition 5.3.4 considers various nondegenerated realvalued diffusion processes that have the stationary increments. In other words, the processes such as Brownian meander, Brownian excursion, reflected Brownian motion, Brownian bridge, Lévy bridges, and excursions are all included. Based on the above assumptions and Definition 5.3.4, we develop an alternative test statistic using the absolute values of the cumulative abnormal returns from the event study. The advantage is that it does not require testing the differences in parameters or distributions across pre-event and post-event periods. Hence, it avoids arbitrary choices of event window or pre-event and post-event periods for statistical verification.8 More specifically, if the invariance principle holds so that the cumulative partial sums of idiosyncratic risks converge to Brownian motion in Assumption 5.3.3, and according to Pötscher (2004), the occupation time from the cumulative abnormal returns will converge in distribution to the occupation time of reflected Brownian motion. This result has been shown in Jeng (2015), and Jeng et al. (2016). Notice that the test statistic does not assume that there is no parameter change before the assumed event window. Nor does it assume that there is no change in
5 Sequential Monitoring for Corporate Events . . .
183
parameters under null hypothesis in conventional monitoring tests for structural change. The following theorem shows the convergence in probability of occupation time statistics based on regression residuals. Proposition 5.3.5 Different severity of impacts implies different occupation time. Theorem 5.3.6 Given Assumptions 5.3.1 and 5.3.3, and under the null hypothesis such that there isno essential impact from the new information disclosed in the event(s), let ˆit i=1,2,··· be the residuals of fitted regressions for asset returns at time t, t = 1, 2, ..., T (based on the recursive tracking algorithms stated in Chen and Guo [1991], Guo [1994], for instance), and −1 j T ˆit2 + 2 Tj=1 k( q )γˆi ( j) be the heteroskedasticity and let σ˜ 2i = T1 t=1 autocorrelation consistent (HAC) estimate for the asymptotic variance σ 2i T − j for asset i, i = 1, 2, · · · , n, where γˆi ( j) = T1 t=1 ˆit ˆi,t+ j , and k(.) is the Bartlett-kernel function with bandwidth q, q → ∞, Tq → 0, T is sufficiently large, let the empirical occupation time statistic ψˆ i,T for all ω ∈ , and for some 0 < h ≤ τ, τ is a finite positive real number, h ∈ R + be defined as9 T m 1 1 ψˆ i,T (ω, h) = δ √ ˆit > h , (5.28) T T σ˜ i m=1
t=1
where δ(y) is an indicator function for y to lie in a set A such that δ(y) = 1 if y ∈ A, A ⊂ R, A = {y|y > h, y ∈ R + }, and δ(y) = 0, otherwise.10 Then, for any h, 0 < h ≤ τ, √
1 T σ i
m t=1
ˆit = √
m
1 T σ i
ψˆ i,T (ω, h) = ψi,T (ω, h) + o p (1), m > h . it t=1
it + o p (1),
(5.29)
t=1
ψi,T (ω, h) =
1 T
T
m=1 δ
√1 T σ˜ i
184
J.-L. Jeng
Proof of Theorem 5.3.6: Notice that ˆit = rit − θˆit X t = (θit − θˆit ) X t + it , where θit = (αit , β1t , . . . , βkt ) , X t = (1, φ1t , . . . , φkt ) , θˆit is the recursive estimate for θit , X t and { it }i=1,2,··· ,n, X t and {θit }i=1,2,··· ,n are mutually orthogonal over time, respectively. In addition, let the asymptotic optimality properties of recursive algorithms show that the tracking errors E||(θit − θˆit )|| p < < ∞, p > 1. Suppose that E||X t ||q < < ∞, q > 1, where 1p + q1 = r1 , r > 2. Notice that m m 1 ˆ ˆit ≤ √ (θit − θit ) X t + √ it . √ T σ i t=1 T σ i t=1 T σ i t=1 (5.30) m 1 ˆ √ The component t=1 (θit − θit ) X t is not negligible unless some T σ i higher-order moment conditions are provided. Given the optimality of recursive algorithms in tracking for time-varying parameters, these conditions can be obtained such as in Chen and Guo (1991), Guo (1994). Notice that, even though the above statistic is denoted as empirical occupation time statistic, it is actually a statistic of normalized counting measure that considers the hitting frequency of the impacts from events when represented by the exceedance of cumulative abnormal returns. The cumulative abnormal returns are similar to the conventional approach in event studies of corporate finance. Yet, this cumulative sum of returns is not solely across event windows. Instead, the cumulative sums are obtained for the entire sample period without any a prior determination of estimation period, event windows and post-event period. In other words, the test statistic for the event(s) focuses on the frequency of the occurrence when the absolute values of cumulative abnormal returns may cross some thresholds. Intuitively, the higher the frequency, the more intensive the impacts are. In the following Corollary 5.3.6 it shows that under Assumption 5.3.3 where the invariance principle for weak convergence toward Brownian motion is applied and assuming the events have no essential impact, this statistic will converge weakly to the occupation time of reflected Brownian motion asymptotically where explicit formulas for its moments and distribution
1
m
1
5 Sequential Monitoring for Corporate Events . . .
185
are available in Takacs (1998). That is the reason why it is denoted as the occupation time statistic, so to speak. Corollary 5.3.7 : If the invariance principle holds for cumulative sums [λT ] d of { it }t=1,...T such that √ 1 t=1 it −→ B(λ), as T → ∞, where T σ i
B(λ), 0 ≤ λ ≤ 1, is a standard Brownian motion then, 1 d ψi,T (ω, h) −→ δ(B(λ) > h)dλ,
(5.31)
0
Proof of Corollary 5.3.7: Given Assumptions 5.3.1, 5.3.2, and according to Pötscher (2004), we ! " m 1 only need to verify that the function δ √ t=1 it > h is locally T σ i integrable. Pötscher (2004) defines that a real-valued function f (x) is K locally integrable if and only if for any 0 < K < ∞, −K | f (x)|dx < ∞. Since the indicator function δ(x) is bounded with values as zero or one, we have sup |δ(x > h)| ≤ 1. Hence, for any level of threshold h, x∈R +
0 < K < ∞,
K
sup |δ(x > h)|dx
−K x∈R + K
≤
−K
dx
(5.32)
≤ 2K < ∞. ! " m Hence, the function as δ √ 1 > h is also locally integrable. t=1 it T σ i " ! m So is δ √ 1 t=1 it > h . Following Theorem 5.2.1 in Pötscher T σ i (2004), we consider that the occupation time statistics ψi,T (ω, h) will 1 converge weakly to 0 δ(B(λ) > h)dλ and are identically distributed for all firm i’s under the null. In the following, it is necessary to verify several properties of occupation time statistic so that the applications of similar asymptotic arguments on
186
J.-L. Jeng
empirical processes can be applied. In particular, the functional of occupation time statistics resembles the empirical reliability function which follows the properties of empirical distribution function. More specifically, these properties also ensure that the extension of occupation time statistics to sequential detection test can be obtained. Notice that the sample path of normalized cumulative sum √ 1 T σ˜ i m t=1 ˆit can be considered as a Cadlag function which is an element of D[0, 1] space that consists of all left-limit right-continuous functions defined on the interval [0, 1]. Hence, the goodness-of-test statistic in Eq. (5.10) is similar to a function that maps an element in D[0, 1] back into the real line. Given that the function space D[0, 1] can be considered as a Polish space11 if a suitable metric (or distance function) is devised, the weak convergence of the statistic can be obtained in using the asymptotic arguments on the complete separable Banach space. However, even though ψˆ i,T (ω, h) is similar to an empirical distribution function, the domain of ψˆ i,T (ω, h) is the function space D[0, 1]. More specifically, Eq. (5.10) is equivalent to 1 minus a generalized empirical process defined by functions in the space of Cadlag functions, D[0, 1]. Although the measure λ is not necessarily the probability measure, ζ¨ (h) can be considered as a functional of δ(Yλ > h), λ ∈ [0, 1] defined under a hypothetical uniform distribution on [0, 1]. Although similar to the concept of empirical process (or empirical distribution function), this generalized empirical process is not for estimating the possible empirical distribution of abnormal returns. Instead, the empirical process is defined on the cumulative sums of abnormal returns where the distribution function for the individual abnormal return could be heteroscedastic and unknown. In other words, the intensity of events (in terms of durations for cumulative abnormal returns) can be approximated by the empirical process defined on the function space D[0, 1].
5 Sequential Monitoring for Corporate Events . . .
5.4
187
Sequential Analysis for Monitoring of Change in Occupation Time Statistic
Given that the corporate finance events can be considered as either permanent (or one-shot) or contemporaneous (such as epidemic change in parameters), the sequential detection of change in occupation time statistic extends the conventional structural change tests to the change of intensity of impacts. We are now about to define the essentiality of the events and the sequential detection functional as occupation time statistics in the following. First, a corporate finance event is considered essential if the event’s impacts are not instantaneous. The impacts may linger around for a while since the “shot” to the market is not fully expected. Instead, the longer the impacts may last, the more severe the event is. The intensity of the event hence, determines the extent of the event may preside. We hereby define the functional for intensity and use it to define the sequential detection problem. The following sequential monitoring test depends on the simple intuition that if the occurring event is significant enough then the frequency or occupation time of the (say) cumulative sums of abnormal returns will change accordingly. That is, the absorbing time that reflects the reaction to the event will not be the same. Hence, the stronger the influence of the event, the bigger is the change of occupation time involved. Definition 5.4.1 A corporate finance event at (possibly unknown) time horizon T ∗ is considered “significant” or “essential” if and only if there exists some threshold h, h > 0, 0 < h ≤ τ , and for time horizon, where T is the current number of observations of stock returns such that |ψi,T ∗ (ω, h) − ψi,T (ω, h)| is significantly different from zero.12 Definition 5.4.2 For a random element S¨m in a Polish space formed by left-limit and right-continuous functions with Skorokhod topology on domain [u, v] and u, v > 0, u < v, k is the additional observations of abnormal returns, the sequential test (after the time horizon T for training samples) with occupation time statistic for 0 < h ≤ τ, (where τ is a finite
188
J.-L. Jeng
real number as a tuning parameter) is defined as d¨k = max sup
Tk
|ψˆ i,T +k (ω, h) − ψˆ i,T (ω, h)| ⎛ ⎞ k T 1 Tk ⎝ 1 ¨ | = max sup δ Sp > h − δ S¨m > h |⎠ 3 k≥1 0 0 of the cumulative sum of abnormal returns where T is the sample size of the cumulative sums of abnormal returns such that as T −→ ∞,0 < s ≤ 1, 0 < h ≤ τ ≤ 1, will converge in distribution to a two-parameter (negative)
5 Sequential Monitoring for Corporate Events . . .
191
Gaussian process −K (s, h) √
⎛
⎞ T∗ 1 d T⎝ δ S¨m > h − ζ¨ (h)⎠ −→ −K (s, h), T
(5.36)
m=1
1 where ζ¨ (h) = 0 δ(Yλ > h)dλ is the genuine occupation time of the data and 0 < h ≤ τ ≤ 1, 0 < s ≤ 1, K (s, h) is a two-parameter Gaussian (Kiefer) process where T ∗ = [sT ] is the largest integer that is less than or equal to sT. Proof Theorem 5.4.3 is not exceptional since the occupation time statistic is identical to 1 minus the empirical distribution of S¨m , by definition. The same logic also implies that 1 − ζ¨ (h), which is similar to the distribution 1 function of Yλ as 1 − ζ¨ (h) = 0 δ(Yλ ≤ h)dλ. Hence, ⎛ ⎞ 1 T∗ T∗ 1 ¨ 1 ¨ ⎝ ¨ δ Sm ≤ h ⎠ − 1 − δ Sm > h − ζ (h) = 1 − δ(Yλ ≤ h)dλ T T 0 m=1 m=1 ⎛ ⎞ ∗ 1 T 1 δ S¨m ≤ h − = −⎝ δ(Yλ ≤ h)dλ⎠ . T 0 m=1
Hence, by Bai (1994, 1996), Ling (1998), Berkes et al. (2009), as the sample size T → ∞ for any s , the statistic as in Eq. (5.36) converges in distribution to a (negative) two-parameter (Gaussian) Kiefer process. In fact, it is interesting to discover that the test statistic in Eq. (5.36) coincides with the sequential Kolmogorov-Smirnov-type test on detecting the difference of empirical distributions (or in our case, for the cumulative sums of idiosyncratic risk over the time horizons). Hence, loosely speaking, the test statistic is similar to sequential monitoring the distribution functions of cumulative sums of abnormal returns. Given the above theoretical results in Chapter 3, it is easy to apply the s asymptotics of sequential empirical processes that if we assume h)dλ as the identical functional for a given h, the functional 0 δ(Yλ ≤ 1 T ∗ ¨m > h − ζ¨ (h) will weakly converge to the −K (s, h). δ S m=1 T
192
J.-L. Jeng
Differing from the usual applications of empirical process, this empirical process is defined on the cumulative sums of idiosyncratic risk. Specifically, it is similar to the so-called tail process. It is not entirely the same as the Kolmogorov-Smirnov statistic defined on a particular variable of interest. However, the empirical process for the cumulative sum process can still be well-defined if the cumulative sums are considered as functional on a proper function space (such as D(0, 1)) with suitable metric and topology. In addition, since the test statistic is formed with two parameters (namely, the time index s and the range for the threshold h), the weak convergence of the test statistic is toward a two-parameter Kiefer process. Theorem 5.4.4 Given the definition of sequential test statistic in Definition 5.4.2 and Theorem 5.4.3, when under the null hypothesis that ζ¨ (h) is identical for when the unknown event is considered as not significant at all, the test statistic d¨h can be shown that under the null hypothesis and as T −→ ∞ and for a given h sufficiently high, 0 < h ≤ τ ≤ 1, τ is a finite real number as a tuning parameter for the range of thresholds applied where k → ∞ as T → ∞, k is the additional observations of abnormal returns and 0 < s ≤ 1, s = [ Tk ], [.] represents the integer part of the number, define the sequential occupation time-based detection as ⎛
⎞ k T 1 Tk ⎝ 1 d¨k = max sup | δ S¨ p > h − δ S¨m > h |⎠ 3 k≥1 0 h − ( 3 ) δ Sm > h | | = 3 T 2 k p=1 T2 m=1 ⎞ ⎛ T k 1 ⎝ ¨ k = 1 | δ Sp > h ⎠ − ( 3 ) δ S¨m > h | T2 T2 p=1 m=1 ⎞ ⎛ k T 1 ⎝ ¨ k 1 ¨ ⎠ ¨ ¨ = 1 | δ S p > h − ζ (h) − ( ) ( 1 ) δ Sm > h − ζ (h)| + o p (1) T T2 T 2 m=1 p=1 ⎞ ⎛ T ([s]) T 1 ⎝ ¨ k 1 ¨ ¨ S ≈ 1 | δ S p > h − ζ¨ (h)⎠ − ( ) δ > h − ζ (h)| + o p (1) m 1 T T2 T 2 m=1 m=1 ⎛ ⎞ T ([s]) T √ √ 1 1 ¨ ≈ T⎝ | δ S¨ p > h − ζ¨ (h)⎠ − s T δ Sm > h − ζ¨ (h)| + o p (1) T T m=1 m=1 ⎛ ⎞ T ([s]) T √ √ 1 1 ¨ ¨ ¨ ¨ ⎝ ⎠ ≈ T | 1 − δ S p ≤ h − F(h) − s T 1 − δ Sm ≤ h − F(h)| T T m=1
m=1
+ o p (1) d
−→ | K (s, h) − s K (1, h)| = | K¨ (s, h)|, F(h) = 1 − ζ¨ (h) = 1 −
1 0
δ(Yλ ≤ h)dλ, (5.39)
d by Theorem 5.4.3 above, where [s] = Hence, d¨k → sup sup | K¨ (s, h)| as T → ∞. Given the results of Bai (1994),
[ Tk ].
0 h, x ∈ R + }, and δ(x) = 0, otherwise. This is to set up the definition of occupation time for the diffusion process. Given this, the detection of event is to deal with the change of the occupation time which is similar to the empirical processes
208
J.-L. Jeng
asymptotically. Based on the above assumptions and Definition 6.3, we develop an alternative test statistic using the absolute values of the cumulative abnormal returns from the event study. The advantage of our test is that it does not require testing the differences in parameters or distributions across pre-event and post-event periods. Hence, it avoids the arbitrary choices of event window or pre- and post-event periods for statistical verification. Furthermore, based on the formula for asymptotic distribution and moments for the occupation time of reflected Brownian motion provided by Takács (1998), a statistical test using the occupation times statistics across the entire sample period can be devised (See Jeng 2015). In other words, following Assumptions 6.1 and 6.2, and under the null hypothesis that the new (corporate finance) information has no essential impact on the capital market, the underlying occupation time statistics defined in the following will converge to the occupation time of reflected Brownian motion in distribution asymptotically. Hence, the test that examines the essentiality of the corporate events can be formed as the test for significant difference between the occupation time statistics defined in the following Theorem 6.7 and the occupation time of reflected Brownian motion. Although the current setting allows the weak convergence toward Brownian Motion, and the occupation time defined depends on the value of threshold h assumes, the detection test based on the same setting is developed in Chapter 5, where the threshold h is considered as a choice variable. Two dimensional setting is considered in Chapter 5, where the optimality is determined. Assuming finite first and second moments of the test statistics ψi,T (ω, h) as those of occupation time of reflected Brownian motion under the null hypothesis, and using the Central Limit Theorem (for Banach-valued random variables) on L 2 space, the new test is to see if the relative frequencies for exceedance of absolute values of CAR’s (for any given level of threshold) after normalization are significant with a standard Normal distribution. More specifically, if under the null hypothesis that there is no significant impact from the corporate event(s), the cross-sectional average of occupation time from these abnormal returns when standardized using the mean and variance provided by Takács’ formula, should converge to a standardized normal random variable as the number of firms grows
6 Real-Time Applications of Monitoring . . .
209
sufficiently large. The details technicality is discussed in Chapter 5 where the Brownian Motion or other diffusion process is assumed when weak convergence for the cumulative sum of the abnormal returns (See Pötscher 2004). In other words, although the above statistic in Eq. (6.5) is denoted as occupation time statistic, it is actually a statistic of counting measure for the event(s) by counting the frequency of the occurrence when the absolute values of cumulative abnormal returns may exceed some thresholds. Intuitively, if an event is essential, then its impacts are noticeable and therefore the frequency for such occurrence of level crossing will either be significantly higher or lower statistically. The reason that it is denoted as occupation time statistic is that the statistic will converge to the occupation time of reflected Brownian motion under the null hypothesis when there is no significant impact from event(s) and Assumption 6.2 holds. Hence, under the null, and when Assumption 6.2 holds, this statistic will converge to the occupation time of a reflected Brownian motion where explicit formulas for its moments and distribution are available. Therefore, for any given level of threshold, the occupation times statistic for the entire sample period can apply these formulas to obtain their moments (or distribution) under the null hypothesis and Assumption 6.2. The formulas of distribution and moments for the occupation time of a reflected Brownian motion versus different thresholds are provided by Takács (1998). As a result, we consider that the occupation time statistics 1 ψi,T (ω, h) will converge weakly to 0 δ(|W (z)| > h)dz and are identically distributed for all firm i’s under the null. Hence, we may apply the Central Limit Theorem for (Banach-valued random variables) using these moments under the null hypothesis if the statistics ψi,T (ω, h) belong to a suitable function space. In the following, since the occupational time is a random functional of real-valued variable, the conventional Central Limit Theorem can not be applied. Hence, with additional assumptions for the functional space of the occupation time statistics, the asymptotic properties for the crosssectional average of the occupation time statistics can be obtained where the Central Limit Theorem (for Banach-valued random variables) can be applied. Specifically, the cross-sectional average of the occupation time of reflected Brownian motion can be shown to follow asymptotic normality
210
J.-L. Jeng
where the test statistics can be applied to verify the null hypothesis of no significant corporate events. The following assumption is to confine the occupation time statistics to a Banach space of some square integrable functionals so that the functionals as occupation time statistics are welldefined and integrable. Assumption 6.4 Let the occupation time statistics of abnormal returns for each firm i, i = 1, 2, . . . , such as ψi,T (ω, h) : (, F , P) → [0, 1] for any threshold h, 0 h < ∞, belong to a type-2 separable Banach L 2 −space of all squared integrable real-valued Borel-measurable functions g, where g : → R with respect to the probability measure P and be equipped with the L 2 -norm (denoted as . heretofore).2 A separable Banach L 2 − space is of type 2 if and only if for a sequence of Rademacher random variables {θi }i=1,2,..., defined on (, F , P) and for all finite sequences of {xi }i=1,2,..., , xi ∈ L 2 - separable Banach space, there exists a constant C > 0 such that
n i
θi xi ≤ C
n
1 2
x i 2
.
(6.5)
i
Assumption 6.5 Let the occupation time functional ψi,T (ω, h)}i=1,2,..., follow the strong law of large numbers such that as n → ∞, 1 a.s. (ψi,T (ω, h) − E[ψi,T (ω, h)]) −→ 0, n n
(6.6)
i=1
where E[ψi,T (ω, h)] denotes the Bochner integral of ψi,T (ω, h) with a.s. respect to P, the notation −→ stands for almost surely convergence. Assumption A 6.6 The de-meaned occupation time functional ξi,T (ω, h)} i=1,2,..., satisfies the small ball criterion such that for each ε > 0, lim τ 2 P ||ξi,T (ω, h)|| > τ = 0,
τ →∞
(6.7)
6 Real-Time Applications of Monitoring . . .
α(ε) = lim inf P ||
n
n→∞
√ ξi,T (ω, h)/ n|| < ε > 0,
211
(6.8)
i=1
where ξi,T (ω, h) = ψi,T (ω, h) − E[ψi,T (ω, h)]. These equations in Assumption A 6.6 is to follow the small ball criterion in Ledoux and Talagrand (1991). The Eq. (6.7) is to specify the such that the second-order moments of functionals tail condition ξi,T (ω, h) i=1,2,..., will not be expanding too rapidly and explode, while n Eq. (6.8) is to ensure|| i=1 ξi,T (ω, h)|| is bounded in probability. These conditions are to ensure that the weak convergence of the partial sums onto a tight probability measure in Banach space. Given the above assumptions, we can establish the Central Limit Theorem random variables) for the occupation time statistics (of Banach-valued ψi,T (ω, h) i=1,2,..., under the null hypothesis of no significant impact from corporate finance event such as merger and acquisition. Many extensions on the central limit theorem from real-valued random variables to function spaces have been provided in mathematical and statistical literature such as Hoffmann-Jorgensen and Pisier (1976), Araujo and Giné (1980), Ledoux and Talagrand (1991), and on the weakly dependent (Banach-valued) random variables in Dehling (1983), Ermakov and Ostrovoskii (1986) and the empirical central limit theorem such as the finding of Andrews (1991) and many others.3 Theorem 6.7 Given the results ofTheorem 3.4.18 and Assumptions 6.4, 6.5, 6.6, and if the null hypothesis holds such that no significant event exists, then for any given threshold h ∈ R + , 0 h < ∞, where the threshold is determined by the sampler. the partial-sum statistic for the occupation time functional will satisfy the Banach-valued Central Limit Theorem for the normal distribution
n ψi,T (ω, h) − M1 (h) , as asymptotically such that for Sn (h) = n1 i=1 n → ∞, T → ∞, with T grows faster than the growth of n, (that is, the time period is not confined to event or estimation period) √
Z (h) =
n d (Sn (h)) −→ γ , σ (h)
(6.9)
212
J.-L. Jeng
where M1 (h), σ (h) are the mean and standard deviation of the occupa1 tion time 0 δ(|W (z)| > h)dz of reflected Brownian motion |W (z)| for any threshold h ∈ R, and 0 h < ∞, γ ∗ is a random variable in the separable Banach space L 2 , with a standard Normal distribution N (0, 1).4 Proof Applying the asymptotic findings of Andrews (1991) for dependent observations in Chapter 3, or the results of Ledoux and Talagrand (1991). It is easy to show that for any fixed n, and as T → ∞, when all 1 0 δ(|W (z)| > h)dz are of identical distribution (for cumulative sums of returns) under the null hypothesis. The finite partial sum as abnormal n i=1 ψi,T (ω, h) will also converge weakly such that n
d
ψi,T (ω, h) −→
i=1
n i=1
1
δ(|W (z)| > h)dz.
0
According to setting and the Assumptions A 6.5, A 6.6, Theorem 6.7, by the asymptotic findings of Andrews (1991) for dependent observations or by Theorem 10.13 in Ledouxand Talagrand (1991), it is feasible to have the central limit theorem for ξi,T (ω, h) i=1,2,..., such that as n → ∞, n 1 d ξi,T (ω, h) −→ γ , √ n i=1
where γ is a random variable in the separable Banach space L 2 with Gaussian measure. Notice that this is a central limit theorem in the function space, not in the usual real space. Therefore, even the variable of interest is the functional of occupation time, it is feasible to consider the weak convergence for elements in the abstract space, other than real variables. Hence, the test statistic can be examined as the asymptotic distribution is available. Under the null hypothesis that no significant events appear and p Theorem 6.7 holds, we can have E[ψi,T (ω, h)] −→ M1 (h) given that p 1 1 n i=1 0 δ(|W (z)| > h)dz −→ M1 (h) when the weak law of large n
6 Real-Time Applications of Monitoring . . .
213
1 numbers holds as n → ∞, for 0 δ(|W (z)| > h)dz, i = 1, 2, . . . , n, according to Takács (1998). Likewise, we can calculate the second-order moment M2 (h) accordingly. Now since all {ψi (ω, h)}i=1,2,..., converge to an identically distributed 1 occupation time of reflected Brownian motion denoted as 0 δ(|W (z)| > h)dz, for all i = 1, 2, . . . . as T → ∞, by assuming that T grows faster than n (that is to assume that the cross-sectional observations grow faster than the time series), we may also apply the the σ 2 (h) = M2 (h) − M12 (h) for normalization where M2 (h) is the second-order 1 moment of 0 δ(|W (z)| > h)dz. Hence, as n → ∞, we have n 1 d (ψi,T (ω, h) − M1 (h)) −→ γ , √ nσ (h) i=1
under the null hypothesis, where γ is a random variable in separable Banach space L 2 with a standardized Gaussian measure as N (0, 1).That is, n n 1 (ψi,T (ω, h) − M1 (h)) σ (h) n i=1 √ n d = (Sn (h)) −→ γ . σ (h) √
Notice that this is to set up the asymptotic argument for the function space. The conditions specify the need for the occupation time functional and the result indicates the asymptotic normality. This shows that if the cross-sectional sample is sufficiently large, the average occupational time statistics minus the mean will converge to a standardized normal distribution. Hence, the statistical inferences based on the sample average of occupational time statistics will have the normal distribution asymptotically. In order to implement the alternative test methodology we propose in the paper, we study a subset of mergers. Our sample is strictly U.S. based and spans over the daily stock returns from 2000 to 2006, immediately after the long wave of M&A of the 1990’s, with a focus on the acquiring firm in successful all cash tender
214
J.-L. Jeng
offer mergers between two U.S. public companies. Specifically, we rely on the Thomson WorldScope mergers and acquisitions database to identify firms for our sample. We screen the deals so as to keep in our sample M&A that are (1) tender offers, (2) whose bidder and target are both U.S. public companies, and (3) are successfully completed. This screen returns 166 successfully completed deals. Further data restrictions, such as availability of data on both CRSP and CompuStat, limit our sample to 125 deals. The pattern of the distribution mirrors two phenomena: first, the intensity of M&A fluctuated during our time period and second, the choice of payment methods also shifted throughout the period. The intensity of M&A declined dramatically in 2001, from 3000 worldwide in 2000 to 1600 in 2001, and then continued to increase through 2006 (Lipton 2006). During this time period, companies’ use of noncash methods of payment increased. The two effects combine to explain the decline through time of our sample. We see that the targets in our sample are on average profitable firms though a significant percentage experienced a loss in the trailing 12-month period prior to receiving the tender offer. The average (median) target had sales of over $600 million ($170). The acquirer offered an average premium of 40% over the stock price of the target on the day prior to the announcement, representing an offer of an average (median) Price to EBITDA of 17 (10) times. As evidenced by the decline in the value of the premium when examined against the price of the target one month prior to announcement, it is important to expand the analysis of any merger and acquisition transaction beyond the narrow window of the traditional event study. The average (median) premium offered by the acquirer is a larger 52% (48%) when measured against this one-month ago value. Since the sample covers a huge range of deals done with target firms such as Northrop Grumman with total assets of $14 billion and with the Pittney Bowes which was as small as $4 million, the standard deviation of these samples shows much larger than the sample mean or median. In fact, the sampled distribution is highly skewed (5.74) and with excess kurtosis (38.82) as well. Hence, the distribution is not a normal distribution. After collecting the stock return series for all 125 acquirers in our sample , we compute the recursive residuals. To do so, we perform recursive regressions with expanding windows in one day increment in order to estimate the daily series of abnormal returns. That is, we run a set
6 Real-Time Applications of Monitoring . . .
215
of regressions for each firm in our sample for each day of our study window and use the regression results to compute one-step ahead forecast errors. By using recursive regressions with expanding analysis window, we are able to capture the progressive change of stock dynamics as information from the merger, announcement and resolution, propagates to the market and the stock behavior. We seed the recursive regressions by starting with 60 trading observations before creating the series of abnormal returns. The abnormal returns are then obtained without the assumption of estimation and event periods. Once we have generated the series, we are able to compute the ψi,T (ω, h) test statistics for each of the acquirers in our sample for different levels of h. After computing the ψi,T (ω, h) statistics for the entire sample according to different thresholds h, we apply the Takács’ formula to obtain the moments of occupation time of the reflected Brownian motion. With application of the Banach-valued Central Limit Theorem (for empirical processes), we then calculate the average of ψi,T (ω, h) statistics across all firms in the sample and obtain the final test statistic Z (h) which is distributed with a standardized Normal distribution. All these statistics are provided in Table 6.1.5 The M1 (h) stands for the mean of the occupation time of reflected Brownian motion for threshold h. The M2 (h) stands for the second-order moment of the occupation time of reflected Brownian motion for threshold h. Both M1 (h) and M2 (h) are calculated using Takác’s (1998) formula. We then, calculate the variance of occupation time of reflected Brownian motion using σ 2 (h) = M2 (h) − (M1 (h))2 . For different levels of threshold h, the test statistics are significant with both 5 and 1% significance levels. In other words, the test statistics reject the null hypothesis that mergers and acquisitions are not essential events for corporations. More specifically, since the test statistics are negative, this implies that the release of corporate restructuring information actually helps to clarify the uncertainty in the market. Intuitively, this implies that the average time spent by the absolute cumulative abnormal returns in excess of certain levels of threshold is less than the occupation time of reflected Brownian motion. The conventional approach only emphasizes the impacts or the changes in parameters or distributions and pays little attention on the persistence or time span the impacts may last. In contrast,
216
J.-L. Jeng
Table 6.1
The Test Statistics Z (h)
Threshold h
Average ψ(ω, h)
M1 (h)
M2 (h)
σ (h)
Z(h)
0.001 0.005 0.010 0.015 0.020 0.025 0.030 0.035 0.040 0.045 0.050 0.055 0.060 0.065 0.070 0.075 0.080 0.085 0.090 0.095 0.100 0.105 0.110
0.9977 0.9891 0.9785 0.9676 0.9570 0.9467 0.9368 0.9271 0.9168 0.9071 0.8969 0.8869 0.8770 0.8670 0.8574 0.8482 0.8393 0.8295 0.8210 0.8115 0.8026 0.7935 0.7841
0.9984 0.9921 0.9841 0.9763 0.9685 0.9607 0.9530 0.9453 0.9377 0.9301 0.9226 0.9151 0.9076 0.9002 0.8928 0.8855 0.8782 0.8709 0.8637 0.8565 0.8494 0.8422 0.8352
0.9968 0.9842 0.9687 0.9534 0.9384 0.9237 0.9092 0.8949 0.8808 0.8669 0.8532 0.8397 0.8263 0.8131 0.8000 0.7870 0.7741 0.7613 0.7486 0.7360 0.7234 0.7109 0.6983
0.0012 0.0059 0.0115 0.0169 0.0219 0.0267 0.0311 0.0352 0.0390 0.0424 0.0455 0.0481 0.0502 0.0519 0.0531 0.0537 0.0537 0.0529 0.0513 0.0486 0.0445 0.0386 0.0294
−6.7840 −5.6781 −5.4491 −5.7378 −5.8602 −5.8922 −5.8359 −5.7974 −5.9953 −6.0576 −6.3100 −6.5590 −6.8075 −7.1503 −7.4479 −7.7547 −8.1046 −8.7556 −9.3149 −10.3638 −11.7496 −14.1333 −19.4286
Source Advances in Quantitative Analysis of Finance and Accounting (2016)
our alternative methodology modifies the approach to consider both the magnitudes (or levels) and the time span of impacts. Hence, with our sample and methodology, we show that the newly available public information may actually resolve the lingering uncertainty from the noise in the market and “tames down” the market instead of introducing additional fluctuations to the abnormal returns. Another feature we uncover is that most of the test statistics are decreasing and becoming more negative as the level of threshold increases. This shows that as the level of threshold increases, there are fewer occurrences and the time the absolute cumulative abnormal returns may stay above the threshold decreases. Namely, there is no significant time period with many spikes in
6 Real-Time Applications of Monitoring . . .
217
absolute cumulative abnormal returns. In other words, there may be some spikes in absolute cumulative abnormal returns, yet they do not sustain for a significant period of time across the entire sampled horizon. As the levels of threshold h expand, there are fewer spikes to be found. Therefore, the test statistics are also consistent with the intuition that as the levels of threshold expand, less dramatic outliers will be shown. That means the impact of merger and acquisition information in our sample does not cause persistent disturbances in the market. Nevertheless, the announcements of mergers and acquisitions actually clarify the noisy information surrounding the corporate restructuring in the market. Currently, the present work allows the discussion for dependence among the occupation time of abnormal returns for instance, the empirical central limit theorem of Andrews (1991). Therefore, it is the dependence condition among the intensity of these abnormal returns. Hence, based on the empirical collection of data, we did find out that the market adjusts fast enough so that the corporate event does not have the impact as usual. This confirms that the conventional methodology for event study can possibly be overstated.6 Sequential Detection of the Event of interest To provide the sequential monitoring test, we perform the reality check and detectability in using the test (on the abnormal returns). The procedures for possible corporate events are considered as follows: (1) select a firm of interest where no prior knowledge for any particular corporate event is obtained in advance; (2) perform the online monitoring tests for the observations after the recursive estimation; (3) check to see if there’s any particular relevant corporate event occurs once the test detects the possible changes in occupation time statistics. In brief, we perform the tests first without knowing if there’s any significant event of the institution or not. And if there’s any detection shown in the test statistics, then doublecheck to see if there’s any significant corporate (finance) event appears in the institution during this period of time. Specifically, the sequential monitoring tests can be performed as the test statistics are calculated online as additional observations are scanned through. The test will indicate the point of detection as soon as the above statistic (or decision function) exceeds the critical values tabulated by Picard (1985). For example, the
218
J.-L. Jeng
0.715, 0.772, and 0.874 for the significance levels as 10, 5, and 1%, respectively. A linear regression as market model is performed to obtain the abnormal returns (and their cumulative abnormal returns (CAR’s)) for the stock returns based on the estimates recursively obtained in using the samples. The test statistic as the detection statistic is performed over the additional observations of abnormal returns after the current samples, where comparisons over different occupation time statistics for after-current samples and current samples are obtained. (Notice that the abnormal returns are obtained after the normal returns are already scanned through the sample recursively.) The test is applied to the daily stock returns of Exxon Mobil from June 2015 to June 2017. For the concerns of testing the accuracy of early detection of the proposed test when no a priori information for the company’s corporate finance issues is obtained prior to the application. In the continuous online estimation with market model, and for the threshold h starts from 0 to 10 times of the range of abnormal returns for the samples, the test statistics show that there is a major change of occupation time statistics around the 178th–179th observations of the time series of abnormal returns when the suprema are chosen over the range of thresholds from zero to ten times of the range of abnormal returns in the samples for conservatism. As for precautions, the range of thresholds is allowed to be as extended as possible to allow the estimates for occupation time statistics as precise as possible for different thresholds.7 Both of these statistics exceed the critical value of 0.772 for 5% level of significance according to Picard (1985). In addition, for various thresholds as earlier, similar results also indicate the detection around the same date as the 178th–179th observations. Applying the above sequential detection stopping rule, this indicates that there is early detection of change in occupation time in abnormal returns. Accordingly, there should be a significant corporate event in the data. The result then, is checked to see if there’s any major corporate event occurs in that period of time so that the detection is not a false alarm due to the spikes (or other noises) of data stream. And the dates of this significant change of occupation time statistics are around 02/25/2016 and on. With the findings (after performing the tests) of corporate finance issues in
6 Real-Time Applications of Monitoring . . .
219
Exxon Mobil, it is shown that the company had issued additional $12 billion bonds to the market in late Feb. 2016, which raised the concerns of the firm’s solvency condition. The test statistics picked up the possible changes as an early detection mechanism. In addition, without using the abnormal returns, the monitoring test statistics calculated with the rates of return detects the change around 221th observation, which is April 27, 2016. For the follow-up, it is also confirmed afterward that the company’s bond was down-rated by S&P from AAA rating to AA+ rating in late April, 2016. Although the downfall of oil price is well-known to the market in 2015, no prior knowledge of the firm is considered and the test is performed without searching for any major corporate issue in advance. In other words, it is not simply a recursive sequential checking for the verification of impacts from corporate events already known. Instead, the sequential monitoring test provides an early “warning shot” for the possible corporate finance events even though no prior knowledge of the institution is available in advance. This shows that the sequential monitoring test developed is useful to detect the corporate issues online. As a conclusion for the sequential monitoring test, the occupation time statistics can be used to detect the possible corporate events without the prior knowledge of the events of interest. The difficulty is that the weak convergence of the test statistics does not have the analytical form of the asymptotic distribution. Hence, the critical values are obtained from the simulations from other literature such as Picard (1985) (Table 6.2).
220
J.-L. Jeng
Table 6.2 The Conventional CAR’s tests with conventional OLS estimates and recursive least squares. The traditional CAR’s test is based on the estimation period from −250 days to −120 days prior to the event date to estimate the market model. The recursive least squares CAR’s test is based on the recursive least squares estimates for the market model for the entire sample period. The CAR’s tests for both methods are based on the event window −2 days prior to the event date up to +5 days after the event date Statistics
Traditional CAR (−2, +5)
Recursive Least Squares CAR (−2, +5)
Average Median Number of firms Z-statistic p-value
0.70% −0.07% 125 −0.39 0.698
0.02% −0.41% 125 −1.16 0.244
Source Advances in Quantitative Analysis of Finance and Accounting (2016)
Notes 1. Certainly, weak convergence to other diffusion processes such as Lévy process can be assumed to allow more generality. However, given the complexity in discussions, this extension is left for further works in later studies. 2. Given that there are many different definitions for the types of Banach space, we apply the definition of Ledoux and Talagrand (1991) for simplicity. 3. We apply the Andrews (1991) Central Limit Theorem for the dependent random variables here since these merger-acquisition events for the firms may be mutually dependent. In addition, the occupation time functionals will converge to some independent identically distributed occupation times of reflected Brownian motion under the null according to Ledoux and Talagrand (1991). 4. In fact, this result does not depend on the assumption that the occupation time statistic will converge to the functional as the occupation time of Brownian motion. As long as the mean of ψi,T (ω, h) exists as M1 (h) where Assumptions 6.2 to 6.5 hold, the above result will still hold.
6 Real-Time Applications of Monitoring . . .
221
5. Since all the ψi,T (ω, h) statistics are quite small and lie between zero and one, we report all the statistics up to 4 digits after decimal point in Table 6.2 for more precision. 6. Some of the empirical results are obtained from the earlier publication of Jeng et al. (2016), “On the Occupation Times of Mergers and Acquisitions in Event Studies”, Advances in Quantitative Analysis of Finance and Accounting, pp. 171–204. 7. Other values of thresholds such as the maximum of the abnormal returns or standard deviation of abnormal returns from training sample, ... etc., are also selected. Since the results are identical, the statistics are not reported here for simplicity.
References Andrews, D.W.K. 1991. Heteroskedasticity and Autocorrelation Consistent Covariance Estimation. Econometrica 59 (3): 817–858. Araujo, A., and E. Giné. 1980. The Central Limit Theorem for Real and Banach Valued Random Variables. New York: Wiley. Brown, S.J., and J.B. Warner. 1985. Using Daily Returns: The Case of Event Studies. Journal of Financial Economics 14: 3–31. Dehling, H. 1983. Limit Theorems for Sums of Weakly Dependent Banach Space Valued Random Variables. Probability Theory and Related Fields 63: 393–432. Eberlein, E. 1986. On Strong Invariance Principles under Dependence Assumptions. Annals of Probability 14: 260–270. Ermakov, S.V., and E.I. Ostrovoskii. 1986. The Central Limit Theorem for Weakly Dependent Banach-Valued Random Variables. Theory of Probability and Its Applications 30: 391–394. Hoffmann-Jorgensen, J., and G. Pisier. 1976. The Law of Large Numbers and the Central Limit Theorem in Banach Spaces. Annals of Probability 4: 587–599. Jain, P.C. 1986. Relation between Market Model Prediction Errors and Omitted Variables: A Methodological Note. Journal of Accounting Research 24: 187– 193.
222
J.-L. Jeng
Jeng, J.-L. 2015. Analyzing Event Statistics in Corporate Finance. New York: Palgrave Macmillan. Jeng, J.-L., D. Park, and M. Dewally. 2016. On the Occupation Times of Mergers and Acquisitions in Event Studies. Advances in Quantitative Analysis of Finance and Accounting 14: 171–204. Ledoux, M., and M. Talagrand. 1991. Probability in Banach Spaces. Berlin: Springer-Verlag. Lipton, M. 2006. Merger Waves in the 19th, 20th, and 21st Centuries. York University Working Paper. Picard, D. 1985. Testing and Estimating Change-Points in Time Series. Advances in Applied Probability 17: 841–867. Pötscher, B.M. 2004. Nonlinear Functions and Convergence to Brownian Motion: Beyond the Continuous Mapping Theorem. Econometric Theory 20: 1–22. Sen, P.K. 1982. Invariance Principles for Recursive Residuals. Annals of Statistics 10: 307–312. Takács, L. 1998. Sojourn Times for the Brownian Motion. Journal of Applied Mathematics and Stochastic Analysis 11: 231–246. Thompson, J.E. 1988. More Methods that Make Little Difference in Event Studies. Journal of Business Finance and Accounting 15: 77–85. Wu, W. 2007. Strong Invariance Principles for Dependent Random Variables. Annals of Probability 35: 2294–2320.
Concluding Remark
The current contents of this cover the conventional methods in finding the effects of corporate finance events and the procedures associated. We surveyed many perspectives in using various approaches to verify the possible impacts of the events of interest. Unfortunately, the conventional approaches although solving many issues (econometrically or else), ignore the intensity of the impacts of events. This book points out several issues for the forthcoming literature to consider; (1) to investigate the normal returns (and the abnormal returns) in using the arbitrary determination of estimation and event periods is very subjective and depending on the issues of interest and samples selected, (2) to obtain the normal returns in using recursive estimation seems more promising since the markets are to use the most current information to update their forecasts for the stocks or portfolios of interest. In particular, these approaches are available in the literature of automatic control to track the dependent variable in the regressional models. (3) we define a new occupation time statistic that may consider the intensity of the events which evaluate the severity of the events by their duration of time. Other than the stochastic drawdown which may consider the time frame or sample path of the variables of interest by assuming the stochastic processes © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J.-L. Jeng, Contemporaneous Event Studies in Corporate Finance, https://doi.org/10.1007/978-3-030-53809-5
223
224
Concluding Remark
underlined while the occupation time statistic is not dependent on the underlying stochastic processes assumed; (4) we also consider the empirical processes and their asymptotic properties and show that the occupation time statistic is the transformation of the empirical processes indexed by some functions. In fact, the occupation time statistic is depending on the cumulative sums of abnormal returns. In contrast, the conventional approach only investigate the magnitude of the cumulative sums of abnormal return to be different from zero as the confirmation of significance of the events. The occupation time statistic instead, studies the duration of the associated consequence of events, (5) the sequential monitoring scheme is also developed for the detection of unknown events that may alternate the pattern of occupation time statistics. With the real-time data, we confirm that the underlying statistic is useful to perform the online detection for the events. However, there are a few issues that may cause the forthcoming literature to consider; (1) some of the corporate events are temporary and the others may be permanent. The statistical inferences are easier for the latter issues. Yet, when the issues are only temporary, the existing studies usually ignore the consequence and the conventional methods tend to apply the same tools for the studies. The study for the events should be robust enough for either cases in corporate issues. The study such as Hawkes process or counting process can be extended to analyze the corporate event studies, (2) the sequential monitoring schemes for the corporate issues need further developments for the asymptotic properties when the underlying stochastic processes may be subject to gradual changes. In particular, the results will be more precise if the analytical distribution function for Kiefer process is feasible, 3) the recursive estimations for the normal returns combined with the considerations of model selection for systematic risk needs further study for the robust estimations as the markets may evolve over time. These are left for future studies to consider.
Index
Banach-valued central limit theorem, 201–213, 215–217 Bartlett-kernel function, 183 Bayesian switching, 26–28 Bi-variate Brownian bridge, 95 Boundary function, 88, 150–153 Bracketing number, 137–138 Brownian pillow, 96, 192–194 Brownian sheet, 95
Clustered events, 31 Conditional excitation condition, 73–74 Conditional Intensity Function, xiv–xvi Conditional method in event studies, 31 Conditional richness condition, 71–79 Contaminated event dates, 19–20 Covering number, 103 Cramér-Von Mises statistics, 121 Cumulative abnormal returns, xi–xiv CUSUM test for structural change, xi–xiv, 148–153
C
D
Central limit theorem for empirical processes, 113–116
Differential expectation, 38 Donsker theorem, 117
A
Abnormal return, xi–xiv, 3–4 α-mixing, 136–138, 141, 165
B
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J.-L. Jeng, Contemporaneous Event Studies in Corporate Finance, https://doi.org/10.1007/978-3-030-53809-5
225
226
Index
E
Earthquake intensity model, xiv–xv Empirical process, 99–143 Equicontinuity, 138–139 Essential corporate finance events and intensity, 148–154 Event dependency, 10–16 Event-induced variance increase, 41–42 Event studies, ix–xiv Event window, 5–9
Lipschitz continuous, 134 M
Misspecification, 45–47 Monitoring test, 147–158, 161–174, 178–189, 203–210 Mutiperiod event date, 20–21 N
Near-epoch dependence, 109–113 Normal returns, 55–86
F
Fidi convergence, 112, 136 First passage time, 157
G
Glivenko–Cantelli theorem, 104–105
O
Occupation time statistic, 91–98, 103–107, 151–154 Occupation time statistic for diffusion process, 96–98 Orlicz norm, 102
H
Hawkes process, xv–xvi
P
φ-mixing, 64–65, 122 Polish space, 187
I
Intensity of events, 100–101
K
Kalman filter algorithm, 83–95 Kiefer-M u‚ller process, 117 Kiefer process, 95 Kolmogorov–Smirnov statistics, 121, 190
L
Least mean squares (LMS), 71–85
R
Random event dates, 16–17 Random permutation, 127–132 Recursive least squares, 73–86 Recursive methods in time-varying coefficient system, 62–86 Reflected Brownian motion, 91–94 Residual analysis, 28–31 Risk Neutrality, 35 Robust methods in event studies, 43–45 Robust regression, 43–44
Index
S
T
Sample selection, 48 Sequential empirical process, 116–144, 147–153 S-mixing, 133 Stochastic drawdown, 154–165 Stochastic drawup, 155–156 Stochastic equicontinuity, 106–108 Stopping time, 150, 170, 190 Structural change, 9–15, 21–26 Switching regression, 26–31
Time-varying coefficient system, 62–67, 69–75, 80–88 Tracking error bound, 74–80
227
U
Uniform central limit theorem, 139 Uniform functional central limit theorem, 140