150 113 7MB
English Pages 310 [306] Year 2021
Blockchain Technologies
Paul Moon Sub Choi Seth H. Huang Editors
Fintech with Artificial Intelligence, Big Data, and Blockchain
Blockchain Technologies Series Editors Dhananjay Singh , Department of Electronics Engineering, Hankuk University of Foreign Studies, Yongin-si, Korea (Republic of) Jong-Hoon Kim, Kent State University, Kent, OH, USA Madhusudan Singh , Endicott College of International Studies, Woosong University, Daejeon, Korea (Republic of)
This book series aims to provide details of blockchain implementation in technology and interdisciplinary fields such as Medical Science, Applied Mathematics, Environmental Science, Business Management, and Computer Science. It covers an in-depth knowledge of blockchain technology for advance and emerging future technologies. It focuses on the Magnitude: scope, scale & frequency, Risk: security, reliability trust, and accuracy, Time: latency & timelines, utilization and implementation details of blockchain technologies. While Bitcoin and cryptocurrency might have been the first widely known uses of blockchain technology, but today, it has far many applications. In fact, blockchain is revolutionizing almost every industry. Blockchain has emerged as a disruptive technology, which has not only laid the foundation for all crypto-currencies, but also provides beneficial solutions in other fields of technologies. The features of blockchain technology include decentralized and distributed secure ledgers, recording transactions across a peer-to-peer network, creating the potential to remove unintended errors by providing transparency as well as accountability. This could affect not only the finance technology (crypto-currencies) sector, but also other fields such as: Crypto-economics Blockchain Enterprise Blockchain Blockchain Travel Industry Embedded Privacy Blockchain Blockchain Industry 4.0 Blockchain Smart Cities, Blockchain Future technologies, Blockchain Fake news Detection, Blockchain Technology and It’s Future Applications Implications of Blockchain technology Blockchain Privacy Blockchain Mining and Use cases Blockchain Network Applications Blockchain Smart Contract Blockchain Architecture Blockchain Business Models Blockchain Consensus Bitcoin and Crypto currencies, and related fields The initiatives in which the technology is used to distribute and trace the communication start point, provide and manage privacy, and create trustworthy environment, are just a few examples of the utility of blockchain technology, which also highlight the risks, such as privacy protection. Opinion on the utility of blockchain technology has a mixed conception. Some are enthusiastic; others believe that it is merely hyped. Blockchain has also entered the sphere of humanitarian and development aids e.g. supply chain management, digital identity, smart contracts and many more. This book series provides clear concepts and applications of Blockchain technology and invites experts from research centers, academia, industry and government to contribute to it. If you are interested in contributing to this series, please contact [email protected] OR [email protected] More information about this series at http://www.springer.com/series/16276
Paul Moon Sub Choi Seth H. Huang •
Editors
Fintech with Artificial Intelligence, Big Data, and Blockchain
123
Editors Paul Moon Sub Choi Ewha Womans University Seoul, South Korea
Seth H. Huang The Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong
ISSN 2661-8338 ISSN 2661-8346 (electronic) Blockchain Technologies ISBN 978-981-33-6136-2 ISBN 978-981-33-6137-9 (eBook) https://doi.org/10.1007/978-981-33-6137-9 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Contents
Blockchain, Cryptocurrency, and Artificial Intelligence in Finance . . . . Yun Joo An, Paul Moon Sub Choi, and Seth H. Huang
1
Alternative Data, Big Data, and Applications to Finance . . . . . . . . . . . . Ben Charoenwong and Alan Kwan
35
Application of Big Data with Fintech in Financial Services . . . . . . . . . . 107 Joseph Bamidele Awotunde, Emmanuel Abidemi Adeniyi, Roseline Oluwaseun Ogundokun, and Femi Emmanuel Ayo Using Machine Learning to Predict the Defaults of Credit Card Clients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Tuan Le, Tan Pham, and Son Dao Artificial Intelligence and Advanced Time Series Classification: Residual Attention Net for Cross-Domain Modeling . . . . . . . . . . . . . . . 153 Seth H. Huang, Lingjie Xu, and Congwei Jiang Generating Synthetic Sequential Data for Enhanced Model Training: A Generative Adversarial Net Framework . . . . . . . . . . . . . . . . . . . . . . . 169 Seth H. Huang, Wenjing Xu, and Lingjie Xu A Machine Learning-based Model for the Asymmetric Prediction of Accounting and Financial Information . . . . . . . . . . . . . . . . . . . . . . . . 181 Minjung Park and Sangmi Chai Artificial Intelligence-based Detection and Prediction of Corporate Earnings Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Sohyeon Kang and Sorah Park Machine Learning Applications in Finance Research . . . . . . . . . . . . . . . 205 Hyeik Kim
v
vi
Contents
Price-Bands: A Technical Tool for Stock Trading . . . . . . . . . . . . . . . . . 221 Jinwook Lee, Joonhee Lee, and András Prékopa Informed or Biased? Some Evidence from Listed Fund Trading . . . . . . 247 Paul Moon Sub Choi and Joung Hwa Choi Information Divide About Mergers: Evidence from Investor Trading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 Ye Jun Kim, Hyeik Kim, Paul Moon Sub Choi, Joung Hwa Choi, and Chune Young Chung Machine Learning and Cryptocurrency in the Financial Markets . . . . . 295 Haneol Cho, Kyu-Hwan Lee, and Chansoo Kim
Blockchain, Cryptocurrency, and Artificial Intelligence in Finance Yun Joo An , Paul Moon Sub Choi , and Seth H. Huang
Abstract This chapter describes the principles of blockchain, cryptocurrency, and artificial intelligence (AI) and their applications to the financial sector. We first discuss blockchain, and discuss cryptocurrency, the best-known application of blockchain. We present the question of whether a cryptocurrency is a currency or an asset and whether it can be a new safe haven asset. We summarize the controversy regarding the issuance of a central bank digital currency (CBDC). We argue that digital currencies only show the potential to inject liquidity into an economy during market stress. Additionally, most of the recognized advantages of blockchain applications relate to two concepts: decentralization and consensus. Blockchain’s decentralization can be used to democratize banking services, corporate governance, and the real estate industry. Finally, we present the strengths of and concerns in using AI technologies in banking, lending platforms, and asset management, bearing in mind the most recently developed applications in these areas. This chapter provides a contribution to the literature that incorporates both theory and practice in blockchain, presenting a detailed review of performances and limitations of AI techniques in finance, including recent publications relating to the COVID-19 pandemic, CBDC, and alternative data. Keywords Blockchain · Cryptocurrency · Artificial intelligence · Fintech · Banking · Corporate governance · Lending · Investing · Bitcoin · Safe haven asset · Central bank digital currency Y. J. An School of Economics, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, South Korea e-mail: [email protected] P. M. S. Choi (B) College of Business Administration, Ewha Womans University, 52-Ewhayeodae-gil, Seodaemun-gu, Seoul 03760, South Korea e-mail: [email protected] S. H. Huang Department of Business Management, The State University of New York, 119-2 Songdo Moonhwa-ro, Yeonsu-gu, Incheon 21985, South Korea e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 P. M. S. Choi and S. H. Huang (eds.), Fintech with Artificial Intelligence, Big Data, and Blockchain, Blockchain Technologies, https://doi.org/10.1007/978-981-33-6137-9_1
1
2
Y. J. An et al.
1 Introduction Fintech (financial technology) is the interactions between information and communication technologies and the established business of the financial industry. In this book, We examine three core concepts: blockchain, artificial intelligence (AI), and big data. This surveys the theories of blockchain and AI and its application to the financial sector. We present what blockchain is, how it works, how it is used in finance, and how it can disrupt or support the financial industry. We also summarize the performance and limitations of the AI techniques in banking, lending, and asset management. Fintech firms have been increasingly prominent in the financial sector since the Global Financial Crisis (GFC) in 2008, increasing banking competition, improving payment technologies, expanding lending institutions, altering corporate governance, and leaving their mark on real estate and supply chain management. Fintech has caused breakthroughs in these areas by showing the potential to alleviate information asymmetries between managers and customers, reduce manual work, enable adoption of innovative techniques, and escape regulation. The opposites of these changes, namely information asymmetry, operational risk, slower adaptation to innovation, and heavy regulation, are prevalent in the conventional financial system, most evidently in banks [1]. Blockchain is a database of records where transactions are recorded and shared. It provides a decentralized peer-to-peer network where nodes can cooperate to reach consensus. Its integrity is a particularly remarkable feature, in which the blocks that hold the records of transactions are linearly linked into a chain in chronological order [2]. These blocks prevent any currency from being spent twice. Additionally, blockchain technology enables the following: cryptocurrency, pioneered by Bitcoin [3], and central bank digital currency (CBDC), which has been proposed as a complement for monetary policy; discussion of Bitcoin as a new safe haven asset; greater shareholder democracy in corporate governance; novel universality in the real estate industry; and prediction of customers’ future orders based on historical records. Most of the proposed advantages of blockchains relate to decentralization and consensus. The advantages shift demand and supply. Blockchain technology, by design, removes financial intermediaries, which lowers barriers and allows easy entry. That said, there are contrasting arguments claiming that current blockchain relies on skewed miners, so it is not fully decentralized, thus unable to yield the proposed advantages. The advantages of AI include the utilization of novel data, generation of new customer profiles, sometimes in opposition to conventional data. For example, AIbased lending platforms can integrate large market segments because they can provide lower-priced credit to customers who have subprime ratings according to conventional criteria. In developing economies, AI techniques can enhance the penetration of the financial system because in remote areas, low-cost lending, banking, and payment services are more accessible than traditional banks. In asset management,
Blockchain, Cryptocurrency, and Artificial Intelligence …
3
machine investors follow pre-specified investing strategies to enhance their performance, while human investors are high-cost and more vulnerable to emotions during decision processing. Despite their strengths, blockchain and AI techniques have limitations. Cryptocurrencies are subject to pseudonymity and fraudulent identity, which reduced user trust. Specifically, cryptocurrency the payment of choice for illegal activities. Blockchain inevitably faces fork problems because multiple equilibria can be formed in games induced by proof-of-work in blockchain protocols. Moreover, because some scholars argue that blockchain is not completely decentralized, its supposed advantages cannot be realized. Some blockchain protocols rely on only a handful of miners, so the protocols are vulnerable to the 51% attack. Further, there is concern regarding the massive energy consumption in mining. Although AI techniques are attractive, the financial industry is slow to adopt them. AI is apparently not in the mainstream of current banking due to banking regulations, scarcity of skilled IT personnel, long-built rigidity in current IT systems, and banks’ innate risk aversion. Although AI can enlarge the size and quality of data, big data will never mean that all entities will have equal access to it. Only market leaders and innovative firms can take advantage of rare alternative data, while most other firms will fall behind. Fintech is viewed in two contrasts. The idea of sustainable fintech holds that fintech firms will provide healthy, competition in the financial industry. In this way of thinking, the established financial sector will experience improve customer service, and accelerate product innovation. However, the disruption will also take customers away from the conventional financial sector. Employees in banks, corporations, and insurance companies will be laid off. The literature that describes fintech as potentially disruptive calls for banks to enhance their customer relationship management. Discussing economic policy in relation to central banking, Lagarde [4] argues that even though fintech may ameliorate technological problems, it can never replace the financial system as it is constituted. This is because in these areas, storytelling, earning the trust of the public, forming public expectations, and active communication among peer experts are the most critical in driving policy changes. Section 2.1 introduces the idea of blockchain, how it works, and two of its most important properties in this context: consensus and decentralization. Section 2.2 presents the ways in which standard currencies and asset-pricing models are applied to cryptocurrencies. This subsection also presents a debate over whether Bitcoin is a new safe haven asset. Section 3 describes applications of blockchain to the banking industry (including CBDC), corporate governance, and the real estate industry. This section also describes concerns regarding the application of blockchain in the financial sector. Section 4 reviews the ways in which AI techniques are altering banking, lending, and asset management. It studies the performance and limitations to the implementation of AI techniques in finance, as well as the debate between the views of fintech as sustainable or disruptive. Section 5 gives the conclusion to the study.
4
Y. J. An et al.
2 Logic of Blockchain 2.1 Introduction to Transactions in Blockchain Blockchain is a decentralized, peer-to-peer records database where transactions are recorded and shared. It brings (or is intended to bring) distributed consensus, where the majority of participants in a public ledger verifiably agree on transactions. We present the process of distributed consensus in blockchain in Fig. 1, following Böhme et al. [5]. Alice, a user, wants to send Bitcoins to another user, Bob. Alice would need to prove two points, namely, that she has ownership of a private key and that she has sufficient cryptocurrency in her account. Then, Alice’s order to Bob is recorded in a block. Alice receives the report of the transaction in her private key, and the report is also sent to Bob’s public key. Bob verifies Alice’s private key and sends his public key to a peer-to-peer network. All participating entities share the records of the transactions in a decentralized peer-to-peer network. Nodes in the network reach a consensus and then approve that the orders are truthful. If the verification is agreed to by every node, then the transaction is recorded in a public ledger. Unlike a traditional payment service or conventional finance services, there are no higher authorities who provide intermediation. Next, the block that records the order from Alice to Bob is inserted into a blockchain. All blocks are added linearly to the chain, which is the origin of the name. Finally, the payment from Alice to Bob is held. Instead of being referred to a third, higher authority, blockchain requires cryptographic proof that two willing
Fig. 1 Process of distributed consensus in blockchain. Source Authors
Blockchain, Cryptocurrency, and Artificial Intelligence …
5
parties have made an online transaction. In this way, each unit of cryptocurrency can be traced through all of its transactions to the start of its circulation. Everyone can read all of the transactions in records that are stored in a widely replicated data structure. Crosby et al. [2] argue that although cryptocurrency is highly controversial in some areas, the blockchain system itself is flawless. Consensus is the developed agreement between nodes in a peer-to-peer network. In particular, for blockchain, it is agreement on the validity of transactions and a history of orders. Many types of consensus algorithms exist for blockchain. Public blockchain relies on several nodes, and they typically require agreement on a single value. Agreement on a common value between multiple nodes is called distributed consensus. How do nodes achieve consensus on the transaction between Alice to Bob? How can we ensure that the relevant blocks are linearly added? Blockchain incorporates two principal technologies: public–private key (asymmetric) cryptography and cryptographic validation of transactions. Asymmetric encryption involves an algorithm in which an encryption key (denoted the public key) and the decryption key (denoted the private key) are distinct. This method has the strength that its transmission can pass through unsecured channels. However, there is also the concern that it may be too slow because these keys involve the processing of large formulae in mathematical problems. Proof-of-work is a highly computational mathematical puzzle that must be solved for the use of cryptocurrency. A block is inserted into a blockchain only if it solves proof-of-work. This puzzle becomes more difficult to solve with each new block added to the system. A node that solves proof-of-work earlier is added to the blockchain before the next node that solves it. This chronological order prevents double spending and falsified identity. The economic significance of proof-of-work is that it ensures the scarcity of currency. The prevention of double spending is an important prerequisite for blockchain to be used to issue money. In cryptocurrency, blockchain preserves scarcity of money by verifying the validity of transactions in the peer-to-peer network through proof-of-work and mining technologies. Blockchain prevents double spending by what is called mining. To prevent recording a transaction that did not happen, each newly added block is compared to the most recently published block. In this transaction, the new block solves a mathematical puzzle that relates to previous blocks. If other users verify the solution, the peer-to-peer network agrees that the new block contains a valid transaction. Thus, the network and miners together ensure that the blockchain is chronologically ordered. In a blockchain, transactions are arranged in a properly linear chronological order. Accounting for the mixture of orders was a daunting task for distributed records management prior to the adoption of blockchain technology [2]. Blockchain prevents a mixture of orders by placing transactions simultaneously inside a single block. Hence, all transactions in one block can be ordered at once. The ledger arranges multiple blocks in which every previous block intersects with the beginning of the following block. Yermack [6] describes the integrity of blockchain as follows. Assume that one agent wants to change an earlier block, say block 74, simultaneously adding the new block 91. Then, because blocks are ordered chronologically,
6
Y. J. An et al.
the hypothetical agent would have to rearrange all the blocks from 74 to 91. This process would need to be done before the new block 92 is inserted. This would be extremely expensive, if it is even possible, so such an agent would be prevented from disrupting the order of chronologically linked blocks. We call the preservation of this property the integrity of blockchain. There are three main types of blockchain: public, consortium, or private, distinguished by who is able to participate in the consensus. In public blockchain, any miner can participate in the peer-to-peer network and the public ledger, that is, the blockchain is read by the public. In consortium blockchain, a pre-selected handful of nodes engage produce consensus. Private blockchain entails a consensus determined by a given organization. In consortium blockchain and private blockchain, reading can be either public or restricted. Consortium blockchain shares the scope of its participants with private blockchain, but transactions in private blockchain are irreversible.
2.1.1
Blockchain Design: Decentralization
Decentralization describes how consensus is generated, distributed, and stored. Bitcoin is a pioneer in creating consensus protocols for blockchains. Conventional systems, including traditional banking, are centralized. Higher authorities like central banks seek efficiency in designing and implementing monetary policies. Centralization brings order, but in this arrangement, corruption is possible, as well as political dependence, which is especially a concern for central banks in developing economies. The GFC of 2008 demonstrated the vulnerability of financial systems and the limitations of conventional monetary policy as administered by central banks. By design, blockchain eliminates the necessity of a centralized authority. Bashir [7] presents two ways in which decentralization can be implemented: disintermediation and contest-driven decentralization. Disintermediation is simply the absence of banks or any intermediaries that exist between sender and receiver in conventional financial system. Contest-driven decentralization, by contrast, involves competition between candidates seeking to perform a transaction service between buyer and seller. This method is not a perfect decentralization because there necessarily is an intermediary agent; however, it prevents monopoly. Blockchain technology can yield varying levels of decentralization. Decentralization has certain advantages. First, it prevents single point of failure (SPOF) in a system, that is, it avoids structures in which if a part malfunctions, the entire system fails. Blockchain alleviates SPOF because it maintains irreversible records that are distributed among decentralized nodes. In this arrangement, the failure of a single node is unlikely to cause the peer-to-peer consensus to fail. Likewise, no single node can reverse or change any record and the order of any transactions. Secondly, decentralization prevents monopoly power. The literature exhibits widespread acceptance that blockchain lowers barriers and allows participants to easily enter. Decentralization increases information interaction. Peer-to-peer
Blockchain, Cryptocurrency, and Artificial Intelligence …
7
networks allow agents to exchange digital assets with surplus information aggregation or exchange. It must be acknowledged that blockchain has achieved something that no prior technology in computer science had achieved: it increases information interaction while preserving data privacy [8]. Barenji et al. [9] merge blockchain technology with cloud manufacturing and highlight that the decentralization possible with blockchain can enhance its flexibility, efficiency, and availability. Blockchain would allow auditing firms to exchange encrypted information while preserving clients’ propriety information. Early users of Bitcoin praise the decentralization possible with blockchain. However, other work argues that the decentralization in blockchain systems is not perfect, and it has only limited benefits. Among other observations, it is noted that a handful of entities control decision making, mining, and consensus in Bitcoin protocol, so it is not fully decentralized. Nakamoto consensus assumes that each mining node has similar computational power and thus a similar probability of extending the blockchain. Chu and Wang [10] argue that current blockchain technology is not fully decentralized because physical nodes in the blockchain have uneven computing power. If the price of cryptocurrency grows, the mining power of a single node can become many times that of other nodes. Their findings suggest that 53% of the mining power for Bitcoin is controlled by the top four Bitcoin miners. This means that the blockchain is maintained by a small handful of entities. A small, decentralized financial system entails higher risk than a centralized financial market. The small, decentralized financial system is vulnerable to adverse economic shocks because it is not controlled by regulators and is inclined to risky investing behaviors [5]. Moreover, blockchain decentralization is vulnerable to the fork problem, which occurs when blockchain diverges into two potential paths forward with conflicts appearing between the old and new rules. A hard fork is when the old node requires strict verification. A hard fork has only one chain and instantly requires old nodes to require new agreement simultaneously, so it negatively affects the whole stability of system. A soft fork is when new node verification requires strict conditions. This produces multiple chains. Here, if old nodes are unaware of the upgrade, they are given some time until they follow it. Biais et al. [11] indicate that fork problems are inevitable because there are several equilibria in games induced by proof-of-work in blockchain protocols. This entails different versions of ledgers rather than a single unique ledger that reaches consensus. The fork problem is caused by these multiple equilibria in the blockchain protocol, information delay, and upgrades in the software. Böhme et al. [5] propose five types of intermediaries that prevent decentralization: currency exchange, digital wallet services, mixers, mining pools, and payment processors. Chen et al. [12] propose the impossibility triangle to indicate that blockchain cannot achieve the three key virtues of decentralization, consensus, and scalability at the same time. Decentralization requires distribution of ownership and governance. If blockchain is decentralized, then a network is unlikely to reach a consensus. However, even if this does occur, it would entail duplicate storage, queries, and recordings. Accordingly, Lee and Choi [13] suggest an algorithm and a consensus protocol that
8
Y. J. An et al.
synthesizes the conventional blockchain framework [3] and the directed acyclic graph [14]. Chu and Wang [10] argue that instead of this trilemma there is a dilemma between decentralization and scalability alone. Here, there is a trade-off between decentralization and scalability. Decentralized blockchain must sacrifice scalability. If it hypothetically were to become fully decentralized, a low upper bound would be found in the platform software layer. This layer would prevent the scaling of smart contract execution. In a smart contract layer, blockchain replicates sequential programming models, which prevent smart contract execution from scaling. Blockchain depends on incentives to encourage honesty. Here, trust is the key component for interaction between entities. Decentralization involves sharing information among agents with divergent preferences and beliefs, in which the common ledger mitigates information asymmetry. In the computer science environment, trust involves executing transactions in a fault-tolerant way. The blockchain approach requires qualified searching and matching in storage computing, verifying transcripts of computations, and randomization of public ledgers. Gandal et al. [15] provide empirical evidence of price manipulation of Bitcoin during a period that saw an unprecedented boom in exchange rate between the United States Dollar and Bitcoin, when Bitcoin’s value spiked from $150 to $1,000 in late 2013.
2.2 Controversy Regarding Public Blockchain Application: Cryptocurrency 2.2.1
Cryptocurrency: A Currency or an Asset?
Nakamoto [3] marks the birth of Bitcoin, a type of cryptocurrency generated from the Bitcoin protocol that is entered into a ledger in a public blockchain. By design, Bitcoin works on a peer-to-peer decentralized network to evade the intervention of financial institutions. Blockchain takes root in digital currency applications. Any digital asset can be transacted with blockchain protocols, but Bitcoin is the pioneer cryptocurrency. Figure 2 and Table 1 present the close price history and market capitalization of cryptocurrencies, respectively. Bitcoin features the highest price and market capitalization of the five major types of cryptocurrencies arranged by market capitalization. Despite the reputed perfection of blockchain technology [2], cryptocurrencies are highly controversial. The debates surrounding them can be boiled down to the following unresolved issue: Is cryptocurrency in fact a real currency or is it an asset? The assessment of the topic in the literature relies upon theories of pricing dynamics. A basic understanding of macroeconomics would indicate that if real incomes or the velocity of money rises, the price of money also rise. Additionally, if the nominal interest rate increases, the price of money falls. These relationships
Blockchain, Cryptocurrency, and Artificial Intelligence …
9
Fig. 2 Prices of Cryptocurrencies. Note Data are from monthly close prices in U.S. dollars from October 2013 to July 2020
Table 1 Cryptocurrencies by market capitalization
Currency
Market capitalization
Bitcoin
$217.84B
Ethereum
$45.52B
XRP
$28.90B
Chainlink
$15.67B
Stellar
$11.07B
Note Data are from https://www.coindesk.com/price/Bitcoin
are part of money demand theory. Bitcoin follows these macroeconomics rules. Its price rises in response to increases in the real interest rate and increased velocity of Bitcoin. Moreover, its price drops as the nominal interest increases [16]. However, other standard economic theories, such as the future-cash-flow model, the purchasing power parity idea, and the conception of uncovered interest rate parity, explain a limited amount of the variation in Bitcoin prices [17]. The soaring price of Bitcoin cannot be attributed to macroeconomic fundamentals such as gross domestic product (GDP), inflation, and unemployment. In a cryptocurrency market, the supply of the currency is fixed, or it is driven by a completely different algorithm from that which guides conventional pricing dynamics. The demand function is not driven by the macroeconomic fundamentals of an underlying economy but rather by buyers’ and sellers’ expectations of profits. Investment sentiment dominates the Bitcoin market, and it is mostly full of short-term noise traders. Against this background, the dominant view in the literature is that Bitcoin is a speculative bubble. Investors in the Bitcoin market are typically young and unexperienced, and they tend to make irrational trading decisions. Yermack [6] argues that Bitcoin does not play the three functions of money, namely as store of value, a unit of account, and a common means of exchange. Hence, Bitcoin is not a currency. Cachanosky [18] uses Friedman’s quantitative theory of
10
Y. J. An et al.
money (MV = PY) as an analytic framework to analyze Bitcoin pricing,1 finding that Bitcoin does not a follow a good monetary rule, indicating that it has serious limitations to becoming an independent currency. Corbet et al. [19] propose what they call the cryptocurrency trilemma, in which regulatory alignment, cybercriminality, and potential for inherent bubbles cannot be alleviated simultaneously. Abadi and Brunnermeier [20] present a more general account, which they term the blockchain trilemma, in which blockchain cannot simultaneously achieve the three ideals of all database records: correctness, decentralization, and cost efficiency. If a blockchain wants to decrease its costs, then it must allow the free entry of record-keepers and information portability. In that case, however, correctness, which is driven by heavy computations and expensive proof-of-work algorithms, may become unaffordable. If a blockchain wants correct reporting in a cost-effective way, then the ledgers must incentivize correct reporting, which is typically available in a centralized record-keeper and its monopoly. Therefore, just as in traditional centralized intermediaries, blockchain and cryptocurrencies are restricted from pursuing all three ideals. In markets that are integrated, shocks to prices of Bitcoin in one market affect price in the global market. However, if markets are segmented, such as that for Kimchi Premium, which is limited to Korea, then the price of Bitcoin in such a market has a marginal effect on the movement of the global price of Bitcoin [21]. In the literature, it is admitted that Bitcoin is not a perfect currency. However, even if cryptocurrency does not meet the requirements to be considered money, it nevertheless offers investment opportunities as an asset, given its high volatility and high returns. Against this background, the literature applies standard textbook empirical assetpricing models such as the efficient markets hypothesis (EMH) and the Fama–French asset-pricing factors. Bartos [22] applies the EMH to the movement of Bitcoin prices. Bartos finds that, unlike many conventional assets that provide poor empirical evidence for the EMH, Bitcoin does follow the logic of the EMH. The pricing of Bitcoin reflects all known information. All investors know all public information, and no investor can outperform the market by using other information. Thus, Bitcoin immediately reacts to public announcements of information by applying the error correction model to daily data from 2013 to 2014. The price of Bitcoin is highly sensitive to events and information. Investors in countries with inadequate financial institutions or tighter capital controls tend to buy Bitcoin aggressively, thus driving up the price [21]. Bitcoin prices in these countries are highly sensitive to positive shocks, including those of news or events. Examining the asset-like nature of cryptocurrency, the literature has used classical empirical asset-pricing methods for cryptocurrencies. Evidence is presented in the literature for the Fama–French three-factor, Carhart four-factor, and Fama–French five-factor approaches on global stock markets [23]. However, Liu et al. [24] claim that they are the first to apply these approaches to cryptocurrency. The Fama–French
1 M:
money supply, V: money velocity, P: GDP deflator, and Y: real GDP.
Blockchain, Cryptocurrency, and Artificial Intelligence …
11
three-factor and five-factor models explain very little of the returns of cryptocurrencies. This finding is not surprising because the three factors and the five factors explain the fundamental values of stocks by design. Cryptocurrency and stocks have different fundamentals, so Fama–French’s factors necessarily have weak power to predict the expected returns of cryptocurrency. Benchmarking the asset-pricing models of Fama–French and Carhart, Liu and Tsyvinski [25] construct the market factor, size factor, momentum factor, and value factor for the counterparts of cryptocurrency. The market and size factors do not affect expected cryptocurrency returns. The market factor does not explain zero-investment long-short strategies. These strategies indicate asset returns against market returns without concern for investment strategies held. The researchers employ the standard Fama and MacBeth cross-sectional regression.2 However, the counterpart to cryptocurrency in Cahart’s fourth factor, namely the momentum factor, exhibits statistically significant power to predict the expected returns of cryptocurrency. Current returns positively and significantly predict returns 1, 3, 5, and 6 days ahead. The same holds true for Bitcoin weekly returns for 1, 2, 3, and 4 weeks ahead. The momentum factor generates alphas with significant long-short strategy returns. The researchers further report that the top quintiles do not outperform the bottom quintiles from the fifth to the hundredth weeks. Working from Liu and Tsyvinski’s [25] approach, Nguyen et al. [26] argue that short-term momentum predicts the expected returns of cryptocurrency, but the market, size, and long-term momentum factors do not affect Jensen’s alpha for cryptocurrency. The nonsignificance of Jensen’s alpha indicates that long-term momentum portfolios do not generate abnormal returns. Long-term momentum does not outperform the cryptocurrency market in that investors had rewarded the risks associated with market, size, and short-term momentum. The coefficient for the market factor is close to one, suggesting the possibility of random walk. The literature also reports determinants of cryptocurrency from novel variables related to the Internet. Empirical evidence is provided that there are no underlying fundamentals in Bitcoin as a financial asset. Speculation, noise trading, and trend chasing evidently dominate Bitcoin pricing dynamics. Liu and Tsyvinski [25] argue that Google searches on Bitcoin, ripple, and Ethereum measure investor attention to cryptocurrency. Kristoufek [27] indicates that search queries on Google Trends and daily views on Wikipedia exhibit strong correlations with Bitcoin returns. The researcher further shows that greater Google search volumes cause the price of Bitcoin to increase (and vice versa). However, decreases in Bitcoin price has no statistically significant effects for search queries. Mai et al. [28] claim to be the first to incorporate social media as a predictive determinant of Bitcoin returns. Increases in Bitcoin price are positively and significantly associated with lagged social media, implying that social media movements 2 First, they sort the returns of individual cryptocurrencies into quintiles given the factor. Then, they
track the returns of each portfolio in the following week and calculate the excess return over the risk-free rate. Next, they form long-short strategy based on the difference between the fifth and first quintiles.
12
Y. J. An et al.
can predict Bitcoin price movements. Microblogging has hourly effects, and Internet forums, which are largely a silent majority, show daily effects.
2.2.2
Cryptocurrency: Is It the New Safe Haven Asset?
The most intriguing question of cryptocurrency is rooted to its apparent resilience during the GFC in 2008 and to region-specific crises, such as bailouts in Europe in 2010–2013, and the demonetization initiatives of the Indian and Venezuelan governments. The rising returns of Bitcoin during a bearish stock market have triggered (and in certain extent, have given hope to) academic research and the finance industry that Bitcoin could be a novel safe haven asset. Figure 3 presents closing prices of Bitcoin and stock indexes, including the Morgan Stanley Capital International (MSCI) USA, MSCI Asia, MSCI Europe, and MSCI UK. The nickname for Bitcoin, digital gold, relates to the safe haven asset properties of gold. Figure 4 presents returns for Bitcoin-gold, Bitcoin-oil, and Bitcoin-U.S. dollar exchange rate. Bitcoin exhibits higher return volatility than gold, oil, or U.S. dollar. These simple plots ground a skeptical view of whether the return volatility of Bitcoin is stable enough for it to become a safe asset.
Fig. 3 Price of stock and Bitcoin. Note Data are monthly close prices of Bitcoin and MSCI stock indexes from October 2013 to July 2020, retrieved from https://www.msci.com/end-of-day-datacountry
Blockchain, Cryptocurrency, and Artificial Intelligence …
13
Fig. 4 Returns for the exchange rates between Bitcoin–gold, Bitcoin–oil, and Bitcoin and the U.S. dollar. Note These are monthly returns (%) from January 2014 to July 2020. We use the gold fixing price at 10:30 a.m. in the London bullion market in U.S. dollars, West Texas intermediate (WTI) crude oil price, and trade-weighted U.S. dollar: broad, goods, and services, with data from the Federal Reserve Bank of St. Louis Economic Research (FRED)
The literature supports mixed conclusions regarding whether Bitcoin is a safe haven asset. We review the literature that examines whether Bitcoin as a safe haven asset, grouping studies into three categories: Bitcoin as a hedge (diversifier) against risky assets, as a safe haven asset by definition, and as a safe haven asset by properties. Regarding Bitcoin as a hedge, we adhere to Ratner and Chiu’s [29] definition of a hedge, namely, when an instrument is uncorrelated with another asset on average. A strong hedge denotes an instrument that is negatively correlated with the other asset, on average. Dyhrberg [30] employs the GARCH method and finds that Bitcoin serves as strong hedge against stocks and the U.S. dollar. These properties of Bitcoin are similar to the properties of gold. Their findings suggest that investors should include Bitcoin in their investment portfolios to hedge market risk. Baur et al. [31] replicate Dyhrberg’s [30] research and analyze contrasting results. Baur et al. [32] use an alternative model—Glosten et al. [33] asymmetric GARCH model, which does not include exogeneous variables in the variance equation. Their findings suggest that Bitcoin prices are uncorrelated with the value of the U.S. dollar. The return volatilities of Bitcoin are different from those of currency FX, equity indexes, gold futures, and gold spot. Because gold is strongly negatively related to U.S. dollar value, Bitcoin does not serve as a strong hedge against U.S. dollars.
14
Y. J. An et al.
Equity indexes such as the Financial Times Stock Exchange 100 return and the MSCI World return are all negatively correlated with the change in value of U.S. dollar. In the definition of hedging used here, Bitcoin can be a hedge but not a strong hedge against U.S. dollars. On the other hand, Brière et al. [34] argue that Bitcoin has remarkably low correlation with stocks, bonds, currencies, commodities, hedge funds, and real estate. If an investor incorporates Bitcoin into a well-diversified portfolio, then alpha returns are improved. However, this result should be interpreted with a degree of caution because the time-series period in their research is relatively short, from 2010 to 2013 only. The hedging effects of Bitcoin against stocks are associated with distinct fundamentals. Bitcoin and stocks have different underlying fundamentals. Accounting practices and macroeconomic fundamentals underlie the price movements of stocks, they do not influence the movements of Bitcoin. Standard economic theories on currency and asset pricing do not explain variations in Bitcoin price. The literature reports determinants of Bitcoin prices from unconventional variables, including Internet searches, social network activity, and microblogging. However, hedging is insufficient to become a safe haven asset. Most studies of hedging are based on the statistical properties of Bitcoin returns. Safe haven assets are uncorrelated or only negatively correlated with risky assets in times of market stress. Does Bitcoin serve as a safe haven asset during financial turmoil? Research has found mixed results. We first review reports that support Bitcoin as a safe haven asset, and then, we examine the literature that opposes this. Bouri et al. [35] argue that Bitcoin is a safe haven asset in the Asia-Pacific, China, and Japan because its prices are negatively correlated with stock prices during economic crises. The researchers proxy economic crises with heighted implied stock volatility (VIX). Bitcoin exhibits resilience to both normal times and financial turmoil, so the researchers conclude that Bitcoin is a safe haven asset in these regions.3 The literature investigates the effects of economic uncertainty on Bitcoin pricing. Wang et al. [36] measure the effects of VIX and the U.S. Economic Policy Uncertainty (EPU) [37] on Bitcoin pricing. The risk spillover shocks from VIX and EPU shocks are negligible for Bitcoin in most conditions, as shown in the multivariate quantile model and Granger causality tests. Bitcoin acts both as a hedge and a safe haven asset under EPU shocks. Selmi et al. [38] claim that they are the first to examine hedging and the safe haven asset properties of Bitcoin, oil prices, and gold all together. They also consider different uncertainty measures, including Monetary Policy Uncertainty (MPU), financial uncertainty, and political uncertainty. Following a comprehensive measure of uncertainty, Bitcoin serves not only as a hedge against oil price movements but also as a safe haven asset in conditions of financial turmoil.
3 The
researchers claim that Bitcoin is not a safe haven asset against U.S. stocks because in bearish stock markets, it is positive and statistically significantly correlated with them.
Blockchain, Cryptocurrency, and Artificial Intelligence …
15
Selmi et al. [38] compare the gold market, the oil price market, and the Bitcoin market by using a model that captures the interdependence between these markets. Bitcoin acts as a safe haven asset when the investor incorporates both Bitcoin and oil into a portfolio, but not both gold and oil. This study is comprehensive because it incorporates different traditional types of safe haven assets, uncertainties, and portfolios (risk-minimizing portfolio, equally weighted portfolio, and a hedge strategy portfolio), and it is robust to different methodologies (CoVaR and quantile-on-quantile regression). Economic uncertainty has large and negative effects on the connectedness of six cryptocurrencies, namely Bitcoin, Ripple, Stellar, Litecoin, Monero, and Dash [39]. Therefore, cryptocurrencies can serve as a hedge against economic uncertainty. The larger U.S. EPU is positively associated with high demand for cryptocurrencies, which in turn is negatively associated with connectedness. VIX, crude oil volatility (OVX), and gold market volatility (GVZ) comprehensively decrease connectedness in cryptocurrencies, again exhibiting the safe haven asset property of Bitcoin. Demir et al. [40] find that EPU and Bitcoin returns are negatively associated; thus, it serves as a hedge in conditions of uncertainty. They use the Bayesian graphical structural vector autoregressive model, ordinary least squares, and quantileon-quantile regression. Incorporating Bitcoin into a U.S. stock portfolio lowers systematic and market risk. Bitcoin hedges against global geopolitical risks, and it is negatively associated with trade policy uncertainty during periods of regime change. Because Nakamoto [3] proposed Bitcoin following the inception of the GFC that it is innately difficult to measure the behavior of Bitcoin prices during the GFC. Bitcoin first came to public attention after its soaring returns in the late 2013. Thus, there are little past data that can be used to test whether Bitcoin was a safe haven asset during the GFC. However, the COVID-19 pandemic crisis of 2020 is a good opportunity to supplement our understanding. This bearish market provides a timely test for the safe haven properties of Bitcoin. The literature has begun to examine the movement of Bitcoin prices during the pandemic. Bitcoin clearly appears not to be a safe haven asset because its price decreased when the S&P 500 index dropped during the COVID-19 pandemic [41, 42]. In fact, its drop was even deeper than that of the S&P 500 index, confirming that judgment. The price movement of Bitcoin is neither uncorrelated nor negatively correlated with the price of stocks during this financial turmoil. Gold and soybean commodity futures, by contrast, are irreversible safe haven assets during financial turmoil, and Bitcoin was shown not to be one during the COVID-19 pandemic [43]. An important property of a safe haven asset is high liquidity because investors pursue safe haven assets in crises. They seek to sell risky securities and quickly buy liquid assets. Against this background, cash and government bonds attracted more demand during the GFC. Smales [44] finds that Bitcoin is illiquid compared to cash, government bonds, and gold. Hence, Bitcoin does not meet the criteria to become a safe haven asset. Smales proxies liquidity with bid–ask spreads, transaction fees, and differences in trading volume. The bid–ask spreads for Bitcoin are higher than those for stocks, bonds, gold, and stock indexes. The average daily volume of Bitcoin
16
Y. J. An et al.
transactions is significantly lower than those for the comparison terms. Transaction costs are higher for Bitcoin than for other assets. The transaction fees of Bitcoin cannot be anticipated. These findings suggest that Bitcoin is even less liquid than risky assets. The liquidity of Bitcoin is highly dependent on exchanges, where Bitfinex relates to exchanges with the most liquid Bitcoin transactions. On average, stocks are more liquid than Bitcoin. Illiquidity makes Bitcoin useless as a hedge because a successful hedge instrument requires that the investor can easily buy or sell it. Because investors tend to rush for safe haven assets during crises, Bitcoin has an undesirable property, illiquidity.
3 Blockchain in the Financial Industry Enhancing performance against risk, is the ultimate goal for anyone working in the financial industry. Blockchain has improved performance in this way by lowering costs and improving efficiency. The distributed character of records in blockchain reduces operational risk. Because it is not subject to interference from intermediaries, less manual work is involved between entities. Proof-of-work facilitates automation in blockchain technology. Blockchain decreases the cost of trust. The current financial system imposes high costs on customers in trade for ensuring trust between institutions and users. Blockchain technology can lower costs, particularly in back-office or post-trade functions. Centralized intermediaries entail concentrated risks and owe significant economic rents. Decentralized blockchain improves the administration of financial services in payments, digital identity, primary securities, processing derivatives, post-trade reporting, and trade finance. In this section, we review the literature on the application of blockchain to banking, corporate governance, and real estate.
3.1 Blockchain and Banking Industry First, we examine the application of blockchain to banking. Blockchain has the potential to revolutionize how payment, credit information systems, and financial transactions are taken care of in banking. Digital finance enhances customer experience, efficiency, cost, and safety. It has the potential to bring technological breakthroughs to the financial industry, in four areas in particular: infrastructure, platform, channel, and scenario. Because the Internet is now ubiquitous in banking services, commercial banks must inevitably rely on new technological growth. Blockchain enhances economic efficiency, operational efficiency, and efficient service in the banking industry.
Blockchain, Cryptocurrency, and Artificial Intelligence …
17
Blockchain disrupts the banking industry by facilitating smart contracts or automating banking ledgers. It is now quite common for tech firms and social networking companies to do provide payment services, for example, Apple Pay, PayPal, Android Pay, and Kakao Pay. Internet banking now allows customers to do perform transactions themselves without needing to visit a bank. Blockchain reduces monopoly power and lowers barriers to entry. Catalini and Gans [45] claim that it entails two main costs: that of verification and that of networking. The cost of verification represents the expense of verifying the counterparty, including information on past transactions and the current ownership of the cryptocurrency. The cost of networking represents the ability to bootstrap a marketplace in the absence of any centralized intermediary. Blockchain has noteworthy impacts on alleviating information asymmetry in the banking industry. In traditional financial markets, moral hazard and adverse selection are well known to destroy market equilibria. Blockchain alleviates information asymmetry and improves welfare and consumer surplus. This surplus is earned by a shared, transparent ledger and distributed information through a consensus. Cong and He [46] build an economic mechanism for consensus generation. They optimize the quality of consensus, as each contracted record-keeper optimizes its own utility. As a result, competition is enhanced and barriers to entry are lowered. Blockchain can reach its full potential if the entire banking system adopts it. A single bank acting alone does not gain any competitive advantage by adopting it. Blockchain provides infrastructure for sharing information in a secure way, automating registration processes and detecting fraudulent identities. The cross-chain interoperability of blockchain is important for facilitating it as a medium of exchange in the banking industry.
3.1.1
Central Bank Digital Currencies: Pros and Cons
The GFC of 2008 exhibited the limitations of conventional monetary policy and the vulnerability of the financial system. Nakamoto [3] proposed Bitcoin during this time. Bitcoin’s blockchain technology could affect the entire financial industry, including the banking sector. The outsize attention paid to blockchain in the banking context prompts us to inquire after the role that central banks can play in the use of blockchain. Should a central bank issue digital currency? The need for adequate liquidity has caught the attention of central banks who are considering issuing CBDC. Most central banks in developed economies target a 2% inflation level, but this goal is not easy to achieve using conventional monetary policy. CBDC enjoys three features that can enhance monetary policy: nominal anchor, tools and operations, and policy strategy. CBDC is a credible nominal anchor, built on publicly posted prices. It injects liquidity and facilitates the central bank’s role as a lender of last resort. The researchers propose an analogy to Taylor Rule, replacing nominal interest with a CBDC interest rate. Fernandez-Villaverde et al. [47] claim that during normal economic times, CBDC helps capital to be efficiently allocated in private financial intermediation. During
18
Y. J. An et al.
crises, central banks are more stable than investment or commercial banks because central banks have rigid contracts. Williamson [48] shows that CBDC can reduce crimes related to physical money. Moreover, it allows interest payments on central bank liabilities. CBDC economizes on scarce safe collateral. If banks have market power in the deposit market, CBDC stimulates competition. Meaning et al. [49] design an economy in which they assume that CBDC is universally accessible. They develop monetary policies using CBDC that share equivalent processes with conventional monetary policies. First, the central bank injects CBDC, thus determining the interest rates paid to its balances. Next, the variations in CBDC interest rates are changed into variations in other assets’ interest rates. Digital currency improves payment efficiency, although privately issued digital currencies do not achieve these efficiencies. For example, CBDC can coexist with bank notes. It increases financial stability and transparency if the central bank uses it to implement a consistent monetary policy. If economists effectively analyze the data attached to CBDC, they can measure network externality. No type of money is perfect, so there is a need for different forms of it. Cash is a rudimentary type of money, is highly liquid, does not suffer from counterparty risk, and allows anonymity. A user of cash does not require financial intermediaries to perform transactions. CBDC is well-qualified as a substitute for cash. CBDC and cash have similar features. Both are denominations of sovereign currency, legal tender, easily convertible, not subject to an interest rate, without central bank fees, enjoy high liquidity and 24/7 accessibility, anonymous, in a bilateral network, and irrevocable. CBDC injects liquidity to an economy in times of crisis. CBDC preserves the advantages of cash but excludes the rudimentary features of cash payment. The literature reviews the effects of CBDC launched as a pilot in some countries. Bergara and Ponce [50] review the e-Peso launched by the Banco Central del Uruguay between November 2017 and April 2018. This test demonstrates that CBDC enhances monetary policy and lowers barriers to entry. Juks [51] finds that the preliminary launch of the e-krona does not harm the financial stability of Swedish banking system. Brunnermeier and Niepelt [52] review the Chicago Plan for the use of CBDC, and its pass-through funding did not harm financial stability. Following the well-known idea of the impossible trinity, Bjerg [53] proposes the following policy trilemma in relation to CBDC,4 in which the free exchange rate, monetary autonomy, and free capital flow in conventional currency are, respectively, matched to parity, monetary sovereignty, and free convertibility in CBDC. The trilemma here is that parity, monetary sovereignty, and free convertibility cannot be achieved at the same time. CBDC benefits monopoly banking. It increases financial inclusion and reduces monopoly profits. During the initial phase of CBDC implementation, a bank panic 4 The
impossible trinity (also known as the impossible trilemma) is a concept regarding the value of a conventional currency in international economics. The trilemma is that it is impossible to simultaneously achieve three goals: a fixed foreign exchange rate, free movement of capital, and independent monetary policy.
Blockchain, Cryptocurrency, and Artificial Intelligence …
19
arises. In this period, CBDC decreases the supply of private credit in commercial banks, raises the nominal interest rate, and lowers the reserve-deposit ratio in commercial banks. However, the researchers claim that once CBDC becomes available for all commercial banks, the increased quantity of CBDC amplifies the supply of private credit and lowers the nominal interest rate. Despite the potential advantages of CBDC, central banks do not show any resolute intention of issuing it on a full scale. Although CBDC has the potential to replace cash, current blockchain technology is ineligible for it because blockchain protocols cannot transact large amounts of transfers or payments. Because of this, CBDC is not likely to have significant implications for central bank seigniorage. Davoodalhosseini [54] finds that welfare gains are not high where an economy has both cash and CBDC. Without additional regulations, CBDC is vulnerable to criminal activity. If disclosure restrictions are imposed, then CBDC loses the advantages of anonymity. Lagarde [4] argues that fintech can never replace conventional central banking, although he also predicts that controversies over digital currency will be resolved in time, as the questions of volatility and energy consumption relate mostly to technological questions. More important to Lagarde [4] is storytelling. Communication, suggestions, and critiques among peers stimulate the development and expression of diverse opinions. The exchange is crucial to good policymaking. Central banks should clearly present their idea of the targeting price level to the public in plain English and form public expectations for future monetary policies, earning the trust of the public. He doubts that AI can communicate to the public in plain English and deliver monetary policies or target a price level.
3.2 Blockchain and Corporate Governance We review the application of blockchain to corporate governance. Blockchain technology embedded in corporate governance is related to internal and external actors for corporate ownership rights, decision authority, and board structure. Yermack [55] presents the impact of blockchain on corporate governance. It enhances the balance of power among managers, institutional investors, small shareholders, and auditors. It lowers cost and increases liquidity, transparency, and the accuracy of record-keeping. Key features of blockchain are transparency and irreversibility. Because it enhances transparency in ownership records, it helps shareholders observe share transfers in real-time with lower trading fees. Blockchain’s irreversibility benefits managerial ownership because manipulation of stock compensation becomes much more difficult. Corporate voting becomes accurate and transparent. Yermack [55] notes that if blockchain became dominant in corporate governance, its vulnerability to sabotage or exploitation of collective action would need to be overcome. Blockchain technology alleviates the agency problem. Conventional internal and external monitoring lead to agency problems in corporate governance. Blockchain technology reduces the agency costs that stem from contracting with agents in a
20
Y. J. An et al.
firm. It also provides a decentralized network that removes internal and external monitoring, which inevitably yield agency problems. Blockchain also sustainably lowers shareholder voting costs and organizational costs in corporate governance. Annual general meetings require much less manual work. Panisi et al. [56] argue that blockchain can bring about or encourage shareholder democracy because share ownership can become transparent and trackable. Leonhard [57] claims that holders of cryptocurrencies can replace traditional shareholders where cryptocurrency holders can trade through decentralized autonomous organizations (DAO). Here, corporations would use smart contracts that could operate without government intervention thanks to the blockchain protocol that DAO uses. Blockchain can foster universality. A public blockchain system is an optimal form for a nationwide industry. Blockchain’s greatest opportunity is that it provides a fundamentally new type of database technology, which can be distributed across an entire country. Moreover, it makes online voting feasible. Shared distributed ledgers enhance transparency; the irreversibility of records improves credibility, and a peerto-peer network fosters responsibility; smart contracts are positively associated with fairness.
3.2.1
Blockchain Concerns for the Financial Sector
There have been concerns regarding the implementation of blockchain technology in the financial sector. In the literature, the 51% attack is highlighted, which refers to vulnerability that if a single node controls more than 51% of computing power, the entire blockchain can be maliciously destroyed. A strong entity can modify transaction data and to stop verification and mining. The 51% attack can also lead to double spending, in which a few miners can enforce verification of a fraudulent transaction and insert a fraud block into the blockchain. If this occurs, other entities have no power because consensus has already been reached, with 51% of miners agreeing. Bitcoin’s technical burden increases the operating costs of blockchain to a possibly fatal level because it requires continuous consumption of electricity. The Bitcoin Sustainability report (2018) shows that the energy consumption of Bitcoin is increasing.5 The consumption for one transaction in a ledger is on par with that of a U.S. household for 13 days. The foremost virtue of the blockchain is honesty. Miners are expected to be honest and are provided with rewards for it. However, dishonest strategies are increasing. A game theory framework is used to show that the current Bitcoin system is vulnerable to subversive strategies and to mining cartel attacks. Blockchain users are experiencing threats and malware. Bitcoin is not immune from questions of pseudonymity, privacy, and anonymity [58]. If pseudonymity in a public ledger leads to double spending, blockchain cannot be used for transactions of cryptocurrency. A substantial amount of cryptocurrency 5 https://digiconomist.net/bitcoin-sustainability-report-01-2018.
Blockchain, Cryptocurrency, and Artificial Intelligence …
21
transactions is associated with illegal activities [59]. Illegal users use cryptocurrencies to fake their identities and trade in illegal goods or services. Criminals often repeatedly order small amounts of cryptocurrencies because large transactions are unavailable on most blockchain protocols. These illegal transactions are associated with rapidly growing speculative investment sectors in Bitcoin and the emergence of alternative cryptocurrencies that allow the concealment of identity. Sustainable ownership in conventional financial systems has previously been studied [60, 61], the recent literature studies the sustainability of cryptocurrency and blockchain technology. Blockchain has only limited availability for preserving identity and transaction privacy [62]. Privacy of identity entails that certain information, such as real identities and past transactions, should not be discoverable. Transaction privacy entails that the contents of a transaction should only be accessible to the agents who placed the transaction orders, not to the public. However, blockchain is vulnerable to replay attack, impersonation attack, and Sybil attack, major weaknesses in preserving identity privacy [63]. Blockchain also suffers from eclipse attacks, transaction malleability, and timejacking, showing a limited ability to preserve transaction privacy.
4 Artificial Intelligence and Big Data in Finance There is no unified definition of AI, but there are different methods of implementing it, including machine learning (with neural networks at the forefront), random forest, or classification and regression trees. As more and more data become relevant, techniques can be expected to evolve. Newer AI techniques significantly affect companies that are developing prediction methods. Increases in the size of data are insufficient to enhance prediction. What is more important is preprocessing the data, improving computation algorithms, and minimizing errors [64]. Automated teller machines are among the earliest uses of IT in banking, beginning in the 1960s. Online payments and transfers are simply an early stage of the fintech era. AI has changed both demand and supply in the financial industry. For demand, geographical borders are fading. Social networks are spreading massive amounts of information, regardless of its accuracy. In this context, people often act irrationally, showing herd behavior. On supply side, IT-based systems participate in lending, payments, and deposits. Sironi [65] claims that fintech launched a new era of banking democratization, in which investors become price-makers and banks become price-takers. Fintech lessens the information asymmetry between economic agents and managerial insiders. It is consumer friendly, providing a positive interface, and it is mostly used on mobile phones. Will fintech reduce costs and heighten profits for existing financial firms? Will fintech start-ups instead disrupt the conventional financial industry? Fintech has a large and positive effect on the value that accrues to innovators. However,
22
Y. J. An et al.
market leaders only survive if they invest heavily in innovations and research and development expenditure.
4.1 Fintech and Banking Industry Because banks are historically resilient to technological disruption, at least before the GFC, they often appear passive in relation to adopting innovations and technological breakthroughs. This lack of innovation motivates tech firms from the Silicon Valley to enter the financial sector. Tech firms launch fintech platforms in a relatively low cost way thanks to open source software or cloud services, such as the Tenserflow library and Google Cloud. Fintech firms can satisfy niche demand related to specific groups of customers. Banks are often less inclined to innovate and are slow to respond. A major reason for this is the existence of banking regulations. Because fintech firms are less regulated, they are showing large increases in activities such as robotic shipping, AI-based start-ups, automation, and patent counting. Fintech firms are forcing banks to reconsider their own competence. Jakšiˇc and Marinˇc [66] urge banks to realize that they “have no time for complacency.” Traditional banks, they write, should increase their competitiveness against fintech firms by standardizing back-office functions, investing in business-to-consumer services, and controlling the risks associated with financial innovations. Fintech can bring structural change to the financial system. Philippon [1] urges that current banking regulations can never deliver deep-down structural change. A focus on regulations inevitably leads to controversies regarding the level-playing field and leverage. The growth of finance is attributed to efficient capital allocation. Financial incomes do not grow with per-capita GDP. Expenses, including asset management fees, are expensive. There is a dichotomy between the top-down regulation of the current banking system and the bottom-up regulation of fintech firms. Philippon [1] argues that fintech can resolve issues that conventional banking will never be able to, such as for example, market entry and making a level-playing field, leverage controversies, and consumer protections. Regulations will be more effective if they are put into place early, when the industry is young. Because fintech start-ups are not held back by pre-existing systems, the start-ups have the chance to build the right system from the start. The ability to conduct customer relationship management is a comparative advantage of conventional banks relative to fintech firms. This strategy incorporates the brand value, loyalty, voice-to-voice consulting, and work experience of the employees [67]. Although crowdfunding provides a lending platform, it is based on big data rather than on long-term relationships. Banks should seek to earn and maintain customer trust if they are to compete against fintech firms. Stulz [68] argues that regulation is both a blessing and a curse for banks. It forms a barrier to entry, so it protects pre-existing banks threat to their profits. Most banks must follow capital requirements even though most transactions, including
Blockchain, Cryptocurrency, and Artificial Intelligence …
23
repossession and payment, do not necessarily require capital if fintech firms conduct them. The curse of regulations is that they make banks more rigid, allowing little innovation. They are also costly. Fintech firms are less regulated than banks, and they enjoy the ability to perform transactions in a floating value account. This float is the major source of profits for fintech firms. Moreover, these firms are funded with more abundant equity than existing banks.
4.1.1
Performance of AI Techniques in the Banking Industry
Data envelopment analysis (DEA) is a data-oriented approach that evaluates financial performance. It converts multiple inputs to multiple outputs with the use of decisionmaking units. DEA evaluates the efficiency of banks based on their technical and allocative efficiencies. There is important research on the effects of AI on banking industry using countrylevel data. AI enhances auditing with planning, evaluation, analysis, pitching an opinion, reporting, and internal auditing in Nigeria [69]. Visual recognition enhances the identification of subjects, thus facilitating inventory checks. Vives [70] claims that residents of African countries have better access to a mobile phone than to banking, so mobile payment systems and P2P loans are more important than conventional banking in developed financial markets. Vives emphasizes that Africa should not fall behind developed economies in learning and implementing AI technologies. Lee [71] uses Korean data on banking from questionnaires. He finds that information system risk management improves due to AI. Artificial neural networks enhance loan decisions in commercial banks in Jordan by visual identification [72]. Commercial banks in Japan cannot benefit from technological innovations, because the high level of non-performing loans in that country remains inefficient [73]. Likewise, restructuring is also blocked for current segments of regional banks in that country. Avkiran [74], using the dynamic network DEA approach, finds that in China, foreign commercial banks are more efficient than domestic ones, after considering divisional interactions, and interest-bearing and non-interest-bearing operations. There are strict licensing regulations for fintech start-ups for developing economies. Saksonova and Kuzmina-Merlino [75] argue that fintech innovation has not yet begun in Latvia as most respondents in their study do not know what fintech services are. Fintech firms should improve their marketing to raise public awareness of them. Using experimental studies, Lee et al. [76] find that mobile banking in Bangladesh has increased urban-to-rural remittances by 26%; increased rural consumption by 7.5%, and decreased extreme poverty among both rural households and urban migrants.
4.1.2
Sustainable Fintech or Disruptive Fintech
The literature examines sustainable fintech, in which the pre-existing financial industry protects itself by implementing technological service themselves. Liu et al.
24
Y. J. An et al.
[77] find that banking industry is benefitting from mobile apps. The researchers report that this consumer behavior produces a net benefit to the bank of 0.07 U.S. dollars per month for the average customer. The wide use of mobile banking fosters the use of other digital services and hence yields complementary effects between the bank and the technology. High-performing banks appear to invest heavily in IT [78]. IT investments complement firm strategy in relation to both strategic advantage and strategic necessity. Autonomous data management enhances efficiency in each of banks’ roles, from the customer-based front office, the operations-based back office, trading and portfolio management, and regulatory compliance. If banks were to implement autonomous AI software widely, their profitability would soar by due to the reduced labor costs. Bömer and Maxin [79] suggest that fintech firms show high performance when they collaborate with banks and that fintech firms produce new products in competition with banks. Fintech start-ups assist by analyzing big data for to find financial ratios; however, it is investment bank that determines mergers and acquisitions. Oshodin et al. [80] argue that Australian banks benefit from fintech firms: they learn technological knowledge, invest in start-ups, modernize their banking platforms, and actively monitor fintech firms. The researchers emphasize that banks should seek deep engagement with customers. AI reduces operational risk. This change can be beneficial, but it can also take over the roles of humans in the banking industry. One might think that AI affects manual jobs in back offices in, but Crosman [81] disagrees, finding that if AI replaces workers, it will be front-office jobs. Financial managers can be replaced by AI-based anti-money laundering strategies. Loan officers can be laid off due to AI-based smart contracts. Tellers, customer service representatives, loan interviewers, and clerks will find their places taken by chatbots and automated technology. For bankers to survive, he argues, bankers should make the transition to AI. Huang and Rust [82] report that AI is progressively replacing human labor, from lower to higher intelligence. AI replaces analytic tasks in particular. Because AI cannot replace those with empathetic skills, human bankers should enhance their soft skills. Do bank customers consider fintech firms safe places to entrust their wealth? The economy is always seeking riskless returns, which make deposits the baseline for wealth management. Boot [83] finds that people consider banks the safest place to place their money even though many competing fintech firms exist. Most bank customers expect there to be physical banks nearby, so that they can receive advisory services face-to-face. However, the literature does not agree here. Jünger and Mietzner [84] find in survey data that 31% of their respondents are willing to place most of their wealth in fintech firms. In their German sample, households with decent financial education and that prefer financial transparency are more likely place their wealth with fintech firms. In particular, 84% of millennials would prefer to obtain banking services from a tech firm such as Google or Apple [85].
Blockchain, Cryptocurrency, and Artificial Intelligence …
4.1.3
25
Limitations to Implementing AI Techniques in the Banking Industry
There are several limitations to using AI in banking. The first is regulatory. Regulation is always later than the innovation. For example, European Union’s General Data Protection Regulation prevents automated decision making. Practitioners in the field do not seem to think that digital finance is mainstream, particularly due to binding bank regulations. For instance, Zalan and Toufaily [86] quote a senior executive at a commercial bank who said, “There is a law requiring customer passport must be scanned before giving a loan or setting up a new bank account.” Banks are reluctant to adopt fintech innovations if doing so will jeopardize existing bank activities. Moreover, skilled IT teams are scarce. Even in large tech firms, talent is not balanced across regions. Financial institutions do not have sufficient internal expertise to fulfil customers’ technological demands. Lack of skill is especially prominently shown in the blockchain technologies developed by Python or Ruby developers. In addition, bank IT systems have been constructed over decades in a long process of add-ons. Stulz [68] claims that integrating fintech innovation into banks is nearly impossible. Only mega banks can afford fintech innovations, but even there the challenge is found that these data cannot be mined using AI techniques. Another issue is security. Most financial firms face strong security concerns, so they closely monitor their programmers. Hacking cannot be excluded. AI is vulnerable to technological challenges, security snags, and misbehaving models. Lui and Lamb [87] suggest that AI discriminates by race and gender. As Lagarde [4] argues, AI cannot clearly deliver monetary policies to the public. Both the complexity of the innovation and the competence of people reduce the usefulness of AI innovation in the financial industry, as fintech innovations are composed of three elements: interoperability among third parties, investment in related assets, and innovative software (protocols, procedures) and hardware. AI techniques reduce trust in bank customers. Thus, human bankers can still play an important role in training machines.
4.2 AI and Lending Platforms Lending was once thought of as an exclusive function of traditional banks. However, lending is not now limited to traditional banking services. Online peer-to-peer (P2P) lending platforms are widely used, including Peerform, LendingClub, Upstart, Prosper, and Funding Circle. P2P lending platforms evaluate the creditworthiness of lenders and borrowers and address information asymmetries between them. AIbased platforms facilitate lending decisions by controlling lending management, identifying low and high credit risk, and detecting fraud. AI gathers data on borrower quality in credit applications. Machine learning algorithms create credit-assessment criteria that measure counterparty risk. Large amounts of data and data-driven decision processes can alleviate information asymmetry between economic agents and managerial insiders.
26
Y. J. An et al.
Fintech platforms improve the prediction of debt default. Jagtiani and Lemieux [88] find that fintech platforms make extensive use of alternative data in internal credit ratings. The rating grades adopted by the LendingClub platform provide information on credit spreads that cannot be found in conventional measures of credit risk. Alternative data incorporate information embedded in utility or rent payments, online records of deposit and withdrawal transactions, consumer insurance claims, credit card transactions, investment choices, online footprint, and personal information including occupation, level of education, and use of a mobile phone. The rating grades in LendingClub have good performance for predicting loan default. Fintech lending platforms have made credit-assessment criteria more lenient than conventional banks could. Hence, customers who would have been classified as subprime in conventional criteria can obtain lower-priced credit via fintech platforms. Leong et al. [89] study AI-based fintech companies’ offers of microloans to college students in China. Fintech firms can satisfy niche demand related to specific customer profiles: in this case, college students in need of microloans. These customers’ online traces produce alternative data for credit scoring, which can redefine conventional creditworthiness, previously measured through prior credit transactions. Fintech lending services can serve two billion adults who are unqualified for traditional bank loans. Therefore, fintech services can increase the financial integration of various segments. Online fintech lenders are more creditworthy than shadow banks, but like shadow banking, they increase real GDP by efficiently allocating capital [90]. Hau et al. [91] find that China’s automated online credit, the Ant Financial platform, retrieves alternative data from Alibaba’s online trading platform Taobao. The use of these data provides opportunities for borrowers hitherto excluded from conventional banks. Fintech firms push the margins of credit lines upward to vendors previously given low credit scores. These properties of fintech services can integrate large shares of the market segment, such as remote cities with less credit supply and higher state-owned enterprise employment. The resulting platform has a comparative advantage over banks in data collection, among other areas. AI potentially benefits risk management by reducing credit scoring bias, improving market monitoring and systematic risk, identifying fraud in the financial market, and minimizing operational risks. Because risk management relies on quantitative credit scoring, statistical techniques are widely used. Risk quantification and causal identification are central concerns in risk management, but AI-based technology alleviates those two concerns. Azar and Dolatabad [92] propose a novel model with fuzzy cognitive maps and Bayesian belief networks to quantify operational risk. In addition to risk quantification, their model identifies causal mechanisms joining features of banks and operational risk. Dirick et al. [93] use survival analysis, including the accelerated failure times model and the Cox proportional hazard model, to evaluate credit scoring from time to default.
Blockchain, Cryptocurrency, and Artificial Intelligence …
27
4.3 AI and Investing De Spiegeleer et al. [94] use machine learning to calculate vanilla option values, the pricing of American options, and that of exotic options. They find that AI techniques greatly increase the speed of the computation process of derivative pricing, hedging, and fitting. Although AI techniques produce some loss of accuracy, the researchers state that this loss is within a reasonable scope and is acceptable for practitioners. AI techniques yield diversification in computation, particularly excelling in eliminating storage bottlenecks, detecting noise, and dealing with heterogeneous datasets. For practitioners, handling a huge dataset is useless if it has no business value. Enterprises should embrace new strategies, simulation techniques, and methods to handle huge datasets. Enterprises should construct automated algorithms that can collect and analyze data. Because most data that practitioners use are heterogeneous and imperfect, machines should learn to tolerate them. Robo advisors are automated investment advisory processes that are often embedded in interfaces. Robo advisors provide investment guidelines to investors based on target return, estimated risk, investor age and gender, risk aversion, and macroeconomic indicators. A robo advisor works with a large amount of data in automation. It has the potential to improve service, personalize customers’ tastes, reduce compliance risk, and decrease discrepancies between expected and unexpected risk. A common example of the widespread use of a robo advisor is for retirement savings. The important question is whether robo advisors can enhance investment performance. If an investor is pursuing long-term investment, then robo advisors can help enhance performance. Most robo advisors are not day traders, and thus, they do not pay attention to market noise or small movements. Uhl and Rohner [84] argue that robo advisors offer superior performance in passive investing, active asset allocation, cost-efficient implementation, an avoidance of behavioral biases. They show that robo advisors decrease management fees. Fein [95] find that robo advisors exceed human performance once asset status, risk aversion of individual investors, social relationships between investors, group-level risk aversion, and the combination of the behavior of individual investors with the investors’ social relations are taken into account. Robo advisors can support portfolio management for both passive and active investing. Leading asset management companies have created a robo advisor for an exchange-traded fund. This advisor can reduce management fees, improve transparency, and improve liquidity. Importantly, robo advisors help dynamic asset allocation to be automated. The overlapping generation model, used in the literature, implies that risk aversion is relatively low during youth and increases as an investor ages. The future incomes of young investors are derived from their future wages in their middle years, while middle-aged investors save for their retirement expecting no labor income at that time.
28
Y. J. An et al.
The financial industry calls this life plan the glide path. During this life journey of asset allocation, robo advisors resist the temptation to emotional investments. Machines do not have emotions, which is their greatest strength over humans. While humans’ emotions hinder rational decisions, AI-based investing mechanically invests for the long term, following a pre-specified investment strategy. Some papers are skeptical of the performance of robo advisors. When they first emerged, investors and the financial industry were enthusiastic. However, the adoption of robo advisors has been slow. They have not fulfilled the high expectations of transparency or the requirement for the sophisticated capacities for engagement in investment. Banks wish to provide robo advisory services to small-income earners to reduce low-yield manual work. However, the majority of average-income or below-averageincome earners want physical banks nearby, so that they can receive face-to-face advising services [84]. This mismatch prevents the use of robo advisors for most bank customers. Fein [95] suggests that robo advisors are not beneficial in the pension industry, as they do not provide personal advice and do not give a commitment to the client’s best interests. Robo advisors cannot learn fiduciary investing. Szeli [96], in an experiment on performance of humanized algorithm, dehumanized algorithm, humanized human, and dehumanized human, finds that humanized human shows the highest loss tolerance. The algorithmic advisors are more risk adverse than humanized advisors. Kaya [97] argues that there is not enough time-series in the data to test whether robo advisors yield high performance. The historical good performance of short-lived robo advisors are simply due to a bullish market and a fast recovery from a bearish market. They are skeptical of the performance of robo advisors during a recession.
5 Conclusions This chapter introduces the series of developments of technological innovations since the GFC in 2008. Fundamental hardware and global inter-connectedness gave rise to innovations in big data and AI, which in turn revolutionized what data could yield and potential superhuman efficiencies. This chapter also introduces the core concepts in blockchain, as well as its ability to lower barriers for geographical asset transfers. The rise of AI techniques and blockchain technologies has had a tremendous impact on current dynamics in the business and financial industries. Taking this trend into account, academic research has focused on the theories and applications of blockchain, AI, and big data. The number of articles on fintech has increased over the last few years. In this chapter, we provide systematic analyses of published papers on blockchain, AI, and big data, bearing on both theoretical designs and applications, with a focus on banking, corporate governance, real estate, and wealth management. We present the principles and designs of blockchain. Transactions are recorded in blocks, verified in peer-to-peer networks, and executed in orders. Nodes in peerto-peer networks reach consensus by using the technologies that we call mining and
Blockchain, Cryptocurrency, and Artificial Intelligence …
29
proof-of-work. Because blockchain removes the presence of financial intermediaries by design, it is decentralized [13]. This decentralization has the potentials to democratize industry dynamics. Blockchain enables the democratization of banking, to make a state in which investors are price-makers and banks are price-takers. It also enables shareholder democracy, where agency problems are resolved and enhances universality in the real estate industry, allowing new databases to be distributed across a country. Overall, blockchain technology has the potential to enhance transparency and credibility and to ensure irreversibility of records. However, some scholars argue that blockchain is not fully decentralized, so its suggested advantages are unlikely to come to fruition. The current practice of blockchain relies on uneven computation powers among different miners. As demonstrated at the unprecedented heights of Bitcoin price in late 2013, blockchain can be attackable to manipulate its price. The most popular application of public blockchain is cryptocurrency. We review the unresolved controversies of cryptocurrency, including whether it is a currency or an asset. The standard currency pricing models, such as future-cash-flows, purchasing power parity, and Friedman’s quantitative theory of money, explain little of the pricing dynamics in cryptocurrency. Some novel approaches exist to seek the determinants of cryptocurrency pricing from web services, such as, for example, Google searches, the Wikipedia engine, and microblogging. The literature investigates the asset-like nature of cryptocurrency. Scholars apply textbook asset-pricing approaches to the returns of cryptocurrency. While the EMH explains the reflection of public information in cryptocurrency sufficiently well, Fama–French’s fifth and third factors explain limited shares of expected cryptocurrency returns. However, Cahart’s fourth factor, the momentum factor, has high predictive power for the returns of cryptocurrency. The short-term momentum has higher predictive power than the long-term momentum. Moreover, we review the properties of cryptocurrencies in three groups: hedges, safe haven assets by definition, and safe haven asset by properties. Literature examines effects of different types of uncertainties—EPU, VIX, MPU, OVX, and GVZ— on the returns of different types of cryptocurrencies—Bitcoin, ripple, stellar, litecoin, monero, and dash. Bitcoin hedges against oil price, U.S. dollars, and stocks around the globe, including the U.S., Asia, UK, and Europe. Hence, incorporating cryptocurrency into a well-diversified portfolio enhances alpha returns. However, the literature draws mixed conclusions. Very recent evidence from the COVID-19 pandemic shows that Bitcoin is not a safe haven asset or a proper hedge in this bearish stock market. Another application of the blockchain protocol is CBDC. The literature demonstrates that CBDC has the potential to enhance monetary policies. Scholars modify versions of CBDC and measure the effects of pilot-launched CBDC on a real economy. However, no central banks show a firm intention to using CBDC on a full scale. This reluctance is associated with technological defects because blockchain cannot transact large amounts of transfers. It further still remains vulnerable to criminal activities. Moreover, economists find that its welfare is not high if both cash and CBDC exist in an economy.
30
Y. J. An et al.
We review a non-exhaustive list of applications of AI in banking, lending, and asset management, with a focus on performance and limitations. The literature supports the assertion that AI technologies yield superhuman performance. Scholars warn that not only will back-office functions be replaced but front offices as well, in both banking and asset management corporations. However, literature does not agree on this. AI techniques do not seem to be in the mainstream of current banking or investment services, possibly due to the heavy regulations of banking, the IT of banks made up of a century’s worth of add-on systems, and the uneven availability of IT talents around the globe. Both the size and quality of data have increased. Big data has shifted financial markets over the past decade. For example, credit rating firms now use alternative data to incorporate the customers who used to be classified as subprime borrowers. Big data enables the integration of segmented markets. However, the concept of big data should not please everyone. Only market leaders that heavily invest in R&D will have access to big data, while most other entities who cannot afford the technological innovations will fall behind. The application of AI to the business sector is based on adoption of users, technological innovations, and reaction of politicians and regulators. Therefore, future research should study how regulations regarding AI and blockchain affect the real economy. Moreover, few articles incorporate both theoretical frameworks and applications of AI. Hence, future studies that present the theoretical development of AI will be appreciated. For example, we have not found any studies that present a game theory framework of how AI alleviates information asymmetry in corporate governance and lending services. We find some research claiming that AI techniques alleviate information symmetry via lending platforms; however, most of the studies are case based. We suggest a more detailed approach by comparing techniques of acquiring and managing data across platforms, for example. Furthermore, most studies on robo advisors focus on performance. We hope to see research that compares the strategies of robo advisors in acquiring data, handling data, and making recommendations to investors. Regarding cryptocurrency, future research should focus on the heterogeneity of cryptocurrencies, depending on the platform of cryptocurrency exchange. The effects of different types of uncertainties have already been studied, but these effects can differ in exchanges and the degree of financial integration across regions. Acknowledgements Special thanks are due to Dhananjay Singh (Series Editor), Loyola D’Silva, Sudhany Karthick, Karthik Raj Selvaraj, and anonymous referees. An and Choi are grateful for the research support from Yonsei University and Ewha Womans University, respectively. This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2019S1A5A2A03053680). We thank Jaysang Ahn, Ethan Jaesuh Kim, Sabin Kim, SaMin Kim, Young Jin Kim, Hyun Jun Lee, and Jaehyun Rhee for their excellent research assistance. All errors are of authors’ own.
Blockchain, Cryptocurrency, and Artificial Intelligence …
31
References 1. Philippon T, Philippon T (2019) The FinTech opportunity. In: The disruptive impact of FinTech on retirement systems (2019) 2. M. Crosby, Nachiappan, P. Pattanayak, S. Verma, and V. Kalyanaraman, “Blockchain Technology - BEYOND BITCOIN,” Berkley Engineering, (2016) 3. Nakamoto S, Bitcoin P2P e-cash paper 4. Lagarde C (2018) Central banking and Fintech: a brave new world. Innovations: technology, governance, globalization 5. Böhme R, Christin N, Edelman B, Moore T (2015) Bitcoin: economics, technology, and governance. J Econ Perspect 6. Yermack D (2015) Is Bitcoin a real currency? An economic appraisal. In: Handbook of digital currency: Bitcoin, innovation, financial instruments, and big data 7. Bashir I (2018) Mastering blockchain : distributed ledger technology, decentralization, and smart contracts explained (2nd Edn) 8. Kosba A, Miller A, Shi E, Wen Z, Papamanthou C (2016) Hawk: the blockchain model of cryptography and privacy-preserving smart contracts. In: Proceedings—2016 IEEE symposium on security and privacy, SP 2016 9. Barenji AV, Guo H, Tian Z, Li Z, Wang WM, Huang GQ (2018) Blockchain-based cloud manufacturing: decentralization. In: Advances in transdisciplinary engineering 10. Chu S, Wang S (2018) The curses of blockchain decentralization 11. Biais B, Bisière C, Bouvard M, Casamatta C (2019) Blockchains, coordination, and forks. AEA papers and proceedings 12. Chen L, Cong L, Xiao Y (2019) A brief introduction to blockchain economics. SSRN Electron J 13. Lee J, Choi PMS (2020) Chain of antichains: an efficient and secure distributed ledger. In: Blockchain technology for smart cities. Springer, Singapore, pp 19–58 14. NXTFORUM.ORG and ON P, Dag, a generalized blockchain. https://nxtforum.org/proof-ofstake-algorithm/dag-a-generalized-blockchain/ 15. Gandal N, Hamrick JT, Moore Y, Oberman T (2018) Price manipulation in the Bitcoin ecosystem. J Monetary Econ 16. Kristoufek L (2015) What are the main drivers of the bitcoin price? Evidence from wavelet coherence analysis. PLoS ONE 17. Bouoiyour J, Selmi R (2017) The Bitcoin price formation: beyond the fundamental sources 18. Cachanosky N (2019) Can Bitcoin become money? The monetary rule problem. Aus Econ Papers 58(4):365–374 19. Corbet S, Lucey B, Urquhart A, Yarovaya L (2019) Cryptocurrencies as a financial asset: a systematic analysis. Int Rev Financ Anal 20. Abadi J, Brunnermeier M (2018) Blockchain economics. National Bureau of Economic Research 21. Makarov I, Schoar A (2019) Price discovery in cryptocurrency markets. AEA Papers Proc 109:97–99 22. Bartos J Does Bitcoin follow the hypothesis of efficient market? http://citeseerx.ist.psu.edu/ viewdoc/download?doi=10.1.1.902.3664&rep=rep1&type=pdf 23. Su S, An YJ, Choi MS (2020) What explains China’s GEM market? Evidence from the five factors. J Eurasian Stud 17(1):47–70 24. Liu Y, Tsyvinski A, Wu X (2019) Common risk factors in cryptocurrency. SSRN Electron J 25. Liu Y, Tsyvinski A (2018) Risks and returns of cryptocurrency. Working paper, NBER working paper 26. Nguyen H, Liu B, Parikh NY (2020) Exploring the short-term momentum effect in the cryptocurrency market. Evol Institut Econ Rev 17(2):425–443 27. Kristoufek L (2013) BitCoin meets Google trends and Wikipedia: quantifying the relationship between phenomena of the internet era. Sci Rep
32
Y. J. An et al.
28. Mai F, Bai Q, Shan J, Wang (Shane) X, Chiang R (2015) The impacts of social media on Bitcoin performance. ICIS 2015 proceedings 29. Ratner M, Chiu CC (2013) Hedging stock sector risk with credit default swaps. Int Rev Financ Anal 30. Dyhrberg AH (2016) Hedging capabilities of bitcoin. Is it the virtual gold? Financ Res Lett 31. Baur DG, Dimpfl T, Kuck K (2018) Bitcoin, gold and the US dollar—A replication and extension. Financ Res Lett 32. Capie F, Mills TC, Wood G (2005) Gold as a hedge against the dollar. J Int Financ Mark Inst Money 33. Glosten LR, Jagannathan R, Runkle DE (1993) On the relation between the expected value and the volatility of the nominal excess return on stocks. J Financ 34. Brière M, Oosterlinck K, Szafarz A (2015) Virtual currency, tangible return: portfolio diversification with Bitcoin. J Asset Manag 16(6):365–373 35. Bouri E, Molnár P, Azzi G, Roubaud D, Hagfors LI (2017) On the hedge and safe haven properties of Bitcoin: Is it really more than a diversifier? Financ Res Lett 36. Wang GJ, Xie C, Wen D, Zhao L (2019) When Bitcoin meets economic policy uncertainty (EPU): measuring risk spillover effect from EPU to Bitcoin. Financ Res Lett 37. Baker SR, Bloom N, Davis SJ (2016) Measuring economic policy uncertainty. Q J Econ 38. Selmi R, Mensi W, Hammoudeh S, Bouoiyour J (2018) Is Bitcoin a hedge, a safe haven or a diversifier for oil price movements? A comparison with gold. Energy Econ 39. Balli F, de Bruin A, Chowdhury MIH, Naeem MS (2020) Connectedness of cryptocurrencies and prevailing uncertainties. Appl Econ Lett 40. Demir E, Gozgor G, Lau CKM, Vigne SA (2018) Does economic policy uncertainty predict the Bitcoin returns? An empirical investigation. Financ Res Lett 41. Conlon T, McGee R (2020) Safe haven or risky hazard? Bitcoin during the Covid-19 bear market. Financ Res Lett 42. Ji Q, Zhang D, Zhao Y (2020) Searching for safe-haven assets during the COVID-19 pandemic. Int Rev Financ Anal 43. Chen C, Liu L, Zhao N (2020) Fear sentiment, uncertainty, and Bitcoin price dynamics: the case of COVID-19. Emerging Markets Finance and Trade 44. Smales LA (2019) Bitcoin as a safe haven: Is it even worth considering?. Financ Res Lett 45. Catalini C, Gans JS (2016) Some simple economics of the blockchain. SSRN Electron J 46. Cong LW, He Z (2019) Blockchain disruption and smart contracts. Rev Financ Stud 47. Fernández-Villaverde J, Sanches DR, Schilling L, Uhlig H (2020) Central bank digital currency: central banking for all? SSRN Electron J 48. Williamson S (2019) Central bank digital currency: welfare and policy implications 49. Meaning J, Dyson B, Barker J, Clayton E (2018) Broadening narrow money: monetary policy with a central bank digital currency. SSRN Electron J 50. Bergara M, Ponce J (2018) Central bank digital currency: the uruguayan E-Peso case. In: Do we need central bank digital currency? 51. Juks R, When a central bank digital currency meets private money: the effects of an ekrona on banks. https://www.riksbank.se/globalassets/media/rapporter/pov/engelska/2018/eco nomic-review-3-2018.pdf#page=79 52. Brunnermeier MK, Niepelt D (2019) On the equivalence of private and public money. J Monetary Econ 53. Bjerg O (2017) Designing new money—the policy trilemma of central bank digital currency. SSRN Electron J 54. Davoodalhosseini SM (2017) Central bank digital currency and monetary policy. SSRN Electron J 55. Yermack D (2017) Corporate governance and blockchains. Rev Financ 56. Panisi F, Buckley RP, Arner DW (2019) Blockchain and public companies: a revolution in share ownership transparency, proxy-voting and corporate governance? SSRN Electron J 57. Leonhard RD (2017) Corporate governance on Ethereum’s blockchain. SSRN Electron J
Blockchain, Cryptocurrency, and Artificial Intelligence …
33
58. Halpin H, Piekarska M (2017) Introduction to security and privacy on the blockchain. In: Proceedings—2nd IEEE European symposium on security and privacy workshops, EuroS and PW 2017 59. Foley S, Karlsen JR, Putnins TJ (2019) Sex, drugs, and Bitcoin: how much illegal activity is financed through cryptocurrencies? Rev Financ Stud 60. Sulkowski AJ (2018) Blockchain, business supply chains, sustainability, and law: the future of governance, legal frameworks, and lawyers? SSRN Electron J 61. Choi PMS, Choi JH, Chung CY, An YJ (2020) Corporate governance and capital structure: evidence from sustainable institutional ownership. Sustainability (Switzerland) 12(10):1–8 62. Feng Q, He D, Zeadally S, Khan MK, Kumar N (2019) A survey on privacy protection in blockchain system. J Netw Comput Appl 63. Dasgupta D, Shrein JM, Gupta KD (2019) A survey of blockchain from security perspective. J Bank Financ Technol 64. Sabitha Malli S, Vijayalakshmi S, Balaji V (2018) Real time big data analytics to derive actionable intelligence in enterprise applications 65. Sironi P (2016) FinTech innovation: from Robo-Advisors to goal based investing and gamification 66. Jakšiˇc M, Marinˇc M (2019) Relationship banking and information technology: the role of artificial intelligence and FinTech. Risk Manag 67. Kotarba M (2016) New factors inducing changes in the retail banking customer relationship management (CRM) and their exploration by the FinTech industry. Foundations of Management 68. Stulz RM (2019) FinTech, bigtech, and the future of banks. J Appl Corporate Financ 69. Ukpong EG, Udoh II, Essien IT (2019) Artificial intelligence: opportunities, issues and applications in banking, accounting, and auditing in Nigeria. Asian J Econ Bus Account 70. Vives X (2017) The impact of Fintech on banking 71. Lee S (2015) Fintech and Korea’s financial investment industry. Capital Market Opin 72. Shorouq Fathi Eletter, Saad Ghaleb Yaseen (2010) Applying neural networks for loan decisions in the jordanian commercial banking system. Int J Comput Sci Netw Secur 10(1):209–214 73. Barros CP, Managi S, Matousek R (2012) The technical efficiency of the Japanese banks: non-radial directional performance measurement with undesirable output. Omega 74. Avkiran NK (2015) An illustration of dynamic network DEA in commercial banking including robustness tests. Omega (United Kingdom) 75. Saksonova S, Kuzmina-Merlino I (2017) Fintech as financial innovation—the possibilities and problems of implementation. Europ Res Stud J 76. Lee JN, Morduch J, Ravindran S, Shonchoy AS, Zaman H (2017) Poverty and migration in the digital age : experimental evidence on mobile banking in Bangladesh. Working paper 77. Liu J, Li B, Chen L, Hou M, Xiang F, Wang P (2018) A data storage method based on blockchain for decentralization DNS. In: Proceedings—2018 IEEE 3rd international conference on data science in cyberspace, DSC 2018 78. Goh KH, Kauffman RJ (2013) Firm strategy and the internet in U.S. commercial banking. J Manag Inf Syst 79. Bömer M, Maxin H (2018) Why fintechs cooperate with banks—evidence from Germany. Zeitschrift fur die gesamte Versicherungswissenschaft 80. Oshodin O, Molla A, Karanasios S, Ong CE (2017) Is Fintech a disruption or a new eco-system? An exploratory investigation of banks response to fintech in Australia. In: Proceedings of the 28th Australasian conference on information systems, ACIS 2017 81. Crosman P (2018) How artificial intelligence is reshaping jobs in banking: EBSCOhost. American Banker 82. Huang MH, Rust RT (2018) Artificial intelligence in service. J Serv Res 83. Boot AWA (2017) The future of banking: from scale & scope economies to Fintech. European Economy—Banks, Regulation and the Real Sector 84. Jünger M, Mietzner M (2020) Banking goes digital: the adoption of FinTech services by German households. Financ Res Lett
34
Y. J. An et al.
85. Cheatham B, Javanmardian K, Samandari H (2019) Confronting the risks of artificial intelligence. McKinsey Q 86. Zalan T, Toufaily E (2017) The promise of fintech in emerging markets: not as disruptive. Contemp Econ 87. Lui A, Lamb GW (2018) Artificial intelligence and augmented intelligence collaboration: regaining trust and confidence in the financial sector. Inf Commun Technol Law 88. Jagtiani J, Lemieux C (2019) The roles of alternative data and machine learning in fintech lending: evidence from the LendingClub consumer platform. Financ Manag 89. Leong C, Tan B, Xiao X, Tan FTC, Sun Y (2017) Nurturing a FinTech ecosystem: the case of a youth microloan startup in China. Int J Inf Manag 90. Li WY, An YJ, Choi MS (2019) Shadow banking and financial development: Evidence from China. Korean J Financ Manag 36(4):53–85 91. Hau H, Huang Y, Shan H, Sheng Z (2019) How FinTech enters China’s credit market. AEA papers and proceedings 92. Azar A, Mostafaee Dolatabad K (2019) A method for modelling operational risk with fuzzy cognitive maps and Bayesian belief networks. Expert Syst Appl 93. Dirick L, Claeskens G, Baesens B (2017) Time to default in credit scoring using survival analysis: a benchmark study. J Oper Res Soc 94. de Spiegeleer J, Madan DB, Reyners S, Schoutens W (2018) Machine learning for quantitative finance: fast derivative pricing, hedging and fitting. Quant Financ 95. Fein ML (2015) Robo-advisors: a closer look. SSRN Electron J 96. Szeli L (2020) UX in AI: trust in algorithm-based investment decisions. Junior Manag Sci 97. Kaya O (2017) Robo-advice—a true innovation in asset management. Deutsche Bank Research
Alternative Data, Big Data, and Applications to Finance Ben Charoenwong
and Alan Kwan
Abstract Financial technology has often been touted as revolutionary for financial services. The promise of financial technology can be ascribed to a handful of key ideas: cloud computing, smart contracts on the blockchain, machine learning/AI, and finally—big and alternative data. This chapter focuses on the last concept, big and alternative data, and unpacks the meaning behind this term as well as its applications. We explore applications to various domains such as quantitative trading, macroeconomic measurement, credit scoring, corporate social responsibility, and more. Keywords Financial econometrics · Nowcasting · Forecasting · Artificial intelligence · Machine learning · Analytics
1 Overview Financial technology has often been touted as revolutionary for financial services. The promise of financial technology can be ascribed to a handful of key ideas: cloud computing, smart contracts on the blockchain, machine learning/AI, and finally— big and alternative data. This chapter focuses on the last concept, big and alternative data, and unpacks the meaning behind this term as well as its applications. Before defining alternative data, it is useful to understand what non-alternative datasets are. Our definition of a conventional, non-alternative dataset is one created for the purpose of formal reporting, either by mandate or a voluntary basis, meant to measure an economic concept useful to investors. Conventional data are sources of information widely used and understood by practitioners, policymakers, and researchers. While conventional data may capture a specific economic concept well, B. Charoenwong National University of Singapore, 15 Kent Ridge Dr. 7-69, Singapore 119245, Singapore e-mail: [email protected] A. Kwan (B) University of Hong Kong, K.K. Leung Building 923, University Dr., Pokfulam, Hong Kong e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 P. M. S. Choi and S. H. Huang (eds.), Fintech with Artificial Intelligence, Big Data, and Blockchain, Blockchain Technologies, https://doi.org/10.1007/978-981-33-6137-9_2
35
36
B. Charoenwong and A. Kwan
it is measurement may be improved, as the data might suffer from a slow reporting lag, a lack of granularity, or imprecision. Often times, big data and alternative data are used synonymously, although the terms merit distinction. In particular, big data are simply data that is larger along one of the four following dimensions, typically called the four V’s: volume, velocity, variety, and veracity. Meanwhile, the definition of “alternative data” actually pertains to the nature of the data. “Alternative data” is data that are not typically assembled for the purpose of financial analysis. In particular, alternative data have three key properties. First, the alternative data should provide information about a fundamental and value-relevant economic construct. Second, the alternative data should be reliable and not easily manipulated by interested parties. Examples of alternative data include geolocation foot traffic data, credit card transactions, Website visits, and shipping container receipts. Finally, alternative data should have novelty or rarity. When data are adopted by the mainstream, the edge provided by such data may diminish. And so, what is alternative today may not be alternative in a few years. For example, with the introduction of the iPhone in 2007, stock analysts of Apple used various creative means to estimate the number of iPhones being sold using packaging data, a type of alternative data. Soon after, Apple itself started reporting the number of iPhones being manufactured and shipped, effectively reducing the value of the alternative data from packaging as the company directly reported it. Similarly, in the 2009s, it was probably relatively novel for quantitative traders to analyze the text of company reports or news. Now, such methodologies are par for the course. More recently, since 2018, Apple stopped reporting the iPhone unit sales in their financial statements, so analysts look to alternative data again. This chapter will discuss the sources of alternative data, different types of alternative data that can be used in various contexts, the market for alternative data, as well as the economic framework behind the use of alternative data in financial markets. Finally, we also share several tips when working with financial data, from the data acquisition process to exploratory data analysis to deployment for live use. Little statistical background is required to understand this chapter and the discussion is primarily at a conceptual level, but technical details are provided. The key conceptual idea is that alternative data solves one of two problems faced by a user of data: forecasting the future or measuring something happening today. Finally, in the remainder sections of this chapter, we discuss the economic framework for using alternative data in the context of specific domains. While all financial sub-industries are likely to use alternative data, we focus on applications that may feature alternative data more prominently: quantitative trading, credit scoring, and macroeconomic forecasting. We then conclude with brisk coverage of how alternative data are used in other industries in finance, such as real estate and insurance, and how economic data were employed during COVID-19 to various aspects of economic activity.
Alternative Data, Big Data, and Applications to Finance
37
2 Where Does Alternative Data Come from? 2.1 Sources of Alternative Data There are three sources of alternative data. First, even publicly available information can serve as alternative data. Examples include company regulatory filings or other public documents, the news, or social media. One drawback of these data sources is that all investors have access to the same information, so either the information advantage garnered from these sources is low, or it takes significant sophistication to take advantage of the information when few other industry players can. An example of this is patent filings, which are difficult to link to business databases as the United States Patent and Trademark Office tracks organization names, forcing a researcher to trove through millions of records accounting for transfer of ownership of patents or changes in company ownership (for example, when Microsoft acquires LinkedIn, any IP of LinkedIn now belongs to Microsoft). Another difficult to process source of information which is difficult to process is shipment data from customs and border control agencies, which involve millions of companies and hundreds of millions of shipment records. Other creative examples of this are Web data, such as BuiltWith, who trove through Web sites on a continuous basis collecting information about installations of company applications on Web sites. Whereas public data provide a level playing field for all players, proprietary data are now becoming increasingly common. There are two types of data; one is collected by cooperatives. Dun and Bradstreet and Cortera operate data sharing cooperatives, whereby suppliers report their customers’ payments—how much do customers order and how timely are their payments on previous orders? Suppliers operate on a “giveto-get” basis. Moody’s operates a cooperative for data sharing among banks to pool loan outcome data. Although this model can acquire and aggregate data quickly and build positive network effects, the data that are reported to the data aggregator may also be subject to firm-specific bias, distortions, and measurement error. Finally, data generated by business processes, typically known as exhaust data, are generated as the by-product of corporate or government recordkeeping for other primary purposes in an automated fashion. These include supermarket scanner data at the store keeping unit (SKU) level and banking records like credit card transactions or bank transfers. Relatedly, sensors on various devices such as satellites or mobile phones also collect data. Mobile phone data have provided features such as the movement patterns of individuals over time, which, when aggregated, provide information such as the frequency of customer visitation in a retail shop. Recently, these mobile phone data have been very important in characterizing movement and social distancing patterns during the COVID-19 pandemic. Sometimes, companies with exhaust data sell their data. These data providers are classified alternative because few people in the financial industry have begun applying this data, yet there is great economic value. Other times, they are companies that have exhaust data that are valuable, but do not realize it. For example, publishers operating financial Web sites may see their content as valuable, but what may be
38
B. Charoenwong and A. Kwan
equally interesting is identifying the viewership of that content. Knowing that large, sophisticated investors are reading about stock A but not stock B could be valuable information.
2.2 Types of Alternative Data Here, we list a number of different data categories, based on classifications commonly used by alternative data providers as well as in academic research. These categories are generally but not entirely mutually exclusive. 1. 2.
3. 4.
5.
6.
7.
8.
Advertising—Data on how companies are spending advertising dollars, across which venues. Examples include Kantar Adspender. B2B Commerce—This refers to commercial transactions in the supply chain between businesses, such as Dun and Bradstreet, or other sources of supply chain data such as truck shipments or bills of lading data. At the aggregate level, this can also be referred to as trade data. At the firm level, this may also include indicators such as buyer intent sourced from digital research activity. Consumer Transactions—Data on consumer transactions sourced from ecommerce Web sites, credit card transaction records, or email receipts. Employment—Job postings data such as LinkUp or Burning Glass Technologies, data from crowdsourced employ surveys such as Glassdoor, data from recruitment data aggregators such as Entelo, and possibly payroll data aggregators. ESG—Data on corporate social responsibility performance of the firm, sourced from the efforts of ESG-specific data providers such as MSCI KLD ratings, Refinitiv Asset4, Reprisk, TruCost, and others. Event Detection—News sites and social media sites can be scanned by algorithms meant to detect anomalous events tied to a specific company. This can be useful for high-frequency traders, medium frequency traders, credit scoring or general business intelligence. Expert Views—Premised on the idea that intelligent, informed pundits provide value-relevant information on the Web, there are a few ways to aggregate their opinions. Primarily, social media platforms such as Twitter, Estimize, Seeking Alpha, and StockTwits provide pundits a platform with large reach. However, not all contributes may necessarily have useful information. Geolocation—Using readings provided by mobile phones and other sensors, this type of data collects the location of an agent. Sometimes, this agent is an individual, sometimes a truck or shipment. What is of interest is not the agent but the locations of transit. For example, how many people entering Walmart may be correlated with revenues.
Alternative Data, Big Data, and Applications to Finance
9.
10. 11.
12.
13.
14.
15.
16.
17.
39
Mobile App Usage—How often are users in an app? How many have downloaded the app? Such estimates could be proportional to revenues for companies which earn significant share of their dollars from app purchases or advertisements, such as Zynga. Reviews and Ratings—Consumer ratings provide information on how a product is perceived. Positive ratings can also predict future sales. Satellite and Weather—Various sensors provide measurements of different kinds. Night-time lights can provide an estimate of the luminosity of the earth’s surface, infrared scans can be used to estimate pollution, and photos can be used to measure whether a store has a filled parking lot or whether oil reserves are empty or full. Sentiment—What is the current mood of investors in general or surrounding a specific stock? This type of data can be collated from social media Web sites, where investors openly discuss data. Store Locations—Companies such as SafeGraph provide comprehensive data on store locations, which can be useful to investors, corporate competitors, and other constituents. These data are particularly useful when combined with geolocation data. Internet of Things—The Internet of things refers to measurements from Internetconnected devices, including information on geolocation, usage statistics, and search histories. Online Search and Digital Web Traffic—Platforms such as Google provide indicators such as Google Trends, which can be used to make economic inferences around trends. Web traffic data provided by publishers themselves, or companies which aggregate from publishers, can also be used. For example, Bloomberg provides the NEWS_HEAT index which provides interest in specific securities among investors. Many additional data providers exist in this space. Pricing—Consumer or B2B transactions provide revenue estimates accruing to a firm, but their profit margins may be affected based on pricing and marginal cost of products. Therefore, tracking pricing could potentially be useful. This data can be inferred from transaction records (for example, email receipts, which provide items and prices), or by scraping company Web sites and pricing schedules. Public Sector/Government Filings—Government filings are often publicly accessible and provide useful information such as company registration, property records, health benefits paid to employees (which can be drivers of cost), balance sheet and financial information, and health and safety records. In addition, for publicly listed firms, corporate disclosures oftentimes involve key material events such as announcements of future business dealings like corporate acquisitions or buybacks. While this approach has long been used by investors, the data are considered alternative in two respects: (1) There remain many corporate filings in many countries that have not yet been harvested, (2) the cost of implementation can be fairly high, and thus, adoption is not complete.
40
B. Charoenwong and A. Kwan
What datasets are most popular? As of 2018, Eagle Alpha produced the following infographic based on their own customers’ interest. Interestingly, those related to consumer retail (reviews and ratings, consume transactions, geolocation) are most popular among investors, while B2B and satellite are least popular. We do not believe these data lack broader application, but rather due to the scarcity of the data, academic research is limited. Thus, prospective buyers of this data have little guidance on whether such datasets pass the cost-benefit test. As we allude throughout this chapter, we believe this data has substantial value and these characterizations will change.
Source Eagle Alpha1.1 the market for alternative data
2.3 The Market for Alternative Data One large friction with alternative data is that the existence of so many datasets makes search costs high. It is difficult to know what is out there, and even where to start looking. While there is no doubt those operating in the early days of alternative data had to pay high search costs to discover alternative data, the world looks much different today. Veldkamp [64] argues that when information is costly to produce,
Alternative Data, Big Data, and Applications to Finance
41
information markets naturally emerge as consolidators of information. One has to look no further than simply the existence of the Wall Street Journal or Bloomberg to see evidence of the existence of basic information markets. Alternative data are arguably even costlier to produce, and so intermediaries should naturally emerge to facilitate this market. There is an emerging ecosystem of service providers in the alternative data space to help solve the following questions: • What datasets are out there? • How do I use these datasets? • How can I operationalize these datasets, and what service providers can assist my adoption of alternative data? One example is Eagle Alpha. Founded in 2012, Eagle Alpha is an early-bird in the alternative data space and is now one of the largest players in the industry. Eagle Alpha has numerous product offerings, but generally operates as a broker between buy-side clients such as hedge funds or private equity firms and data providers. They operate a database of over 1000 data owners as of June 1, 2020, who operate on the platform seeking to monetize their data. They also provide consulting services to help with dataset identification and data management. Their list of providers is not public, but they offer the community freemium access to conferences, whitepapers, webinars, and other content. Other platforms have emerged to consolidate and organize the space of data providers. Alternativedata.org is an open offering that collates and lists different datasets. On their platform, over 400 data providers list their data for purchase or acquisition. Adaptive management is a closed offering that focuses on identifying datasets but also provides extract-transform-load services to help investors ingest data into their databases. There are likely many more not named here. Traditional data providers have also entered the space. Examples of these data platforms include FactSet “Open Marketplace,” a platform which at the time of this writing includes over 100 alternative datasets in addition to the standardized datasets like firm fundamentals and stock prices. CapitalIQ and Refinitiv (formerly Thomson Reuters) also curate a select set of alternative data offerings. As these data providers are also providers of traditional data offerings, a key benefit these data providers offer is integration with existing workflows of many large institutional investors. The above solutions only discuss how to find and implement alternative data. However, what data are useful? Many data brokers do not assess all possibilities with alternative data. Resources are limited and the use of alternative data may be different for each user. However, data providers can reduce costs for R&D through partnerships with academic researchers or third-party agencies such as ExtractAlpha, who specialize in research and evaluation of data. However, relative to data providers or data brokers, these research resources are relatively rare with most players focusing on aggregation and intermediation.
42
B. Charoenwong and A. Kwan
2.4 Characteristics of an Appealing Alternative Dataset Beyond financial cost or legal considerations, the ideal dataset has the following key features: 1. Entity resolution for target customers—In our experience, this aspect is the costliest aspect of working with an alternative dataset from a time perspective. Whether one is meant to trade stocks, score credit risk of individuals, or monitor households, the object of interest should be mapped to a standard “identifier” with which the user can easily link records. For users at most investment firms, the application is quantitative trading. While the company can be identified by a number of symbols, in the case of stock trading the object of interest would be International Securities Identification Number (ISIN), stock exchange daily official list (SEDOL) or a platform-specific measure such as Bloomberg ticker. We advise against using exchange tickers because they can change over time and the same ticker may represent different securities across exchanges. Multiple identifiers in the same database are useful, particularly to other types of databases. For example, in the USA, understanding the government contracts a firm receives requires tracking its DUNS number (the Dun and Bradstreet unique nine-digit identifier for businesses), and understanding its benefits packages requires its Employer Identification Number (EIN). In addition, much data today are tracked digitally, so having the related Web sites for the company can be useful. The more data the better. Other useful data include address and phone number for cross-checking across databases. As it relates to business data, another important property is “point-in-time” identification of ownership structure. Company ownership changes over time, like with LinkedIn as a standalone company, then as a subsidiary of Microsoft. Thus, not only would LinkedIn have to now be binned into Microsoft’s account, so too would be the subsidiaries of LinkedIn. If we receive data on LinkedIn’s subsidiary without considering its implications for its “ultimate owner,” this can be problematic for analysis. Ideally, LinkedIn would be mapped to a standalone company prior to its acquisition, and linked to Microsoft after, but not before its acquisition. 2. Thoughtfully featurized/flexibly featurizable—The data provider should provide useful features (variables in the dataset) that a user would immediately find useful, but also retain flexibility for users to consider different measurements or permutations. Consider two popular providers of job posting data. On one end, LinkUp gathers job postings of major employers in the USA. They have useful features, such as the duration of the job posting, the location of the posting and the occupation code, but also provide job descriptions in case for full flexibility. On the other end, Burning Glass Technologies goes further in identifying credentials or job skills required in a posting, but does not provide the posting for data mining. It is more feature-rich but less flexible. The user must trade-off a more featurerich data source with another with fewer features but more flexibility. Whether one is better than the other is application-specific.
Alternative Data, Big Data, and Applications to Finance
43
3. Meaningful economic content—Clearly, there is no point in measuring something and paying a non-trivial cost if it does not measure a powerful economic construct with some advantage over existing data. Congruent with the four V’s of big data, it is important to understand if the advantage of this data is its granularity of subject, its speed of reporting, or its ability to measure something more precisely than previously possible. For example, as with the case of data aggregators such as NovaCredit, Dun and Bradstreet, and Moody’s, the value proposition of the data provider is to provide aggregate data from individual firm-level contributions that would otherwise not be available to a single firm. Another approach is to measure a concept less well but in a timelier fashion. For example, one object of interest to investors is a company’s earnings, which get announced on a quarterly basis. While we cannot know better than the company themselves what profits they earned, we can forecast it early using many different signals. For example, if American consumers have been entering Walmart stores as evidenced by credit card statements or if mobile phone data suggest Walmart visitations have increased, we may forecast that Walmart’s revenue will increase. In some cases, there are constructs that were not measurable before, such as investor attention. Bloomberg’s NEWS HEAT index provides such a measure. Job postings provide the firm-level demand for workers. Previously, vacancies were only available at the industry level. 4. Novelty and differentiation—This can be considered to be the “wow” factor, which depletes over time with mainstream adoption. In the 2009s, it was probably relatively novel for quantitative traders to analyze the text of company reports or news. Now, such methodologies are par for the course.
3 Economic Framework Alternative data does not mean alternative economics. Alternative data simply provide another methodology through which to capture variation of the same economic variables which may have already been previously measured. Such data may be beneficial for at least two reasons: (i) it either presents a more precise way of measuring existing economic concepts, or (ii) it presents a timelier measure compared to conventional data sources. That is, whether the outcome variable of interest is a company’s stock returns, a borrower’s likelihood to default, or some other economic or financial variable, the point of alternative data is to take a foundational economic building block and better measure it: either through the dimension of precision or through time. Consequently, these two benefits of alternative data serve two distinct purposes. The first, viewed through the lens of increased precision of existing economic concepts, is forecasting. The second, viewed through the lens of more timely measurement of existing economic concepts, is nowcasting. The figure below shows these two methodologies across an example with two time periods.
44
B. Charoenwong and A. Kwan
Nowcasting originated in the field of meteorology for predicting current weather patterns in areas where there may not be many sensors. It has been applied to financial and economic applications because many key variables are reported with a lag. For example, for publicly listed companies in the USA, annual 10-K financial statement filings are due only after 90 days post the end of the fiscal period. Nowcasting attempts to forecast the current values which are reported with a lag based on more timely data from the same period. Forecasting involves the prediction of future values based on values in the current time period. This specific framework for forecasting already incorporates the potential lags in the economic or financial variable reporting.
Although both applications are set up as statistical problems, they fundamentally represent different uses of the data. Forecasting pertains to whether the inclusion of alternative data improves predictions of future economic concepts. For example, changes in prices of items at the stock-keeping unit-level acquired from store scanners may be used to predict future inflation. Nowcasting pertains to whether the alternative data predicts the current ongoing economic concept which is typically reported with some delay. For example, the amount of credit card transactions in a given month may be used to predict a firm’s current revenues. From the traditional lens of data science, this framework for nowcasting versus forecasting also applies to data science concepts of statistical inference and predictive analyses. On the one hand, nowcasting and forecasting can both improve statistical inferences as nowcasted and forecasted variables form additional variables for empirical tests of whether statistical relations are strong and robust. In particular, they may be used to address concerns related to reverse or simultaneous causality. On the other hand, nowcasting and forecasting can both improve predictive analyses as nowcasted values can be used as additional explanatory variables or the alternative data may directly be used to predict future values in a forecasting framework.
Alternative Data, Big Data, and Applications to Finance
45
Those interested in a cursory statistical primer may read ahead. Otherwise, the remainder of this chapter discusses the conceptual basis for statistical modeling. It should be largely intelligible to those without formal statistical training, and the key distinction to remember is simply that alternative data are a drop-in for an economic construct implicit within traditional quantitative models.
3.1 Discretionary Versus Quantitative Trading The application of data is generally thought of to be the province of “quants,” technically skilled individuals who can program computer code to execute strategies in a systematic fashion. Such individuals will begin with hypothesis testing, gathering data to build a simple model to test whether the data can be used to reliably correlate with a specified outcome. Once the model is tested for statistical significance, statistically robust relationships will enter more complex testing. Those that survive vetting will be employed in “production” use and implemented in practice. While quants will no doubt be the key consumers of any explosion in alternative data, less technically skilled individuals will increasingly find alternative data useful. There is a surge in efforts geared toward building tools allowing non-technical users to access the insights from data, by abstracting away data processing through a user interface. For example, companies such as Thinknum take publicly scrapable data and provide a Web site portal where users can identify specific companies and quickly benchmark unique data points about a company such as its Glassdoor ratings or job postings against industry peers, or against its own historical trends.
3.2 Forecasting Consider a variable of interest Yt+1 , capturing a specific economic concept of interest. The goal is to build a model using variables X 1,t , X 2,t , . . . X p,t that can predict the variable Yt+1 where t captures a time index, t−1 t−1 t−1 Yt+k = f {Yt−τ }t−1 τ =1 , X 1,t−τ τ =1 , X 2,t−τ τ =1 , . . . X p,t−τ τ =1 . In words, the equation above simply takes the stance that the outcome variables at some future are some function of some, any, or all historical values of a set of explanatory variables, which may or may not include itself as well as additional variables. In other words, forecasting is a very general problem that simply uses historical values to predict future values. In practice, we start with the simplest model possible that just considers one lag of the model using a linear regression specification of the form:
46
B. Charoenwong and A. Kwan
Yt+1 = α + γ Yt + β1 X 1,t + β2 X 2,t . . . + β p X p,t + ε, where ε is some noise term with mean zero. Ordinary linear regressions (OLS) of the form above are commonly used as a starting point to understand the relation between different variables. Although the model is basic and most statistical packages come with built-in functions for this analysis, it is powerful. In fact, the Gauss–Markov theorem states that if the true underlying model is in fact linear and error terms are normally distributed (Gaussian), then OLS is the best linear unbiased estimator. We discuss the importance of bias versus variance in the statistical modeling subsection below. But put briefly, OLS requires few assumptions, is easily computed in the data, and is a useful starting point for statistical inference and prediction. However, additional key statistical properties required for valid statistical modeling in the time series include that the series must be stationary as well as ergodic. For simplicity, we avoid discussing the exact mathematical statistics property in detail but discuss what the two requirements imply. Stationarity. The first property, specifically defined as covariance stationarity, ensures that the statistical model is stable over time and that its statistical properties do not change. Simply put, stationary time series have a finite and stable variance going forward and also covariance with its own historical values. On the other hand, non-stationary time series have infinite variance. Since practically every statistical model must estimate some variance parameter of the data, this property also means that the statistical model can actually estimate something that is not blowing up. Although that variance does not have to be constant and can move around, it must exist. In practice, we never observe the estimated values equal to infinity. So, what does the latter issue mean? Although the notion of stationarity requires some assumption on the underlying data generating process, in practice, statistical models estimated on non-stationary time series models tend to generate vastly different estimates across different time horizons and subsets of the data. Without stationarity, time series predictive models tend to generate spurious correlations, the phenomenon where a statistical model shows some relationships which are not actually there. This issue falls under the class of overfitting issues, which we discuss more below. In fact, with the wrong variable transformations, the probability of observing a spurious correlation gets larger the more time series data there are! Ergodicity. The second property ensures that additional time series data, also known as sample paths, make the estimates of model parameters more precise. In practice, this means that if we continually re-estimate model parameters as more data become available through time, the statistical model gets closer to the true underlying relationship, and gets better in a predictive sense. Without ergodicity, a time series statistical model effectively will not get better with a longer sample of data! Although these two properties pertain to different characteristics of the time series data, you can see why they must go together. The first ensures that statistical model parameters actually exist and can be estimated. The second ensures that longer
Alternative Data, Big Data, and Applications to Finance
47
time series data are actually useful to collect and process. Without these properties, different sampling periods of data can result in truly different data generating parameters, and in fact, including additional data does not guarantee that the overall performance of the model improves. The role of the data scientist is to evaluate, by using domain knowledge based on economic theory or visual representations of the data, whether specific variables are stationary and ergodic. Thankfully, many non-stationarity variables can be transformed to become stationary. For example, stock prices are non-stationary since stock prices can theoretically increase forever and therefore have infinite variance, but stock returns are stationary. Therefore, an astute data scientist will not set up a statistical model predicting stock prices directly, but instead predict returns (or capital gains/losses). Should the end goal be to come up with a target stock price, the forecasted return may be inverted to get a predicted stock price level when multiplied by the previous end of period stock price. Ways to transform non-stationary variables to become stationary include (i) taking the first difference (subtracting the current value by the lagged value) and (ii) considering the percentage change. These transformations may also be applied recursively. Under some situations, a variable that is initially non-stationary may still be non-stationary after first-differencing and must be differenced again. When applied to financial markets, financial data scientists typically consider financial outcome variables. For example, for a stock trader, the key outcome variable of interest is a stock or market index return. This variable is stationary and ergodic. So, we replace Y with a security i’s stock return at time t, rit+1 = α + β1 X 1,it + β2 X 2,it . . . + β p X p,it + it . What types of variables do we use? In the absence of alternative data, we would typically use fundamental variables that we expect to predict stock returns. For example, the finance literature has considered hundreds of the headline normalized financial statement variables that are comparable across companies. In addition, non-headline items span over one thousand additional items just on the financial statement alone. When including alternative data, the data dimensionality increases even more.
3.3 Nowcasting Compared to forecasting, nowcasting seeks to exploit differential delays in data reporting to use data generated in a timelier manner to predict outcome variables. Consider a variable of interest Yt , capturing a specific economic concept of interest. The goal is to build a model using variables X 1,t , X 2,t , . . . X p,t that can predict some set of variables Yt+1 where t captures a time index, Yt = f X 1,t , X 2,t , . . . X p,t .
48
B. Charoenwong and A. Kwan
Using the simplest model possible, consider a linear regression specification of the form: Yt = α + β1 X 1,t + β2 X 2,t . . . + β p X p,t + ε, where ε is some noise term with mean zero. The setup for nowcasting looks highly similar to forecasting where the only difference is that the right-hand-side variables are all from the same time period. So, what exactly are the differences beyond the index subscript of the data? Fundamentally, the different timing of the statistical problem uses different sources of data and requires different assumptions. Forecasting requires using historical values to predict future values. In contrast, nowcasting uses some current values to predict other values. Although there is still a dimensionality of lead-lag effects in time because the explanatory variables are realized and recorded faster than the outcome variable, the data are fundamentally pertaining to the same period of time. To better see this, we discuss the concept of sources of variation to understand what statistical models are using to estimate parameters and create predictions. Sources of Variation. For statisticians and data scientists, information is contained within the variation in the data. (For those readers who have taken a mathematical statistics class, it is no coincidence that in maximum likelihood estimation, the variance of the score function is called the “information matrix”!) Variation in the data comes from only two sources which capture all of physical existence: space and time. Space captures different positions and characteristics of different entities or objects at the same point of time, whereas time captures how the position and characteristics of a specific entity or object change through time. Variance in space is also called a “cross-section” which you can imagine as considering different objects within the same slice of time. Therefore, cross-sectional analyses compare objects within the same time period, and time series analyses compare the same object through time. With this vernacular, we can now discuss the exact conceptual differences between forecasting and nowcasting. The source of variation for forecasting comes from time series variation. The source of variation comes from the cross-section. That is, nowcasting only uses contemporaneous variables and considers how a set of explanatory variables affect the outcome variable (it does not consider the outcome variable’s history or the explanatory variable’s history). One may think that, for example, the sales level of a company increases through time to unbounded values due to economic growth and inflation. But, nowcasting it with foot or Web traffic data will continually keep pace with such variables, therefore, causing no problems for the statistical models. However, although the main source of variation is from cross-sectional relation between sales and Web traffic data, the data used to estimate the model still come from multiple snippets of time. Therefore, if variables are non-stationary as in this case, then the variance of the estimates and model can still blow up to infinity. Therefore, nowcasting still requires stationary
Alternative Data, Big Data, and Applications to Finance
49
variables. In the example with sales and foot traffic data, both variables should be converted into percentage changes. Astute readers may realize that one can simply combine forecasting and nowcasting, effectively combining variation from both space and time together. Indeed, such analyses may uncover additional statistical relations that individual sources of variation (space or time) may not be able to identify. Such data structures are called panel data, where information is available for the same set of entities across different periods of time. In terms of the required statistical properties, including the lagged terms of additional data sources turns a nowcasting exercise into a forecasting exercise. With this conceptual framework in mind, we are one step closer to operationalizing alternative data. However, before putting the data to work, we must first discuss additional complications of using large amounts of data below.
3.4 Statistical Modeling and High Dimensional Data Even if one has novel and economically important data, alternative data comes with a potential additional set of issues: too many features to choose from. An important follow-up question is how to combine different data sources or different methods of capturing a given data source. For that, we consider a class of methods for variable selection which innovates upon the standard linear regression. All such statistical methodologies can be represented through an optimization process and are often referred to as “shrinkage methods”. For example, with ordinary least squares, the objective function is to minimize the sum of squared errors and to estimate the conditional mean, while with quantile regressions, the objective function is to minimize the equally-weighted deviations from the median and to estimate the conditional median. Both have their advantages and disadvantages based on the properties of the data and the related assumptions on the data’s generating process. Methods like OLS or quantile regressions suffer from a dimensionality problem. As the features (that is, loosely speaking, the number of columns within a dataset where each row is an observation) increase, these models can quickly overfit the data. Overfitting is a modeling error that occurs when a specific objective function is too closely fitted to the number of data points, resulting in an overly complex model that appears to fit the data well, but due to artificial optimization properties. Models that are overfitted to a specific dataset tend to perform poorly along many measures of accuracy such as the R 2 , mean-absolute error, or the root-mean squared error when confronted with new data, even if the new data were generated from the same exact data generating process. As the number of estimated parameters increases closer to the number of observations, the degrees of freedom, which is the remaining source of variation in the data that is not mechanically captured by the model, decreases. As the degrees of freedom approaches zero, each parameter is estimated by one particular observation. In these scenarios, essentially any objective function can suffer from overfitting
50
B. Charoenwong and A. Kwan
issues. Therefore, additional methods were developed to reduce the bias that arises from overfitting and to preserve the degrees of freedom left within the data to allow for proper tests of model fit and model quality. To understand the importance of the degrees of freedom and model complexity, consider the variance-bias trade-off. Errors in statistical models come from three sources: (i) the bias term from model misspecification from the algorithm (also known as underfitting), (ii) the sensitivity of predicted values due to variations in inputs into the model (also known as overfitting), (iii) and the irreducible component fundamental to the statistical problem itself. The latter component can never be solved with any statistical model, hence, the decision of different statistical techniques to use focus on the first two components. To see this, consider the expected squared error relative to any predicted value yˆ generated from any model based on a given set of inputs yˆ = fˆ(x) where fˆ is the estimated model, where the true data generating model is y = f (x) + ε with the noise term ε being mean zero. The total mean squared error is then given by, E
y − yˆ
2
2 = E yˆ − y + var yˆ +
Bias
Variance
σ2 , Fundamental
where σ 2 is the variance of the actual outcome variable y. To see the right-hand side of the equation from the left, replace the true value y = f (x) + ε, recall that the noise term is uncorrelated with the estimated function
by construction, and add and subtract the expected fitted value from the model E yˆ . This trade-off is related but distinct from the ideas related to measurements of statistical fit like the R 2 measure and is a fundamental trade-off in statistical modeling. Conceptually, statistical models with high bias may miss relevant correlations in the data while those with high variance are susceptible to noise. All statistical methods trade-off between these two sources of error, potentially achieving different levels of variation in forecast errors. On the one hand, models with lower bias in parameter estimations (where the bias is the expected value of the estimator relative to the true underlying value of the parameter) tend to capture true relations in the data but may instead suffer from overfitting, generating noisy predictions due to small changes in the input data. On the other hand, models with more bias tend to be more simplified compared to those with low bias, but the predicted values generated from these methods tend to be less sensitive to small variations in the input data, meaning that the model tends to have lower variance. Overfitting is also alleged in academic publications on empirical asset pricing. Harvey et al. [38] document over 240 academic publications proposing new empirical asset pricing factors. They argue non-statistically significant results would not be published in academic outlets, and statistical inferences should account for the missing strategies that were back-tested but not published. They argue most research findings in the field are likely false due to the multiple sampling and overfitting of the same datasets.
Alternative Data, Big Data, and Applications to Finance
51
One commonly used set of tools to address overfitting is shrinkage estimators. These are effectively models that introduce some penalty term in the optimization function. Although conceptually one can always overfit a statistical model no matter the penalty that is introduced, the penalty terms and additional empirical methodologies add discipline for statistical practice. When discussing credit scoring, we discuss the LASSO methodology. When discussing macroeconomic forecasting, we discuss partial least squares.
4 Quantitative Trading The use of alternative data has perhaps gained infamy nowhere more than in quantitative trading. For example, on the award-winning television show, Billions, legendary investor Bobby Axelrod often mentions novel sources of information such as “satellite data” to predict movements in the company’s quarterly earnings based on the number of cars entering the parking lots of specific companies. Terms such as “quantamental” have sprouted up to describe the enhancement of fundamental analysis traditionally executed manually by humans, with data-driven processes now being executed systematically by computer algorithms rather than on a bespoke or manual basis by humans. However, the use of alternative data need not replace human processes, as more recently, as alternative data providers have sought to reach a wider audience, discretionary investors have also started to adopt alternative data. Industry observers have noticed a distinct rise in interest from discretionary investors. Alternative data are relevant for quantitative investing in various ways, from informing decisions at the asset class level—like currencies and commodities—to security-level—like predicting a firm’s earnings. We first begin by describing the historical context behind the explosion in quantitative strategies, and then discuss a number of useful cases for quantitative strategies. As we describe in Sect. 4.2, Generation 1 of “quantamental” investing focuses on predicting revenues of companies. Generation 2, meanwhile, focuses on expanding beyond predicting a company’s revenues.
4.1 Strategy Decay and the Impetus for Alternative Data The impetus for applying alternative data to quantitative trading arises from the competitive nature of financial markets. The widespread use of quantitative strategies boomed in the early 1990s with the emergence of academic research documenting that simple strategies provide investors high average returns that “beat the market.”1 1 To
be more precise, there exists an investment strategy that provides statistically robust “alpha,” relative to a benchmark regression. A typical benchmark is the market returns or the Fama-French three-factor model.
52
B. Charoenwong and A. Kwan
For example, Jagadeesh and Titman [46] document that investors can make sizeable returns by simply identifying stocks that had performed well in the previous 11 months, skipping the most recent month, and holding that stock for at least one month. Further findings include the value and size style investment factors [35]. The value strategy rewards investors who buy stocks with relatively weak valuation ratios. What’s more is that these simple strategies appear to perform in large-capitalization stocks, giving hopes that investors can apply these strategies at scale without fear of moving the market. Through the lens of classic efficient markets theory, the momentum strategy can be interpreted in three ways. First, in the “risk premia” view, momentum embodies a risk for which investors are fairly compensated. Daniel and Moskowitz [29] document that momentum has severe crashes which nearly wipe out equity investors. Second, from the view of “behavioral finance,” momentum reflects persistent evidence of investor mistakes that one can repeatedly use to obtain extra returns. Indeed, many have ascribed momentum to the role played by investors who misinterpret or remain inattentive to past prices. Third, in the “mispricing” view, momentum’s high returns should be eroded once enough investors are aware and deploy capital to compete away investor profits. Whichever of these philosophies one subscribes to, this line of empirical asset pricing research has spawned some of the largest investment firms and hedge funds on the street. Long Term Capital Management was formed with key members of its founding team coming from academia, applying quantitative methods aimed to “arbitrage” mispricings in fixed income instruments.Dimensional Fund Advisers, founded by David Booth in 1981, applied a combination of value and size investment styles in mutual fund and ETF construction. Then, in the late 1990s, Cliff Asness, one of three researchers credited with discovering momentum, started AQR who today manages over $100 billion of capital. Abis [1] notes a more than threefold increase of mutual funds classified as following a “quantitative strategy” in the mutual fund industry from the year 2000 to 2016. These initial discoveries propelled more. Hundreds of papers documented various investment strategies which provided statistical evidence over its sample period that a simple, parsimonious trading strategy could earn returns that beat the market. By the late 2000s, John Cochrane, the then president of the American Financial Association, used the term “factor zoo” to refer to the over 200 investment strategies that were discovered in literature and published in top journals. This total does not include the strategies that industry practitioners potentially discovered but did not disclose. The existence of the “factor zoo” gives rise to two questions. First, were these strategies persistent phenomena (the risk premia or behavioral finance views), or fleeting mispricings? While some strategies, such as the momentum strategy, appear to remain profitable since their discovery, there has been a fair bit of evidence that such strategies were mispricings, wherein investors had limited capacity to profit. McLean and Pontiff [56] replicate 96 investment strategies, and find that the performance of mispricings documented in academic publications declines by about a quarter out-
Alternative Data, Big Data, and Applications to Finance
53
of-sample and 60% after the publication of the studies. The decline in performance is larger the better the in-sample returns are, as shown in the figure below. Of course, one alternative explanation is overfitting done by academic researchers. However, the decline in returns is greater among stocks that are more liquid with lower idiosyncratic risk, which appears more consistent with investors learning about such mispricings from academic research. Jacobs and Muller (2020) note, however, is that this appears to be a USA-specific phenomenon, with no evidence of post-publication decline internationally.
Source McLean and Pontiff [56]
The second question is about strategy correlations. Are there really 200 simple, mutually distinct investment strategies that investors can choose from? Or do investors crowd into the same strategy, whether knowingly or not? The financial crisis provided a test for quantitative investment strategies—and many strategies failed this test. Khandani and Lo [50], for example, document evidence that in August 2007, at the outset of the financial crisis, many quantitative strategies suffered large consecutive losses in the span of a few days. Interestingly, many of these strategies rebounded quickly after a few days’ losses. The figure below from their study shows this pattern for some major types of quantitative strategies executed by investors.
54
B. Charoenwong and A. Kwan
Source Khandani and Lo [50]
This pattern could be consistent with investors selling off and closing out their positions, leading to losses on the stocks typically traded by a strategy. As investors finish selling, pressure subsides and the market recovers. Vinesh Jha of ExtractAlpha re-produces this analysis for the main equity factors.
Source Jha [47]
Alternative Data, Big Data, and Applications to Finance
55
4.2 The Alternative Data Framework for Quantitative Trading The fundamental goal of an investor is to predict risk-adjusted returns. A return refers to the percentage change in wealth an investor receives for investing in a security. For an equity security, it is different than the price change when factoring in dividends or splits/reverse splits. Additional considerations might be transaction costs, which pure return calculations ignore. With respect to 3.1, the simple change is to make the outcome variable rit : rit = α + β1 X 1,it + it The basic idea is that X is a variable that measures an economic idea we expect to be related to returns. The data are meant to better measure the economic idea more closely. For example, suppose we believe that companies with positive investor sentiment are positively related to future returns. Traditionally, we may find this difficult to measure, so it may be inferred indirectly. Baker and Wurgler [6], for example, argue investor sentiment is positively related to the “closed end fund discount,” the discount on which mutual funds pay relative to assets. If investors want to be in the market, the discount is smaller. Now, with the advent of social media, wherein investor discussion around a stock might be monitored, we might have the ability to better-measure investor sentiment. Thus, we add value by measuring and replacing a block of an existing investment strategy with a better one through alternative data. Equity returns move for a lot of reasons. Sometimes, investors trade for liquidity reasons which may induce non-fundamental movements in prices. Some alternative datasets are available for ten years or longer, allowing one to overcome the noise over a long period of time. However, one of the challenges with predicting stock returns using alternative data is a short history. Therefore, in order to gain statistical significance, a common approach is to focus instead on days where returns are driven primarily by information about the firm, namely, scheduled earnings announcements. Earnings consist of revenues minus expenses. Ultimately, an investor takes home earnings the company has decided to distribute to investors. Therefore, earnings are the major object of interest for investors. However, a firm with a reduction in expenses beyond expectations or increase in sales beyond expectations can also be attractive. How does one define market expectations? Generally, the views of all market participants are not known. However, it is common for many stocks to have stock analysts, usually working for “sell-side” firms such as Goldman Sachs and JP Morgan, to write analyst reports detailing company business models. Such reports usually produce forecasts of quantities such as sales and earnings per share which are aggregated by companies such as Refinitiv and Zacks into databases to form a consensus forecast. The consensus forecast is the average of forecasts produced by analysts. How do we compare one firm beating expectations by 5 cents per share, with another by $5 per share? We can scale by disagreement in analyst forecasts.
56
B. Charoenwong and A. Kwan earnings
SUEit =
earnings − [μ]it earnings
[σ ]it
.
where earnings are the realized earnings that the company announces, [μ] represents the average forecasted value by analysts, typically defined at the most recent forecast, and [σ ] is the standard deviation of the forecasts across analysts. Thus, rather than predicting returns, we build a model to predict earnings surprises, expecting that beating earnings expectations, on average, will lead to outperformance around the days of earnings announcements, on average. SUEi,t = α + βXi,t + i,t A number of studies focus on earnings announcements and show that their indicator provides a statistically significant prediction of company earnings surprises. We now walk through the findings from this literature. We first start with earnings surprises. Then, we discuss cross-momentum, another common application of alternative data. Finally, we list a few other investment strategies. Earnings surprises. By predicting whether a company’s earnings will exceed expectations, we can predict company returns around announcements. The economic theory is simple: companies outperform if they have better earnings. If alternative data are available to some investors before others, such investors can take advantage. The market will react positively on the day of earnings announcements once the company announces earnings. The focus in the first generation of earnings announcement strategies has been to target revenues. Revenues are easier to predict than earnings, as costs come from a variety of sources. This logic is not perfect. For example, if a retailer offers large discounts to attract customers, even if revenues rise, earnings may fall if the additional quantities sold do not outweigh the average discount per unit. That said, this strategy has worked enough that it is a staple of quantamental investment strategies. The prevalence of revenue surprise strategies owes largely to the availability of consumer retail data. Consumers often receive excellent free services without paying a dime while exchanging their data in return. Examples of this include Internet search, credit cards, and the usage of apps such as Internet GPS maps or a Web browser and ad-blocker. These free or low-cost service providers attract a large user base, thereby creating a substantial data asset. Examples abound. Internet search allows companies to build data products such as Google Trends, mapping products, or services receiving abnormal interest. Mobile apps ask for GPS readings, allowing them to access the location and movement of anonymized users. Such data are aggregated by companies such as Veraset to identify stores and brands receiving more footfall. Yodlee aggregates credit card spending across major financial institutions, enabling them to track revenues of retailers such as target, and so on. Consider the case of Netflix. Netflix is more or less one product: subscription services. There are a handful of subscription prices. However, we can generally guess Netflix’s revenues by guessing the number of subscribers. This can be done
Alternative Data, Big Data, and Applications to Finance
57
in a number of ways. Most straightforwardly, data on email receipts or credit card statements provide statements about how many users purchase Netflix. However, credit card panels are often biased, coming from a single or handful of banks, while Netflix is a global player. Thus, this estimate may be buttressed by estimates of Web browser traffic through Alexa or SimilarWeb, which collate data from browser extensions. Of course, these Web traffic panels—while probably informative—may also not cover viewership of Netflix on a mobile phone. Therefore, one can possibly acquire data on app downloads or app usage. A number of academic papers and industry writings confirm the relation between consumer activity indictors and revenue surprises. For example, Da et al. [28] study the relation between Google Trends and revenue surprises. Revenue surprises can either be defined relative to the consensus of analyst forecasts, or in the case of some academic papers, relative to prior sales growth.2 Da et al. [28] find a positive relation between search intensity and revenue surprises, shown in the figure below for the search term and revenues of Garmin LTD, using the search term “GARMIN.”
Source Da et al. [28]
Mobile phone data has been shown to predict company revenues as well. Froot et al. [36] study how foot traffic measured through destination lookups on a mobile phone strongly explain foot traffic. They show a strong relationship between a foot traffic index and announcement returns. Agarwal et al. [2] conduct similar analysis using a nationally representative panel of credit card users from a bank in the USA. Digital revenue signal is an amalgamation of search, social, and Web traffic data, combined via machine learning. Industry vendor ExtractAlpha has many white papers demonstrating the continued, strong outperformance of this signal. Huang 2 salest −sales{t−4} σ salest−k−4 .
where σ is the standard deviation of the prior annual sales growth salest−k −
58
B. Charoenwong and A. Kwan
[43] scrapes Amazon ratings, finding that companies which receive increases in ratings outperform in terms of both revenues and earnings the following month. Most of these prior studies focus on measurement of revenues accruing to retailoriented businesses. However, business-to-business or expense-oriented studies are relatively rare. Tracking B2B expenditures could be highly value-relevant. This data are relatively difficult to get, however, as companies would likely prefer not to disclose their expenses unless required by financial reporting conventions. However, as mentioned before, suppliers oftentimes report into co-ops such as Dun and Bradstreet or Cortera. Hirshleifer et al. [41] examine trade credit data. They argue that when a company makes more purchases from its top trade creditor, it signals that the company has positive fundamentals as top trade creditors would only provide companies with materials or services if they sincerely believed in the company’s success. Such companies outperform. Further, Kwan [52] documents the role of digital “intent” data, a type of data used by marketers to identify B2B sales leads, in predicting corporate policies. These data map organizations to the Internet content they consume by topic. For example, is Microsoft more interested in blockchain than they used to be, or office furniture? Kwan [52] finds that companies which read more spend more, particularly, on non-tangible items such as research and development expenses or sales and general administrative expenses. These values are highly value-relevant, predicting negative earnings surprises in the future. Finally, labor cost is a significant component of earnings. Thus, it stands to reason real-time data tracking employment could be interesting. We speculate that—and are unaware of any research on the topic—various predictors of hiring could negatively predict earnings surprises. For example, high job posting demand could lead to greater hiring, indicating greater expenses. Of course, this hiring could be replacement hiring, in which case earnings may actually improve temporarily as workers leave until they are replaced. Job recruitment platforms such as Entelo or Human Predictions provide data for recruiters, which aggregate worker profiles across various platforms. Data from Internet venues such as Github or LinkedIn can provide further insight into earnings.3 In addition, mapping mobile data to organization offices could theoretically provide a strong correlate of hiring activity. Of course, one has to be careful about filtering out visitors, focusing on those mobile devices that spend consistent time at an office location. We are excited to see any future research on this topic. Cross-momentum. Cross-momentum captures the idea that two companies are economically connected. Examples of such economic links include through supply chain relationships, common owner relationships, or geographical similarities. If company A is economically linked to company B, news for company A is also relevant for company B. If B is a major customer of firm A, when company B does 3 There
exists a handful of studies in this vein today, but their data are subject to massive backfilling issues. LinkedIn launched in 2007, with most databases gathering this data emerging in 2013. Many of these studies on employee turnover back-fill the history, leading to many issues related to survivorship bias. Further, it is not clear that the time an employee moves is equal to when it is reported on a Web site such as LinkedIn, because employees may do so on a lag or delay.
Alternative Data, Big Data, and Applications to Finance
59
poorly, company A eventually does poorly—but there is a long delay before which company A realizes negative returns. Formally, for two assets A and B: A = α + βrtB + t rt+1
In the equation above, B is the economically linked firm whose information at day, week or month y should affect the future returns of A. A value β>0 means that positive returns for asset B predicts a more positive return for asset A, so traders who notice positive return for asset B will invest more in asset A. A value β 0, Otherwise, PPSin = 0 TAi,t = CAi,t − Cashi,t
5
Corporate risk (RISK)
RISKi h = βi h
6
Prior discretionary accruals (DAin−1 )
D Ain−1 = D Ai h−3
7
Earnings persistence (PERS)
PERS =
8
Corporate size (SIZE)
12 h=2 UEh −UE UEh−1 −UE 2 12 h=1 UEh −UE
SIZEin = ln
h
SALESi pn
p=h−1 h
9
Firm performance (CFO)
CFOin =
10
Financing activities (SHARVAR)
SHARVARin = 1, when the outstanding shares increase or decrease by 10% in firm i in the latest six months; otherwise, SHARVARin = 0
11
HALF
To control for the effect of these periods, Mahmoudi et al. [24] used the numbers 1 and 2 to represent two six-month periods
12
Prediction of future earnings growth (MV/BV)
Market Value/Book Value
p=h−1 OCFi pn ) ASSETinh−2
have an overall success rate of 54.25% in predicting earnings management. Subsequently, Mahmoudi et al. [24] adopted a multilayer-perceptron neural network with two hidden layers, based on the modified Jones model, which considers discretionary accruals as a proxy for earnings management. They employed 12 variables (See Table 3), derived from previous studies (Sajjadi and Habibi [25]; Tsai and Chiou [23]), as input variables in their neural networks. Their results showed that, when multilayer-perceptron neural networks with two hidden layers were used, the prediction level of earnings management ranged from 57% to 81%.
3 Conclusion Global spending on AI technologies amounted to 50 billion USD this year and is predicted to double in the next four years, reaching 110 billion USD in 2024 [26]. AI has applications in almost every industry, ranging from detecting and deterring
202
S. Kang and S. Park
security intrusions, resolving users’ technology issues, developing blockchain-based technologies, automating various industries, anticipating future customer purchases, to reforming financial trading. For example, Qraft Technologies, a South Koreabased startup that develops novel AI-driven asset management models, recently filed to create the Qraft AI-Enhanced US Next Value exchange-traded fund (ETF). Its strategy of applying deep learning technologies to financial data is based on value investing and can revive the factor by estimating a company’s intangible assets [27]. Another example is Visa Inc., which unveiled an advanced AI-based tool that can automatically approve or reject card transactions [28]. In this chapter, we examined the conventional models commonly used to detect corporate earnings management in accounting literature, and then reviewed various AI-based methodologies, such as supervised and unsupervised machine learning, that are used to detect or predict earnings management. Overall review suggests that a detection model’s accuracy can be enhanced by adopting AI-based methodologies rather than conventional linear regression models. Supervised learning-grafted models take the greater part compared to unsupervised learning. Also, AI models for detecting corporate earnings management, rather than predicting, account for the majority of existing research. However, since real earnings management is more difficult to detect than accrual-based earnings management, AI-based methodologies face limitation in detecting or predicting real earnings management. Therefore, the development of an effective tool that utilizes unsupervised machine learning can facilitate the detection/prediction of corporate earnings management even when a novel type of fraud emerges in a newly developed business field (e.g., blockchain-based business). Future studies can delve further into the detection/prediction of real earnings management utilizing unsupervised machine learning. At present, since the adoption of AI-based technologies has not surpassed that of statistical linear regression models (e.g., probit/logistic regression model), the duly phased introduction of AI-based methodologies is desirable.
References 1. MarketInsite (2019) For the first time, Nasdaq is using artificial intelligence to surveil U.S. stock market. Nasdaq 2. Schipper K (1989) Earnings management. Account Horiz 3(4):91 3. Etheridge HL, Sriram RS, Hsu HK (2000) A comparison of selected artificial neural networks that help auditors evaluate client financial viability. Decis Sci 31(2):531–550 4. Koskivaara E (2004) Artificial neural networks in analytical review procedures. Manag Audit J 5. Höglund H (2010) Detecting earnings management using neural networks 6. Chen F-H, Chi D-J, Wang Y-C (2015) Detecting biotechnology industry’s earnings management using Bayesian network, principal component analysis, back propagation neural network, and decision tree. Econ Model 46:1–10 7. Dbouk B, Zaarour I (2017) Towards a machine learning approach for earnings manipulation detection. AJBA. 10(2):215–251
Artificial Intelligence-based Detection and Prediction …
203
8. Rahul K, Seth N, Kumar UD (2018) Spotting earnings manipulation: using machine learning for financial fraud detection. In: International conference on innovative techniques and applications of artificial intelligence. Springer, pp 343–56 9. Dechow PM, Sloan RG, Sweeney AP (1995) Detecting earnings management. Account Rev:193–225 10. Dechow PM, Skinner DJ (2000) Earnings management: reconciling the views of accounting academics, practitioners, and regulators. Account Horiz 14(2):235–250 11. Graham JR, Harvey CR, Rajgopal S (2005) The economic implications of corporate financial reporting. J Account Econ 40(1–3):3–73 12. Roychowdhury S (2006) Earnings management through real activities manipulation. J Account Econ 42(3):335–370 13. Cohen DA, Zarowin P (2010) Accrual-based and real earnings management activities around seasoned equity offerings. J Account Econ 50(1):2–19 14. Zang AY (2012) Evidence on the trade-off between real activities manipulation and accrualbased earnings management. Account Rev 87(2):675–703 15. Jones JJ (1991) Earnings management during import relief investigations. J Account Res 29(2):193–228 16. Healy PM (1985) The effect of bonus schemes on accounting decisions. J Account Econ 7(1–3):85–107 17. Messod BD (1999) The detection of earnings manipulation. Financ Anal J 5(55):22–36 18. MacCarthy J (2017) Using Altman Z-score and Beneish M-score models to detect financial fraud and corporate failure: a case study of Enron Corporation. Int J Financ Account 6(6):159– 166 19. Dechow PM, Kothari SP, Watts RL (1998) The relation between earnings and cash flows. J Account Econ 25(2):133–168 20. Cohen DA, Zarowin P (2008) Economic consequences of real and accrual-based earnings management activities. Account Rev 83:758–787 21. Detienne KB, Detienne DH, Joshi SA (2003) Neural networks as statistical tools for business researchers. Organ Res Methods 6(2):236–265 22. Rao C, Gudivada VN (2018) Computational analysis and understanding of natural languages: principles, methods and applications. Elsevier, Amsterdam 23. Tsai C-F, Chiou Y-J (2009) Earnings management prediction: a pilot study of combining neural networks and decision trees. Expert Syst Appl 36(3):7183–7191 24. Mahmoudi S, Mahmoudi S, Mahmoudi A (2017) Prediction of earnings management by use of multilayer perceptron neural networks with two hidden layers in various industries. J Entrepreneurship Bus Econ 5(1):216–236 25. Sajjadi H, Habibi HR (2008) Factors affecting earning management in companies registered in TSE. Econ Res Q 2(1):1–12 26. McCormick J (2020) World-wide AI spending expected to double in next four years. Wall Street J 27. Lee J (2020) A robot tried to fix value investing and ended up buying Amazon. Bloomberg 28. Castellanos S (2020) Visa unveils more powerful AI tool that approves or Denies card transactions. Wall Street J
Machine Learning Applications in Finance Research Hyeik Kim
Abstract Machine learning applications in finance research have conspicuously surged in recent years. This chapter overviews machine learning methods and studies their usages in the finance literature. Previous research finds machine learning particularly helpful when estimating the risk premium in asset pricing, assisting market participants in making financial-economic decisions, and making unstructured data accessible for analysis. The mismeasurement problem is discussed as a challenge machine learning faces. Although machine learning in finance research is still preliminary, its pool of potential predictors is far more expansive than those of conventional econometrics, allowing flexibility to push the frontier of finance research. Keywords Machine learning · Big data · Predictions · Textual analysis
1 Overview Today, there are an overflow of data coming out from various sources. Statistical methods to make great use of these data has become one of the interests for social science researchers; Just using traditional econometric methods on big data seem like leaving all the useful information, that can otherwise be extracted, behind. Therefore, recently in finance, researchers have been applying machine learning methodology into their empirical exercises. One of the greatest advantages that machine learning brings to social scientists is that it offers high dimensional statistical procedures. Machine learning techniques such as decision trees, neural network, deep learning, LASSO, and ridge enable more flexibility in studying the data and allow to model more sophisticated relationships. The heart of what machine learning can offer us can be mainly summarized into three categories. First, they allow a model that works better out-of-sample. H. Kim (B) Fisher College of Business, The Ohio State University, 2100 Neil Avenue, Columbus, OH 43210, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 P. M. S. Choi and S. H. Huang (eds.), Fintech with Artificial Intelligence, Big Data, and Blockchain, Blockchain Technologies, https://doi.org/10.1007/978-981-33-6137-9_9
205
206
H. Kim
Even though most of the research in finance concerns a causal relationship problem between two variables, there are areas where prediction can be helpful and can even help in identifying the economic channel.1 Second, machine learning can help in studying behavior of market participants when making financial-economic decisions. Machine learning can predict the best outcome by interactively considering all the variables that potentially could dominate human decisions; Machine learning algorithms are free from human biases or agency problem that often affect financialeconomic decisions. So it provides benchmarks to study how deviated the behaviors of market participants are. Third, machine learning allows the use of unstructured data.2 The majority of data relevant to finance is unstructured and a vast array of these data are rarely put to use. Therefore, providing a way to make use of these data can definitely enrich the depth of finance research. In this article, I will introduce how machine learning is applied in finance research. Even though machine learning itself has been around for almost three decades, the application in finance research has happened in very recent years. Despite the fact, the frequency of machine learning applications is growing faster than ever. Therefore, it is worthwhile to examine research that has been done in this field and study a few tools that are mainly used. Section 2 starts with how machine learning is defined in finance. Section 2.1 discusses the inference and prediction problems. Section 2.2 discusses important issues in machine learning. Section 2.3 introduces some machine learning models that are often applied in finance research. Then in Sect. 3, I introduce finance literature that applies machine learning. Section 4 concludes and discusses an econometric concern that is worth thinking about going forward.
2 Defining Machine Learning in Finance Research Social science research has been mainly interested in asking causality, where the focus is to find out the effect of X on Y. In order to identify the pure effect of X on Y, social scientists have developed identification strategies such as randomized control trials (RCT), Instrumental variables, Regression Discontinuity Design, Diffin-Diff, and etc. What machine learning differs from the traditional data analysis done in social science is that they are concerned primarily with pure predictions. So machine learning applications are most useful when the research question has an emphasis on prediction. For example, trying to understand the risk premia in asset pricing could be a good research question that could advantage from using machine learning methods as the fundamental goal of understanding the risk premia is to better predict return. On the other hand, there could be a collaboration between the 1 For
instance, in asset pricing, the goal is to understand the pricing kernel which essentially is understanding how asset prices will move in the future. 2 Unstructured data are the ones that cannot simply be transformed into an excel sheet. It can be text reports, videos, images, or audio.
Machine Learning Applications in Finance Research
207
traditional econometrics and machine learning that involves causal inference as a well developed predictive model can help in estimating the causal effect. This section will discuss how the purposes of the traditional econometrics and machine learning are different, how they can complement each other, why machine learning can be helpful in some research, and what the main issues are in machine learning.
2.1 Inference and Prediction There are mainly two purposes for performing data analysis: Inference and Prediction. When we want to find the effect of X on Y, what we care about is the causality between X and Y, and in this case, the purpose of analyzing the data becomes inference. When inference is the purpose of the data analysis, it is important to sort out the sole effect of X on Y. Therefore, traditional econometrics has developed numerous “identification” methods that help to screen out endogenous factors that are correlated to X and could affect Y, such as randomized control trials (RCT), Instrumental variables, Regression Discontinuity Design, Diff-in-Diff, and etc. On the other hand, the purpose of data analysis is in prediction when we want to use X to predict Y. Machine learning is a collection of statistical tools that are used for predictions. According to Varian [39], machine learning specialists are primarily concerned with developing high-performance computer systems that can provide useful predictions in the presence of challenging computational constraints. Gu et al. [19] defines machine learning as a diverse collection of high-dimensional models for statistical predictions combined with “regularizations” methods for model selection and mitigation of overfitting, and finding an efficient algorithm among a vast number of model specifications. As such, machine learning methods are used to make “good” predictions of Y using the observed data on X and Y.3 Machine learning provides better ways to make “good” predictions than traditional linear models. These methods are usually non-linear models that use big data such as classification and regression trees (CART), random forests, and penalized regression such as Lasso, LARS, and elastic nets. As mentioned earlier, even though finance research mostly aims to figure out the causal relationships, machine learning can be still very useful. First, some of the subfields in finance are actually not entirely unrelated to prediction. For example, one of the main objectives of asset pricing is to predict future movements of asset prices. For instance, empirical asset pricing has been quested to find a pricing kernel that explains all stocks. Although there have been numerous attempts, most of them are criticized by non-academics of their poor out-of-sample performance. Second, according to Varian [39], machine learning algorithm may not allow one to conclude anything about causality, but can assist in estimating the causal impact. When we try 3 Usually,
we call it a “good” prediction when the sum of squared residuals or the mean absolute value of residuals are minimized out-of-sample.
208
H. Kim
to examine causal inferences, oftentimes, we need not only the outcomes, but also the counterfactual that cannot be observed. Machine learning can be helpful in this aspect because it can predict the counterfactual to be compared with the outcome. It is apparent that even though inference and predictions are different, one can complement the other and that machine learning can be helpful in finance research.
2.2 Important Topics in Machine Learning 2.2.1
Overfitting
The primary goal of machine learning is to make a good out-of-sample predictions. Often times, a regression model that works very well in-sample does not work well out-of-sample. As shown in Fig. 1, When degrees of freedom and complexity in a model increases, the accuracy in explaining in-sample increases. However, on the other hand, out-ofsample performance decreases with the complexity of a model. This is called an overfitting problem; there is a trade off between model’s complexity and the outof-sample fit. Overfitting problem can be avoided if the number of observations is much greater than the number of variables in the model. Unfortunately, big data is not as big as we would like to be and we can easily run into overfitting problem. The following three methods, cross validation, LASSO, Ridge, and principal component analysis (PCA), are used to resolve overfitting problem.
Fig. 1 Trade off between complexity and accuracy for out-of-sample tests
Accuracy
In-Sample Out-of-Sample
Model Complexity
Machine Learning Applications in Finance Research
2.2.2
209
Cross Validation
Cross validation helps to measure out-of-sample performance and to mitigate overfitting problems. The idea is to divide in-sample into a training sample and testing sample to develop a model that minimizes the overfitting problem. Model is developed using the training sample and its accuracy is tested in the test sample. Based on how in-sample is divided, there are three different methods: Holdout method, K-fold cross validation, and leave-one-out cross validation. For better understanding, suppose we are interested in predicting whether a patient has a heart disease. We have a large number of predictors to use, such as weight, blocked arteries, indigestion, chest pain, and many more. We want a model that minimizes the mean square errors (MSE) out-of-sample and we decide to use cross validation method to do so. There are in large, three ways to do cross validation. Figure 2 depicts the holdout method. Assuming that we have 1,000,000 patients in our entire sample, we can divide the sample randomly into 2 groups so that each group has 500,000 patients. We can then use one group to train the models and the other group to test the models. An alternative way is to use k-fold cross validation. Figure 3 depicts tenfold cross validation. tenfold cross validation equally divides the sample into 10 groups and assign one group to be a testing sample. Each group will then take turns to become a testing sample; for each model, there will be 10 training and testing sessions. Finally, there is Leave-one-out cross validation (LOOCV) method, which is an extreme version of K-fold cross validation. LOOCV will leave one patient out and all the other 999,999 patients as training sample. So for each model, there will be 1,000,000 training and testing sessions. 00,000 patients
00,000 patients
Training set
Testing set
1,000,000 patients Fig. 2 Holdout method
...
Training set: 900,000 patients Fig. 3 Tenfold method
Testing set: 100,000 patients
210
2.2.3
H. Kim
Reducing Complexity
There are three ways to minimize the complexity of a model: Subset selection, shrinkage methods, and dimension reduction. Subset selection is trying out all the possible combinations of all possible features and selecting the specification that minimizes the out-of-sample MSE. As the number of variables increases, the number of possible combination features increases exponentially, and at some point, it is infeasible to try out all the possibility. The shrinkage method is also known as regularization. The method aims to resolve the overfitting problem by adding a penalty function to the optimization algorithm and reduce the size of coefficients. The following equation shows how the coefficients are calculated.
min(M S E) = min α,β
α,β
1 n
Y −α−
P
βi xi + Penalty
(1)
i=1
Penalty function gets larger when the coefficients of the model are large. The optimization problem will then solve for the coefficient that minimizes not only the MSE but also the penalty so that it minimizes the in-sample MSE while restricting the coefficient from being too large. There are two mostly used penalty functions: LASSO and Ridge. LASSO is an abbreviation for lease absolute shrinkage and selection operator. The following equation is how LASSO defines the penalty function.
LASSO Penalty = λ
P |βi |
(2)
i=1
Ridge is called ridge because it avoids having ridges in the likelihood function. The following equation is how ridge defines the penalty function.
RIDGE Penalty = λ
P
βi2
(3)
i=1
Both penalty functions increase as the coefficients increase, and most importantly, coefficients need to be standardized before doing shrinkage estimators. The difference between LASSO and RIDGE is that LASSO can shrink the coefficient to zero whereas RIDGE shrinks the coefficient of the model but never to zero. A good way to select λ is to do K-fold cross validation and choose the one that minimizes the outof-sample MSE. Neither LASSO or Ridge always dominate the other, and a clever way to choose between LASSO and Ridge is to observe the characteristics of the true model. If the true model only has a small subset of important predictors and the
Machine Learning Applications in Finance Research
211
rest are not affecting the outcome, but you are not sure which ones are important, it’s better to use LASSO. On the other hand, if you expect all the variables to have some effects on the outcome, it’s better to use Ridge.
2.2.4
Dimension Reduction
An alternative method to reduce the complexity of a model is to transform the original features into a smaller subset of variables. One of the most famous methods is principal component analysis (PCA), which reduces the number of variables by generating new variables that are linear combinations of the original features. Suppose there are P features in the original model, PCA method will decrease the number of features to m (P>m) features, which are called principal components, which are a linear combination of P features. The first principal component will capture most of the variation in the data, and the second principal component captures most of the remaining variation and so on. The number of principal components will stop at the point where there is no extra variation to add.
2.3 Machine Learning Models Here, I introduce a number of learning methods that are widely used in finance research nowadays.
2.3.1
Decision Tree
The decision tree predicts the outcome variable by categorizing each observation based on the flow of leaf nodes from the root to the final leaves. Each leaf nodes are input features and branches represent conjunctions of features that lead to the leaf node. For each feature, we have to find the best split that gives us the highest accuracy. Out of all the features we try, we choose the one that provides the highest accuracy. After the first branch split is complete, we iterate the process for the subgroups. The advantages of the decision tree are that it is intuitive and easy to show to non-technical people; it handles qualitative and categorical predictors well and that it handles non-linear dependencies or interactions well. However, issues are that it is highly vulnerable to overfitting, sensitive to small changes in the data.
2.3.2
Random Forests
This is the methodology that can complement decision tree models. As mentioned above, one of the pitfalls of decision tree is that it is unstable and likely to overfit. In each tree, there will be some portion of truth but a lot of noise as well. Therefore, if
212
H. Kim
we average many trees, that will give us a stable and accurate model. Random Forests averages the prediction of many trees to minimize the noise in each tree. However, predictions of trees from random subsamples are often highly correlated. For this reason, random forest not only randomizes the subsample but also the features, and this is found out to increase the accuracy of the model.
2.3.3
Neural Network
Neural network is constructed by millions of neurons that are interconnected. Data moves through neurons in one direction receiving information from neurons beneath it and sends to neurons after it. To each of the incoming neurons, there will be weights assigned. Each neuron will receive different data items that will again be calculated based on the weights to send it to the next neurons. If the weighted sum is below the threshold, the neuron will not transmit the data to the next neuron. Training neural network involves adjusting both weights and thresholds continually until the same neurons consistently yield similar outputs.
3 Applications in Finance Research 3.1 Forecasting Stock Returns One of the ultimate goals of asset pricing is to understand the equity risk premium. Estimating risk premium can be viewed as a prediction problem because an asset’s risk premium is an unconditional expected return of a future realized returns. So in a way, estimating risk premium is estimating future returns. However, the traditional linear models have been mostly failing in this aspect because they have poor outof-sample performance. On the other hand, machine learning provide methods that allow a rich set of variables and functional forms and still have good out-of-sample performance. Due to the advantage of machine learning, there has been numerous attempts in applying various machine learning methodology to estimate stock returns. Among the pioneers is [19]. They were among the first to apply machine learning in measuring stock risk premium and introduced the procedure thoroughly by comparing the performance of a list of machine learning models like linear regression, generalized linear models with penalization, dimension reduction via principal components regression (PCR) and partial least squares (PLS), regression trees (including boosted trees and random forests), and neural networks. With 60 years of individual stock data and 900+ predictors, the paper shows that machine learning methods have high out-of-sample R 2 and demonstrate larger economic gains to investors. By comparing various machine learning methods for predicting the panel of individual US stock returns, the paper shows how machine learning methods can benefit the accuracy of stock return predictions.
Machine Learning Applications in Finance Research
213
Following [10, 19] estimates a stochastic discount factor using a general nonlinear asset pricing model with deep neural networks for all US equity data based on a substantial set of macroeconomic and firm-specific information. They show that including the no arbitrage condition leads to estimating a better performing SDF that can explain the expected returns of all assets. In other words, the paper basically solves a generalized method of moments (GMM) using three neural network structures. First, they use a feedforward neural network to explain the functional form of the SDF based on the information set. Second, they use a recurrent LongShort-Term-Memory (LSTM) network to capture the time varying nature of SDF by macroeconomic conditions. Third, a generative adversarial network (GAN) is used to find the optimal SDF that minimizes the mispricing. The network essentially provides test assets or times that are the most mispriced and then corrects the asset pricing model to price these assets. The process is repeated until all pricing information is taken into account. The final asset pricing model (GAN) outperforms out-of-sample all benchmark approaches in terms of sharpe ratio, explained variation, and pricing errors. Chinco et al. [11] predicts stock return at a high frequency as one-minute ahead using the entire cross-section of lagged returns as candidate predictors. Using LASSO, they estimate a model using data from the previous 30 min. Then they apply the coefficient estimates to the most recent three minutes of data to forecast for a minute ahead return. The results show that compared to other benchmarks like market benchmark or AR(3) benchmark, LASSO has better out-of-sample performance. The result has meaningful implication in that it shows how LASSO uses a purely statistical rule to identify candidate predictors that are too unexpected, and short lived that is not easy for a researcher to have an intuition. However, even though the selected predictors did not reply on researcher’s intuition, the identified predictors from LASSO are often lagged returns of stocks with recent news about firm fundamentals, which is associated with economically meaningful events. Another paper that predicts stock returns with machine learning is Rossi [37]. Using boosted regression trees (BRT), the paper examines whether a representative mean variance investor can forecast excess returns and volatility by exploiting publicly available information. They find that BRT forecasts outperform benchmark models in terms of mean squared error and accuracy. The model also generates profitable portfolio allocations for mean-variance investors and they show that the relationship between predictor variables and the optimal portfolio allocation is nonlinear and nonmonotonic. There are papers that apply machine learning methods in identifying true asset pricing factors. Since the recent empirical asset pricing literature has been producing a number of factors that outperforms the current, up-to-date factor models,4 recently, there have been papers that questions the rationale of these factor models. Since the true ingredients for SDF are unknown, these papers try to estimate loadings of as many 4 Fama
and French [14] initially proposed a three factor model, and then [23] came up with a four factor, Fama and French [15] to five, and now we have six factor model from Barillas and Shanken [3].
214
H. Kim
factors as possible to test which factors are redundant or have high explanatory power. Machine learning methods can overcome the high-dimensional problem by shrinking the factor loadings towards zero or providing a way to select among a large number of predictors. Kozak et al. [27] tests whether a few characteristic sparse factors (i.e., the three, four, five, etc. factors) can explain the cross-section of stock returns. To deal with a large number of predictors, they use Bayesian approach with prior distribution to shrink SDF coefficients toward zero. The results suggest that there is not enough redundancy among a large number of cross-sectional return predictors and to perform well, the SDF needs to load on a large number of factors. Feng et al. [17] uses double stage LASSO (DS LASSO) to test the marginal contribution of factors that adds to the existing factors. Since the number of existing factors exceeds the number of observations, the standard two pass regression from Fama and MacBeth [16] cannot be used to test the marginal effect of new factors. Instead, the paper applies DS LASSO method, where they first run a LASSO regression of return on existing factors to exclude factors with low risk loadings and then at a second stage, include factors that have high correlations with new factors. The process selects the variables to control when testing the marginal contribution of new factors.
3.2 Assisting Financial-Economic Decision-Making Process In a corporation, managers face various problems on which they need to make decisions. The problems managers encounter span from financing decisions, investment decisions to selecting board members and hiring employees, etc. A wide accepted view among economists and practitioners is that managers should act in the interest of shareholders when they decide on these problems.5 However, since [5], shareholders have been aware that managers can have incentives to maximize their own interests instead of the firms’ without a monitor. Moreover, even absent agency problems, managers may have behavioral biases that affect their decisions to be far from optimal.6 Given that numerous other market participants make financial decisions in their everyday lives, a natural question is whether machine learning algorithms can improve human decisions by avoiding behavioral biases and agency problems. In fact, there has been a number of papers that examine this question outside finance. For example, Kleinberg et al. [26] show that machine learning algorithms can suggest policy implications that can benefit more people with less money. They illustrate how machine learning predictions can increase medicare beneficiaries’ overall welfare by isolating futile joint replacement surgery. Following the model predictions, the medicare fund saved from futile surgeries can be reallocated to 5 This
idea became famous from Milton Friedman’s article at the New York Times magazine in 1970. In the article, he says that a corporate executive is the employee of the owners of a (public) company and has a direct responsibility to his employers. 6 A number of papers have found that human decisions are oftentimes biased. E.g., racial biases (Munnell et al. [34], Quintanar [36], Samorani et al. [38], Kleinberg et al. [26]).
Machine Learning Applications in Finance Research
215
patients who can actually benefit from the surgery. Moreover, Kleinberg et al. [25] find that the algorithm can correct judge’s biases and help them make efficient decisions. They build an algorithm using gradient boosted decision trees with a large data set of cases heard in New York City from 2008 to 2013 to predict the flight risk of arrested defendants if they are not jailed until the trials. The results show that judges are releasing many defendants the algorithm predicts as very high risk of flight and that if judges follow the algorithm predictions, they can jail less people with the same reduction in crime. As Meehl [32] suggested early on, because of behavioral biases, statistical predictions can outperform humans. Recently in finance research, there also have been studies that examine the weaknesses and strengths of human made decisions compared to the decisions that machines suggest. Erel et al. [13] studies whether a machine learning algorithm can assist the director selection process in a firm. Selecting directors is one of the first-order issues in corporate governance because directors are, in theory representing shareholders’ interest. Despite their roles, however, directors are often connected to the incumbent CEOs who usually select them, and the selection process often accompanies agency problems. The paper examines whether the absent of these agency problems, the algorithm can better predict director performance. They use four commonly used supervised machine learning methods, such as LASSO, ridge, neural networks, and gradient boosting trees. Using a set of observable directors, board, and firm features, they train each algorithm with director appointment data between 2000 to 2011 and test them on the directors appointed between 2012 and 2014. The results indicate that algorithms outperform human selections. Directors that are predicted to perform poorly indeed performed poorly. Also, human selected directors who were predicted to perform poorly by the machine are more likely to be male, have larger networks, sit on more boards, and are more likely to have a finance background. Gathergood et al. [18] examines how individuals make debt payments and suggest implications on how individuals should borrow across their credit cards portfolio. Using a large dataset on credit card contract terms, monthly statements, and repayments for 1.4 million individuals in the United Kingdom over two years, the paper tests several repayment models: Balance matching heuristics, N1 heuristics, and Debt snowball methods. They examine the explanatory powers of these methods using the standard measures of goodness of fit. They set up a lower benchmark where the payment on high APR is randomly drawn and use machine learning techniques such as decision tree, random forest, and extreme gradient boosting models to provide an upper benchmark. The results show that the balance matching repayment method explains more than half of the predictable variation, which is inconsistent with the optimal strategy where repayment is allocated to the highest interest rate cards. The result shows a discrepancy between human repayment decisions and the optimal repayment methods due to the tendency for humans and other species to engage in “matching behavior”. Abis [1] uses a machine learning technique to classify US mutual funds as funds that rely on computer-driven models and funds that rely on human judgments and compare each type’s characteristics. The paper classifies a sub-sample of 200
216
H. Kim
prospectuses manually to extract a bag of words to train the training sample using random forest and identify types of 2607 unique funds. The paper finds that quantitative funds have greater information processing capacity but are less flexible compared to discretionary funds. Also, quantitative funds hold more stocks, specialize in stock picking, and have pro-cyclical performance, whereas discretionary funds are alternate between stock picking and market timing, display counter-cyclical performance and focus on stocks that have less information. The paper provides issues such as less flexibility that need to be further addressed as we rely more on machine learning techniques. Some papers examine the venture capitalist’s funding decisions. Catalini et al. [8] use data from a leading startup accelerator in the US and study how machine trained models to pick applicants that maximize profit can outperform machine trained models that mimic human evaluators. They find that machine trained models to mimic human evaluators perform well out-of-sample, suggesting that human evaluators were systematically overlooking high-potential applicants. Human evaluators tend to underestimate ‘cognitively demanding’ elements compared to machines. Hu and Ma [24] also find similar results that inaccurate beliefs of human can lead to poor venture funding decisions. Using presentation videos of startup entrepreneurs when they pitch in front of VCs, the paper finds that higher scores are given to teams with more positive pitch even though they underperform conditional on funding. The machine can assist the firm’s hiring decisions as well. Hoffman et al. [22] find that in the context of lower-skill workers, managers who appear to hire against test recommendations end up with worse average hires. Cowgill [12] find those job candidates who are recommended by machine learning algorithm are “nontraditional”: candidates who graduated from non-elite colleges, who lack job referrals, who lack prior experience, whose credentials are atypical and who have strong non-cognitive soft-skills.
3.3 Use of Unstructured Data and Machine Learning Techniques Traditionally, most of the data we use in social science research was structured. We can think about structured data as ones that could be imported into an excel file; They are, most of the time, quantitative and easy to analyze. Unfortunately, most of the data in the real world are unstructured as most of the information we obtain is from texts, videos, images, audios, reports, etc. Unstructured data are expected to expand even more in the future with the development of multimedia. Therefore, it is important to take advantage of the machine learning technique to process these unstructured data as they will provide many more fruitful analyses in finance research. In finance, text data have been popularly used in asset pricing and corporate finance; Texts from 10-Ks, financial news, social media, earnings call, and conference
Machine Learning Applications in Finance Research
217
calls are used to predict asset price movements and study how new information has any causal effect. Text data help study the behavior of market participants. Hansen et al. [20] study how transparency affects the deliberation of monetary policy makers on the Federal Open Market Committee (FOMC) by analyzing the FOMC meeting records. The paper makes a methodological contribution by introducing latent Dirichlet allocation (LDA), a machine learning algorithm for textual analysis, to economic research. With the advantage of an interesting natural experiment setting and textual analysis technology, they find that transparency affected policy makers to become less interactive, more quantitatively oriented, and more scripted. Text data are also incredibly helpful sources to measure investor sentiments. Bollen et al. [6] analyze text contents of daily twitter feeds to examine how collective mood is correlated with the value of the Dow Jones Industrial Average (DJIA). They find that the inclusion of public mood significantly improved the accuracy of DJIA predictions. Chen et al. [9] conduct textual analysis of articles and commentaries that were posted by investors on popular social media platforms and find that views expressed in both articles and commentaries predict future stock returns and earnings surprises. In asset pricing, Boudoukh et al. [7] analyze firm-specific news to predict volatility and find that this information accounts for a large portion of overnight idiosyncratic volatility. Liu et al. [29] use salient textual information and find those market distortions previously known as information asymmetry problem dissipates as they incorporate textual information into the model in real estate prices. The results imply that it is extremely difficult to test information asymmetry using incomplete information and that textual information can help mitigate the gap between information researchers have, and the information market participants have in their hands. Textual analysis is also actively used in corporate finance research. Text data are especially helpful in estimating firm values like innovation, which were traditionally measured by proxies like patents. Bellstam et al. [4] use LDA method to examine analyst reports to specify a firm’s level of innovation. They find that The text-based innovation measure robustly forecasts greater firm performance and growth opportunities for up to four years. These value implications hold just as strongly for non-patenting firms. Li et al. [28] create a culture dictionary of firms using the word embedding model and earnings call transcripts. They score corporate culture based on five values and find that firms with strong corporate culture have higher operational efficiency, more corporate risk-taking, less earnings management, an executive compensation design fostering risk-taking and long-term orientation, and higher firm value. Unstructured data such as images or videos are also used in finance research. Aubry et al. [2] examines 1.1 million painting auctions from 350 auction houses to evaluate the accuracy of auction house estimated prices. They use neural networks to develop artwork pricing algorithm based on both non-visual and visual information of artwork. They find that auction house pricing best predicts the artwork price. However, the machine does help to predict buy-ins, and it is particularly helpful in predicting artwork with high price uncertainty. Moreover, the machine corrects human experts’ systematic biases in expectation formations. The result supports what [30] call a “quantamental” way; best results are achieved with the combination of
218
H. Kim
human and machine learning algorithms. Obaid and Pukthuanthong [35] develops a daily market-level investor sentiment index from news photos and predicts the stock market return. They use neural networks to identify many aspects of photos, such as negativity, objects, colors, and facial expressions, and find that photo pessimism predicts market return reversals and increased trading volumes. Hu and Ma [24] study how human interactions affect financial investments by analyzing presentation videos of startup entrepreneurs when they are pitching in front of VCs. They measure positivity in a pitch using machine learning technologies such as Face ++, Microsoft Azure, Google Cloud. The results show that higher scores are given to teams with more positive pitch providing evidence that interaction induced biases can be explained by inaccurate beliefs or taste-based channels. References [21] and [31] use audio data from analyst conference calls to identify firm private information.
4 Discussion In this article, I introduce how machine learning algorithms are applied in finance research. The three important contributions of machine learning to finance literature are that it provides increased out-of-sample performance results in pricing assets. By better predicting the future stock return, they give implications on causal inferences as well. Second, it assists in observing human biases, and detecting channels underlying those behaviors, providing a better way to study market participants’ behaviors. Third, the machine learning algorithm uses unstructured data sources, which are the majority of financial data, and enables research in-depth. Despite the advantages that machine learning can bring us, some econometric concerns are worth thinking about. One of the greatest concerns is that mismeasurement can magnify human biases and errors. Mismeasurement can occur if predictors are correlated to the noise term of the outcome variable. Therefore, one of the challenges is to decide what to include and exclude as a predictor. Unfortunately, there are no set rules on this other than a researcher’s discretion. Poor selection of predictors might end up magnifying human biases or errors that are present in the data because algorithms that are trained on biased outcomes might replicate them [33]. For example, when predicting default rates in banks, one might be interested in including race as a predictor. Race could earn a high coefficient in predicting default. However, this is due to mismeasurement as the noise term inside race is correlated to the noise term inside default rate. Mismeasurement can distort rank orderings and can magnify discrimination against some minorities. Therefore, in the future, not only is it important to develop the computational skills but also some criteria must be selected to base predictions since algorithms are not perfect, but it is a start, and one has to start somewhere. In finance, machine learning application is still preliminary. However, since the majority of financial data are now automated and will grow bigger, machine learning techniques to manipulate big data will allow flexibility to push the frontier of research in finance.
Machine Learning Applications in Finance Research
219
References 1. Abis S (2017) Man vs. machine: Quantitative and discretionary equity management. Unpublished working paper. Columbia Business School 2. Aubry M, Kräussl R, Manso G, Spaenjers C (2019) Machine learning, human experts, and the valuation of real assets. HEC Paris Research Paper No. FIN-2019-1332 3. Barillas F, Shanken J (2018) Comparing asset pricing models. J Fin 73(2):715–754 4. Bellstam G, Bhagat S, Cookson JA (2019) A text-based analysis of corporate innovation. Available at SSRN 2803232 5. Berle AA, Means GC (1932) The modern corporation and private property. New Brunswick, NJ 6. Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8 7. Boudoukh J, Feldman R, Kogan S, Richardson M (2019) Information, trading, and volatility: Evidence from firm-specific news. Rev Fin Stud 32(3):992–1033 8. Catalini C, Foster C, Nanda R (2018) Machine intelligence vs. human judgement in new venture finance. Tech. rep., Mimeo 9. Chen H, De P, Hu YJ, Hwang BH (2014) Wisdom of crowds: The value of stock opinions transmitted through social media. Rev Fin Stud 27(5):1367–1403 10. Chen L, Pelger M, Zhu J (2020) Internet appendix for deep learning in asset pricing. Available at SSRN 3600206 11. Chinco A, Clark-Joseph AD, Ye M (2019) Sparse signals in the cross-section of returns. J Fin 74(1):449–492 12. Cowgill B (2018) Bias and productivity in humans and algorithms: Theory and evidence from resume screening. Columbia Business School, Columbia University, vol 29 13. Erel I, Stern LH, Tan C, Weisbach MS (2018) Selecting directors using machine learning. Tech. rep, National Bureau of Economic Research 14. Fama EF, French KR (1993) Common risk factors in the returns on stocks and bonds. J Fin Econ 15. Fama EF, French KR (2015) A five-factor asset pricing model. J Fin Econ 116(1):1–22 16. Fama EF, MacBeth JD (1973) Risk, return, and equilibrium: Empirical tests. J Polit Econ 81(3):607–636 17. Feng G, Giglio S, Xiu D (2020) Taming the factor zoo: A test of new factors. J Fin 75(3):1327– 1370 18. Gathergood J, Mahoney N, Stewart N, Weber J (2019) How do individuals repay their debt? the balance-matching heuristic. Am Econ Rev 109(3):844–75 19. Gu S, Kelly B, Xiu D (2020) Empirical asset pricing via machine learning. Rev Fin Stud 33(5):2223–2273 20. Hansen S, McMahon M, Prat A (2018) Transparency and deliberation within the FOMC: a computational linguistics approach. Q J Econ 133(2):801–870 21. Hobson JL, Mayew WJ, Venkatachalam M (2012) Analyzing speech to detect financial misreporting. J Account Res 50(2):349–392 22. Hoffman M, Kahn LB, Li D (2018) Discretion in hiring. Q J Econ 133(2):765–800 23. Hou K, Xue C, Zhang L (2015) Digesting anomalies: An investment approach. Rev Fin Stud 28(3):650–705 24. Hu A, Ma S (2020) Human interactions and financial investment: A video-based approach. Available at SSRN 25. Kleinberg J, Lakkaraju H, Leskovec J, Ludwig J, Mullainathan S (2018) Human decisions and machine predictions. Q J Econ 133(1):237–293. Oxford University Press 26. Kleinberg J, Ludwig J, Mullainathan S, Obermeyer Z (2015) Prediction policy problems. Am Econ Rev 105(5):491–95 27. Kozak S, Nagel S, Santosh S (2020) Shrinking the cross-section. J Fin Econ 135(2):271–292 28. Li K, Mai F, Shen R, Yan X (2019) Measuring corporate culture using machine learning. Available at SSRN 3256608
220
H. Kim
29. Liu CH, Nowak AD, Smith PS (2020) Asymmetric or incomplete information about asset values? Rev Fin Stud 33(7):2898–2936 30. Lopez de Prado M (2018) Advances in financial machine learning. Wiley 31. Mayew WJ, Venkatachalam M (2012) The power of voice: Managerial affective states and future firm performance. J Fin 67(1):1–43 32. Meehl PE (1954) Clinical versus statistical prediction: A theoretical analysis and a review of the evidence 33. Mullainathan S, Obermeyer Z (2017) Does machine learning automate moral hazard and error? Am Econ Rev 107(5):476–80 34. Munnell AH, Tootell GM, Browne LE, McEneaney J (1996) Mortgage lending in Boston: interpreting HMDA data. Am Econ Rev 25–53. JSTOR 35. Obaid K, Pukthuanthong K (2019) A picture is worth a thousand words: Measuring investor sentiment by combining machine learning and photos from news. Available at SSRN 3297930 36. Quintanar SM (2017) Man vs. machine: an investigation of speeding ticket disparities based on gender and race. J Appl Econ 20(1):1–28. Taylor & Francis 37. Rossi AG (2018) Predicting stock market returns with machine learning. Tech. rep., Working paper 38. Samorani M, Harris S, Blount LG, Lu H, Santoro MA (2020) Overbooked and overlooked: machine learning and racial bias in medical appointment scheduling. Available at SSRN 3467047 39. Varian HR (2014) Big data: New tricks for econometrics. J Econ Perspect 28(2):3–28
Price-Bands: A Technical Tool for Stock Trading Jinwook Lee, Joonhee Lee, and András Prékopa
Abstract Given a stochastic process with known finite dimensional distributions, we construct lower and upper bounds within which future values of the stochastic process run, at a fixed probability level. For a financial trading business, such set of bounds are called “price-bands” or “trading-bands” that can be used as an indicator for successfully buying or short-selling shares of stock. In this chapter, we present a mathematical model for the novel construction of price-bands using a stochastic programming formulation. Numerical examples using recent US stock market intraday data are presented. Keywords Intraday data · Technical analysis · Price-band · Trading strategy · Stochastic process · Binomial moments · Robo-advisor
1 Introduction Stock trading can be approached in a multitude of ways, and the one fact practitioners agree upon is that there is no clear and easy way to outperform the market consistently. It is intuitively appealing that a company should have an “intrinsic value,” and in the long run the stock price would converge to this value. This is the conventional wisdom of so-called fundamental analysis, and the valuation of the company amounts to estimating this unknown “intrinsic value.” This process is concerned mostly with the economic climate, interest rates, products, earnings, management, etc. There are A. Prékopa: Deceased 18 September, 2016. J. Lee (B) LeBow College of Business, Drexel University, 3220 Market Street, Philadelphia, PA 19104, USA e-mail: [email protected] J. Lee · A. Prékopa Rutgers Center for Operations Research, Rutgers University, 100 Rockafeller Road, Piscataway, NJ 08854, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 P. M. S. Choi and S. H. Huang (eds.), Fintech with Artificial Intelligence, Big Data, and Blockchain, Blockchain Technologies, https://doi.org/10.1007/978-981-33-6137-9_10
221
222
J. Lee et al.
a large number of valuation studies, and many of them are actually being used in financial practice. On the other hand, technical (or quantitative) analysis is an evaluation process of securities based entirely on charting patterns, statistical approaches, and/or mathematical formulae. The technical analysis approach is particularly suited for short-term investing. In a certain sense, this amounts to an analysis of crowd psychology and behavior as well as investor philosophy, since it is believed that short-term patterns and trends result primarily from decisions by human investors. The tacit assumption underlying technical analysis is that a future price can be predicted by quantitative analysis of the past price movement. However, the efficient-market hypothesis (e.g., see [2, 3], etc.) asserts that no one can consistently outperform the market, since the market incorporates all information instantaneously. On the other hand, behavioral economists criticize the efficiency of the market for many reasons, such as irrationality of investors and information asymmetry. For more details, we refer the reader to the literature, e.g., see [5, 9], etc. In this chapter, we suppose that stock prices are at least partially predictable based on recent market trends. Price patterns can be elusive, and the difficulty is amplified by the sheer complexity of the financial market, as well as market participants’ philosophical and psychological states. We believe that price movement discloses investors’ expectations in light of these (and many more) factors, and in this way accounts for them. There are some clear patterns in the stock market; for example, when a stock is in an up-trend with increasing volume, it is regarded as a sign of an up-market trend. Such patterns can be found by technical analysis, and this motivates the study of such methodologies. In what follows, we present a mathematical model for constructing new and reliable trading-bands (or price-bands) for price forecasting.
2 Construction of Price-Bands In the stock trading business, many different technical tools exist to guide traders through the swarm of information, e.g., trading-bands, envelopes, channels, etc. Bollinger Bands is probably one of the most popular and successful models. Many traders use it daily as a tool for pattern recognition, augmenting technical trading strategies, and so on (see, e.g., [1, 4], etc.). It is simple to see that Bollinger Bands are a collection of individual confidence intervals of future stock prices. Stock prices are expected to remain within the bands with a certain probability for each future time, depending on the width of the bands. Observation of stock price outside of the bands is considered a sign for “buying” or “(short) selling.” Let us consider the following simple example: Suppose we happen to observe a stock’s price moving below the lower band. Then our expectation would be that the price will go up, moving back into the bands, ceteris paribus. If we make our investment decision solely on the basis of price-bands, then we would choose to “buy” while the price is below the lower band and subsequently “sell” within the bands. Consider, on the other hand, the case that a stock price is observed above the
Price-Bands: A Technical Tool for Stock Trading
223
Table 1 Bollinger Bands Upper band = n-day moving average+ kσ Middle band = n-day moving average Lower band = n-day moving average −kσ There are several ways to calculate moving average, e.g., simple moving average, front-weighted moving average, exponential moving average, and so on. The band width is determined by the (x −¯x)2
i multiplier k and standard deviation σ = N −1 , where xi is the data point, μ the average, and N the number of points. The multiplier k can be chosen depending on the time periods n . Recommended (by Bollinger) width parameters with time periods are k = 1.9 if n = 10, k = 2.0 if n = 20, k = 2.1 if n = 50 , etc.
upper band. As one might expect, the proper action on the stock would be to “short sell” above the upper band and then “buy” within the bands. For completeness, we present the Bollinger Bands formulae in Table 1. Bollinger uses the mean and standard deviation to create price-bands as in Table 1, where the mean can be thought of as a central tendency and standard deviation as its volatility, thus determining the width of the bands. Like Bollinger, we assume the stock price in a time period to be a Gaussian process. However, rather than using simple mean and variance, we use conditional mean and conditional variance. By conditioning on a recent historical stock price data, we construct price-bands that are more sensitive to recent market information than Bollinger Bands. Let stochastic process X (s) be the stock price at time s, and let Iτ = {X (τ1 ), X (τ2 ), . . . , X (τN )}, τ1 < τ2 < · · · < τN be a sequence of past (τ1 , . . . , τN ) stock prices (see Fig. 1). Then given the information set Iτ , the probability p of the future stock prices running within [a1 , b1 ], · · · , [al , bl ] at times t1 < · · · < tl , is the following: p = P(ai ≤ X (ti ) ≤ bi , i = 1, . . . , l | Iτ ).
(1)
If we use the past N data points (equally spaced), say, in order to predict the future price changes for n time points, then (1) has the following practical meaning. With probability p, we can expect that the future stock price is likely to run within the lower and upper bounds ai , bi , i = 1, . . . , l. These bounds are paramount in trading strategy, and in this chapter, we present an efficient method of computing these values. Let us assume that the stochastic process {X (t), t ≥ 0} is Gaussian. Let μ = (μ1 , . . . , μn )T denote the expectation of the random vector X = (X (t1 ), . . . , X (tn ))T . Then the covariance matrix = (ij ) is defined by
Fig. 1 Past, present, future timeline
224
J. Lee et al.
ij = E((X (ti ) − μi )(X (tj ) − μj )).
(2)
Let us recall that we have the closed form of p.d.f. of Gaussian models. Let random vector X = X1 , . . . , Xp , its covariance matrix. Suppose that we are interested in the conditional distribution of XA = (X1 , . . . , Xk ) given XB = (Xk+1 , . . . , Xp ). Let μA and μB denote the corresponding expectation vectors of XA and XB , respectively. We partition the covariance matrix into =
AA AB BA BB
.
Then we have −1 −1 XA |XB ∼ Nk (μA + AB BB (XB − μB ), AA − BB BA ),
where k denotes the number of dimension of the distribution function. For a price-bands construction, let X (τj ) denotes the random stock prices on day j, j = 1, . . . , m, and let P denotes “Past.” We form a random vector: ⎞ X P (τ1 ) ⎟ ⎜ .. XP = ⎝ ⎠. . ⎛
(3)
X (τm ) P
Let ej denotes the expected stock price on day i, i = 1, . . . , m. Then the expectation vector of (3), i.e., an m by 1 vector, on day j, j = 1, . . . , m can be written as ⎞ e1P ⎜ ⎟ eP = E(XP ) = ⎝ ... ⎠ . ⎛
(4)
emP
For the future random variable, let us use F for “Future.” Then we form a future random vector: ⎞ ⎛ F X (t1 ) ⎟ ⎜ (5) XF = ⎝ ... ⎠ , X F (tn )
and expectation vector:
⎞ e1F ⎜ ⎟ eF = E(XF ) = ⎝ ... ⎠ . ⎛
(6)
enF
Using the notations (3), (4), (5), and (6), let us define components of covariance matrix of the random variables X P (τj ), j = 1, . . . , m, X F (ti ), i = 1, . . . , n by
Price-Bands: A Technical Tool for Stock Trading
S = E[(XF − eF )(XF − eF )T ] U = E[(XF − eF )(XP − eP )T ] T = E[(XP − eP )(XP − eP )T ]. Then the covariance matrix C can be written as S U . C= UT T
225
(7)
(8)
In our setting, given that XP = xP , XF has a normal distribution with “conditional” expectation vector (9) eC = eF + U T −1 (xP − eP ), and covariance matrix
S − U T −1 U T .
(10)
The conditional probability density of XF , given XP = xP , can be written up as f (xF | xP ) =
(S − U T −1 U T )−1 1/2 (2π )N × exp − 21 (xF − eC )T (S − U T −1 U T )−1 (xF − eC ) .
(11)
We want to predict upper and lower bounds of a stock price for the next business day, based on historical data over a certain time period. Since we are interested in trading on the next day, XF is a random variable, not a random vector. As S is the variance of XF , it follows that S in the covariance matrix (8) is a number. Hence, the conditional covariance matrix S − U T −1 U T is a number. Given XP = xP , let us denote the standard deviation of XF by σC =
S − U T −1 U T .
(12)
Then after the way of Bollinger, we can construct our trading-bands, but with inputs of the conditional variance and expectation (12) and (9), respectively (Fig. 2). Definition 1 (Prékopa-Lee Bands) Based on n intraday data points from the past m days, we construct next-day trading-bands by predicting volatility σ C = √ S − U T −1 U T and average price eC = eF + U T −1 (xP − eP ) as follows: Upper band = eC + kσ C Middle band = eC Lower band = eC − kσ C ,
(13)
226
J. Lee et al.
Fig. 2 Description of Gaussian process
where k is a multiplier that can be chosen based on investors’ preference (or risk tolerance level), i.e., the larger k, the wider bands. Together with σ C , k determines the width of the bands. The central tendency (i.e., eC ) determines the moving direction of the bands. √ To construct the bands, we need to find the values of σ C = S − U T −1 U T and eC = eF + U T −1 (xP − eP ). Since the stock price distribution of the next business day is unknown, there is no way to find the exact values of σ C and eC . However, given X P = xP , the calculation of reasonable upper and lower bounds for the volatility σ C is possible. Among the terms in the expression for σ C , the “Future” random variable X F appears in S = E[(XF − eF )(XF − eF )T ] and U = E[(XF − eF )(XP − eP )T ]. For the next-day variance S, we want to use the sample variance (as in [1]): N S=
P i=1 (xi
− x¯ )2 , N −1
(14)
where xiP , i = 1, · · · , N , are all the given data points (n intraday data points on each of the past m days, i.e., the total number of data points is mn), x¯ their mean, and N the number of data points, i.e., N = mn. In a short period of time, say up to 30 days, estimation of S by (14) is known to be acceptable to use in practice. More importantly, the success of Bollinger Bands in practice proves (14) works just fine with a proper multiplier k. Cross-covariance U represents the relationship between past and future stock prices. In the short term (of, in this case, at most ten days), a myriad of factors can materially contribute to wildly varying degrees. For example, investor sentiment can, and often does, shift in dramatic fashion as herd psychology sweeps the market. When patterns do exist, distinguishing them in an actionable fashion is virtually impossible from historical time series data alone.
Price-Bands: A Technical Tool for Stock Trading
227
However, the cross-covariance matrix U is typically estimated from historical time series via linear regression. This approach effectively averages out the myriad of factors alluded to above that contribute to short-term fluctuations; indeed, this is part of the value of the approach in long-term forecasting. For the purpose of shortterm technical trading, however, a more detailed approach is required. For this reason, effective estimation of U = E[(XF − eF )(XP − eP )T ] is paramount to obtaining the upper and lower bounds of the conditional volatility σ C . To this end, suppose we could choose the cross-covariance to maximize or minimize the conditional variance (next-day volatility). Given n intraday data points zi , measured at equally spaced time points, then we have n! possible permutations on their ordering. That is, we have the possible cases ⎧ z1 < z2 < · · · < zn−1 < zn ⎪ ⎪ ⎪ ⎨ z1 < z2 < · · · < zn < zn−1 (15) .. ⎪ . ⎪ ⎪ ⎩ zn < zn−1 < · · · < z2 < z1 . As n gets arbitrarily large, the complexity of bounding σ C therefore increases exponentially. Thus, in order to efficiently obtain the upper and lower bounds of σ C , we propose an efficient stochastic programming formulation.
3 Stochastic Combinatorial Optimization Problem Formulation We want to find reasonable lower and upper bounds of the next day’s stock price. The unit time period could be a day, a week, or a month, depending on the investment strategy in terms of realization of profit in a preferred length of time. Generally, price-bands are used for day trading, short-term investments, etc. For a short-term trading, we use intraday stock price data. Suppose that for the next day stock trading, we use n intraday data points (per day) with historical data points for the past m days. Let us define “Past Data Matrix” by: ⎛
y11 y21 ⎜ y12 y22 ⎜ YT = ⎜ . .. ⎝ .. . y1n y2n
⎞ ⎛ P P x11 − e1P x21 . . . ym1 − e2P P P P P ⎜ x12 . . . ym2 ⎟ − e x 1 22 − e2 ⎟ ⎜ .. ⎟ = ⎜ .. .. .. . . ⎠ ⎝ . . P P . . . ymn − e1P x2n − e2P x1n
⎞ P . . . xm1 − emP P . . . xm2 − emP ⎟ ⎟ ⎟, .. .. ⎠ . .
(16)
P . . . xmn − emP
where xjiP denotes the past stock price at ith time point on day j; ejP denotes the estimated average stock price on day j, j = 1, . . . , m (i.e., ejP = 1n ni=1 xji ).
228
J. Lee et al.
Since we are interested in “unknown” stock prices on the next business day, let us define “Future Data Vector” by: z = (z1 , . . . , zn ) = (X1F − eF , X2F − eF , . . . , XnF − eF ),
(17)
where random variables XiF , i = 1, . . . , n are normally distributed with mean eF and variance S and denote stock price at ith time point of the next day. In (16) and (17), Y T is an n × m matrix, and z is an 1 × n row vector. Components of the covariance matrix C=
S U UT T
.
(18)
need to be represented by the use of “Past Data Matrix” and “Future Data Vector” as in (16) and (17), respectively. Using the “Past Data Matrix” Y T , T = E[(XP − eP )(XP − eP )T ] can be estimated by (19) T = YY T , which is an m × m matrix. Now we turn our attention to the other components of C: S and U , involving the “Future” random variable X F . As in [1], we estimate next day’s variance S by (14). The future is unpredictable, especially short-term trading, so we cannot predict the volatility of next day based on existing patterns or market trends (if there is any) in the past couple of days. Thus the bridge between past and future for should be closely examined, and we have to keep the randomness for the best selection. The cross covariance U = E[(XF − eF )(XP − eP )T ] can be written, but unknown, by n 1 zi [Y T ]i , (20) n − 1 i=1 where zi , i = 1, . . . , n denote ith component of “Future Data Vector” z in (17), and [Y T ]i i = 1, . . . , n denote ith row of the Past Data Matrix Y T in (16). Note that this is a 1 × m row vector. By (14), (19), and (20), the conditional variance S − U T −1 U T (i.e., the variance of XF |XP = xP ) can be written up as: N
P i=1 (xi
− x¯ )2 − N −1
n
i=1 zi [Y
n−1
T
]i
T −1
(YY )
n
i=1 zi [Y
n−1
T
]i
T ,
(21)
Price-Bands: A Technical Tool for Stock Trading
229
where x¯ in the first term is the mean of all the past data points, and N = mn, m : number of days, n : number of intraday data points. In the above-estimated conditional variance (21), zi , i = 1, . . . , n, are the only unknowns, i.e., the “Future Data Vector” z = (z1 , . . . , zn ) which we want to find by a suitable optimization problem√formulation.√Since zi = XiF − eF ∼ N (0, S), it is reasonable to assume that −4 S ≤ zi ≤ 4 S, i = 1, . . . , n. We want to find the minimum and maximum values of (21), i.e., lower and upper bounds of the variance of XF |XP = xP (volatility on the next day). For meaningful bounding values of (21), the future data points zi , i = 1, . . . , n, and their relations to all given past data points at time i, i = 1, . . . , n (i.e., [Y T ]i = (xP − eP )T i , i = 1, . . . , n) are essential. Thus, all possible cases must be examined for a reasonable selection of “Future Data Vector” z = (z1 , . . . , zn ). Let f be p.d.f. of XiF − eF , normally distributed with mean 0 and variance S for all i = 1, . . . , n. Let z (k) denotes kth largest element in (z1 , . . . , zn ), i.e., z (n) ≤ z (n−1) ≤ · · · ≤ z (1) . Similarly, let xj(k) denotes the kth largest element (i.e., xj(n) ≤ xj(n−1) ≤ · · · ≤ xj(1) ) in (xj1 , . . . , xjn ): n intraday data points on day j, j = 1, . . . , m. Let ejP and σjP denote the expectation and standard deviation of the normal random variable XjP , p on day j = 1, . . . , m. Let gj denote p.d.f. of XjP − ej ∼ N (0, σj2 ), j = 1, . . . , m. Then we define the following probabilities pj(i) at time i = 0, . . . , n on day j, j = 1, . . . , m by: ∞ (0) pj = gj (t)dt, yj(1) (i)
pj(i) =
yj
gj (t)dt, i = 1, . . . , n − 1,
(22)
yj(i+1) (n)
pj(n) =
yj
gj (t)dt, −∞
where yj(n) ≤ yj(n−1) ≤ · · · ≤ yj(1) : stock prices of day j, j = 1, . . . , m, such that n (i) (i) i=0 pj = 1. Figure 3 can be referred for the description of pj , i = 0, . . . , n.
230
J. Lee et al.
Fig. 3 Description of data points and their corresponding probabilities
Based on the stock price data for the past m days, we want to give some reasonable intervals to probabilities regarding “each” of the next day’s data points zi , i = 1, . . . , n. For this reason, by (22), we set lower and upper bounds for probabilities regarding the ordered future data points z (n) ≤ · · · ≤ z (1) , out of zi , i = 1, . . . , n of in the following way: l0 =
(0) min{p1(0) , p2(0) , . . . , pm }
∞ ≤
(1) l1 = min{p1(1) , p2(1) , . . . , pm }≤
z (1) z(1)
(1) f (t)dt ≤ max{p1(1) , p2(1) , . . . , pm } = u1
z (2)
ln =
(n) min{p1(n) , p2(n) , . . . , pm }
(0) f (t)dt ≤ max{p1(0) , p2(0) , . . . , pm } = u0
z(n) ≤
(23)
.. . (n) f (t)dt ≤ max{p1(n) , p2(n) , . . . , pm } = un ,
−∞
where z (k) denotes kth largest element in (z1 , . . . , zn ), and f the p.d.f. of zi = XiF − eF , normally distributed with mean 0 and variance S for all i = 1, . . . , n.
Price-Bands: A Technical Tool for Stock Trading
231
With the condition (23), we can write min-max problem formulation as follows. N min(max)
P i=1 (xi
− μ)2 − N −1
n
i=1 zi [Y
T
]i
n−1
(YY T )−1
n
i=1 zi [Y
T
]i
T
n−1
subject to ∞ l0 ≤ f (t)dt ≤ u0 z (1) z(1)
l1 ≤
f (t)dt ≤ u1 z (2)
(24)
.. .
z(n) ln ≤
f (t)dt ≤ un
√−∞ √ −4 S ≤ zi ≤ 4 S, i = 1, . . . , n z (k) = kth largest element in (z1 , . . . , zn ), N
(xP −μ)2
i where f is the p.d.f. of zi , i = 1, . . . , n and S = i=1N −1 (the first term of the objective function), the estimated variance for z1 , . . . , zn . The above formulation (24) can be said to be a stochastic combinatorial optimization problem due to the last constraint, which requires the necessity of counting of all outcomes and a proper selection among them. We know decision making in a finite sample space is very often considered as a counting problem. There are n! permutation of the set {z1 , . . . , zn }, i.e., n! choices of a total ordering of the set of intraday data points {z1 , . . . , zn }. In order to find minimum and maximum of the problem (24), all n!-permutations should be counted and examined (all must simultaneously satisfy all the constraints for feasibility.). In order to count all of the n! cases, among various methods we propose a set representation of the n intraday data points. In this way, all possible cases can be counted in a more systematic way, allowing a faster subsequent computation. Let us recall a few basics from set theory and combinatorics. If A is any set and (A1 , . . . , Ak ) is a sequence of subsets of A satisfying
(i) each Ai = ∅ (ii) Ai ∩ Aj = ∅ for all i = j (iii) A1 ∪ · · · ∪ Ak = A then (A1 , . . . , Ak ) is called an ordered partition of A with k blocks. Let Pn denotes the number of ordered partitions of an n-set, then P0 = 1, and for any positive integer n, Pn =
n n k=1
k
Pn−k =
n−1 n j=0
j
Pj .
(25)
232
J. Lee et al.
We observe that Pn is written up in terms of binomial coefficients, which are important in combinatorics and very useful in counting problems. In general, for any sequence of finite sets, the principle of inclusion and exclusion can be used. This is the fundamental motivation for implementing the set theoretical approach, and the formulation (24) can be written up in a more (mathematically) compact form by the use of an ordered partition of a finite set, and binomial coefficients. By a set representation of the “Future Data Points” z = (z1 , . . . , zn ) as “ordered partitions with n nonempty blocks” of the set {z1 , . . . , zn }, we can reformulate the problem as a modified binomial moment problem, which effectively counts all possible cases without actual counting per se.
4 Binomial Moment Problem Formulation Utilizing a binomial moment scheme, the problem (24) can be reformulated in a systematic form. For details about the binomial moment scheme, we refer the reader to the literature, e.g., Prékopa ([6–8]), etc. For completeness, we present some basic definitions here. Let ν designate the number of events from A1 , . . . , An that occur. Let vi = P(ν = i), i = 1, . . . , n. Then n i vi = Sk , k = 0, 1, . . . , n, k i=0
(26)
ν where, by definition, Sk = E , k = 0, . . . , n. Essentially, the binomial k moment problem formulation is to optimize an objective function with a counting method leveraging the inclusion–exclusion principle. For a suitable representation of the future data points z = (z1 , . . . , zn ) as ordered partitions with n blocks, let us define the sets as follows: Aj = {t | t ≤ z (j) }, j = 1, . . . , n,
(27)
where z (j) is the jth largest among the next day’s n intraday data points z1 , z2 , . . . , zn , i.e., z (n) ≤ z (n−1) ≤ · · · ≤ z (1) . Note that An ⊆ An−1 ⊆ · · · ⊆ A1 . Let us introduce sets: A = A1 = {t | t ≤ z (1) }, and B1 = A1 \A2 B2 = A2 \A3 B3 = A3 \A4 .. .
Bn = An
(28)
Price-Bands: A Technical Tool for Stock Trading
233
Fig. 4 Description of data points and their corresponding probabilities
Then (B1 , B2 , . . . , Bn ) is a sequence of subsets of A satisfying (i) each Bi = ∅ (ii) Bi ∩ Bj = ∅ for all i = j (iii) B1 ∪ · · · ∩ Bn = A, and thus (B1 , . . . , Bn ) is an ordered partition of A with n blocks. Let us introduce the functions, for k = 1, . . . , n:
Sk (z) = =
P(z (i1 ) ≥ η, . . . , z (ik ) ≥ η)
1≤i1