Blockchain Gaps: From Myth to Real Life (Future of Business and Finance) [1st ed. 2021] 9813360518, 9789813360518

This book analyzes the fundamental issues faced when blockchain technology is applied to real-life applications. These c

278 53 4MB

English Pages 122 [114] Year 2021

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Scope of Blockchain Technology is Not Limited to Finance
Like the “Internet Bubble” in the Early 2000s
Revisit the History of the Internet
Construction of this Book
Contents
Editors and Contributors
1 Fundamentals of Blockchains
1 Blockchain Types
2 Data Structure of Blockchains
3 Characteristics of Blockchains
4 Elements that Make the Operation of Blockchains Possible
5 The Power Relationship Among Network Participants
6 Issues to Overcome
2 Proof-of-Work: A Consensus Mechanism to Achieve Consistency
1 Three Requirements for Archiving Systems
2 Preventing Unwanted Consequences
3 Crypto-Puzzles in Detail
4 Smart Contracts in Ethereum
5 Fairness and Transparency in Blockchain
3 Misunderstandings and Expectations of the Blockchain Created by Bitcoin
1 A Beautiful Misunderstanding of the Blockchain
2 Is It Possible to Allow Privacy?
3 The Difficulty of Building Safe Mechanisms Using Computing Power
4 Understanding the Weaknesses and Considering Where to Apply Blockchain
4 Challenges Blockchain Technology Faces
1 Characteristics of Blockchain
2 Requirements Imply Issues
5 Are Blockchains Trustless?
1 The Essence of “Trust”
2 The Conditions for a System to Be Trusted
3 “Trust Framework” as the Mechanism to Create “Trust” for SMEs
4 There is no Mention of “Trustless” in Satoshi Nakamoto’s Paper
5 The Illusion of Being “Trustless”
6 Reject False Expectations and use Blockchain with a Realistic Perspective
References
6 The Bitcoin “Consensus” Problems
1 What Bitcoin is Intended to Accomplish
2 What Does This Term “Consensus” Refer To?
3 What is Consensus in a Distributed System?
4 Bitcoin Consensus Has Always Been Ambiguous
5 The Consensus Problem and the Byzantine Generals’ Problem
6 Byzantine Faults Largely Ignored
7 P2P Systems and Byzantine Faults
8 Bitcoin and the Byzantine Generals’ Problem
9 Humans Consent is a Must for the System’s Administrative Policy
10 We Need to Clearly Specify Who is Agreeing With Whom About What
References
7 The Myth of “Blockchain is Scalable” and Real Challenges
8 Unexpected Pitfalls of Bitcoin
References
9 The Biggest Problem of Blockchains: Key Management
1 Forms of Blockchain Key Management
2 Issues Regarding Key Management by Software
3 Issues Regarding Key Management by a Device
4 Issues Concerning Online Wallets, Crypto-Asset Exchanges, and Other Services
5 Main Methods of Key Management in Blockchains
10 The Cryptographic Technology of Bitcoin Will Eventually Be Broken
1 Long-Term Signature Technology, a Possible Cure for Hash Function Migration
References
11 How We Can Secure Blockchain-Based Systems
1 Lessons Learned from the DAO Attack
2 Another Aspect: Availability
3 Five Layers Ensuring a Secure System Based on Cryptographic Technology
4 Assumptions Between Layers Produce Pitfalls
5 Utilize Knowledge from Existing Systems and Standards
References
12 The Current State of the Global Movement
1 The Maturity of Blockchain Technology
2 Facilitating Collaboration Between Engineers and Academia
3 Academic Activities on the Blockchain
4 International Standardization and Issues
5 Beginning of Multi-Stakeholder Collaboration
Recommend Papers

Blockchain Gaps: From Myth to Real Life (Future of Business and Finance) [1st ed. 2021]
 9813360518, 9789813360518

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Future of Business and Finance

Shin’ichiro Matsuo Nat Sakimura   Editors

Blockchain Gaps From Myth to Real Life

Future of Business and Finance

The Future of Business and Finance book series features professional works aimed at defining, describing and charting the future trends in these fields. The focus is mainly on strategic directions, technological advances, challenges and solutions which may affect the way we do business tomorrow, including the future of sustainability and governance practices. Mainly written by practitioners, consultants and academic thinkers, the books are intended to spark and inform further discussions and developments.

More information about this series at http://www.springer.com/series/16360

Shin’ichiro Matsuo • Nat Sakimura Editors

Blockchain Gaps From Myth to Real Life

123

Editors Shin’ichiro Matsuo Department of Computer Science Georgetown University Washington, WA, USA

Nat Sakimura Nomura Research Institute Tokyo, Japan

ISSN 2662-2467 ISSN 2662-2475 (electronic) Future of Business and Finance ISBN 978-981-33-6051-8 ISBN 978-981-33-6052-5 (eBook) https://doi.org/10.1007/978-981-33-6052-5 © Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Preface

Blockchain—which is one of the recent buzz words—collects huge attention as a foundation of future innovations. Many of us are expecting their potential power. After 2015, blockchain is often said as a disruptive technology like the Internet.

Scope of Blockchain Technology is Not Limited to Finance The start point of blockchain was the Bitcoin paper proposed by anonymous Satoshi Nakamoto. It was thought of as a technology to change finance. However, regulators started discussions on the regulation of cryptocurrency in 2015. It makes startups and technologists to change their scope to a non-financial application to avoid frictions to regulation. It was the primary reason the blockchain technology, which is a core technology of Bitcoin, collected much attention. Therefore, people see blockchain will be applied to a wider range of applications than “Fintech”. Research and development of blockchain are massively conducted by many companies and organizations, including the global financial industry, IT companies, and governments. On the other hand, there are many doubts about the maturity of blockchain technology. Someone says the maturity of the blockchain technology is as same as the late 1980s of Internet technology. On June 18, 2016, “The DAO Attack” had a huge negative impact on the blockchain community. The DAO (Decentralized Autonomous Organizations) was attacked, and about 50M dollars was leaked. This attack was caused by a naive vulnerability of the programming code of the DAO platform. This incident was a turning point to revisit the maturity of this technology. In 2017, many Initial Coin Offering (ICO) collected huge amounts of money though most of them did not have viable and transparent technology and project. In 2020, a similar situation is occurring for DeFi (Decentralized Finance) businesses and startups. Those situations show that there are many issues not only on technology but also on surrounding businesses.

v

vi

Preface

Like the “Internet Bubble” in the Early 2000s Blockchain technology and its ecosystem have the potential to unbundle the current centralized business structure of data, similarly as the Internet unbundled the centralized communication owned by big telecommunication companies. It could give general people the power to leverage a shared and global ledger to create innovation. However, we think the current progress of the blockchain ecosystem is too quick to make it mature enough as a social foundation. The commercialization of the Internet happened 25 years after the launch of ARPANET. Like Linked In, Social Network Services (SNS) started its services 8 years after the commercialization of the Internet. Creating innovation over a pile of mature technologies takes time. The current bustle around blockchain technology skips such essential steps. It looks like the “Internet Bubble” in the early 2000s. The Internet was said “redundant” and “not scalable” compared with Asynchronous Transfer Mode (ATM) and other advanced communication technology. However, Internet guys, who believe the potential power of the Internet to provide power to the people, solved technology problems without losing its technical merits.

Revisit the History of the Internet Unfortunately, a not small number of the current blockchain projects do not understand the core value of the technology and concentrate on short-term financial gains. We will likely experience the same problems as the Internet bubble. We need to understand the core values, challenges, and gaps to achieve the real potential usefulness of blockchain technology. It is a lesson from the Internet history that continued steady works are needed to win the era of blockchain. This book aims to provide viewpoints to produce genuinely innovative products over blockchain. As same as the Internet is redundant and was called a “stupid” network, blockchain technology is redundant and hard to scale it, though it has massive potential for permissionless innovation. If you would like to grab the fruits of global innovation over blockchain, it provides you real understandings of points of challenge.

Construction of this Book This book first explains the fundamental knowledge and applications of blockchain. Then, it describes essential technology issues and gaps against widely known understandings. This book is written in an omnibus style by academic researchers and engineers engaged in the design and operations of cryptography and its application. The contents of this book were originally written in 2016 and 2017 as a

Preface

vii

series of articles of Nikkei BP. At first glance, it seems old and obsolete contents. However, all of them are fundamental problems of blockchain technology and still not be solved in 2020. These problems happen these a few years. It is good evidence of the value of the contents. It took 2 years to be published as a book. The manuscript was deemed to be too negative against blockchain and distributed ledger technology (DLT), while the market perception of DLT was so high. It was only in January 2018 that it could be published as the high expectation against Blockchain and DLT was dissipating. A week later, the CoinCheck incident that has lost over 500 million US dollars’ worth of cryptoasset occurred as it was predicted in the book. The book became one of the best-selling blockchain books for the year. Subsequently, the book was translated into Chinese and Korean. Even in 2020, blockchain technology is rapidly growing. This version, however, is not the same as these versions, as the text now has been revised to include more up-to-date information. The revisions include: Chapter “Fundamentals of Blockchains”: Updated to reflect trends in 2019. Chapter “Are Blockchains Trustless?”: Updated the example incidents to include CoinCheck (2018), etc. Chapter “The Biggest Problem of Blockchains: Key Management”: Updated to reflect trends in 2019 and to further focus on issues in blockchain. We hope this book helps you to create new innovation and a peaceful world over the blockchain. Washington, USA Tokyo, Japan

Shin’ichiro Matsuo Nat Sakimura

Contents

Fundamentals of Blockchains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masashi Sato

1

Proof-of-Work: A Consensus Mechanism to Achieve Consistency . . . . . Kazue Sako

9

Misunderstandings and Expectations of the Blockchain Created by Bitcoin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masanori Kusunoki

17

Challenges Blockchain Technology Faces . . . . . . . . . . . . . . . . . . . . . . . . Shin’ichiro Matsuo

25

Are Blockchains Trustless? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nat Sakimura

31

The Bitcoin “Consensus” Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shinichi Miyazawa

39

The Myth of “Blockchain is Scalable” and Real Challenges . . . . . . . . . . Masanori Kusunoki

59

Unexpected Pitfalls of Bitcoin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kazue Sako and Ryo Furukawa

67

The Biggest Problem of Blockchains: Key Management . . . . . . . . . . . . Masashi Sato

75

The Cryptographic Technology of Bitcoin Will Eventually Be Broken . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masashi Sato How We Can Secure Blockchain-Based Systems . . . . . . . . . . . . . . . . . . Shin’ichiro Matsuo

85 95

The Current State of the Global Movement . . . . . . . . . . . . . . . . . . . . . . 105 Shin’ichiro Matsuo

ix

Editors and Contributors

About the Editors Dr. Shin’ichiro Matsuo is a Research Scientist in Cryptography and Information Security. He is working on maturing blockchain technology from the academia side and presents research results on blockchain security. At Georgetown University, he directs the CyberSMART Research Center and leads multi-disciplinary research among technology, economy, law, and regulation. He also leads international research collaboration on blockchain and founded BASE (Blockchain Academic Synergized Environment) alliance with the University of Tokyo and Keio University. In 2019, he co-founded Blockchain Governance Initiative Network (BGIN) a multi-stakeholder discussion body like IETF, as an initial contributor. He is co-chair of BGIN. He is a co-founder of the BSafe.network, an international and neutral research test network to promote applied academic research in blockchain technologies. He is a part of many program committees on blockchain technology and information security, and a program co-chair of Scaling Bitcoin 2018 Tokyo. He serves as the leader of security standardization project of blockchain (ISO TC307). Previously, he served as the head of Japanese national body of ISO/IEC JTC1 SC27/WG2 for cryptographic techniques, a member of advisory board cryptographic technology for the Japanese government. Nat Sakimura is a well-known identity and privacy standardization architect at NAT Consulting and the Chairman of the Board of the OpenID Foundation. Besides being an author of such widely used standards as JWT (RFC7519), JWS (RFC7515), OAuth PKCE (RFC7636) and OpenID Connect that are used by over 3 billion people, he helps communities organize themselves to realize the ideas around identity and privacy. As the Chairman of the board of the OpenID Foundation, he streamlined the process, bolstered the IPR management, and greatly expanded the breadth of the foundation spanning over 10 working groups whose members include large internet services, mobile operators, financial institutions, governments, etc.

xi

xii

Editors and Contributors

He is also active in public policy space. He has been serving in various committees in the Japanese government including the Personal Data Working Group of the Ministry of Economy, Trade and Industry and the Study Group on the Platform Services of the Ministry of Internal Affairs and Communications.

Contributors Ryo Furukawa NEC Corporation, Tokyo, Japan Masanori Kusunoki Japan Digital Design, Tokyo, Japan Shin’ichiro Matsuo Georgetown University, Washington, DC, USA Shinichi Miyazawa Secom Intelligent Systems (IS) Lab, Tokyo, Japan Nat Sakimura NAT Consulting, Tokyo, Japan Kazue Sako Waseda University, Tokyo, Japan Masashi Sato Secom Co., Ltd, Tokyo, Japan

Fundamentals of Blockchains Masashi Sato

1

Blockchain Types

Many platforms referred to as “blockchains” have been developed in recent years, and there is a growing expectation that these platforms will be expanded to cover various applications across different fields and new businesses. However, while the word “blockchain” is used in numerous situations, not all people share the same definition or concept of this term. Blockchain platforms are provided by different communities and each has its own objectives and design. Below are some examples: (1) Platforms and services implemented on Bitcoin Example: Omni, Counterparty, ColoredCoins, Proof-of-existence (2) New crypto-assets derived from Bitcoin’s implementation and concept Example: Litecoin, Monacoin, DASH, Zcash (3) New platforms generated through the execution of the so-called “smart contract” codes, in addition to the transactions of cryptocurrencies Example: Ethereum, Hyperledger Fabric, NEM (4) Services implemented on the platforms of (3) Example: Everleger, CryptoKitties, REX, uPort (5) Private blockchains designed for specific purposes and participants Example: MultiChain, Hyperledger Fabric, Quorum (*By the time this document is published, several of the projects or services above may still be at the conception or demonstration test stages or may already have been terminated.)

M. Sato (&) Secom Co., Ltd, Tokyo, Japan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Matsuo and N. Sakimura (eds.), Blockchain Gaps, Future of Business and Finance, https://doi.org/10.1007/978-981-33-6052-5_1

1

2

M. Sato

This book is intended to organize the topics involved in the discussion of blockchain. To this end, it is impossible to avoid summarizing the most common concepts concerning the main topic of discussion: blockchain itself. Hence, this chapter extracts the common elements of the most typical so-called “blockchain mechanisms” to organize its concepts. It is justified to say that blockchain platforms, which take many different forms, can be classified from different perspectives. One of these perspectives is the concept of “public” and “private” blockchains. While its definition varies, a “public blockchain” can be roughly defined as a blockchain network that anyone can join and withdraw from. The best-known examples are Bitcoin and Ethereum. On the other hand, blockchain networks that place restrictions or conditions on joining are known as “private blockchains.” Some of these run a blockchain network within a closed network environment with limited external connections; others, like Hyperledger Fabric, use a platform equipped with an access control function. Because public and private blockchains operate using different environments and mechanisms, the issues involved often differ as well. However, unless otherwise noted, this book focuses entirely on public blockchains.

2

Data Structure of Blockchains

For many readers, the first example of a blockchain that comes to mind is Bitcoin. Thus, let us briefly review the concept of blockchains in Bitcoin. A Bitcoin sender creates a transaction containing the address of the receiver and the number of coins to be sent; then, they assign a digital signature to that transaction. This forms a chain of transactions that moves from the first to the next sender. Meanwhile, every transaction considered valid in the Bitcoin network is registered to a ledger (a blockchain), which is accessed by all participants. This ledger functions as the proof-of-existence for valid transactions. The chains of digital signatures associated with the transactions, as well as the chain of hash values in the ledger, protect the transactions from unauthorized rewriting and changes, thus preventing the unauthorized use of Bitcoin. The transactions that take place in the Bitcoin network are consolidated into data structures called “blocks,” which are created at certain intervals. Furthermore, for each of these transactions, a hash function is used to generate a hash value, which in turn is used to create a hash tree (Fig. 1). Then, the hash value of the root of that hash tree is stored in an area inside the block, called a “block header,” and a hash value for the block header is generated again. This is how a block is created. The block header contains the hash value of the block header created immediately before it, thus creating a chain of blocks that extends back to the beginning of the Bitcoin network (hence “blockchain”). Thus, if the data of past transactions are rewritten without authorization, they can be detected via their non-conforming hash values; bundling the hash values as chains makes them more difficult to replace illegally.

Fundamentals of Blockchains

3

Fig. 1 Hash chain of blockchain’s blocks

By maintaining the blockchain’s uniqueness in this way, it is possible to prove the existence of past transactions and maintain the consistency of the entire Bitcoin system. Bitcoin adopts proof-of-work (PoW) as a mechanism for the creation and approval of blocks. Despite differences in the mechanism, many blockchain platforms besides Bitcoin adopt similar hash trees and hash chains to prove the existence of past transactions. On the other hand, platforms such as IOTA, Ripple, and R3 Corda have not adopted a block-based chain structure like that of Bitcoin’s. These platforms are called “distributed ledgers” and are sometimes distinguished from blockchains. In platforms with distributed ledgers, verification and approval are performed for each transaction, which are then added to the ledger held by each node. Though they do not adopt a block-based chain structure, these platforms use digital signatures and hash values to ensure that all transaction history that has been registered is unaltered. This mechanism of data falsification detection and proof-of-existence is not new, and it has already been implemented in previous technologies. An example of a technology that uses a chain of digital signatures to detect data falsification and ensure continuity is known as “hysteresis signature.” Furthermore, ISO/IEC 18014-3, which describes time-stamping services that produce linked tokens, is an example of a proof-of-existence of data, using hash chains and hash trees. Similar time-stamping services are offered by companies such as Surety and Guardtime. Other technologies that use hash trees are Evidence Record Syntax of RFC 4998/RFC 6283 and Certificate Transparency (RFC 6962), although these differ from the time-stamping services mentioned above. Certificate Transparency implements hash trees in the mechanism that saves and publishes the issuance history of Transport Layer Security server certificates. Hash tree is an easy method of proving the existence of multiple pieces of data at once; as a result, it is used in a

4

M. Sato

wide range of technologies. However, the use of a secure cryptographic hash function is an indispensable condition.

3

Characteristics of Blockchains

Thus, technologies that employ mechanisms similar to blockchain existed prior to blockchain itself, which makes it inaccurate, define as a blockchain every mechanism that uses hash chains and hash trees to prove the existence of data. The two main characteristics of blockchain that differentiate it from conventional verification technologies can be summarized as follows: • Blockchain Characteristic No. 1: It offers an environment for transactions and the execution of codes. • Blockchain Characteristic No. 2: It aims for management that does not depend on third-party organizations. Characteristic No. 1 is a functional characteristic common to all platforms. Conventional digital signatures and time-stamping services are intended to ensure the authenticity and prove the existence of general data. For example, a series of documents stored as proof of an electronic contract, data pertaining to intellectual property rights, or healthcare records. The data classification also differs depending on its application. It can be human readable (such as PDF files) or machine readable (such as binary data, XML, or JSON). In contrast, blockchains are designed for the transaction of crypto-assets and smart contracts that feature executable codes; they also offer an environment that executes these processes. A blockchain is a platform equipped on each node with a function that manages a series of processes, from the generation of a transaction to its verification. Some services—such as proof-of-existence—provide similar functionalities as that of conventional time-stamping services by using the blockchain function. Blockchain Characteristic No. 2 is one of the most typical features of blockchains and can be seen in the design concept of all platforms. The conventional time-stamping services that produce linked tokens assume a trusted third-party authority. The time-stamping service generates the hash chains and hash trees, and the hash values needed to verify them are published in newspapers and official gazettes. The time-stamping service relies on the trust that no illegal action takes place during the creation of hash chains and hash trees. To build up that trust, the time-stamping service asks the user to send only the hash value of the document, to make it impossible to alter the document with ill-intention. Other security measures include implementing a tamper-resistant device to prevent falsification during the creation of the hash value-containing timestamp data, and locating the servers in robust facilities such as data centers. Conventional timestamping maintains the immutability of hash chains using this relationship of trust. Meanwhile, in blockchains, the mechanism that ensures the authenticity of the transaction and ledger

Fundamentals of Blockchains

5

through multiple nodes was designed to avoid management from a trusted third-party authority. Blockchain nodes connect to each other and create blocks, they then verify each other with the data of these blocks. This ensures that, even if a single node stops functioning or an illegal action takes place, the consistency of the entire system is maintained, provided the other nodes continue to function correctly. To keep such a system running, each participant of the blockchain network must fulfil their role and operate autonomously. For this reason, particularly on the main public blockchain platforms, it is thought that the blockchain network continues owing to a combination of the motivations of gaining crypto-assets within the platform and the mechanism maintaining the chain of blocks uniquely without divergence. The mechanism designed to ensure the consistency of the chain is called a “consensus algorithm.” The best-known examples of consensus algorithms are the PoW, which has been adopted by many platforms (including Bitcoin); the Proof-of-stake, which is set to be introduced to Ethereum; and the Proof-of-importance, used in NEM. These autonomous systems are likely to feature complex architectures and relationships between their participants. Therefore, it is necessary to consider not only just the relationships between the nodes of the blockchain network but also the software development of the platform and the applications and services built upon it.

4

Elements that Make the Operation of Blockchains Possible

For a blockchain platform to be managed and operated autonomously by its participants, without a third-party organization managing it, the policies and rules concerning the platform must be shared among and followed by all the parties involved. In this context, the parties involved in a blockchain platform are categorized as follows: 1. A community that defines the policies and rules concerning the operation of the blockchain platform 2. A community that defines the specifications of the software used in the blockchain platform 3. A community of developers that implement the software of the blockchain platform 4. A participant that operates the software of 3. (above) and acts as a node (the participant can assume more than one role, including the ones below) 4:1. The role of a client that creates and consults transactions (for example, sending and receiving crypto-assets, deploying smart contracts, requesting to execute smart contracts, etc.)

6

M. Sato

4:2. The role of the creator of a block (for example, a miner) 4:3. The role of a verifier of transactions or blocks/blockchains 4:4. The role that stores the data of blockchains (ledgers) 5. A developer of smart contracts and applications for the blockchain platform 6. A provider of services that use the blockchain platform and 5. The community of Item 1. defines the policies and rules concerning the entire governance and operation of the community; this includes the objectives and concepts of the blockchain platform, its providing range, what is required to participate in a community, and the method of making decisions in software specification development. The level of clarity and detail regarding these policies and rules are likely to vary between communities. Based on those policies and rules, the community of Item 2 makes proposals for the specifications of the software, discusses the necessities, and defines the specification changes along with the version to which those changes are to be applied. The specifications include, for example, how to create and verify a transaction or block, how to save a block, consensus algorithms, and more. Some specifications may be influenced by the objectives and concepts of the platform of Item 1, as is the case for consensus algorithms. In some cases, the community of Item 2 describes the specifications using documents, and the community of Item 3 provides a reference implementation that reflects those specifications. The community of developers of Item 3 implements the software, using the specifications of Item 2. In cases like Ethereum, different communities develop interoperable software. In the initial stages of the projects (particularly in the small ones), the communities of Items 1, 2, and 3 work almost as a single group, and software implementation often takes place only once. The participants of Item 4 execute the software provided in Item 3 and connect it to the blockchain network as a node. Because each node continues to work autonomously, the software specifications determined by the decisions of Items 1 and 2 have a large impact. For example, suppose that a part of the protocol executed in Item 4 concerns the generation of hash chains via a consensus algorithm. It is expected that this protocol adequately divides the nodes involved in the approval of the transactions and blocks, making it difficult to alter transactions and pre-approved blocks. However, if the communities of Items 1 or 2 decide to remove a specific node or to revert a blockchain to its past condition, and if this is incorporated into the software specifications, it may even overrule all agreements made on the network by consensus algorithms. This is exemplified in Ethereum’s decision following leakage of the crypto-assets held by the DAO. There are many different opinions regarding this decision, and we have no intention of drawing any conclusions here. The main point is that the various processes executed on the network of Item 4 also depend on the governance of the communities of Items 1 and 2.

Fundamentals of Blockchains

7

If we consider only the processes on the network of Item 4, is it possible to say with surety that it is decentralized and autonomously distributed? Ensuring sound governance of the community that influences the software specifications is certainly a crucial condition.

5

The Power Relationship Among Network Participants

A power relationship exists between the participants of the network of Item 4. Among the roles of Item 4, the one of block creator (4-2) is especially significant: if a person is able to control the generation of blocks, he or she can use a crypto-asset twice by reversing the past transaction history; they could even stop the other creators from producing blocks and executing transactions. For this reason, various schemes to prevent a user from taking control of block creation have been proposed across many blockchain platforms. Furthermore, the amount of data of the blockchain inevitably increases according to the number of blockchain network users. For this reason, it is believed that the process of each node owning each other’s blockchain data and verifying one another will reach its limit someday. This opens up the possibility of dividing the roles of the participants between verifiers—who hold the blockchain data and are responsible for the verification—and those who delegate the verification to the verifiers. However, this might also create a power relationship between the participants. A power relationship may also arise between participants who own large asset volumes and participants who do not; for example, between vested interest holders who have built up assets from the beginning of the blockchain operations and participants who have recently joined. Vested interest holders with large assets may bypass the network of Item 4 and have the power to influence the decision-making processes of Items 1 and 2. On a blockchain network, the developers of the smart contracts and applications of Item 5 and the service providers of Item 6 assume one of the roles of Item 4, offering a new application or service to the end users. If the implementation or management of these application developers or service providers is inadequate, it not only damages the end users of that service but also—in the worst-case scenario—interferes with the basic operations of the blockchain network.

6

Issues to Overcome

To develop the discussion in the following chapters, this chapter summarizes the most relevant concepts of blockchains as well as the composition of the parties involved in a blockchain platform.

8

M. Sato

When building an autonomous blockchain network that eliminates third-party organizations, the governance of various potential stakeholders becomes a crucial factor. This includes each participant’s motivation to maintain a solid blockchain network. Integrity is essential not only in the discussions concerning the mechanisms implemented by the software, such as the reward rules of crypto-assets and consensus algorithms, but also in the communities that discuss and form decisions about those mechanisms. Blockchains are expected to be applied to various fields, and many blockchain platforms have shown a remarkable development of functions and expansion. As blockchain networks increase in size and begin to offer more important services, they also need to become safer. To make a blockchain network a safer information base, it is necessary to consider it in its entirety; this includes the development community, the specifications of the platform software, and the network formed by the nodes running that software.

Masashi Sato works for a security company, SECOM CO., LTD as a research engineer. He researched secure systems using electronic authentication and electronic signature. He served on standardization activities in the electronic signature field. He contributed to the drafting of JIS (Japanese Industrial Standards) and ISO standards related to electronic signatures, e.g. series of ISO 14533 and ISO 17090-4. He serves as a subleader of the electronic signature working group of JNSA(Japan Network Secruity Association), and an editor of the secruity working group of CGTF (Cryptoassets Governance Task Force).

Proof-of-Work: A Consensus Mechanism to Achieve Consistency Kazue Sako

As explained in Sect. 1, a blockchain is maintained not by a single authority but by multiple nodes, where each node independently maintains the history of transactions. In this section, we discuss how each node maintains the history independently of, and yet consistently with, all other nodes.

1

Three Requirements for Archiving Systems

If we regard a blockchain as an archiving system, three roles must be defined: (1) the Data Generator, (2) the Data Manager, and (3) the Data Referrer. Furthermore, a prior rule is specified, which decides what data generated by the Data Generator is to be archived as correct history by the Data Manager. We assume that there are multiple Data Generators and Data Referrers in the system. For this system, the external requirements are well defined: (1) The data generated by legitimate Data Generators are archived according to the rule. (2) Data already archived are not to be changed. (3) Data Referrers can correctly refer to the archive of the data. Provided we can trust the Data Manager, these requirements can be met by an ordinary database. We would not need a blockchain. The Data Manager authenticates the Data Generator, verifies that the data follow the rule, and archives them. A trusted Data Manager would present the data to a Data Referrer exactly as they were archived.

K. Sako (&) Waseda University, Tokyo, Japan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Matsuo and N. Sakimura (eds.), Blockchain Gaps, Future of Business and Finance, https://doi.org/10.1007/978-981-33-6052-5_2

9

10

K. Sako

If the Data Manager is untrustworthy, the Data Referrer would question whether the data provided by Data Manager were indeed those that were generated by the Data Generators; that is, whether the data have not been changed since they were recorded. This is where a blockchain applies. By using a blockchain, we can achieve these three goals without fully trusting the Data Manager. For this purpose, Bitcoin’s blockchain employs a Digital Signature scheme and a Hash-chain. A Digital Signature scheme is a cryptographic technology through which an entity can sign for the data. A verification algorithm is used to check whether or not the data or signatures have been modified. The signature is generated using the data to be signed and a secret key that only the signer knows; it is verified using the public key of the signer. Therefore, the Data Generator would sign the data he/she has generated. The Data Manger would record the data with the signature. If the Data Manager modifies the data, the Data Referrer would detect it when verifying the signature. However, the Digital Signature scheme only ensures that the data have been generated (or acknowledged) by the signer. It does not guarantee the ordering of multiple pieces of data that have been generated by multiple signers. For example, a malicious Data Manager may present to a Data Referrer, a history consisting of (in order) Alice’s data, Bob’s data, and Chris’ data, along with their respective signatures; however, they may present it in a different order—say Alice, Chris, then Bob—to a different Data Referrer. This raises issues in the consistency of the data history. Hash-chain technology helps to achieve consistency (Fig. 1). Hash-chain technology can be illustrated as follows. We use the so-called “Hash function.” A hash function is a function that computes a fixed-length bit string from an arbitrarily long input data string. The output of a hash function is called a “hashed value.”

Fig. 1 Basic structure of a hash-chain

Proof-of-Work: A Consensus Mechanism to Achieve Consistency

11

When the Data Manager receives data from Alice, he/she computes a hashed value for Alice’s data and publishes it; then, when he/she receives the next data from Bob, he/she adds the previous hashed value of Alice’s data to Bob’s data, and computes and publishes a hashed value for the concatenated data. Similarly, upon receiving Chris’ data, the Data Manager concatenates the previously published hashed value with Chris’ data and publishes its hashed value. The hashed values are thus chained together, and this is referred to as a “hash-chain.” The hash function is designed so that if the input differs by a bit, its output is completely different. Therefore, given the published hashed values, a malicious Data Manager cannot disclose anything other than the original ordering. In the Bitcoin blockchain, hash-chains are used not on individual data but are used to ensure the ordering.

2

Preventing Unwanted Consequences

Even after implementing digital signatures and hash-chains, the Data Manager can still do something malicious: he/she can choose not to include certain data in the archive; that is, he/she can apply censorship, by acting as if no such data exist. To prevent this, blockchains feature multiple Data Managers, who are connected via a peer-to-peer (P2P) network. A Data Generator who wants to have his/her data archived can send his/her data to any number of Data Managers. Honest Data Managers relay the data among themselves via the P2P network; in this way, more—hopefully all—Data Managers receive the data. However, the P2P network may be incomplete and might contain malicious Data Managers who do not relay the data; thus, it is not guaranteed that all Data Managers receive all the data in the same ordering. To achieve a consistent archived record, the Data Managers must “agree” to manage the same data in the same order. If the number of Data Managers is limited, and the Data Managers know each other; methods of agreeing upon the data in the distributed nodes in such scenarios have been studied since the 1980s; they are known as Byzantine fault-tolerant protocols. A Byzantine fault-tolerant protocol provides a mechanism by which, even if some Data Managers are faulty or malicious, the other Data Managers can form a consensus on a consistently ordered set of data. What is exciting about Bitcoin blockchain is that they provide a means of reaching such consensuses even if the Data Managers know neither each other nor the number of Data Managers in the P2P network; this method is the so-called “PoW (Proof-of-Work).” In essence, the one who solves the crypto-puzzle the fastest decides which data and which ordering thereof are to be archived in the ledger. Each Data Manager collects the data that he/she receives and forms a block candidate.

12

K. Sako

He/she solves a crypto-puzzle using the values of the block candidate and the most recent block. If he/she is successful in solving it, he/she announces the solution of the crypto-puzzle—along with his/her block candidate—to the other Data Managers. Every other Data Manager checks that the solution is correct in relation to the most recent block he/she has with a given block candidate; if it is, he/she updates the most recent block with the candidate. He/she now tries to solve his/her own crypto-puzzle using the data he/she has and the recently updated block. This is how the blocks are ordered or “chained.” The chain of blocks constitutes the ledger. Note that a block candidate must only contain fresh data that has not appeared in previous blocks. In case where two Data Managers solved the puzzle almost the same time, there is a rule called “longer chain wins.” The length of the chain is defined by the number of blocks in the chain. It may be equal at some point of time and thus no winners may be identified. However, it will be the solver of the next puzzle who decides which chain grew longer and thus become a winning chain.

3

Crypto-Puzzles in Detail

Here, we consider how crypto-puzzles are defined. Here again, a hash function is used. The hash function outputs a completely different value for a different input, even if the difference is only one bit. It is difficult to guess the input from the output, or to find two inputs that produce the same output. Using this hash function, crypto-puzzles are defined as follows. Given the hashed value (B) of the recent block and a collection of data (D1,…, Dm) which is a block candidate, find X such that when concatenating all these data as an input to the hash function, it outputs a value smaller than a previously determined number (k). Hash functions output a seemingly randomly data string of 256 bits, which usually represents a large number. To obtain a small output, you must try different X values and be lucky enough to fall within the small target. In Bitcoin, the number (k) is designed such that this trial takes an average of 10 min, adjusting itself from the previous challenge history. Thus, it requires large amounts of computing power to find an X that meets the criteria. However, it is easy to check the correctness of the answer; one simply computes a hash function. Why do Data Managers participate in such computation-heavy crypto-puzzle races? In systems using blockchain, the incentives of Data Managers should be designed to maintain sustainability. In Bitcoin or Ethereum, Data Managers receive rewards in cryptocurrencies if they succeed in solving a crypto-puzzle. The reward they receive per block is pre-defined in the system.

Proof-of-Work: A Consensus Mechanism to Achieve Consistency

13

Furthermore, Data Managers receive a “transaction fee” (in cryptocurrency) that is set by Data Generators. It is the incentives of Data Generators to have their data recorded in the ledger, so they set the transaction fee when generating the data. The Data Managers who solve the crypto-puzzle using this data receive the set transaction fee. Data Managers only receive these rewards and transaction fees when they correctly follow the rules. Therefore, they check not only the blocks they create but also the previous block, because blocks generated on top of incorrect blocks will also be incorrect. Thus, each Data Manager checks all the other blocks, which helps to maintain the overall correctness of the process.

4

Smart Contracts in Ethereum

In Bitcoin, the Data Generators are those who wish to transfer some cryptocurrency to someone else, and the data are the transaction data. If the transaction data are stably archived in the ledger, it means that the transaction is finalized. To ensure the correctness of the transactions, some predetermined conditions need to be verified on the data before they can be recorded in the ledger. For example, those initiating a transfer must own at least the amount of coins they are transferring, and the coins they have already spent cannot be used again. These rules are tailored toward cryptocurrency transfers. In Ethereum, to incorporate various rules or “programs” besides cryptocurrencies, a layer for “smart contracts” has been built on top of the data-archiving ledger. A smart contract is a computer program that automatically enforces rules and procedures upon the data generated and archived in the ledger. In Ethereum, the ledger maintains both “program” data (which are called smart contracts) and the generated data that serve as the input to the program. Furthermore, the outcome and/or inner state of the program is archived in the ledger. With these functions, the data generated by Data Generators are processed in accordance with previously recorded and published programs, in a way that is verifiable by everyone. Here, we consider Bitcoin again. Bitcoin adopts a special data structure, to ensure that individuals do not spend more cryptocurrency than they own, and to enable easy verification of correctness. The transaction data of Bitcoin are expressed as “I am sending z bitcoins to Alice out of coins I received in transaction x.” To verify the correctness of this transaction, a Data Manager must check the following three conditions: – That the receiver of the cryptocurrency in transaction x is the sender of this transaction. – That the cryptocurrency received in transaction x has not been used before.

14

K. Sako

– That the amount of cryptocurrency to be sent to Alice is smaller than the amount received in transaction x. If these conditions are written into a program, it becomes much simpler. The data structure might resemble the phrase “I am sending z coins to Alice.” The inner state of the program maintains the current amount of coins each entity possesses. The program checks that the Data Generator indeed has z coins in his/her account, subtracts the amount z, adds that same amount to Alice’s account, and updates the inner state. The Data Manager reads the program and its current inner state from the ledger, processes the data following the program, and writes its output as a record in the ledger. Thus, the Data Manager not only writes the data generated by the Data Generator but also processes the data using the program they specify. Hence, a ledger can be used as a platform for multipurpose transactions, not simply cryptocurrencies. What is unique about smart contracts on ledgers—compared with ordinary computer programs—is that the program is now public on the ledger. One can check beforehand the procedure to be followed and can later verify that the data has indeed followed the specified procedure correctly. This means that all data are treated equally. Currently, a program on a server or in a cloud is not public, making it difficult to see which program is running in the cloud, and how the data is being processed. The administrators of those computing environment may change the program if they do not like the outcome of a certain program. Smart contracts represent an alternative to current computers, which operate in a black box manner. Meanwhile, it is left to the Data Generators to verify that the smart contract they specified—written in programming language—behaves as expected. A large incident occurred in which the smart contract for The DAO—which operates on Ethereum—suffered from a bug; this resulted the leakage of a large amount of cryptocurrency to an unintended recipient.

5

Fairness and Transparency in Blockchain

In this chapter, we discussed the basic ideas used in blockchains such as Bitcoin and Ethereum, and we introduced some of the cryptographic primitives used therein. In both blockchains, anyone can play the role of Data Managers, and therefore called permission-less blockchain. It is important to bear in mind that, depending on the purpose of the service, the blockchain design—for instance, which rules and data structures are incorporated and which individuals take what kind of role—will differ. For example, Bitcoin was designed to realize an electronic cash system without requiring a trusted third party; thus, to prevent the centralization of power, it is designed such that anyone can be a Data Manager.

Proof-of-Work: A Consensus Mechanism to Achieve Consistency

15

Yet, as the designed systems have been deployed in real applications, many issues have arisen. How can we democratically update the software when a bug is found, or when its capacity needs to be improved? What can be done when people cannot manage their own secret key? What if there are many PoW blockchains, in which computer capabilities are split between different blockchains? Alternatively, if there is an entity that we can comfortably trust to a certain extent, is it feasible to design a system without wasting our precious energy? We have taken an initial step forward to design a system that brings transparency to the computing process, with the goal of realizing a fair IT system upon which users can depend with tolerable risk. The invention of blockchain has shown the direction in which we should move, from services designed by and for the servicer to more people-centered services.

Dr. Kazue Sako is a researcher in designing secure systems using cryptographic protocols that enhance privacy and fairness, e.g. electronic voting protocols, group signature schemes, digital lottery systems and blockchain architecture. She serves as an expert in ISO/IEC JTC1 SC27 (Information Security Techniques.). She chaired several international conferences in cryptography and security including ASIACRYPT, CT-RSA, FC, PKC and ESORICS. She is a member of the Science Council of Japan, and served as the president of Japan Society for Industrial and Applied Mathematics, and a vice-president of Institute of Electronics, Information and Communication Engineers.

Misunderstandings and Expectations of the Blockchain Created by Bitcoin Masanori Kusunoki

Bitcoin, the cryptocurrency that created the blockchain concept, is also the largest and longest-running blockchain. Recently, interest in the blockchain architecture has increased, most notably from financial institutions; this is because Bitcoin has demonstrated that it can intervene in a large number of transactions and withstand the conditions of practical use. In this sense, all blockchain implementations are affected by Bitcoin to some extent. Why do blockchains aim at decentralization, adopting PoW, etc. and forming an ecosystem? To understand this, we must re-consider Bitcoin, the original blockchain. In this chapter, we consider the misunderstandings and expectations of the current blockchain, by reviewing the history and achievements of Bitcoin. What has Bitcoin achieved? Bitcoin was originally designed as a mechanism to mediate value exchanges without relying on state power arbitrages or supporting assets. Blockchain was designed to help achieve this. In modern currency, the issuer —including the central bank—records a book liability according to the outstanding balance. For electronic money, gift certificates, prepaid cards, and so on, the manager must accumulate a provision in accordance with the issued balance and the system of each country. However, Bitcoin’s mechanism operates without an issuer or administrator. As a result, it avoids the laws and regulations determined by the issuer. What did Bitcoin facilitate by functioning without an issuer? First, it realized a new mechanism that generates a large amount of currency issuance (gained by issuing coins); it covers the currency’s operating costs with this issuance gain. In Bitcoin, the amount of calculation required to issue a new coin is determined by the sum of the computational capabilities of participating miners. Thus, when the number of participants is small, the currency can be obtained efficiently at a very low cost.

M. Kusunoki (&) Japan Digital Design, Tokyo, Japan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Matsuo and N. Sakimura (eds.), Blockchain Gaps, Future of Business and Finance, https://doi.org/10.1007/978-981-33-6052-5_3

17

18

M. Kusunoki

As the value of Bitcoin increases, the market value of previously issued bitcoins also increases. The identity of the author who published Bitcoin’s initial paper— under the name Satoshi Nakamoto—remains unclear; however, it is highly probable that he has gained currency worth at least ¥ 10 billion. Bitcoin does not have a nominal issuer; thus, there is no need to record the issue amount as a debt of the issuer in the book, or to accrue a reserve according to the issue balance. Even now, a new bitcoin is generated approximately every 10 min, and the money issuance profit is divided between the miners (who provide computing resources for blockchain data processing). Thus, miners voluntarily operate an international settlement platform, which operates 24 h a day, 365 days a year with almost no transaction fees; the transaction is confirmed in about 10 min, based on the reward of currency issuance. Second, without an issuer, Bitcoin has realized a value transfer platform independent of national regulations. As a result, economic value can be readily transferred by simply creating a key pair on demand. This eliminates the need for contracts or identity verification when—for instance—opening an account. Thus, it is now possible for users to conduct illegal transactions that do not disclose their identity and are difficult to trace. For example, illegal drugs, malware development toolkits, stolen ID/password lists, credit card number transaction fees, ransomware ransom fees, and many other products and services can now be exchanged online without being intercepted. Bitcoin transactions are possible anywhere in the world, provided you are connected to the Internet. As a result, assets can now be freely withdrawn from emerging countries such as China, where foreign exchange transactions are regulated. Until 2017 (when they were prohibited by law), almost 90% of bitcoin mining operations were being carried out in China. This is because investment in bitcoin mining is one of the few ways of bypassing foreign exchange regulations and legally transferring Chinese assets overseas. Bitcoin was also used as a means of avoiding deposit taxation by the state. In March 2013, Cyprus, which fell into a financial crisis in response to the Greek debt reduction in the Eurozone, closed its deposits and taxed them by approximately 10%; however, at that time, it was unable to withdraw capital from bank deposits in Cyprus. Bitcoin was used as a means of doing so. In Cyprus, various goods and services—including university tuition—can be paid for with Bitcoin. Bitcoin is safer than bank deposits in legal currencies because it cannot be frozen on account of the country’s circumstances; thus, it has become a capital escape destination with an exchange rate independent of legal currencies. The price of bitcoin, which was 1 BTC (unit of bitcoin) = approximately US $17 in early 2013, rose to a level of over US $200 in the spring following the Cyprus crisis. After that, the Internet black market “Silk Road” was caught by the US FBI; as a result, the existence of Bitcoin became widely known to the world and its price soared, at one point exceeding US $1000. However, the bitcoin value was halved following the tightening of cryptocurrency restrictions by the People’s Bank of China. After that, an embezzlement case at the Bitcoin exchange Mt.GOX caused the value to drop to US $200. Since then,

Misunderstandings and Expectations of the Blockchain …

19

the price of bitcoin has continued to fluctuate. In mid-2016, Bitcoin surged again following economic activities such as the UK’s withdrawal from the EU and the devaluation of the Chinese yuan. As of the end of July 2016, it has risen to around US $650. The Bitcoin community widely claims that “decentralized management has been achieved;” thus, it does not consider currency issuance profits as a management entity’s debt and does not pass operating expenses on to users. Furthermore, it became the first currency with an exchange rate independent of state powers. Bitcoin has demonstrated that—unlike traditional financial services—it can facilitate anonymous international payments and mediate transactions that are antisocial and unprotected by law. Through its use to withdraw capital from euro deposits in the Cyprus crisis, as a payment method on the black market Silk Road, as a method of withdrawing funds from China, and as a ransomware ransom-payment method, Bitcoin has survived numerous cyber-attacks and been in operation for over 10 years.

1

A Beautiful Misunderstanding of the Blockchain

The success of Bitcoin has led to expectations of the technology that supports it. “Blockchain operates 24 h a day, 365 days a year with zero downtime,” “Distributed ledgers can be operated at a lower cost than conventional financial systems and realizes a high security that can withstand attacks” have become popular slogans. FinTech—that is, fusing finance and technology—startups have made it difficult to chart growth scenarios simply by providing technology to existing financial institutions. However, if Bitcoin has the potential to create disruptive services that can replace existing financial services (such as international payments), explosive growth can be expected. As a result, annual investment in FinTech startups has increased rapidly to about $12 billion in 2014, more than triple that of the previous year; it further increased to about $22 billion in 2015. Bitcoin itself still has a dangerous image; however, the blockchain, which separates its elemental technology from Bitcoin, has attracted attention in FinTech contexts. However, separating the blockchain from the Bitcoin ecosystem and its supporting achievements of “monetary issuance gain” and “independence from the state” does not always result in a truly efficient information system. One expectation of the blockchain is that it can reduce the fees for international settlements. However, the low bitcoin settlement fees are not necessarily due to the efficiency of the blockchain itself. As described above, the miner who solves the PoW is responsible for the processing required for the operation, including improving the safety of Bitcoin. In return, mining rewards are paid every 10 min to the miner who solves the problem. This mining reward was initially 50 BTC; however, it was reduced to 25 BTC in

20

M. Kusunoki

November 2012 and was recently lowered again to 12.5 BTC in July 2016. This is about 860,000 yen at time of writing (July 28). Dividing this reward amount by the number of transactions that can be confirmed in a single block (about 1,000 transactions can be recorded in a 900 Kbyte block), the calculation takes several thousand yen per transaction. This figure is high enough compared with credit card and bank settlement fees. Users perceive Bitcoin to be cheap because it does not collect operating costs from users, but instead uses the gain from issuing currency. Bitcoin’s innovation is not in the blockchain itself. It establishes an ecosystem where each stakeholder can cooperate to maintain the value of bitcoin without having to concentrate authority on specific managers. Bitcoin’s innovation is in realizing the trade.

2

Is It Possible to Allow Privacy?

Furthermore, Bitcoin has been boldly claimed in various places to achieve decentralization. This is a large obstacle to overcome when applying the blockchain—the elemental technology of Bitcoin—to other purposes. For example, in Bitcoin, an electronic signature is used to record a transaction, but anyone can freely generate the key pair used for the signature. Every user can add new transactions to the ledger and view all transactions therein once Bitcoin has begun to operate. To send bitcoin to someone, you must reveal your Bitcoin address. Revealing your Bitcoin address means not only revealing your bank account number to your partner but also revealing your bitcoin balance at that address, as well as your transaction history. The original paper by Bitcoin’s proposer Satoshi Nakamoto argues that even if all transactions are disclosed, privacy can be ensured by separating the Bitcoin address from personally identifiable information (Fig. 1).

Fig. 1 Bitcoin privacy model. Source Created based on “Bitcoin: A Peer-to-Peer Electronic Cash System-Satoshi Nakamoto 2008”

Misunderstandings and Expectations of the Blockchain …

21

However, with this method, at very least the partner who makes the transaction knows the correspondence between the Bitcoin address and the owner. In addition, each country imposes identity verification obligations for the subscribers of cryptocurrency exchange accounts, in response to recommendations from the FATF (Financial Activities Working Group). In fact, anonymization technologies—such as mixing services—are used to make Bitcoin transactions difficult to track. A mixing service makes it impossible to track the flow of funds, by mediating multiple users’ transactions through the same Bitcoin address. A dark wallet with a similar anonymization mechanism is also being developed. However, can general consumers really apply the blockchain framework with such privacy in the field of international remittances? No one has found an answer as yet.

3

The Difficulty of Building Safe Mechanisms Using Computing Power

Bitcoin has survived for over 10 years, despite receiving cyber-attacks from external attackers. However, the blockchains derived from Bitcoin are not guaranteed to enjoy the same security. The safety of Bitcoin is ensured by the fact that a specific person cannot monopolize the PoW calculation capabilities used for mining. In the framework of PoW, if a specific person controls a majority of the network’s computing power (even less under certain conditions), they can make a false blockchain real. However, in reality, the bitcoin value drops when hijacking occurs; hence, a miner who earns money with Bitcoin is unlikely to try to take over, making such fraud unlikely. The PoW mechanism—which guarantees tamper-resistance by distributing computing power—works effectively only if one cryptocurrency is available. However, for a later cryptocurrency, if a miner’s funds flow from the preceding cryptocurrency, it can be easily hijacked. In Bitcoin, a standard hash function (called SHA-2) is used for PoW; however, in later cryptocurrencies, different hash algorithms (such as SCrypt) are popular. This is intended to prevent bitcoin miners from mining with special chips. However, many of the unique algorithms have not been verified for mathematical safety, and it is impractical to develop unique algorithms for the number of systems. For a blockchain mechanism, whether it be a public chain open to anyone (like Bitcoin) or a private chain containing limited numbers of network participants, the question of whether Bitcoin’s PoW is used as an agreement protocol or another consensus protocol is developed, is an important design requirement. For one thing, it is difficult to ensure the security of a coexisting system simply by mimicking Bitcoin.

22

4

M. Kusunoki

Understanding the Weaknesses and Considering Where to Apply Blockchain

Alongside cryptocurrency applications, blockchain is thought to be suitable for performing transactions between financial institutions, making donations to politicians, managing ledgers to prove rights, and notarizing resumes and digital content. Various projects are being developed for these uses; however, besides cryptocurrencies, their applications are still in the trial and development stages. In other words, the development of such blockchain applications is being considered using the “what blockchain is not suitable for” constraint. For example, Bitcoin is expected to be used for public ledgers, proof-of-existence, donations, and so on, because it can be used without causing privacy problems through access controls. Expressed otherwise, to use it as electronic money for individuals, it is necessary to solve several problems, including user privacy and scalability (expandability) with respect to the cost and transaction volume. The capacity to realize a decentralized system through separating development, operation, and implementation may be deployed for data exchange between organizations. However, irrespective of whether blockchain is suitable for data exchange between organizations using access control and contracts, it does have specific advantages over conventional methods such as file transfer, distributed database construction, and joint system operation. A comparative study must be made. Many trials conducted by financial institutions and stock exchanges have sought to compare costs and usability between conventional methods and blockchain. In addition to the system performance, efficiency, and cost reductions expected from the blockchain, its potential application as an architecture for realizing more robust data management has been suggested. In conventional information system design, the protocol typically considers methods of designing and constructing systems based on business needs; then, it links the systems. What happens if this order is changed? First, we define the data structure shared between systems, prepare a common software for exchanging data, and structure the business system of each organization around it. However, this process varies considerably. The current information system design process requires high technical expertise to design and build robust data structures that can be easily rolled back (that is, rewound to a point in the past), as well as highly scalable application programming interfaces (APIs). However, using blockchain middleware—and other software functions allowing common data to be maintained between systems from the outset—it may be possible to reduce the man-hours required to build such a data infrastructure and increase the degree of completion.

Misunderstandings and Expectations of the Blockchain …

23

The question of whether blockchain—decoupled from the Bitcoin ecosystem— really has a utility value is still unresolved. First, it is thought that the search for application fields will develop a system model that links distributed ledgers between organizations.

Masanori Kusunoki works for Japan Digital Design, a FinTech company founded by Mitsubishi UFJ Financial Group as Chief Technology Officer. He engaged in building the systems infrastructure of Social Security and Tax Number system in Japan. He also leads the Japanese mirror committee of ISO/TC307 and OpenID Foundation Japan.

Challenges Blockchain Technology Faces Shin’ichiro Matsuo

1

Characteristics of Blockchain

Here, we review the key points of the previous chapters. When Bitcoin was invented, blockchain technology was implemented to assure a functional and secure cryptocurrency. Blockchain technology is still being independently developed for other applications besides cryptocurrency. The Bitcoin blockchain continuously manages a ledger that records transactions, by appending it and revising the records it contains. The ledger records “how much bitcoin each user owns,” and “how much bitcoin is transferred from one user to another user.” The records are replicated and maintained by a large number of network users. Even if some users try to perform malicious activities, the blockchain technology is designed to maintain the correct records. Significantly, blockchain does not require a trust anchor such as a trusted manager or organization, as previous similar technologies have done. It relies upon the Peer-to-Peer network to provide “decentralized” features. A group of Bitcoin’s users do not maintain ledger data; they only verify transactions by using the data maintained by another user. In general, such users are allowed to exist; however, to enhance the decentralized nature of the network, we require as many users as possible to maintain the entire ledger. The ledger is created/updated using a combination of cryptographic techniques, including a (cryptographic) hash function and digital signature scheme. This mechanism ensures the integrity and correct ordering of transactions on the payment transaction ledger. All data recorded into the ledger can be accessed by any user; thus, anyone can verify the integrity and correct ordering of the ledger data at any time. Thus, blockchain data are publicly verifiable.

S. Matsuo (&) Georgetown University, Washington, DC, USA e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Matsuo and N. Sakimura (eds.), Blockchain Gaps, Future of Business and Finance, https://doi.org/10.1007/978-981-33-6052-5_4

25

26

S. Matsuo

These characteristics mean that the blockchain system can be operated without a trusted anchor. Such anchors would require rigorous management and operation; hence, Bitcoin began its operations in 2009 without any central bank. When we move our focus from Bitcoin to generalized blockchain, the data stored in the blockchain will be more varied than simply payment records. This expansion facilitates other applications besides cryptocurrency. For example, applications for smart contracts relating to money and rights in a social infrastructure are anticipated.

2

Requirements Imply Issues

To implement the values of blockchain and operate it, the following requirements should be considered: – The blockchain eliminates trusted operators; instead, it uses a large number of users to replicate the ledger data, and all the users must manage it. This requires as many users as possible, to safeguard it from malicious users. – The blockchain needs to append and revise the same data in the ledger, which are managed by a large number of users. – Cryptographic techniques must be appropriately used, to ensure the integrity of the data and their order in the ledger. Unfortunately, the existing blockchain technologies do not fulfill all these requirements “perfectly.” The following are four unexpected issues that have arisen in the blockchain technology: Issue 1: The security evaluations of cryptographic protocols and systems are insufficient. In general, technologies that combine cryptography and communications are called “Cryptographic Protocols.” Transport Layer Security (TLS) - often called Secure Socket Layer (SSL)—is an example of a cryptographic protocol used to ensure authentication and encryption over the communication channel. When we design a cryptographic protocol, we first define the security requirements and then design the protocol to fulfill them. A mathematical proof must be performed, to verify that the designed protocol fulfills the requirements. For example, a fundamental entity authentication protocol such as ISO/IEC 9798 has been mathematically verified by the academic community. Several ISO/IEC standard protocols have been revised to pass the verification after several formal verification attempts identified vulnerabilities. However, in current Internet technologies, even essential protocols such as TLS lack sufficient verification results. Its latest version (TLS 1.3) was standardized by the Internet Engineering Task Force after extensive formal verification. However, this is a rare case.

Challenges Blockchain Technology Faces

27

The same applies to the current blockchain. Several early-stage studies have been conducted on the formalization of security requirements for a small subset of blockchain technologies; however, they do not consider the entire blockchain technology and all its systems. Hence, we lack sufficient verification results for blockchain. This does not imply insecurities of the blockchain technology; rather, it means that we cannot demonstrate that “blockchain technology does not contain vulnerabilities.” Risks are still involved in operating blockchain-based systems. This situation applies not only to the technology’s specifications but also its implementation. For example, the DAO attack (2016) was realized by exploiting a vulnerability of the Ethereum blockchain programming code. The case highlighted the possibility of moving tokens maliciously by exploiting vulnerabilities in the programming codes. Blockchain is expected to be employed as a social infrastructure tool, to manage contracts regarding money and property rights. In such applications, the social impacts of security incidents could increase dramatically. Therefore, the need for vulnerability handling procedures is essential to ensuring the quality of programming code and responding to attacks when they occur. Issue 2: There is no standard operational model regarding the usage of cryptography. Blockchain technologies use a digital signature scheme and hash function to ensure the integrity of the ledger data and the order of records. A digital signature scheme requests users to create a key pair, containing a signing key and a verification key; then, the user must rigorously keep the signing key secret from other users. The key pair generally has a validity period, because an adversary may guess the signing key from the publicly available verification key after long-term computations. The current Bitcoin implementation lacks a management process for the validity period of key pairs and key lifecycles, as well as for the revocation and generation of key pairs. That is, there is no Public Key Infrastructure operation process. Most other permission-less blockchain technologies do not feature specific key management and operation rules either. Users perform such operations and set the rules. Asides from key management problems, specific cryptographic techniques such as digital signature schemes and hash functions are subject to be compromised over time. In some cases, design vulnerabilities may be found. The computational power of adversaries increases every day owing to the progress of computer technology. Hence, the time required to exhaustively search for the signing key is becoming shorter. Such situations are referred to as the “compromise of cryptographic techniques.” For example, the IT industry incurred huge costs when transitioning from SHA-1 to SHA-2, after the compromise of SHA-1 was reported. Most of the bitcoin and blockchain engineers lacked the necessary experience to manage the compromise of their cryptographic techniques and to transition to the securer algorithm. No mechanisms or operations exist to deal with such transitions.

28

S. Matsuo

One long-term issue is that of the quantum computer. When the quantum computer becomes capable of solving modern public-key cryptography using factoring and discrete logarithm problems, the digital signature schemes used in blockchain technologies will become insecure. “Quantum-Safe” cryptographic techniques, lattice-based cryptography, code-based cryptography, and hash-based cryptography are now being widely studied, and in 2017 the National Institute of Standards and Technology launched a competition to develop post-quantum safe cryptography techniques, with a view toward standardization. However, this initiative for quantum-safe cryptography is still in its early stages. Thus, no major works have incorporated such new cryptographic techniques into blockchain. Blockchain applications assume long-term operations, exceeding the lifetimes of specific cryptographic techniques; hence, we must take into account the management of transitions toward long-term secure cryptographic techniques. Issue 3: Trade-off between scalability and decentralization The blockchain ledger is processed according to a specified rule of processing speed. For example, in the case of Bitcoin, 1 MB of data is appended every 10 min. This implies that the global Bitcoin network can process only 1 MB of transaction data every 10 min; thus, a strict upper bound exists on the maximum number of transactions per unit time. That is, by definition, there is a scalability restriction in the design of the technology, and the progress of computing power cannot solve it. A naive solution to this problem is to increase the size of one block by changing its specifications. However, this change increases the amount of data stored at all user nodes; thus, it requires large resources to operate such a node. As a result, only a wealthy individual or party can become a node user of the blockchain network, and this decreases the number of blockchain nodes. This contradicts the original “decentralization” philosophy of permission-less blockchain, making such permission-less blockchains less secure. The trade-off between scalability and decentralization is closely related to the design philosophy of blockchain; extensive debates continue. Reducing the number of nodes or setting multiple nodes in the same cloud computer (to achieve greater scalability) may destroy the merits of public blockchain, which eliminates centralized operators. If the decentralized nature is decreased, the resulting technology closely resembles “a multiplex database with cryptographic time-stamping performed by one/some operators.” Thus, we must consider the cost-merit balance of introducing such a semi-decentralized blockchain when we expand the blockchain’s applications. Issue 4: Integrity in revising distributed data Bitcoin and permission-less blockchains require numerous distributed users to update the local ledger data correctly. The “PoW” consensus algorithm is one of the mechanisms used to make such unified, correct updates from multiple distributed users.

Challenges Blockchain Technology Faces

29

However, the existing and on-going research results of distributed computing indicate that there is no guaranteeing that the result of the update is 100% correct; furthermore, it is possible that the results can be changed. That is, we need to design blockchain applications taking into account such (small) possibilities that a transaction is not correctly reflected in the ledger. In this chapter, we describe the requirements of blockchain technology and the issues it faces. We describe these issues in detail in the following chapters.

Dr. Shin’ichiro Matsuo is a Research Scientist in Cryptography and Information Security. He is working on maturing blockchain technology from the academia side and presents research results on blockchain security. At Georgetown University, he directs the CyberSMART Research Center and leads multi-disciplinary research among technology, economy, law, and regulation. He also leads international research collaboration on blockchain and founded BASE (Blockchain Academic Synergized Environment) alliance with the University of Tokyo and Keio University. In 2019, he co-founded Blockchain Governance Initiative Network (BGIN) a multi-stakeholder discussion body like IETF, as an initial contributor. He is co-chair of BGIN. He is a co-founder of the BSafe.network, an international and neutral research test network to promote applied academic research in blockchain technologies. He is a part of many program committees on blockchain technology and information security, and a program co-chair of Scaling Bitcoin 2018 Tokyo. He serves as the leader of security standardization project of blockchain (ISO TC307). Previously, he served as the head of Japanese national body of ISO/IEC JTC1 SC27/WG2 for cryptographic techniques, a member of advisory board cryptographic technology for the Japanese government.

Are Blockchains Trustless? Nat Sakimura

It is often touted that the innovation that the blockchain brought is its trustlessness, in that it does not require a trusted third party to achieve “true records of transactions.” However, digging a little deeper, it becomes apparent that what is meant by “trustless” in such a claim is rather vague. What does “trust” really mean in the context of an information system? Here lies an essential key for the application of blockchain to the real world. In this chapter, we look closer into the meaning of the claim, “Blockchains are trustless.”

1

The Essence of “Trust”

The essence of “trust” is that one does not verify. If you have verified it, you already “know” it and need not take such an uncertain step as to “trust.” A statement by Georg Simmel in his book “Sociology” (1908) [1] emphasizes this point: he states, “Trust, as a hypothesis, is a middle state between knowing and not knowing about people.” Likewise, ISO/IEC 25010 [2], an international standard for evaluating system and software quality, defines trust as “the degree to which a user or other stakeholder has confidence that a product or system will behave as intended.” The term “system” means “an interdependent group of items forming a unified whole” [3]. Thus, “trust” can be restated as the degree to which an entity believes that the system, as a whole, will behave as expected.

N. Sakimura (&) NAT Consulting, Tokyo, Japan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Matsuo and N. Sakimura (eds.), Blockchain Gaps, Future of Business and Finance, https://doi.org/10.1007/978-981-33-6052-5_5

31

32

N. Sakimura

For example, suppose I have temporarily paid for my friend’s lunch. In this case, I believe that he will pay me back with a high probability based on our relationship, and I do not verify his payment ability by checking his bank account balance. Thus, I am trusting him. Alternatively, let us consider the case of me saving money in a bank account. In this case, I am trusting the bank; that is, I believe with high confidence that the bank will not misappropriate the funds and will safeguard the money. I do not verify the current operational and financial status of the bank. As in these examples, when one believes that a system will behave as expected without verifying it, the system can be said to be trusted. Society relies heavily on such trusted systems, and this significantly reduces social costs. It is simply too costly for a user to “verify” a system each time it is used.

2

The Conditions for a System to Be Trusted

Well then, under what conditions can a system be trusted? Studies—such as that of Takasaki (2010) [4]—have shown that the “brand” is a dominant factor in the consumer’s formation of trust toward a system. If a person says something along the lines of “I do not think there are any bad products mixed in with the clothing that Harrods deals in,” then that person trusts the business. This is why companies devote considerable finances and effort to establishing a brand. However, if a business betrays their expectations, their brand-building efforts instantly disappear. Customers implicitly know this and use the brand as a proxy for trust. In other words, brand formation is the process of piling up stakes that will be instantly lost once it betrays the expectation. People see how meagre the benefit of betraying them is compared to the amount that the company loses, and they “trust” that “it will be fine.”

3

“Trust Framework” as the Mechanism to Create “Trust” for SMEs

Naturally, it is difficult for new market entrants and Small and Medium Enterprises (SMEs) to win trust for their brand as described above. To be fair to them, we must prepare another way they can build trust, instead of through brand establishment. To this end, the following three procedures must be carried out in some form: 1. Implement tools to produce the expected results. 2. Implement rules and operations to make this possible. 3. Implement a mechanism to remediate when failures occur. The mechanism used to achieve such a combination is called a “trust framework.”

Are Blockchains Trustless?

33

A user can also by him/herself verify that these conditions are met. However, this involves hefty costs; moreover, once verified, the user does not need to trust because he/she now knows that the system works correctly. In general, people rely on somebody else to verify these conditions. This other party could be the service provider—that is, the first party—or it could be a third party. The former is called a “self-declaration model,” whereas the latter is called the “third-party certification model.” For the self-declaration model to function, the declarer needs to be at risk of serious loss if they fail to live up to expectations. In the USA, the Federal Trade Commission (FTC) functions as the backstop, by imposing heavy fines on businesses that fail to fulfill their declarations, as described in Section 5 of the FTC Act. The brand model can also be considered as a form of the self-declaration model, in which the business loses the stakes that have been piling up during the brand-building procedure, as well as the potential sales that could have been gained if the incident had not occurred. In the third-party certification model, users trust that a third party—such as an auditor or authority—has verified the three conditions. This is also called the “trusted third-party (TTP) model.” For example, in the case of inter-bank transactions, it is typically the central bank that acts as the trusted third party. However, it remains possible to deceive the third party. Thus, penalties designed to thwart such wrongdoing are also necessary. In the case of banks, penalties imposed by the authority—such as fines and revoking of the licensing—perform this function.

4

There is no Mention of “Trustless” in Satoshi Nakamoto’s Paper

Now that we have a solid understanding of what “trust” means, let us delve into what it means to say that blockchains are “trustless.” Somehow, the term “trustless” has come into general, casual usage amongst the public; however, it does not actually appear in Satoshi Nakamoto’s paper: “Bitcoin: A Peer-to-Peer Electronic Cash System” [5]. Instead, the wording he uses is “an electronic payment system based on cryptographic proof instead of trust, allowing any two willing parties to transact directly with each other without the need for a trusted third party.” This paper makes three assumptions, as follows: 1. Valid transaction assumption: Peers consent to the transaction. 2. Peer-to-peer assumption: Transactions occur on a peer-to-peer basis. 3. Computing power distribution assumption: The total computing power of honest nodes is larger than the computing power of any cooperating group of attacker nodes.

34

N. Sakimura

From these assumptions, Satoshi builds “a solution to the double-spending problem using a peer-to-peer distributed timestamp server to generate computational proof of the chronological order of transactions.” Expressed otherwise, provided these three assumptions are fulfilled, no intervention of a trusted third party is required to address the double-spending problem. If the assumptions are not met, or if anything other than the anti-double-spending property is sought, the paper is silent on whether trust is required or not. For example, users cannot know for sure whether the combined computing power of the honest nodes in the current blockchain system outweighs that of the attacker nodes. As a result, users end up having to trust the system. Furthermore, they must trust that their peers’ software has been written correctly and is being administered properly (the situation differs if a user personally checks the software’s code and can independently prove its rigorous administration).

5

The Illusion of Being “Trustless”

Despite these realities, many people seem to expect much more of blockchains than Satoshi Nakamoto put forth. In the following, we use the Bitcoin blockchain to illustrate the problems incurred by naively believing that “blockchains are trustless.” First, we must check what is being automatically verified and thus made “trustless.” The Bitcoin blockchain only provides proof of the ordering of the transactions that are valid and agreed upon. Asking for something beyond that falls outside of what the Bitcoin blockchain provides. This idea of transactions that are valid and agreed upon is problematic. In Bitcoin’s case, it is defined endogenously within the system. However, this property vanishes as soon as something exogenous to the system—such as land titles, copyrights, or legal currency—is brought into the system by being bound to the tokens on the blockchain; in this case, another mechanism is needed to verify that the transactions are valid and agreed upon. All that blockchain provides under such circumstances is the chronological order of the transactions and the immutability of those transactions. It does not prove the truth of what is written. False information can be written and—in some cases—this falsified information can persist. In other words, to trust that what is being written into the blockchain is truthful, we need some trust mechanism to verify it. If it becomes cheaper to use a TTP to verify the transaction than some cryptographic or other automated mechanism, then the benefit of being “trustless” becomes an illusion. The second issue stems from the assumption of peer-to-peer transactions. At present, many crypto-asset transactions are handled by crypto-asset exchanges. Exchange systems operate outside the blockchain’s system. Accordingly, a crypto-asset exchange serves as a trusted third party. The Mt. Gox incident—and more recently the Coincheck incident—are textbook examples of cases in which a TTP was operated in an untrustworthy manner. In fact, these raise questions of

Are Blockchains Trustless?

35

whether it is possible to create a trustworthy third party in a Bitcoin-like ecosystem. Bitcoin was explicitly designed to render such third parties unnecessary. Thus, the system was designed without any safety considerations for when such a third party becomes involved. If people are to rely on the blockchain system, it may be necessary to re-design it with these third parties in mind. The BitPoint incident (July 2019) demonstrated that a heavily regulated crypto-asset exchange may—unlike traditional financial institutions—fail catastrophically; this evidence supports the above conjecture. The third assumption—that of distributed computing power—is not actually fulfilled by Bitcoin’s system. Two aspects of Bitcoin have been identified as problematic in relation to this assumption. The first problem arises from an overconcentration of computing power among a minority of users. For example, over 70% of the mining power for Bitcoin lies behind the Great Firewall of China, of which two companies account for the greater portion. Hypothetically, if a certain government was to gain outsized influence over Bitcoin’s operation, it could create a hyper-centralized dystopia, the polar opposite of what Satoshi Nakamoto envisioned. The next problem is the difficulty involved in confirming the legitimacy of the blockchain software. “Software” refers not only to the software installed at each node but also to the script written into each Bitcoin transaction and into Ethereum’s smart contracts. How many people write their own software code or verify the code personally? Most users trust software developers, software download sites, the smart contracts offered to users, and so on. It is difficult to consider that this framework is trustless. Moreover, in terms of the system’s stability, the software constituting the blockchain must be sufficiently distributed, meaning that a wide variety of softwares must be provided and used. If only one version of software was used, then any bug in that software would make it easy to nullify the condition that “honest nodes collectively control more CPU power than any cooperating group of attacker nodes.” In addition, if a software development team could fix that bug after the fact, then it would have a central government-like position within the system. The DAO hack that occurred in June 2016 brought this issue to light.

6

Reject False Expectations and use Blockchain with a Realistic Perspective

The preceding sections have provided an overview of what we mean by “trust,” what is actually “trustless” in a blockchain, and the current state of trust in the Bitcoin blockchain currently in operation. The results show that it is difficult to consider blockchain “trustless” in its current state. Rather, it has merely shifted the trust from existing TTPs to other institutions and software developers. Furthermore, serious doubts remain over whether the parties to whom trust has shifted are indeed trustworthy. Asides from a very limited number

36

N. Sakimura

of cases in which the assumptions envisioned in Satoshi’s paper are fulfilled, the idea that blockchains are trustless, low-cost systems is an illusion. There are other instances of unfounded trust that users have in blockchains. One such example is the notion that the ledger in blockchain is distributed, as implied by the word “distributed ledger.” The ledger in a Bitcoin-like blockchain is not at all distributed. Instead, it is hyper-centralized. There is only one ledger. In contrast, a conventional ledger system is completely distributed in that they are maintained by each entity instead of being collectively managed. In distributed ledger technology, what is distributed is the consensus and the read/write operation. The ledger itself is—while highly replicated— hyper-centralized. Indeed, this centralization is the feature that makes double-spending difficult, because the consistency between transactions can be verified. However, this is not to say that blockchain is useless. Rather, present author looks forward to the future of blockchains and simply believes that when applying blockchain to real-world problems, we must not maintain any of these illusions; instead, we should properly assess the assumptions and limitations that apply. Otherwise, as the bubble supported by the illusion breaks, blockchain’s image will be irreversibly tarnished, rendering the technology obsolete. Over the past 10 years, numerous technologies widely considered promising have been stunted in this manner. We cannot let the same thing happen to blockchain. This chapter has mostly considered Bitcoin’s blockchain. However, the conclusions herein are also very likely, applicable to other blockchains that inherit Bitcoin’s design. Additionally, the analysis methodology in this chapter can be applied to other types of blockchains. Irrespective of what kind of blockchain it is, we must examine the assumptions made and verify what is achieved by the model. Some may think it too bothersome to proceed in this careful manner; however, what we need right now is realistic and sober research and development.

References 1. Simmel, Georg. 1908. Soziologie Untersuchungen über die Formen der Vergesellschaftung. Berlin: Duncker & Humblot (first edition). 2. ISO/IEC 25010 Systems and software engineering—Systems and software Quality Requirements and Evaluation (SQuaRE)—System and software quality models. 3. Merriam-Webster. https://www.merriam-webster.com/dictionary/system. Accessed 30 Apr 2018. 4. Takasaki, Haruo, et al. 2010. “Study on the preference of the consumers on the personalization service based on personal data”, The 27th meeting of the Institute of Electronics, Information and. Communication Engineers. http://www.jsicr.jp/doc/taikai2010/A-1%20Takasaki.pdf. Accessed 10 Sept 2016. 5. Nakamoto, Satoshi. 2008. Bitcoin: A Peer-to-Peer Electronic Cash System, Bitcoin.org, Oct 31. https://bitcoin.org/bitcoin.pdf. Accessed 10 Sept 2016.

Are Blockchains Trustless?

37

Nat Sakimura is a well-known identity and privacy standardization architect at NAT Consulting and the Chairman of the Board of the OpenID Foundation. Besides being an author of such widely used standards as JWT (RFC7519), JWS (RFC7515), OAuth PKCE (RFC7636) and OpenID Connect that are used by over three billion people, he helps communities organize themselves to realize the ideas around identity and privacy. As the chairman of the board of the OpenID Foundation, he streamlined the process, bolstered the IPR management, and greatly expanded the breadth of the foundation spanning over 10 working groups whose members include large internet services, mobile operators, financial institutions, governments, etc. He is also active in public policy space. He has been serving in various committees in the Japanese government including the Personal Data Working Group of the Ministry of Economy, Trade and Industry and the Study Group on the Platform Services of the Ministry of Internal Affairs and Communications.

The Bitcoin “Consensus” Problems Shinichi Miyazawa

In discussion about blockchain, the term “consensus” is often used. For example, somebody might explain the blockchain’s function as follows: “It is a data-sharing mechanism where multiple computers form a ‘consensus’ on the validity of the data without relying on a central authority.” The contributors to this book also use the term “consensus;” in this chapter, we take a close look at what it means. “Consensus” is a word we often use in our everyday lives. For example, we might say “companies reached a consensus to collaborate.” However, if we use “consensus” in this sense, we cannot productively discuss how blockchain works. Furthermore, academic and industrial researchers involved in distributed systems have been working on the “consensus problem” for more than 30 years. Here, the “consensus problem” deals with how multiple distributed computers can form a consensus between themselves. In fact, algorithms and programs have already been designed to solve the consensus problem. However, even if we take the distributed systems definition of “consensus” in the blockchain discussion, we still often obtain multiple meanings of “consensus” and must further discuss what is meant by “consensus” in each case. This is because, when discussing blockchain technology, we use “consensus” without properly defining who is agreeing upon what. Naturally, the discussion becomes confusing when each person has different notions of these. In short, we have not formed a consensus over what “consensus” means in the context of blockchain. In the following discussion, we focus on Bitcoin (from which the blockchain originates) and examine how the technology has come to be associated with distributed systems problems. The first section discusses the consensus problem, and the second section focuses on the “Byzantine Generals’ Problem,” which is closely related to the consensus problem. S. Miyazawa (&) Secom Intelligent Systems (IS) Lab, Tokyo, Japan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Matsuo and N. Sakimura (eds.), Blockchain Gaps, Future of Business and Finance, https://doi.org/10.1007/978-981-33-6052-5_6

39

40

1

S. Miyazawa

What Bitcoin is Intended to Accomplish

To understand what is meant by “consensus” in Bitcoin discussions, let us review Bitcoin’s original aim, as described in Satoshi Nakamoto’s paper [1]. Bitcoin is a “virtual currency system.” In other words, the purpose of the system is to use Bitcoin as a currency. For example, imagine using it as a currency to buy and sell a product, such as coffee. A major fraud that such virtual currency systems would face is double-spending (see Fig. 1). In Fig. 1, X pays A $80 and B $70, using data that represent only $100. In this example, the receivers A and B cannot detect the fraud because the “$100 data” that X used to pay is digital data and can be duplicated arbitrarily. In pre-Bitcoin virtual currency systems, an authority was established to centrally manage the system, to prevent these frauds and manage money issuance. What Bitcoin attempted was to create a peer-to-peer (P2P) virtual currency system that prevents double-spending and can issue currency without a central authority. To achieve it, Bitcoin implements the idea that all computers connected to the Bitcoin network hold all records of money issuance and transactions (see Fig. 2). Expressed otherwise, the system is based on the idea that each computer holds the same data, rather than using a central authority to manage the system from a single location. Indeed, as of September 2017, each computer connected to the Bitcoin network holds approximately 130 GB of data [2]. With each computer holding the entire transaction history, double-spending can be detected and prevented. To preserve the health of the Bitcoin system, the computers connected to it must possess an identical set of transaction history data.

Fig. 1 Double-spending with a virtual currency

The Bitcoin “Consensus” Problems

41

Fig. 2 The difference in recordkeeping architectures between traditional and Bitcoin models

When discussing this mechanism, the word “consensus” is often used. Roughly speaking, it appears in statements such as “double-spending frauds conducted through the manipulation of the transaction order are prevented by the majority of computers forming a ‘consensus’ on the correct order of the transaction history.”

2

What Does This Term “Consensus” Refer To?

Consensus is used in Bitcoin without distinguishing it from replication. Confusingly, the term “consensus” in the Bitcoin discussion is not clarified as to whether it means “replication” or “consensus” in the distributed system research. Briefly speaking, “replication” is a process ensuring that data held by multiple processes (i.e., running programs) are kept identical at all times; meanwhile, “consensus” is a multi-process procedure that selects a single value within a finite time period. The following is a detailed explanation of the differences between replication and consensus in a distributed system. If multiple computers hold copies of identical data, then if some of the computers fail or cease to operate, consistent service can still be provided. Unfortunately, it is difficult to ensure that the replicated data on mutually distant computers are kept identical at all times. Thus, distributed systems research has considered how to make replicated data. In particular, a considerable quantity of research has been dedicated to models and protocols concerning “consistency,” which is the degree to which replicated data are identical. “Consensus” within a distributed system is one means of achieving this replication.

42

S. Miyazawa

Fig. 3 Replication by consensus

If we can achieve a consensus between processes, then a broadcasting mechanism can be constructed to allow all agreeing processes to receive data in the same order. (This is also called an “atomic broadcast” or “total order broadcast” [3]. One precondition is that no lost or broken data are present during the broadcast). By designing such a communication mechanism, replicated data can be made identical in all consenting processes (see Fig. 3). In this example, once the processes have agreed to write the value “1,” they write the data. The example presented here is fairly abstract; however, to avoid network asynchrony or faults when reaching a consensus, we outfit the system with a consensus mechanism that employs several modules, including a module to elect a leader (also referred to as an “omega failure detector [4].” A module that identifies broken as well as unbroken processes [i.e., by finding leaders] is also called a failure detector). Typically, the consensus problem is addressed with a fixed number of processes and a time limit within which the consensus must be reached. In practical applications, the faster the consensus, the better. On the other hand, with continuously replicating data, we cannot know what assembly of data we have until the system itself finally completes its task and terminates.

3

What is Consensus in a Distributed System?

Here, we develop a precise explanation of what is meant by consensus in a distributed system.

The Bitcoin “Consensus” Problems

43

There are many applications that employ consensuses. The replication outlined above is one method of realizing it. In the field of distributed systems, the algorithms studied thus far have been designed to ensure that distributed computers use a single value. This problem—in which we consider algorithmic solutions—is called the “consensus problem.” “All correct processes propose a value and between them must agree on some value related to the proposed ones.” Furthermore, some systems require that a consensus be reached within a certain number of steps. Figure 4 illustrates such a system. The consensus problem is difficult to solve because it requires algorithms that enable normal processes to reach a consensus even when a fault occurs within a process or the network. If the algorithms are not robust against faults and one occurs during consensus-forming, different processes will decide upon different values, thus precluding a consensus being reached between all processes. Furthermore, to overcome these faults, the volume of messages exchanged must be increased; then, the questions of how to shorten the consensus-forming time and to exchange messages more efficiently become important. In addition, we cannot define algorithms to solve the consensus problem without considering various properties of the envisaged distributed system; for instance, whether or not messages will be delivered within the prescribed time (i.e., whether we use a synchronous or asynchronous model). Thus, academic papers first clarify the properties of the distributed system in mind, as well as the models of possible faults; then, they discuss algorithms to solve the consensus problem. Many papers define the following three properties, to frame the problem more precisely. Termination: A limit on how many steps are required to reach consensus. Validity: When the proposed value is the value on which consensus is formed. Agreement: When no existing processes agree to different values. Another property is Integrity, which indicates that an agreed-upon value is not overridden.

Fig. 4 The consensus problem

44

4

S. Miyazawa

Bitcoin Consensus Has Always Been Ambiguous

As described above, consensus in a distributed system is a clearly defined term that refers to an agreement between processes. In contrast, the term “consensus,” as regarded by people discussing blockchain, can have a variety of meanings, as illustrated below. Different people use the term in different ways, confusing the discussion. • Consistency of replicated data (e.g., “consensus” as defined at bitcoin.org [5]) • All algorithms that enable replication (e.g., the “consensus mechanism” that appears in Satoshi Nakamoto’s paper [1]) • Consensus as studied in distributed systems (e.g., some have highlighted that Bitcoin’s “consensus” is not that of a distributed system [6].) • Consensus between humans (e.g., consensus to a transaction between Bitcoin users, consensus on development policy within the Bitcoin developer community) Clarifying the conversation on these points will smooth out these discrepant discussions. So, what caused this confusion over the definition of “consensus” in Bitcoin? One probable factor is that Satoshi Nakamoto, in his initial Bitcoin paper [1], uses words such as “agree” and “consensus” in an ambiguous manner, without stating what agrees to what. The paper is vague on whether humans or processes are agreeing, as well as what they are agreeing to. Here, we examine an excerpt containing the first instance of the word “agree.” To accomplish this without a trusted party, transactions must be publicly announced, and we need a system for participants to agree on a single history of the order in which they were received. The payee needs proof that at the time of each transaction, the majority of nodes agreed it was the first received. Here, the word “agree” appears twice. In the first instance, the subject is described as being “participants.” These would seem to be Bitcoin users; that is, human beings. What they agree to is that, so long as they are using Bitcoin, only “a single history of the order in which (transactions) were received” will exist. The subject of the sentence containing the second instance of “agree” is “nodes.” Nodes typically refer to computers; however, here they could be computers or processes. What they agree on is that only one blockchain history exists, and that no double-spending will occur. In Fig. 5, the contents of this passage are illustrated. Meanwhile, the final sentence of the paper uses the word “consensus” to refer to the process of reaching agreement. Any needed rules and incentives can be enforced with this consensus mechanism. It is uncertain what “consensus mechanism” refers to in this sentence, but perhaps it refers to the Bitcoin mechanism as a whole. As before, what is agreeing to what remains unclarified.

The Bitcoin “Consensus” Problems

45

Fig. 5 The use of “agree” in Nakamoto’s Bitcoin paper

As shown above, it is unclear what agrees to what in the original Bitcoin paper.

5

The Consensus Problem and the Byzantine Generals’ Problem

Here, we consider the problem of consensus by discussing the Byzantine Generals’ Problem—a key topic in Bitcoin conversations—and by referencing a paper by Michael J. Fischer [7]. To fully introduce the Byzantine Generals’ Problem, we take some time to explain it step-by-step, as follows: 1. Consensus Problem 2. Interactive Consistency Problem 3. Generals’ Problem

46

S. Miyazawa

4. Byzantine fault 5. Byzantine Generals’ Problem 1. Consensus Problem The consensus problem occurs when all normally functioning processes propose a value. These processes then agree on one of those values. 2. Interactive Consistency Problem How can we ensure that all processes agree on one value? One feasible method is to inform all processes of the proposed values. In other words, if each process receives a list of the values proposed by all other processes (i.e., an interactive consistency vector), it can use rules common to all processes (e.g., minimum value, maximum value, median, majority vote) to pick out one value. Thus, the problem of proposing algorithms that inform all processes of their peers’ proposed values is referred to as the “Interactive Consistency Problem” [20] (see Fig. 6). 3. Generals’ Problem The Generals’ Problem is a special case of The Interactive Consistency Problem, in which only one process proposes a value (see Fig. 7). Here, we use military generals to represent processes. All processes that propose or know of values are generals. The Generals’ and Interactive Consistency Problems are very similar; this is because, if an algorithm that solves The Interactive Consistency Problem can focus on only one of the processes proposing values, then it also solves The Generals’ Problem. Likewise, if an algorithm can solve The Generals’ Problem, then executing that algorithm for all processes will solve The Interactive Consistency Problem. As previously mentioned, if we can solve The Interactive Consistency Problem, then we can also solve The Consensus Problem.

Fig. 6 Interactive consistency problem

The Bitcoin “Consensus” Problems

47

Fig. 7 Generals’ problem

4. Byzantine Fault Thus far, we have explained The Consensus Problem, The Interactive Consistency Problem and The Generals’ Problem. These may appear ordinary and easy-to-solve. However, in reality it is difficult to develop an algorithm that solves them. This is because, for every problem, we must consider what happens if a process becomes faulty. We must consider algorithms that can reach a consensus between normally functioning processes even if one process experiences a fault during operations. For example, in the Interactive Consistency Problem, if various processes propose different values to each other, or if some processes seem to have halted and propose no values, then different processes will recognize different interactive consistency vectors. If each process has a different interactive consistency vector, it becomes difficult to agree on a single value. This kind of process fault is called a “Byzantine fault,” which is a reference to the reportedly traitorous generals of the Byzantine Empire. The Byzantine fault is also called an “arbitrary fault” and refers to various anticipated faults (see Fig. 8). 5. Byzantine Generals’ Problem The explanations above provide the proper setting in which to discuss the Byzantine Generals’ Problem [8], which uses Byzantine generals as an analogy for the occurrence of a Byzantine fault (see Fig. 9). The aforementioned Byzantine fault gets its name from the Byzantine Generals’ Problem. To give an analogy with a Japanese character, I hereby pose the “Sengoku daimyō problem.” Imagine that in a battle of the Sengoku (“Warring States”)

48

S. Miyazawa

Fig. 8 Interactive consistency problem with a Byzantine fault

Fig. 9 Byzantine generals’ problem

Period—when Japan was thrown into a lengthy civil war in which daimyō (feudal lords) vied for dominion over the country—a daimyō sends a message to the other daimyō, resulting in confusion among them. For example, consider the decisive battle of Sekigahara; here, Tokugawa Ieyasu and his Eastern Army finally dealt a fatal blow to supporters of the Toyotomi clan and their Western Army, thus establishing a shogunate that would last for more than two centuries. [A3] Imagine that Tokugawa sends messages to his Eastern Army daimyō giving the order to charge and attack. If all the Eastern Army daimyō charge with all their forces, then they will beat the Western Army. However, if only some of the daimyō attack, then the Western Army will repel them. Although a daimyō knows that his allies have received the same message to attack, he does not know at this time whether they will actually follow the order.

The Bitcoin “Consensus” Problems

49

Now, imagine that all the Eastern Army daimyō double-check with each other by sending messages to all their allies asking, “We’re all charging, right?” As in the earlier explanation of the Byzantine Fault, the daimyō behave inconsistently: some send messages saying they will charge, some say they will not charge, and others do not send any reply at all. We also imagine a scenario in which Tokugawa—who issued the original order— himself experiences a Byzantine fault. The problem of finding an algorithm that can decide whether or not to charge when no unanimous consensus exists among the Eastern Army daimyō (excluding those experiencing a Byzantine fault) is what we call the Byzantine Generals’ Problem.

6

Byzantine Faults Largely Ignored

Numerous researchers have studied the behavior of systems during Byzantine faults; however, in terms of practical application, most systems do not account for them. Notable exceptions are those at risk of losing lives if failure occurs, such as aircraft. For example, consensus algorithms such as those used for the US Internet search firm Google’s systems or for Hadoop [9] generally do not take Byzantine faults into consideration. Well-known examples of these consensus algorithms in the cloud computing industry include Paxos [10] and ZAB [11], which are designed with crashes in mind. These limit the number of faults that can cause a process to halt. Cloud companies such as Google are more concerned about crashes because the more computers in use, the more inevitable it becomes that some fail or crash. Meanwhile, cloud companies do not consider Byzantine faults because it is unrealistic to do so. The cause of a Byzantine fault—especially when different processes propose various different values—could be a hardware problem; however, it is generally a bug, malware, or similar device. Therefore, in an administrator-run system, developers will not use inefficient consensus algorithms that account for Byzantine faults, instead focusing their energies on debugging and improving security as a method of fault prevention. Consensus algorithms are considered as “inefficient” because a distributed system requires more processes to account for Byzantine faults, and these processes must exchange a far greater number of messages. Developers know that the algorithms that solve Byzantine faults require exponentially more messages between the processes deemed to be at risk of Byzantine faults. Although more recent research has found ways to reduce the message load, a realistic algorithm as yet to be found, and little progress has been made beyond the realm of research. Recently, a proposed solution, the Practical Byzantine Fault Tolerance (PBFT) algorithm [12] caused a stir. The word “practical” in the title suggests that previous algorithms are inefficient. However, even with PBFT, it is difficult to add many more processes.

50

7

S. Miyazawa

P2P Systems and Byzantine Faults

In the fight against Byzantine faults, developers prioritize removing bugs and preventing malware intrusions; however, we cannot completely prevent Byzantine faults occurring in these operating environments. One such environment is a P2P system such as Bitcoin’s; here, developers cannot predict which computers will be connected. Research into the Byzantine Generals’ Problem and other consensus problems has thus far only studied systems in which a fixed number of processes participate in consensus-forming. Accordingly, many of the consensus algorithms installed in systems are equipped with modules that control the membership of processes involved in the consensus. Meanwhile, the advent of P2P systems has created environments where the processes participating in the system are in a dynamic, uncontrollable state. Researchers are currently exploring whether it is possible to reach a consensus in such systems. We now discuss two research papers: the first is by John R. Douceur [13] and concerns a phenomenon called a “Sybil attack,” the second is by Aspnes et al. [14], who develop Douceur’s work. Douceur’s paper proves that, in an environment where anonymous computers are free to join or leave (such as a P2P system), attacks causing a failure of the algorithm whereby consensus is influenced by the number of user IDs (or processes)—for instance, an algorithm premised on majority votes—is a persistent possibility. Douceur calls this a “Sybil attack [A4].” Sybil is the name of the protagonist in a novel; she has 16 personalities. As the name implies, in a Sybil attack, a user creates multiple user IDs (or processes) on one computer and connects to a P2P system. The user employs their enhanced voting power to control the voting results in the system (see Fig. 10). In the crucial section of Douceur’s paper, he states that even if a P2P system is equipped with a mechanism that requires a certain amount of computation to generate a user ID (expressed as a computational puzzle instead of PoW), the one with the most computational resources can obtain control. The 2005 paper by Aspnes et al., published 3 years after Douceur’s, presents research conducted on mechanisms more similar to Bitcoin’s. Aspnes et al. argue that the preconditions Douceur gives for the Sybil attack are unfair to the honest users. In Douceur’s Sybil attack scenario, the conditions state that the hostile users generate multiple IDs (i.e., processes) on their computers, whereas the honest users only generate one ID for each of theirs. Aspnes et al. therefore came up with a consensus algorithm in which the honest users also create multiple IDs on their computers.

The Bitcoin “Consensus” Problems

51

Fig. 10 Sybil attack

Just before executing the consensus algorithm, the groups of honest and hostile computers compete on their combined computational power. From this contest, the number of IDs obtained by each group is determined according to its computational power. Then, the system uses the obtained IDs to execute a consensus algorithm, which has been pre-designed to handle a certain kind of Byzantine Fault (see Fig. 11). Thus, Aspnes et al. argue that processes in a P2P system can still form a consensus. Even if imposters are present, an algorithm that solves the Byzantine Generals’ Problem can reach a consensus formed only by the honest users. However, despite this research, control over consensus-forming still lies with those implementing greater computational resources. On this point, the result is the same as in Douceur’s research.

8

Bitcoin and the Byzantine Generals’ Problem

Then, in 2008, the Bitcoin paper appeared. However, the Bitcoin paper does not explicitly mention the Byzantine Generals’ Problem. This paper neither proposed nor emphasized the connection between Bitcoin and the Byzantine Generals’ Problem; rather, this came from a discussion about encryption that was held in a mailing list.

52

S. Miyazawa

Fig. 11 Exposing computationally-challenged Byzantine imposters

Satoshi Nakamoto first released his original paper on this mailing list. In the discussion that followed on the list, Nakamoto stated, “The PoW chain is a solution to the Byzantine Generals’ Problem.” However, in this email, Nakamoto neither mentions the processes that cause a Byzantine Fault nor does he explain the Generals’ Problem as described in this essay. Nakamoto’s explanation in the mailing list concerns the Consensus Problem in an asynchronous environment. Because this email is lengthy; hence, only a summary is presented here. The reader interested in checking the details is referred to the source in the References, listed at the end of this chapter [15]. To begin, the generals (i.e., processes) propose a value. It is beneficial if they decide upon an agreed value based on the value each one first receives; however, because messages are not sent and received simultaneously, the first message one general receives may differ from that of another. This means they cannot reach a unanimous consensus. Thus, the PoW is applied. A general receiving a value from another general will run a computation of around 10 min. If that computation produces the answer (i.e., the PoW), the general then broadcasts this PoW to the other generals.

The Bitcoin “Consensus” Problems

53

Generals receiving the PoW form it into a chain. The PoW chain is built by prioritizing longer chains. Generals use the later PoW to begin computing again. Repeating this process 12 times will result, 2 h later, in a single value combined from 12 PoWs, which each general now possesses. That this algorithm solves the Byzantine Generals’ Problem or Consensus Problem has not been proven. Furthermore, it is unclear why the Byzantine Generals’ Problem came up in this conversation. Nevertheless, Bitcoin is a P2P system in which a Byzantine Fault could occur. What would constitute a Byzantine Fault for Bitcoin? Byzantine Faults describe a variety of faults that can occur in processes. Consequently, we can also describe a fault as Byzantine if—for some reason—it causes a Bitcoin process to crash. Another fault, unique to virtual currency systems, was introduced at the beginning of this chapter; that is, when a process attempts to engage in double-spending. In other words, the fault sends a transaction that contradicts the virtual currency system’s purpose (see Fig. 12). This is a typical example of a Byzantine Fault in Bitcoin’s system. In Bitcoin, this double-spending fraud is eliminated through “mining,” which serves as the aforementioned PoW. The question then becomes, “Has Bitcoin experienced the Byzantine Generals’ Problem?” If so, has Bitcoin solved it?

Fig. 12 Example of a Byzantine fault in Bitcoin

54

S. Miyazawa

First, it is clear that Bitcoin—a P2P system that numerous unidentified nodes may join—needs to anticipate Byzantine Faults. Furthermore, this system contains two types of processes: those that propose transactions or blocks, and those that accept those proposals. As a result, the Byzantine Generals’ Problem is an integral issue for Bitcoin. However, the Bitcoin system undergoes phases in which the agreement of processes to a single value is unclear; furthermore, replicated data cannot be overridden later. Hence, conventional ideas about the Consensus Problem do not provide us with a sufficient understanding. Furthermore, Bitcoin has not clearly defined the properties of its distributed system (agreement, termination, validity, etc.) in such a way as to frame the consensus problem (although one study [16] has attempted to provide definitions). Thus, Bitcoin can be seen as attempting to address the Byzantine Generals’ Problem; however, in my view, we cannot affirmatively declare the problem solved.

9

Humans Consent is a Must for the System’s Administrative Policy

I believe fresh memories remain of the uproar that resulted from Bitcoin’s split in August 2017. The issue arose because the decision over which faction’s Bitcoin was to be adopted was made by humans, rather than an automatic computerized selection (i.e., a consensus). The final event to be considered in this chapter is the blockchain split that accompanied a program upgrade in March 2013. This was an unintentional split, rather than one that humans had intended beforehand. An important fault found in distributed systems occurs as a result of a program upgrades. The upgrade changes the composition of the data in messages and protocols, thus creating unforeseen combinations that produce faults. No person has complete control over the program used on a P2P system’s network; thus, ensuring that everyone uses the same version is especially challenging. When Bitcoin upgraded from version 0.7 to 0.8, the blockchain underwent a major split [17]. To summarize what happened: after version 0.8 was announced, versions 0.7 and 0.8 were both run simultaneously on the Bitcoin network (see Fig. 13). Version 0.8 removed 0.7’s block size limit, allowing for the generation of larger blocks. However, 0.7’s program could not receive the large blocks owing to their size. This created a blockchain split between 0.7 and 0.8 (see Fig. 14). This problem was quickly discovered. Developers and miners used a chat program to discuss the issue. In the end, the miners using version 0.8 agreed to give up on the version 0.8 blockchains they had made, even if it meant making a loss.

The Bitcoin “Consensus” Problems

55

Fig. 13 Mixture of versions 0.7 and 0.8 (on the P2P network of Bitcoin)

Fig. 14 Mixture of versions 0.7 and 0.8 (on the Bitcoin blockchain)

This case illustrates how changing the version of a program that structures a distributed system can occasionally produce a fault. It also shows the fundamental challenges involved in producing programs and systems that are completely free of bugs. In short, unexpected faults are always possible. In these situations, human judgment may need to step in and manage the problem. When a system designed with no central authority experiences a fault, human judgment becomes necessary; thus, a very important question is whose judgment to rely upon without contradicting the decentralized nature of the system. Whether engineering an intentional split or recovering from an unintended one, we cannot rely solely upon Bitcoin’s mechanisms in their current form. The decisions of developers, users, and other human beings must still exert a large influence.

56

10

S. Miyazawa

We Need to Clearly Specify Who is Agreeing With Whom About What

We have focused on the Byzantine Generals’ Problem—closely associated with the Consensus Problem—and explored its relationship with Bitcoin. Bitcoin was developed as a P2P system. We do not know what programs are connected in a P2P system; thus, these systems have been developed with an aim toward tolerating Byzantine Faults. Bitcoin-like blockchain environments are open to anyone (i.e., a public blockchain); however, consider a private blockchain equipped to better control the participating processes. Here, the need for clear-headedness in establishing whether we should consider Byzantine faults from the outset, what other kinds of faults should be anticipated, and other matters, is paramount. In this chapter, we briefly reviewed the Consensus Problem and Byzantine Generals’ Problem in distributed networks; these problems have been studied prior to Bitcoin. Then, we examined both problems in Bitcoin. The premise of our discussion has been consensus; however, we began the chapter by considering the ambiguity surrounding consensuses in discussions of blockchain, because the definition of “consensus” has a clear meaning within the distributed systems R&D field. On the other hand, people who are not distributed systems specialists can engage in these discussions while interpreting consensus as agreement between human beings. Under the current circumstances, I feel that this interpretation prevents us from coherently and productively discussing Bitcoin and blockchain technologies without running into misunderstanding. If we are to continue using words such as “consent” and “agree,” then whenever we do, we should certainly clarify who is agreeing with whom, and about what they are agreeing. Otherwise, we must consider alternative terminology. Research into providing structure for Bitcoin and blockchain is still in the investigative stages [18, 19]. Most likely, new terminology for discussing Bitcoin’s mechanisms will gradually become standardized over time, much like blockchain has.

References 1. Nakamoto, Satoshi. 2008. Bitcoin: A peer-to-peer electronic cash system. 2. Blockcain.info. Blockchain size. https://blockchain.info/charts/blocks-size. 3. Hadzilacos, Vassos, and Sam Toueg. 1994. A modular approach to fault-tolerant broadcasts and related problems.

The Bitcoin “Consensus” Problems

57

4. Deepak Chandra, Tushar, Vassos Hadzilacos, and Sam Toueg. 1996. The weakest failure detector for solving consensus. Journal of the ACM (JACM) 43(4): 685–722. 5. Bitcoin Project. Bitcoin glossary. https://bitcoin.org/en/glossary/consensus. 6. Harper, Jim. 2016. It isn’t ‘consensus’: Toward cooler protocol debates, September. http:// www.coindesk.com/isnt-consensus-toward-cooler-protocol-debates/. 7. Fischer, Michael J. 1983. The consensus problem in unreliable distributed systems (a brief survey). In International Conference on Fundamentals of Computation Theory, 127–140. Springer. 8. Lamport, Leslie, Robert Shostak, and Marshall Pease. 1982. The Byzantine generals’ problem. ACM Transactions on Programming Languages and Systems (TOPLAS) 4 (3): 382– 401. 9. Shvachko, Konstantin, Hairong Kuang, Sanjay Radia, and Robert Chansler. 2010. The hadoop distributed file system. In Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on, 1–10. IEEE. 10. Lamport, Leslie. 2001. Paxos made simple. ACM Sigact News 32 (4): 18–25. 11. Paiva Junqueira, Flavio, Benjamin C. Reed, and Marco Serafini. 2011. Zab: High-performance broadcast for primary-backup systems. In Dependable Systems & Networks (DSN), 2011 IEEE/IFIP 41st International Conference on, 245–256. IEEE. 12. Castro, Miguel, Barbara Liskov, et al. 1999. Practical Byzantine fault tolerance. In OSDI 99: 173–186. 13. Douceur, John R. 2002. The sybil attack. In Peer-to-peer Systems, 251–260. Springer. 14. Aspnes, James, Collin Jackson, and Arvind Krishnamurthy. 2005. Exposing computationally-challenged Byzantine impostors. Department of Computer Science, Yale University, New Haven, CT, Tech. Rep. 15. Satoshi Nakamoto. 2008. Re: Bitcoin p 2p e-cash paper, Nov. http://www.mail-archive.com/ cryptography%40metzdowd.com/msg09997.html. 16. Eyal, Ittay, Adem Efe Gencer, Emin Gün Sirer, and Robbert Van Renesse. 2016. Bitcoin-ng: A scalable blockchain protocol. In NSDI, 45–59. 17. Andresen, Gavin. 2013. March 2013 chain fork post-mortem. https://github.com/bitcoin/bips/ blob/master/bip-0050.mediawiki. 18. Pass, Rafael, Lior Seeman, and Abhi Shelat. 2016. Analysis of the blockchain protocol in asynchronous networks. Cryptology ePrint Archive, Report 2016/454. http://eprint.iacr.org/ 2016/454. 19. Garay, Juan, Aggelos Kiayias, and Nikos Leonardos. 2015. The bitcoin backbone protocol: Analysis and applications. In Advances in Cryptology-EUROCRYPT 2015, 281–310. Springer. 20. Pease, Marshall, Robert Shostak, and Leslie Lamport. 1980. Reaching agreement in the presence of faults. Journal of the ACM (JACM) 27 (2): 228–234.

Shinichi Miyazawa works for a security company, SECOM CO., LTD as a research engineer. He was a member of the development team of the very large distributed system for security services for five years. After that, he researched and developed systems using digital signature for five years. His current research interests are distributed systems, local differential privacy and operation for secure SoC. He was the Chief Examiner of the System Working Group to the Editorial Committee of IPSJ Magazine in 2017.

The Myth of “Blockchain is Scalable” and Real Challenges Masanori Kusunoki

“Blockchain changes the global payment system.” Regardless of the validity of this statement, its basis is that “a blockchain is a database that can be scaled up at a low cost,” and “can be scaled to a large scale and at a lower cost than conventional server-centric systems.” Believing this, a “myth” appears and disappears. However, the current blockchain reality is very different. Taking Bitcoin—the largest blockchain currently in operation—as an example, the amount of transaction data it can process is about 1 MB every 10 min. The amount of data per transaction depends on the nature of the transaction, which on average result in seven transactions per second. This processing capability cannot handle the global payment infrastructure. In addition, the IT resources required to process the transaction cost 6.25 BTC (approximately $400,000 at the conversion rate at the time of translation) every 10 min, even if only the mining costs are considered. The unit price is tens of dollars. This is comparable to or slightly more expensive than bank transfer fees, thus it is incorrect to say “it can be transferred at low cost.” In the Bitcoin ecosystem, users are unaware of the cost because they do not pay mining rewards and use money generated from the issuance of money (*1). This is the current state of Bitcoin. * A bitcoin’s operating costs should include not only mining fees but also the operating costs of the full-node servers that hold all Bitcoin transactions. However, because these are the lunch boxes for the participants, it is difficult to calculate the cost of the entire system.

M. Kusunoki (&) Japan Digital Design, Tokyo, Japan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Matsuo and N. Sakimura (eds.), Blockchain Gaps, Future of Business and Finance, https://doi.org/10.1007/978-981-33-6052-5_7

59

60

M. Kusunoki

How far can we improve the performance of such powerless and incomplete blockchains with technical ingenuity? This question is the “blockchain scalability problem.” In this section, we clarify the challenges faced by current blockchain technologies, while considering the various efforts made to solve the scalability problem, using Bitcoin as an example. Two completely different scalability issues. The “blockchain scalability problem” actually refers to two completely different technical problems. One is the problem of how to increase the data-processing capacities of Bitcoin. As a result of the recently increased transaction volume of Bitcoin, the upper limit of the block size generatable per unit time—that is, every 10 min—has been used up in less than 10 min. If this happens, the transaction data are not easily stored in the block, and the confirmation of the transaction is greatly delayed. The only solution to this problem is to increase the processing capacity of Bitcoin. The other issue is whether the blockchain technology itself can increase its processing capacity; for this, it is compared against conventional information systems, as well as Bitcoin blockchains. First, in the current situation (where even the definition of the word blockchain has not been established), it is difficult to make a precise comparison. The technology that calls itself blockchain promises data consistency at different levels. The mechanism that ensures consistency is collectively referred to as the “consensus algorithm” in the blockchain, though it is unclear what is being called a consensus. For example, if a blockchain is operated in a single data center with a wide bandwidth and small delay, and the requirement for data synchronization is relaxed, a reasonably high transaction processing performance can be achieved. However, we have not yet reached a level where we can compare its performance in a real environment. Internally, many blockchains use a NoSQL-based database (DB) backend and implement various functions (such as blockchains) as application/application programming interface (API) layers. Bitcoin uses “LevelDB,” which is published by Google. It seems that such database performances can be compared with those of existing information systems, such as distributed databases. However, Bitcoin does not aim for efficiency as a distributed database. When comparing a blockchain and distributed database, we must attend to the difference between the two goals and the characteristics. Bitcoin is pioneering in the sense that even if multiple mutually untrusting entities are interconnected in the system, the consistency of the distributed ledger is maintained; this is a feature that conventional distributed databases do not possess. On the other hand, the Bitcoin mechanism only maintains the total issuance

The Myth of “Blockchain is Scalable” and Real Challenges

61

according to the algorithm, and it does not guarantee the general data consistency required for a distributed database. Moreover, in the case of blockchain, past transactions are increasingly more probable; however, transactions are truncated in the process, allowing data between nodes to be momentarily inconsistent. Similarly, many other blockchain implementations are also attempting to provide a mechanism for preventing falsification and ensuring data integrity in higher layers, for environments in which multiple entities are interconnected. In the world of relational database management systems, benchmark tests have been developed by organizations such as tpc.org, along with an environment in which different products can be compared on common ground. This is because international standards—such as the SQL language—have been established, and similar workloads can be executed on different products. The environment surrounding the blockchain has not yet sufficiently matured. A standardization movement is also occurring in the blockchain. In the International Organization for Standardization (ISO), the “ISO TC307 Blockchain and Distributed Ledger Technologies” was established, and it is anticipated that standardization will continue for common APIs, use cases, data integrity requirements, and so on. Failure of Bitcoin XT and deadlock of maximum block size expansion. Here, we discuss blockchain scalability in such contexts, focusing on Bitcoin blockchains that are currently in operation and being considered for scaling at conferences such as “Scaling Bitcoin.’’ However, we will be realistic. Considering Bitcoin by transaction volume alone, it is a weak system that can process only about seven transactions per second. However, in terms of the number of nodes, the amount of money handled, the computational power mobilized for mining, and the attacks received every day, it can be said to be the largest blockchain in the world. In the original design, the maximum block size of Bitcoin was 36 MB; however, in 2010 it became the current 1 MB, as a countermeasure against spam and DDoS (Distributed Denial of Service) attacks. In 2013, the “Bitcoin Bubble”—where the price of Bitcoin soared—increased the number of transactions; furthermore, in 2015, blocks approaching the limit size were being generated every 10 min. In response to this, a full-scale review of the processing capacity of Bitcoin has been proposed. The transition of Bitcoin block size

62

M. Kusunoki

(Source blockchain.info) In June 2015, Gavin Andresen, one of the Bitcoin’s core developers, made a proposal to expand the maximum block size. His proposal was to expand the maximum block size to 8 MB on January 11, 2016, and to then double the block size approximately every two years until 2036 (BIP-0101). Andresen left the decision to a majority vote according to the miners’ processing power, similarly to how the block format was updated in 2012. Unfortunately, it did not receive the support of the miners. Many miners were mining where the network environment—for instance, China’s inland areas—was poor, and they disliked the disadvantages of increasing the block size. Since then, plans have been made to increase the maximum block size to 2 M bytes (BIP-0102); to increase the maximum block size according to technological progress (BIP-0103); to dynamically control the maximum block size based on consent (BIP-0106), until 2020; and to increase the maximum block size step by step, then increase the maximum block size by 10% when more than 60% of the transactions are at the maximum (BIP-0107). Although proposed, these have not gained sufficient support from the miners. SegWit can pack more transactions into the same block. Many of the above attempts have failed because the rule changes required to increase the maximum block size call for a software renewal (hard fork) for all users. Therefore, the proposal “Segregated Witness” (segregation of evidence: SegWit) appeared as a block expansion measure that eliminates the need for hard forks. SegWit was expected to deliver an immediate breakthrough for block issues.

The Myth of “Blockchain is Scalable” and Real Challenges

63

In the conventional Bitcoin specification, an electronic signature area (ScriptSig/ScriptPubKey) is included in the data for each transaction input/output. The size ratio of the electronic signature in the transaction data is high, which is one factor that puts pressure on the block size. In SegWit, the electronic signature area is arranged in a new field (called the “witness”) in the transaction, instead of a field for each input/output included in the transaction. With the introduction of the witness, it becomes possible to separate the digital signature areas in transactions and block processing. The transaction data (left) embedded in the block are separated into a digital signature (bottom right) and transaction contents (top right).

In SegWit, fields other than those related to the witness are targeted for transaction ID generation (TXID). This means that the existence of the witness is not considered when generating the transaction hash tree (the witness hash tree is generated separately and included in another area in the block, so that the witness itself is part of the hash chain). It remains separate. Simply separating the area of the witness’ electronic signature simply changes the data arrangement and does not affect the capacity of the entire block. However, by separating the witness from transaction hash tree generation, older versions of the software—that do not support SegWit—can receive blocks composed of transactions excluding witness and can calculate block sizes. In other words, the capacity occupied by the electronic signature—under a block size restriction of 1 Mbyte—can be released as usual and hold more transactions. Meanwhile, between new versions of SegWit-supporting software, rules regarding new blocks including witnesses can be applied. Thus, instead of a hard fork, it enables a soft fork in which the old and new software versions coexist.

64

M. Kusunoki

The advantages of introducing a witness extend beyond the block compression effect. For example, transaction malleability (an attack that modifies TXID without modifying digital signature data), which has been a problem with Bitcoin, includes a field in which instruction codes such as ScriptSig and ScriptPubKey are stored as targets for generating TXID. By introducing witness, these fields are excluded from the generation of TXID, which forms a countermeasure against transaction malleability. In addition, SegWit also reduces data-processing costs. The electronic signature need not be verified every time a transaction or block is transferred between nodes (for example, a past transaction that has already been verified). For such unnecessary items, lightweight clients can now choose to reduce processing costs by excluding witness transfer and processing. In addition, future expansion is expected, including a new electronic signature method that can generate multiple electronic signatures simultaneously in a transaction. SegWit was officially incorporated into Bitcoin-core from Bitcoin 0.13.1 (released on October 28, 2016), and the migration process proceeded according to the process defined in its background intelligent transfer service version, with timeout and delay (BIP-009). The node that mined at 95% capacity (1916 blocks) or more for about 2 weeks (2016 blocks) entered the lock-in period, at which point it announced the adoption of Segwit, and then mined at 5% capacity for about 2 weeks (2016 blocks). If there is no objection of more than %, a soft fork is actually performed and SegWit becomes effective. The block compression effect of SegWit is only about 30% (three times more per block), and it struggles to exceed 30 transactions/second when improved. Some highlight that this is insufficient to achieve the scalability required for Bitcoin. Even if there is a temporary margin in transaction capacity due to the application of Segwit, it will be necessary to either revise the maximum block size in the long run or to make the network itself hierarchical. Of these, off-chain payment technologies such as “payment channel” and “lightning network” (that make payments outside the blockchain) are attracting attention. Progress in efforts to improve scalability through tiering. Because of the increased capabilities of computers and databases, we tend to forget that paper book scalability was originally less than 1 transaction/second. Even so, financial transactions were possible because the books were distributed and managed based on credit. For example, in Japanese yen circulation, current accounts with the Bank of Japan (owned by the commercial banks) and the deposits held by the individual banks are managed in separate ledgers. Until the 1960s, when a second online system came into operation, deposits were managed separately for each branch. The current blockchain records all transactions in a single ledger; this is similar to every company or individual having a checking account with the Bank of Japan. If this can be hierarchized in the same way as payment systems in the real world, transactions can be localized in the ledger of each hierarchy. By doing this, it is possible to simplify the process of finalizing the transaction for a small settlement, and to shorten the time required for finalizing the transaction to less than 10 min.

The Myth of “Blockchain is Scalable” and Real Challenges

65

The simplest method for realizing off-chain transactions is one in which a user deposits a fixed amount of money and makes payments within the range of this amount. This technique is implemented inside server-type electronic money and bitcoin exchanges. However, with this method, there is a risk that refunds will not be possible; for instance, when an electronic money business operator or a bitcoin exchange fails. The feature of Bitcoin that operates without trust would be ruined. Thus, a payment channel was developed to conduct off-chain transactions without a trusted mediator. Two frequent trading partners deposit money that cannot be moved freely on the blockchain and conduct off-chain trading within the scope of the credit. This method allows quick off-chain trade with certain trading partners; however, it requires a deposit, as many deposits as there are trading partners. Applying this mechanism, the Lightning Network has been enhanced to allow transactions with multiple partners, within the scope of deposits reserved on the blockchain. In the Lightning Network, transactions are routed on multiple payment channels, in the same way as the Internet relays packets between routers to construct a network; this, permits off-chain payments with multiple partners. Provided you are connected on the payment channel network, you can trade with any third-party off-chain. Lightning Network is still an evolving technology; six major companies gathered in the Scaling Bitcoin conference held in Milan, Italy during October 2016 (Acinq, Amiko Pay, Bcoin/Purse.io, Bitfury, Blockstream, LightningLabs). Discussions toward standardization proceeded; however, sufficient reliability was not achieved, even at time of translation. The Lightning Network aims at a P2P distributed network facilitating payment. Similarly to the Internet, it was originally designed as a distributed system; however, if a connected entity tries to reduce the number of routes, it tends to become a star network. In the actual payment system, Zengin-Net aggregates transactions as a central counterparty, and the number of transactions is reduced through daily payments to the Bank of Japan. Seeing as the Lightning Network has been put to practical use, can we build a scheme that does not depend on a reliable mediator as originally planned? Can the overhead of routing itself be increased? Can it be easier to understand for consumers? Solutions to Bitcoin’s scalability problem are still proceeding by trial and error. Table: Proposals relevant to improving the scalability of Bitcoin BIP

Date

Title

BIP-0009 BIP-0101 BIP-0102 BIP-0103

2015/10/4 2015/6/22 2015/6/23 2015/7/21

Version bits with timeout and delay Increase maximum block size Block size increase to 2MB Block size following technological growth (continued)

66

M. Kusunoki

(continued) BIP

Date

Title

BIP-0105 BIP-0106 BIP-0107 BIP-0109 BIP-0141 BIP-0142 BIP-0143 BIP-0144 BIP-0145 BIP-0152

2015/8/21 2015/8/24 2015/9/11 2016/1/28 2015/12/21 2015/12/24 2016/1/3 2016/1/8 2016/1/30 2016/4/27

Consensus based block size retargeting algorithm Dynamically controlled Bitcoin block size max cap Dynamic limit on the block size Two million byte size limit with sigop and sighash limits Segregated witness (Consensus layer) Address format for segregated witness Transaction signature verification for version 0 witness program Segregated witness (Peer services) Getblocktemplate updates for segregated witness Compact block relay

Masanori Kusunoki works for Japan Digital Design, a FinTech company founded by Mitsubishi UFJ Financial Group as Chief Technology Officer. He engaged in building the systems infrastructure of Social Security and Tax Number system in Japan. He also leads the Japanese mirror committee of ISO/TC307 and OpenID Foundation Japan.

Unexpected Pitfalls of Bitcoin Kazue Sako and Ryo Furukawa

The Bitcoin blockchain achieves a strong tamper resistance by exploiting cryptographic technologies. However, regarding the security of Bitcoin as a cryptocurrency, it is not sufficient to evaluate only the tamper resistance of blockchain. Various perspectives should be taken into consideration; for instance, resistance to double-spending attacks and denial of service attacks. Bitcoin was the first system to use a blockchain technology and has the longest history. Since it was launched in 2009, there have been no major system terminations or data rollbacks except for minor program updates. However, some research papers have reported issues on its protocols or implementations that could result in incorrect payments, data tampering, or system failures under some circumstances. In this section, we introduce some of the research findings on blockchain security. These results teach us some of the vulnerabilities found in the history of studying Bitcoin, as well as how they have been fixed. Not all vulnerabilities have been resolved, and methods to evaluate and quantify security related aspects of blockchains are still under active research. (1) Selfish mining attack PoW mechanisms in blockchains were considered secure, provided the computing power of malicious nodes was less than 50% of the overall power. However, it was found that strategies exist whereby a malicious node can control block generation even if its computing power is less than 50%. “Selfish mining attack” is one of these strategies.

K. Sako Waseda University, Tokyo, Japan R. Furukawa (&) NEC Corporation, Tokyo, Japan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Matsuo and N. Sakimura (eds.), Blockchain Gaps, Future of Business and Finance, https://doi.org/10.1007/978-981-33-6052-5_8

67

68

K. Sako and R. Furukawa

The probability of mining nodes succeeding in block generation is considered to be proportional to their computational power. This observation seems natural because to generate a block, they must solve a crypto-puzzle. However, if a node employs “Selfish mining,” it can successfully generate a block with a higher probability then its own computing power. In particular, one research paper [1] has shown that a node with only 41% of the computational power of the whole network can—in theory—successfully generate a block with a probability exceeding 50%. The strategy lies in the other rule: that the “longer chain wins.” Although a node can solve a crypto-puzzle and generate a block candidate, it cannot be identified as a block generator unless the block resides in the longest chain. In “Selfish mining” attacks, a malicious node withholds from broadcasting the block he/she has successfully generated. While hiding that block, the node continues to mine on top of the hidden block. This strategy allows the node to mine future blocks while other mining nodes are mining old ones. If the node is lucky enough to generate the next block continuously, then it broadcast the hidden block. Now, while the other nodes work on top of the just-broadcasted block, the malicious node has the advantage of working on the block after next. In this way, the malicious node can continue to maintain its advantage of working on the longer chain (Fig. 1). The node is “selfish” because it does not broadcast and share its work results promptly and fairly with other nodes. By adopting this strategy, a node possessing 33% or more of the overall computational power can generate blocks with substantially higher probability than its ability allows. Moreover, if the node has a computing power of 41% or more, it can generate blocks with a probability exceeding 50%. So far, no effective counter measures to “Selfish mining” have been proposed. Thus, it should be recognized that a malicious node with less than 50% of the overall computing power has the capacity to control the blockchain.

Fig. 1 Selfish mining

Unexpected Pitfalls of Bitcoin

69

(2) Double-spending attack for “fast payment” in Bitcoin In Bitcoin, a payment is regarded as completed when the transaction is included in a block in the Bitcoin blockchain. By design, a new block is generated at an average rate of one every 10 min; therefore, it takes at least the same amount of time for a transaction to be completed. In addition, the block must be in the longest chain, rather than in forked short ones. Therefore, it is recommended to wait for five or more blocks to be written after the block in question, to increase the probability that the block is indeed included in the Bitcoin blockchain. To follow this advice, the user must wait roughly an hour after the transaction, to ensure that the payment was valid. However, it would be ridiculous to have customers wait for an hour to settle a payment in, say, a coffee shop. Therefore, a “fast payment” method was developed. In fast payment, a payment is considered as “completed” when the store’s node receives the transaction, verifies the signature, and confirms that the coin is unused. However, the method is apparently risky. Here, we illustrate a concrete attack that exploits a vulnerability of “fast payment,” whereby a malicious payer completes the “fast payment” without spending his/her Bitcoin [2]. The shop would give out goods to a malicious payer, without being able to claim the Bitcoin. It could be regarded as a double-spending attack because the malicious payer can claim the goods from the shop without using the coin. Here, we assume that nodes in the Bitcoin network—so-called “helper nodes”— support the attack (Fig. 2).

Fig. 2 Double-spending attack using helper nodes

70

K. Sako and R. Furukawa

When a malicious payer sends the coin (specifically, the identifier for Unspent Transaction Output called UTXO) to the shop in transaction “Tv,” he/she sends another transaction “Ta” to “helper nodes” using the same coin. The helper nodes send Ta to nodes that are not neighbors of the shop, to spread Ta on the network ahead of Tv. The shop, upon receiving the transaction Tv, verifies the signature, and confirms that the coin is unused. Following the fast payment procedure, the shop regards Tv as valid and provides whatever goods are being paid for. However, the other nodes receive Ta first, and thus discard Tv as a double-spending transaction. As a result, Tv is not diffused to the network. Finally, Ta is more likely to be included in the block than the transaction Tv received by the shop; therefore, Tv is invalidated. The serious issue with this type of attack is that most nodes cannot detect the attack, and the shop suffers from financial loss. In the current Bitcoin specification, the nodes receiving two transactions using the same coin discard the later transaction without spreading it to the other nodes. As a result, very few nodes receive both transactions, masking the attack. The existence of such a pair of transactions is strong evidence that the attacker’s account is malicious; thus, it is desirable to spread the pair for the purpose of account reputation management. In fact, the specification of Bitcoin-XT—which was developed a core developer of Bitcoin and improves the performance and scalability of Bitcoin—has been modified such that these malicious transactions are also spread, to manage the account reputation of its accounts. On the other hand, these specification modifications can be exploited by attackers; they can overload the network traffic by sending a large number of malicious transactions, resulting in a successful DoS (denial-of-service) attack. Bitcoin-XT avoids this risk by limiting the number of relayed transactions that use the same UTXO to two. At present, Bitcoin has not yet incorporated such specification changes. Therefore, the shops must understand the risks when adopting the protocol of fast payment with Bitcoin. (3) DoS attack to Bitcoin A research paper [3] has reported that it is possible to delay the transmission of valid information to a specific node, by exploiting a propagation protocol specification in Bitcoin. This attack may delay information transmission throughout the system and might cause a system shutdown or permit double spending. As shown in Fig. 3, when a node propagates a block or a transaction to another node in the Bitcoin specification, it first sends a hash value of the data in an “inventory message.” The node receiving the “inventory message” checks whether such a hash value exists in his/her repository; if it does not, it requests a data body from the source node. Abuse of this inventory message, either in block or transaction propagation, can cause information transmission delay, as depicted in Fig. 4.

Unexpected Pitfalls of Bitcoin

71

Fig. 3 Propagation of a block and transaction

Fig. 4 Mechanism for delaying a block and transaction

In the block propagation protocol, an attacker node sends an inventory message to a target node. Then, regardless of what the target node’s request is, it returns only ping messages continuously for 20 min until timeout occurs. This strategy allows the attacker node to successfully occupy the capabilities of the target node and delay the block propagation. Additional measures can be taken to attack other free connections of the target node; thus, it is possible to continuously bind and occupy the target node and its resources.

72

K. Sako and R. Furukawa

Similar attack can be performed to the transaction propagation protocol. After sending the inventory message to the target node, the attacker node returns only ping message in response to requests from the target node; this lasts for 2 min until timeout occurs, and it delays the propagation of transactions. Furthermore, the target node can be forced to make subsequent requests to the same attacker node, by controlling its queue. Thus, it is possible to control and delay the transaction’s propagation to the target node for an arbitrarily long time. This fraud delays the transmission of information and can be combined with other attacks to increase its effectiveness. For example, by combining it with selfish mining, the blockchain can be substantially controlled with much less computational power than selfish mining alone requires. Specifically, an attacker can control the blockchain with as little as 34% of the overall computing power. As a second example, because it is possible to delay the propagation of transactions infinitely, double-spending attacks can be easily implemented. Finally, the Bitcoin system can be halted by targeting not one but all nodes. It has been shown that the attacker node only needs the capacity to send 600 kilobytes of messages every 20 min to perform this attack. Several countermeasures against these attacks have been proposed by Bitcoin developers. For example, a propagation protocol that does not use “inventory messages” but sends a block header directly has been proposed. Moreover, a mechanism to filter the sender of the transaction by IP address has been proposed, so that one IP address does not occupy the whole connection. Another proposal allows a node to select the sender node at random, without using the queue. The first proposal is listed as “BIP 130” and was introduced in Bitcoin 0.12.0. (4) Expectations of security evaluation research into blockchain Many cryptographic protocols have been compromised owing to the implementation and its operation environment, though we try to at least guarantee their theoretical security. In the case of blockchain, the study of its theoretical security is still in its infancy. We do not yet understand how security properties can be defined for implementing blockchain. In the following, we introduce an interesting research paper [4] that provides an analytic view on the security of a PoW blockchain. In this paper, the authors define the “common prefix property” and “chain quality property” as desirable properties of blockchains, and they show the relationship of these properties to the probability of an attacker succeeding in PoW. Here, “common prefix property” evaluates the probability that the blockchain does not cause a fork, and “chain quality property” evaluates the probability that the block generated by an honest miner is included in blockchain. They demonstrated that, given the probability a that an honest miner succeeds in PoW and the probability b that a malicious miner succeeds in PoW, the two properties “common prefix property” and “chain quality property” can be evaluated specifically under certain conditions.

Unexpected Pitfalls of Bitcoin

73

These findings show how the respective likelihoods of honest and malicious miners successfully generating blocks affect the plausible properties of the blockchain. Interestingly, their evaluation shows that when the probability b that a malicious miner succeeds in PoW is 1/3 or higher, the “chain quality property” becomes low. This has been demonstrated by the existence of selfish mining. This may not seem surprising; however, it is a cornerstone result in the sense that we can prove these properties using mathematical models. Bitcoin’s specifications and sources are publicly available and actually used; however, if analyzed carefully, the subtle issues of communication and data management can be found and exploited in attacks that result in major security holes. We strongly hope that further research on blockchain security continues and that it leads to the design and development of an ideal protocol.

References 1. Eyal, I., and E.G. Sirer. 2014. Majority is not enough: Bitcoin mining is vulnerable. Financial Cryptography and Data Security. 2. Karame, G., E. Androulaki, M. Roeschlin, A. Gervais, and S. Capkun. 2015. Misbehavior in Bitcoin: A Study of Double-spending and Accountability. ACM Transactions on Information and System Security (TISSEC). 3. Gervais, A., H. Ritzdorf, G. Karame, and S. Capkun. 2015. Tampering with the delivery of blocks and transactions in Bitcoin. In ACM Conference on Computer and Communications Security. 4. Garay, J., A. Kiayias, and N Leonardos. 2015. The Bitcoin Backbone protocol: Analysis and applications. EUROCRYPT.

Dr. Kazue Sako is a researcher in designing secure systems using cryptographic protocols that enhance privacy and fairness, e.g. electronic voting protocols, group signature schemes, digital lottery systems and blockchain architecture. She serves as an expert in ISO/IEC JTC1 SC27 (Information Security Techniques.). She chaired several international conferences in cryptography and security including ASIACRYPT, CT-RSA, FC, PKC and ESORICS. She is a member of the Science Council of Japan, and served as the president of Japan Society for Industrial and Applied Mathematics, and a vice-president of Institute of Electronics, Information and Communication Engineers. Ryo Furukawa works for NEC Corporation as a software developer. He researched privacy preserving technology such that privacy policy and anonymization for seven years and researched blockchain architecture for two years. He currently develops middleware for IoT systems and data analytics.

The Biggest Problem of Blockchains: Key Management Masashi Sato

The mechanisms of blockchains are based on cryptographic technologies such as digital signatures and hash functions. Below are some examples in which cryptography and hash functions are applied in typical blockchains: • Digital signature for transactions [Public key cryptography] • Generation of an ID for the addresses and transactions of users and smart contracts [Hash function] • Generation of hash trees for transactions [Hash function] • Generation of hash chains for blocks [Hash function] In addition to the cases above, cryptographic technology is also used to protect the wallet that manages a user’s signature key, and to ensure the anonymity of a transaction or block. Thus, cryptography is an indispensable technology for blockchains. For blockchains to continue as social infrastructures, it is necessary to investigate the secure use of cryptographic technology from a mid-to-long-term perspective. • Points of discussion in cryptographic technology Below are some of the main points of discussion regarding the appropriate use of cryptographic technology: (1) Secure cryptographic algorithms (2) Appropriate key sizes

M. Sato (&) Secom Co., Ltd, Tokyo, Japan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Matsuo and N. Sakimura (eds.), Blockchain Gaps, Future of Business and Finance, https://doi.org/10.1007/978-981-33-6052-5_9

75

76

M. Sato

(3) Secure cryptographic protocols (4) Appropriate implementation of the software or hardware relating to Items (1)–(3) (5) Appropriate key management Items (1)–(3) above are problems in the theory and design of cryptographic technology. Secure evaluations, performed by cryptography specialists from the academic community and standardization groups, are essential. Techniques to break encryptions and forge digital signatures are being continuously studied. It is also possible that the security of modern cryptography, which is based on computational security, will be compromised by higher computational capacities—for example, quantum computers intended for use in the future. For this reason, it is necessary to choose the secure cryptographic algorithm and key size that best suit the conditions of each period. In current cryptographic technologies, the validity periods and migration deadlines are normally defined according to the cryptographic algorithm and key size. In some countries, the government publishes recommended standards for secure cryptographic algorithms and key sizes. Item (4) is a problem of implementation. Even if the most appropriate cryptographic algorithm and protocol are chosen, if the software or hardware implementation is inadequate, these vulnerabilities may result in broken encryptions and forged digital signatures. In fact, there have been several cases of vulnerability in OpenSSL; these have stemmed from a faulty implementation of the transport layer security (TLS) protocol. Furthermore, we should consider the random number generators used for generating keys. If the random numbers are biased, it may allow an attacker to illegally copy the keys of other users, using characteristics of those random numbers. Ensuring that every user and provider evaluates the implementation of cryptographic modules is almost impossible in practice. For this reason, schemes have been proposed in which a third-party organization evaluates the implementation according to the requirements of the security function. One of these is the FIPS 140-2, security criteria for cryptographic modules; the Cryptographic Module Validation Program (CMVP) evaluates a module’s conformity to that criteria. FIPS 140-2 is the security standard for the US Federal Government; however, products that conform to this standard are used internationally in private business scenarios as well. Furthermore, the Common Criteria (ISO/IEC 15408) evaluates security products that include encryption modules; this is used as an international authentication standard. Item (5) is a problem of key operation. Even when the software or hardware being used is equipped with an appropriate cryptographic algorithm, if a user or administrator handles the key carelessly, any security is lost. As an analogy, imagine building a house with sturdy doors and walls and an impeccable security system, but hiding the key to the precious house in a publicly accessible mailbox outside. Perpetrating a successful attack that exploits the vulnerabilities of a cryptographic algorithm or implementation is a costly operation, because it requires technical expertise as well as advanced technical knowledge and techniques in

The Biggest Problem of Blockchains: Key Management

77

environment construction. Occasionally, an attacker can mount a successful attack more easily by stealing the key from a user or administrator whose management practices are careless. When using cryptographic technology, users tend to focus only on the safety of the cryptographic technology itself; however, in actual operations, key management is a very important issue. So far, we have discussed the use of general cryptography. However, based on what has been covered so far, let us consider the issues regarding blockchains. This chapter focuses on the problem of key management; then, in chap. “The Cryptographic Technology of Bitcoin will Eventually be Broken”, the migration of cryptographic technology is discussed. • What is the meaning of digital signatures in blockchains? To elucidate the importance of digital signature management in blockchains, let us examine “the meaning of digital signatures.” Digital signature technologies have long been used in Public Key Infrastructure (PKI) and numerous other applications. The digital signatures used in both PKI and blockchains employ the same public key cryptographic technology. The basic mechanism of digital signatures is as follows: the data signer (sender) generates signature data using a signature key and sends them to the verifier (receiver). By verifying the signer’s signature data with the verification key, the verifier can confirm that the signature data were generated by the signer. In short, only the person who has the signature key can create signature data. But what is the role of the digital signature in blockchains? In the case of Bitcoin and Ethereum (the best-known blockchains to date), a digital signature confirms whether a transaction should be executed when it is received; that is, it is predominantly used as a means of authenticating and authorizing a transaction. By verifying the digital signature assigned to a transaction, the creator of that transaction can confirm that he/she has a signature key corresponding to its address and that the content of the transaction has not been modified in the transmission process. This function is identical to that of the general digital signatures mentioned earlier. In addition, blockchains are also used to confirm whether the address corresponding to the signature key used in the digital signature satisfies the conditions required to transfer the crypto-assets, which are described in the transaction. For example, in the case of unspent transaction output (UTXO)-type transactions (such as Bitcoin’s), the blockchain confirms whether the destination address of the previous transaction corresponds to the signature key, and whether the balance of crypto-assets at that address is sufficient. In the case of the major public blockchains, because they lack a method of proving the identity of the signature key owner, identification is based on the address associated with the signature key instead of the owner’s ID. As described above, the transfer of crypto-assets is controlled by the signature key, and the identification is made using the address associated with the signature key on the

78

M. Sato

blockchain network; therefore, the signature key used in public blockchain digital signatures is tied to the asset rather than the person. Hence, the signature keys in blockchains are the very assets that need to be protected. Meanwhile, the digital signature of PKI is used to authenticate a user, server, and device; alongside this, a non-repudiation method is used to express consent to a contract or written responsibility to a document regarding possible disputes in the future. Based on the laws concerning electronic signatures, the digital signatures of PKI are the main non-repudiation method used in Japan and the EU. A common feature of PKI digital signatures is the association of a signature key to subjects such as people, organizations, servers, and devices. Entities known as “certification authorities (CAs)” certify these associations. Therefore, in most public blockchains, the signature key has a strong relationship to the crypto-assets; however, for PKI digital signatures, the signature key is tied to entities such as people, organizations, servers, and devices. On the other hand, some private blockchains—such as Hyperledger Fabric—also adopt a CA for PKI in private blockchains. In such cases, the conventional concepts of PKI key management may apply. Another difference between the digital signatures of public blockchains and PKI is the revocation of the signature key. In PKI digital signatures, the association between the entities (people, organization, servers, and devices) and the signature key can be disabled through a revocation process. For example, in the event of a leak of the signature key file or a theft of the smart card on which the signature key is stored, the signer can prevent unauthorized use of the key by revoking it; to do so, the owner of the signature key (or administrator, in the case of a server or device) must submit a revocation request to the CA. The CA adds the information of the public key certificate associated with that signature key to a revocation list and publishes it. The verifier of the digital signature accesses the revocation list and—if a signature key registered in the list is used—rejects the digital signature as invalid. Because that signature key cannot be used after the revocation, a new one must be created. Then, to re-associate that entity with its new signature key, the CA reissues the public key certificate. This revocation function is possible because the CA manages the association between the signature key and its owner. Meanwhile, public blockchains such as Bitcoin lack a function to revoke signature keys. In part, this is because the architecture of public blockchains excludes third-party organizations; as such, there is no entity to accept a revocation request. A revocation process must somehow verify whether the person submitting the revocation request actually owns that signature key. In a system where users are free to autonomously generate and claim ownership of signature keys, it is difficult for other participants to judge whether a revocation request is legitimate. In the end, only the person who generated the signature key will know. Anyone who obtains the signature key can execute a transaction; furthermore, once the transaction is executed, it is almost impossible to overturn. If a signature key is leaked, there is no way to prevent its unauthorized use. Measures such as multi-signature have been devised to mitigate this risk. Other possible measures include blockchain mechanisms that limit the number of times a signature key can

The Biggest Problem of Blockchains: Key Management

79

be used or that specify a period during which it is available. Although such measures can mitigate the risks of the unauthorized use of signature keys, they cannot be considered a replacement for a revocation process, which immediately suspends use. Thus, how should a user proceed if they suspect their signature key has been leaked? Before the signature key has been used illegally, the user must transfer their crypto-assets to a new signature key. The absence of signature key revocations and expiration dates also affect the life cycle of signature key management. Even if the owner of a signature key believes that they have consciously suspended the use of that key, the address associated with the signature key remains valid on the blockchain network, which makes it impossible for the system to prevent other users from transferring crypto-assets to that address. Furthermore, to prevent any crypto-assets transferred to that address from being lost, the signature key must continue to be managed. The user must also be careful not to lose the signature key. The meanings and characteristics of blockchain digital signatures were summarized above, by comparing them with PKI digital signatures. PKI has been widely discussed over the years, and this can provide information on key management assuming PKI. However, while knowledge of key management in PKI is helpful, it is important to remember that keys in blockchains have different characteristics. When the signature key of a blockchain is directly tied to crypto-assets, a leaked key has a huge impact, and dealing with it afterward is a challenging task. For this reason, signature key management requires an extra level of caution.

1

Forms of Blockchain Key Management

This section summarizes the forms of signature key management in blockchains. The management forms can be classified as follows: (1) Management by software (2) Management by a device (3) Services such as online wallets and crypto-asset exchanges A further case is a paper wallet, which prints the keys on paper. (1) Examples of management by software include wallet software that is included in the reference implementation provided by the blockchain platform development community, as well as third-party wallet software. In either case, the signature key is typically protected via encryption and is stored as a file in a storage medium (e.g., a hard disk drive or solid-state drive). To use the signature key, the user inputs a password, which decrypts the encrypted signature key and makes the digital signature operational. This action is sometimes called activation. The next form is (2): management by a device. A signature key is necessary to generate and send transactions (such as for transferring crypto-assets); however, it is not used to receive transactions. Thus, the signature keys can be stored in a

80

M. Sato

dedicated device and used (from that device) only when a transaction is to be generated, thereby protecting the signature keys from threats (such as malware). Dedicated devices—such as Leger Nano, Trezor, and KeepKey, which manage the signature keys of crypto-assets such as Bitcoin and Ethereum—have been developed in recent years. These dedicated devices are called hardware wallets. Management form (3) entrusts the management of signature keys to a third-party service. The user entrusts their signature key to an online wallet service, connecting to the online wallet when the key is needed to generate a transaction. To ensure that no user uses the signature key of other users, a user authentication method restricts access to the online wallet. Using an online wallet frees the user from managing the signature key management themselves; however, they must handle the ID password and other authenticating credentials with great care. While crypto-asset exchanges predominantly deal with buying and selling between crypto-assets and legal currencies, if the user opens an account in a crypto-asset exchange, they can order the transfer of crypto-assets through that account.

2

Issues Regarding Key Management by Software

Each blockchain platform adopts a different cryptographic algorithm and hash algorithm; thus, in some cases, the algorithm is not supported by the hardware wallet, and the user must opt for management by software. However, management by software requires special care regarding the theft of the signature key and unauthorized access. For example, using a malware infection, the signature key in the terminal memory may be read during the transaction’s execution, the transaction may be altered just before the signature key is used, or the file containing the signature key may be stolen. Even with a password-encrypted signature key, if the attacker obtains that file, they can attempt to crack the password using a brute dictionary attack or obtain the password with a malware technique known as keylogging. Furthermore, a backdoor pre-installed in a wallet software will result in a critical threat. In particular, secondary software distributions from a suspicious source require extra caution. Ideally, the software implementation should be evaluated, but this is not an easy option. Requesting certification from third-party evaluators— such as CMVP or common criteria—is not realistic for all software. In open-source software, the codes should ideally be reviewed by a large group of people; however, with so many platforms, this too is limited.

The Biggest Problem of Blockchains: Key Management

3

81

Issues Regarding Key Management by a Device

Using a mechanism that generates a signature key in a secure area of a hardware wallet and prevents it leaving this area during its life cycle, it is possible to construct a solid measure against key leakage. In fact, some new products manage the keys using tamper-resistant devices and secure elements. In addition, the question of whether the design and implementation are adequate occurs in hardware wallets as well. Hardware security module (HSM) devices have long been used to manage the keys of PKI. An HSM stores the key in secure hardware and prevents it from leaving. It also features a robust security mechanism with hardware access controls and physical damage and attack resistance. Many HSMs are certified by the common criteria or CMVP (FIPS 140-2). If used adequately, HSMs can manage keys very securely; thus, they have been used by the CAs of PKI as well as other services that require strict key management. Some new HSMs support the signature keys of crypto-assets such as Bitcoin, allowing service providers—such as crypto-asset exchanges—to adopt HSMs. However, by default, HSMs are unsuitable to manage the keys of blockchains that use cryptographic algorithms not supported by HSMs. Some HSMs can execute extended codes with newly implemented cryptographic algorithms in a secure area; however, newly added codes have this certification revoked. In other words, if a module contains newly implemented algorithms, unless it receives a third-party evaluation and obtains a new certification, it is not considered certified. Certain hardware wallets designed for general users are certified by third-party organizations. Having a good understanding of the subjects and levels of certification involved in each certification criteria helps users to better understand the device. The certification subjects include semiconductor modules—such as the processors or chips that process encryption or security functions—and devices and appliances embedded with these modules and firmware. The subjects can also be software, not simply hardware. In some products, only a few modules—rather than the entire product—are certified. It is essential to know which modules from that product have been certified. For example, even if the key is properly managed with a certified security chip, if the transaction data can be altered in the I/O process before being input to that security chip, the entire device may become vulnerable. Knowing which modules have been evaluated informs the user of the extent to which the security of the device has been ensured, enabling them to devise the most appropriate handling method. Furthermore, each certification criterion has its own range of certification levels. For example, FIPS 140-2 features four certification levels with different security requirements. The higher the security level, the stricter the device’s access control requirements and the higher its resistance to physical damage or falsification. Meanwhile, common criteria feature seven evaluation assurance levels (EALs) for implementation, instead of security levels. Depending on the levels, the details of the evaluations may differ; for instance, they may require functional specification

82

M. Sato

tests, source code reviews, or formal verification. The documents required in the evaluation all vary according to the EAL. Therefore, knowing the levels of which system the evaluation subject is certified with is crucial to handling the device appropriately. The distribution processes of the devices also require attention. What happens when an attack is hiding in the distribution channel of a device? This possibility increases if the consumer acquires the device not from a regular distribution route but as a resell from irregular stores or individuals. For example, if the attacker sets their own master seed or key into a device and then resells it with a fabricated manual, they may force the unsuspecting consumer to use that key. Thus, it is essential to consider the risks involved in acquiring a device from an irregular distribution route.

4

Issues Concerning Online Wallets, Crypto-Asset Exchanges, and Other Services

When a user uses services such as online wallets or crypto-asset exchanges, the security and trust of those services are crucial. A failure in the service provider’s system may allow an attacker to leak or unlawfully use signature keys. Furthermore, the service operator might make a mistake; or, in the worst-case scenario, there could be an internal illegal action. The service provider must implement security measures across the entire system; this includes—for example—user authentication methods, service interfaces (such as protocols for sending and receiving requests), backend systems, and key management. As with existing information and financial systems, security management is vital. This includes the system’s design and implementation, its operation, threat-analysis, and risk-evaluation mechanisms as well as its monitoring functions and attack and vulnerability response mechanisms. In addition, service providers that manage keys must construct a method or system of signature key management that considers the aforementioned characteristics of the blockchain and wallet signature keys.

5

Main Methods of Key Management in Blockchains

This section introduces two popular techniques for managing blockchain keys: multi-signature and cold wallet. Multi-signature is a technique that configures multiple signature keys to execute a single transaction; it thereby mitigates the risk involved when a signature key is leaked. Its functionality is supported by the blockchain platforms and smart contracts used in Bitcoin and Ethereum. Multiple signature keys that can sign a certain transaction are prepared in advance; if the correct number of digital signatures is assigned to the transaction, it is executed as a valid transaction. Because multiple

The Biggest Problem of Blockchains: Key Management

83

signature keys are needed to execute the transaction, the leakage of a single signature key does not directly result in the theft of crypto-assets. Multi-signature can be applied both when a key is managed by the user themselves or by a service provider. Users can store multiple signature keys in separate media and multiple places under their control; otherwise, they can entrust a portion of the signature keys to a service provider that manages keys. Users of online wallets can take a defensive measure by using multi-signature to manage some of the keys, thereby preventing the wallet provider from executing a transaction without his/her consent. Multi-signature is an effective counter-measure to the unauthorized use of a leaked signature key; however, it does not protect the key itself. Even when multi-signature is used, it is important to manage each signature key securely. This requires careful consideration of each signature key’s storage location and respective access controls (as well as an appropriate division of authority, if managed by an organization). Furthermore, the higher the number of signature keys, the higher the number of items requiring management. Some blockchain platforms do not support multi-signatures. Other related methods of divided management include the use of technologies such as secret sharing and threshold signature. Because these also involve concealing multiple pieces of information, it is necessary to examine issues similar to those faced by multi-signature operations. Cold wallets make signature keys accessible only when necessary; its antithesis is a hot wallet. The definitions and requirements of cold and hot wallets vary from person to person. A hot wallet is generally considered as an operation that allows a PC (or another device) connecting to a node of the blockchain network to access the signature key. With a hot wallet, one can order the creation of a transaction, immediately assign a digital signature to it, and send it directly to the blockchain network as a valid transaction. When a user installs a general software wallet to a PC and uses the signature key created there, it is considered a hot wallet operation. One problem of hot wallet operations is the constant exposure to threats such as malware infections and unauthorized access, which present a high chance of leakage or illegal use of the signature key. In contrast, a cold wallet operation cuts off unnecessary access to the signature key, permitting it only when the user needs to create a digital signature. There are several ways to operate a cold wallet. One of them is to use an offline, digital-signature-dedicated terminal that can access the signature keys. The user can install a software wallet to a digital-signature-dedicated terminal and manage the signature keys, or they can allow the hardware wallet to connect to that terminal. The transaction data can be transferred between the online terminal connected to the nodes of the blockchain network and the digital-signature-dedicated terminal via a portable medium. When a cold wallet is used in an organization, an authorized administrator often needs to approve the assignment of a digital signature. While portable media are subject to malware infections, their risk of attacks is believed to be lower than with hot wallets. However, cold wallets are not appropriate to creating large numbers of transactions, because they are impractical in managing all signature keys. Therefore, it is necessary to distribute the allocation of crypto-assets between cold and hot

84

M. Sato

wallets according to the risks. Furthermore, a cold wallet can be combined with multi-signature and secret sharing. The multi-signature and cold wallet control measures were created based on the characteristics and functional limitations of the blockchain platforms. Key management for different blockchain platforms requires specific knowledge and technology. At present, individual users and businesses are struggling with their own methods. However, we hope that in the future, a wide set of findings and technologies will be shared and standardized in the industry, and this will stimulate further debates toward securer key management practices. To this end, the blockchain platform development community also needs to be involved in the discussion, to improve security through the design of appropriate platform architectures.

Masashi Sato works for a security company, SECOM CO., LTD as a research engineer. He researched secure systems using electronic authentication and electronic signature. He served on standardization activities in the electronic signature field. He contributed to the drafting of JIS (Japanese Industrial Standards) and ISO standards related to electronic signatures, e.g. series of ISO 14533 and ISO 17090-4. He serves as a subleader of the electronic signature working group of JNSA (Japan Network Secruity Association), and an editor of the secruity working group of CGTF (Cryptoassets Governance Task Force).

The Cryptographic Technology of Bitcoin Will Eventually Be Broken Masashi Sato

Blockchains depend on technologies from the field of cryptography, such as digital signatures, encryption, and hash functions; thus, using secure cryptographic technology is essential. However, hackers are continuously developing new ways to attack the cryptographic algorithms and protocols used by blockchain systems, which slowly diminishes the security of cryptographic technologies. Even if a blockchain’s cryptographic technology is secure at the early stages of operation, it becomes vulnerable after an extended period, creating opportunities for attackers to exploit that vulnerability. This chapter deviates from the book’s main subject; however, the subject of breaking cryptographic technologies has sometimes surfaced in recent discussions on quantum computers. Care must be taken with the various information sources, which can become conflated in such discussions about quantum computers and breaking cryptography. A Canadian company called D-Wave Systems attracted public attention when it announced the world’s first commercial quantum computer in 2010, followed by the D-Wave 2000Q (with 2048 qubits) in January 2017. Such announcements have sparked debate regarding the various ways in which quantum computers might impact society. Some of these discussions have suggested that the arrival of quantum computers such as D-Wave’s will result in the main cryptographic technologies of today becoming crackable; this topic must be calmly analyzed. Using a method called quantum annealing, D-Wave’s quantum computer can be effectively used in certain optimization problems and in machine learning; however, it is difficult to relate this to the threat of breakable cryptographic technologies. In contrast, cryptanalysis-related matters—such as prime factorization and discrete logarithm problems—belong to the field of the so-called “general-purpose quantum computers,” which use quantum gates. With a general-purpose quantum computer, M. Sato (&) Secom Co., Ltd, Tokyo, Japan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Matsuo and N. Sakimura (eds.), Blockchain Gaps, Future of Business and Finance, https://doi.org/10.1007/978-981-33-6052-5_10

85

86

M. Sato

Shor’s algorithm was discovered. This algorithm can efficiently solve small-scale prime-factorization problems. Various companies and research institutions have begun to study general-purpose quantum computers; in July 2017, IBM unveiled its 16-qubit quantum computer (IBM Q) to the public. Here, we consider not just research into general-purpose quantum computers but also research of cryptanalysis techniques using quantum computers, incorporating these trends into the discussion. Returning to the subject of blockchains, projections indicate that the signature algorithm ECDSA (key length: 256 bits) and the hash function SHA-256, which are both used in Bitcoin, will still be usable after 2030 [1]. Meanwhile, new issuance of Bitcoin is set to continue until around 2140, and it will remain possible to use them after that. If the use of Bitcoin continues, it will eventually approach the life span of our current cryptographic technologies. If blockchains—not just Bitcoin—continue to be used in the long term, attacks against digital signatures and hash functions may one day become a reality; hence, before the threat materializes, it will be necessary to switch to new signature algorithms and hash functions. Digital signatures and hash functions are used in various blockchains functions, in which each needs to be examined individually. Using the model of Bitcoin as an example, this chapter will focus on the digital signatures of transactions and the hash functions used in the hash chains of blocks and it presents the concepts relating to their methods of transition. The functions of digital signatures and hash chains differ, as shown in Table 1. Consequently, the security risks and the actions and tasks required for their migration processes also differ. Migrating the digital signature safeguards future transactions, whereas migrating the hash chain can be considered to protect the records created of the transactions Table 1 The roles of digital signatures and hash chains, as well as their threats Main roles

Examples of threats if security fails

Necessary tasks in migration

Digital signature for transactions

Confirmation of the transaction’s creator and detection of transaction tampering

Unauthorized copy of the signature key of digital signature (Illegal acquisition of crypto-assets)

Hash chain of the block

Proof-of-existence of the transaction stored in the block

Fabrication of past transactions and blocks (e.g., seizure of unused assets from the past)

• Platform software update • Migration to a new signature key by the user • Platform software update • Update to a new hash chain by the block creator

The Cryptographic Technology of Bitcoin Will Eventually Be Broken

87

and blocks. The following sections examine issues concerning the migration of the digital signature and hash chain, respectively. If the system of digital transaction signatures becomes vulnerable, the worst-case scenario is that an attacker may be able to guess the signature key without stealing its data. If this kind of attack becomes possible, it will be meaningless for the user to securely manage their signature key. Before this occurs, it will be necessary to switch to new signature algorithms or longer signature keys. The blockchain platform software, the software and hardware wallets—which manage the signature keys—of the new signature algorithms, and the signature key sizes will all need adapting. When migrating to a new signature algorithm or signature key size, it is important to consider the effects upon the scalability problems of blockchains; for example, the higher computational loads required to process signature operations and the larger transactions and blocks. Furthermore, the user must generate a new signature key using that software and transfer their crypto-assets to that new signature key. If a user forgets to migrate the signature key, or is unaware of the need to do so, their signature key is left behind and remains vulnerable. Ideally, the community that provides the blockchain and wallet’s software/hardware, or the volunteer community in the industry, should provide users with appropriate guidance to support them during the transfer. The next section focuses migrating the hash functions of hash chains. Because the migration of hash chains affects the entire network, it must be analyzed more carefully. Hash functions are used in various blockchain processes; for instance, to generate digital signatures (as well as to pre-process them), and to generate transaction IDs and addresses; they are also used in the hash trees of transactions and the hash chain of blocks. This cryptographic hash function generates a hash value of a certain length from the original data. The main characteristics of the hash function are as follows: • The same hash value is output for the same input; however, if the input data differ, a completely different hash value is output. • It is difficult to estimate the original data using a hash value. For example, the hash function SHA-256 is used in Bitcoin to generate a fixed-length hash value of 256 bits (32 bytes) from the input data. Because of these characteristics, hash values are used in various situations besides blockchains, to realize security functions and efficient processing. For example, they are used to confirm the identity of two large sets of data without comparing them, as well as the keys in data searches. Even PoW, which is used in Bitcoin, requires a large number of hash value calculations; it determines a hash value below a certain threshold value using the characteristics of hash functions.

88

M. Sato

When a hash function is used to detect data tampering, the security of the function is important. The properties of a secure hash function include preimage resistance, second preimage resistance, and collision resistance. If the safety is compromised in any one of these areas, a different threat is encountered; however, together they can facilitate estimation of the original data from the hash value, or produce the same hash value from various data. If the different data that produce the same hash value are found, the hash value’s utility in verifying data consistency becomes doubtful. If a hash function becomes vulnerable, so does the blockchain. This may allow an attacker to replace a transaction or a block without having to correct its hash tree or hash chain, respectively. Thus, the blockchain, which should maintain the immutability of the record, becomes obsolete. Therefore, it will eventually become necessary to migrate to a new hash function before the existing hash function becomes vulnerable and the threats materialize.

1

Long-Term Signature Technology, a Possible Cure for Hash Function Migration

Using the example of Bitcoin, Fig. 1 briefly illustrates the process of generating a hash chain from a transaction. It is a multistage hash function processing, and the threat of hash value collisions must be considered in each process, though this is fairly complex. To summarize the main points of the discussion, this section introduces the well-established concept of long-term signature technology and indicates the issues concerning blockchains.

Fig. 1 Generation of hash chains and hash trees in blockchain

The Cryptographic Technology of Bitcoin Will Eventually Be Broken

89

Long-term signature is a technology from the public key infrastructure (PKI) digital signature field, which is mainly standardized by the ETSI (European Telecommunications Standards Institute) [2–4]. This technology makes it possible to keep digitally signed electronic data (electronic signature format) valid for a long period of time. For example, when tax-related documents, contracts, and medical information are handled as digitally signed data, they require long-term storage, depending on the document type. While some documents’ storage periods are determined by law; others are stored in case of future disputes. There are two important aspects concerning this use of digital signatures; that is, they make it possible to prove who created the digital signature and when, and they keep that proof valid even after the digital signature has been stored for an extended period. Regarding the “who” in the first item; in PKIs, this is proved by a public key certificate issued by a certification authority; however, such proof does not exist in public blockchains. Moreover, as indicated in Fig. 2, the information concerning when a digital signature was created is provided by a third-party timestamping authority, which issues a timestamp token (this links the digital signature to time information) [5]. Therefore, despite differing from blockchains in terms of the actual processing and model, when this concept is compared to the blockchain model of Fig. 1, the “electronic data” of Fig. 2 can be regarded as corresponding to “transaction,” the “proof-of-existence (timestamp token)” to the “hash value of the correlated block,” and “time information” (the input for generating proof-of-existence) to the “hash value of the preceding block.” However, the two mechanisms differ in that—unlike the strict proof of time provided by the timestamping service—the proof-of-existence provided by the hash chain of blockchains ensures the order of the data. The other aspect of long-term signatures, “long-term validity,” is also an important factor in the hash migration of blockchains. After an extended period, if the security of the hash function used to generate a digital signature or proof-of-existence is compromised, attackers may be able to fabricate that digital signature or proof-of-existence. Therefore, if a signature algorithm or hash function used in the past becomes vulnerable at a certain point in the future, the validity

Fig. 2 Proof-of-existence in a long-term signature

90

M. Sato

extension of long-term signatures proves that the original digital signature or hash value is unaltered. In the case of PKIs, countermeasures must be established not only against the security compromise of the hash function and signature algorithm but also against the expiration of the public key certificate, which is included in long-term signatures. The basic idea of long-term signatures is to newly link the digital signature, its proof-of-existence, and the digitally signed electronic data with a more secure hash function before a threat emerges (Fig. 3). With this, provided no threat is made against the hash function used in the new link, the validity of the digital signature created in the past, as well as its proof-of-existence, is maintained. This is because if the original electronic data, the digital signature, or the timestamp token previously issued is altered, it can be detected using the hash value obtained from the new hash function. As previously mentioned, in the case of long-term signatures, the proof-of-existence for signature data is a timestamp token issued by a third-party organization; this is a single piece of data that is relatively easy to manage. However, blockchains differ in that the data that must be maintained to guarantee proof-of-existence is a sequence of hash values generated and held in each node. This difference must be considered in the analysis of a migration method. The next section describes key topics in blockchains. • Main topics concerning hash function migration in blockchains As summarized in the concept of long-term signatures, the primary objective of migration is to strengthen the correlation between a transaction, a digital signature, and a proof-of-existence—in other words, a hash chain—created in the past with a new hash function. The simplest case is where the electronic data and signature data

Fig. 3 Concept of extending the validity of signature and time-stamping in the long-term signature

The Cryptographic Technology of Bitcoin Will Eventually Be Broken

91

are entered to the more secure hash function in Fig. 3 as all transactions in the blockchain (including digital signatures) and time information as a large group. In other words, this method assigns the entire past blockchain to a new hash function and sets it as the new starting point of hash chains. However, this method is unrealistic. For example, in the case of Bitcoin, the size of the blockchain is approximately 230 gigabytes as of July 2019. Suppose that the migration of hash functions is considered for the year 2030, when the size of the blockchain is projected to reach as much as 800 gigabytes (assuming a monotonic increase). Directing each node to process hash calculations from 800 gigabytes of data so as to verify past blocks is inefficient. If the process is entrusted to a specific node or organization, the relevance of the distributed model may be questioned. Furthermore, if a temporary fork of the hash chain is considered, the chain to be used as the input must be chosen. This requires a decision-making process from the community, which goes beyond the software domain. Therefore, although the idea of transferring an entire hash chain at once is simple, it involves a wide range of problems. Instead of one-batch hash migration, a sequential migration might be considered, in which individual transactions and specific blocks are used as inputs for new hash functions. In this case, the hash values generated from past transactions and blocks are incorporated into the process of generating new blocks (by new hash functions) (Fig. 4). As new blocks are generated, past transactions and blocks are sequentially reinforced by new hash functions, until the migration of all transactions and blocks is completed. With this kind of method, the decision-making problems involved in one-batch migration might be avoided. In sequential transitions, it is necessary to estimate at what point in time the past transactions and blocks should complete the hash migration, as well as to define the units of past transactions or blocks to be included in the new block. The migration must also leave the order of past transactions and blocks unchanged. This is because, during the process of retrieving information about past blocks from the old hash chain and entering it into the new hash function, if a specific transaction or block is intentionally or accidentally removed or has its order changed, the hash chain may no longer function as a proof-of-existence of the past. For more details about this technique of sequentially migrating blocks, please refer to “Long-Term Public Blockchain: Resilience Against Compromise of Underlying Cryptography” [6]. By itself, a technical mechanism for the migration of hash functions and digital signatures does not guarantee that the actual migration will succeed. The field of PKI is already facing the problem of migration ahead of blockchains. Following the security decrease of the hash function (SHA-1) and signature key (RSA 1024 bits), which had long been used in the public key certificates, digital signatures, and timestamping services of PKIs, a migration was undertaken to the safer SHA-2 family of hash functions and to key sizes exceeding RSA 2048 bits. However, the problems of migration still continue in some areas. Below is a brief explanation concerning which parts of PKIs may face problems.

92

M. Sato

Fig. 4 Concept of sequential transition of hash functions in blockchain

First, it is necessary to define the migration approach to be taken. This can include determining which applications the previous method will remain available for and for how long it does so, considering the evaluation of government-recommended criteria. To perform migration according to the established policy, a certification authority must reissue a certificate (the key); it is also necessary to update the systems, applications, and middleware that use that certificate, as well as to change their settings. For the long-term signatures mentioned earlier, the signature data created in the past also must be updated using the long-term signature algorithm. A migration process affects many related systems and providers. In the past, stakeholders—such as government agencies, auditors, accreditation bodies, CAs, service providers, system integrators, and application venders—cooperated with each other to raise issues and discuss solutions, to ensure a smooth migration while maintaining the system. Therefore, PKI migration projects involve communication between many stakeholders. Similar problems are anticipated for blockchains. It will likely be necessary to adjust guidelines and actual tasks relating to the migration in accordance with various stakeholders; this includes the development community of blockchain

The Cryptographic Technology of Bitcoin Will Eventually Be Broken

93

platforms, the general users of blockchains, the service providers (such as crypto-asset exchanges), and service providers operating the smart contracts of blockchains. Particularly for public blockchains like Bitcoin, the difficulty of making the adjustments required to conduct a migration may represent a barrier, as indicated by the discussions on the scalability problem in Chap. 7. What kind of procedure is required to replace a signature algorithm or hash function in public blockchains such as Bitcoin? Below we consider the procedure for two major cases. The first case occurs when an attack method is revealed and an immediate transition is required. To adapt to a new signature algorithm or hash function, a so-called “hard fork”—that is, a batch update of all nodes to a new software incompatible with the old one—is unavoidable. Any nodes that are incompatible with this software update can no longer receive transactions or blocks created with the new method. However, hard fork migrations often involve risks of coexistence between new and old blockchains, as seen in actual cases involving Bitcoin and Ethereum. Some users may also fail to update the software and lose crypto-assets. The second case is the stage in which an effective attack method has not yet been found, and there is still leeway before the migration. In this case, the best option is a “soft fork” migration, which interoperates with the previous software while maintaining compatibility. Instead of discarding digital signatures and hash chains created in the past and starting over, a soft fork migration will require safeguarding the validity of past digital signatures and hash chains with a new hash function, as seen with long-term signatures. In the case of long-term signatures, because a third-party organization ensures their validity, they can be migrated relatively easily. However, in the case of blockchains, it is still not clear who bears responsibility for this assurance. This subject must be further studied, to facilitate the migration of cryptographic technologies via soft forks rather than hard forks. The migration of digital signatures and hash functions is a necessary measure designed to protect the users’ assets, and such migrations heavily impact the entire system surrounding the blockchains. Therefore, it is necessary to discuss the migration guidelines and measures very carefully, to ensure a smooth operation.

References 1. National Institute of Standards and Technology. 2016. SP 800-57 Part 1 Rev. 4, Recommendation for Key Management, Part 1: General., NIST. 2. International Organization for Standardization (ISO). 2014. ISO 14533-1:2014, Processes, data elements and documents in commerce, industry and administration—Long term signature profiles – Part 1: Long term signature profiles for CMS Advanced Electronic Signatures (CAdES), ISO.

94

M. Sato

3. European Telecommunications Standards Institute. 2016. EN 319 122-1 V1.1.1, Electronic Signatures and Infrastructures (ESI);CAdES digital signatures; Part 1: Building blocks and CAdES baseline signatures, ETSI. 4. Internet Engineering Task Force. 2007. RFC 4998 Evidence Record Syntax (ERS), IETF. 5. Internet Engineering Task Force. 2001. RFC 3161 Internet X.509 Public Key Infrastructure Time-Stamp Protocol (TSP), IETF. 6. Sato, Masashi, and Shin’ichiro Matsuo. 2017. Long-term public blockchain: Resilience against Compromise of Underlying Cryptography. In ICCCN 2017 Workshop on Privacy, Security and Trust in Blockchain Technologies.

Masashi Sato works for a security company, SECOM CO., LTD as a research engineer. He researched secure systems using electronic authentication and electronic signature. He served on standardization activities in the electronic signature field. He contributed to the drafting of JIS (Japanese Industrial Standards) and ISO standards related to electronic signatures, e.g. series of ISO 14533 and ISO 17090-4. He serves as a subleader of the electronic signature working group of JNSA (Japan Network Secruity Association), and an editor of the secruity working group of CGTF (Cryptoassets Governance Task Force).

How We Can Secure Blockchain-Based Systems Shin’ichiro Matsuo

So far, we have considered the basic components of, and some significant unsolved problems in, specific technical aspects of blockchain technology, including trust, consensus, scalability, and key management. These issues are not easily resolvable; thus, more fundamental research is required to find solutions. To make blockchain the basis of a world-changing technology, “blockchain systems,” constructed from the above elements, must be guaranteed as secure and trustworthy over the long term (and ultimately forever). Otherwise, blockchain will be unable to form a part of society’s infrastructure. This chapter will highlight issues to be considered in making the overall blockchain-based system more secure.

1

Lessons Learned from the DAO Attack

In June 2016, the Distributed Autonomous Organization—a project built on the Ethereum blockchain platform and also referred to as The DAO—experienced a severe security incident: an attacker exploited a vulnerability in the design of The DAO’s program and—for the first time—almost succeeded in stealing a large amount of Ethereum coins, in violation of the system’s intended design. This incident has highlighted the mistake of naively assuming that encryption technologies guarantee the security of blockchain. Securely constructing the basic functions—that is, those preserving the integrity of data stored in blockchain—is more difficult than blockchain believers expect. Blockchain technology is categorized under “cryptographic protocols;” that is, a combination of cryptographic operations and communications that realizes more advanced security features than simple cryptographic primitives. S. Matsuo (&) Georgetown University, Washington, DC, USA e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Matsuo and N. Sakimura (eds.), Blockchain Gaps, Future of Business and Finance, https://doi.org/10.1007/978-981-33-6052-5_11

95

96

S. Matsuo

An example of the struggle to ensure security with cryptographic technology is the Secure Socket Layer/Transport Layer Security (SSL/TLS) protocol. We use this protocol daily, for online shopping and banking. These protocols establish a secure channel by having two communicating parties authenticate each other and agree to an ephemeral encryption key made specifically for that communication session. However, even if the cryptographic algorithms used in the SSL/TLS protocols are secure, numerous reports over the past few years have indicated that attackers can decrypt the confidential communication, by exploiting mistakes in the protocol’s design and implementation [1]. These existing problems have already been fixed in the current implementation. The lessons learnt from these problems have been applied to the design of the next TLS 1.3 version, recently standardized by the Internet Engineering Task Force (IETF). These efforts to improve the SSL/TLS protocol are a good example of how we cannot guarantee the security of systems through the security of the underlying cryptographic algorithms alone [2].

2

Another Aspect: Availability

Another security aspect we must consider is availability. Along with confidentiality and integrity, availability is one of the three major elements of information security. A major debate that took place during the DAO incident concerned how to handle the lost coins. Eventually, the Ethereum community decided to initiate a hard fork. However, it took a long time to reach this conclusion, during which time Ethereum essentially underwent a system shutdown. In other words, availability was lost. Availability can be lost as a result of various other attacks. If an underlying cryptographic algorithm is compromised, all the full nodes must migrate to a new cryptographic algorithm. Executing such migrations take considerable time. Adversaries are incentivized to suspend blockchain-based systems for a while, even if they fail to steal coins. When blockchain becomes a part of our social infrastructure, an attack that causes a system shutdown could have huge destructive impacts on social activities. If we look beyond the incentives for blockchain in terms of coins and asset value, and instead regard it as a world-changing social infrastructure, we can understand that an attack would also damage trust in the system’s ability to continue operating. Therefore, we must also maintain its availability. From the perspective of system security, it is premature to assume that the present blockchain technology is unstoppable. To develop blockchain into an element of world-changing social infrastructure, we must continuously consider matters from this security perspective.

How We Can Secure Blockchain-Based Systems

3

97

Five Layers Ensuring a Secure System Based on Cryptographic Technology

When evaluating the security of a blockchain-based system, it is important to consider the blockchain’s technology layers, as well as its security layers. This framework is not limited to blockchain but applies to information system security in general. Here, we consider five security-related layers for blockchain-based systems, as illustrated in Fig. 1. (1) The Cryptography Layer This layer ensures the technical qualifications of the underlying cryptographic algorithms. For example, Bitcoin applies SHA-2 and RIPEMD-160 hash functions and an ECDSA digital signature algorithm. A hash function compresses data strings of an arbitrary length into a certain fixed length. Furthermore, “cryptographic hash functions” must fulfill three requirements. It must feature one-wayness; that is, a pre-image resistance that prevents the recovery of the original data from the hash value. It must have second pre-image resistance; this prevents other data from being found with the same hash value as the original data. Last, the function must possess collision resistance; this makes it impossible to identify a pair, because any two strings of original data that feature the same hash value can be combined.

Fig. 1 Five layers to consider in assuring system-wide security

98

S. Matsuo

In 2016, a team—including Arvind Naranayan, assistant professor at Princeton University—claimed that blockchain needs a “hiding” (pre-image confidentiality) and “puzzle-friendly” (soundness as a PoW or other cryptographic puzzle) hash function. Subsequent academic research on the properties needed by cryptographic algorithms is currently in progress [3]. Digital signature schemes guarantee two things: that the data were indeed produced by somebody possessing the secret key, and that the original message and authentication data (i.e., the digital signature data) have not been modified since they were created. The digital signature scheme requires a security property referred to as an “existential unforgeability under a chosen message attack (EU-CMA);” this means that any attempts at forging a digital signature—from a combination of pre-existing public keys, data, or digital signatures—onto the new message will fail. Methods for evaluating security in this layer have been developed by the academic cryptography community since the late 1970s, through their research into modern cryptography and its standardization. Until a research paper can align the framework of “provable security” to thereby guarantee security, and can further stand up to stringent peer review, conference presentations, and further examination, then any proposed cryptographic algorithm will not be trusted. Provable security establishes a framework that mathematically guarantees that the longer a key or output is, the exponentially more unlikely it is that the algorithm will be broken, irrespective of whether we consider the cryptographic techniques (such as hash functions) or cryptographic algorithms derived from public key cryptography (such as digital signatures). The National Institute of Standards and Technology (NIST), which sets standards in the USA, used global competitions to decide the advanced encryption standard (AES) for symmetric key cryptosystems, as well as the SHA-3 hash function standard. The competition leading to the AES lasted 4 years (1997–2010), whereas SHA-3 took 7 years to develop, with the first workshop taking place in 2005 and the first discussions on requirements conducted in 2012. The academic community coordinates on matters such as cryptographic technology requirements, evaluation methods, actual technology proposals, theoretical evaluations, fair software and hardware evaluations, and so on. Through this process, thorough academic evaluation has ensured the trustworthiness of cryptographic algorithms. (2) The Fundamental Protocol Layer This layer combines cryptographic algorithms with communications technologies, to securely realize the basic functions required for a blockchain. A blockchain must update stored data while remaining decentralized and avoiding data modification. To this end, the blockchain combines cryptographic techniques in the form of constructed blocks and hash chains, with decentralized elements in the form of peer-to-peer (P2P) protocols and distributed consensus algorithms. In the case of blockchain, this entire combination must be evaluated to verify that security is maintained.

How We Can Secure Blockchain-Based Systems

99

Two methods of evaluation are possible in this layer: the first is to mathematically prove that the probability of a successful attack is extremely small, as performed in cryptographic algorithm evaluations; the other method is to formally examine the security of the combination, assuming that the underlying cryptographic algorithm is secure. This framework was established by the International Organization for Standardization (ISO) as an international standard—ISO/IEC 29128 (Verification of Cryptographic Protocol)—by the author of this chapter, and at time of writing it is beginning to be applied to blockchain technology. Previous research findings that are highly relevant to blockchain—such as those regarding cryptographic timestamps using hash chains (published in 1990 [4, 5]) and those on hysteresis signatures (published in 2000 [6])—can be applied to evaluate the security of blockchain. On the other hand, as this book has shown, we have not yet realized the properties that satisfy blockchain’s requirements regarding both P2P protocols and distributed consensus. The security discussions around blockchain protocols that comprehensively fit these two pieces together are only just beginning. (3) The Application Protocol Layer This layer is for designing payments and other business logic. The protocols in this layer are designed for applications that use blockchain. In Bitcoin, these include business logic functions, such as those used for the mining and payment of coins. When user privacy protection is a necessary application requirement, privacy-enhancing mechanisms are also needed in this layer. Likewise, if an anti-money laundering feature is needed, these solutions should be designed on this layer. In Bitcoin, mining is incorporated into business logic, while also providing an incentive toward honesty for node administrators. In other words, mining handles the security that the fundamental protocol layer would normally oversee. Therefore, we must ensure that the Bitcoin design contains elements that do not separate security into multiple layers. This design is one reason why proposed solutions to the scalability and security problems of other layers are often rejected: they contradict the Bitcoin community’s financial incentives. Attempts toward formal verification in this layer are beginning to be made by Ethereum and other providers. (4) The Implementation Layer This layer implements the first three layers into a programming code and hardware. The goal of this fourth layer is to eliminate any vulnerabilities to bugs. One method of realizing this goal is to use a language processor that is unlikely to produce vulnerabilities. This goal applies not only to security but also privacy protection.

100

S. Matsuo

In reality, the threat of an attack—on both hardware and software—is ever-present. An adversary might infer a cryptographic key from secondary information available in side channels, which have only an indirect connection to encryption processing—for instance, power consumption or processing time. The adversary might employ a fault injection, deliberately feeding data into the hardware circuit to acquire information and infer a cryptographic key. Another possibility is a cold boot attack, in which an adversary cools a memory chip to a low temperature, thereby extending the time that data remains in the memory to long enough to permit extraction. These are all examples of attacks that developers must be wary of. The ISO 15408 standard established a security evaluation framework and evaluation method for this layer. It is said to be difficult for startup companies to enforce ISO/IEC 15408 because it contains extremely complicated steps and criteria; however, technologies such as Bitcoin and blockchain—which could become platforms of enormous financial value—must be developed according to such a framework. Therefore, blockchain developers will have to decide how they apply existing standards to their essential concepts. (5) The Operation Layer This layer determines the overall operation of a cryptographic system (including its human operation), as well as how to audit it. First, it determines the system’s security policy; this policy is incorporated into the work processes, and it must be in an implementable operation form. In addition, because operation and auditing always exist together, rules are needed for auditing. This layer invokes the question of how to follow the PDCA cycle (plan, do, check, act/adjust) and makes improvements. It also considers issues in incident handling; for example, the DAO incident resulted in a major debate over how to deal with the problems caused by vulnerability. Security in this layer is regulated in the form of the ISO/IEC 27000-series and the Information Security Management System standards. A security policy must be prescribed, which assumes that a community capable of writing it exists. However, in a decentralized community, this is difficult to achieve.

4

Assumptions Between Layers Produce Pitfalls

To ensure system-wide security, two points must be considered when adopting the five-layer model detailed above. First, each layer must eliminate any vulnerability and guarantee sufficient security within its own layer. Irrespective of the employed cryptographic technology’s security level, if an implementation contains bugs and open vulnerabilities, its encryption becomes meaningless. If a system administrator uses very weak passwords or authentication, security holes are created.

How We Can Secure Blockchain-Based Systems

101

If the operator works negligently and no auditing is carried out (as in the Mt. Gox case), major weaknesses are produced. Asides from this, the trust model in Satoshi Nakamoto’s original Bitcoin paper is restricted to Bitcoin alone; it was not intended to guarantee secure exchanges with existing fiat currencies. In other words, the question of what kind of security cryptocurrency exchanges should maintain lies beyond the scope of Nakamoto’s paper. Therefore, a cryptocurrency exchange might become a centralized organization of power, in opposition to the original vision of blockchain’s trust model; it could come to resemble Big Brother, the leader of the totalitarian state depicted in George Orwell’s novel “1984.” When considering system security and the decentralized nature of blockchain, we inevitably have to discuss what forms the nodes will take and how they will be audited. The other major point regarding system-wide security is this: because the layers are separated, the techniques used for one layer make numerous assumptions about the others. To exemplify, a crypto-technology designer assumes that everyone keeps their secret keys 100% confidential. The cryptography itself does not guarantee confidentiality, integrity, or authentication. We guarantee these properties by mathematically converting them into a problem of each user’s key management, which is more realistic. In itself, this is a major academic achievement; however, from a purely systems security perspective, we have merely discarded a great deal of the responsibility for secret key management. In fact, several businesses have been unaware of this fact when creating products or systems using encryption. Another vulnerability is found in the SSL/TLS protocol; more specifically, it concerns a document published by the IETF regarding this protocol. The document contains many undocumented assumptions, which are known to the protocol’s designers but unknown to the developers who use SSL/TLS. This ambiguous point was never clarified, and when the protocol was implemented by these developers, it was performed under a different understanding. It is not uncommon for people operating on one layer to know nothing about tacit assumptions in other layers; this leads to discrepancies that can threaten system security. In this sense, we need engineers who can manage the overall security for systems based on blockchain, and who can perform system-wide evaluations with a fuller view of the overall picture.

5

Utilize Knowledge from Existing Systems and Standards

So far, we have discussed the security layers to be considered in blockchain; however, many discussions have already taken place concerning such layers for general information systems, and numerous standards have been proposed therefrom.

102

S. Matsuo

NIST uses contests to select the standard encryptions for cryptographic algorithms. Regarding the security of cryptographic protocols, I have been leading the work group to create a framework for ISO/IEC 29128, and a standardization that adds protocol evaluations to ISO/IEC JTC1 SC27/WG2 is also being developed. Implementation security is specified in ISO/IEC 15408 and the Cryptographic Module Validation Program, and rules for information security management (including operation) are specified in the ISO/IEC 27000-series. Adapting to these frameworks will take time and effort, but they will result in mechanisms for the third-party certification and auditing of security systems. In fact, many systems operating as societal infrastructure are already being built according to these frameworks. In this sense, if we fail to consider side-channel attacks and neglect to implement third-party certification according to standard frameworks, then we cannot be confident in using blockchain as part of society’s infrastructure. For example, key management is essential to blockchain. Using a hardware wallet by itself is insufficient; security is considerably more complex to ensure. The problem with current blockchain development—including Bitcoin’s—is that the organizations currently developing it are not utilizing the knowledge already available. With Bitcoin in particular, which demands extreme decentralization, developers might continue ignoring established research findings because their ideological leanings prevent them from working with international standards that originate from an authority. However, I believe that the information security frameworks regulated by existing standards—and the corresponding wisdom therein—will play a vital role in making blockchain secure. The developer community must rethink the meaning of decentralization, so as to develop methods of securing systems without creating a “Big Brother” scenario. Stanford University hosted Blockchain Protocol Analysis and Security Engineering 2017 from January 26–27, 2017. During this international conference, discussions took place on how to secure blockchain. At time of writing, another conference is scheduled for January 24–26, 2018. Meanwhile, the ISO has established the TC 307 committee for international standardization in blockchain; it began operations in April 2017. A second international conference on blockchain—convened in Tokyo during November 2017— decided to form Working Group 2, which develop technical documentation pertaining to security, privacy, and identity. In May 2018, a project to write a technical report on the security of digital asset custodians was established, at a London meeting of the TC 307. I became the leader of these projects to write technical documents on security, and believe that these efforts will reduce the disparity between existing standards and new, decentralized mechanisms. When considering how to maintain system-wide security, it is difficult to apply general information systems methods to blockchain, and it is unlikely to become affordable to do so. Thus, applying our current knowledge to the new decentralized mechanisms will likely represent an important step in making blockchain part of a world-changing infrastructure.

How We Can Secure Blockchain-Based Systems

103

References 1. CELLOS: Cryptographic protocol Evaluation toward Long-Lived Outstanding Security Consortium. 2. Kenneth, P., T. van der Merve. Reactive and Proactive Standardization of TLS. 3. Narayanan, A., Bonneau J., Falten E., et al. 2016. Bitcoin and Cryptocurrency Technologies: a Comprehensive Introduction. Princeton University Press. 4. Haber, S., W. Scott Stornetta. How to Time-Stamp a Digital Document. 5. Buldas, A., M. Saarepera. On provably secure time-stamping schemes. 6. Matsumoto, Tsutomu, Iwamura M., Sasaki R., et al. 2000. Alibi Establishment for Electronic Signatures: How to prove that you did not make the electronic signature in question even when the base cryptosystem was collapsed Part 1. Concepts and Basic Schemes. In 8th Conference of the Computer Security Group (CSEC), 13–17, March, Hitachi Ltd., “What is a Hysteresis Signature?”.

Dr. Shin’ichiro Matsuo is a Research Scientist in Cryptography and Information Security. He is working on maturing blockchain technology from the academia side and presents research results on blockchain security. At Georgetown University, he directs the CyberSMART Research Center and leads multi-disciplinary research among technology, economy, law, and regulation. He also leads international research collaboration on blockchain and founded BASE (Blockchain Academic Synergized Environment) alliance with the University of Tokyo and Keio University. In 2019, he co-founded Blockchain Governance Initiative Network (BGIN) a multi-stakeholder discussion body like IETF, as an initial contributor. He is co-chair of BGIN. He is a co-founder of the BSafe.network, an international and neutral research test network to promote applied academic research in blockchain technologies. He is a part of many program committees on blockchain technology and information security, and a program co-chair of Scaling Bitcoin 2018 Tokyo. He serves as the leader of security standardization project of blockchain (ISO TC307). Previously, he served as the head of Japanese national body of ISO/IEC JTC1 SC27/WG2 for cryptographic techniques, a member of advisory board cryptographic technology for the Japanese government.

The Current State of the Global Movement Shin’ichiro Matsuo

The concept specific to blockchain is its capacity to process transactions and contracts by recording states of rights and values under a decentralized model without a trusted operator. Thus, blockchain’s technological philosophy and concepts aim toward similar goals as the Internet’s. The Internet enabled the centralized operation of communications, and the blockchain aims to facilitate economic and social activities without trusted operators. It can be described as “permission-less.” However, even Bitcoin—the longest-running blockchain network—cannot completely realize these goals at present. When we select a part of the Bitcoin technology— such as the “blockchain”—and apply it to broader applications, the current blockchain technologies are not sufficiently qualified to meet their technical requirements. Top-level blockchain engineers and researchers are working on solving the fundamental issues and gaps described in this book. Standardization efforts, which are essential for this technology to become a foundation of social infrastructure, have begun. This chapter reviews the current state-of-the-art technologies seeking to place blockchain on a sure, socially relevant foundation.

1

The Maturity of Blockchain Technology

From a range of perspectives, this book has described the current disparities between the technological capacities and expectations of blockchain, in terms of its original goals as a permission-less network.

S. Matsuo (&) Georgetown University, Washington, DC, USA e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Matsuo and N. Sakimura (eds.), Blockchain Gaps, Future of Business and Finance, https://doi.org/10.1007/978-981-33-6052-5_12

105

106

S. Matsuo

The crucial and interesting reason why blockchain technology falls behind its expectations is its trade-off between requirements. Requirements and specifications exist in complicated trade-off relationships; thus, it is difficult to solve problems and enhance the technology. The decentralized nature of blockchain is realized by uniformly and randomly distributing a large number of nodes. This characteristic is enhanced by greater numbers of nodes. However, we must decrease the cost of maintaining a node when increasing their numbers. Otherwise, only rich individuals can set up a node. This requires the specification of smaller block sizes. Greater numbers of nodes require more communications and more computer processes to reach a consensus. As a result, scalability and process speeds may be lost. To enhance security and privacy, we must sacrifice performance and usability, as well as increase users’ costs and responsibilities. It reduces the number of nodes, which then erodes security and decentralization. These trade-offs are inevitable when applying this technology to broader, real-world applications. A more efficient blockchain technology than Bitcoin’s may ignore certain critical technical requirements. The scalability problem is not easily solvable considering this technical background. In the case of Bitcoin, the 1 MB block size and 10 min per block production speed (at time of writing) is impressive considering its philosophy and technological trade-offs. In other words, such specifications might represent a bottleneck for different applications. We often hear of new, innovative applications and social mechanisms built around the blockchain. This situation arises from our experience of innovations in the context of the Internet, which decentralized communications and produced innovative ecosystems. The same applies for blockchain; thus, the Internet represents a good perspective from which to consider the future of permission-less innovations based on blockchain technologies. Here, we begin such considerations. However, the actual permission-less blockchain technology lacks sufficient capacity to realize such ambitious applications; the present moment is analogous to 1990 in terms of the Internet. The ambitious projects of this period sought to realize social network and online movie services with a T1 (1.5 Mbps) backbone and 2400 bps modem. Current blockchain technologies cannot record all forms of information. Enhancing scalability without sacrificing security and decentralization is essential, and will be difficult to achieve in the space of a few years. The Internet was not considered scalable in its early stage; however, multiplexing and the development of communication devices realized scalable Internet. Moore’s law is a good carrying vehicle of this. In the case of blockchain, the following breakthroughs are needed: – – – –

Large decreases in storage costs Large enhancements of global network speeds Higher synchronization levels between distributed Internet nodes New realistic consensus algorithm robust against appropriate fault/attack models for blockchain – Enhancement of off-chain protocols such as Lightning Network

The Current State of the Global Movement

107

However, it is difficult to conduct such research and development in the age of “Post-Moore’s law.” Constraint from mathematics cannot scale as fast as Moore’s law. Of course, private blockchains with a trusted third party can improve this performance. Many existing applications and business are based on a private blockchain. However, in this case, we need to re-think the “why blockchain” question. For example, cryptographic time-stamping with linked tokens, which was proposed in 1990, offers the same structure as private blockchains. Multiplexing time-stamping servers can realize most of blockchains applications by considering only crashes (not Byzantine faults); this is more efficient than blockchain.

2

Facilitating Collaboration Between Engineers and Academia

Collaborations between the engineering community and academia began in 2015 and became established in 2016. The motivation for these collaborations was the unique state of blockchain development. In the case of the Internet, between the ARPANET (1970) and the end of NSFNet (1995), the academic community—led by US universities—helped to develop the communication technology. At that time, the technological ideas were studied in academia and jointly implemented by companies and universities. Afterward, technology that was agreed upon by stakeholders was standardized. Businesses are established using such standards. This is a typical process by which to sufficiently mature technology into a piece of societal infrastructure in a requisite series of steps. However, with Bitcoin, a reference implementation was published immediately after the publication of Satoshi Nakamoto’s original Bitcoin paper. Many businesses began using this reference code. They proceeded without academic discussions and standardization, which are essential to developing technologies. As a result, blockchain-related businesses expanded alongside the unresolved issues highlighted in this book. Collaborations between academia and engineers started from discussions on the scalability issue. In 2014, the scalability issue became serious and difficult to solve technically owing to the economic incentives (such as the price of Bitcoin). Hence, multi-disciplinary discussions between cryptography, security, networks, and economics experts became essential. In September 2015, the first Scaling Bitcoin workshop was held in Montreal. That same year, the second workshop was held in Hong Kong. “Segregated Witness”—which decreases the size of on-chain transaction data—and “Lightning Network” were technologies proposed in these workshops. After that, the workshop was held in Milan (2016), Stanford (2017), and Tokyo (2018). This workshop now hosts discussion on scalability, privacy, and other relevant technical issues between Bitcoin core engineers and academic researchers.

108

3

S. Matsuo

Academic Activities on the Blockchain

Since 2016, the number of academic papers and activities relating to the blockchain has rapidly increased. The Institute of Electrical and Electronics Engineers and the Association for Computing Machinery, two major academic communities in computer science, established blockchain-focused academic conferences. Financial Cryptography—a top-level conference discussing the application of cryptography to financial operations—has held Bitcoin workshops and smart contract workshops from 2016. Publishing papers in journals incentivizes academic researchers. In 2016, the “Ledger” journal was initiated to provide such an outlet to academic researchers. Another academic activity inspired by the history of the Internet is BSafe.network, a neutral, global-scale testbed constructed by universities. Each member university sets a blockchain node, and researchers can conduct any academic research and experiment with a real, global network. The scope of research includes technology, economics, and regulations. During the Internet’s technological development, the National Science Foundation (NSF) provided funds to US Universities to establish NSFNet, helping to develop the communication technology. Many experiments were conducted by using this testbed. In 1995, the NSFNet was made public as a foundation of the commercial Internet; BSafe.network will play the same role in blockchain development. At time of writing, 32 universities from 14 countries around the world have joined this network. The crucial philosophy of BSafe.network is neutrality. In early discussions of Bitcoin scalability, economic incentives regarding Bitcoin price became more important than the technological discussion. That is, even if some technologies represent an excellent enhancement for the blockchain ecosystem, they might not be implemented due to a lack of economic incentive. Neutral discussion is essential to eliminate such noise. BSafe.network aims to provide an environment for conducting neutral academic discussions. Academic research is becoming increasingly active and will encourage healthy development of blockchain technology.

4

International Standardization and Issues

International standardization is necessary when technologies form components of social infrastructure; they ensure interoperable and high-quality technology. On the other hand, internationally standardizing blockchain technology is premature considering the current state of blockchain technology. Much technological progress is expected, and technological requirements and architectures will change. However, international standardization activities have already begun. W3C held a workshop in April 2016, to explore potential standardization items. The International Organization for Standardization initiated the new technical committee TC 307 for blockchain and distributed ledger technology. In April 2017, the first

The Current State of the Global Movement

109

meeting was held in Sydney. Currently, the TC has six working groups, discussing terminology, reference architectures, security, privacy, identity, smart contracts, and interoperability. One major concern about the standardization of immature technology is that it is sometimes influenced by the marketing activities of existing used technologies, and is thus standardized according to a business perspective. Such marketing activities may disturb the standardization of the right technology. One such bad example was ISO/IEC 9979. This was “a list of” cryptographic algorithms without technology evaluations. Any company can “register” cryptographic algorithms. As a result, the list contained insecure or unused cryptographic algorithms. Such a list became obsolete for the industry and lost reliability as an international standard. Then, the ISO/IEC SC27 replaced ISO/IEC 9979 with ISO/IEC 18033, after careful security evaluations. For blockchain, it is beneficial to standardize neutrally evaluated technologies. In the case of Web technology, it is acceptable when it works well with the designated interface. However, for technology involving cryptographic techniques, its security should be carefully evaluated.

5

Beginning of Multi-Stakeholder Collaboration

The global and permission-less nature of blockchain technology often produces conflicts among stakeholders of the blockchain ecosystem. An example is a privacy protection and anti-money laundering (AML). Privacy protection in using crypto asset and blockchain is essential for users. On the other hand, from the regulators’ view, privacy-enhancing technology makes tracing money laundering hard. The major problem is there is no common language and understating among stakeholders on technology, operation, and governance, The lack of common understandings makes lack of transparency and compliance problems. In 2019, G20 discussed the needs for multi-stakeholder dialogue for decentralized finance, then concluded the following sentence in the communique. We welcome the FSB report on decentralized financial technologies, and the possible implications for financial stability, regulation and governance, and how regulators can enhance the dialogue with a wider group of stakeholders. In March 2020, Blockchain Governance Initiative Network (BGIN) was established to facilitate multi-stakeholder collaboration. The goals of this initiative are (1) Creating an open, global and neutral platform for multi-stakeholder dialogue, (2) Developing a common language and understandings among stakeholders with diverse perspectives, and (3) Building academic anchors through continuous

110

S. Matsuo

provision of trustable documents and codes based on an open source-style approach. It is expected that BGIN plays the same role as the Internet Engineering Task Force (IETF) for the blockchain ecosystem, then; as a result, the gaps pointed out in this book will be resolved by discussions at this initiative.

Dr. Shin’ichiro Matsuo is a Research Scientist in Cryptography and Information Security. He is working on maturing blockchain technology from the academia side and presents research results on blockchain security. At Georgetown University, he directs the CyberSMART Research Center and leads multi-disciplinary research among technology, economy, law, and regulation. He also leads international research collaboration on blockchain and founded BASE (Blockchain Academic Synergized Environment) alliance with the University of Tokyo and Keio University. In 2019, he co-founded Blockchain Governance Initiative Network (BGIN) a multi-stakeholder discussion body like IETF, as an initial contributor. He is co-chair of BGIN. He is a co-founder of the BSafe.network, an international and neutral research test network to promote applied academic research in blockchain technologies. He is a part of many program committees on blockchain technology and information security, and a program co-chair of Scaling Bitcoin 2018 Tokyo. He serves as the leader of security standardization project of blockchain (ISO TC307). Previously, he served as the head of Japanese national body of ISO/IEC JTC1 SC27/WG2 for cryptographic techniques, a member of advisory board cryptographic technology for the Japanese government.