IT Crisisology: Smart Crisis Management in Software Engineering: Models, Methods, Patterns, Practices, Case Studies 9813344342, 9789813344341

This book focuses on crisis management in software development which includes forecasting, responding and adaptive engin

424 72 6MB

English Pages 181 [194] Year 2021

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Foreword
Acknowledgements
Introduction: Look, This Bridge is Falling Down!
References
Contents
About the Author
Acronyms
1 Historical Survey: Crises, Crises, Everywhere; Any End to This Nightmare?
1.1 Introduction: Crises, Crises Everywhere
1.2 Software Crisis and Software Engineering
1.3 The New Discipline of the IT Crisisology
1.4 Conclusion: The Crisis Is Still Here
References
2 Models and Methods of Crisis Management
2.1 Introduction: Agility in Crises
2.2 Lifecycle Models in the Crisisology Framework
2.3 Using Case Method in Crises
2.3.1 Publishers: Springer Versus IGI
2.3.2 Pizza Makers: Dodo Pizza Scales up
2.4 Agile Process and Quality Improvement Approaches
2.4.1 Crystal Methods
2.4.2 Six Sigma Strategy
2.4.3 PAEI Lifecycle Framework
2.4.4 Six Sigma and Other Quality and Process Improvement Approaches
2.5 Conclusion: Lifecycle-Based Crisis-Responsive Models and Methods
References
3 Optimization Methods in Crises: Patterns and Practices
3.1 Introduction: Balancing Tradeoffs in Crises
3.2 Tradeoff Optimization Methods
3.3 Architecture-Centric Development Method
3.4 Crisis-Resistant Smart City Management
3.5 Blockchain as Digitalization Crisis Remedy
3.6 Conclusion: Crisis-Aware Optimization
References
4 Social and Human Aspects of Crisis: The “Human Factors”
4.1 Introduction: Taming Human Factors in Crises
4.2 Resilient Knowledge Transfer in Crises
4.3 Harnessing Human Factors for Better Knowledge Transfer
4.4 Case Study: Taxi Service Optimization
4.5 Conclusion: Lessons from the IT Crises: Optimizing the Human Factors
References
Conclusion: Captain Nemo and Crisis as a New Environment
References
Annex A Springer Exhibits
Annex B IGI Global Exhibits
Annex C DodoIS as the Dodo Pizza Company Understands It
Annex D DodoIS GUI Screenshots
Annex E DodoIS as a Cloud-Based System
Annex F The Dodo Pizza’s KPI and Team
-4pt- Glossary
Index
Recommend Papers

IT Crisisology: Smart Crisis Management in Software Engineering: Models, Methods, Patterns, Practices, Case Studies
 9813344342, 9789813344341

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Smart Innovation, Systems and Technologies 210

Sergey V. Zykov

IT Crisisology: Smart Crisis Management in Software Engineering Models, Methods, Patterns, Practices, Case Studies

123

Smart Innovation, Systems and Technologies Volume 210

Series Editors Robert J. Howlett, Bournemouth University and KES International, Shoreham-by-sea, UK Lakhmi C. Jain, Faculty of Engineering and Information Technology, Centre for Artificial Intelligence, University of Technology Sydney, Sydney, NSW, Australia

The Smart Innovation, Systems and Technologies book series encompasses the topics of knowledge, intelligence, innovation and sustainability. The aim of the series is to make available a platform for the publication of books on all aspects of single and multi-disciplinary research on these themes in order to make the latest results available in a readily-accessible form. Volumes on interdisciplinary research combining two or more of these areas is particularly sought. The series covers systems and paradigms that employ knowledge and intelligence in a broad sense. Its scope is systems having embedded knowledge and intelligence, which may be applied to the solution of world problems in industry, the environment and the community. It also focusses on the knowledge-transfer methodologies and innovation strategies employed to make this happen effectively. The combination of intelligent systems tools and a broad range of applications introduces a need for a synergy of disciplines from science, technology, business and the humanities. The series will include conference proceedings, edited collections, monographs, handbooks, reference books, and other relevant types of book in areas of science and technology where smart systems and technologies can offer innovative solutions. High quality content is an essential feature for all book proposals accepted for the series. It is expected that editors of all accepted volumes will ensure that contributions are subjected to an appropriate level of reviewing process and adhere to KES quality principles. Indexed by SCOPUS, EI Compendex, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and Technology Agency (JST), SCImago, DBLP.

More information about this series at http://www.springer.com/series/8767

Sergey V. Zykov

IT Crisisology: Smart Crisis Management in Software Engineering Models, Methods, Patterns, Practices, Case Studies

Sergey V. Zykov Higher School of Economics National Research University Moscow, Russia

ISSN 2190-3018 ISSN 2190-3026 (electronic) Smart Innovation, Systems and Technologies ISBN 978-981-33-4434-1 ISBN 978-981-33-4435-8 (eBook) https://doi.org/10.1007/978-981-33-4435-8 © Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

To God, my teachers, and my family

Foreword

More than 50 years ago, the term “software crisis” changed our view on the development of modern software to a way of thinking inspired by systematic engineering. This approach resulted in the discipline of software engineering, now highly regarded as an essential part of our current understanding of computer science. References from Edsger Dijkstra, the 1972 Tuning Award ACM winner suggested: “The major cause of the software crisis is that the machines have become several orders of magnitude more powerful! To put it quite bluntly: as long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a mild problem, and now we have gigantic computers, programming has become an equally gigantic problem.” To master the challenge of the software crisis, it was necessary to develop, integrate, and practice a large number of new models, methods, patterns, technologies, and use cases. Fundamental aspects exacerbating the software crisis have been considered. They included the rapidly evolving context of software technology, agile and adaptable software and enterprise architectures, context-oriented highly distributed information systems, and the human-centric aspects, especially the customer-centric strategic drivers for development, management, but also for business and application. Newer technologies such as service computing, cloud and edge computing, big data, the Internet of things, data analytics, and artificial intelligence are just a few current emerging approaches that induce increasing software complexity and additional challenges in coping with the persisting software crisis. The current book IT Crisisology: Smart Crisis Management in Software Engineering offers a comprehensible compilation of well-assorted studies, models, methods, patterns, practices, and case studies that support the original and newly developed framework of IT Crisisology. This book also provides ideas and guidance for the way forward. This book offers a broad spectrum of essential methodological, technological, and business management contributions from software engineering background underpinning the new framework of IT Crisisology. The topics are discussed from a holistic perspective and illuminated from several dimensions, which motivates the detailed and accurate models of the established discipline of IT Crisisology. vii

viii

Foreword

The presented approaches are clear and understandable, providing a focused perspective on essential compromises and optimizations of software engineering modeling and practice. Specific details of the presented theoretical framework result from a number of practice-oriented patterns and comprehensive, well-aligned realworld use cases. The material contained in the book originates from research but is also strongly based on practical experience in this area, which characterizes scientific quality and relevance. The content of the book has a significant educational value by presenting specific software engineering methods and processes that focus on agile approaches and make social and human orientation a core for transferring the framework and theories into practice. All this should be relevant both for those just starting their journey in software engineering and for established researchers, managers, decision makers, and practitioners. It provides important knowledge on the basis of which many solutions can be created innovatively when companies seek new applications for their upcoming digital products and services or adapt their business profiles to fit best the fast-evolving digital transformation. Prof. Alfred Zimmermann Reutlingen University Reutlingen, Germany

Acknowledgements

I would like to thank the colleagues of mine who significantly contributed to this book. They clarified my initially vague concepts and assisted in a number of processes including translation, copyediting, diagramming, etc. These are the students who did their master/Ph.D. theses under my supervision. A few of their takeaways were transformed and included into this book as case studies on agility improvement. They are: Joseph Attakorah, Alexey Chervinskiy, John Dadzie, Alexander Davydov, Yadviga Dolgikh, Azamat Dzhonov, Lev Feofanov, Ramis Gabeydulin, Pavel Kanaev, Lyuba Khoreva, Nikita Morgun, Evgenia Murzaeva, Vassily Naumushkin, Dinara Nikolayeva, Eric Parakal, Mazari Salim, Maxim Shmakov, Indra Singh, Alexander Sobolev, and Igor Vasilev. I would like to thank the Springer Editorial Director Dr. Thomas Ditzinger, the Springer Senior Editor Mr. Aninda Bose, the Springer Senior Executive for Production Mr. Ashok Kumar, and the Springer Project Coordinators for Books Production Mr. Daniel Joseph Glarance and Mr. Gowrishankar Ayyasamy, for their continuous availability and prompt assistance. In addition, I would like to express my deep appreciation and sincere gratitude to the editors in chief of the Springer Series in Smart Innovation, Systems and Technologies, Prof. Lakhmi C. Jain, and Prof. Robert J. Howlett, for their tireless efforts in supporting my ideas.

ix

Introduction: Look, This Bridge is Falling Down!

This book focuses on crisis management in large-scale systems. To do this in a smart way, we suggest a multifaceted approach that carefully blends models and methods, and includes patterns, principles, and practices. This approach would be incomplete if it did not address the human-related factors. Of these, managerial and technical dimensions are essential. However, depending on a particular system type, one of these dimensions may dominate. The simplest system in both dimensions is an Excel spreadsheet. A typical example of a very technically complex system is a large telecommunication switch. Enterprise systems are complex in terms of management. Of course, this does not mean that they are technically simple. However, it is their managerial complexity which is mission-critical and, therefore, it often is a root cause of system development crises. We have discussed these crises in our previous books [1, 2]; they typically result from an imbalance between available resources, business requirements, and technical constraints. We stated that “in software development … a crisis is … a disproportion between client’s expectations and the actual product behavior.” Why can this disproportion happen? Very often, this happens as a result of miscommunication between the project participants (at a minimum, including client team, developer team, and their management teams). Another crisis trigger is lack of resilient and responsive communication during a crisis due to distorted product vision and consequently wrong product development and delivery. We treat an enterprise as an “adaptive socio-technical large-scale system” [3], and it includes subsystems such as social and technical. As such, the social subsystem operates by means of messaging. Consequently, “organization is communication” [1]. Therefore, the focus of this book is related to the so-called human factors. To determine the complexity of the management framework, we provide a brief outline of the lifecycle models and methodologies. Our idea is to provide guidelines for flexible (i.e. agile) adjusting of the project parameters “on the move.” As such, we analyze the model-and-methodology frameworks to explain the ways of their smart tailoring and agile adjustment by means of a tradeoff optimization of the managerial dimension of complexity based on the human factors. To deeper understand and master agile adjustment techniques and practices, we suggest a set of case studies, which we believe are a powerful approach to develop xi

xii

Introduction: Look, This Bridge is Falling Down!

analytical and decision-making skills. Gill argues that case studies are especially suitable for conquering complexity in large-scale systems and better informing [4]. He also states that “… case studies … are likely to be the most rigorous form of research in many—if not most—settings where the fundamental test of our research value is its applicability to practice” [4]. Importantly, case studies address the higher levels of Bloom’s taxonomy such as analysis, synthesis, and evaluation [5]. This book’s focus is crisis management in developing software; this includes forecasting, responding, and adaptive engineering models, methods, patterns, and practices. Over the past decades, production in general and software engineering in particular were understood and practiced in different ways. Changeable business constraints, complex technical requirements, and the so-called human factors imposed on the software solutions caused what was articulated as “software crises.” These complex sources of trouble require a multifaceted approach to address each of their layers. Therefore, this book suggests an approach that contains models, methods, patterns, and practices to efficiently meet and match all the crisis ingredients. Software engineering was triggered by what was initially identified as a “software crisis”; however, this practically focused discipline, even after 50 years of existence, cannot be considered as a “silver bullet” for certain software products. This means that the crisis is still here and it may immediately occur in cases of careless selection or unbalanced application of the rich variety of principles and practices that the state-of-the-art software engineering currently incorporates. The pioneering 1968 NATO Conference in Software Engineering compared software development and bridge production. Both processes are very complex and require interaction of multiple teams for design, construction, and maintenance. However, the lifecycles of material and software products were recognized as essentially different, and, therefore, digital product development undoubtedly requires a very specific framework, particularly in a crisis. This book introduces the crisis management framework as a blend, the components of which have been carefully selected from a few domains adjacent to software engineering such as material engineering, management, and psychology. The key ingredients of this crisis management framework, to name a few, are the elements of information theory, tradeoff analysis, agile development, and knowledge transfer. This book will help the stakeholders in understanding and identifying the key technology, business, and human factors that may likely result in a software production crisis, i.e., critically affect the project outcomes in terms of successful software product development and delivery. These factors are particularly important for the mission-critical and large-scale applications, typically considered very complex in managerial and technological aspects, and therefore specifically addressed by the discipline of software engineering. This book will throw light on the crisisresponsive and resilient methodologies and practices; therewith, it will focus on their evolutionary changes and the resulting benefits. To understand the social and human aspects of IT Crisisology, this book suggests the case study-based approach. A set of comparative case studies for digitally focused businesses of a different scale and scope will be considered. We illustrate

Introduction: Look, This Bridge is Falling Down!

xiii

the Crisisology from the perspectives of different continents, historical and cultural diversity, which can essentially affect the human factors that often are the root cause of any crisis. The businesses that we examine clearly have a number of similarities including their overall structure and ultimate goals. However, certain outcomes and business deliverables due to local varieties and business-specific dynamics might be essentially different. After discussing each of these case studies separately (from the perspectives of process, data, and software systems used), we will compare them in terms of business, technology, and human-related factors to detect and refine common patterns of achieving crisis responsiveness and resilience. This book is organized as follows. Chapter 1 presents the historical background of software engineering as the initial discipline to manage the crises in software development. Further, it outlines the framework of the new IT Crisisology discipline, which includes ingredients such as formal models and development methods, typical patterns, and best practices. Chapter 2 provides more details on the models and methods of crisis-responsive software production; it elaborates on lifecycles and product quality, focusing on agile approaches and techniques; these improve communication and facilitate adjustable team development, which assists in crisis management for quality software production. This chapter also contains case studies regarding crisis-responsive transformations of book publishing, and pizza production and delivery businesses. Chapter 3 incorporates tradeoff-based optimization approaches, specifically architecture-centered and compromise-oriented ones. This chapter also suggests crisis-resilient patterns and practices applicable to software development, including the techniques for smart city and digital transformation management. The final chapter addresses knowledge transfer issues, which critically depend on the so-called human factors. Therefore, harnessing these human factors results in a resonant and more efficient knowledge transfer, even in a crisis. This chapter also includes a real-world case study related to IT-intensive crisis transformation in taxi service management. This interwoven collection of business transformation stories in crises assists in evaluating the models and methods introduced previously, and in revealing helpful takeaways and dangerous pitfalls. The conclusion summarizes crisis management methods and techniques and recommends best practices for efficient software development in these critical conditions. We hope that this book will serve as a reliable compass for software developers as it will give them the necessary guidelines and produce reliable bridges for their customers to navigate safely through the rough ocean of software production in stormy times of crises.

References 1. Zykov, S. V. (2018). Managing software crisis: A smart way to enterprise agility (p. 153). Switzerland: Springer International Publishing. 2. Zykov, S. V. (2016). Crisis management for software development and knowledge transfer (p. 133). Switzerland: Springer International Publishing.

xiv

Introduction: Look, This Bridge is Falling Down!

3. Gromoff, A. (2010). A study of the subject-oriented approach for automation and process modeling of a service company. S-BPM ONE 2010: 192–206. 4. Gill, G. T. (2011). Case method informing with the case method: A guide to case method research (p. 562). Writing, & Facilitation. Informing Science Press. 5. Bloom, B. S. (1956). A taxonomy of educational objectives. Addison-Wesley Longman Ltd., 2nd Edition.

Contents

1 Historical Survey: Crises, Crises, Everywhere; Any End to This Nightmare? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Introduction: Crises, Crises Everywhere . . . . . . . . . . . . . . . . . . . . . . . 1.2 Software Crisis and Software Engineering . . . . . . . . . . . . . . . . . . . . . 1.3 The New Discipline of the IT Crisisology . . . . . . . . . . . . . . . . . . . . . . 1.4 Conclusion: The Crisis Is Still Here . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Models and Methods of Crisis Management . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction: Agility in Crises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Lifecycle Models in the Crisisology Framework . . . . . . . . . . . . . . . . 2.3 Using Case Method in Crises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Publishers: Springer Versus IGI . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Pizza Makers: Dodo Pizza Scales up . . . . . . . . . . . . . . . . . . . . 2.4 Agile Process and Quality Improvement Approaches . . . . . . . . . . . . 2.4.1 Crystal Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Six Sigma Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 PAEI Lifecycle Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.4 Six Sigma and Other Quality and Process Improvement Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Conclusion: Lifecycle-Based Crisis-Responsive Models and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 1 2 5 14 15 17 17 18 24 25 31 37 37 42 45 51 56 58

3 Optimization Methods in Crises: Patterns and Practices . . . . . . . . . . . 59 3.1 Introduction: Balancing Tradeoffs in Crises . . . . . . . . . . . . . . . . . . . . 59 3.2 Tradeoff Optimization Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.3 Architecture-Centric Development Method . . . . . . . . . . . . . . . . . . . . . 93 3.4 Crisis-Resistant Smart City Management . . . . . . . . . . . . . . . . . . . . . . 96 3.5 Blockchain as Digitalization Crisis Remedy . . . . . . . . . . . . . . . . . . . . 104 3.6 Conclusion: Crisis-Aware Optimization . . . . . . . . . . . . . . . . . . . . . . . . 107 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 xv

xvi

Contents

4 Social and Human Aspects of Crisis: The “Human Factors” . . . . . . . . 4.1 Introduction: Taming Human Factors in Crises . . . . . . . . . . . . . . . . . . 4.2 Resilient Knowledge Transfer in Crises . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Harnessing Human Factors for Better Knowledge Transfer . . . . . . . 4.4 Case Study: Taxi Service Optimization . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Conclusion: Lessons from the IT Crises: Optimizing the Human Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

111 111 112 118 124 140 141

Conclusion: Captain Nemo and Crisis as a New Environment . . . . . . . . . 143 Annex A: Springer Exhibits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Annex B: IGI Global Exhibits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Annex C: DodoIS as the Dodo Pizza Company Understands It . . . . . . . . 157 Annex D: DodoIS GUI Screenshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Annex E: DodoIS as a Cloud-Based System . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Annex F: The Dodo Pizza’s KPI and Team . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

About the Author

Dr. Sergey V. Zykov completed his M.Sc. and Ph.D. in Computer Science from Moscow Engineering Physics Institute in the year 1994 and 2000, respectively. Presently, Dr. Zykov is working as Professor in National Research University, Higher School of Economics, Russia. He is a member of many professional organisations like IWA, IIIS and KES Intl. He is a member of many reputed journal boards – International Journal of Knowledge-Based and Intelligent Engineering Systems, and Intelligent Decision Technologies. He has published three books – “Managing Software Crisis: A Smart Way to Enterprise Agility”, “Agile Enterprise Engineering: Smart Application of Human Factors” and “Crisis Management for Software Development and Knowledge Transfer” by Springer, and “Software Engineering for Enterprise System Agility” with IGI Global.

xvii

Acronyms

6σ ACDM AHP BiTA CBA CIO CSF DEA DMU ERA GDP GUI ICT IDC IofAs IoT IQR ISO KM KPI KS KT KTT M2M MCDA MCS OS PDCA PDM PPE PSP PtD

Six Sigma Architecture-Centric Development Method Analytic Hierarchy Process Blockchain in Transport Alliance Choosing by Advantages Chief Information Officer Critical Success Factor Data Envelopment Analysis Decision-Making Units Evidential Reasoning Approach Gross Domestic Product Graphical User Interface Information and Communication Technologies International Data Corporation Importance of Advantages Internet of Things Interquartile Range International Organization of Standardization Knowledge Management Key Performance Indicator Knowledge Sharing Knowledge Transfer KM Tools and Technology Machine-to-Machine Multiple-Criteria Decision Analysis Mobile Crowdsensing Operating System Plan–Do–Check–Adjust Precedence Diagramming Method Personal Protective Equipment Personal Software Process Prevention Through Design xix

xx

SD SDLC TMCS TSP

Acronyms

System Dynamics Software Development Lifecycle Top Management Commitment and Support Team Software Process

Chapter 1

Historical Survey: Crises, Crises, Everywhere; Any End to This Nightmare?

Abstract This chapter gives an overview of the concept of software development crisis, and how this should be addressed. This includes historical surveys of crises in economics and digital product development, particularly the NATO Software Engineering Conference of 1968. Another key area of this chapter deals with the new discipline of IT Crisisology. Discussion of models and methods for crisis-responsive software development is included in this chapter, to allow further understanding of how to monitor, predict, and react to these mission-critical events which require immediate attention. The chapter concludes that after half a century of software engineering as a discipline, IT crisis is still a dramatic issue and suggests a further elaboration of the new IT Crisisology framework in the following chapters. Keywords Software crisis · Software engineering · IT crisisology · Crisis management · Tradeoff optimization · Agile methodology

1.1 Introduction: Crises, Crises Everywhere This chapter begins with the definition of a crisis in economics and continues to discuss the crisis in software development. This discussion includes a brief historical survey and detects the key issues that typically trigger crises. Based on the historical lessons learned, it introduces the new discipline of IT Crisisology, identifies its structure and the key applications. The first part of this chapter presents a survey of the early days of software engineering and the crisis-related problems that gave birth to this discipline. A crisis is defined as an imbalance of project resources. This critical situation was recognized at the 1968 historical NATO Software Engineering Conference, and applying engineering practices to software development was suggested as a crisis remedy. This section also addresses the software complexity and so-called human factors, which are potential crisis triggers in digital product development. The section concludes that the issue of crisis is still uncertain (as some researchers argue that it is over, whereas others state that it still exists), and therefore, developers require a systemic

© Springer Nature Singapore Pte Ltd. 2021 S. V. Zykov, IT Crisisology: Smart Crisis Management in Software Engineering, Smart Innovation, Systems and Technologies 210, https://doi.org/10.1007/978-981-33-4435-8_1

1

2

1 Historical Survey: Crises, Crises, Everywhere …

approach which is able to conquer the digital product development crisis in software engineering. Further, this chapter presents the novel framework of IT Crisisology, which is intended for the successful development of quality software products in crisis, including mission-critical and large-scale applications. This new IT Crisisology approach includes a set of models, methods, and tools for crisis monitoring, forecasting, detection, resolution, and prevention. IT Crisisology incorporates “the three pillars” of technology, business, and human-related factors. This chapter also considers a few model-based approaches to crisis management. These include the PAEI framework for company development lifecycle, agile methodologies for software development, and optimization-based approaches, such as the enhanced spiral model, agility matrix, and tradeoff methods for software-intensive digital product design.

1.2 Software Crisis and Software Engineering Regarding the development of software systems, let us consider what a crisis is, and the NATO conference where this term originated. How can developers deal with this crisis, what are its causes, and what factors influence the anti-crisis development? Karl Marx, the founder of the theory of crises in economics, argued that a crisis is an imbalance, relating this crisis to overproduction. A crisis can be triggered by an imbalance of resources in terms of production and consumption, or an imbalance of resources in terms of profits and wages. For software engineering and the crisis of developing software systems, the human factor is quite serious. In fact, consumption, in terms of behavior of the buyer, and production, i.e., behavior of the seller as a person, have certain psychological reasons. It is not by chance that the Carnegie Mellon University, the founder and producer of standards for software engineering, was established by psychologists. Psychologists continue to develop the human–machine interfaces, which include behavioral aspects of users in the system, as the human factor is quite important in software engineering. Software engineering, as a discipline, has two components: a technological and a managerial one. In terms of complexity of software products, enterprise systems are distinguished by rather high technological complexity and more significant management complexity. Therefore, communication, negotiation, and teamwork skills, also known as soft skills, are a very important asset of any software engineer and are mission-critical in any software engineering training program of the leading universities, including Carnegie Mellon. Very often, the developer and the customer have different viewpoints and ideas regarding the future software product, its features and development. The task, nevertheless, is establishing a common understanding between the developer and the customer about this software product, even in a crisis. That is why a crisis often originates from a difference in perception. Therefore, the following questions arise:

1.2 Software Crisis and Software Engineering

3

How can we manage a crisis? What actually happens in this crisis? What is the difference of the product vision and expectations, for the developer and the customer? Why/how does this difference happen? How can the developer and the customer achieve a common vision (and is this always possible)? Is it possible to predict or prevent a software development crisis? The above questions suggest that software engineering is not a 100% technical subject, as the role of human factors is very important. The concept of a crisis in software development was introduced at the 1968 NATO conference, which brought together the experts from the Western Europe and the USA, who advised NATO on the military systems. The term “crisis” was coined by Bauer and supported by Dijkstra, a significant contributor to concurrent system development. This term appeared at that conference, due to rocketing complexity of computers (such as the Burroughs B5000), which was so serious that the experts did not see a clear way to efficiently integrate software developers into this human–machine system. On the other hand, the development methods of the 1960s could not ensure adequate productivity and high quality of software development. Therefore, it was necessary to manage this crisis of hardware complexity and software inefficiency, addressing human factors in the development process. Dijkstra illustrated this situation in his Turing Award speech. In computer science, this award is synonymous to the Nobel Prize. He said, “The main cause of the software crisis is that machines have become several orders of magnitude more powerful. Let’s face it, as long as there were no machines, there were no problems in programming at all. When we had several weak computers, programming was not a big problem, and now that we have giant computers, programming has become an equally huge problem.” Dijkstra meant that one problem was hardware and software complexity increase, whereas another issue was personal and team software development efficiency. Today, those issues of the late 1960s software industry are still critical. These projects often inflate budgets, do not meet the deadlines; the products being developed lack functions and quality required. The projects are difficult to manage, and the code is hard to maintain. Let us try to understand why this happens and what are the ways to fix this. Rocketing software system complexity typically results in an unpredictable software development lifecycle, which is hardly manageable in terms of time and cost. Software engineering assists in addressing these uncertainties systematically, managing them in a predictable way by means of accurate lifecycle models and development methodologies. The crisis root cause, therefore, is located in the people’s minds, i.e., this crisis is largely a consequence of fundamental differences in the software product expectations of the developer and customer, rather than a result of technological issues. In this sense, an adequate and adaptive method of developing software systems is required, which includes:

4

1 Historical Survey: Crises, Crises, Everywhere …

(i) Adequate lifecycle model that specifies production phases, or a combination of such models (ii) Development methodology that determines production processes, roles, and artifacts (including software process participants and their activities, deliverables for each development stage, and the milestone criteria) As we already said, in the 1960s and before, the lifecycle was rather anarchic, and the development was commonly done by gifted individuals with unique abilities. Therefore, each system was unique as well as it usually featured its own programming language, and development and operating environment. Often, the systems were either hardly compatible with each other or incompatible at all. Next, there was a manufacturing stage that used simple development tools. This was followed by pipelines, i.e., toolkits also known as CASE tools, which supported teamwork and software maintenance. Regarding the question of whether there is a crisis in software engineering today, there are two opposite viewpoints, which argue that the crisis is over, and that the crisis is still here. Probably the right answer would be that the crisis depends on the positions of the developer and the customer, and on their alignment. If we apply specific technologies and methods that mitigate this human factor influence and make the developer and customer share the same product concept, the final product will largely meet the customer’s expectations. Naturally, the points of difference will exist. However, they will be quite small. Let us consider crisis management by means of predictive, preventive, and responsive methods. To address the human factors, a set of seven principles that control the development cycle will be used, including knowledge organization, knowledge transfer, and interaction with the client. Together with an extension of Shannon’s information theory, based on electrical circuits and analysis of their interacting elements, this approach will provide a framework for the human factor management. To manage the technology factors, we suggest a matrix that describes processes, methods, and interfaces in software systems. For a certain combination of these models, we can achieve a balanced development of software systems in a crisis. What conclusions can be drawn from the brief survey of the crisis phenomena and their origins? Nearly 50 years ago, the experts decided that a software crisis happened, and this crisis was caused by an imbalance between the complexity of the software and hardware systems, and the inefficiency of the human factor involved. This suggests that the human factor requires optimization together with the technological factors. As a result, software engineering emerged, a discipline for production of efficient and quality systems, by methods similar to material engineering, i.e., production of architectural structures, buildings, and bridges. These two examples were considered at the 1968 NATO conference, where it was concluded that blind copying of these material production methods for software engineering, unfortunately, was inefficient.

1.2 Software Crisis and Software Engineering

5

Instead, this new discipline requires a balanced combination of models, methods, and tools to optimize software development in terms of resources (i.e., time, cost, and features) and constraints (i.e., quality and complexity). This means multi-factor optimization, unique for each project, due to priority changes. Finally, software systems are often complex technologically and managerially, which requires optimizing technology and human factors.

1.3 The New Discipline of the IT Crisisology Crisis is an imbalance between the expectations of clients and the exact behavior of the product. It is due to rapid increases in computer power and the complexity of unhandled problems. The term “crisis” is used in computer science to describe the difficulties of writing useful and efficient computer programs in a scheduled time. Let us define crisis in software production as a situation of either premature project termination, insufficient quality or late product delivery, due to an imbalance of project resources and product constraints. Let us define IT Crisisology as a discipline that studies crises in software production, including their monitoring, forecasting, detection, mitigation, resolution, resilient and adaptive (either proactive or reactive) responding, and prevention. Crises affect both software project and software product. In software projects, crises occur in the form of over budgeting, late delivery of projects, and unmanageable projects, whereas in software products they manifest themselves as the products being inefficient, low quality, essentially different from customer requirements, difficult to maintain, and even undelivered. Three “Pillars” of IT Crisisology The first “pillar” refers to technical requirements; these include quality attributes (e.g., portability, security, and performance) and system-level framework, such as integrated development environment (IDE), database management system (DBMS), and programming language (PL). The second “pillar” refers to business constraints; this occurs in the form of project delimiters (e.g., budget, and time to market) as well as the problem domain in which this crisis occurs. The third “pillar” is known as the human factor; this encapsulates such communication attributes as teambuilding ability, negotiation skills, and the ability to transfer knowledge to other people. Let us refer to these three “pillars” as the technology, business, and human factors, i.e., “T”-factor, “B”-factor and “H”-factor (see Fig. 1.1). The idea is that removing any of the above “pillars” results in a “global” crisis (as it would with removing any

6

1 Historical Survey: Crises, Crises, Everywhere …

leg of a three-legged stool), whereas neglecting any of these results in a “local” crisis due to their imbalance. Myths About Crisis In April 12, 1959, the US President J. F. Kennedy said that a “crisis” meant an “opportunity” [1]; however, the Chinese word for crisis he mentioned is typically composed of these two characters: “danger” and “situation”. Notably, at that point in time, “Crisisology” was born as a phenomenon. As the years went by, researchers argued whether the crisis in software engineering is over or it still exists. In 2008, Coliburn et al., argued to prove that there was no software engineering crisis [2]. However, in 2006, Buettner et al., argued conversely, stating that software engineering was indeed in a crisis [3]. All these happened, because of the fundamental Fig. 1.1 Three pillars of the ITC framework

IT CRISISOLOGY

T Fig. 1.2 “Iron” project triangle

B

H

1.3 The New Discipline of the IT Crisisology

7

differences in the lifecycles of software and material products, and due to human factor dependencies. IT Crisisology This book presents IT Crisisology, the discipline that systematically studies IT crises phenomena and their management. This discipline addresses business, technical, and human factors; it applies a carefully selected blend of the software engineering models, methods, and tools that result in manageable (i.e., measured and predictable) projects, better quality products, conquering product complexity, ambiguity, and managing project uncertainty. IT product management differs from financial management and material production management in many ways, some of which are its product type, which is up to 100% non-material, and its lifecycle (i.e., implementation and maintenance cost). The other key reason is communication ambiguity, as the developer and the client sides typically speak in different languages. This is evident even from the Google search results, which report 612 M for the “Crisis Management” query, 629 M for “Financial Crisisology”, and only 1.2 K for “IT Crisisology”. In 1968, NATO organized the first conference in software engineering in Garmisch, Germany, in order to find a solution to what became known as “crisis” at that time. Among the key delegates of this conference were Friedrich Bauer, Alan Perils, Edger Djikstra, and Peter Naur. Software has been in crisis for over 50 years. A number of models were developed due to the crisis; one example is the “crisis triangle” (Fig. 1.3) based on the “Iron” (project) triangle (Fig. 1.2). Let us mention “Quality”, which deals with adjusting tradeoff and managing the project budget, project scope, and project schedule. The “crisis triangle” consists of three zones. These include the “comfort” (i.e., green) zone, where there is little need for management since the actions proceed as planned, and this guarantees quality deliverables. The other zone is the “tradeoff” (i.e., yellow) zone, where a number of factors may be adjusted, and immediate actions are required in other to guarantee the quality. The third zone is the “crisis” (i.e., red) zone; in this critical area, immediate actions are required to produce even a negotiable quality. Dealing with this crisis triangle, a paramount factor is the communication between the developer and the customer. Shannon information theory fits the “developerand-customer” system, where the key issue is transferring messages without loss, especially in a “noisy” environment [4]. In this model, a crisis typically results from an information loss, and in case of a large-scale system, such a loss can happen due to complexity. To compensate for negative issues and conquer the complexity, feedback (either negative or positive) is mission-critical. The soft skills that result from the so-called human factors (such as communication, negation, conflict management, and teambuilding) are based on psychology and are highly essential for software engineers; however, these are initially rare for the technical staff (i.e., software developers). These human factors can be broken down into “seven principles” of knowledge transfer [5]:

8

1 Historical Survey: Crises, Crises, Everywhere …

Fig. 1.3 “Crisis” triangle

• • • • • • •

Prior knowledge Knowledge organization Motivation Practice and Feedback Mastery Climate Metacognition.

Surprisingly, the human factors result in crises more often than the technical factors. In order to manage these, metrics to determine if the product development goals are met, how to achieve these goals and what is required, are necessary. Then, a time, priority and due date for each deliverable are established. These ingredients, taken together, are often referred to as SMART communication [6]. Also, adaptive development practices such as teambuilding, Myers Briggs personality tests, situation leadership models, and personal development models are considered. This difference between developers and customers inspires the need for tradeoffbased architecture (ACDM/ATAM) [7]. ACDM/ATAM reduces uncertainty and helps avoid crises. How this model ensures that crisis is avoided is by reducing uncertainty (Fig. 1.4). ACDM defines the main stages of work on the coordination of design and rapid, adaptive development of high-quality architectural design. It helps teams to analyze, identify, and build architecture “drivers” in the early stages of software development. These architecture “drivers” help to design, evaluate and test a software development project. The results of the project evaluation contribute to the

1.3 The New Discipline of the IT Crisisology

9

Fig. 1.4 ACDM/ATAM: software development lifecycle

refinement of the architectural design. The ACDM method includes these three main components: • Processes: These involve iterative development and clarification of the architectural project • Project evaluation: This determines whether the project is ready for production and identifies problems important to architecture • Architectural testing: This solves technical problems identified during the assessment, serves as the basis for the refinement of the project, and requires (re)assessment until the product is ready for production. In ACDM, the main goal is software design. This is achieved by performing these tasks: identifying, analyzing, and organizing architectural drivers, architectural project management and its documenting, project evaluation, management, iterative

10

1 Historical Survey: Crises, Crises, Everywhere …

refinement of software design to reduce risk, assessment, planning and correction of the system/product at the architectural level. Agile Software Development This is a set of approaches based on iterative development, where requirements and solutions evolve through collaborative efforts of teams and their customers, or end users. It provides a response, which is rapid and flexible, to manage a crisis. The general Agile software development lifecycle includes such phases as: Plan, Requirements Analysis, Design, Development, Test and Deployment. A number of software development methodologies follow the Agile way. These include: Scrum, Extreme Programming (XP), Feature-driven Development (FDD), Adaptive Software Development (ASD), Lean Software Development (LSD), Kanban, Dynamic Software Development Method (DSDM), and Crystal Clear. Agile methodologies help crisis management by employing the following techniques and practices: • Continuous improvement: feedback from the system, the customer, and team members throughout the project. The aim is to improve future iterations using past records • Customers are informed and work closely with the team. Customers see the deliverables, share the input, and have an impact on the end product. The aim is to meet customer requirements • Changes: It is easy to accept and set changes with a shorter planning cycle. The aim is to refine and reprioritize items • Fast and high-quality project delivery: The product is broken down into manageable units. Team focus is on high-quality development and testing. Bugs and errors are identified and solved quickly; the product is delivered fast, with a reasonable schedule and budget • Team interaction: Frequent and face-to-face communication results in highquality products • Continuous goal clarification: With a clear goal, development adapts to evolving requirements as the project progresses within scope Lifecycle Management in the Enterprises In general, the lifecycle management in enterprises is similar to a person’s life in a number of aspects. The organization is born in creative and entrepreneurial agony, experiencing the difficulties and joys of adolescence. It reaches maturity, and then many organizations begin to grow old and decline as young and dynamic competitors gradually replace them. Old age is usually followed by organizational death. However, unlike people’s lives, the lifecycle of an organization is not limited to a certain time period. Examples of organizations that have been in their blossoming

1.3 The New Discipline of the IT Crisisology

11

and maturity for decades can be observed in companies like Google, Apple, and Amazon. Change Management and Consequences Often, a problem or an opportunity created by a change leads to a solution that will cause an even greater change, and as a result, customers face a new reality with a new set of problems or opportunities. When the systems change, they collapse and disintegrate. However, in order to break up and disintegrate, it is not necessary to be old. Just as young people sometimes commit suicide, young systems can also disintegrate. Therefore, regardless of the age of the system, the cause of its disintegration is change, and the faster the change, the faster the disintegration that manifests itself, in what we call problems. The lifecycle of an enterprise proceeds in a number of stages, which are summarized below. The Courtship phase precedes the emergence of an organization that has not yet come into being and exists only as an idea. During the courtship, the emphasis is typically on the ideas and opportunities that the future promises. The potential founder of the company experiences a burst of enthusiasm and willingly tells everyone about how wonderful his or her idea is. At the Infancy stage of development, it is more important what the employees do, rather than what they think. The question that the founders must answer and which they ask their employees, looks like this: “What did you do? Did you manage to sell something, produce it, or bring it to an end?” Infant companies face an amazing paradox. The higher their risks, the higher their loyalty should be in order to ensure the achievement of success. At this phase of courtship, the founders must be dreamers capable of developing a devotion to their dreams. However, as soon as a company enters the period of infancy, the risk increases, and it begins to demand persistent, result-oriented founders who are no longer dreamers. The “Let’s Go“ phase is a dramatic period of abrupt transition, from love to hatred. Workers may not like their leaders, but they continue to fear and respect them. If a company falls into the trap of the founder, it means that when the founder dies, the company dies. In order for the company to be able to save the hard-earned wealth, it must move from management based on intuition and position (used in the “ComeCome” phase), to a more professional management. This process is carried out at the stage of the company’s youth. If the company does not implement such a transition, it falls into the founder’s trap or into a family trap. The Youth phase is characterized by a number of problems, such as: (i) a conflict between partners or decision makers, (ii) a temporal loss of vision, (iii) unsystematic way of delegation of authority; therefore, the rules are established but not observed. Other problems that occur include: (i) inadequacy of goals, (ii) suspension of the founder, and (iii) rapid weakening of mutual trust and respect. The Flourishing phase is for the organizations that have reached the stage of “Come-Come“, and create new companies, i.e., the new business units that have their own products, as well as their own production and marketing capacities. These new units can exist by themselves. Like an adult tree, a company that has reached blossom

12

1 Historical Survey: Crises, Crises, Everywhere …

also contains seeds of new companies in its fruits. They are not just new functions. They are new profit centers. Organization of the Flourishing period is typically a group of profit centers that share some functions for the sake of economies of scale, for the benefit of coordination, or for the sake of preserving the critical mass for further creation. Blossoming is an optimal state of the lifecycle, achieving a balance between self-control and flexibility. Symptoms of an Aging organization emerging from a state of prosperity can be seen in its financial statements. Financial statements also help to detect a disease (i.e., a crisis) when abnormal symptoms appear in them, and we can only hope that they see the problems before they acquired the nature of pathology. The purpose is to identify the symptoms of impairment when there is still the possibility of preventive treatment of the disease. At the Aristocracy phase, organizations reduce expectations of growth, show little interest in conquering new markets, master new technologies and expand borders. They focus on the past achievements rather than future vision, are suspicious of the changes, reward those who do what they are told to do, and are more interested in maintaining interpersonal relationships. Additionally, they spend money on control systems and the arrangement of premises, worry about how to do rather than what and why, require adherence to traditions, and compliance with formalities. Companies that have reached the Salem City phase have the following characteristics. People focus on who created the problems, not on what to do to solve them. Problems are personified; instead of dealing with the problems of the organization, people are involved in conflicts, and accuse and discredit each other. The organization includes universal paranoia. In the unseen struggle, everyone is included; nobody has time to deal with the needs of the client. At the phase of Bureaucracy, companies are unable to generate enough resources on their own. They justify their existence by the simple fact that they are of interest to another organization that is ready to provide their support. The organization needs an artificial life support system that allows the delaying of Death, and what ensures the provision of such a system is a political decision. The four managerial roles to account for the above phases, are: Producer (P), Administrative (A), Entrepreneurial (E), and Integrative (I). The PAEI (Adizes) Methodology The tools of this systematic approach support the processes of internal, external, and internal–external integration. This therapeutic intervention assists in analyzing the state of the organizations, and determining, depending on their position on the lifecycle curve, what is likely to happen (i.e., predicting crises)—see Fig. 1.5. The PAEI (Adizes) methodology outlines the therapy, depending on the stage of the lifecycle, which the organization itself implements. The structure of each organization, its management style, remuneration system, planning process, objectives, and other features may be desirable or destructive depending on the organization’s position on the lifecycle curve.

1.3 The New Discipline of the IT Crisisology

13

Fig. 1.5 PAEI lifecycle

Optimized Spiral Model Every lifecycle stage of software system development can be optimized. To optimize the lifecycle, i.e., to adapt it for crisis conditions, a complex methodology is required. This complex methodology is a combination of a set of models, methods, CASE tools, and practices. The methodology is process-based, and it has six stages, each of which produces certain deliverables in terms of software product components and their connectors. At the most abstract level, these are key concepts of the product and certain relationships between these concepts. Next, high-level architectural modules and interfaces follow. These lifecycle stages are shown in Fig. 1.6. Agility Matrix Agility is related to balancing business requirements and technology constraints. Many local crises result from misbalancing of these two aspects. Hence, agility is essential to achieve a well-balanced software solution. Agility is a remedy for crisis. When a crisis occurs, agility is vital for any kind of a business organization, if it wants to overcome the crisis. Agility should be present in every stage of the enterprise system lifecycle. In the case of an enterprise, its agility is a concept that incorporates the ideas of “flexibility, balance, adaptability, and coordination under one umbrella.” For each of the perspectives, we identify a set of business management levels, such as strategic decision-making [8] or everyday management. After we combine these perspectives and the business management levels, we get the enterprise agility matrix (Fig. 1.7); this matrix is a tool for monitoring, predicting and preventing crises.

14

1 Historical Survey: Crises, Crises, Everywhere …

Fig. 1.6 Optimized spiral model

1.4 Conclusion: The Crisis Is Still Here After 50 years, IT crisis is still a critical issue. To address these crises, a specific discipline is required, and therefore we suggest the IT Crisisology framework (ITCF) to manage this issue. However, as material and IT product lifecycles are very different, to perform crisis management, there should be a combination of methodology-based optimization (e.g., Agile, SCRUM, XP, OpenUP, and ICONIX), and communication. This results in the conquering of crisis, as our ITCF approach is combined with a software development discipline, knowing that factors determining efficiency and productivity in the short and long terms are delivered and integrated according to a predictable model. Our ITCF approach incorporates the three “pillars” of business, technology, and human-related factors (T-, B-, and H-factors); it also includes a set of lifecyclebased models. These models include data, process, and system-related aspects. One aspect of presenting and balancing the above three aspects in crisis is the agility

1.4 Conclusion: The Crisis Is Still Here

15

Fig. 1.7 Agility matrix

matrix; to address data lifecycle specifically, we recommend using the optimized spiral model. To manage technology (and knowledge) transfer, the authors suggest using the enhanced Shannon’s information model, and “soft skills” tailoring based on the “seven principles.” PAEI (Adizes) framework and agile methods are also used to predict crises, and dynamically adjust the “iron” triangle parameters in order to optimize the lifecycle based on mission-critical tradeoffs (e.g., ACDM/ATAM and SMART methods). Systemic application of the above-mentioned models, methods, principles, and practices under the umbrella of the ITCF results in disciplined, predictable, manageable, and adjustable software product development even in cases of volatile and uncertain crisis conditions.

References 1. https://www.jfklibrary.org/learn/about-jfk/life-of-john-f-kennedy/john-f-kennedy-quotations. 2. Colburn, A., Hsieh, J., Kehrt, M., & Kimball, A. (2008). There is no software engineering crisis. http://mercury.pr.erau.edu/~siewerts/cs332/documents/Papers/There-is-no-Sof tware-Engineering-Crisis.pdf. Last accessed 25/05/2020. 3. Buettner, M., Peng, D., Suporn, P., & Richa, P. (2006). Software Engineering is indeed in a crisis.

16

1 Historical Survey: Crises, Crises, Everywhere …

4. Shannon, C. http://people.math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy. pdf. 5. Ambrose, S.A. et al. (2010). How Learning Works, 7 research-Based Principles for Smart Teaching. Jossey-Bass Publications. 6. Information regarding SMART is available at https://www.citethisforme.com/topic-ideas/other/ SMART%20Goals-54246756. Last accessed 25/05/2020.. 7. Anthony, J. (2008). Lattanze. Architecting Software Intensive Systems: A Practitioners Guide. Auerbach Publications. 8. Thomas, L. S. (2014). Mathematical Principles of Decision Making. RWS Publications.

Chapter 2

Models and Methods of Crisis Management

Abstract This chapter focuses on crisis-responsive digital product development methods. Keeping in mind the findings of Chap. 1, the IT Crisisology framework treats lifecycle optimization as a high-level adjustment of the software development process. However, this was a set of preliminary considerations, which did not address particular approaches in terms of models and methods. Therefore, this chapter starts with an overview of lifecycle models applicable for flexible, i.e., agile, product development in crisis. The next sections discuss tradeoff-based crisis-resistant methods of optimized software production. Keywords Lifecycle model · Enterprise lifecycle · Waterfall model · Incremental model · Spiral model · Personal software process · Team software process · Six sigma

2.1 Introduction: Agility in Crises This chapter presents a description of the lifecycle models and their application to crisis-responsive software development. We compare several types of lifecycle models such as Waterfall, Incremental, Spiral model, and a few others (the survey is largely based on [1]). For each model, we identify the strong and weak aspects in terms of crisis responsiveness, i.e., agility. Therewith, the primary objective of this chapter is to determine the applicability of each lifecycle model (and certain human factor-centered methods, such as GEMBA and Crystal) to mission-critical software development, particularly in crises. The early software development lifecycles were often structured simply. For instance, the Waterfall model is straightforward and single path, as it does not allow backtracking to previous stages, and iterations are available within each stage only. However, the software crisis revealed that the lifecycle models are often different in terms of agility, and therefore, some of these approaches should be enhanced (such as the Spiral model that we addressed in the previous chapter) or combined. This problem was addressed by agile approaches, which in fact are human centered, and consequently, more flexible in terms of crisis responsiveness [2, 3]. © Springer Nature Singapore Pte Ltd. 2021 S. V. Zykov, IT Crisisology: Smart Crisis Management in Software Engineering, Smart Innovation, Systems and Technologies 210, https://doi.org/10.1007/978-981-33-4435-8_2

17

18

2 Models and Methods of Crisis Management

Research will demonstrate that every approach, whether it is a model, process, or method, has certain tradeoffs in terms of digital product quality. Therefore, a fit-forall solution is a dream impossible to implement in reality. Instead of inventing such a utopic universal solution, developers should wisely compromise on the missioncritical product qualities to determine and follow an optimized product development path, one which blends carefully selected features of certain models, methods, and processes. Often, an “on the go” adjustment of this initially outlined path is required to develop the digital product in a crisis-responsive and agile way. To illustrate the ways these methods are applicable to mission-critical development of IT intensive businesses, we present a few case studies. These include a comparative study of publishers (Springer vs. IGI Global), and Dodo Pizza production and delivery service. Both describe rapidly growing businesses that survived in scalability crises. Further, we elaborate on the lifecycles regarding the PAEI framework by Adizes, which helps to detect and mitigate human-related issues that may result in development crises. We also discuss software development processes in more detail in terms of quality, as we analyze Six Sigma and Personal/Team Software Process approaches. In conclusion, we summarize the results of this chapter focusing on model, process, and method applicability to the IT Crisisology framework of responsive and agile software product development. Therefore, this chapter starts with an overview of lifecycle models applicable for flexible, i.e., agile, product development in crisis. The next sections discuss tradeoff-based crisis-resistant methods of optimized software production.

2.2 Lifecycle Models in the Crisisology Framework Software Development Lifecycles Field Analysis The software development lifecycle (SDLC) is a term referring to a series of steps required to deliver a software product to a customer [4–6]. The software development process is an infinite cycle that never ends. The first release of a software application typically is not final. In most cases, there are additional features and bug fixes to be designed, developed, and deployed. Methods and practices for software development evolved over the decades since the advent of software engineering [7]. These methods vary; however, their common goal is to develop software as cheaply and efficiently as possible. Regardless of specific methods, the software development process typically runs in cycles, starting over with each iteration. Team development of any software product is extremely difficult to carry out without a plan. Therefore, each software development methodology is a project plan framework. Without a justified and detailed plan, neither the developers of the product features nor the project managers have a clear estimate of the project progress, and the client has no understanding how their requirements will be achieved in the final product. The above vision of the SDLC provides the following benefits:

2.2 Lifecycle Models in the Crisisology Framework

• • • • •

19

Common vocabulary for each process step Established communication channels between client and developer teams Clear responsibilities and roles for project managers, designers, developers etc. Prescribed inputs and outputs for each software development step Exit criteria for each step of digital product development.

The SDLC models are divided into main groups, each with its strengths and weaknesses. The SDLC model’s diversity is predetermined by a variety of product types, ranging from a Web application development to a complex enterprise software. The most important SDLC models are: Waterfall, Iterative, Spiral, V-Shaped, and Agile. Waterfall is an SDLC model, in which the development process resembles a water flow, moving sequentially through the phases of analysis, design, implementation, testing, development, and maintenance. This SDLC model executes every stage completely. This process is strictly documented and predefined for every phase. Typical use cases for the Waterfall SDLC model are as follows: • Requirements document complete • Product defined • Technology stack predefined. Advantages

Disadvantages

Simple to use and understand

The software is ready after the last stage is complete

Management simplicity thanks to its rigidity: every phase has a defined result and process review

High risks and uncertainty

Development stages go one by one

Inadequate choice for complex and object-oriented projects

Suitable for the small or mid-sized projects Inappropriate for long-term projects where requirements are clear and not equivocal Easy to determine the key points in the development cycle

The progress of each stage is hard to measure while in development

Easy to classify and prioritize tasks

Integration is done at the very end, risk of initially wrong focus

An Incremental model does not need the full list of requirements before the project starts. The development process may start with the high-level functional requirements and can be expanded later. The process is repetitive, allowing to make new versions of an operational product for every cycle. Every iteration (which lasts from two to six weeks) includes the development of a separate component of the system, and after that, this component is added to the functions developed earlier. Use cases for the Incremental model are as follows:

20

2 Models and Methods of Crisis Management

• Requirements of the final product are predefined • Key requirements are predefined, and the details may elaborate later. Advantages

Disadvantages

Fast development at the beginning of the SDLC

Requires more resources than Waterfall

Concurrent development possible

Continuous management required

The progress is easily measurable

Design flaws possible, as critical requirements may be overlooked in the brief planning stage

The shorter the iteration, the easier the testing and debugging stages are

Suboptimal choice for small projects

Risk management, as high-risk tasks are completed first

Progress management is challenging

Risks detected within any iteration can be realized in the next iterations

Risk assessment complete only at the final stages of the project

Agility and flexible requirement changes

Risk assessment requires high expenses

The Spiral model is an SDLC, which combines Incremental and Waterfall models with the on-risk analysis. The main challenge of the Spiral model is defining the right moment to start the next stage. Preset timeframes are recommended as a solution to this issue. Each stage is planned, and this plan is based on the data from previous projects and developer’s experience. Typical use cases for the Spiral models include: • Customer is uncertain about the requirements • Major changes are expected during the development cycle • The projects with mid- or high-level risk, where it is important to prevent these risks • A new product that should be released in a few stages to have enough client feedback. Advantages

Disadvantages

Lifecycle is divided into small parts, and if the risk concentration is higher, the phase can be finished earlier to address the treats

Can be expensive

The development processes precisely documented The risk control demands involvement of yet scalable to changes highly skilled professionals Scalability allows to make changes and add new functionality even at relatively late stages

Can be inefficient for small projects

Early working prototype: users can detect product flaws

Large number of the intermediate stages requires excessive documentation

The V-Shaped model is an expansion of the classic Waterfall model, and it is based on an associated testing stage for every development stage. This is a rather strict model, and the next stage starts only after the previous one. This is also called

2.2 Lifecycle Models in the Crisisology Framework

21

“Validation and Verification” model. Every stage has the process control, to make sure that the next stage is possible. Use cases for the V-Shaped model are as follows: • Projects where accurate product testing is required • Small and mid-sized projects where requirements are strictly predefined • Qualified engineers, especially testers, are within easy reach. Advantages

Disadvantages

Each stage has well-defined deliverables, and is easily manageable

Low flexibility

Testing and verification at early stages

Suboptimal for small projects

Acceptable for small projects with static and clear requirements

Relatively high risks

In the Agile methodology, after every development iteration, the customer is able to see the result and understand if he/she is satisfied with it. This is a merit of agile software development. One of its disadvantages is that with the absence of rigorously defined requirements, it is difficult to estimate the resources and development cost. The basis of such an approach consists of short weekly meetings, or sprints, which is also a part of the Scrum. Typical use cases for the Agile methodology include: • User requirements change dynamically • Lower price for the changes implemented because of multiple iterations • Unlike Waterfall, Agile approach requires only initial planning to start the project. Advantages

Disadvantages

Corrections of functional requirements are implemented into the development process to provide competitiveness

Difficulties with measuring the final cost because of permanent changes

Project is divided into short and transparent iterations

The team should be highly professional and client-oriented

Risks are minimized due to flexible process

New requirements may conflict with the existing architecture

Fast release of the first product version

With all the corrections and changes, the project may exceed the estimated time

There are seven common steps, which are similar for the above-mentioned models and methodologies. Typically, they occur in order, but can be mixed together, such that some of the steps may occur in parallel. Outputs from the previous step become inputs to the following one. Those steps are:

22

1. 2. 3. 4. 5. 6. 7.

2 Models and Methods of Crisis Management

Planning Gathering requirements Design and prototyping Software development Testing Deployment Maintenance.

1. Planning phase involves project and product management and may include: (a) (b) (c) (d) (e)

Estimation of cost Provisioning Scheduling Capacity planning Resource allocation.

The planning phase outputs include cost estimations, procurement requirements, project plans and schedules. Project managers and development staff should collaborate with operation teams to ensure presence of all perspectives 2. In the Gathering Requirements phase, IT teams must communicate with businesses to establish requirements for development and enhancement. The requirements are gathered from stakeholders and subject matter experts. Product managers and development teams document the business processes with needs of automation, through software. Typical outputs are either definitive documents with a list of requirements, or backlogs 3. In the Design and Prototyping phase, software developers and architects begin to design the software according to requirements gathered in the previous step. The design process includes patterns for software development and application architecture. Sometimes architecture framework is used to compose an application from already existing components, promoting standardization and reuse. Proven design patterns are used by developers to solve algorithmic problems in a consistent way. This phase may include rapid prototyping. The outputs of this phase are: • Design documents with a list of components and patterns selected • Code produced during rapid prototyping as a starting point in development. 4. Software development phase produces the software. This phase may be split in timed “sprints” or processed discretely. Regardless of the methodology, development teams work to produce operational software with as little time and effort as possible. Stakeholders should be regularly engaged to ensure that expectations are met. This phase output is fully functional software ready for testing

2.2 Lifecycle Models in the Crisisology Framework

23

5. The testing phase is arguably the most important one. With a lack of testing, it is impossible to deliver quality software. A wide variety of testing types is necessary for quality assessment; these include: • • • •

Functional Integration Performance Security.

6. The deployment phase is ideally highly automated and (in large enterprises) is almost invisible. A software product is deployed as soon as it is ready. Some companies require manual approvals at this stage. Application release automation is used as a part of continuous integration. The output of this phase is the release version of operational software. 7. Maintenance is a very important phase. Software must be under constant monitoring to ensure correct operation. Defects and bugs discovered must be reported and responded to. Regardless of the types of methods being used, there are certain basic software development practices: • It is hard to imagine software development without Source Control. Those few development teams who do not use it are taking a risk, regarding both source code and development process, by ensuring that the deliverables are regularly assembled in a dedicated location. As such, if a user’s workspace or file server malfunction occurs, the central repository (often with multiple backups on other repositories) is there to save the day. Many source control systems also support continuous integration. • Continuous integration’s main purpose is to keep the software in an operating state as many times as possible. Prior to a continuous integration, software development teams would write hundreds or even thousands of lines of code and then attempt to integrate all these to the system, which resulted in excessive workload being timeconsuming, and prone to errors. Currently, when a problem occurs, the continuous integration system alerts the developers the very moment “bad” code is written. If proper continuous integration practices are followed, no further work is done until the point when all errors are fixed, and the build is successful. This method prevents many defects and contributes to the final software product being as defect-free as possible. • Software lifecycle management systems are large-scale and complex software development toolkits. These systems offer an all-in-one package of bug tracking, activity management, and analytics, which assists in decision-making. Crystal Methods Family The previous section presented general information about software development lifecycles. Now, it is time to see how Agile, as an improvement of Waterfall , can

24

2 Models and Methods of Crisis Management

potentially be further improved. Therefore, we introduce the Crystal Methods agile framework family, which addresses the issues of different characteristics and team sizes between projects. Those methods come from years of study and interviews of teams by Cockburn [8]. His research showed that the teams he interviewed did not follow the formal methodologies, yet they delivered successful projects. The Crystal family is a way of cataloging what those teams did that made the projects successful. Crystal Methods consist of agile methodologies such as Crystal Clear, Crystal Yellow, Crystal Orange, etc. Cockburn, one of the authors of the Agile Manifesto for Software Development, introduced this set of methodologies. Crystal Methods aim to fit into many different projects. Two main factors determining the characteristics of these different projects were introduced: team size and system criticality. Factor of “criticality” (i.e., crisis) is the level of potential damage the system can cause if it does not perform as designed. There are four levels of criticality in Crystal: • • • •

Comfort (C) Discretionary money (D) Essential money (E) Life (L).

The maximum number of people that can be assigned to a specific project, as well as the number of team roles, depends on the size of the project. There are many different roles for huge projects and vice versa.

2.3 Using Case Method in Crises To better comprehend and manage crisis-related issues, we start a set of case studies. We firmly believe these are a principal method to master decision-making and analytical abilities, which are central in crisis management. As we already mentioned, Gill argues that case studies are particularly important as they assist in better informing and addressing complexity [9]. He says that “… case studies … are … most rigorous … where the fundamental test of our research value is its applicability to practice” [9]. This is so because the case studies typically address the top Bloom’s taxonomy levels including analysis, synthesis, and evaluation [10]. Below, we present two case studies, which address company digital transformation in a crisis. The first case study compares two very different yet successful publishers, Springer and IGI; the second one investigates an IT intensive business of a rapidly growing pizza delivery network.

2.3 Using Case Method in Crises

25

2.3.1 Publishers: Springer Versus IGI Springer: History and Achievements The Springer company started nearly 200 years ago, in 1842. This happened in Berlin, Germany, where Julius Springer, aged 25, launched his own bookstore and publishing house. On this long path, Springer successfully passed a number of rough turns. These included hard times of the two World Wars, and several revolutions in economics and production: from industrialization to Industry 4.0 society. The company logo was the knight chess piece, which reflected not only the founder’s passion for this old game, but also his strategic thinking and business agility [11]. At first (since the start until early 1900s), Springer was a typical family business. At this early stage, the world was changing rapidly due to industrialization. Therewith, the changes were based on emerging R&D innovations, as the community started to globalize. The new “Julius Springer Publishing House” responded to these new vital trends in an agile way; from political caricatures and treatises, they quickly turned to natural sciences and engineering. This proactive move accelerated the change and was an essential contribution toward the present-day knowledge-based society. To keep up with the rapid pace of society’s progress, Springer’s next move was further shifting the scope toward the cutting-edge technical and engineering topics. To reach this goal, Springer: 1. contracted top authors (such as Diesel and Siemens), and 2. acquired a few important, technically focused journals (such as VDI). Knight’s other moves at this debut stage included expanding this focus toward medicine and natural sciences. Again, these moves were fortified by inviting the topmost and influential authors (such as Marie Curie). Strange as it may seem, before 1945 Springer published entirely in German. This was so because the company’s primary focus was the domestic market, even though internalization was a clear trend the world over. However, Knight’s strategy was to gain domination in the homeland before starting expansion across the world. To achieve this, Springer used the approved strategy of fully utilizing the power of authors’ and editors’ expertise, including such prominent Nobel prize winners as Ehrlich, Fischer and Zeppelin. This strategy paid off soon, as Springer became the #2 in Germany in 1913. To gain domination in Germany, another approved strategy was further portfolio expansion; this used a couple of proven techniques such as: (i) focusing on medicine, natural sciences and engineering, and (ii) aggressive acquisitions. This strategy of the debut, however, included a novel move, which was interdisciplinary periodicals which included The Science of Nature, inspired by the famous British Nature journal. This rich portfolio, focused on the best quality content and using approved techniques, helped the company to survive in the First World War. In fact, Springer’s portfolio grew due to war-focused topics. After the war, Springer grew steadily as their acquisition strategy included top content and journals in mathematics and physics (such as Mathematische Annalen,

26

2 Models and Methods of Crisis Management

and Handbuch der Physik), authored and edited by such prominent researchers as Hilbert, Einstein, and Schrödinger, to name a few. The hard times started after 1933, when the fascists came to power. At that time, at least fifty Jewish Springer journal editors-in-chief had to quit, and the business was transferred to a trusted family’s friend. The next 50 years after World War II, the third generation of Springer publishers began the reconstruction of the venture. Many of their strategies were straightforward, as they inherited the successful techniques of their parents and grandparents. However, after conquering Germany, the new generation of Springer added new strategies aimed at globalization and an IT focus. Following these guidelines, the Knight “jumped” to Heidelberg to establish an office there and fortify their position in Germany. The idea was the separating of concerns, and focusing on medical and natural sciences in Heidelberg, and on general management in Berlin. In 15 years, Springer regained their prewar peak production levels. As of 1988, their German staff exceeded 1,000. Concerning the globalization trend, Springer decided to increase their presence worldwide. The reason was the shift of the research centers and standards from Europe to the USA, and from German to English as the primary language for research publications. Therefore, following this strategy, in 1964 Springer opened their first foreign office in New York City (see Annex A). In the next 30 years, they added a number of subsidiaries in Great Britain, Japan, France, China, Spain, Hungary, and India. Not only was Asia a prospective market, but also a source of relatively cheaper and quality workforce, and a new emerging research center. The next Knight moves, therefore, were printing Chinese and Indian publications translated into English. Of course, Springer continued the acquisitions of publishers in Germany and abroad (a few examples are Birkhäuser and Lehmann). In the 1990s, a new crisis exploded. To some extent, this was comparable to that of World War II. The nature of this new crisis was very different though, with the root cause being e-books and digital publishing. Shifting to this new paradigm clearly required establishing new processes while rearranging priorities and resources. Whereas a number of famous publishers had to cut their revenues and even quit their businesses, Springer reacted to this new challenge in an agile way. The reason was that the technical focus of the company made it better prepared for the digital age in terms of innovative technologies, including state-of-the-art computeraided facilities and processes. The result was not only fixing market share but also continuing the business growth. The other strategy inherited from the older Springer generations was the separation of concerns. Following this strategy, the Knight jumped to Chennai, India, to launch a subsidiary specifically for production and copyediting. The Chennai office shortly became an outsourcing partner, providing these services to other publishers. Springer optimized their business processes and outsourced them wherever possible, so that they became agile and responsive to the new digital challenges. However, agility and responsiveness as such were insufficient measures for conquering the new crisis. Adapting to digitalization required huge investments

2.3 Using Case Method in Crises

27

and clearly was not all smooth sailing. With low growth rates, rocketing expenses for marketing and shipping, a shaky core business due to local Asian crises, and a number of other threats, Springer raised subscription prices even for their VIP clients, including large libraries. These local risks were clear evidence of a more global crisis to come, and the many of the older generation techniques were not quite applicable at that point. The new transformation to fit the digital age started in 1996 and required nearly 15 years of continuous and intensive innovation. The Knight’s first move was establishing the SpringerLink, a pioneering online platform for instant reading and purchasing research publications. This was a response to the publicly sponsored electronic information services emerging at that time. Digitally offering a wide selection of their journal portfolio, Springer shortly became known and valued as a topmost online library provider in science and technology. This first SpringerLink move accelerated and standardized dissemination of research and academic electronic publications (particularly for e-journals and e-books). Eventually, this resulted in global market leadership for the company. The second move was re-optimizing the entire stack of business processes (including but not limited to acquisition, editing, production, sales, marketing, and shipping) to become more agile, client-focused and meet the new digital standards. The millennium year brought a number of mergers (with Bertelsmann, Cinven & Candover, and Kluwer) in a 5-year period. Due to high-quality personnel and efficient processes, the impact on Springer’s human resources was negligible. The new name included the initially famous brand of Springer Science + Business Media, ranked as the world’s #2 academic publishing house (with over 5,000 employees in 19 countries, and 1,450 journals and some 5,000 books per year by 2006). The new group restructured, updated, and fine-tuned its processes. The principle of separating concerns made the Indian subsidiary even more focused on outsourcing. These mergers updated the Springer logo (see also Annex A); however, the Knight was still clearly recognizable. The Knight’s recent moves were aimed at expanding their digital publishing presence. These included: 1. Enriching the e-portfolio (these equalized revenues from digital and conventional publishing) 2. The Open Choice model that combined the benefits of Open Access and traditional ways (this new approach appeared to be very efficient as the choice was the decision of the author) 3. SpringerLink for e-journals (this pioneering platform became de facto standard in e-publishing) 4. Springer eBook collections (this started high, with 10,000 titles) 5. Springer Book Archives that digitized the older Springer publications (this included several thousand books featuring all the precious volumes of 1842) 6. Springer Open portfolio (this made the company the Open Access leader). In the last decade, Springer is wisely combining the conventional printed and new electronic media formats. Their strategies and techniques include:

28

2 Models and Methods of Crisis Management

1. Traditional separation of concerns in terms of subject areas 2. Personal and trusting relationships between authors, editors, and publishers 3. Proactive search for promising researchers, prospective research fields, and locations 4. Optimizing business for print-on-demand processes 5. Package-based sale of e-publications 6. Striving for quality by means of peer reviews from proven experts. Even in crisis times, the above strategies and techniques, many of which were inherited from the founder, Julius Springer, helped the company to overcome hardships and challenges, and establish a world leadership in educational and professional publishing in a few areas. After a major merger with Macmillan Science and Education, the group’s current name is Springer Nature. This new brand united two very old and famous researchoriented publishers. Their staff number is over 13,000, and revenues exceed EUR 1.5 bln. IGI Global: Timeline and Results The IGI Global company is relatively young; they were founded in 1988 and just celebrated their 30th anniversary. Their headquarters is located in the small city of Hershey (population amounts to approx. 15,000), Pennsylvania, USA. On this relatively short journey, IGI had a few transformations. These included shifting from technical to multidisciplinary research, utilizing the power of visionaries in these fields, strictly following ethics code and quality standards, and implementing recent technological advantages [12]. The company’s slogan is: “Publisher of Peer-Reviewed, Timely, and Innovative Academic Research”. IGI Global’s founder, Dr. Mehdi Khosrow-Pour, D.B.A. is an academician with a passion for research and innovation. His idea for kicking off this venture (the acronym IGI means “Idea Group Inc.”, and this is their initial name) was that every research is worth publishing, as it is innovative and contributes significantly to prior knowledge in a certain subject area. That is not only prominent, but also aspiring, and prospective researchers deserve world-class standard publishing. Therefore, the primary areas covered by IGI Global include: • • • • • • • • • •

Business and Management Computer Science and IT Education and Engineering Environmental, Agricultural, and Physical Sciences Government and Law Library and Information Science Media and Communications Medical, Healthcare, and Life Sciences Security and Forensics Social Sciences and Humanities.

2.3 Using Case Method in Crises

29

In terms of coverage, the approach of IGI Global is much wider than that of Springer, and it looks more global. Let us see how the company evolved, and which strategies, principles, and practices they used, to become a world recognized and successful brand. The start in 1988 was small: the new company published journals and books on management, primarily focusing on technical subjects, such as information and IT resources, and their utilization. In about 10 years, they added case studies. Their first casebook, based on an already approved subject area, was titled: “Cases on IT Management in Modern Organizations”. With the millennium, the e-publishing imperative emerged. IGI’s reaction was immediate, releasing e-journals (1999), and publishing e-books based on their own brand new InfoSci® Online Database Platform (2002). The moves to embrace a global audience included launching new or rare types of media/publications, such as reference books (1999), dictionaries (2006), and encyclopedias (2005). The subject areas were predominantly the well-approved IT and management domains. Later add-ons to support IGI’s business in the new digital era included: 1. eEditorial Discovery® for manuscript submission 2. InfoSci Dictionary mobile application 3. Hybrid Open Access for all their journals. Instead of building expensive offices worldwide, IGI’s management decided to outsource the sales of the hard copies and e-books/journals to more than 40 local distributors in more than 20 countries. The e-publications are also sold through major e-stores such as Amazon.com, bn.com (Barnes and Noble), etc. Currently, the online database for IGI’s 4,500 e-books exceeds 82,000 chapters, and the database for their 175 e-journals includes some 21,000 articles. Full downloads and full text search are provided. IGI Global’s strategies and principles include: • • • • •

Close collaboration with researchers worldwide Following high ethical practices Meeting state-of-the-art standards Establishing and following agile publishing processes Being customer centric. To implement the above, they apply the following practices:

• • • •

Rigorous, high-quality peer review Transparent publishing process Fast and efficient dissemination of the research results worldwide Focus on new trends and emerging research within prospective areas. The key factor of IGI’s success is addressing diversity. They do this by means of:

• Versatile subject areas

30

2 Models and Methods of Crisis Management

• Different types of publications (reference/authored/edited books, encyclopedias, casebooks, and journals) • Various e-media (mobile content, e-books, e-journals, online courses, video, etc.). The other key factor of IGI’s success is establishing trust. They do this by means of: • • • • • •

Recommendation through libraries, institutions, and colleagues Using approved and high-quality reviewers Following ethics codes, including COPE (i.e., Committee on Publication Ethics) Meeting high-quality publication standards Providing Open Access to certain kinds of publications Offering specific access agreements to the authors (such as private use, translations use, and non-profit education).

In their relatively short 30-year business history, the company managed to gain success: it survived in a highly competitive race, offers unique products, assists young and promising researchers, became an international group and addresses diversity in many aspects. The primary reasons for this are agility and customer focus. Questions for Discussion 1.

The company profiles are very different. What makes them similar except the field of business? 2. How did the two companies survive in crises? What additional steps would you recommend for them to survive? 3. Which business principles did Julius Springer set up? Which of them worked, and why? 4. Which business principles of Julius Springer did not work? Why? How did his descendants adapt these? What else would you recommend? 5. Do you think trust is important in (publishing) business? Why? How does it work? 6. If you build your own business, will you address diversity? If yes, how? If not, why? 7. How would you adapt your company to the new digital era? Are the lessons and techniques of Springer and IGI helpful? If yes, how? If not, why? 8. Which additional principles and practices of business would you recommend to a publishing company to become even more successful than Springer/IGI? 9. How would you solve the tradeoff between accessibility and profitability in publishing? 10. How would you ensure the process quality of the products for a publishing company? 11. For a publishing company, which business processes would you outsource? Why? 12. As a new venture manager (such as IGI some 30 years ago), how would you enter a new market with a key publisher (e.g., Springer) dominating already?

2.3 Using Case Method in Crises

31

2.3.2 Pizza Makers: Dodo Pizza Scales up This case study presents the scale-up story of the most rapidly growing Dodo Pizza, a famous Russian (and recently international) pizza maker and delivery service. Surprisingly, their business is heavily IT-focused as opposed to their competitors, such as Papa John’s and Domino’s. The case study discusses resource optimization by means of digital product improvement in the crisis of extremely rapid company growth. Unhappy New Year It was snowing heavily; Nikita Poloznikov, the Dodo Pizza IT Director, parked his car by the house, and entered his apartment. He was back home from work, late in the evening. OK, this was natural for a top manager. The New Year approached; however, the holiday atmosphere was not yet there as the business reported a set of problems that required immediate attention. His current IT project with the Dodo Pizza Company, launched very well indeed. Of course, he remembered the day it started, April 23, 2011. The first pizzeria was located in the city of Syktyvkar, Komi Republic, in the Northern part of Russia. That very first month brought 820 orders; by the end of the year the monthly order value rocketed to 3,353 with a very impressive growth rate, exceeding 300%. This inspiring growth was based on the principles established by 30-year-old Fyodor Ovchinnikov, Dodo’s founder, and CEO. Among these were: • • • • • •

Best team → Best product → Best process Agility Transparency High-quality standard Fast delivery Always assisting clients.

Fyodor tested every critical business aspect himself by taking a bottom-up approach. After the collapse of the Brainpower bookstore network he owned before the crisis of 2008, Fyodor decided to start another venture, a client-focused quality pizza production and express delivery. To study the competitors’ processes, detect their bottlenecks, and learn from their strengths, he spent a few months as a kitchen worker at Papa John’s and Domino’s. Dodo’s strong side was their innovative IT-focused approach, and they believed that their information system was very smart. Yes, this was true when compared to Papa John’s and Domino’s. Domino’s did not even consider IT to be an important factor; and their software was clearly outdated in terms of architecture, technology and usability. Dodo’s, as a start-up, was rocketing fast. In a year, the second pizzeria opened in the same city; three months later, the first franchisee-based Dodo’s pizzeria opened in another Russian city. The quarter ending December 2012 brought up eight production and delivery points in different locations; this presented the IT infrastructure (and particularly its scalability) with a number of serious risks.

32

2 Models and Methods of Crisis Management

As the business scaled up, it appeared too hard to stay client-oriented in the same way as before. The geographical diversity left little space for the old IT system (named DodoIS), which was clearly becoming outdated, somewhat slow and non-agile [13]. The current DodoIS was heavily iOS client-focused (as the pizza making staff enjoyed high quality, and durable iOS tablets). However, as the business was scaling up due to franchising, the CIO’s initial idea changed to providing a wider range of compatible client-side end user devices (such as Android and even Microsoft-based), while maintaining the operating speed and quality of service comparable to the first Dodo’s pizzeria. Another problem was efficient data consolidation in real time. This was missioncritical to remain transparent, as it was both Dodo’s key competitive advantage and a strong client attractor. Moreover, transparency was among the core principles of the Dodo’s business strategy; this required continuous and prompt feedback. However, as more pizzerias started, the feedback from DodoIS included aggressively negative statements. Some of these opinions (both from the employees and clients) were: • “Dirt, pain and agony” • “The new tracking is disappointing and annoying” • “The update looks disastrous”. The above feedback criticized the new version of the order tracking software. OK, any update is initially far from ideal as many improvements happen later, in its maintenance phase. However, at some point in time, the previous version of the DodoIS, which was at the heart of the business, suddenly became far from a smooth ride. This system, and particularly its order tracking module that formerly allowed the reporting of KPI’s (such as revenues and number of pizzas made) in almost real-time, was Nikita’s invention and an attribute of his pride. This tracking module allowed the instant propagation of any order (delivery, takeaway or “for here”) to the tablet PCs installed in the kitchen. Of course, the module inventor and IT geek was not happy to hear such feedback after all the work he did in the first months (and even years) of their success. However, this time he could not rest on his laurels. Background The order tracking system started operating at the beginning of 2012; this was actually a very early stage of Dodo’s initial business start-up. From these early days of the venture, the order tracking system was the “heart” of the DodoIS, which in turn, was clearly the brainpower behind the business. At that time, the company owned their single, delivery-only pizzeria in Pervomayskaya St., Syktyvkar city. In 2012, the daily orders were counted in thousands with a typical value of 3,000 rubles (approx. USD 100). However, as the order number increased fivefold by mid-2016, this imposed new constraints on the performance. At this new operating speed, when the initial combustion engine of the business started to require jet performance, even small user interface flaws and minor architecture/algorithm issues would cause a serious headache.

2.3 Using Case Method in Crises

33

Performance in terms of one local pizzeria was not the only problem. The number of pizzerias to control simultaneously, already exceeded one hundred; this meant that the old DodoIS had to be optimized for more efficient data tracking, collecting, and aggregating. This would provide better KPI monitoring, computing, and adjusting, and as a result, probably return Dodo’s status quo in the market competition. Another critical issue was also performance-related; clearly, this was scalability. Nikita understood that at this point the DodoIS users’ feedback had revealed the mission-critical issues, which required immediate attention. It was beneficial that this risk was identified before it became a problem. Actually, it was the DodoIS itself that generated the warning. However, the situation required a clear plan, to be developed in a very limited time. Practically, Nikita had a couple of days at most to complete it and to start its implementation. This plan should be realistic and operational for the small IT team he managed. The Plan As every software product can be separated into two clearly distinct parts, interface and business logic, Nikita’s plan also included these two aspects. As the overall system complexity (in terms of interacting hardware, data size, etc.) increased dramatically, by nearly two orders of magnitude, this separation idea seemed very helpful. So, Dodo’s CIO decided to separate the concerns and use the “divide-andconquer” strategy by delegating the tasks to dedicated team members. Even reducing the complexity by half, would be a very helpful step. Thereby, the CIO decided to delegate these tasks to his comrades-in-arms, Mr. Alexander Shibanov and Mr. Dmitry Pirzyaev, the lead IT engineers, with instructions to improve the two above-mentioned critical areas. Alexander and Dmitry represented the famous IT development team from the Russian city of Kolomna, in the Moscow region. They were in the DodoIS project from day one, and suggested quite a number of successful ideas and critical improvements from this point moving forward. However, this time the case looked far more complex. This was because it was not just a routine performance tuning or GUI shape-up. At this point, the software system also needed a number of critical architectural improvements. First of all, the system should become more decentralized, clientfocused and service-oriented. Secondly, it should smoothly scale up without any sharp performance decrease. Thirdly, the system should support the following global requirements: • • • • •

Loose coupling Services Performance Portability Usability.

This was quite a number of attributes, which clearly had tradeoffs. For example, better portability would typically slow down performance even further. However, the new franchising approach required this portability as a “loadbearing wall,” or a must-have business requirement. The previous iPad-focused

34

2 Models and Methods of Crisis Management

client-side software, being efficient enough for small-sized applications, lacked versatility in terms of operating platform (i.e., OS) compatibility. This was a clear technological constraint: the new system should support not only iOS but also Android (and possibly Windows Mobile also). Ideally, it should be easily portable to virtually any operating platform. Also, some of the above attributes would include subdivisions. For instance, usability would feature a user-friendly GUI as an integral part. Though the overall outline was clear, the plan would certainly require a long time to implement, as the IT staff did not directly participate in the pizza production process. That is why Nikita decided to challenge the GEMBA practice, and to ask Alex and Dmitry to do the same. “GEMBA”—Top Managers in the Kitchen In Dodo’s, the CEO followed the same principle that he learned from extremely efficient Japanese businesses such as Toyota and Panasonic: to master the process, the manager should regularly participate as a worker himself (“GEMBA“). This practice was later established in Dodo’s (first for top-level management). However, the top IT staff did not quite follow this pattern as they thought they were too busy, and rather technology-oriented than specifically business and/or client-focused. However, this time the IT team clearly faced a local software development crisis as the KPI’s, resources, business requirements, and technology constraints were unbalanced and needed a fast and efficient tradeoff analysis and improvement. Therefore, Nikita had to join the kitchen team in their daily activities to understand the critical bottlenecks and quickly outline a practically efficient and operational software improvement plan for the order tracking system. This would become a part of the bigger DodoIS software improvement plan. Nikita decided that Alex and Dmitry must also attend the kitchen “interview” after him. Once again, this was an established top management practice at Dodo’s before making any critical decision. To make each of the team members grasp every critical issue of the production process, he decided to spend an entire business day there. As the actual business day started at 9:00, Nikita’s GEMBA began early in the morning. Therefore, each of the three got the required medical certificate and joined the kitchen team. Nikita’s duties included washing dishes and containers, cutting the forms, and rolling the dough. Meanwhile, Dodo’s CIO carefully monitored the software system operation. Surprisingly, he easily detected the major flaws and critical inconsistencies. Issues Detected The new orders appeared late on the kitchen monitor screens. Each time after tapping on the order, it took a while before the next order appeared, as the software reloaded the screen upon each information update. This delay lasted for a few seconds; therefore, for a single order, this looked just fine. Even for a few orders during an off-peak period, this delay was not critical. However, in the peak hours, it was not just a user’s impression that these few moments

2.3 Using Case Method in Crises

35

seemed to last eternally. These small delays, chained together, resulted in a local crisis, as they clearly caused a critical bottleneck. Another issue, often related to the previous one, was that the monitor screens often froze and did not respond to the taps instantly, as required and expected. This often caused nervous retapping, which resulted in more delays and occasionally led to a system collapse. As soon as the critical errors were identified, a high-level architectural system layout became clear. The New System Requirements The new system should be more decentralized; it should be based on standard services and protocols, and accessible from anywhere. A clear client-side solution was a web browser. However, which server infrastructure would better support that lightweight client? This was an open issue. As for the graphical interface improvements, the new GUI should consist of a minimum number of screens to faster complete each process, and each screen ideally should be: 1. Easily identifiable as a part of its specific chain, such as delivery order or special offer 2. Clearly different from any other patterns. The new system enjoyed service-oriented architecture. Moreover, it was cloudbased for better scalability and load balancing in peak hours; it had flexibility in terms of decentralization, along with more portability and instant deployment at any new location. A few months passed before the new solution was developed, tested and deployed. However, the initial internal updating and testing helped to solve a number of issues. As the issues detected were mission-critical, the new cloud-based DodoIS was quickly deployed in all the company subsidiaries. The New System: From Update to Operation The pilot operation revealed that the system improved significantly. However, quite a number of end users were unhappy with the new order tracking process. It is natural that users stick to their typical behavior patterns and rarely like to adopt anything new. The users must be trained to efficiently use any new system; as a rule, this is when its performance starts paying off. The users were dissatisfied with the unfamiliar interface that decreased their performance at the early stage of the pilot operation. Many of them even asked to immediately “undo” the changes and revert back to the old system. Their major complaints dealt with: 1. Malfunctions/failures due to unexpected behavior from certain hardware and software combinations and updated system requirements 2. Excessive/uncomfortable latency, due to a slow response to elementary user actions.

36

2 Models and Methods of Crisis Management

However, Dodo’s corporate culture prescribes treating any complaint as a generous gift, as this gives a valuable chance to improve. The company’s policy states that the more complaints they receive and respond to, the better their processes and performance become. Peer reviewing was a must-have for managers in any position; this produced an “index of happiness”, the key KPI of the company. The decisions of headquarters were open for discussion and criticism from any members of the subsidiaries and franchisees. Therefore, Nikita got quite a number of complaint “gifts” from nearly every subsidiary [14]. The CIO started “unpacking” them and analyzing their focal points and degree of criticality. The team brainstorming, and subsequent in-depth analysis, showed that the intensive mass-scale testing revealed several more design flaws. These required a redesign, as the new order tracking software still left much to be desired. The analysis showed that the new system was initially tested on iPads only, as the headquarters and first local pizzerias used this hardware exclusively. However, some of the newer franchisee-based offices often used cheaper tablet devices with a different operating system (mostly Android by Google). These budget end user devices had a number of specific implemented features, and therefore were often more flexible when compared with the headquarters’ iOS-powered iPads by Apple. Due to a wide variety of makers, models, operating systems and firmware versions, this diversity increased by nearly two orders of magnitude. Therefore, such diversity, together with a number of new feature add-ons, required a thorough and timeconsuming compatibility testing that would guarantee similar system behavior in different hardware and software environments on the client’s side. The compatibility testing proved that some of those hardware and software combinations did not behave correctly in terms of the new system requirements; this often resulted in malfunctions or failures. Alex and Dmitry also detected a few critical bugs. One was a relatively low contrast of colors that represented different menu items. With budget Android-based devices, this often resulted in difficulties and/or delays in their visual identification by the kitchen personnel. Another source of human error resulted from a number of look-alikes among the menu items in terms of color. One more drawback was a compatibility issue; the reason being that the pure black color often worked like a mirror on a budget quality monitor screen (actually, everything but the iPad failed to adequately display the data; instead, its operator saw his/her face reflection). Based on this later feedback and brainstorming sessions, the team started a system redesign and redevelopment. This did not affect the service-oriented system architecture, and the server-side Azure cloud platform by Microsoft [15]. However, the redesign and intensive testing for these multiple hardware/software configurations required 24 business days. All the critical bugs were addressed; the updates included exhaustive compatibility testing. Eventually, the users appreciated the IT team’s efforts. Dodo Information System Pays Off The testing in Russia (Syktyvkar office) and the USA (Oxford office, Mississippi State) revealed that the system improved significantly.

2.3 Using Case Method in Crises

37

Performance levels increased twice. The new user interface design was more attractive and easier to use. The web pages in the kitchen became more meaningful. Each page contained data on more orders than previously; its color clearly indicated the order destination (delivery or restaurant). Since then, the system started to pay for itself. The new system assisted in a number of mission-critical strategic issues, such as: • Efficient real-time data consolidation and display (daily revenues, number of pizzas produced, etc.) • Compatibility with a wide range of mobile clients and devices (iOS by Apple, Android by Google, various tablet PC makers, screen sizes, etc.) • Instant deployability • High fault tolerance in the cloud • High scalability. IT and AI became mission-critical success factors for the DodoIS. The idea was to make the pizza preparation process as human-oriented and employee-friendly as possible. Therefore, the MES level control system featured a human voice; this was a professional sound, recorded by Olga Smotrova. The employees treated “Olechka” as a team member and introduced “her” as a “top manager” to the freshmen. In 2014, successful pizza delivery by drone was announced as a world first in the pizza business. The new DodoIS was the key to that success [16, 17].

2.4 Agile Process and Quality Improvement Approaches 2.4.1 Crystal Methods After a brief introduction earlier in this Chap. 2, let us dive deeper into the Cockburn’s Crystal Methods [8]. Cockburn stated that processes, while important, should be considered as a secondary aspect in terms of importance after the following: • • • • •

Skills Talents Community Communication Interaction.

Therefore, the central idea of Cockburn’s Crystal Methods is that the teams involved in developing software would typically have varied skill and talent sets, and these are mission-critical, whereas the process is not always the key factor. The roles defined by the Crystal Methods are: Project Sponsor, Senior Designer/Programmer, Designer/Programmers (including Business Class Designers, Programmers, Software Documenters, and Unit Testers), and Users. There are also a

38

2 Models and Methods of Crisis Management

number of other roles such as Architect, Coordinator, Requirements Gatherer, Business Expert, Business Analyst/Designer, Project Manager, Design Mentor, Usage Expert, Lead Design Programmer, User Interface Designer, Technical Facilitator, and Technical Writer. Cockburn defined the behavior of people in teams as follows: • “People are communicating beings, doing best face-to-face, in person, with realtime questions and answers” • “People have trouble acting consistently over time” • “People are highly variable, varying from day to day and place to place” • “People generally want to be good citizens, are good at looking around, taking initiative, and doing whatever is needed to get the project to work” [8]. As many other agile methodologies, Crystal preaches delivery of early versions of software products, with as little bureaucracy as possible, and high user involvement. It suggests the uniqueness of each project, and requires processes, practices, and policy adjustments for different projects. Instead of having a general focus on processes and tools, Crystal actually focuses on individuals and their interactions. This framework family is a direct outgrowth of one of the main core values articulated in the Agile manifesto [18]. Crystal agile framework is built upon two core beliefs: 1. Teams find their own ways of workflow improvement and optimization, selforganization and crisis management 2. The team is to determine the best ways to tackle the work, since every project is unique and always changing. The initial purpose behind the development of Crystal was to focus on guidelines for team communication and collaboration, instead of focusing on a very specific step-by-step development strategies guide, that would attempt to work across a variety of teams involved in different projects. Therefore, the traits of Cockburn’s Crystal method were all based around the team itself: • Human-powered: a project being tailored and flexible to the needs and the preferred work of the people involved • Adaptive: no fixed tools but ones flexible enough to meet a team’s specific needs • Ultra-light: involving less documentation, management, and reporting. That way, artifacts are kept light by focusing on the transparent work between the team and the client, and by practicing open communication between team members. Crystal’s strengths include: • Allowing teams to work the way they think is most suitable and effective • Contributing to direct team communication, accountability, and transparency • Adaptive approach to enable teams to respond well to changing requirements. Crystal’s weaknesses include: • Lack of predefined plans, which can lead to scope creep • Lack of documentation, which can lead to confusion.

2.4 Agile Process and Quality Improvement Approaches

39

Cockburn states that the product’s development should be viewed as a game, which should stimulate everyone to interact, become creative, and produce brilliant ideas. He also says that rather focusing on questions like “Is our model accurate?”, we should focus on questions such as “Is our product meeting the needs of our customer?”, or “Do we have our goals aligned as a team?” One of the things that Cockburn discovered was the fact that the project properties were changing, with a dependence on the number of people involved and the level of its criticality. The number of communication artifacts rises rapidly with bigger teams working on large-scale projects, unlike smaller teams, typically developing a product with little status reporting and paperwork. Therefore, the more people are in a team, the more critical the project is, and the more complex the approach. That is why there is not one single Crystal Method, but different Crystal methodologies for different types of projects. To make his classification easier to understand, Cockburn categorized Crystal methods along two dimensions, which are size and criticality that match such attributes of minerals as color and hardness. The explanation for such a division is, as Cockburn called it, “a set of samples that you adjust to your circumstances”. The methodology is categorized by the following colors according to the number of people involved in the project: • • • • •

Clear: less than 7 people Yellow: 7–20 people Orange: 21–40 people Red: 41–80 people Maroon: over 80 people.

Additionally, Crystal Sapphire and Crystal Diamond were introduced to represent such large-scale projects that involve potential risk to human life. As the projects scale up, they become harder. There are seven main properties of Crystal method: 1. Frequent Delivery, which allows for frequently delivering tested code to the users, so the investment of the team’s energy and time into the product is not in vain 2. Reflective Improvement, which states that there is always room for improvement of a product, no matter how good or bad it currently is, and there is also room for improvement in team practices, techniques, and methods 3. Osmotic Communication allows team members to pick up valuable information without being involved in the discussion of the specific matter, since team members are co-located. Osmotic communication is a gradual absorption of ideas. It is believed by Cockburn that this kind of work atmosphere can operate with little structure 4. Personal Safety states that only by practicing open and honest communication can a healthy working atmosphere, and true team culture be achieved

40

2 Models and Methods of Crisis Management

5. Focus helps a team prioritize and work toward the same goal. Each team member should know exactly what to work on, so attention is focused and no mid-task switching occurs 6. Easy Access to Expert Users enables a team to maintain communication and get regular feedback from end users 7. Technical Environment with automated tests, configuration management, and frequent integration. Specific tools apply to software teams where the emphasis is on continuous integration; therefore, defects are detected within minutes, and development is held until they are fixed. The Crystal Method is distinguished from other software development methods by focusing on people and communication as its organizing principles. Instead of other Agile methodologies, Crystal focuses on adjustment of the techniques used in a project, with the main aim of strengthening the process of team communication. Crystal also allows: • Continuous integration • Active user involvement • Flexible and configurable processes. The Crystal Method has four recurring cycles of processes of varying lengths. Each cycle has its own sequencing: 1. Project Cycle: Composed of three parts, which include chartering, a series of two or more delivery cycles, and the project wrap up. This is the sequence where the core team is formed, the methodology to be used is shaped and fine-tuned, and the initial project plan is built 2. Delivery Cycle: Has four parts, namely: recalibration of the release scheme, a series of one or more iterations, presentation to end users, and a completion ritual that reflects on both the product created and the methods employed. This is where a valuable product is presented to users, and feedback about the product is obtained from them 3. Iteration Cycle: Involves three parts: iteration planning, daily and integration cycle activities, and a reflection workshop, the length and format of which may vary. It is also the time where the team may make changes to the requirements, functionality, or capabilities of its environment 4. Integration Cycle: Has no prescribed length. For some teams, the integration cycle may only take half an hour, while for others it may be for several days. Some teams may do it more than once a week, while others may carry out continuous integration, or merge after every design episode The “Clear” Crystal does not require any particular strategies or techniques. Instead, it suggests considering a basic set of techniques that help the team to work effectively on each of the Crystal cycles. These main strategies include: • “360° study”: At the very beginning of the project, the team needs to understand how useful and meaningful the product is and whether it can be created with available resources and technologies. For this, the project is considered in

2.4 Agile Process and Quality Improvement Approaches

41

the following areas: business value, requirements, required technologies, project plan, team composition, and selected methodologies. The implementation strategy takes from a couple of days to 1–2 weeks. • “Early victory”: Victory strengthens and unites the team and increases the confidence of each of its members. When working on applications, the strategy of early victory is to find the first piece of working code. Its role is usually a small element of a useful system function (e.g., the ability to add an object to the system database and look at it), which is called the “walking skeleton.” Members of the team, on the basis of this small victory, study the styles of each other’s work, users get an early idea of the system, and stakeholders (including customers) see the team’s performance. • “Incremental architecture change”: The architecture of the system should develop because technological and business requirements change with time. The larger the project, the more detrimental it is to stop development for an architectural review. The strategy aims at solving the problem: the team changes the architecture in stages without disrupting the work of the entire system. This same strategy can be used to make changes to the functionality of the deployed system. • “Information radiator”: A display that is placed where people can see it during work or leisure. It shows the essential information without anyone having to ask unnecessary questions, and this embodies the importance for the Crystal Clear rule, “more communication, less breaks.” The information radiator should be large, easily accessible, easy to use and read, and contain up-to-date information. Initially, Crystal was created as a reminder for teams to stay focused on work during project development. It is expandable and can be used by large or small teams to work on both simple and complex projects. It places importance on the development teams’ skills, interactions, and on communication. This in turn encourages collaboration and the exchange of ideas. It is also beneficial for the client, since it delivers the most important component or features of the product first. The Crystal family is focused on these three characteristics as they must be present in every project: 1. Permanent production 2. Responsive communication 3. Reflective improvement. Crystal Clear uses the advantage of being small in size and having a short distance between groups, to enhance good communication to more efficient osmotic communication. Experienced developers may notice that, except for this change, all the above-mentioned characteristics are applicable to each project and not just to projects with small groups. Crystal Clear is described as a set of characteristics. Most descriptions of methodologies lack an important indicator that separates a successful group from an unsuccessful one. The Crystal Clear group determines its state according to the mood of the group and the nature of the communication, as well as the degree of productivity. Assignment of names to characteristics provides the group with slogans to

42

2 Models and Methods of Crisis Management

Fig. 2.1 “Crystal methods” selection matrix

determine its situation, such as “For some time we did not perform a reflective improvement”, “Can we get easier access to experienced users?” The very names of the characteristics help developers identify and discuss methods for resolving the problematic situations (see Fig. 2.1).

2.4.2 Six Sigma Strategy Potentially, any software development lifecycle can be improved with the Six Sigma. Six Sigma (6σ ) is a set of techniques and tools for process improvement. Six Sigma strategies seek to improve the quality of the output of a process by identifying and removing the causes of defects, and minimizing variability in processes. They use a set of quality management methods, mainly empirical and statistical, and create a special infrastructure of people within the organization who are experts in these methods. Each Six Sigma project carried out within an organization follows a defined sequence of steps and has specific value targets, for example, reduce process cycle time, reduce costs, etc. Six Sigma doctrine asserts: • Continuous efforts to achieve stable and predictable process results (e.g., by reducing process variation) are of vital importance to success • Processes have characteristics that can be defined, measured, analyzed, improved, and controlled • Achieving sustained quality improvement requires commitment from the entire organization, particularly from top-level management. Features that set Six Sigma apart from previous quality-improvement initiatives include:

2.4 Agile Process and Quality Improvement Approaches

43

• Clear focus on achieving measurable and quantifiable financial returns from any Six Sigma project • Increased emphasis on strong and passionate management, leadership, and support • Clear commitment to making decisions on the basis of verifiable data and statistical methods, rather than assumptions and guesswork. Case Study: Aircraft Flight Control System The Aircraft Flight Control System (FCS) is a high-risk flight system that controls every aspect of an airplane’s operation to ensure safer, smoother flight. It consists of the flight control surfaces, cockpit controls, and the mechanisms to control the aircraft’s direction in flight. FCS requires: • Adequate aircraft handling properties • Low pilot workload • Model simulation or prototyping required to analyze whether digital processing signals represent the desired implementation, to avoid any mishap during ground or flight testing • Backup or failover plan in case of software or hardware fault • Comprehensive testing for an extensive set of faults, and thorough ground-based testing. The system and its inherent functional design should be free from errors. Additionally, FCS requires adherence to the highest level of quality standards. Any failure in the system can cause loss of aircraft and human lives; the probability of success should be very close to one hundred percent. However, a test to prove hundred percent correctness is almost impossible. Thus, a tradeoff is done by deploying many reliable but redundant artifacts, a thorough design and development process, and test cases under all possible combinations of inputs. Redundant artifacts would be used as backups during any software fault. The project is high risk, safety-critical, and requires zero defect deliverables, along with continuous risk assessment. Thus, Spiral model (see Fig. 2.2) is proposed as the software development process along with the Six Sigma business management strategy. This gives a combination of prototyping, continuous refinement, and nearzero defects. The factors taken into consideration are as follows: 1 (a) (b) (c) (d) (e)

Stakeholders: Pilots Passenger FAA (Federal Aviation Administration) Airlines Market Private and military avionics industry.

2. Technology: (a) Real-time and embedded (b) Communication between each device has to be near real-time.

44

2 Models and Methods of Crisis Management

Fig. 2.2 Basic version of spiral model

3. Business drivers: (a) Early generations of FCS were mechanically based; therefore, pilots had to physically steer and control the aircraft, which was limited by their physical capabilities (b) Development of digital FCS would automate the process (c) Increase in safety as the pilot can concentrate on high-level tasks rather than routine control tasks. The Spiral model combined with Six Sigma strategy is a good fit for the project. The project would involve interactions between software engineers, embedded system developers, Six Sigma black belt members (to aid high-quality and defectfree deliverables), testers, change management group (risk, impact analysis, and versioning), analysts, and pilots (for live testing of the system). This project, being safety-critical, requires thorough testing, simulation, highquality standards, zero defects, and adequate documentation. The Spiral model incorporates the above requirements with a fast-iterative approach, and a team of Six Sigma competent members working on quality, risk management, cost, and estimation, synchronized with Spiral model phases. Therefore, the Spiral model fits the project well. Due to the lack of emphasis on documentation with the Spiral model, its weakness is strengthened by combining it with the Six Sigma strategy.

2.4 Agile Process and Quality Improvement Approaches

45

Six Sigma improves the quality of process outputs by identifying and removing the causes of defects and minimizing variability in manufacturing. In a Six Sigma process, almost one hundred percent of the product is expected to be defect-free. The five phases of the Six Sigma process are, define, measure (identify issues critical to quality and risks), analyze (create high-level design), design (simulate and optimize), and verify (set up pilot runs). This, along with the Spiral model, would provide a thoroughly tested, well documented, defect-free, high-quality deliverables. Considering that the key for developing aircraft flight control is safety, we have concluded that the Spiral process is an excellent fit for this project. The Spiral model encapsulates iterative development with prototyping, verification and validation, and a Waterfall approach in incremental order. Finally, Six Sigma provides the documentation that the Spiral model sometimes lacks, as well as ensures further quality control at the highest level.

2.4.3 PAEI Lifecycle Framework The PAEI framework was proposed some 30 years ago by Adizes and applied to a number of organizations. His large-scale Russian clients included the Government, Sberbank (one of the biggest Russian banks), various oil and gas enterprises, and a number of customers in USA and Europe. This PAEI acronym includes four different roles or features of the manager; however, it is also applicable to institutions and companies. Let us discuss the meaning of each letter in this acronym in more detail (we gave a brief outline in Chap. 1). P is a Producer. A person or organization at this stage of development is resultoriented and responsible for the implementation of a product or service. They generate ideas, satisfy customer needs, keep clients loyal, and attract more of them. Producer is an individual or company that makes a certain product. The next letter, A, is an Administrator, with an idea to optimize current cost and resources. An administrator keeps a company from going bankrupt and introduces bureaucracy. Bureaucracy is not necessarily bad; this is something to formalize and define the company’s processes. Similarly to RUP lifecycle, these processes are framework-based, and they have deliverables. The processes are administered, i.e., they are formal, and have certain entry/exit criteria and metrics. The next letter, E, is an Entrepreneur, who keeps the company competitive, adaptive, and agile, i.e., adjustable to the environmental changes. Entrepreneurship is being proactive and creative. This means that the remedy is searched for proactively, before a disaster strikes, and the solutions proposed are typically innovative. An entrepreneur initiates changes and manages them. Some people admire entrepreneurial efforts, and some others are afraid of them, as such people are very changeable and unpredictable. Sometimes, they are overwhelmed with the new ideas.

46

2 Models and Methods of Crisis Management

And these ideas are rapidly changing, sometimes even before they are actually implemented. Therefore, this is a dangerous kind of leader if this includes only E (i.e., Entrepreneur) and lacks A (i.e., Administrator). The final PAEI letter is I, which means Integrator. This ingredient keeps people united, gets them together, to implement the ultimate goal or company mission. Integrators are not generators of ideas (i.e., not entrepreneurs); however, they make people communicate efficiently and create values. Applying this PAEI framework, let us get back to the lifecycle. The Courtship is an E-based stage. This is a very early stage of missioning, visioning, and generating the idea. The other letters in the “paEi” acronym are small, meaning that certain activities probably exist in production, administration, and integration; however, E dominates clearly, the E component is the key point of this stage, and all the rest are low in activity if not totally inactive. The acronym 00E0; as indicated in the diagram (see Fig. 2.3). In a crisis, this situation may degrade to an Affair rather than actually get to business or the initial idea. The next stage is Infancy, which is business operation; this means that the Entrepreneurship has changed for Production. The Producer clearly understands what the company should do; however, at this stage the business is the key point, and there is little need in formal process or administration. Instead, the company needs to make a product and offer it in order to satisfy the customer’s demand. At this stage, certain innovation is possible; however, if production is the only focus, the company will not survive as it has to remain competitive, to establish the improved processes and maintain them formally, i.e., bureaucratically. Therefore, in the case of P, only P dominates in the acronym, and all the other ingredients are zeros. In this case, the company will probably not survive or succeed. It should be somewhat

Fig. 2.3 Ingredients of the PAEI in the company lifecycle

2.4 Agile Process and Quality Improvement Approaches

47

innovative, bureaucratic, and possess certain features of entrepreneurship. The next step is E, which means that entrepreneurship increases together with production. This stage features P and E, i.e., Producer and Entrepreneur are active. This means that typically a stable cash flow is generated, the products are built and supplied to the end users. As the opportunities are numerous at this stage, the company can develop in many different ways and adapt to the people (both to the employees and to the customers). This stage features P and E, while Administrator and Integrator (“a” and “i”) are in lower case. If these are totally absent, the company may degrade to the Founder Trap, where Producer and Entrepreneur dominate, and the employees cannot adjust to follow this leader. The next step is Adolescence, which can be interpreted either by two letters: either P and A (Producer and Administrator), or A and I (Administrator and Integrator), as the other factors do not really dominate. The focus shifts to A. Therefore, the big A appears on stage. Administration means that there is a shift from “what” to “how”, i.e., from the software architecture to detailed digital product design, and from Production (what is going to be produced) to Administration, i.e., to the process (how this is going to be done). Here, the process dominates; the company needs stability; therefore, A is activated. At this stage, the company works on the process definition and process improvement, and there is often a fight between Entrepreneurship and Administration. Entrepreneurship is something innovative (i.e., it offers new ideas, which sometimes cannot be implemented instantly), and the Administration is bureaucratic and has formal processes and procedures. Naturally, Entrepreneurship and Administration fight among themselves. As before, the other two letters would also be present as minor ingredients. For instance, integration may help to shift between the entrepreneurship and production that dominate. A question arises: Does innovation or process dominate? The answer is: it depends on the maturity of the company’s integration skills. The next stage is Prime, i.e., best stage or the top of the “hill,” before the decline happens. This means that the company has achieved a balance between formal management and flexibility. Therefore, it is agile and manageable at the same time. At this stage, the processes are established, they are manageable and not as bureaucratic as they tend to become later. The ideas are not so numerous as before, and the company both grows and ages. It becomes bigger and older at the same time. At this stage, we have either P, A, and E, or E and I in capital letters. In each case, the focus is changing to entrepreneurship and integration. The idea is that integration probably dominates at this stage. The company has fast growth and stable profits. That is the best stage for a company and, therefore, ideally, it needs tight integration to get all the processes and people to work together. However, the company also needs entrepreneurship, which is important for its future success. This ingredient gives energy to the company, because if there is little or no entrepreneurship, the Fall stage begins. The company gets older and loses vitality, i.e., positive energy of entrepreneurship and innovation. At this stage, it is often rich, has a high income yet a little number of new ideas and innovation. The financial reports are impressive, the cash flow is positive, and revenues are high. The company is profitable; however,

48

2 Models and Methods of Crisis Management

the general attitude is: “If something is not broken, then don’t fix it. Do nothing unless there is an actual threat or trouble”. This is the wrong attitude, as they do not act proactively and not flexible enough; therefore, competitors threaten them at this stage. In the Fall, the company’s decline starts as soon as it begins to lose energy. The same early crisis signs mainifest themselves in people; they become old as soon as they feel their energy is not that high as it was in their youth. At this Fall stage, the company’s energy is low; therefore, entrepreneurship is small, while all the other ingredients are in capital letters (P, A, and I). The integration (I) becomes strong as the company has stable profits and cash flow, and consequently, they can integrate their processes, get all their skills together and work on the knowledgebase. However, if the entrepreneurship (E) is low, the production (P) degrades. Unless the company has innovation to offer in the market, they cannot make new products. As we go down this declining curve, the next stage is Aristocracy. At the stage of Aristocracy, flexibility, or agility declines, which makes the enterprise climate worse. At this stage, the company is more bureaucratic as the top management dominates. As such, the focus becomes short term. Typically, the big vision evaporates; there is no long-term planning, the goals are low risk and financially oriented. No employee is interested or innovative. Their idea is to produce a good impression, a sound financial report, and to deliver something simple and realistic. This stage means stagnation as people are meaningless and every act is formal. After Aristocracy, Recrimination starts, and witch-hunts begin because acute problems arise, and people are chasing other employees instead of solving these problems. The idea is to accuse someone as being guilty and punish him or her, rather than solve problems and plan for crisis disasters, which are yet to come. Therefore, this stage typically features A (Administration), possibly, accompanied by I (Integration), which means that the company has an established corporate culture of meetings and many other formal things, and a high level of bureaucracy. Specific people are appointed to dedicated roles to manage this bureaucracy and control this issue. However, Entrepreneurship (i.e., the new ideas and technologies) and Production are weak, which means that the company shifts from being resultoriented, to following formal rules and rituals. At this point, the metrics are often meaningless. However, there are many rituals and traditions. The company follows their traditions; however, the investments in production and new ideas are not as intensive as earlier, and this means that the company declines. The stages of Recrimination and Bureaucracy feature witch-hunting; their only active component is A (Administration), which dominates. Dismissals happen; many people are accused of crimes and fired. Entrepreneurs and innovators are out; integration lowers so it is either zero (which is extremely bad) or very poor. The things, which may help, are external support and the resources generated previously. At the stage of Bureaucracy, the company is still alive as it has financial support and certain capital available, some people are still committed and the company is able to sustain itself due to the support generated by the previous development stages. The leader is formal; however, nothing innovative actually happens. This is the stage before Death; The Death occurs when the commitment and financial resources can

2.4 Agile Process and Quality Improvement Approaches

49

no longer keep the company alive. At this stage, no one is actually interested or deeply involved in the company’s affairs. If it has a certain governmental funding support, this may artificially prolong the agony before the actual Death. Death is indicated by four zeros. This means that nothing happens. No integration, no commitment, no (or very low) production, no innovation and external support, because no customers trust the company anymore. Afterward, a new cycle begins, i.e., a new Childhood (or Courtship) starts. A new idea and mission appear when the founder survives in this Death stage; a new brand name can appear. With this totally new idea, a new company starts at the Childhood stage. This is how the PAEI framework works. Let us revisit the dangerous stages of decline: Fall, Aristocracy, Recrimination, Bureaucracy, and Death. For these, we recommend risk mitigation strategies. As for the Fall, the company should support and stabilize entrepreneurship. To avoid the company’s fall, it must let in new ideas and innovative-minded people, and support these. Concerning the Aristocracy, the company should discuss problems openly, as agile methodologies prescribe. Due to many formal rules, the real problems are often hidden, and the critical risks are not revealed. As such, the company should legitimize the need for changes and implement these changes. The changes are clearly required at this point, and the company has to understand that they should identify the challenges and implement the changes. In case of recriminations and witch-hunts, adequate HR management is required to make the company profitable again. However, the management understands who really threatens the company. These are not necessarily the employees who are thought to be witches; instead, the real witches are probably the people who are the witch hunters. For the stages of Bureaucracy and Death, the company needs entrepreneurship and production, and new ideas to keep the company moving. However, Death itself is almost impossible to cure; although it is often possible to postpone the actual Death. To do this, the company needs to increase E and P, i.e., Entrepreneurship and Production. To conclude on the PAEI framework, this company lifecycle curve is an adequate tool to monitor and to understand a current company’s state, whether it is its younger days (Courtship spinning up), Go-Go, Prime, or any other stage. The PAEI framework helps to identify the forces driving the company, whether it is Entrepreneurship, Production, Administration, or Integration, which dominate. Each stage of the PAEI lifecycle requires a different management style. For example, in the final stages, which happen before the Death, the company can manage itself more efficiently if it boosts P and E, i.e., Production and Entrepreneurship. Therefore, new ideas and new products should bring in extra energy to address this “old age” crisis. The management of the company should have all the four types of roles (see Fig. 2.4)

50

2 Models and Methods of Crisis Management

Fig. 2.4 Managerial types of roles and PAEI lifecycle

1. 2. 3. 4.

Productive Administrative Entrepreneurial Integrative.

However, it is hard to imagine this or actually meet this in reality in a single person. That is very unrealistic to be a producer, administrator, entrepreneur, and integrator altogether in one. For instance, an individual can be a strong producer and a gifted entrepreneur, and, to a very small extent, an integrator, and an administrator. In a management board, different people can have specific PAEI attributes, which are dominant. PAEI framework helps to achieve a balanced combination within the board (see Fig. 2.5). This framework can be applied to any enterprise lifecycle stage, in order to assist or “roll back” to a previous stage; thereby, it is possible to postpone a company’s death or even change this death for one of the earlier stages of the lifecycle. Of course, even after assistance or rollback, the crisis risk still exists, and there are various strategies and risk management techniques other than the PAEI, addressed in this book and elsewhere. Adizes has produced more than 20 books on PAEI; his earlier monographs were published in 1980s, over 30 years ago.

2.4 Agile Process and Quality Improvement Approaches

51

Fig. 2.5 PAEI-optimized lifecycle

2.4.4 Six Sigma and Other Quality and Process Improvement Approaches After discussing the PAEI framework, let us focus on crisis-resistant lifecycle optimization by means of quality and process improvement methods and techniques. Of these, one of the best known is Six Sigma. Six Sigma features a kind of “Plan-Do-Check-Adjust” (PDCA) lifecycle, which includes planning, acting, and checking the result of these actions (see also Fig. 2.6). Six Sigma was introduced in 1968 by Motorola Corporation and revised by the General Electric Company. This is an approved technique and a well-known approach. However, it was refined and revisited, as were Total Quality Management (TQM), PDCA, and some other lifecycles (such as Kaizen, Kanban and Lean approaches, started by Toyota in Japan later in the 1990s). The above-mentioned quality management approaches resulted in the ISO 9000 standard of quality and some other standards adopted by the NIST standard association in the USA, and worldwide. The focus of the Six Sigma approach is process improvement, and the idea is to achieve as good quality as possible. Six Sigma assumes acceptable quality, as it allows only four defective units being manufactured out of one million. As for the methods the approach involves, they Fig. 2.6 Process improvement lifecycle

52

2 Models and Methods of Crisis Management

Fig. 2.7 Six sigma: normal distribution

are mostly statistical methods, and Six Sigma deals with normal distribution. The idea is that the edges of the distribution graph (i.e., the edges stretching out of Six Sigma, see Fig. 2.7) should be very thin, and this helps to reduce the percentage of defectively manufactured parts or products. The core focus of the methodology is improving process and quality, and thereby increasing customer satisfaction. The other application of the approach is waste reduction; this is close to such Japaneseoriented techniques as Kaizen and Kanban. To be precise, Six Sigma allows a maximum of 3.4 defective items per million produced. This is a sufficiently low percentage (or a high-quality level), keeping in mind that it is impossible to have no defective parts at all. This may sound dangerous, however, even the life management applications and mission-critical systems are not 100% safe. As such, developers can make better digital products by improving the process and product quality, even though statistics say that no software is 100% reliable or safe. Therefore, to make a quality and crisis-resistant software product, developers have to guarantee at least Six Sigma quality level to ensure that critical failures, faults, or defects happen very seldom. Failure Model and Effect Analysis (FMEA) related to Six Sigma, and known as “failure modes” (plural), in many publications was one of the first highly structured, systematic techniques for failure analysis. It was developed by reliability engineers in the 1950s to study problems resulting from military system malfunctions. Often, FMEA is the first step of a system reliability study. It involves reviewing as many components, assemblies, and subsystems as possible, in order to identify failure modes, and their causes and effects. For each component, assembly, and subsystem, the failure modes, and their resulting effects on the system are recorded in a specific FMEA worksheet. A FMEA can be a qualitative basis when mathematical failure rate models are combined with a statistical failure mode ratio database. According to Six Sigma, 99.999 is the minimum threshold the system quality should meet. Motorola is known as a producer of very reliable mission-critical and

2.4 Agile Process and Quality Improvement Approaches

53

military systems. However, this does not mean these products have zero faults or they are 100% safe. This means that their products must meet certain quality standards to be accepted by the quality assurance control. A better production lifecycle could even further improve the quality, if required by the customer. To improve the process, developers should control it at every stage of the lifecycle, starting from the resource planning. In a crisis, they should optimize the time and effort, and investigate how they can improve quality by means of better processes. Therefore, developers need to define and analyze their production process (e.g., outline its stages and/or iterations), as this often happens in software production and manufacturing. Six Sigma as a methodology is based on carefully measured statistical data and quality-oriented deliverables; it essentially involves human factors, such as leadership and commitment. To achieve better quality, verification plans of the management should be adequately transferred down the company’s hierarchy and thoroughly implemented. Such a high-quality level requires reliable processes, step-by-step production control, clear metrics, and quantified results. The Six Sigma approach guarantees quality production in terms of the percentage of defective products. After Six Sigma implementation, Motorola reported billions of dollars in savings, and by the end of the 1990s, about 70% of the Fortune 500 organizations began using this approach. Importantly, the International Organization of Standardization (ISO) implemented Six Sigma as a process framework, which is currently a part of the ISO 9000 Standard. There are other levels of standards (e.g., national and enterprise levels); these are also focused on production process improvement and resource efficiency. Figure 2.7 illustrates a normal distribution, where a “single” Sigma level stands for one standard deviation. For instance, a level of one Sigma would mean around 69% of defective and 31% of quality items; two Sigma would mean the opposite values (31% defective versus 69% of quality items), and three Sigma would mean visually, almost no defective items (see also Figs. 2.8 and 2.9). With Six Sigma, even under a large shift of the normal distribution graph, to either the right or left-hand side, developers still get a very high level of quality, which is acceptable in most cases. That is why Six Sigma sounds strict; however, it typically guarantees sufficient quality even in the case of serious crisis-related fluctuations of the production environment. The aim of the first step is to define the goal. At this stage, developers estimate the resources and budget required, and prepare a high-level project proposal. Further steps of the lifecycle define the scope (including the intended audience of the potential end users) and estimate the costs for the digital product. At this point, the key questions are: How many people would buy this new product? Is this a general purpose or a custom-made (proprietary) production? These activities are typically done by the developer’s marketing team. This marketing team estimates possible options based on historical and present data. They identify the baseline and outline a well-balanced set of the key product features and quality attributes.

54

2 Models and Methods of Crisis Management

Sigm a level 1 2 3 4 5 6 7

Sigma (with 1.5σ shift)

DPMO

Percent defective

Percentage yield

Short -term Cpk

Longterm Cpk

-0.5 0.5 1.5 2.5 3.5 4.5 5.5

691,462 308,538 66,807 6,210 233 3,4 0,019

69% 31% 6.7% 0.62% 0.023% 0.00034% 0.0000019%

31% 69% 93.3% 99.38% 99.977% 99.99966% 99.9999981%

0.33 0.67 1.00 1.33 1.67 2.00 2.33

-0.17 0.17 0.5 0.83 1.17 1.5 1.83

Fig. 2.8 Six sigma levels

Fig. 2.9 Scope, time, cost, and quality

For instance, if the product has advanced multi-level security, will it perform adequately in terms of response time? If it is very secure, its usability may suffer. Additionally, the hardware and software environment can affect product usability. For example, a mobile phone battery would die prematurely and become non-operational if its operating system and hardware required excessive power consumption. Therefore, developers determine the lifecycle process for this new product, select the appropriate methods and tools, and identify root causes of the potential failures (e.g., using Ishikawa diagrams). The next step is production process improvement. At this point, the developers re-examine the features to detect which of them are mission-critical for the product. They also identify tradeoffs and optimize the digital production process. Complexity is another dimension, which is essential to optimize for process improvement.

2.4 Agile Process and Quality Improvement Approaches

55

The next stage deals with improving the process. At this point, the developers determine improvement priorities; they typically prototype or simulate the process to identify the best scenario. Each of the possible use cases can be tested in order to acquire feedback and adjust the solution accordingly. Adjusting the solution requires a more detailed plan and reiterating the process. Next, this new process is applied; this requires a strategy similar to using a checklist that the aircrew uses before a flight take-off. For example, they perform a crosscheck of each exit and a number of other verification steps to confirm that the flight is possible. There are a number of techniques for quality engineering. Many of these use Six Sigma and empirical checklists based on certain metrics. Detecting and fixing the critical points eventually improves businesses. Typically, these improvements use software quality metrics, such as failure and defect rates, number of errors per 1,000 lines of code, number of defects detected and removed, and other similar indicators. These metrics enable developers to predict the failures and usually guarantee an acceptable quality level of the digital product. Moreover, they help in adjusting the quality level of the product to be implemented. Although supported by a number of quality management standards, Six Sigma is a relatively informal technique and obviously, it is statistically dependent. To make it more applicable to digital product development, we recommend using certain aspects of the Personal Software Process (i.e., PSP, for the personal level) and the Team Software Process (i.e., TSP, for the team level). These two process frameworks (see also Figs. 2.9 and 2.10) can be tailored to the size and scope of a certain digital product, and are focused on assessment-based continuous monitoring and quality improvement [19, 20].

Fig. 2.10 PSP and TSP structure

56

2 Models and Methods of Crisis Management

Fig. 2.11 Quality management in TSP

To conclude, the Six Sigma methodology is applicable to large-scale projects, and therefore, huge enterprises can significantly benefit in terms of savings by using this approach. In the case of large-scale projects, the key manager salaries (often referred to as the Black Belts in Six Sigma) pay back as soon as the improved processes are implemented on a large scale. Six Sigma is a general process improvement framework; therefore, we recommend using PSP and TSP (which also use statistical data and focus on assessment-based quality improvement) to make this better applicable for software production at an individual and team level (see also Figs. 2.11 and 2.12).

2.5 Conclusion: Lifecycle-Based Crisis-Responsive Models and Methods Every software product begins as a concept and transitions through a series of phases until being developed, released, and deployed. The software development lifecycle of a system or an application continues with updates and new feature implementation until it is replaced or decommissioned. In this chapter, we discussed a few lifecycle models (such as Waterfall, Incremental, and Spiral), human-centered methods (such as Agile), and process/quality improvement approaches (such as PSP/TSP, and Six Sigma). Through decades of software development evolving, several methods of software development were introduced. Agile is rapidly overtaking Waterfall in large and small companies. New Agile frameworks are introduced on a constant basis, some of which are of questionable use, while others might be the next industry standard. An Agile framework called Crystal Methods seems very interesting and promising. The Crystal approach is among the more flexible frameworks because of

2.5 Conclusion: Lifecycle-Based Crisis-Responsive Models and Methods

57

Fig. 2.12 Spiral model with six sigma

its independence from any single set of processes or tools, and overall design structure surrounding the project team. It can be a viable methodology for organizations wanting to empower teams to work. Still, it is important to keep in mind that because Crystal emphasizes direct team collaboration around the software product they are developing, and de-emphasizes the importance of reports and documentation - this could mean that other teams in the organization will have less visibility of the team’s progress on the project. In contrast to that, the Six Sigma Strategy, which is based on metrics and process standardization, was discussed in the section following Crystal. The survey and a case study demonstrated the applications of the Six Sigma and other important software development and process/quality improvement approaches to crisis-resistant digital product development. Justified, tradeoff-based selection of the above-mentioned models, methods, and process frameworks results in crisis-responsive software development, which is the goal of the IT Crisisology.

58

2 Models and Methods of Crisis Management

References 1. Zykov, S. V. (2016). Crisis Management for Software Development and Knowledge Transfer. Switzerland: Springer International Publishing. 2. Abrahamsson, P., Oza, N., Siponen, M. T. (2010). Agile software development methods: a comparative review. In Proceedings Agile Software Development: Current Research and Future Directions Conference (pp. 31–59). 3. Boehm, B. (2000). Spiral Development: Experience, Principles, and Refinements. In J. H. Wilfred (ed). 4. Pressman, R. (2014). Software Engineering: A Practitioner’s Approach, 8th Edition, McGrawHill Publication. 5. Information regarding Dodo Pizza’s history and corporate values is available at http://sila-uma. ru/2015/01/22/the-dodo-way/. Last accessed 25 May 2020. 6. Information regarding Dodo IS structure and functions is available at http://dodois.com. Last accessed 25 May 2020. 7. Information regarding The Dodo Pizza’s annual report for 2017 is available at https://dodopi zzastory.com/2018/01/05/2017-results-for-dodo/. Last accessed 25 May 2020. 8. White, V. (1992). Introduction to Six Sigma with a Design Example. In Proceedings of Applied Power Electronics Conference and Exposition (pp. 23–35), Korea. 9. Cockburn, A. (2004). Crystal Clear: A Human-Powered Methodology for Small Teams, Selfpublished, 10. Humphrey, W. (1995). A Discipline for Software Engineering. Addison Wesley. 11. Boehm, B. (1998). A spiral model of software development and enhancement. Computer Journal of IEEE, 6(2), 61–72. 12. Information regarding Agile Manifesto is available at https://agilemanifesto.org. Last accessed 25 May 2020. 13. Humphrey, W. (1999). Introduction to the Team Software Process. Addison Wesley. 14. Rastogi, V. (2015). Software development life cycle models—comparison, consequences. International Journal of Computer Science and Information Technologies, 6(1). 15. Lewallen, R. (2005). Software Development Life Cycle Models. http://codebetter.com/blogs/ raymond.lewallen/archive/2005/07/13/129114.aspx. 16. Kuhl, J. J. (2002). Project Lifecycle Models: How They Differ and When to Use them. www. businessesolutions.com. 17. Information regarding IGI Clobal is available at: https://www.igi-global.com/about/companyhistory/. Last accessed 25 May 2020. 18. Information regarding Springer Nature is available at: http://www.springer.com/la/about-spr inger/history. Last accessed 25 May 2020. 19. Information regarding Dodo IS is available at: http://www.dodofranchise.co/. Last accessed 25 May 2020. 20. Information regarding Microsoft Azure implementation at Dodo Pizza is available at www. customers.microsoft.com/…/download_dodo_pizza_case_study_pdf. Last accessed 25 May 2020.

Chapter 3

Optimization Methods in Crises: Patterns and Practices

Abstract This chapter describes the principles and practices for crisis-resistant software development. We discuss tradeoff multi-criteria optimization methods and how they improve digital product development in the event of crisis. Another section of this chapter is devoted to applying prospective technologies, such as cloud services and artificial intelligence, to a mission-critical area of smart city emergency management. Finally, we discuss blockchain as a cutting-edge technology and its possible implementations to manage crises in banking and financials. Keywords Multi-criteria optimization · Tradeoff optimization · Cloud service · Smart city · Blockchain

3.1 Introduction: Balancing Tradeoffs in Crises The previous chapters outlined the key ideas regarding crisis-responsive software development within the ITC framework. They also discussed the approaches to building successful lifecycles in crisis. However, these were mostly the issues that answered the question “what” rather than “how”; i.e., they lacked lower-level design and implementation details. Although the case studies we presented clarified certain aspects of crisis-resilient software development, these were also rather high-level focused. Therefore, this chapter describes the principles and practices of different kinds of multiple-criteria tradeoff-based optimization methods aimed at crisis-responsive development of digital products. We discuss several types of these tradeoff optimization approaches, including analytic hierarchy process (AHP), choosing by advantages (CBA), data envelopment analysis (DEA), evidential reasoning approach (ERA), and architecture-centric development method (ACDM). For each type of tradeoff optimization, we investigate its benefits and shortcomings from the standpoint of their applicability to the ITC framework. Therewith, the purpose of this chapter is to advise on using certain methods in the event of a software development crisis so that the outcome, in terms of digital product, is high quality, and its production process is efficient. © Springer Nature Singapore Pte Ltd. 2021 S. V. Zykov, IT Crisisology: Smart Crisis Management in Software Engineering, Smart Innovation, Systems and Technologies 210, https://doi.org/10.1007/978-981-33-4435-8_3

59

60

3 Optimization Methods in Crises: Patterns and Practices

To increase the practical value of the research, the theoretical aspects of a few tradeoff methods analyzed are illustrated and supported by case studies. For CBA, we consider a construction project with high safety requirements to ensure worker fall protection. For DEA, we discuss a hypothetical dataset for nuclear power plant to be designed and installed in different locations. For ERA, we instantiate it with the UK National Health Service case study, which describes an IT system that stores patient safety incidents related to contacting hazardous substances. For ACDM, briefly mentioned previously, we describe the details of its application to an architectural review of the IT system to be implemented and delivered to the customer. Further, we dive deeper into the practical part of software system design and implementation by means of another case study, which deals with smart city management. Certainly, this case study also focuses on crisis-resistant software development, as the software system is intended for the efficient service-oriented crisis management of a smart city. Finally, we present an insight of a prospective blockchain technology and its possible applications to crisis-resistant software development, including large-scale trustworthy banking and digital financial products and services. As previously, the focus of this chapter is discussing tradeoff optimization methods and their balanced agility and utility for the ITC framework.

3.2 Tradeoff Optimization Methods In crisis management, Project Risk Analysis can be said to be one part of the Project Risk Analysis and Management Framework, which also includes Project Risk Management. Project Risk Analysis and Management can be used on all projects, irrespective of the industry, environment, or timescale that they are a part of. Let us begin by examining the Risk Analysis part of the risk management processes, the need for it, and how it is done using a traditional approach. We then move on to discussing what multiple-criteria decision analysis (MCDA) methods used in the survey are, followed by using multiple-criteria decision analysis methods to perform Risk Analysis in specific cases, and the advantages and disadvantages in each case. Risk Analysis Risk can be said to be an unexpected event that adversely affects the project in meeting its goals, thereby having an overall negative effect on the project. It should be considered as a more proactive rather than a reactive process. Properly undertaken, it increases the likelihood of successful completion of a project to cost, time, and performance objectives. Where developers have an ample amount of data, Risk Analysis can be performed in a statistical manner; however, they must understand that no two projects are the same. Generally, projects can face problems that are unique to themselves, their industry, and/or working environment. Therefore, dealing with risks in projects is often different from situations where there is sufficient data to adopt a statistical

3.2 Tradeoff Optimization Methods

61

approach (such as Six Sigma that we discussed in Chap. 2). This is due to the fact that since projects usually involve a strong technical, engineering, innovative, or strategic content, a systemic approach has proven preferable to an intuitive approach. Risk Analysis is what follows the risk identification process in the Risk Management Framework and can be clearly divided into two distinct categories. Qualitative Risk Analysis During this process, one prioritizes risks for further action by assessing their probability of impacting project development, and this process is further subdivided into the following categories: Risk Probability and Impact Assessment. During this stage, every particular risk that may occur is investigated and analyzed in relation to its potential effects, both positive opportunities and negative threats, on the objectives of the project (e.g., cost, schedule, quality, and performance). Each risk is defined in levels by means of an interview, an investigation, or a meeting with all related stakeholders, to document the identified results. Developers classify risks based on their probability of occurrence, e.g.: • • • •

High probability (80–100%) Medium–high probability (60–80%) Medium–low probability (30–60%) Low probability (0–30%).

Impact Risk Rating Matrix The documented results of risk probability can be described in qualitative terms, such as very high, medium, and low. The risk rating is developed using a matrix, which represents risk scales for each of the risks. The matrix documents the risk probability scale between no possibility (0.0 rate) and certainty (1.0 rate), as well as the impact scale of the risk, reflecting the severity of its influence on the objective of the project. This matrix helps to improve the quality of the data and make the process easier to replicate several times during the project. The risk impact is usually classified according to the following criterion: • High catastrophic (Rating A = 100) • Medium critical (Rating B = 50) • Low marginal (Rating C = 10). We can use the following matrix as a guideline for the risk impact classification: Project objective

C Rating (10)

B Rating (50)

A Rating (100)

Cost

Cost increase 0%

Cost increase 5–10%

Cost increase over 10%

Schedule

Overall project schedule delay 0 days

Overall project schedule delay 1 week

Overall project schedule delay 2 weeks (continued)

62

3 Optimization Methods in Crises: Patterns and Practices

(continued) Project objective

C Rating (10)

B Rating (50)

A Rating (100)

Scope

Scope decrease barely noticeable

Minor areas of scope are affected

Major areas of scope are affected; scope reduction unacceptable to the client

Quality

Quality reduction barely noticeable

Quality reduction does not affect vital functionality

Quality reduction requires client approval

Risk Categorization This step entails grouping together risks by their common causes, to determine the most exposed areas of the project in order, which helps in developing a risk response plan. Risks can be grouped based on their sources and areas of the project that are most exposed to their effects. Risk categorization can also be done using the common root causes. Using this specific project management technique, we can identify the work packages, project phases, project activities, and roles that can be utilized to develop an effective risk response plan. The main goal of categorizing risks is to avoid any unpleasant surprises, i.e., predict and prevent product development crises. This provides a systemic and structured approach, when it comes to identifying risks consistently. Another added benefit is that it permits a more comprehensive detection of a wider range of risks. Additionally, Risk Analysis can be promoted by having meetings with experts to work on specific risk categories. Developers can categorize risks to the level of project objectives, the three risk categories being: • Operational risks: Risks that are related to the operational objectives of the project, are restricted to the products from the project • Short-term strategic risks: Risks that are related to the short-term strategic objectives of the project, more specifically, refer to risks that pose a problem to product functionality. These are not clearly specified in the project definition, but are needed in order to achieve the required deliverables of the project • Long-term strategic risks: Risks that hinder the project from achieving its longterm objectives, i.e., risks that do not belong to either of the above-mentioned categories. Risk Urgency Assessment During Risk Urgency Assessment, developers review and determine the timing of actions that need to happen sooner than the other risk items. This particular type of Risk Analysis helps to identify the near-term risks. Project managers can thus identify which risks should be considered urgent and/or require their immediate attention.

3.2 Tradeoff Optimization Methods

63

Below are some of the many factors that characterize short-term risks: • Time available: The project management team requires a minimum amount of time to implement a response to the risks, and responses can only work if they can be implemented within a reasonable time frame • Warning signs of risk: Some risks present certain signs, which can also be considered as risk triggers. They can be used by the project management team if a certain risk requires an urgent response. Depending on the risk, they can have either a short or a long time frame to respond • Risk rating score: Risks can be rated numerically depending on the impact and probability of their occurrence. This implies that higher rated risks may occur earlier than lower rated ones. Expert Judgment Expert judgment refers to taking advice from people who have expertise in a specific area, discipline, industry etc. Typically, this expertise is provided by the internal members of the project management team. However, if this is not possible, it is not uncommon for the expertise to come from an external group or person. Some of the most commonly used sources for expertise are: • • • • •

Employees within the organization External consultants Stakeholders Industry groups Suppliers.

Quantitative Risk Analysis When developers consider the concept of Risk Analysis, they should keep in mind that the project management team must take action during and before the execution of a project, in order to properly determine the approximate level of risk, so that it may exist with regard to the unhindered execution of the project. Quantitative risk analysis means that they have to numerically quantify the complete effect of the total amount of risks on the list of project objectives that have been determined beforehand. This can be done using the following methods. Interviewing Stakeholders This is a useful technique to gather information, which helps in forming low-rate and high-rate risk scenarios. In this scenario, the actors are implied to be individuals, sellers, and team members, i.e., the entire stakeholder. The name of the method is derived from the fact of interviewing them for obtaining information. The interviewing method can vary, and it can either be done in a formal or an informal manner. This means that the interview can ask either prepared or spontaneous questions, and record the responses of the interviewee. Depending on the situation, the interviewer may either be a single person, or a panel of interviewers may be used.

64

3 Optimization Methods in Crises: Patterns and Practices

This method is especially useful in that developers can understand which specific features are required to create an effective product. It is also helpful in obtaining confidential information in a one-on-one interview. Sensitivity Analysis This method helps to understand which specific risks have the largest potential effect on the project. It is also a method of modeling the risks to the project. If developers know the impact level of all the various elements on the project, the project management can prioritize the tasks in order to reach the desired result faster. The results of a sensitivity analysis can be seen through the use of a tornado diagram. Since the analysis happens to be a quantitative one, the difference between the risks can be easily seen. The impact of each risk is ranked by its numerical value. This allows for comparison between different risks in order to quickly determine which risks are worth taking. The project management team uses a sensitivity analysis in order to prioritize dealing with elemental risks to the project. If they know which risk affects the project objective the most, they can invest more effort into reducing that risk. Doing so can greatly reduce any unexpected delays faced during the execution of the project, leading to its smooth flow. Expected Monetary Value Analysis This is a statistical method that calculates the average outcome, when the future includes scenarios, which are either positive values (i.e., opportunities) or negative values (i.e., risks). The process can be visualized using a decision tree analysis, with a diagram representing each decision under consideration and the implications from choosing either path. In addition to quantifying the risks, this process also helps the project management team in calculating the contingency reserve. The technique known as expected monetary value analysis helps to: • Calculate the amount required to manage all the identified risks • Select the choice that saves money to manage the risks. The equation for expected monetary value (EMV) is: EMV = Probability × Impact

Modeling and Simulation Simulation is a method, which uses a model to translate uncertainties that arise at any level of the project, into their potential impact on the objectives of the project. Simulations are often performed using the Monte Carlo method. In this method, the project model is computed many times (i.e., multiple iterations), with the input values being chosen as randomized from a probability distribution function that consists of values, which analysts know beforehand. For instance, these values include the cost of the various project elements, the duration of the project, or the scheduled activities chosen for each iteration from the probability distributions of

3.2 Tradeoff Optimization Methods

65

each variable. The result of the simulation is a probability distribution that encompasses a value known only after the project is completed, such as the total cost of the project, or the completion date of the project. It is evident that the Monte Carlo method generates large volumes of probabilistic performance data based on the input that the probability distribution function feeds in. The outcome of the simulation is used to create the probability for the entire project. Since in most cases multiple simulations are run, analysts obtain multiple output values, which can be then plotted using a histogram or an S-curve to determine the probability of the whole project. Cost Risk Analysis For cost risk analysis, developers can use traditional work breakdown structures (W BS). Contrary to its actual name, W BS do not break work, but rather break down deliverables. A work breakdown structure can be defined as a hierarchical structure of things that the project will make, or outcomes that it will deliver. Below we consider the WBS for an aircraft carrier system. In this diagram (Fig. 3.1), WBS breaks down all of these complex activities into more compact and manageable constituent parts. For most projects, three levels of decomposition are sufficient; however, in the case of complex projects, it is common to have a fourth or even a fifth level.

Fig. 3.1 Work breakdown structure for aircraft system

66

3 Optimization Methods in Crises: Patterns and Practices

Usually, WBS denote the major deliverable work areas of the project; their lower elements provide appropriate detail and focus for the support of project management processes. WBS do not describe any actions to be taken; rather, every item represents an end product, such as armament, equipment, and weapon delivery. This is a fundamental feature of WBS in that it describes deliverables rather than activities required to produce them. Therefore, developers understand how WBS can be used for breaking down the various deliverables in addition to the cost and risk associated with each of them. Schedule Risk Analysis Schedule risk analysis is a planning-based procedure that is aimed toward improving the predictability and performance of a project, the final result being that the project management team can assess the likely impact of uncertainty and of individual risks, in order to complete the project on time. Schedule risk analysis often uses the precedence diagramming method (PDM), which is a strategy for the development of a schedule network diagram. An example of PDM diagram is shown in Fig. 3.2. This is a method that represents project activities. Their connections, illustrated by arrows, show the dependencies. The project management team can use this precedence diagram to check whether the project objective will be completed within the estimated time frame and budget. Multiple-Criteria Decision-Making Since we have already looked at Risk Analysis in terms of a more traditional approach of how to handle its operation, and seen in detail how each step can be completed, we now try to approach the task of Risk Analysis using the method of multiplecriteria decision-making (MCDM) and the six specific variants of it that are chosen. However, before we discuss the variants in detail, we first elaborate on what MCDM is.

Fig. 3.2 Project precedence diagram

3.2 Tradeoff Optimization Methods

67

Multiple-criteria decision-making is a subdiscipline of operations research that explicitly evaluates multiple conflicting criteria in decision-making and chooses the alternative that can best satisfy all of the existing criteria. As stated earlier, MCDM deals with structuring and solving decision problems involving multiple criteria. Typically, there does not exist a unique optimal solution for such problems and it usually falls into the hands of the decision maker to consider his/her own personal preference to separate feasible solutions. Even the term “solving” a multiple-criteria optimization problem does not accurately convey the meaning of obtaining the final solution. It can translate to choosing the “best” from a set of available alternatives (here “best” implying “the most preferred alternative” of the decision maker). Another interpretation of the word “solving” could be choosing a small set of good alternatives, or even grouping alternative solutions into preference sets. The ambiguity of the solution to the problem exists due to the presence of more than one criterion. There does not exist a unique optimal solution that can be obtained without incorporating preference information. Instead, developers must consider the case of a set of non-dominated solutions. A non-dominated solution has the trait of not being able to move away from it to any other solution without sacrificing at least one criterion in favor of another. Therefore, it is logical for the decision maker to choose a solution from the non-dominated set. Very often, the set of non-dominated solutions is too large to present to the decision maker before making the final choice. This means that developers need to shrink the pool of non-dominated solutions, i.e., perform a tradeoff between certain criteria. We can classify MCDM problems and methods into the following categories. Multiple-Criteria Evaluation Problems In these types of problems, there exist a finite number of alternatives, which are explicitly known before the solution process begins. Each alternative solution is represented by its performance in multiple criteria. The main problem is in defining the best alternative for a decision maker or finding a good set of alternatives. Developers may also “sort” or “classify” the alternatives. Otherwise stated, “sorting” refers to placing the alternatives in a set of preference-ordered classes, and “classifying” refers to assigning alternatives to non-ordered sets. Multiple-Criteria Design Problems In these types of problems, developers do not know the alternatives beforehand. The only way to determine an alternative is to solve a mathematical model. In this case, the number of alternatives is either infinite and not countable when some variables are continuous, or in most cases very large if countable, when all the variables are discrete. Irrespective of the problem being an evaluation problem or a design problem, the decision maker requires preference information in order to make a final decision regarding which alternative to select.

68

3 Optimization Methods in Crises: Patterns and Practices

Representations The MCDM problem can be represented either in the criterion space or in the decision space, both of which are elaborated on below. Criterion Space Representation Let us assume that solutions in a specific problem situation are evaluated using several criteria. It is better to say, the more criteria the better in terms of evaluating solutions. This leads to the idea that among all of the possible solutions, the most valuable ones are those that perform well in all of the considered criteria. In a large majority of the cases, some solutions perform well in some criteria, whereas others perform well in different criteria. The central idea of MCDM problems is to find a tradeoff between different criteria. In mathematical terms, an MCDM problem analogous to the previously stated arguments is represented as max q subject to q ∈ Q, where q is the vector of k criterion functions, and Q is the feasible set, Q ⊂ Rk . If Q is defined explicitly (by a set of alternatives), the resulting problem is called a multiple-criteria evaluation problem. If Q is defined implicitly (by a set of constraints), the resulting problem is called a multiple-criteria design problem. The maximization of a vector is not well defined mathematically. This means that we have to find a way to determine the tradeoff between criteria (that is usually done based on the preferences of the decision maker) when a solution that performs equally well in all criteria does not exist. Decision Space Representation The decision space translates to the possible set of available decisions. The criteria values will be consequences of the decisions that are made. Thus, it is possible to define a corresponding problem in the decision space. This means that, in designing a product, developers decide on the design parameters (i.e., decision variables), each of which affects the performance measures (i.e., criteria) with which they evaluate the product. In mathematical terms, a multiple-criteria design problem can be represented in the decision space as follows max q = f (x) = f (x1 , . . . , xn ) subjected to the condition q ∈ Q = f (x): x ∈ X , X ⊂ Rn where X is the feasible set, and x is the decision variable vector of size n. Let us consider the following important definitions, knowing that the two definitions of non-dominance (based on the criterion space representation) and efficiency (based on the decision variable representation) are closely related to each other.

3.2 Tradeoff Optimization Methods

69

Definition 1 q∗ ∈ Q is non-dominated if there does not exist another q ∈ Q such that q ≥ q∗ and q = q∗ . Simply stated, a solution is non-dominated so long as it is not inferior to any other available solution in all the considered criteria. Definition 2 X ∗ ∈ X is efficient if there does not exist another X ∈ X such that f (x) ≥ f (x∗ ) and f (x) = f (x∗ ). If an MCDM problem represents a decision situation well, then the most preferred solution of a decision maker has to be an efficient solution in the decision space, and its image is a non-dominated point in the criterion space. Definition 3 q∗ ∈ Q is weakly non-dominated if there does not exist another q ∈ Q such that q > q∗ . Definition 4 X ∗ ∈ X is weakly efficient if there does not exist another X ∈ X such that f (x) > f (x∗ ). Weakly non-dominated points include all non-dominated points and some special dominated points. The importance of these special dominated points comes from the fact that they commonly appear in practice, and special care is necessary to distinguish them from non-dominated points. Further, let us introduce the following important definitions. Ideal point (in criterion space) represents the best (the maximum for maximization problems and the minimum for minimization problems) of each objective function and typically corresponds to an unfeasible solution. Nadir point (in criterion space) represents the worst (the minimum for maximization problems and the maximum for minimization problems) of each objective function among the points in the non-dominated set and is typically a dominated point. The ideal point and the nadir point are useful to the decision maker to get the “feel” for the range of solutions. Having a basic understanding of what MCDM methodology is, let us apply these techniques to the context of Risk Analysis. Traditional methods of Risk Analysis are not very useful where the project demands the fulfillment of multiple criteria and the minimization of the various risks associated with them. This is where the MCDM techniques come into play. These are able to not only fulfill the multiple criteria, but also minimize the risks associated with them. Let us consider the following MCDM techniques in the context of Risk Analysis: • • • • •

Analytic hierarchy process Choosing by advantages Data envelopment analysis Evidential reasoning approach Architecture-centric development method.

Analytic Hierarchy Process The analytic hierarchy process (AHP) technique was introduced by Thomas Saaty in the 1980s as an effective tool for dealing with complex decision-making problems [1].

70

3 Optimization Methods in Crises: Patterns and Practices

It helps the decision maker to set priorities and make the best decision using them. Thus, it is possible to reduce complex decision problems to a series of pairwise comparisons. The AHP technique helps to capture both subjective and objective aspects of a decision problem. The AHP considers a set of evaluation criteria and a set of alternative options, from which analysts choose the best decision. Importantly, since some of the criteria can be contrasting, the best option is not one that optimizes each and every criterion, rather one that achieves the most suitable tradeoff when all the different criteria are considered. First, the decision maker compares all of the criteria pairwise; then, the AHP generates a weight for each criterion based on its pairwise comparison. The greater the weight, the more important is the corresponding criterion. Then, for a fixed criterion, the AHP assigns a score to each option according to the decision maker’s pairwise comparisons of the options based on that criterion. The higher the score, the better the performance of the option with respect to the considered criterion. Finally, the AHP combines the criteria weights and the option scores, thereby determining a global score for each option and a consequent ranking. The global score for a given option is a weighted sum of the scores it obtained with respect to all the criteria. The AHP is flexible and powerful because the scores, and therefore the final ranking, are obtained on the basis of the pairwise comparisons of both the criteria and the options provided by the user. The calculations made by the AHP are always influenced by the decision maker’s experience and preferences, and therefore the AHP can be considered as a tool that is able to translate the evaluations (both qualitative and quantitative) made by the decision maker into a multi-criteria ranking. Another advantage is that the AHP is simple because there is no need of building a complex expert system with the decision maker’s knowledge embedded in it. However, a disadvantage of AHP is that it requires bulky computations; this is more evident for problems with a large number of criteria and options. Even though every single calculation is simple because it only requires the decision maker to express how two options or criteria compare to each other, the load of the evaluation task may become unreasonable. In fact, the number of pairwise comparisons grows quadratically with the number of criteria and options. The AHP is implemented in the following three steps: 1. Computing the vector of criteria weights 2. Computing the matrix of option scores 3. Ranking the options. Let us describe each of the steps in detail, assuming that m evaluation criteria are considered, and n options are to be evaluated. Computing the Vector of Criteria Weights The AHP starts by creating a pairwise comparison matrix A. The matrix A is an m×m real matrix, where m is the number of evaluation criteria considered. Each entry ajk of the matrix A represents the importance of the jth criterion with respect to the

3.2 Tradeoff Optimization Methods Table 3.1 Relative importance of AHP criteria

71

Value of ajk

Interpretation

1

j and k are equally important

3

j is slightly more important than k

5

j is more important than k

7

j is strongly more important than k

9

j is absolutely more important than k

kthcriterion. If ajk > 1, then the jth criterion is more important than the kthcriterion, while if ajk < 1, then the jth criterion is less important than the kth criterion. If two criteria have the same importance, then the entry ajk is1. The entries ajk and akj must satisfy the following constraint:ajk .akj = 1. It is obvious thatajj = 1. Analysts measure the relative importance between two criteria using a numerical scale as shown in Table 3.1. After building the matrix A, let us derive from A the normalized pairwise comparison matrix Anorm by making equal to 1 the sum of the entries on each column; i.e., each entry ajk of the matrix Anorm is computed as: ajk − a jk = Pm m

l=1 alk

Finally, the criteria weight vector w (i.e., an m-dimensional column vector) contains the average entries for each row of Anorm , i.e.: m wj = Pm

− l=1 a jl

m

Computing the Matrix of Option Scores The matrix of option scores is an n × m real matrix S. Each entry Sij of S represents the score of the ith option with respect to the jthcriterion. In order to derive a pairwise comparison matrix, B(j) is first built for each of the m criteria, j = 1, . . . , m. The matrix B(j) is an n × n real matrix, where n is the number of options evaluated. (j) Each entry bih of the matrix B(j) represents the evaluation of the ith option compared to the hth option, with respect to the jth criterion. (j) (j) If bih > 1, then the ith option is better than the hth option, while if bih < 1, then the ithoption is worse than the hth option. If the two options are evaluated as equivalent with respect to the jth criterion, then (j) (j) (j) the entry bih is 1. The entries bih and bhi satisfy the following constraint: (j)

(j)

(j)

bih · bhi = 1 and bii = 1, for all i. As the AHP applies to each matrix B(j) , the same two-step procedure holds true for the pairwise comparison matrix A. This procedure divides each entry by the sum

72

3 Optimization Methods in Crises: Patterns and Practices

of the entries in the same column, and then it averages the entries on each row, thus obtaining the score vectors: s(j) , j = 1, . . . , m. The vectors s(j) contain the scores of the evaluated options with respect to the jth criterion. Finally, the score matrix S is obtained as:   S = s(1) . . . s(m)

Ranking the Options Once the weight vector w and the score matrix S have been computed, the AHP obtains a vector v of global scores by multiplying S and w, i.e., v =S×w The ith entry v represents the global score assigned by the AHP to the ithoption. As the final step, the optional ranking is accomplished by ordering the global scores in decreasing order. After examining the AHP in detail, let us move on to a case study to demonstrate applying AHP in the context of Risk Analysis. Case Study This case study describes a construction project that assists a contractor in the evaluation of the risk of the project. The AHP applies to the construction project as follows: 1. Create a hierarchy of risks 2. Develop the relative weights of the risks 3. Create and determine likelihoods of levels of risks. Let us discuss the above steps in more detail. Create a Hierarchy of Risks This step begins with identifying some of the possible risks that may arise before or during the construction project. Then, the risks are arranged according to a logical hierarchy. Due to space limitation, the significant risk factors of only the three most relevant risk categories in this context have been selected for consideration in this demonstration. The risks can be categorized and subcategorized as follows: 1. Financial and Economic Risks: The larger and more complex a project is, the more critical it is to identify the risks associated with it (a) Subcontractor Financial Default: Subcontractor financial default can lead to serious financial problems and time delays for the prime contractor. It is the responsibility of the prime contractor to thoroughly review the qualifications of the candidates

3.2 Tradeoff Optimization Methods

73

for the position of a subcontractor so that they do not cause delays due to neglecting to pay their bills or failing to complete their jobs (b) Unavailability of Funds: Unavailability of funds from the federal government and untimely payments to the subcontractors are factors that should be considered by a contractor. This can derail the project before it even gets started (c) Inflation and Price Escalation: The risks are generally of a speculative nature. It is not possible, for example, to insure against increases in cost which occur due to inflation. Similarly, charter rates and escalating costs of construction materials due to inflation 2. Political Risks: Short of external or internal instabilities, a project usually cannot be affected by the politics of the country that it is based in. However, changes in government policies, civil disturbances, export/import restrictions, embargoes and expropriation of contractor’s equipment and plant can affect the smooth operation of the project 3. Acts of God Risks: equipment, materials used, and the resulting structure of the construction project itself can be damaged during transport to the work site (a) Earthquakes: Seismic activity can affect the foundation and structure of the project. The risk of an earthquake can also negatively impact the construction project during its development phase and can also negatively affect the surrounding soil structure by liquefying it (b) Water Damage and Floods: The site of the project can be prone to water damage in the form of heavy flooding. Heavy flooding can lead to severe damage in the foundation and structure of the construct project that can render it irredeemable. In addition to damage to the construction project itself, the surrounding and underlying soil may be eroded rendering future construction projects futile. (c) Soil Subsidence and Collapse: Areas considered vulnerable to soil subsidence and collapse include approach roads, access routes, and flood embankments which can create an issue in transporting workers and materials to the construction site. This problem is especially critical in case the construction site is near a large water body, as the chance of soil subsidence and collapse is high. Even though the extent of any physical damage may be small, the subsequent delays that they cause can be expensive. Thus, we can incorporate the risks in Level 2 and Level 3 of the hierarchy (Fig. 3.3), which shows the example hierarchy for a bridge construction project. Level 1 represents the construction firm’s overall goal, namely: the most likely risk to the bridge project. Level 4 contains the three possible levels or intensities of the total risk of the project. More outcome levels (e.g., severe, moderate, strong, fair, weak, favorable, unfavorable etc.) could be added between the subfactor level and the risk level. Develop the Relative Weights of the Risks Further, let us determine the likelihood levels of the risks as well as the importance of the factors and subfactors related to those risks. In order to do this, we need to obtain the judgments from the management of the construction firm and create matrices of judgment as shown in Table 3.2.

74

3 Optimization Methods in Crises: Patterns and Practices

Fig. 3.3 Risk hierarchy, bridge construction project

Table 3.2 demonstrates that the financial risks are judged to be more important than political risks on the AHP verbal ratio scale. The main reason for this is that the management’s concern about political risks is mitigated by the fact that in most cases governments will not impede the construction of new projects as it directly benefits the infrastructure of the country that the projects are based in. However, comparatively, money can be a very limited resource depending on the country; therefore, a government may not necessarily allocate the financial resources required to complete the construction project. Concerning financial risks and acts of God risks, the management is uncertain which of them are more important. Conversely, political risks are typically more important than acts of God risks. Similar procedure applies to obtain judgments on the relative importance of the subfactors and the relative likelihood of the levels of risk (e.g., high, medium, and low total risk) as shown in Table 3.3. Table 3.3 demonstrates that the unavailability of funds is the most influential subfactor in determining the level of risk for this project with a relative importance of 0.433, followed by changes in local laws (0.205). The management sees inflation, Table 3.2 Judgment for risk factors With respect to goal

F1

F2

F3

Relative importance

F1 F2

1

3

6

0.635

1/3

1

5

0.287

F3

1/6

1/5

1

0.078

3.2 Tradeoff Optimization Methods

75

Table 3.3 Risk levels and their likelihood Levels of risks Factors F1

F2 F3

Likelihood of level of risk

Subfactors 0.635

0.287 0.078

High

Medium

Low

F11

0.150

0.099

0.031

0.020

F12

0.433

0.041

0.121

0.271

F13

0.052

0.005

0.017

0.030

F21

0.082

0.008

0.019

0.055

F22

0.205

0.040

0.147

0.018

F31

0.022

0.005

0.015

0.002

F32

0.049

0.035

0.010

0.004

F33

0.007

0.002

0.005

0.001

0.235

0.364

0.401

and soil subsidence and collapse as the least important subfactors (with a priority of 0.007). Create and Determine Likelihood of Level of Risk At the final step, the likelihood of high, medium, and low total risk is determined by aggregating the relative weights through the hierarchy. The results show that the project is characterized as low risk (with a likelihood of 0.401) as it appears at the bottom of Table 3.3. Choosing by Advantages Choosing by advantages (CBA) is an MCDM method developed by Suhr, which allows decision makers to focus on what is important: the advantages (i.e., beneficial differences) that each alternative could deliver to stakeholders, and the decisions to be taken (see Fig. 3.4). CBA focuses on advantages for the customer (i.e., end user) of the project and helps transferring the ideas regarding client requirements to the decision makers. CBA encourages use of correct data by basing decisions on anchoring questions, relevant facts, and the importance of differences between the advantages of alternatives. This process leverages and facilitates the workflow when translating the activity of generating design alternatives into construction operations, through a

Fig. 3.4 CBA process

76

3 Optimization Methods in Crises: Patterns and Practices

more consistent and less subjective decision-making process when deciding among alternatives. CBA Principles The key principles are: 1. Decisions must be based on the importance of the beneficial differences between alternatives 2. Decisions must be anchored to relevant facts 3. Different types of decisions call for different methods of decision-making 4. Decision makers must learn and skillfully use these methods. Principle 3 suggests different methods for different types of decisions. These range from simple binary decisions with no resource implications, to complex ones with many alternatives, each with its own set of resource implications. It is important to consider resource requirements in a different way from other attributes, as there is a critical issue for most stakeholders about what they would do with the resource if it was not consumed by the decision currently being considered. In CBA, the beneficial differences of non-resource attributes of the alternatives are evaluated before any consideration is given to resources, so that any tradeoffs can be clearly explored. CBA Features CBA uses an open, transparent, and auditable decision-making process that reduces the complexity of most projects and of the client systems that commission them. Therefore, the process handles both objective and subjective data within a single decision process. Considering the complexity and ever-changing nature of large original production, CBA is ideal in this scenario as a clear audit trail for decisions made allows them to revisit the implications of changing those decisions. When the decision maker is faced with too much choice, he/she can become overwhelmed and may leap to conclusions without exploring all reasonable options. CBA offers a systematic way for all stakeholders to manage the process of deciding between large numbers of alternatives without being overwhelmed and ensures that they do not artificially limit the number of alternatives considered with complex decisions. CBA identifies only the advantages of alternatives, as opposed to the traditional methods of weighing up both the advantages and disadvantages of alternatives, to avoid double-counting and omissions. The second rule is to separate cost from value. Cost is a constraint, not a factor, and thereby should be given special attention when making a decision. It should be noted that other confounding variables may also be considered as constraints, such as contractual requirements. CBA Implementation CBA uses facilitation so that meetings are not slowed down by discussions about processes that were already covered in the previous phase(s). In most cases, the CBA process includes the following five steps:

3.2 Tradeoff Optimization Methods

77

1. Stage setting: It begins by defining the purpose and identifying the issues, the criteria of the decision, and who should be involved in the decision-making process 2. Innovation: This next step involves identifying the alternatives and making the differences between them visible and tangible 3. Decision-making: The third step involves: (i) Listing the advantages of each alternative (ii) Deciding on the importance of each advantage (iii) Choosing the alternative with the greatest advantages before considering the resource implications of the alternatives and making the final decision 4. Reconsideration: This step involves reviewing the final decision to check that it is in fact what is required. It is changed if appropriate, and if not changed, then it is committed to 5. Implementation: The final step entails doing whatever necessary to make the final decision a reality. These five steps are illustrated by the following diagram (Fig. 3.5). Case Study This case study presents a real-world application of CBA in the context of Risk Analysis. It considers the case of a construction project, which requires to select a fall protection measure to be implemented on a jobsite, in order to mitigate the risk of falling from the roof. In this particular case study, three safety solutions were identified, and CBA is to be used in order to make the complex decision as to which particular safety solution is to be selected. Specifically, the decision determines the safety solution based on a careful examination, which considers nine factors. In order to implement this, CBA uses the tabular method, which is conducted using the following steps:

Fig. 3.5 CBA process steps

78

3 Optimization Methods in Crises: Patterns and Practices

1. Generating possible alternatives 2. Identifying factors and criteria 3. Summarizing the attributes of each alternative 4. Determining the advantages of each alternative 5. Deciding the importance of each advantage 6. Selecting the best alternative. The steps are discussed in more detail below. Generating Possible Alternatives In this particular case for designing fall protection measures for a construction site, we only consider three alternatives for the sake of simplicity. They are described below: (i)

Installation of a temporary guardrail system that meets OSHA guardrail height requirements, on the roof during the construction and maintenance operations to protect the safety of workers (ii) Installation of six roof anchors on the building rooftop (iii) Installation of a 100 cm roof parapet. This particular method is an example of a prevention through design (PtD) solution. Identifying Factors and Criteria The stakeholders conducted a review to document the factors having a potential impact on the selection of a rooftop fall protection measure and their associated risks and benefits. This was followed by a brainstorming session held with the stakeholders in order to decide on the most important factors contributing to the final decision regarding the selection of a fall protection system. As a result, nine factors were selected to be the components of the decision: • Reliability of safety measure (Factor 1): One of the primary reasons for using the CBA process is to investigate the reliability of the proposed safety measures. The hierarchy of controls, which includes five levels, with elimination being the most effective control, and personal protective equipment (PPE) being the least preferred control, is used as a means of determining the feasibility of each safety measure • Ease of implementation (Factor 2): Ease of implementation is a potential crisis factor when designing a safe application. As discussed previously, safety managers are typically surrounded by many considerations that eventually affect their safety selections. For instance, measures involving the concept of PtD are most likely to require the early involvement of designers, while administrative/engineering measures can be solely implemented by the contractor or safety staff • Construction site safety (Factor 3): The construction workforce is “the most valuable resource” involved in the process of constructing a facility; therefore, protecting the safety of construction workers must be the priority of any planning effort. Any failure to guarantee worker safety may cause serious injuries or illness

3.2 Tradeoff Optimization Methods

• •







79

and lead to possible litigation that impacts the project’s success. The attribute of each of the alternatives varies in its influence on construction site safety. For example, while a temporary guardrail significantly reduces the risk of falling, it does not eliminate the hazard altogether. Even with such implementation, it is still possible that a guardrail might break during a work operation, causing serious injury. Similarly, fall protection gear does not eliminate the hazard even though it may catch workers and prevent them from falling to the ground. However, a 100 cm roof parapet eliminates the hazard of falling over the sides of the roof Safety of maintenance personnel, and safety of end users (Factor 4, and Factor 5): The safety of maintenance personnel and end users is considered in a similar manner to the safety of construction workers, but weighted differently Factor 6 (Aesthetics): Building aesthetics is an important element when designing a building not only for designers, but also for owners. The contractor and designer want to construct an attractive building to keep the owner satisfied. A tall parapet can improve the building aesthetics by keeping maintenance equipment unseen. Because extensive mechanical construction operations are expected to be carried out on the one-story building’s roof, this factor may have a substantial impact on the selection of the decision makers, represented by participants in this case study Factor 7 (Productivity of workers): The safety measures may potentially impact productivity, which, if negative, is considered a type of waste. Any task that generates waste would be undesirable and clearly distinguished during the decisionmaking process. Researchers have quantified the impact of a roof anchor system on the efficiency of workers as opposed to working on a well-protected roof. They found a 15% reduction in worker productivity due to the use of fall protection gear, as it restricts worker movement and requires greater effort to tie off. It has also been decided that the temporary guardrail system can impact worker productivity negatively due to distraction. In contrast, PtD solutions improve both the quality of the final product and productivity of construction workers Factor 8 (Premaintenance/installation effort): Preparation needed before carrying out maintenance/installation operations may substantially affect the total task duration. For instance, the extra effort required to install a temporary guardrail is significant compared to the effort needed when working near protected roof edges Factor 9 (Construction schedule): The construction schedules required for different designs can differ greatly depending on the complexity of the design, as well as the construction means, and methods used on the site. The original design of the case study building calls for a 30 cm tall parapet, while alternative #3 involves an increase in parapet height by about 70 cm, which may affect the construction schedule. However, due to the inherent design, the stakeholders decided that none of the alternatives selected would affect the project duration These factors, their associated risks, and comments are summarized in Table 3.4.

80

3 Optimization Methods in Crises: Patterns and Practices

Table 3.4 Alternatives and their attributes Factors

Alt.1: Temporary guardrail

Alt.2: Roofing anchor system

Alt.3: 100 cm (40-inch) parapet

1. Reliability of safety measure Criterion: Elimination is the most preferred, and eng. control is the least

Att: Engineering control

Att: Engineering, PPE, administrative

Att: Elimination

2. Ease of implementation Criterion: The easier to implement, the better

Att: Easy to install; only contractor involved

Ad: More reliable and a little safer

IofA: Ad: N/A 15

IofA: Ad: Much more N/A reliable and safer

Att: Two parties involved in implementation; easy to install

Att: Three parties involved in implementation; moderate to install

Ad: More IofA: Ad: Fewer IofA: Ad: N/A known; 70 people involved 60 fewer people involved; less technical 3. Construction site safety Criterion: Eliminating hazard is preferred

Att: Significantly reduces falling over the side; requires installation while no barrier is present; requires admin. control Ad: N/A

4. Safety of maintenance personnel Criterion: Eliminating hazard is preferred

5. Safety of end users Criterion: Enhancing end user’s safety is preferred

IofA: Ad: Permanent N/A over portion of construction phase. More structurally stable

Att: Significantly reduces falling over the side; requires installation while no barrier is present; requires admin. control Ad: N/A

Att: Prevents falling to the ground; reduces severity on injuries; requires PPE and admin. control; partial permanent control

IofA: Ad: Permanent. N/A More structurally stable

Att: Unlikely to be used by end user

IofA: N/A

Att: Prevents falling over side; permanent control

IofA: Ad: Permanent 35 during portion of construction phase. “It’s there.” Fewer admin. control needed

Att: Prevents falling to the ground; reduces severity on injuries; requires PPE and admin. control; partial permanent control

IofA: 100

Att: Prevents falling over side; permanent control

IofA: Ad: Permanent. 25 “It’s there.” Fewer admin. control needed

Att: Unlikely to be used by end user

IofA: 20

IofA: 50

Att: Permanent protection provided. Prevent falling over side (continued)

3.2 Tradeoff Optimization Methods

81

Table 3.4 (continued) Factors

Alt.1: Temporary guardrail Ad: N/A

6. Aesthetics Criterion: The nicer, the better

7. Productivity of workers Criterion: Higher productivity is preferred

Alt.3: 100 cm (40-inch) parapet

IofA: Ad: Allowing N/A user to conduct work safely without installing another system or using PPE

Att: Taller exterior wall. Prevents seeing equipment; nice looking from below IofA: Ad: Nicer N/A looking (hiding maintenance equipment)

IofA: Ad: Highest N/A productivity

IofA: Ad: No extra 13 effort required

9. Construction schedule Att: No impact Att: No impact Criterion: The faster, the Ad: N/A IofA: Ad: N/A better N/A

Att: No impact

135

IofA: 10

IofA: 55

Att: Some extra effort Att: No extra effort required to attach required lanyard if working near edge

IofA: Ad: Less extra N/A effort required

Total IofAs

IofA: 35

Att: Decrease Att: No impact productivity for construction and maintenance workers by 15% due to wearing fall protection gear

IofA: Ad: N/A 50

Att: Significant extra effort required to install if working near edge Ad: N/A

Att: No impact

IofA: Ad: N/A N/A

Att: Some impact on productivity due to distraction

Ad: Higher productivity 8. Effort needed before maintenance/installation Criterion: Less effort is better

IofA: Ad: N/A N/A

Att: No impact

Ad: N/A

Alt.2: Roofing anchor system

133

IofA: Ad: N/A N/A

IofA: 15 IofA: N/A

285

In this step, the participants summarized the attributes of each alternative in response to each of the above-mentioned nine factors using the criteria as a rule of judgment. Some of the attributes were described above. Determining the Advantages of Each Alternative In this step, the participants identified the advantages of each alternative, relying on the criterion and attributes for each factor. The procedure was as follows: (i) Select the least preferred attribute for each factor (ii) Determine the differences between the least preferred attribute and the other attributes

82

3 Optimization Methods in Crises: Patterns and Practices

(iii) Decide the most preferred advantage of each factor Therewith, determination of advantages of the alternatives should be an objective task. Deciding the Importance of Each Advantage In this stage, the participants assigned a level of importance to each advantage. A scale from 1 to 100 was selected to provide the participants with flexibility in assigning different levels of importance. The advantage with paramount importance should be determined first and assigned a score of 100. The next task is to weight the rest of the advantages using the paramount advantage as a point of reference. The final step of this stage is to compute the total importance of advantages (IofAs) of each alternative (see the bottom row of Table 3.4). Selecting the Best Alternative In this phase, the alternative that provides the greatest value to the stakeholders/decision makers should be selected. If funds are unlimited, a cost analysis will not be needed. In the case study example, Alternative #3 (a 100 cm parapet) will be selected, as it was identified as having the greatest value (total IofAs). However, cost is seldom unlimited, and therefore should be considered in the decision-making process. Table 3.5 summarizes both the initial and lifecycle cost assessments of implementing each system. The cost of maintenance operations includes costs associated with a full-body harness, self-retractable lifeline, lanyards, and fall protection training program as required by OSHA. These considerations need to be provided regularly (assumed every 5 years) due to aging, obsolescence, and turnover, when the roof anchor system is adopted. Data Envelopment Analysis Data envelopment analysis (DEA), first put forward by Charnes, Cooper, and Rhodes in 1978, is a performance measurement technique, which is used for evaluating the relative efficiency of decision-making units (DMU) in organizations. DEA is suited for the type of evaluation that enables results to be compared by making allowances for factors. DEA makes it possible to identify efficient and inefficient units in a framework, where results are considered in their particular context. In addition, DEA provides information that enables the comparison of each inefficient unit with its “peer group,” that is, a group of efficient units that are identical with the units under analysis. Analyzing DEA Efficiency Let us consider the following dataset with 4 variables: 2 inputs (X 1 and X 2) and two outputs (Y 1 and Y 2). The dataset is relatively small, with just 12 decision-making units as shown in Table 3.6. Let us assume that only X 1 and Y 1 would be important in the analysis. This is followed by plotting the 12 observations on a scatter plot as shown in Fig. 3.6.

3.2 Tradeoff Optimization Methods

83

Table 3.5 Initial and lifecycle cost evaluation Temporary guardrail

Roofing anchor system

100 cm (40-inch) parapet

Work description

Cost

Work description

Cost

Work description

Cost

Material cost of a guardrail system

$1,173

Material cost of 6 eng. roof anchors

$2,638

Walls and ceiling

$19,533

Installation and removal: 24 work hrs

$1,205

Installation of 6 roof anchors/davits

$1,706

Roofing

$4,475

Fall protection equipment

$2048

Base plates: supply and installation

$1,082

Exterior wall panels

$20,020

Delivery costs and miscellaneous

$600

Miscellaneous expenses

$6,756

Extra design fees

Included

Total initial cost

$5026

Total initial cost $12,182

Total initial cost S44, 210

Lifecycle cost assessment (50 years)

$125,650

Lifecycle cost assessment (50 years)

$23,182

Lifecycle cost assessment (50 years)

X2

Y1

Y2

$44,210

Table 3.6 DMU dataset DMU

X1

1

4.0

140.0

2.0

28.0

2

5.0

90.0

1.0

22.5

3

6.0

36.0

6.0

12.0

4

10.0

300.0

8.0

60.0

5

11.0

66.0

7.0

16.5

6

8.0

36.0

6.0

12.0

7

9.0

12.0

7.0

6.0

8

5.0

210.0

3.0

30.0

9

5.5

33.0

4.4

5.5

10

8.0

288.0

4.0

72.0

11

10.0

80.0

2.0

20.0

12

8.0

8.0

1.0

4.0

DMU 3 is the most efficient one, since a straight line originating at the point (0,0) toward DMU 3 has a higher slope than any of the remaining DMUs. The straight line originating at the (0, 0) point toward DMU 3 is called the efficiency frontier, and together with the X axis it defines a cone with its vertex at the origin. This cone is called the production possibility set, since it contains all real data, and according to DEA axioms, only points inside this cone correspond to possible working conditions based on best achieved performance.

84

3 Optimization Methods in Crises: Patterns and Practices

Fig. 3.6 DMU dataset (plotted view)

Further, let us analyze DMU 1 in greater detail to examine the methods of how it may reach its efficiency. DMU 1 has two ways of reaching efficiency: • Increasing output till it reaches M. O (maximization of output) • Decreasing input till it reaches m. i (minimization of input). The actual value for efficiency is defined as the ratio between the distances given by the equation: d (m.i − z.i) 2−0 = = 50% d (1 − z.i) 4−0 Similarly, the actual value for inefficiency is defined as the ratio between the distances: d (M .O − Z.O) 4−0 = = 100% d (1 − Z.O) 2−0 Here, z.i means the zero input point and Z.O the zero output point for DMU 1. We can conclude that in DEA, there exist two different methods of optimization related to radial measures of score: input minimization and output maximization. The artificial points M .O and m.i are termed targets or composite points for DMU 1. Point 3 is the only efficient one, and it is the peer for all remaining points. Under constant returns to scale, efficiency is the reciprocal of inefficiency; the peer set is also the same, regardless of the orientation, although their targets are different as concluded from the exposition above. Formal Model for DEA In DEA, the efficiencyEF a of a specific DMU a under analysis is defined as ratio between a weighted sum of its m outputs Yia and weighted sum of its m inputs Xia ,

3.2 Tradeoff Optimization Methods

85

a natural extension of the concept of efficiency used in the fields of physics and engineering ∞ n=1

2−n = 1

s μra yra EF a = r=1 m r=1 via xia When assessing a set of J organizations, where Xik stands for the ith input of the kth DMU, with a similar meaning for Yrk , the weights μrk and vik are shown in the equation above and chosen for each DMU j , under evaluation as those that maximize its efficiency as defined by EF a . Case Study Let us illustrate the use of DEA as Risk Analysis, as taking risks is fundamental to doing business. In order to make a profit, by necessity one must take risks. To illustrate the working of DEA in Risk Analysis, let us use a hypothetical nuclear power plant. In the case of a set of data, including a supply chain, with a need to select a nuclear power plant, twelve alternatives with four criteria exist. Criteria considered include cost, expected lives lost, risk of catastrophe, and civic improvement. Expected lives lost reflect workers as well as expected local (civilian bystander) lives lost. The hierarchy of objectives is shown below. The alternatives that are available, along with measures on each criterion (including two categorical measures), are given in Table 3.7. In order to successfully apply the DEA model, it is more straightforward if the scores are higher. Therefore, the cost (in EUR bln) and expected lives lost scores Table 3.7 Alternatives and criteria measures Alternatives Verona, Italy Chemnitz, Germany

Cost (EUR bln)

Expected

Risk

Civic improvement

40

60

Very high

Low

100

140

Very low

Very high

York, UK

60

40

Low

High

Eindhoven, Netherlands

60

40

Medium

Medium

Split, Croatia

70

80

Low

Very high

Marcelle, France

70

80

High

Medium

Barcelona, Spain

60

50

High

High

Porto, Portugal

50

30

Medium

Medium

Plovdiv, Bulgaria

90

130

Very high

Very low

Valetta, Malta

80

120

Very low

Very low

Gratz, Austria

80

70

Medium

Low

Poznan, Poland

90

100

Very high

Very low

86

3 Optimization Methods in Crises: Patterns and Practices

Table 3.8 Qualitative rankings Alternatives

Cost

Verona, Italy

60

Chemnitz, Germany

Expected

Risk

Civic improvement

40

0

25 100

0

0

100

York, UK

40

100

80

80

Eindhoven, Netherlands

40

100

50

50

Split, Croatia

30

60

80

100

Marcelle, France

30

60

30

50

Barcelona, Spain

40

90

30

80

Porto, Portugal

50

110

50

50

Plovdiv, Bulgaria

10

10

0

0

Valetta, Malta

20

20

100

0

Gratz, Austria

20

70

50

25

Poznan, Poland

10

40

0

0

are adjusted by subtracting them from the maximum possible score, and consistent scores on a 0–100 scale are assigned for the qualitative ratings given for risk and civic improvement, as shown in Table 3.8. It is easy to identify non-dominated solutions by method of inspection. In this instance, Verona, Italy, has the lowest estimated cost, so is non-dominated. Similarly, Porto, Portugal, has the least expected lives lost. There is a tie for risk of catastrophe: Chemnitz, Germany, and Valetta, Malta, have the best ratings, with a tradeoff in that Valetta, Malta, has better cost and lives lost estimates, while Chemnitz, Germany, has a better civic improvement rating, and both are non-dominated. There is also a tie for best civic improvement (Chemnitz, Germany, and Split, Croatia), and a tradeoff in that Split, Croatia, has better cost and lives lost estimates, while Chemnitz, Germany, has a better risk of catastrophe rating, and again both are non-dominated. There is one other non-dominated solution (York, UK), which can be compared to all of the other eleven alternatives and shown to be better on at least one of them. As we apply the DEA model for Risk Analysis of this data, this model seeks to maximize the ratio of efficiency of output attainments to inputs, considering the relative performance of each alternative. The model creates a variable for each output (outputs designated by μi ) and input (inputs designated by vj ).Each alternative k has performance coefficients for each output (yik ) and input xjk . Then, the classic DEA model is: 2 MaxEfficiencyk = i=1 2

ui yik

j=1 vj xjk

such that for each k from 1 to 12:

3.2 Tradeoff Optimization Methods

87

2

i=1 ui yik

2

j=1 vj xjk

≤ 1ui , vj ≥ 0

The model also includes a scale parameter that releases the restriction on sign for ui , vj : 2 MaxEfficiencyk =

i=1 ui yik + γ 2 j=1 vj xjk

such that for each k from 1 to 12: 2 2

i=1 ui yik

j=1 vj xjk



≤ 1ui , vj ≥ 0

The third DEA model allows for efficiency; this has no restriction on efficiency ratios: 2 ui yik MaxEfficiencyk = i=1 2 j=1 vj xjk such that for each k from 1 to 12: 2 i=1 ui yil ≤ forl = kui , vj ≥ 0 2 j=1 vj xjl The results after running the DEA models on the dumpsite selection model are given in Table 3.9. As Table 3.9 demonstrates, these approaches provide rankings. In the case of R DEA, the ranking includes some ties (e.g., for 1st and 11th place). The non-dominated Verona, Italy, alternative was ranked 10th , behind the dominated solutions of Barcelona, Spain; Eindhoven, Netherlands; Marcelle, France; and Gratz, Austria. Verona, Italy, dominates Plovdiv, Bulgaria, and Poznan, Poland, but does not dominate any other alternative. The ranking in 10th place is probably due to a smaller scale for the cost criterion, where Verona, Italy, has the best score. BCC DEA has all dominated solutions tied for first. The rankings for 7th –12th reflect more of an average performance on all criteria (affected by scales). The rankings provided by BCC DEA are affected by criteria scales.

88

3 Optimization Methods in Crises: Patterns and Practices

Table 3.9 DEA results CCR DEA

CCR DEA

BCC DEA

BCC DEA

Super-CCR

Super-CCR

Score

Rank

Score

Rank

Score

Rank

Verona, Italy 0.438

10

1

1

0.438

10

Chemnitz, Germany

0.750

6

1

1

0.750

6

York, UK

1

1

1

1

1.310

1

Eindhoven, Netherlands

0.625

7

0.833

8

0.625

7

Split, Croatia

1

1

1

1

1.071

2

Marcelle, France

0.5

8

0.701

9

0.5

8

Barcelona, Spain

0.975

3

1

1

0.975

3

Porto, Portugal

0.833

5

1

1

0.833

5

Plovdiv, Bulgaria

0

11

0.450

12

0

11

Valetta, Malta

0.938

4

1

1

0.938

4

Gratz, Austria

0.468

9

0.625

10

0.468

9

Poznan, Poland

0

11

0.486

11

0

11

Alternative

Super-CCR provides a nearly unique ranking (tie for 11th place). Evidential Reasoning Approach ERA)is different from the methods based on probability distributions and statistical models. It is a tool for information aggregation in MCDM problems under uncertainty. It can provide a framework to accommodate different forms of information, handle different kinds of uncertainty in the information aggregation process, and generate an overview of the risks involved with an enterprise operating in complex environments according to different individual risk (i.e., crisis) factors. Modeling Evidential Reasoning Approach In an MCDM problem, suppose an alternative A is described by L criteria, with the ith criterion being represented by ei (i ∈ {1, 2, ..., L}) which can be described by a set of mutually exclusive and collectively exhaustive grades:H = {H 1, H 2, H 3, ..., HN }. A weight w  i is assigned to ei that can be described by a belief distribution: S(ei ) =  Hn , β n,i , n = 1, 2, ..., N , in which, βn,i represents the degree of belief to grade Hn   is used to assessei , with βn,i ≥ 0 and Nn=1 βn,i ≤ 1. S(ei ) is complete Nn=1 βn,i = 1,

3.2 Tradeoff Optimization Methods

89

 and it is incomplete if Nn=1 βn,i < 1. The degrees of belief can be transformed into basic probability masses using the following equations: mn,i = ωi βn,i mH ,i = 1 −

N 

mn,i = 1−ωi

n=1

N 

βn,i

n=1

m ¯ H ,i = 1 − ωi N 

m ˜ H ,i = ωi (1 − ωi )

βn,i

n=1

  − ∼ With mH ,i = mH ,i + mH ,i and Nn=1 βn,i and Li=1 wi = 1. In the above four equations, n ∈ {1, 2, ......, L}, i ∈ {1, 2, ..., L}, mn,i represents −

the basic probability mass assigned to Hn ; mH ,i represents the basic probability mass −





assigned to the whole set H and it is divided into 2 parts: mH ,i andmH ,i , where mH ,i is caused by the relative importance of ei . mn = k

L i=1



mH = k

L −

 − ∼ ∼ mH ,i + mH ,i mn,i + mH ,i + mH ,i − i=1

L − 

L − ∼ mH ,i + mH ,i − mH ,i i=1

i=1



mH = k k=

L



i=1



mH ,i

−1

N L L − − ∼ ∼ mH ,i + mH ,i mn.i + mH ,i + mH ,i − (N − 1) m=1

i=1

i=1

βn =

mn −

1 − mH ∼

βH =

mH −

1 − mH

In the above equations, βn and βH are the degrees to which the alternative A can be described by Hn and H , respectively, and H is the degree of ignorance in the aggregated result. Therefore, the overall assessment of A is represented by S(A) = {(Hn , βn ), n = 1, 2, ..., N } In addition, it has been proved that

N

n=1 βn

+ βH = 1

90

3 Optimization Methods in Crises: Patterns and Practices

For the convenience of comparison among different alternatives, a utility U (A) is generated for A from the above equation. When βH = 0, U (A) is calculated using the equation below with U (Hn ) being the utility of grade Hn : U (A) =

N 

βn U (Hn )

n=1

If βH = 0U (A) is determined by an interval with lower and upper bound being calculated as follows: U (A)max =

N −1 

βn U (Hn ) + (βN + βH )U (Hn )

n=1

U (A)min = (β1 + βH )U (HN )

N −1 

βn U (Hn )

n=2

It is assumed that U (Hn−1 ) < U (Hn )forn = 2, 3, . . . , N . The Evidential Reasoning Approach to Risk Analysis To use the ERA in the context of Risk Analysis, each risk is assessed individually and considered as a criterion, while the risk profile of a department in an enterprise, or the enterprise itself, is generated by aggregating the information of relevant criteria. In addition, a set of grades needs to be defined to assess individual risk according to the product of two elements: likelihood of the occurrence of an incident regarding the risk, and the corresponding consequence. In current practice, both elements are usually measured by a score scaled from 1 to 5, so that risk is measured by a score from 1 (when both the score of likelihood and that of consequence are scored as 1) to 25 (when both the score of likelihood and that of consequence are scored as 5). Based on the risk score, different grades, such as low/medium/high or green/amber/red, are proposed to describe the risk. However, in this case, the difference among individual risks with the same grade but with different scores cannot be reflected, leading to distortion and even loss of information in risk aggregation. To avoid such information distortion or loss (i.e., a crisis), risk scores are graded to describe individual risks. Since both incident likelihood and incident consequences can take values from 1 to 5, the risk score can be equal to 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 20, and 25. In addition, if at the time of assessment, there is no risk regarding a factor (or a department within the enterprise), the risk score would be 0. Therefore, the whole set of grades used to describe individual risks can be represented as: H = {H0 , H1 , H2 , H4 , H5 , H6 , H8 , H9 , H10 , H12 , H15 , H16 , H20 , H25 }

3.2 Tradeoff Optimization Methods

91

Case Study Let us apply the ERA approach to a case of the National Health Service (NHS) in the UK, which stores patient safety incidents caused by manual handling of, or exposure to, hazardous substances. It is observable that the overall trend is rising. In addition to the incidents related to the safety of the patients, there are other risks involved in the NHS operation (such as risk in finance and risk in staffing), which are reflected from the risk registers of certain NHS organizations as parts of this system. It is evident that certain risks trigger crises more often; these are prevalent in NHS daily operations. Furthermore, as indicated by NHS staff, an important incident reason is that there is an inefficient risk reduction strategy, which is potentially caused by underestimation or overestimation of the risk levels within the individual organizations. Therefore, Risk Analysis is very important for the organizations within the NHS. This case study presents Risk Analysis and assessment in an NHS organization in the North of England using the ERA. Typically, this organization only detects around 60 risk factors per month in their daily operations, and they measure each risk by multiplying the likelihood of its occurrence and consequence. Currently, they produce regular risk reports for senior managers to develop risk control strategies. Before applying ERA to risk assessment, let us design a framework to organize individual risk factors in the organization. Currently, the risk factors reflect the business units and their structure. Additionally, the organization includes dedicated committees, each of which focuses on particular risk factor sets. For example, Fig. 3.7 presents the risk assessment framework according to the risk factors identified in June 2010 within the organization. In Fig. 3.7, ORG, COM, BU, and RF stand for “organization,” “committee,” “business unit” and “risk factor,” respectively. We can see that the organization has nine business units (BU-A to BU-I), and different sets of risk factors are identified in these business units (e.g., there is no risk factor in BU-A, and there are eight risk factors in BU-B, i.e., RF-B.1 to RF-B.8).

Fig. 3.7 Risk assessment framework

92

3 Optimization Methods in Crises: Patterns and Practices

Table 3.10 Key risk factors of the BU-F unit

Risk factor

Likelihood

Consequence

Risk level

RF-F.1

2

3

6

RF-F.2

3

4

12

RF-F.3

3

2

6

RF-F.4

3

2

6

RF-F.5

4

3

12

In addition, there are four committees in this organization (COM-1 to COM-4), each of which focuses on a certain set of risk factors (e.g., COM-4 concerns include RF-I.1, and some other risk factors in the BU-I, except the RF-I.10). Several committees can address a risk factor simultaneously. In total, there are 59 risk factors identified in this organization as of June 2010. Currently, the information available regarding the risk factors is the score of the likelihood of occurrence and the score of the consequence corresponding to each individual factor. Let us apply ERA to assess risks in different organizational levels based on the information available. As an example, we consider BU-F, where five risk factors (RFF.1 to RF-F.5) are identified, and the scores corresponding to these are summarized in Table 3.10: Therefore, the belief distributions representing risk levels of the five risk factors are: S(RF – F.1) = {(H0 , 0), (H2 , 0), (H3 , 0), (H4 , 0), (H5 , 0), (H6 , 1), (H8 , 0), (H9 , 0), (H10 , 0), (H12 , 0), (H15 , 0), (H16 , 0), (H20 , 0), (H25 , 0)}. S(RF – F.2) = {(H0 , 0), (H2 , 0), (H3 , 0), (H4 , 0), (H5 , 0), (H6 , 0), (H8 , 0), (H9 , 0), (H10 , 0), (H12 , 1), (H15 , 0), (H16 , 0), (H20 , 0), (H25 , 0)}. S(RF – F.3) = {(H0 , 0), (H2 , 0), (H3 , 0), (H4 , 0), (H5 , 0), (H6 , 1), (H8 , 0), (H9 , 0), (H10 , 0), (H12 , 0), (H15 , 0), (H16 , 0), (H20 , 0), (H25 , 0)}. S(RF – F.4) = {(H0 , 0), (H2 , 0), (H3 , 0), (H4 , 0), (H5 , 0), (H6 , 1), (H8 , 0), (H9 , 0), (H10 , 0), (H12 , 0), (H15 , 0), (H16 , 0), (H20 , 0), (H25 , 0)}. S(RF – F.5) = {(H0 , 0), (H2 , 0), (H3 , 0), (H4 , 0), (H5 , 0), (H6 , 1), (H8 , 0), (H9 , 0), (H10 , 0), (H12 , 1), (H15 , 0), (H16 , 0), (H20 , 0), (H25 , 0)}. In addition, among the five risk factors in BU-F, the RF-F.2 and RF-F.5 risk scores are higher than those for the other three factors. Factors RF-F.2 and RF-F.5 require more attention as they are more important. Thus, the weights assigned to the risk factors are proportional to their risk scores, and the weights for BU-F are as follows: ωRF−F.1 = ωRF−F.3 = ωRF−F.4 = 0.143 ωRF−F.2 = ωRF−F.5=0.286 Consequently, the risk profile of BU-F is: S(BU – F) = {(H0 , 0), (H1 , 0), (H2 , 0), (H3 , 0), (H4 , 0), (H5 , 1), (H6 , 0.3798), (H8 , 0), (H9 , 0), (H10 , 1), (H12 , 0.6202), (H15 , 0), (H16 , 0), (H20 , 0), (H25 , 0)}. Further, for each grade, the score U (Hi ) is:

3.2 Tradeoff Optimization Methods Table 3.11 Risk scores and risk levels

93

Risk level

Risk score range

Risk level

Low

1–3

Moderate

High

8–14

Extremely high

Risk score range 4–7 15–25

U (Hi ) = i, for·i = 1, 2, . . . 25 Finally, the score of BU-F is 9.7. According to NHS risk analysis policy, the relation between risk score and risk level is shown in Table 3.11. As such, the risk for BU-F is “high.” The risk level of BU-F, together with its risk profile, provides a more detailed view of its risk. Risk profiles for the other business units are generated similarly.

3.3 Architecture-Centric Development Method In the context of software development, a software project is typically implemented using high-speed and lightweight programming methodologies. Let us define the functionality of a system as a measure of how well a system does the work it is intended to do. However, functionality is not all that matters in software development. Properties such as interoperability, modifiability, and portability often matter as much as functionality does. Importantly, these properties depend on the software architecture. However, with these traditional high-speed and lightweight programming methodologies, software architecture is often delayed until the latter phase of production. Conversely, the architecture-centric development method (ACDM) determines the quality attributes by consciously selecting the appropriate architecture early in the development process. The architecture is the software development stage where its builders (i.e., client stakeholders and managers) meet, discuss, and agree on the future system properties. This is the underlying philosophy of the ACDM. The Importance of Software Architecture Software architecture helps capture the high-level partitioning of the system and expresses the fundamental structural organization of the system elements and the relationships between them. This organization is essential for meeting the functional and quality attribute requirements of the software project during its delivery and operation. Every software system is determined by its architecture, regardless of whether there is a representation of that architecture. However, a spontaneous software architecture is very different from a tailor-made solution. If the architecture design is

94

3 Optimization Methods in Crises: Patterns and Practices

unconscious, and developers proceed to detailed design and coding without an overarching blueprint, the system architecture emerges by chance. Consequently, the quality attribute properties the system possesses also emerge spontaneously. Systems built without a well-designed and documented architecture often exhibit unpredictable properties. The system should be modifiable, and should perform and interoperate with other systems as required. Software architects define external properties of the system elements, their topological arrangement, and interactions, in order to meet the functional and quality attribute requirements. Detailed designers focus on the internal details of these elements. As such, the architecture constrains designers and ensures that the prescribed properties are advanced in the design and ultimately implemented. Architecture influences organizational structure, as development teams are typically assigned to build the “parts” of the system. During software system development, the work breakdown follows the architecture, together with its strengths and weaknesses. For instance, the tightly coupled architectural elements would suit implementation teams with high-frequency communication patterns, whereas loosely coupled architectures would better suit the teams with less intensive communication. Managers use architectures to structure teams, plan, monitor, and adjust their development efforts. The ACDM Method Since the ITC focus is closely related to the ACDM Risk Analysis aspect, let us discuss primarily ACDM Stage 4, which reviews the architecture and detects the architecture-related risks (see Table 3.12). Table 3.12 ACDM architectural review Preconditions

Notional architecture created—key views provided in slide presentation. Slide presentations created summarizing business goals and architectural drivers

Who

Development team and stakeholders (optional per discussion below)

Activities

Review business goals and architectural drivers Analyze the architecture

Outputs

Problematic architectural decisions and tradeoffs

Role

Stage 4 Recommended responsibilities

Chief architect

Present architecture (step 3 below), respond to architectural queries (step 4 below)

Managing engineer

Facilitate review meeting, present introductory material (step 1 below)

Requirements engineer

Present the architectural drivers overview (step 2 below)

Chief scientist

Support review by asking probing questions during analysis (step 4) and scribing

Support engineer Software engineer

3.3 Architecture-Centric Development Method

95

This stage includes an architecturalreview, which involves the development team and system stakeholders. At this stage, the development team should plan for the architecture review meeting with the stakeholder community. The purpose of this review is exposing problematic architectural decisions and explicitly identifying tradeoffs between alternate approaches and decisions in order to minimize risks. Architectural reviews may be internal or external, and in most cases, developers plan for both. An internal review refines the notional architecture; afterward, a second review with the stakeholders follows, which precedes production. It is strongly recommended that the key stakeholders attend at least one of these reviews. Typically, the ACDM managing engineer facilitates these meetings, and the development team members attend. The review meeting includes the following four steps: 1. 2. 3. 4.

Introductions and expectations Review of business goals and architectural drivers Presentation of notional architecture Architectural analysis.

The managing engineer should set the expectations by describing the meeting intent. The key points of the review are to: • Ensure that all attendees understand the architectural drivers clearly • Introduce the notional architecture to the attendees • Identify problematic architectural decisions. However, the review should not: • Fix problems and discuss the solutions in detail • Criticize any members (developers or stakeholders) • Discuss process or organizational problems. It is important that all attendees commonly understand the review purpose and particularly its ultimate purpose to find problematic architectural issues, rather than fixing them. The facilitator is responsible for stopping nonproductive discussions and moving the review forward. When the discussion gets off-track, the facilitator should record the essential issue, offer to take up this issue at another time, and move on with the agenda. Immediately after setting the initial expectations, the facilitator should formally introduce the stakeholders and developers to each other, reporting: • Their names • Roles (as a stakeholder or developer) • Problematic architectural issues.

96

3 Optimization Methods in Crises: Patterns and Practices

Review of Business Goals and Architectural Drivers The development team should give a presentation of the business goals and architectural drivers, thereby testing and demonstrating their understanding of stakeholder requirements. Further, the ACDM requirements engineer presents the business goals distilled earlier (at the Stage 1), thus ensuring that the system motivation is still valid and clearly understood by the developers. After reviewing the business goals, the development team reviews the high-level requirements, constraints, and the quality attributes refined at Stage 2. They also present, analyze, and confirm the notional architecture with the stakeholders.

3.4 Crisis-Resistant Smart City Management Let us discuss a smart city emergency management system as an example of a crisisresponsive digital product. Smart City Era as an Emerging Trend As a way of rethinking human life, the new concept of a smart city emerged to improve the world and raise agility. With the advent of the accessible and affordable technologies, particularly for information and communication technologies (ICT), this concept received its name of “smart city.” This term includes the “city” concept, being an environment for daily human life, including education and the workplace. The attribute “smart” refers to a technology-centered approach; it addresses the challenging city lifestyle. The earlier smart city initiatives considered technology as their only driver and often failed due to their ignorance of the other critically important ingredients. The principal drivers of the smart city are: • Technology: The main factor for implementing smart cities, mostly with the use of ICT. The ICT role is consolidating the infrastructure, improving communication, collecting, and analyzing the data to deliver smart services • Human factor: This is central because the smart city is driven by human intelligence and creativity, and in this context, the citizens do not only directly consume and benefit from the solution, but also participate in its lifecycle, by active collaboration and assistance • Governance: This includes the institutions and policies that regulate and manage the smart city and its initiatives. A successful smart city initiative should embrace all the above factors.

3.4 Crisis-Resistant Smart City Management

97

City in Crisis Crisis Phenomenon Generally, a crisis is an abnormal situation with a potentially high risk to a business or a system. Consequently, in a crisis, an important decision must be immediate [2]. A crisis results from multiple issues including the nature of the event, the impact on the local and foreign government, and other institutions etc. Therefore, crisis is a situation of panic, fear, and danger [3]. However, other researchers point out the positive side of crisis and define it as a beneficial situation, although its negative aspect is commonly predominant [4]. There is no universal definition of crisis; the common understanding of this phenomenon is that a crisis can be harmful to an organization or result in its destruction. Crisis Criteria Responding to a crisis, it is essential to investigate its key features [5]: • Crisis is not a manufactured event; this situation happens unexpectedly • Crisis has both positive and negative aspects; generally, this is harmful to the organization; however, certain benefits are also possible • Crisis has a certain structure/sequence, determined by phases and processes • Crisis origins can be diverse; these include political, economic, or social issues/disasters • Crises require immediate attention in terms of decision-making. The above criteria are highly dependent on an organization’s agility or its preparedness to a crisis, which often changes its balance. Another factor of a crisis is the structural complexity, and this is mission-critical for a smart city. Crisis Types Crisis types are classified by origin, as it is often impossible to know their consequences in advance. These types are categorized by the following hierarchy (see Fig. 3.8): Community crisis typically manifests itself as: • Natural crisis resulting from natural disasters • Industrial crisis resulting from socio-technical disasters • Conflictual crisis resulting from external triggers (such as war, embargos, and terrorism) • Non-conflictual crisis resulting from internal triggers (such as politics, and religion) • Non-industrial crisis (including economic and social crisis situations).

98

3 Optimization Methods in Crises: Patterns and Practices

Fig. 3.8 Crisis classification

Non-community crisis: This does not affect the community; examples include transportation accidents of various origins. Crisis Management Stages Typically, a crisis management plan includes the following four stages; see Fig. 3.9 [6]: . 1. Mitigation: It includes preventive and protective operations, and depends on the organization’s resilience to the crisis 2. Preparedness: Action plan is typically initiated when a crisis is recognized; the crisis affects organizations already beyond the mitigation barrier; this happens due to unexpectedness, neglect, or ignorance

Fig. 3.9 Crisis management stages

3.4 Crisis-Resistant Smart City Management

99

3. Response: Reaction that happens after discussing a solution for the recognized crisis; the most tedious step in crisis management, which requires patience, trust, and agile, focused communication [7] 4. Recovery: It mitigates the tension caused by the crisis; applies commonly accepted and approved solutions; repairs the damage; and improves processes to support the regular operation. Let us outline a crisis-resilient framework for a smart city. Citizen Centricity This framework is citizen-focused in the sense that the citizens provide, share, and use the data [8, 9]. . • Participation: The citizens participate in data collection by using mobile and IoT devices in the smart city context (including city knowledgebase and networking communities) • Collaboration: The smart city addresses the local governmental institutions, organizations, companies, and individual citizens; therefore, it is important to build a bridge of communication between the stakeholders and encourage their collaboration • Transparency: Sharing the data between stakeholders and organizations and providing access to them for the citizens should be transparent in order to establish trust between the city inhabitants. Smart Governance The purpose of smart city governance is achieving a common goal by collaborative communication and participative actions; therefore, ICT is important in providing a reliable and scalable infrastructure. Crowdsensing Crowdsensing is a paradigm, harnessing the power of “crowds”, to sense the environment and collect data; this paradigm promotes scalability and accelerates deployment. Architecture Thereafter, we extend the context of a generic smart city system architecture with a subsystem for crisis management. Crisis Management for a Smart City In the event of a crisis, risks to people and infrastructure must be anticipated; the available resources must be managed with maximum efficiency, the data must be shared and transferred with minimum latency, and the software must be reliable and instantly available. In a crisis, software often plays a key role, and in the context of a smart city, system architecture should respond to this critical situation; therefore, the data must

100

3 Optimization Methods in Crises: Patterns and Practices

be continuously available and secure. Large volumes of data must be analyzed for crisis prediction and prevention; as such, the software solution should promote instant communication and transparency. Software Technologies for a Smart City Cloud-Based Systems Cloud computing is known for its powerful and diverse services, which are independent of a physical infrastructure. In a crisis, cloud computing promotes system recovery and reliability [10]. A basic cloud system can consist of three layers [11]. The cloud layer provides the platform for a crisis system service. The intelligence layer provides the algorithms and models that implement crisis response strategies based on heterogeneous data. The interface layer captures the data from sensing infrastructures, including versatile devices and social platforms. Deep Learning-Based Crisis Management System Deep learning handles large-scale data; it processes and analyzes the data coming from the city environment. Deep learning can predict and recognize critical situations. For a smart environment, the following three-layer architecture (see Fig. 3.10) can be considered. First, the input layer manages the collected data to train and test the deep learning model using APIs provided by the data sources, and forwards them to the processing layer. Secondly, the data processing layer aggregates the data for future prediction. Thirdly, the prediction layer uses the previous layers. The crisis management system follows pipeline topology [12]. The data is collected from multiple, heterogeneous sources, which listen and gather the notifications containing sensible keywords. Furthermore, the data is stored using the

Fig. 3.10 Layer-based smart city software architecture

3.4 Crisis-Resistant Smart City Management

101

fault-tolerant big data techniques. Next, the data is analyzed after a big data-based parser and preprocessor (used for data preparation), to make it meaningful and add value. Finally, the results are aggregated, consolidated, visualized, and presented in multiple views, including storytelling techniques [13]. Mobile Ad Hoc Networks In a crisis, the need for backing up communication media increases significantly. Therefore, ad hoc wireless networks become vital as soon as regular networks and communication channels are often down or destroyed. This kind of network embraces (groups of) wireless mobile devices, i.e., nodes, which collaborate by forwarding the packets to each other to reach the nodes initially outside the range. Such autonomous networks are self-configurable and decentralized, which makes them independent from any other parts of the smart city infrastructure [14]. Proposed Approach The proposed system implementation is based on the mobile crowdsensing (MCS) architecture for smart cities. This crisis-responsive solution follows the “trust” crisis resilience-based model [15], where trust is the key factor of success in crisis management. This trust factor incorporates the following aspects [16]. • Decentralization: Decentralized management can mitigate the negative isolation effects due to unexpected crisis factors. For instance, natural disasters often require an immediate local reaction, prior to the authority’s approval, in order to reduce harmful and threatening consequences • Communication: This is a key aspect of a successful crisis-resilient framework for a smart environment with multiple and interdependent stakeholders. In a crisis, communication must be effective, instant, transparent, and open • Collaboration: Collaboration is essential for quick and effective action based on trusting stakeholders’ expertise • Trust: The trust factor permits a better understanding of the situation, fast communication, and efficient resource management • Architecture: The architecture that promotes crisis resilience exploits the previously mentioned paradigms, which are citizen centricity, smart governance, and mobile crowdsensing (see Fig. 3.11). Architecture Layers Sensing layer: The sensing layer is based on the MCS paradigm. This sensing layer is composed of heterogeneous mobile devices of citizens located in the city environment and connected to the back-end application. Each of these mobile devices features a front-end application, which manages sensing and networking. Computing layer: It supports the technological infrastructure, which consolidates the network and provides the following services: • Data collection acquires both the data shared during the sensing campaign and the data from the sensing infrastructure

102

3 Optimization Methods in Crises: Patterns and Practices

Fig. 3.11 System data flows

• Data storage cleans, verifies, and stores the data, implemented as a service • Data analysis analyzes the stored data for business planning and decision-making • Service management: this Software as a Service (SaaS) module is used for sensing, data acquisition, management, and social networking • API communication includes the services for communication interfaces • Monitoring: this back-end application module monitors the system; it includes three interfaces for sensing campaign management, system management, and social networking. Architecture Plans These are the blueprints, which include the key quality attributes to consider for the design and implementation of a reliable MCS system: • Security and Privacy: Insufficient security or privacy can endanger or destroy the system; therefore, these are its critically important quality attributes [17, 18] • Knowledge: The knowledgebase adds system value; this is based on the data collected by the analytics tools.

3.4 Crisis-Resistant Smart City Management

103

Fig. 3.12 System data flows

Crisis-Resilient System Outline The system is based on the MCS paradigm and smart governance principles. It includes data sharing and analysis for decision-making, and feedback and communication mechanisms. The system outline is presented in Fig. 3.12. The system includes the following components: • MCS user: A citizen with a mobile device, which is not only a data sensor but also a data transmitter. In the event of a crisis situation, the user initiates a warning, i.e., a notification containing the description of this situation and its contextual data (such as location and time), and an optional video file, depending on network availability • Crisis notification: Each of these is the data sent by the MCS as soon as the situation of a crisis is recognized. Each notification triggers the crisis management system to recognize, analyze, and report the situation as a crisis • Data cloud manipulation: This service collects user notifications; it validates, classifies, stores, and transfers them to the data analytics component • Big data and analytics: Big data analysis is essential and adequate for crisis situations; when the data has to be handled carefully. Big data tools store, preprocess, classify, analyze, and report the datasets required • Social media: These outlets are essential due to their power to gather and inform people; they act as a collaborative platform • Networking: These are broadcasting services for monitoring crises and informing both individuals and organizations • Government and institutions: Government and crisis-related institutions play the role of decision makers; they also coordinate the actions, logistics, and communication

104

3 Optimization Methods in Crises: Patterns and Practices

• Mobile ad hoc network: This is for emergencies when the regular network fails. This network is essential to maintain communication and share information at a low cost. . Conclusion and Future Research The idea of the approach suggested was applying a human-centered architecture, based on mobile crowdsensing, to crisis management in a smart city. This software solution must be trustworthy; it should collect and analyze the data, respond reliably and resiliently to crisis situations, and allow instant decision-making. Future research could elaborate on the communication between the diverse actors and suggest advanced frameworks, tools, and policies for security and privacy. The efficiency of crisis mitigation depends essentially on a trustworthy framework rather than the technologies used; i.e., the human factor is predominant and missioncritical for this problem domain.

3.5 Blockchain as Digitalization Crisis Remedy Let us discuss blockchain technology as an instance of a crisis-responsive approach to digital product development. Digitalization Digitalization is an integration process of information technology with established industrial and economic models, or the creation of new models using information technologies and communication algorithms (such as machine-to-machine, M2M). Digitalization involves process transformation: from robotized production of mass consumption commodities (i.e., goods and services), to new production methods and customizing consumer commodities (for instance, by innovative digital technologies such as 3D printing). According to McKinsey, digitalization of the Russian economy will enhance the country’s GDP by 4.1–8.9 trillion rubles by 2025, constituting the total anticipated GDP growth from 19 to 34%. This evaluation is based not only on process automation, but also on the introduction of smart and innovative technologies and business models. Such technologies include digital platforms, robots, the Internet of things (IoT), data analysis, and machine learning [19]. A collaborative consumption-based economy (i.e., platform model) is an example of the advent of these emerging technologies, and a new business model’s introduction to the market. Therewith, clients directly interact with contractors via these new and emerging services; they use mobile platforms and data exchange solutions. Typical examples of this type of project include Uber, Airbnb, and Alibaba [20]. Machine learning, as one of artificial intelligence methods, has a broad spectrum of applications (such as gesture, voice, handwriting, and image recognition for medical and technical diagnosis).

3.5 Blockchain as Digitalization Crisis Remedy

105

Fig. 3.13 Key agility challenges in digitalization

Certain technologies (e.g., robots and the IoT) require a longer time frame for testing and implementation. Ericsson estimated that the number of sensors and IoT devices would exceed that of mobile phones by 2018. From 2015 to 2021, the expected combined annual average growth rate of the segment is around 23%. Of nearly 28 billion Internet devices in the world, around 16 billion will use the IoT by 2021. Crisis-Related Digitalization Issues Global digitalization is a time-consuming process. The diagram in Fig. 3.13 presents the key problems, which trigger the digitalization crisis, in order of their complexity. The main challenge that stands in the way of digitalization is adjusting people’s lifestyles in order to adapt them to these smart technologies. These new lifestyles should simplify daily activities and increase the availability of basic services. Apart from positive effects, digitalization brings employment rate losses. According to the World Bank research, middle-class competition tends to increase significantly, which, in turn, leads to salary stagnation [20]. Future professions will include tight interaction with smart technologies and robots. Regulation and implementation of these new technologies require a federal legal framework. Social institutes should also adapt to embrace smart and innovative technologies. The crisis of digitalization also requires technologies based on physical and virtual networks, which implies the development of big data repositories and cloud services. However, the first step to digitalization is updating and upgrading the obsolete technologies. Blockchain Blockchain is a network, where every node provides common data. The data is stored in lists containing information blocks. These blocks are created during a process called mining.

106

3 Optimization Methods in Crises: Patterns and Practices

The block creation mechanism is determined by the blockchain consensus protocol (e.g., proof-of-work and proof-of-state). Blockchain data allocation promotes security and consistency. The technology overview is available in [21]. Probably, the best-known project that uses blockchain is Bitcoin. Blockchain networks can be public or private. It is possible to develop business logic, which is archived as each new block is created; this is referred to as Chaincode or smart contracts. However, to date, the blockchain smart contracts are often suboptimal for archiving certain business logic clauses. In 2017–2018, the blockchain market attracted many investors. CoinSchedule estimated that in 2018, this technology attracted $21.4 billion from major corporations, venture capital companies, and private investors [22]. As the IDC Company reported, the same year, global expenditure on blockchain reached $1.5 billion. IDC forecasted that the investments in blockchain technology would rise from $400 million as of 2018 to $3.5 billion in 2022. Blockchain is a possible remedy to the digitalization crisis as it promotes system credibility. Blockchain presents an environment where data cannot be compromised, as all the network users guarantee that. Blockchain Applications in Digitalization Blockchain is implementable in different areas; some of its typical applications are summarized below. Cryptocurrency and Transactions Currently, cryptocurrency transaction recording is among the most useful of blockchain technologies. The main advantages of cryptocurrencies are a decentralized structure and encryption-based security. Cryptocurrencies often reduce transaction expenses of the financial institutions (e.g., Visa and MasterCard) by nearly 3%; they cut down commission fees for international transactions by the same 3% and decrease currency conversion overheads by at least 1%. However, cryptocurrencies require complex state regulation. In a few countries, transactions with cryptocurrencies are officially permitted. Cryptocurrency legislation in Russia is currently being developed [21]. However, high criminal risks resulting from payment anonymity hamper the cryptocurrency progress. Nevertheless, cryptocurrency can solve the digitalization crisis and make banking services available worldwide. Currently, there are a large number of projects in logistics and accounting, based on supply chain management. The American Blockchain in Transport Alliance (BiTA) develops an open standard for supply chain management. In the USA, Walmart and IBM launched a pilot project on supply chain management; its nodes are managed by Walmart and located in the IBM cloud. Financials The financial sector is interested in the technology as it accelerates financial placement and raises the level of trust between stakeholders. Large banks, such as UBS, opened blockchain research laboratories that investigate system efficiency and cost reduction.

3.5 Blockchain as Digitalization Crisis Remedy

107

Smart Contracts for Credibility Second-generation blockchain systems (such as Ethereum, EOS, Waves, Tron, Hyperledger Fabric, Corda, and Exonum) implement business logic in the ecosystems. Due to distributed data (this approach promotes consistency and security), obligations are guaranteed, and blockchain provides and promotes digital trust. However, the present-day smart contracts typically require commissions, and there is little evidence of their operational and scalable applications [23]. Conclusion Experts are confident that blockchain can transform many spheres of human activities, yet there is little evidence of its large-scale implementation. Blockchain projects often attempt to compete with already established alternative technologies and solutions in developed markets. Therefore, many blockchain projects do not meet stakeholder expectations. The Bank for International Settlements, along with a few other organizations, criticized blockchain for its high costs and mandatory proof-of-work consensus [24]. Blockchain technology is still in its development phase; it promotes digitalization and therefore potentially assists in crisis management.

3.6 Conclusion: Crisis-Aware Optimization In this chapter, we have addressed multi-criteria optimization issues of digital product development. Clearly, balancing multiple criteria requires tradeoffs, i.e., compromises. This is the way that software engineering as a discipline teaches developers to follow. To efficiently use such multi-criteria methods as a part of the ITC framework and crisis manager portfolio, we cross-examined an extensive set of this kind of approaches. This set included tradeoff-based methods, such as AHP, CBA, DEA, ERA, and ACDM. For each of the above-mentioned approaches, we analyzed their applicability to crisis management in IT product development. In order to increase the practical value of our survey, we added case studies to this comparative discussion. These case studies addressed multiple contexts; however, the focus was on crisis agility of the decisions made by the developers, while identifying, prioritizing, and balancing the tradeoffs. Depending on the situation (i.e., crisis phase, data available, client’s priorities and expectations, and other project constraints), developers should choose the most suitable method (or a wise combination of smart techniques and innovative practices) to provide an acceptable quality product in a timely fashion. The case studies presented in this chapter provided patterns and practices for the tradeoff-based optimization of digital products as an extension of lifecycle-based approaches given previously. The final chapter will address human-related factors in more detail, to complete the ITC framework. Our previous case studies only gave quick hints on these human factors and their application to crisis-responsive management of digital production; our other books provide more detail regarding these issues and their solutions [19, 20].

108

3 Optimization Methods in Crises: Patterns and Practices

References 1. Saaty, T. L. (2014). Mathematical principles of decision making. RWS Publications. 2. Gherardi, S., Turner, B. A., Pidgeon, N. E., et al. (1999). Man-made disasters 20 years later: Critical commentary. Health, Risk & Society, 1(2), 233–239. 3. Darling, J. R. (1994). Crisis management in international business: Keys to effective decision making. Leadership & Organization Development Journal, 15(8), 3–8. 4. Stranks, J. (1994). Human factors and safety. Environmental Policy and Practice, 4, 147–147. 5. Shaluf, I. M., Ahmadun, F. R., & Mat-Said, A. (2003). A review of disaster and crisis. Disaster Prevention and Management: An International Journal, 12(1), 24–32. 6. Farazmand, A. (2017). Crisis and emergency management. Crisis and Emergency Management: Theory and Practice. 7. Zykov, S. (2018). Managing software crisis: a smart way to enterprise agility. Springer series in smart innovation, systems and technologies (Vol. 92, xxii+153pp). Switzerland: Springer International Publishing. 8. Lane, N. D., Miluzzo, E., Lu, H., et al. (2010). A survey of mobile phone sensing. IEEE Communications Magazine, 48(9), 140–150. 9. Degbelo, A., Granell, C., Trilles, S., et al. (2016). Opening up smart cities: Citizen-centric challenges and opportunities from GIScience. ISPRS International Journal of Geo-Information, 5(2), 16. 10. Kakderi, C., Komninos, N., & Tsarchopoulos, P. (2016). Smart cities and cloud computing: Lessons from the STORM CLOUDS experiment. Journal of Smart Cities, 2(1), 4–13. 11. Alazawi, Z., Alani, O., Abdljabar, M. B., et al. (2014). A smart disaster management system for future cities. In Proceedings of the 2014 ACM International Workshop on Wireless and Mobile Technologies for Smart Cities (pp. 1–10). ACM. 12. Suma, S., Mehmood, R., Albugami, N., et al. (2017). Enabling next generation logistics and planning for smarter societies. Procedia Computer Science, 109, 1122–1127. 13. Aqib, M., Mehmood, R., Alzahrani, A., et al. (2020). A smart disaster management system for future cities using deep learning, GPUs, and in-memory computing. Smart Infrastructure and Applications (pp. 159–184). Cham: Springer. 14. Jadhav, S. S., Kulkarni, A. V., & Menon, R. (2014). Mobile Ad-Hoc Network (MANET) for disaster management. In:Eleventh International Conference on Wireless and Optical Communications Networks (WOCN) (pp. 1–5). IEEE. 15. Zykov, S., & Singh, A. (2020). Agile enterprise engineering: Smart application of human factors (models, methods, practices, case studies). Springer Series in Smart Innovation, Systems and Technologies (Vol. 175, xii+142pp). Switzerland: Springer International Publishing. 16. Ramer, R. M., & Tyler, T. R. (ed.). (1995). Trust in organizations: Frontiers of theory and research. Sage Publications. 17. Vashi, S., Ram, J., Modi, J., et al. (2017). Internet of Things (IoT): A vision, architectural elements, and security issues. In2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) (pp. 492–496). IEEE. 18. Mollah, M. B., Azad, M. A. K., & Vasilakos, A. (2017). Security and privacy challenges in mobile cloud computing: Survey and way ahead. Journal of Network and Computer Applications, 84, 38–54. 19. https://documents.worldbank.org/curated/en/896971468194972881/pdf/102725-PUB-Replac ement-PUBLIC.pdf. Last accessed at 25/05/2020. 20. https://irsup.hse.ru/news/225704104.html. Last accessed at 25/05/2020. 21. Berberov, A. B. (2017). On the way to the digitalization of Russian Economy: problems and perspective//Management of Economic Systems: Digital scientific journal №. 7 (101). https:// www.tadviser.ru/images/c/c2/Digital-Russia-report.pdf. 22. https://www.tadviser.ru/index.php/%D0%A1%D1%82%D0%B0%D1%82%D1%8C%D1% 8F:%D0%91%D0%BB%D0%BE%D0%BA%D1%87%D0%B5%D0%B9%D0%BD_(% D0%BC%D0%B8%D1%80%D0%BE%D0%B2%D0%BE%D0%B9_%D1%80%D1%8B% D0%BD%D0%BE%D0%BA). Last accessed at 25/05/2020.

References

109

23. https://www.iksmedia.ru/news/5564141-Blokchejn-ostaetsya-na-nachalnom.html. Last accessed at 25/05/2020. 24. Information regarding German Gref about unwillingness of economy for digitalization and ‘sick’ banks is available at https://www.vestifinance.ru/videos/40829. Last accessed at 25/05/2020.

Chapter 4

Social and Human Aspects of Crisis: The “Human Factors”

Abstract This chapter focuses on how tradeoff optimization methods and software lifecycle management depend on human-related factors. This application is the final part of the IT Crisisology framework, which includes technology, business, and human factors. The approach is based on the models and methods presented in the previous chapters. We discuss the concept of knowledge transfer and provide a few concise case studies of large-scale and complex multi-national companies. We further evaluate the effect of knowledgebase inconsistencies on the knowledge transfer quality in the event of a crisis. The chapter concludes by a comprehensive case study based on a real dataset, dealing with taxi management service, using tradeoff optimization to manage the driver crisis in the new digital era. Keywords Human factors · Knowledge transfer · Tradeoff optimization · Digital service

4.1 Introduction: Taming Human Factors in Crises The previous chapters have analyzed the concepts of crisis and software system lifecycles and illustrated their applications to crisis-resistant digital product development within the ITC framework. However, this set of methods and models generally addressed only the two “pillars” of the ITC, namely the technology and businessrelated factors. Concerning the third “pillar”, or the human factors, this was barely mentioned and largely remained in the shadow. Adizes PAEI model and certain tradeoff-based methods (such as AHP and ACDM) involve human factors. Still, their application has been reserved till the final chapter to complete the big picture of the ITC. This was so because detailed design and later development, implementation, and maintenance/support of the IT systems, which typically happen at the later part of their lifecycle, are essentially dependent upon the human-related factors. Ignoring or neglecting this group of factors would likely destabilize the ITC “three-legged stool” and result in a crisis of digital product development. As this taming process is seldom a smooth sail, a number of questions arise. These include: How can managers adjust these “wild” human factors to promote agility © Springer Nature Singapore Pte Ltd. 2021 S. V. Zykov, IT Crisisology: Smart Crisis Management in Software Engineering, Smart Innovation, Systems and Technologies 210, https://doi.org/10.1007/978-981-33-4435-8_4

111

112

4 Social and Human Aspects of Crisis: The “Human Factors”

and responsiveness rather than hinder/hamper in a crisis situation? Which activities are required to make a client comprehend often technically complex requirements and constraints and, following the developer’s lifecycle and tradeoff ideas, actively participate in design and further elaboration of the digital product desired? In what way should the established processes and technologies of the client and the developer be adjusted to become crisis-resistant? To answer these questions, we start this chapter with the concepts of data, information, and knowledge, which are closely related yet clearly different. Further, we discuss the principles and challenges of knowledge (and technology) transfer, as efficiently doing this is an essential requirement for successful system design, implementation, and maintenance/operation. The next step is discussing models and practices of crisis-responsive knowledge/technology transfer when developing a digital product. For this purpose, we present a few models and apply a set of case studies. The models involve people, process, and technology aspects. The case studies include the success stories of Ernst & Young and Accenture (former Andersen Consulting), large-scale and, therefore, complex multi-national enterprises. We perform a SWOT analysis of knowledge transfer aspects that paves the way towards crisis-responsive development and suggest the technology transfer lifecycle as a part of the ITC framework. Further on, we investigate the factors affecting knowledge transfer success in crises and detect the mission-critical ones. Therewith, we discuss the knowledgebase inconsistencies, their impact on client’s and developer’s organization agility, and on the knowledge transfer between these two organizations and their mission-critical individuals (such as top management, project management, end users, etc.). Finally, we present an extensive, real data-based case study of a Yandex.Taxi service implementation and its impact on the clients (i.e., taxi drivers and their passengers), and their behavior strategies. Clearly, these strategies are multi-criteria tradeoff optimization problems. Therefore, we illustrate the overall ITC framework by identifying the critical factors and applying the ITC to an extensive real dataset using a combination of tradeoff optimization methods to harness the human factor and manage the crisis of taxi service in this new era of digitalization. Similar to the previous chapters, the focus of this final one is investigating the influence of its subject (i.e. human factors) on the ITC framework in terms of successful digital product development in a crisis.

4.2 Resilient Knowledge Transfer in Crises Before discussing the knowledge transfer, let us define the key concepts. Data is a series of structured facts that contain no instructions regarding their utilization in a given context. Typically, organizations store their data in information systems. Organizations treat their data carefully, as a large number of unclassified data likely results in critical issues [1].

4.2 Resilient Knowledge Transfer in Crises

113

Information is the data that contains certain significance and has a value for its user. This value emerges when the data becomes meaningful [2]. Knowledge refers to data and information interpreted by a person’s intelligence. It is often difficult to evaluate and share this knowledge, keeping it intact. Knowledge transfer refers to the dissemination of knowledge and providing inputs to problem solving which “includes a variety of interactions between individuals and groups; within, between, and across groups; and from groups to the organization” [3]. Technology transfer is the mechanism by which the accumulated knowledge developed by a specific entity is transferred wholly or partially to another one, to allow the receiver to benefit from such knowledge. The ability to transfer knowledge is vital to implement productivity growth, leading to better life quality, lower product costs and higher company profits. However, what is the core of this knowledge? Where does it come from? How do people acquire it before sharing it? These questions bring us to the two main aspects of knowledge: explicit and tacit (see Fig. 4.1). Tacit Knowledge Tacit knowledge is implicit; it helps determine how organizations make decisions and influence the collective behavior of their members. Tacit knowledge refers to knowing more than we can tell, or knowing how to do something without thinking about it, e.g. riding a bicycle. This highly personal, subjective form of knowledge is usually informal and can be inferred from the statements of others. Tacit knowledge tends to be local; it is absent in manuals, books, databases, or files. Tacit knowledge is technical or cognitive and is made up of mental models, values, beliefs, perceptions, insights, and assumptions. Technical tacit knowledge manifests itself when people master a specific body of knowledge or use skills like those gradually developed by master artisans.

Fig. 4.1 Process model for knowledge sharing/transfer in organizational context

114

4 Social and Human Aspects of Crisis: The “Human Factors”

Cognitive tacit knowledge incorporates implicit mental models and perceptions that are so ingrained they are taken for granted. Cognitive models affect how people interpret/evaluate events. People use metaphors, analogies, demonstrations, and stories to share their tacit knowledge [4]. Listeners can evaluate the content and apply tacit knowledge to their own fields. For instance, employees of Data Fusion Inc., an IT consulting company, take photos at business conferences and share these with their colleagues. The stories employees write, contain notes and descriptions, or explicit knowledge. Stories about why things happened and how information could be applied, contain tacit knowledge. Tacit knowledge, in context, is often easier to remember and discuss than explicit knowledge. Explicit Knowledge Explicit Knowledge can be articulated in formal language and easily transmitted or shared; it implies facts about material properties, technical data, and tool features. Most explicit knowledge is technical/academic data (or information) represented in terms of formal language, such as manuals and mathematical expressions. Systematic knowledge, also often referred to as “know-what”, is typically communicated and shared by means of printed, electronic, and formal media. Explicit knowledge is technical and requires a level of academic knowledge or understanding that is gained through formal education or structured study. Typically, explicit knowledge is carefully codified, stored in a database, and accessed and retrieved with high quality (i.e., reliable and fast) information systems. Once codified, explicit knowledge assets become reusable. Data sharing processes often require infrastructural investments [5]. Acquiring and using explicit knowledge assumes a predictable and relatively stable (i.e., crisis-resilient) environment. Market competition, changeable customer requirements, among other crisis factors, reduce stability. The following Cases #1 and #2 illustrate the use of explicit knowledge. Case #1: Ernst & Young, with 82,000 employees worldwide, create a global network of explicit knowledge to manage cultural differences. Their “best practice” repository is based on sharing and documenting knowledge. They approach business issues from multiple perspectives. Wherever a problem occurs, they readily suggest a few operational solutions. Ernst & Young treat knowledge objects as templates of core insights, suitable for multicultural environments. Case #2: Accenture (formerly Andersen Consulting) elaborated explicit knowledge codification, storage, and reuse. Its “people-to-documents” approach extracts information from its owner and alienates it.

4.2 Resilient Knowledge Transfer in Crises

115

Table 4.1 Differences between tacit and explicit knowledge Explicit Knowledge

Tacit Knowledge

Nature

• Easily identifiable • Relatively easy to share • Intrinsically incomplete, lacks context, and requires interpretation

• • • •

Typical Examples

• Information • Know-what • Theoretical knowledge

• Intuition and insight • Practical intelligence, skills, and practice • Know-how and heuristics • Rules of thumb • Mental models and beliefs

Mechanisms for generating and sharing

• • • •

• • • • •

Codification Documentation Database and search engines Blogs, wikis, and intranets

Within personal knowledge Difficult to articulate Hard to share Can be shared only indirectly

Practice Personal and team reflection Drawing mental maps Apprenticeship Social interaction and mentoring • Storytelling and metaphors • Smart codification systems can make some tacit knowledge easier to share, through converting certain elements of it into explicit knowledge

Source Koners, U., and Goffin, K. (2007). Learning from post-project reviews: A cross-case analysis. Journal of Product Innovation Management 24(3): 242–258

After removing client-sensitive aspects, meaningful multi-context information is reused. This information is then transformed into a smart solution applicable to a number of related contexts [5] (Table 4.1). Knowledge Transfer Methods Documentation The ability to record and transfer knowledge as text, audio, and video creates documentation, which often has a longer lifespan than its authors do. However, documentation cannot properly capture certain human-centered aspects. Often, documentation is ambiguous, incomplete, or instantly inaccessible. For instance, an employee may reinvent a process from scratch that had been previously optimized and fully documented at considerable cost. Coaching Coaching is a type of leadership accountable for team or individual performance. This approach is common for transferring tacit knowledge, such as interpersonal, physical, and leadership skills. For example, a five-star chef will coach other cooks

116

4 Social and Human Aspects of Crisis: The “Human Factors”

in the kitchen who are involved in the day-to-day operations of a restaurant, to guide them and improve their performance. On-the-Job Training: The process of learning a job by doing it with supervision and guidance. For example, a newly employed designer will train to use software tools, such as Adobe Illustrator, to make sketches. Regular Training: Education aimed at transferring applicable knowledge and skills. For instance, experts in digital product development organize programming bootcamps for young programmers. Presentations: Information sharing allows knowledge owners to disseminate it to a number of interest groups. A typical example is research conferences (e.g. the Software Engineering Conference Russia, secrus.org, and the International Conference on Code Quality, iccq.ru) that allow researchers and practitioners to share information with other attendees. Mastery Training: This is an approach to education and training, which ensures that a participant perfects a basic skill before moving on to advanced levels. For example, an aviation school will make sure its students have mastered flight simulations before piloting real aircrafts. Redeployment: Also known as off-site training, this is similar to sending the experts from headquarters to subsidiaries, in order to familiarize their employees with a new business process or software system. Mentoring: A type of knowledge transfer, where an expert guides an inexperienced person by offering skills, techniques, and experience. Shadowing: A process similar to mentoring; this allows a trainee to follow the trainer by observing his/her daily activities in order to improve skill, knowledge, and strategy levels. Paired Work: This is similar to the extreme programming practice in software engineering, where coding and testing is done in pairs. This provides the rare experience to learn from each other, through continuous observation and correction, by means of bug fixing, early testing, and other techniques. In summary, most companies and individuals transfer knowledge by multiple methods, such as shadowing, documenting, working in pairs, webinars, podcasts, conferences, meet-ups, and speaker invitation. Therefore, there is no single method suitable for all cases. Consequently, to avoid a crisis, it is critically important to know where, when, and how to apply a particular method, or their combination. Knowledge Transfer Methods: Strengths and Weaknesses Table 4.2 summarizes the strong and weak points of different knowledge transfer methods. Knowledge Transfer Process Since knowledge transfer happens between people, there is a need to account for the human-centric factors that matter. These factors include personal traits they exhibit and involve their cognitive system during the process of knowledge sharing (KS) and knowledge transfer (KT). Lin and Hendriks discuss the role of these individual and organizational factors for KS; see also [6].

4.2 Resilient Knowledge Transfer in Crises

117

Table 4.2 Knowledge transfer methods: strong and weak points Method

Strengths

Weaknesses

Informal communication

Relatively simple to conduct Two-way communication Can occur outside practical site

Difficult to transfer tacit knowledge Limited to the transferable knowledge available

On-the-Job Training

Can transfer both tacit and explicit knowledge Capable of sharing feelings and experience

Limited to practical site Difficult to occur for many other people simultaneously

Seminar

Can use many learnings tools

More difficult to conduct

Documentation

Can be in the form of paper or data on media Easier to conduct

Difficult to transfer tacit knowledge One-way communication

Knowledge/Technology Transfer Process Knowledge transfer uses certain technologies; AUTM documented this process during their research on the “critical role in ushering new technologies to market” [7]. They argued that, in this era of a high tech-driven digital economy, human relationship management is still a vital part of the transfer process. Therefore, advanced technology transfer methods should address these human factors (see also Figs. 4.1 and 4.2). Knowledge Transfer: Suggestions Since there is no single approach that equally suits all cases for efficient knowledge/technology transfer, we suggest to:

Fig. 4.2 Technology transfer process

118

4 Social and Human Aspects of Crisis: The “Human Factors”

• Identify the type of knowledge (i.e., tacit or explicit) prior to KT • Evaluate the personal traits (e.g., extrovert or introvert) of knowledge owner/transmitter, and receiver; this ensures appropriate selection of the KT method • Select the primary KT method • Evaluate the strengths and weaknesses of the primary KT method • Compliment for weak points of the primary method with a secondary method • Compliment the shortcomings of any alternative methods (if required). Conclusion Following the above knowledge transfer process, together with harnessing humanrelated factors, essentially assists in efficient and responsive crisis management, and promotes agility in digital product development.

4.3 Harnessing Human Factors for Better Knowledge Transfer Any business involves a set of complex processes, in which people actively participate and play a key role. The role of employees is particularly important in the field of information technology, because the result of such businesses is a digital product that essentially involves knowledge. This knowledge is produced by, and then distributed, among the people. For this purpose, each company creates its own knowledgebase, a kind of information storage regarding the business subject area, regularly updated by the employees. Ideally, the knowledge should be added to this storage as soon as created; however, insufficient resources often delay this until the day when the employee quits and delegates his/her duties to others. Due to the human factor, in this transition, the knowledge often gets distorted, inconsistent, or lost. Employees may find some chunks of this knowledge unimportant or obsolete, or miss a certain production process aspect while updating the knowledgebase. Let us explore the role of the human factor in knowledge transfer and describe the cases where it plays a key role. People, Process, and Technology Knowledge management (KM) offers IT organizations a variety of strategies, techniques, and tools for their business processes in order to grow and efficiently utilize their knowledge assets. The diagram in Fig. 4.3 highlights the impact of the three elements (people, processes, and technologies) on the four KM steps [8]: • • • •

Create or generate Represent or store Access or use Disseminate or transfer.

4.3 Harnessing Human Factors for Better Knowledge Transfer

119

Fig. 4.3 People, process, and technology

To be successful, a KM initiative should involve all of these four components. Let us focus on the key KM element, the people, and investigate how the human factors affect knowledge sharing when interacting with the other two critical elements, processes, and technologies. People are the most important factor of any KM initiative, as without their involvement, the other two elements are meaningless. Out of the multiple aspects identified, let us analyze the two factors related to people involvement: • Organization culture • Top management commitment and support. Organization Culture One of the most important KM success factors of the human factor dimension is the organization culture, which nourishes the development of KM practices. Irrespective of technologies and resources supporting KM implementation, employee unwillingness to share knowledge likely puts the entire KM program in jeopardy. Therefore, the initial step to KM success is a culture of mutual trust, which enables knowledge sharing and results in organizational learning. The principles of Personal Mastery, Mental Models, Shared Vision, Team Learning, and Systems Thinking, form the foundation for a strong culture, which enables the ultimate KM strategy success. Research suggests that “to develop a knowledge management system and achieve its objectives in an organization, requires a corporate culture for changing its activities to knowledge-based, and also its staff become knowledge-based workers, which

120

4 Social and Human Aspects of Crisis: The “Human Factors”

means creating a culture of knowledge management that supports knowledge sharing and value creation and encourage its use”. According to Ernst & Young, half of the experts believe that changing human behavior is a key issue in the KM implementation strategy. Top Management Commitment and Support Few organizational initiatives are successful without the support of top management; it is critically important in KM implementation. When the management is committed and supportive, this makes employees confident. A number of researchers involved in “corporate” learning, noted the top management commitment and support (TMCS) as a key KM success factor. Top management initiates, sponsors, and promotes KM; they provide the resources required to develop the organizational culture. Krogh et al. describe how KM construction and maintenance depend on the physical, virtual, and emotional context. With strong commitment of the executive management, an organization is able to create the values that lead to efficient KM [9]. Identifying the Key Knowledge Sharing Factors The terms “knowledge sharing” and “knowledge management” often interchange. Knowledge sharing is defined as the activities through which knowledge (such as information, skills, or expertise) is exchanged among people, friends, families, or organizations. Wang and Noe described that knowledge sharing refers to the “movement” of knowledge between individuals to help and collaborate with others, to solve problems, develop ideas, achieve targets, and implement procedures [10]. In this context, knowledge transfer refers to the “movement” between units, divisions, departments, or organizations rather than individuals. Therefore, we use the term “sharing” for knowledge “movement” between employees. One of the authorities, Niedergassel, argued that the term “sharing” could represent a kind of knowledge transfer from a sender to a recipient, when the sender does not retain the knowledge after the transfer [11]. In addition to our previous contributions based on the Shannon’s information theory and the Gill & Cohen informing science approach [6], let us consider the basic knowledge sharing model of Brockhoff [12]. This model consists of the knowledge source, recipient, shared object, sharing process, and sharing context. Earlier sources argued that usually the knowledge in general and tacit knowledge in particular (such as “hands-on” work experience) was not easy to share within an organization. To analyze the aspects affecting KM based on human factors, we suggest applying Shannon’s information model enhanced in [6]. Methodology and Results The survey is based on the outcomes of a human resource review [8, 13]. The key research issue is which factors influence knowledge sharing? The next steps are to identify

4.3 Harnessing Human Factors for Better Knowledge Transfer

121

(1) Meaning of knowledge sharing (2) Key factors that affect knowledge sharing. As a result, the review identified 35 factors that affect knowledge sharing; these include: • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •

Information technology Organization culture Motivation Leadership Trust Management system Organization environment Awareness Personality Self-efficacy Ability to share Job satisfaction Organization structure Work procedure Office layout Face-to-face interaction Reciprocity Repute Altruism Acknowledgement Vision and goal Social network Explicit knowledge Tacit knowledge Education level Training course or seminars for knowledge sharing Perceived reciprocal benefits Perceived loss of knowledge power Perceived reputation enhancement Service availability Strength of social ties Time Enjoyment in helping others KM system quality Transaction-based memory system.

Consequently, the five most important factors due to highest frequency reported are (see Table 4.3): • Trust • Motivation

122 Table 4.3 Frequency of key KM factors

4 Social and Human Aspects of Crisis: The “Human Factors” Factor

Frequency

Percentile

Information technology

10

50

9

45

Motivation Trust

9

45

Organization culture

7

35

Leadership

6

30

• Leadership • Information technology • Organizational culture. Relationship between KM System and Critical Success Factors Mathew, Vittaleswar, and Rodrigues, in their “Human Factors & Knowledge Management: A System Dynamics Based Analysis”, built a relationship diagram between critical success factors (CSFs) and KM system, and the level of the impact of different CSF variables on KM quality. The research used the System Dynamics (SD) model and VenSim simulation software [8, 14]. Forrestor’s SD methodology includes five interrelated stages: • • • • •

Problem identification System conceptualization Model formulation Simulation and validation Policy analysis and improvement.

The model identifies relationships between the CSFs and the effects of the variables representing the human factors on the KM system improvement [15]. The causal loop diagram (Fig. 4.4) includes three reinforcing and two balancing loops, and shows the interrelationships of the above-mentioned variables [8]. The first reinforcing loop is between the level of acceptance and organization culture favorable to KM. As the level of acceptance is increased, it increases the organization culture, which in turn increases the level of acceptance. The second reinforcing loop is between the organization culture and culture erosion congenial to KM. When the culture erosion increases, the organization culture decreases, which in turn decreases the culture erosion. The third reinforcing loop includes acceptance level, culture erosion, and organization culture. An increase in acceptance level reduces erosion of culture, which increases organization culture and that, in turn, results in acceptance level improvement. Similarly, there is a delay between acceptance level and organization culture as well as between culture erosion and organization culture. The two balancing loops in the model are between technology adoption, KM tools and technology (KTT), and between technology obsolescence and KTT. In the first loop, technology adoption ratio growth increases KTT level.

4.3 Harnessing Human Factors for Better Knowledge Transfer

123

Fig. 4.4 Relationships between CSFs and KM

In the second balancing loop, the obsolescence growth rate reduces KTT, and the obsolescence rate reduces accordingly. The variables modeling human factors refer to (i) Organization culture (ii) Top management commitment and support. In the causal loop diagram, the factors affecting organization culture are (i) Top management commitment and support (ii) KM measurement and incentives. Case Study: Knowledge Transfer in IT Companies This case happened in a company, which develops and deploys information systems to governmental organizations that use them for assigning tasks to employees and monitoring their progress. The project goal was immediate integration of in-house and third-party information systems, for which the only documentation available was the knowledgebase. The integration progress revealed that for a third-party system, the deadlines were tough, and therefore, many of its developers resigned. Consequently, the knowledgebase was incomplete, irrelevant, and inconsistent. As such, the company’s analysts had to contact the technical support staff of the third-party system. However, due to multiple concurrent projects, their replies were either late or unavailable, and the knowledgebase remained incomplete. In this situation, the developers themselves had to re-engineer the third-party system in order to integrate it, which caused a severe project delay. As a result, the integration service never operated commercially, and the pilot release was six months late, although the integration itself was completed. In this

124

4 Social and Human Aspects of Crisis: The “Human Factors”

six-month period, the chief integration analyst and the developers resigned. The new team, which inherited this project, faced the same problems as the resigned employees, due to the tight deadlines and complexity, and had no time to update the knowledgebase accurately or in a timely fashion. We can conclude that the company failed to establish the knowledge transfer process, as the organizational culture was not intended for that, and the management was unsupportive. Another human factor-related issue was that when employees reported their decision to resign, the management reacted inappropriately, and the employees were not motivated to update the knowledgebase. Conclusion Knowledgebase inconsistencies due to human-related factors result in delays, duplicated activities and may cause project failures. Insufficient commitment of the management and inappropriate corporate culture often result in low motivation of the developers and hamper knowledge transfer success.

4.4 Case Study: Taxi Service Optimization Spreading of taxi call services significantly changes cities and their surroundings. In recent years, numbers of daily rides based on taxi platforms, such as Uber and Lyft in New York City, increased to approximately half a million. This exponential taxi platform growth expanded beyond the USA. Currently, in many countries, digital taxi platforms are taking over classical taxi companies. According to Yandex, nearly half a million orders come from taxi mobile applications, and these services tend to grow annually by 75-100% [16]. Multi-disciplinary problems in economics and computer science attract a number of research groups to study and model the taxi market services. The aim of this case study is to examine effective algorithms for pricing, market analysis, driver’s reputation management systems, and for client and driver matching, their top priorities being efficiency and quality of service. However, despite the popularity of these digital platforms, there are few papers concerning agent-based optimization strategies for the drivers. Therewith, the maximization problem of individual income became a pressing issue for millions of car owners all over the world. For instance, in New York City, there are around 65,000 drivers searching for orders in the streets. Moreover, these drivers spend a lot of their time figuring out their potentially promising locations; they make decisions individually or use their colleagues’ expertise. Currently, it is challenging to find a reliable advisor (including blogs, social networks, etc.) for the drivers to maximize their income. Therefore, they often make their own decisions based on sources of questionable reliability. In this section, we formalize strategies for maximization of the expected taxi driver income. The approach suggests a planning guide for the car drivers; it includes formal

4.4 Case Study: Taxi Service Optimization

125

models and describes a solution to maximize income based on a set of dynamic programming algorithms using the real data of a taxi aggregator. Optimization Problem in Taxi Services Recently, a number of papers analyze the stimuli for the driver to work for a taxi service. Researchers aim to figure out the agents’ goals and ways to manage them. For example, the New York Times discussed the stimulation research initiatives that Uber and Lyft use for their drivers [17, 18]. The article argued that the key service to “manipulate” the drivers was dynamic pricing and bonus strategies for a certain number of drives. Chen and Sheldon mentioned that the Uber drivers react on surges to manage these critical periods [15]. This information contrasts the previous research papers, which assumed that the drivers primarily focused on a certain wage level [19] and left the line straight after achieving that amount. Castillo et al. showed that the market pricing during high-demand periods typically contributes to efficient driver movement that prevents them from the “goose chases”, a period when their cars are so scarce that everyone busy on a long drive provokes an even greater shortage of cars in the remote areas [20]. Banerjee et al. suggest dynamic pricing strategies for the platforms based on economic theory models [21]: They demonstrated that dynamic pricing models resistant to parameter changes are available. Oskan and Ward considered strategic matching between the applications (i.e. individual drivers) and demand (i.e. requested trips) for Uber [22]. They demonstrated that the passenger–driver matches based on dynamic parameters (such as arrival speed to destination and client willingness to wait for a car) might improve system performance. However, the above researchers generally performed posterior data analysis and did not focus on specific recommendation development for the computer systems and their agents. Optimizing Taxi Services Many researchers applied network-based models to describe product requirements for searching supply and demand equilibrium for complex tariffs and regulations [23]. Another research group optimized the distribution of taxi drivers [24]. This problem was also addressed by Maciejewski et al. [25] with the focus on optimizing driver routes using guidelines from centralized administrators (for instance, taxi dispatchers). Jaeyoong et al. considered the related problem of maximizing vehicle occupancy and minimizing travel time on joint trips [26]. Camerer et al. found that inexperienced taxi drivers often make decisions on the distribution of their working time on a single-day basis rather than distributing their work and rest over several days; they also align their goal with the daily income and decline new orders upon reaching this goal [19]. City Modeling Let us introduce the city model, which assumes that the city is divided in disjoint zones, each designated as X . The city itself is represented as a complete weighted directed graph of

126

4 Social and Human Aspects of Crisis: The “Human Factors”

G = (X, E), where |X | = n and |E| is the set of nodes, and the weight on edge e(i → j) means the chance of a driver in zone i to get an order in zone j. As such, each edge is determined by • Time spent from zone i to zone j, τ (i, j) • Reward for moving from zone i to zone j, r (i, j). The common problem for the above two values is that they depend on time. In other words, the travel time between two nodes and the reward for this depend on t and are expressed accordingly as τ t (i, j) and r t (i, j). Empirical transition matrix (F) Each edge e(i → j) ∈ E is bound to the travel probability f (i) → j ∈ [0, 1], such that 

f (i, j) = 1, ∀i ∈ X.

j∈x

As the rows of matrix F correspond to probability, the weights of this matrix make a Markov chain with the transition matrix F, in which every record f (i, j) denotes the probability that a passenger travels from zone i to zone j. Therewith, f (i, i), i.e. traveling within the same zone, denotes the probability that a passenger travels from zone i to zone i. Also, within the zone i, f (i, i) denotes the probability that a driver did not find any passenger in a given period of time in this same zone i. Transition time matrix (T) For each edge, e(i → j) ∈ E, let us define time for a travel between zones i and j as r t (i, j) > 0. These values form the time matrix for transition T with the rows of: r t (i, j) Reward matrix (R) Each edge e(i → j) ∈ E

4.4 Case Study: Taxi Service Optimization

127

is related to the reward r (i, j) > 0 and expresses driver’s income for traveling with a passenger from zone i to zone j. The benefit is the driver’s share in the total order cost. The aggregator usually receives about 20% of the net commission fee excluding the expenses such as fuel, vehicle care, and overheads. Since income and expenses vary depending on the distance and travel time, every row in the reward matrix R is r (i, j) = income(i, j) − expenses(i, j). Each of the three matrices (F, T, and R) depends on the time, i.e. their values change dynamically. Modeling Driver Behavior Our model assumes that each trip requires a certain length of time. Let us denote this time as the total value of B with equal intervals. Depending on the model options, a driver can either sequentially use all the intervals of B or divide them into N time intervals, if N ≥ B. For instance, a driver has an 8-hour daily working schedule. Let us assume that each decision takes a driver 10 minutes, consequently he or she makes no more than six decisions per hour. For such a driver: B = 48, and N = 144. Let us describe the other parameters, which form the model. Home zone (i 0 ) We assume that each driver has a home location, i.e., the zone from which he or she starts working. Let us denote this home location as i 0 ∈ X . This will account for the driver’s planning to return home. Available solutions for a driver (A) Modeling driver’s behavior, we assume that every discrete moment of time, he or she has a choice of the following three alternative decisions: 1. “Wait for a passenger” (a0 ): Wait for a passenger in the current zone 2. “Return Home”(a1 ): Stop working and go home 3. “Move to another location” (a2 ( j)): Move to zone j. Driver strategy (π ) Driver strategy is a sequence of decisions, which he or she makes depending on the current time and location. The strategy π is non-determined, as the number of decisions made by the driver depends on the budget, decision type, complexity of strategy, and a number of other parameters. Each decision on the time and location, and included in π , may be  a, dependent  ˆ expressed as i, tˆ, aˆ , where: • aˆ ∈ A belongs directly to action iˆ ∈ X and to location where it happened, • tˆ ≤ N belongs to time, when decision was made.

128

4 Social and Human Aspects of Crisis: The “Human Factors”

Let us denote the multitude of all possible strategies as . Calculating Driver Income Further, let us focus on calculating a driver’s potential income, which, being in zone i, at the moment t, makes decision a. The income E(i, t, a) depends on the decision(s) made and is calculated as follows. For a driver being in zone i in moment t and taking an order a0 , and for the decision “Wait for a passenger”, the income is based on the following assumption(s): E(i, t, a0 ) = Fi .Ri , where Fi and Ri denote i th row in the transition matrix F and in the reward matrix R, respectively. The decision “Go home” means the completion of the current transfer and then going home. If the driver is in zone i at time t, the income function is   E i, t, a j = −cost(i, j0 ) and the driver incurs losses if there is no client on the way back. Importantly, state-of-the-art taxi services often provide “searching for orders on the way home” [27]. These allow the driver to receive orders that bring him or her as close to home as possible. However, our current model does not address them. Driver Strategy and its Optimization Considering the values of the matrices F, T, and R, together with the time budget B, the total expected driver income E for the strategy π is E(π, F, T, R, B) =

   ˆ tˆ,aˆ π i,

  ˆ tˆ, aˆ E i,

(4.1)

Since the idea is maximizing the driver’s total expected income, we approach the problem in terms of optimization. Problem 1 (MaxEarnings) Having a set of time-varying F, T, and R, and the driver’s time budget B, the goal is to find π ∗ such that π ∗ = arg

max  ∈ (π.F, T, R, B) π

Analysis of the Strategies Available Let us compare possible driver strategies. These are based on the set of decisions A, which are available to the driver. Additionally, we are going to optimally solve

4.4 Case Study: Taxi Service Optimization

129

the problem of income maximization (i.e. the MaxEarnings problem) in polynomial time for different sets of A. The overall expected income of a driver located in area i at the moment of time t, and with the budget time b, is denoted by Φ(i, b, t). Consequently, the overall income can be expressed as Φ(i 0 , B, N ). Let us assume that the driver is located in zone i, at the moment of time t, with the remaining units of time budget b, and has the choice of either relocating to zone j, or getting an order to transfer passengers to the area j. The other important conditions are t  = t + τ t (i, j), and the remaining time budget is b = b − τ t (i, j), In this case, the estimated future income is Φ( j, b, t). Let v(i, b, t) denote the decision that creates a vector of cumulative earnings for different areas, based on the choice of an action of v(i, b, t) = [Φ( j, b, t)] j∈X Further, let us analyze the driver strategies and find the solution for maximizing income. Flexible Transfer Strategy for the City This is the most common strategy, where a driver is free to choose a work option (i.e., a transfer scenario) between the city areas. For instance, the driver’s budget is limited to B time slots, which should be used in N finite units of time. According to this strategy, a driver staying and waiting in a zone i has the following set of available solutions:  A = {a0 , a1 } ∪ a2 ( j)|∀ j ∈ X, j = i Let us emphasize that any solution set with the transition a2 ( j) is limited to the solutions, which do not result in t ≥ N or b < 0. Following this strategy, the driver chooses an option, which maximizes his or her overall expected income. In this strategy, we apply linear programming methods. Each O(n N B) of the initial variables includes less than O(n) available  solutions.  Therefore, solving the maximum income problem requires an O n 2 N B time. “Simple” Strategy In the “simple” strategy, all the driver’s transfers within the city depend only on the order patterns. After the end of each trip, he or she would remain at the destination of the previous passenger and wait for another order. As such, the only available method for this driver can be “waiting for another passenger”, i.e., a0 . “Relocation” Strategy In the “relocation” strategy, a driver waiting in the area i can choose one of the two options: to wait for an order (i.e., a0 ) or to start relocation (i.e., a2 ). Consequently, the set of available solutions for a driver contains a set of activities n. One of these activities is waiting for a passenger (n − 1) and using a set of the rest moving options, one for each separate city district. Therefore,

130

4 Social and Human Aspects of Crisis: The “Human Factors”

A = a ∪ a2 ( j)|∀ j ∈ X, j = i The areas where the relocation is constrained by the time limits (i.e., when t ≥ N ) are excluded from the zones of this strategy. “Flexible” Strategy Using the “flexible” strategy, drivers control their time themselves. In practice, this happens on a heuristic basis, which considers the most profitable periods of a day. To model this, we set additional boundaries for the work time t ≥ N , so that a driver can divide the time into finite sets of N time units. This strategy allows figuring out an optimal driver’s schedule. At any point of time, a driver can leave his or her working zone and come back to the home area. Accordingly, the set of the available driver’s decisions consists of two options: waiting for a passenger (i.e., a0 ) or going home (i.e., a1 ). This means that = a0 , a1 . As stated in [6], the strategies of analyzing relative target income are more popular among the drivers than the strategies based on the time budget available. These flexible strategies offer the drivers an option to create their own work schedules and to spend less time to analyze their daily income. For all the above strategies, the solution for maximizing income is expressed as an Eq. (4.2). ⎧ ⎨

⎫ if a = a0 ⎬ Fi (Ri + v(i, b,  t)),   Φ(i, b, t) = max − cos t(i, i 0 ) + Φ i 0 , b, t ,  if a = a1 a∈A ⎩ ⎭ max j − cos t(i, i 0 ) + Φ j, b , t  , if a = a2

(4.2)

Income Maximization in Uncertainty The primary source of variability in the income maximizing is the data of the transition matrices F. Samples of the real rides based on the historical data are often the key source for these matrices. However, these empirically based transition matrices may also contain considerable deviations due to unclear data collection methodology and other reasons (such as weather changes or city events), which may complicate the income maximization problem. Let us discuss adjusting the above methods in case the data for the empirical transition matrices contains uncertainty, which is a potential trigger of a crisis. Our key assumption is that an empirical transition matrix (F) is generated either from a basic traffic matrix or from a matrix of requests (C) for rides between i and j areas. Matrix of Requests (C) Each edge of this matrix, e(→ j) ∈ E, corresponds to an integer value c(i, j), which stands for the number of requests for the transfers from area i to area j. Based on these data, all trips from the area i can be computed.

4.4 Case Study: Taxi Service Optimization

131

Based on the above considerations and the methods, we can model uncertainty F and change the income maximization for the problem of robust earnings. Uncertainties in Modeling Assuming that the real transition matrix exists, let us investigate whether the matrix C was generated from a real transition matrix P. Same as before, both P and C are time dependent. Let us examine every individual row, p and c, of the real transition matrix and the matrix of requests, P and C, respectively. We obtain a discriminating random variable, 2 Iˆ, distributed as χ 2 c(n − 1) by the degrees of freedom. Heuristically, 2 Iˆ is as a measure of c and p “divergence”. Therefore, for c to appear in (1 − α) or 100(1 − α)%, of the confidence interval p, we need  2 Fχn−1

  n n    2 2 Iˆ = Fχn−1 c(i) log c(i) − 2n log n − 2 c(i) log p(i) = 1 − α, 2 i=1

i=1

where p(i) or c(i), refer to the i th element of the vector p or c, respectively. In the above equation, α expresses admissible uncertainty and represents the upper bound of the value, which apparently exists in p. Consequently, α can be called an initial level of uncertainty. n 

c(i) log p(i) =

i=1

2(βmax − n log n) − Fχ−1 2 (1 − α) n−1 2

(4.3)

In other words, for each vector p that satisfies the Eq. (4.3), c will take place in the confidence interval p. Thus, for C, we set acceptable matrices Pα as multiple real transitions, where the Eq. (4.3) holds true for each matrix P in Pα , and each row p. The Problem of Sustainable Income The approach calculates the least expected total income by selecting the P matrix among all Pα matrices, which contains the minimal total expected income value for a driver. This allows estimating the difference between the solution calculated for MaxEarnings and the lowest driver’s income taking into account the limits of uncertainty α. Problem 2 (RobustEarnings) Let us formalize With the dynamic changes of the matrices C, T, and R, the driver’s budget B, and the uncertainty level α, it is required to find such πˆ that πˆ = arg

max min  ε(π, P, T , R, B) π P ∈ Pα

The problem involves a search within all possible transition matrices in Pα , which is a non-enumerable set. Therefore, this problem can be solved by increasing the

132

4 Social and Human Aspects of Crisis: The “Human Factors”

total expected financial income through the activities of type a0 (i.e. waiting for a passenger) by means of dynamic programming procedures, described in the section on optimizing driver’s strategies. Further, let us describe the metrics of the effect from the driver’s choice of a certain strategy. This requires more details about the preparation and analysis of the data received from a Russian transportation company in Moscow. This is a case study based on real data, illustrating how drivers should act to maximize their income. Data Collection Quantifying driver strategies involves time-dependent matrices, which describe the possible options of relocation between city areas and the associated rewards: • • • •

F T K C

— possibilities of relocating from one area to another — time for relocating from one area to another — rewards for the rides — number of rides from one area to another.

Each matrix meets the conditions described previously. In this case study, the main data source is the actual rides of the Yandex.Taxi passengers in Moscow, Russia, 2016. The reason for choosing this date and the city was the introduction of a new algorithm of dynamic price computation, which dramatically influenced both sides of the market (i.e., drivers and passengers), performing best in Moscow at that moment. This city is a typical example of a mono-centered megapolis, i.e., a big city with a distinct center, where a large number of workplaces are located, and residential areas in the remote outskirts. For a taxi service, this means that the evening rides (specifically those bringing people home after work) typically terminate in the residential areas and, therefore, leave the drivers in the zones of low demand. Further, we use the space-time rate of system misbalance (i.e., a local crisis), which is often referred to as “surge” by the taxi services. This rate exists in the passenger mobile applications and taxi driver online maps to exemplify the multiplier of the prices for rides in each zone [28]. Forming C and F Matrices Our starting point is the Yandex.Taxi dataset (collected in Moscow, 2016) of over 20 million records. Of these, a daily average was approximately 50,000 records, each including the locations for the starting point and destination, time, distance, fare, and the rate increase if available. Each record has the driver’s identifier who served the respective order. In addition to the starting point and destination, each ride has another spatial attribute indicating the city area. This links each record to one of the 119 city areas, also known as Moscow administrative districts (see Fig. 4.5 also containing number of orders for each district). Each record has the time of the trip’s beginning and ending to the exact second. For the sake of simplicity, we divide every day into 144 time slots of 10 minutes each, and bind each ride to one of these.

4.4 Case Study: Taxi Service Optimization

133

Fig. 4.5 Daily order distribution across the city districts

Therefore, each row c(i, j) of the transition matrices C t shows the overall number of trips from zone i to zone j during a 30-minute time slot. For instance, value c(i, j) for the time slot [10:40, 10:50] on Wednesday is an average number of trips from zone i to zone j, which started between 10:30 and 11:00 on any Wednesday in the entire dataset. The model does not allow any all-diagonal rows of the matrix C t , which are equal to zero, as these are the trips inside the same zone. According to definition, filling in the empirical transition matrix F t requires evaluating diagonal elements that represent the unavailability of the orders in the area. The following algorithm fills the matrices from the dataset. Assuming that the ride parameters do not change significantly throughout one time slot, N (λ) and N (μ) determine the number of passengers and drivers, respectively, who arrive to the zone i during one unit of time, according to independent Poisson distribution with the key values of λ and μ. Consequently, the value K = N (λ) − N (μ) is distributed according to Skellam distribution (see Equation 4.4): e−(λ+μ)

     λ I K 2 λμ μ

(4.4)

where Ik (z) is modified Bessel function of the first kind. Every time when K is less than O, the number of drivers exceeds that of the passengers in the area. In the simulation, we assume the scenario where a driver joins the end of the line. For this zone, a driver would get the next order based on a

134

4 Social and Human Aspects of Crisis: The “Human Factors”

FIFO scheme (First In—First Out). This means that a driver entering the zone last will get an order only after all previous drivers get their orders. Consequently, if k ≤ 0, a driver has to wait for a passenger (|k| + 1) to get the order. This condition can be expressed as follows (see Equation 4.5): Pr [N (λ) = |k| + 1] =

λ(|k|+1)e− λ (|k| + 1)!

(4.5)

Accordingly, the diagonal of f t (i, i) is determined as follows (see Equation 4.6): f t (i, i) = 1 −



Pr [K = k] ∈ ×Pr [N (λ)] ≥ (|k| + 1)

(4.6)

k≤0

For the matrix F to be stochastic, every remaining value is determined as follows (see Equation 4.7): 

ct (i, j) f (i, j) = 1 − f (i, j) ×  t j c (i, j) t

t

 (4.7)

As a result, the F t matrix satisfies all initial assumptions. Forming Time-Sensitive Matrices T and R Let us examine the Yandex.Taxi rides and concentrate on the “Economy” tariff as their flagship product. In 2016, over 90% of their rides used this tariff. Let us assume that the ride price, trip duration, and distance do not change significantly within every 10 minutes. With these assumptions, the real data concerning the ride costs and duration will form a reward matrix r (i, j), and the time matrix between zones t(i, j), i.e. between the city districts. Figure 4.6 presents a typical example of these matrices. Next, we describe the results of simulating driver activities during one week in October 2016. The results of this simulation will generally hold true for any other week, except for the very specific periods of the New Year (approximately December 30–January 5) and the May holidays (approximately May 1–10). Experimental results Comparing the Strategies First, let us outline the best strategy for a driver. It is intuitively clear that a taxi driver who has chosen the “flexible hours” strategy is free to select work options and relocate between the city zones. Such flexibility allows a more efficient use of the spatial and time differences in the demand for the Moscow rides. However, this is a hypothesis only. To prove it, a comparison of the income of drivers with different strategies is required. Apparently, the drivers using “simple” strategies have eight working hours (9 am to 5 pm). In this case, all relocations throughout the city depend only on the passenger

4.4 Case Study: Taxi Service Optimization

135

Fig. 4.6 Experimental data sample for an administrative district

order patterns after each trip, a driver stays in the zone of the previous order destination and waits for the next order. Conversely, those drivers who chose “flexible strategy” or “flexible strategy with relocations” spend the same total of eight hours every day in different time periods. Comparing these strategies requires solving the above-mentioned MaxEarnings problem earlier by modeling the behavior of 100 drivers. Of these, every driver is randomly appointed for the starting point in one of the administrative districts, and his or her strategy is fixed for the next 10 weeks. Figure 4.7 provides a chart of the actual received income. The diagram illustrates that, as the hypothesis suggested, all “clever” strategies consistently outperform “simple” ones. In most cases, the “flexible strategy with relocations” brings the biggest income. For instance, an average income of a driver using “simple” strategy on Sunday is 0.4 units, whereas that of a driver with “flexible strategy with relocations” is estimated at 0.62 units, or 55% more. The average income increase adds 47% per workday for the “flexible strategy with relocations” overstaying at the passenger’s destination. Therefore, the above strategies definitely exploit spatial and time demand changes in Moscow. The results also show that for a part-time driver in Moscow it is better to ride in the middle of the week (i.e., Wednesday to Friday), and on Sunday as compared to Saturday and Monday. Spatial Dynamics for Strategies Further, we address the benefits and issues of driver relocation. Figure 4.7 demonstrates spatial characteristics of driver demand in Moscow. Clearly, these result in uneven driver’s income that largely depends on the passenger destination. For instance, we should expect that the drivers who drop off passengers near the Kremlin (i.e., city center) earn more than those who finish their rides in Southeastern outskirt.

136

4 Social and Human Aspects of Crisis: The “Human Factors”

Fig. 4.7 Driver income depending on the strategy (by day; % from the real maximum income)

Similarly, Fig. 4.7 differentiates the income by the weekdays. In the days of low demand, not only the average income of the drivers using relocation strategies exceeds that of the “simple” strategists, but also the interquartile range (IQR) gets lower. In the days of high demand (such as Fridays) with intensive rides in all city areas, relocations and flexible strategies are equally efficient, and they significantly outperform the “simple” ones. These observations indicate that the income misbalance for the “simple” strategy is considerably higher than for any relocation-based one. Therefore, we can conclude that the daily relocations keep the drivers off the “traps” of the low-income areas and significantly raise their income. For some drivers, this seems irrational, as the relocation, being an “empty” ride by itself, implies a cost. Nevertheless, the results demonstrate that relocation adds-on to income, if properly implemented, are typically higher than the associated expenses. Temporary Dynamics of Strategies Obviously, due to demand periodicity, it is expected that driver income significantly depends on the working hours and the time spent. Drivers usually work 8 am to 5-6 pm. However, is this the best tactic to maximize income? Alternatively, is there a better strategy for working time management? In addition, what is the best start working time to maximize the income? To answer these questions, we simulated 1,000 drivers using “flexible” and “flexible relocation” strategies. Therewith, the drivers were randomly appointed to the administrative districts, where they started and finished the working shifts, and to their home areas.

4.4 Case Study: Taxi Service Optimization

137

Fig. 4.8 Active drivers (percentage of total, two strategies, evenings)

For the simulation, we solved the above MaxEarnings problem for each strategy and created a recommended plan of actions for every driver simulation. For each simulation step, the drivers performed personalized activities recommended by the strategy in accordance with their current locations, daytime, and the remaining time budget. Figure 4.8 gives the percentage of simulated drivers, who relocated within the city during different parts of a day. It demonstrates a high percentage of the drivers, who used the flexible strategy and the standard schedule (i.e. 9 am to 6 pm). Conversely, the number of drivers using “flexible relocation” strategy demonstrates two definite peaks of morning and evening rush hours. Moreover, the majority of these drivers used their time budget for the rides in the second part of the day (i.e., 3 pm until midnight). Since these two strategies differ only in their relocating accessibility, we can conclude that the differences observed depend solely on the driver activities. Consequently, the relocating activities are most effective in the evening hours, as they stimulate the drivers to relocate actively within the city. Concerning the most frequent relocation spots, we compared the relocation activities for the drivers who used: (i) “Relocation” strategy only, and (ii) “Flexible relocation” strategy with working time management on the line As such, one typical scenario included a driver using “relocation strategy” predominantly within the Moscow center. Another scenario involved “flexible relocation” strategy and demonstrated a clear geographical homogeneity. The number of relocations initiated by the drivers using this strategy was significantly lower. The main reason for that was the flexible working hours, which allowed driving at the time of the

138

4 Social and Human Aspects of Crisis: The “Human Factors”

highest demand. This reduced the relocation activity frequency and, consequently, the risk not to recover these relocations by the income growth. In Pursuit of Increased Rate Considering dynamic price calculation, state-of-the-art aggregators feature the socalled price hikes. Their purpose is providing service accessibility by means of increasing the ride prices in the periods of high demand and low car availability. We argue that it limits the current demand for more price-sensitive clients and stimulates the drivers to start their work at the rush hours. Figure 4.9 presents the active rates in the Moscow districts at a given moment of time. This kind of data is easily accessible for the drivers, since they have similar maps in their mobile applications [28]. However, the drivers are uncertain whether they should relocate to a high-demand area or stay at the same location due to the implicit impact of these rates on the actual Yandex. Taxi pricing in 2016. To understand whether the drivers should race for the “hikes”, we simulated their earnings by the following scenarios: (i) “no surge” — no price multiplier (ii) “surge” for the earnings (for each ride, a zone service multiplier is used) (iii) “surge chase” — a driver is located in the zone without an active multiplier and always relocates to the zone of the highest rate within 10 minutes. For each strategy under these three scenarios, Fig. 4.10 gives a simulated driver’s income.

Fig. 4.9 Active rates in Moscow

4.4 Case Study: Taxi Service Optimization

139

Fig. 4.10 Income depending on strategy choice, %

Clearly, the blind “surge chase” results in income dropdown irrespective of the strategy choice. Figure 4.10 proves our hypothesis concerning a high dispersion of the “simple” strategy. Occasionally, the drivers who used the “simple” strategy with the multiplier turned on, earned less as compared to the periods when this multiplier was off. In contrast to the predetermined strategy, the other strategies do not result in any notable benefits in terms of the “surge chase”. In other words, a blunt application of the increased rate strategy can result in a crisis situation, when a driver’s long-term earnings appear suboptimal. Conclusion This section examined different strategies to optimize the digital service efficiency and built a theoretical basis to solve the problem of maximizing the agent’s (i.e., taxi driver’s) income under uncertainty (i.e., in a crisis). This systematic approach simulated the driver’s income and suggested a set of dynamic programming equations. We introduced and discussed a set of the elementary activities of the drivers that formed a number of practically applicable strategies. We also suggested a simulation model for the city that included a number of rides between the zones, and estimated time and reward for each possible transition. Experimental results based on the real taxi service data demonstrated that the flexible (i.e., agile) strategies allow efficient crisis management by optimizing the combinations of the time slots and geolocation zones. The result of these flexible strategies typically outperforms any straightforward approaches that suggest waiting for a new order at the destination zone of the previous ride. These flexible strategies are historical data-based. Consequently, the results are potentially sensitive to the values of the empirical transition matrices. The future research directions include simulation of this empirical disturbance, and result validation by solving the RobustEarnings problem.

140

4 Social and Human Aspects of Crisis: The “Human Factors”

4.5 Conclusion: Lessons from the IT Crises: Optimizing the Human Factors This chapter discussed harnessing human-related factors such as communication, negotiation, and teamwork to efficiently develop digital products in crises. In crisis situations, these human factors often appear mission-critical. Therefore, we treated them as one of the three “pillars” of the novel ITC framework, along with business and technological factors investigated in the previous chapters. To avoid these crises, software production typically requires responsive and resilient knowledge transfer between the developer and the customer. For this kind of transfer, some researchers, e.g., Gil and Cohen, also use the term “resonant” [6]. To reach this helpful “resonance”, developer-to-customer communication requires application of a certain model (e.g., based on Shannon’s information theory or its later descendants). The idea is to transfer the knowledge regarding the new digital product, which is yet to be produced, between the client and the developer, without any critical loss. The reasons for such loss are typically the human-related factors due to a diverse and “noisy” intra-organizational “environment”, i.e., they differ in maturity, corporate culture, language, technical literacy level, etc. Each of the communicating organizations is usually treated as both a transmitter and a receiver, so the “connection” is bidirectional. Moreover, each of these communicating sides is a complex “informing” system in itself. As such, the product knowledge should disseminate within each of these subsystems with as little distortion or loss as possible, even in a “noisy” environment, due to such hampering factors as “witch-hunting”, rumors, neglecting, and ignorance. These complex issues of multi-level communication essentially hinder the transfer, and therefore require adequate models and methods to make it successful, even in a crisis. However, such a “noise” as a potential trigger of crisis may be encountered either within an organization or between the communicating sides. To manage the above mentioned challenges, we started by analyzing the knowledge transfer and knowledge sharing phenomena in the crisis environments, and suggested responsive transfer methods. To eliminate the “noise” of possible miscommunication, we classified the concepts of data, information, and knowledge and explained the difference between explicit and tacit knowledge, as well as between knowledge transfer and knowledge sharing. Further, to validate and reinforce our ITC-based approach to the human-related factors, we illustrated the application of the above-mentioned methods with a set of case studies that address knowledge transfer in large-scale and complex organizations. Examples included Ernst & Young and Accenture (former Andersen Consulting). The next step was analyzing human factors in knowledge transfer. We identified and listed the most influential 35 of these, selecting the top five that we consider as extremely important for crisis-responsive knowledge transfer. Our final activity in this chapter was combining the formal models, optimization methods, and useful human factors in an integrated case study. This case study dealt with implementing a new digital service for taxi drivers at a multinational

4.5 Conclusion: Lessons from the IT Crises: Optimizing the Human Factors

141

Yandex corporation. The case study used a real dataset for Moscow collected in 2016. However, in many respects, this approach would hold true for New York or any other megapolis. Clearly, this was a problem of multi-criteria optimization with tradeoffs such as time, budget, fuel, distance, and traffic intensity. Additionally, this problem addressed driver’s tradeoff-based decision-making strategies. Therefore, we addressed this by a formal model that included an optimizing method (top priority being the taxi driver’s salary) and suggested a few strategies (such as “simple”, “flexible”, and relocation-based). This formal model was based on the digital service produced by Yandex for their Yandex.Taxi IT platform for taxi service management. We used a large-scale simulation for 1,000 drivers and found out that “flexible” (i.e., agile) strategies are better applicable for crisis situations, such as losing profit due to wasting time at a single preferred location. This final outcome verified the ITC framework applicability to crisis-resistant digital product design, development, and operation.

References 1. Davenport, T. H., & Prusak, L. (1998). Working knowledge: How or-ganizations manage what they know. Boston, MA: Harvard Business School Press. 2. Chini, T. C. (2004). Effective knowledge transfer in multinational cor-porations. New York: Palgrave MacMillan. 3. Information for Paulin, and Suneson. (2012). “is available at https://www.researchgate.net/ publication/259272411_Knowledge_Transfer_Knowledge_Sharing_and_Knowledge_Bar riers_-_Three_Blurry_Terms_in_KM. Last accessed 25/05/2020. 4. Stewart, G. B. (1991). The quest for value: A guide for senior manag-ers. New York, NY: Harper Business. 5. Hansen, M., Nohria, N., & Tierney, T. (1999). What’s your strategy for managing knowledge? Harvard Business Review, 77(2), 106–116. 6. Zykov, S. V. (2016). Crisis management for software development and knowledge transfer. Springer series in smart innovation, systems and technologies vol. 61, (pp. xxiii + 133). Switzerland: Springer International Publishing. 7. Information for AUTMN is available at https://ec.europa.eu/research/innovation-union/pdf/ktireport-final.pdf, https://autm.net/. Last accessed 25/05/2020. 8. Mathew, A. O., Rodrigues, L. L. R., & Vittaleswar, A. (2012). Human factors and knowledge management: A system dynamics based analysis. Journal of Knowledge Management Practice, 13(2). 9. Krogh, G. V., Ichijo, K., & Nonaka, I. (2000). “Enabling knowledge creation: How to unlock the mystery of tacit knowledge and release the power of innovation”. Oxford: Oxford University Press. 10. Wang, S., & Noe, R. A. (2010). Knowledge sharing: A review and directions for future research. Human Resource Management Review, 20, 115–131. 11. Niedergassel, B. (2011). Knowledge Sharing in Research Collaborations. Gabler Verlag, 267 pp. 12. Brockhoff, K. (1999a). Forschung und Entwicklung, Planung und Kontrolle. München/Vienna: Oldenbourg. 13. Stewart, G. B. (1990). The quest for value: The EVA management guide. New York, NY: Harper Business. 14. Information regarding VenSim Simulation Software is available at https://vensim.com/. Last accessed 25/05/2020.

142

4 Social and Human Aspects of Crisis: The “Human Factors”

15. Keith, C. M. and Michael, S. (2016). Dynamic pricing in a la-bor market: Surge pricing and flexible work on the uber platform. http://www.anderson.ucla.edu/facultypages/keith.chen/pap ers/SurgeAndFlexibleWorkWorking. Last accessed 25/05/2020. 16. Information regarding Increasing bid ratio in the city. Video instruction for taxi drivers is available at https://driver.yandex/news/povyxawi-kofficient-video/. Last accessed at 25/05/2020. 17. Information for The New York Times. (2017). How Uber Uses Psychological Tricks to Push Its Drivers’ Buttons. is available at https://www.nytimes.com/interactive/2017/04/02/technology/ uber-driverspsychological-tricks.html. Last accessed at 25/05/2020. 18. Hall, J. V., & Krueger, A. B. (2018). An analysis of the labor market for Uber’s driver-partners in the United States. Technical Report No. w22843. National Bureau of Economic Research. 19. Camerer, C., Babcock, L., Loewenstein, G., & Thaler, R. (1997). Labor supply of New York City cabdrivers: One day at a time. The Quarterly Journal of Economics, 112(2), 407–441. 20. Castillo, J. C., Knoepfle, D., & Weyl, G. (2017). Surge pricing solves the wild goose chase. In: Proceedings of the 2017 ACM Conference on Economics and Computation (pp. 241–242). 21. Banerjee, S., Johari, R., & Riquelme, C. (2015). Pricing in ridesharing platforms: A queueingtheoretic approach. https://dl.acm.org/citation.cfm?id=2764527. Abstract appeared in ACM EC-2015. 22. Ozkan, E. & Ward, A. R. (2016). Dynamic Matching for Rea time Ridesharing. https://papers. ssrn.com/sol3/papers.cfm?abstractd=2844451. Working paper. Last accessed at 25/05/2020. 23. Yang, H., Wong, S. C., & Wong, K. (2002). Demand–supply equilibrium of taxi services in a network under competition and regulation. Transportation Research Part B: Methodological, 36(9), 799–819. 24. Shi, Y. & Lian, Z. (2016). Optimization and strategic behavior in a passenger–taxi service system. European Journal of Operational Research, 249(3), 1024–1032. 25. Maciejewski, M. & Nagel, K. (2013). Simulation and dynamic optimization of taxi services in MATSim. VSP Working Paper 13–0. TU Berlin, Transport Systems Planning and Transport Telematics. 26. Jung, J., Jayakrishnan, R., & Park, J. Y. (2013). Design and Modeling of Real-time Shared-taxi Dispatch Algorithms. In: Proc Transportation Research Board 92nd Annual Meeting. 27. Set a destination and we’ll match you with riders heading the same way. https://www.uber. com/info/destination. (2018). 28. Increasing bid ratio in the city. Video instruction for taxi drivers https://driver.yandex/news/ povyxawi-kofficient-video/. Last accessed at 25/05/2020. 29. Information regarding Carrier is available at https://www.carrier.com/carrier/en/worldwide/ about/willis-carrier. Last accessed at 25/05/2020.

Conclusion: Captain Nemo and Crisis as a New Environment

This book introduced the new ITC (i.e., IT Crisisology) framework for developing digital products in crises. This ITC framework included models and methods, patterns and practices, interwoven and illustrated by a set of case studies. Let us summarize the findings. To do this, we are going to apply a metaphor-based approach widely used in agile digital product development, and in case studies. One classic metaphor which comes to our mind is Captain Nemo’s story. The plot dates back to the nineteenth century, when the famous novel “Twenty Thousand Leagues under the Sea (Vingt mille lieues sous les mers)” was written by Jules Verne. This eminent author is often referred to as the “Father of Science Fiction” and is the second most translated writer of all time (behind Agatha Christie) [1]. The name of the novel’s main character, Captain Nemo (literally, “Captain Noname”, see Fig. 1), reveals his own crisis. He was born a noble Indian, Prince Dakkar. However, during his youth, many of his relatives were executed by the colonists, his hopes for freedom, wealth, and prosperity were ruined, his home country was invaded, and his future was dark and uncertain. Clearly, these events triggered a personal crisis quite close to our initial definition, i.e., imbalance between “customer” expectations and the reality or actual “product”, i.e., “what you see” and “what you get”. In this unfortunate situation of crisis, what did the prince do? As he could not protest in the current circumstances, he decided to change his surroundings, i.e., environment. Therefore, as he could not compete (i.e., fight) effectively on the ground (or even water), he decided to change this fighting environment to underwater, i.e., sea depths, which at that time were completely devoid of people (Fig. 2). Of course, such a choice was an aggressive challenge and a tough decision to make. However, this sudden and “global” crisis (which affected quite a number of critical aspects, including his personality, his family, and his country) triggered this.

© Springer Nature Singapore Pte Ltd. 2021 S. V. Zykov, IT Crisisology: Smart Crisis Management in Software Engineering, Smart Innovation, Systems and Technologies 210, https://doi.org/10.1007/978-981-33-4435-8

143

144

Fig. 1 Captain Nemo

Fig. 2 The Nautilus submarine

Conclusion: Captain Nemo and Crisis as a New Environment

Conclusion: Captain Nemo and Crisis as a New Environment

145

Fig. 3 The motto of the Nautilus submarine

Therefore, the prince’s decision was brisk and unexpected: he decided to build a submarine, i.e., a product suitable for a quite new environment that was adjustable to this crisis, and moreover, resilient and responsive. This step was potentially deadly, as survival was doubtful in the new circumstances of being submerged under water. The name of his submarine was Nautilus, a seashell; and this is another metaphor (see Fig. 2). The motto of this unprecedented vessel was “Mobilis in mobili”, i.e., a mobile object in a mobile environment (see Fig. 3). Let us elaborate more on the environment. People always try to adapt themselves to their environment. One of the early examples of this is air conditioners. This was an invention by Carrier in 1902 [2]. At school, we admired supermen, who used superpowers to adapt their initially human bodies to uncommon or tough environments. The best-known example of these is perhaps Spiderman, the hero of the famous movie. However, this is just one example of dozens of different supermen (such as Batman and Sandman) from the comic books of the 1980s. These changed themselves by means of super abilities, which improved their activities and entire lifestyles within their environments, which they then adapted to. However, let us get back to the famous Captain Nemo and analyze his actions step-by-step, i.e., his crisis response plan, and its implementation. First of all, this plan followed the PDCA strategy, which clearly includes analysis of the situation, incorporating environmental changes. For the next step, the prince had to accept the fact of the new reality; he was no longer a prince (except for the fact of his birth, and certain material resources he still possessed). Further, he had to change his attitude (stop fearing, being angry, etc.) and start thinking constructively. This was not death, but rather a new kind of life, which required a number of mission-critical adjustments due to environmental changes. He even had to change his name (from Prince Dakkar to Captain Nemo), his motto or “operating mode” (to “Mobilis in mobili”), and many other attributes to match these new requirements. These changes raised his “mobility”, i.e. agility, which made him crisis-responsive and more resilient. To do this, new paradigms were required in terms of models and methods, patterns and practices and their wise application in a balanced (i.e., optimized) combination, and changeable (i.e., dynamically variable or multiple) context. Not only did he realize (comprehend, or understand) this, but he also planned and acted (according to PDCA strategy).

146

Conclusion: Captain Nemo and Crisis as a New Environment

He then decided to build a conceptually new type of battleship capable of fighting, not only on the surface, but also underwater, i.e., a submarine. After committing to this concept, he created a detailed plan, which included mission-critical features. These features, or requirements and constraints, became the drivers for the architecture (i.e., components and relationships) as in the ACDM/ATAM design methods. After planning the project and designing the future digital system or product, the next thing is acting, i.e., implementing the deliverables, as PDCA strategy prescribes. Following this strategy, Captain Nemo designed and built his Nautilus, an unprecedented battleship powered by electricity and capable of fighting underwater. Typically, individuals are afraid to make mission-critical decisions like the ones of Prince Dakkar to become Captain Nemo. This is due to their nature or the so-called human-related factors that we addressed. Certain aspects of their behavior are being negative (“there is no human life under water”, “we’ll die there!”), neutral/ignorant (“this situation is helpless, we cannot fight/compete”), uncertain (“who knows what creatures live deep in the ocean; perhaps they are dangerous!”), and unaware (“there is nothing there that can support our life or even allow us to breath there”), etc. Surprisingly, religious beliefs often help to overcome these wrong attitudes, as they address the “crises” related to life and death issues. Buddhism appeared as a remedy to the crisis of suffering. In Hinduism, reincarnation theory helps to overcome the fear of death, which clearly is another form of crisis. In Christianity, they have eternal life after death. Moreover, after being baptized, or initiated (i.e., ordinated) as monks, they are given new names; the same as with Captain Nemo, as long as Prince Dakkar was “dead” in this crisis. Earlier in this book, we discussed the concept of “vision”, or product idea. This is similar to what a religious belief or faith is: “Now faith is the assurance of things hoped for, the conviction of things not seen” (Hebrews 11:1). This vision, although somewhat vague, essentially helps to overcome the fear of crisis transformation, before the product gets elaborated or at least designed. Captain Nemo was a typical visionary, i.e. a person able to penetrate through the border of the two environments (before and after the crisis). He was able to foresee the benefits of this new (underwater) environment and therefore able to manage this transition to the new kind of life (underwater lifestyle). As such, he could identify the critical requirements for this “new reality” (i.e., environment) and build the product for surviving in this new reality and accommodating its features (such as high pressure and no oxygen). In crisis, this inter-environmental border is revealed to the visionaries who recreate typical human activities for these new worlds. Professor Pierre Arronax, another character (and a visionary) of this famous novel by Jules Verne, being captured by Captain Nemo, had a chance to physically explore the underwater world that he initially knew only as his intelligent research insight. Captain Nemo presented him with a nautilus seashell, the prototype of the submarine, which existed in the Professor’s dreams only prior to this crisis. Regarding another event, the recent Covid pandemic, which clearly was a crisis due to a dramatic change of the environment, we noticed very different attitudes, policies, and adjustments by country, location, and individual. These resulted in very

Conclusion: Captain Nemo and Crisis as a New Environment

147

different outcomes. However, as the subject of this book is different, we would prefer that the readers elaborate on this matter themselves and come to conclusions of their own. Let us revisit this book’s structure and summarize its outcomes (and takeaways). The introduction outlined the purpose of IT Crisisology as a framework, its focus, highlights, and outcomes and presented the structure of the volume. Chapter 1 gave a historical background and overviewed the key concepts and ideas, such as crisis, software engineering, and more abstract things (e.g., models, and principles). Chapter 2 gave an early ITC model overview and a set of case studies to provide food for thought, which was later explained in the following chapters, and to illustrate the ITC approach application to real-world IT-intensive businesses. Chapters 3 and 4 described the ITC framework in terms of models and methodologies and their applications (interwoven as case studies) in more detail. These approaches included Crystal, Adizes, ACDM/ATAM, and Six Sigma, to name a few. The focus was on human factors and their practical application, as illustrated by business cases. The conclusion summarized the results of the book. Appendices added value to the ITC framework, as they simplified navigation (by means of acronyms list, glossary, and index) and contained materials to illustrate the case studies. Based on this narrative, what are the takeaways regarding crisis-responsive behavior? First of all, do not be afraid of crises, as they suggest improvement. Remember the story of Fyodor Ovchinnikov, the Dodo Pizza’s CEO, who treated client complaints as valuable presents, since they contained guidelines for business improvement. However, the business and entire lifestyle would change. Think of a crisis as a new environment offered for a new life. In this view, a crisis is a chance for a revival, a resurrection into a new reality. To be a success, analyzing and fully utilizing all the three “pillars” of the ITC framework (i.e., business, technology, and human factors, see Fig. 4) is essential. Of these, harnessing human factors (such as communication, negotiation skills, and teamwork) is critically important. If neglected, the human factors significantly hamper crisis-responsive development as shown by our case studies; these include useful hints (such as GEMBA practice for top managers working in the pizzeria kitchen) to revisit and apply in mission-critical situations for IT-intensive businesses. The three “pillars” of ITC, however, should be carefully balanced rather than treated bluntly or carelessly. For this reason, the book included a helpful set of handy models and methods such as PAEI, Six Sigma, ACDM/ATAM, and other approaches mostly aimed at tradeoff optimization of the digital product lifecycle. More options of crisis-responsive methods and tools are available in our recent books: – “Crisis Management for Software Development and Knowledge Transfer” (Springer, 2016) – “Managing Software Crisis: A Smart Way to Enterprise Agility” (Springer, 2018) – “Agile Enterprise Engineering: Smart Application of Human Factors” (Springer, 2020).

148

Conclusion: Captain Nemo and Crisis as a New Environment

Fig. 4 Three “pillars” of the ITC framework

IT CRISISOLOGY

T

B

H

Armed with these toolkits, managers will likely get a better chance to respond to and conquer the crises. We wish you every success to confidently and safely navigate your IT business vessel through the rough sea of crisis!

References

1. Behind Agatha Christie https://www.biography.com/writer/jules-verne last accessed at 25/05/2020 2. Information regarding Carrier is available at https://www.carrier.com/carrier/en/worldwide/ about/willis-carrier last accessed at 25/05/2020

Annex A

Springer Exhibits

See Figs. 1, 2, 3, 4, 5, 6 and 7

Fig. 1 Julius Springer, founder and CEO (left), 1930s (approx.)

© Springer Nature Singapore Pte Ltd. 2021 S. V. Zykov, IT Crisisology: Smart Crisis Management in Software Engineering, Smart Innovation, Systems and Technologies 210, https://doi.org/10.1007/978-981-33-4435-8

149

150

Fig. 2 Early book published by Julius Springer in late 1800s

Fig. 3 Memorial board in Berlin, Germany

Annex A: Springer Exhibits

Annex A: Springer Exhibits

151

Fig. 4 Title page of The Science of Nature (an interdisciplinary journal launched by Springer in 1913 to compete with the British periodical Nature)

152

Annex A: Springer Exhibits

Fig. 5 Springer offices in New York (USA, est. 1964) and Chennai (India, est. 1996)

Fig. 6 Springer logos for different projects (1996–2015)

Fig. 7 Scanning old books (“Springer book archives” project, 2010)

Annex B

IGI Global Exhibits

See Figs. 1, 2 and 3

Fig. 1 IGI Global’s founder, Dr. Mehdi Khosrow-Pour, D.B.A

© Springer Nature Singapore Pte Ltd. 2021 S. V. Zykov, IT Crisisology: Smart Crisis Management in Software Engineering, Smart Innovation, Systems and Technologies 210, https://doi.org/10.1007/978-981-33-4435-8

153

154

Fig. 2 IGI Global’s publishing process

Annex B: IGI Global Exhibits

Annex B: IGI Global Exhibits

Fig. 3 IGI Global’s manuscript publishing process

155

Annex C

DodoIS as the Dodo Pizza Company Understands It

Key success ingredients of this multinational delivery-oriented pizza business are: “innovation, deep digital integration and automation” (http://www.dodofranchise. com/). The core of this business is cloud-based software, which coordinates processes so that they are streamlined, and optimizes resources. This makes the business transparent and competitive. DodoIS is a tightly integrated set of cloud-based applications that control kitchen and delivery activities, collect and process data, and report in real-time. This set includes the applications for: – – – – – –

Production management Assembly line for pizza makers Delivery management (incl. POS) Hiring, scheduling and payroll Inventory management Marketing (incl. promotions, menu boards etc.)

© Springer Nature Singapore Pte Ltd. 2021 S. V. Zykov, IT Crisisology: Smart Crisis Management in Software Engineering, Smart Innovation, Systems and Technologies 210, https://doi.org/10.1007/978-981-33-4435-8

157

Annex D

DodoIS GUI Screenshots

See Figs. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 and 13.

Fig. 1 Production Management in the Kitchen (orders immediately appear in the kitchen tablets)

© Springer Nature Singapore Pte Ltd. 2021 S. V. Zykov, IT Crisisology: Smart Crisis Management in Software Engineering, Smart Innovation, Systems and Technologies 210, https://doi.org/10.1007/978-981-33-4435-8

159

160

Annex D: DodoIS GUI Screenshots

Fig. 2 Pizzamaker’s Timer (sets priorities and displays preparation time required/left)

Fig. 3 Packer’s Workplace (assists cashiers in order inspection, approval, and dispatching)

Annex D: DodoIS GUI Screenshots

161

Fig. 4 Delivery POS (produces receipt and packing list, monitors order status and delivery time)

Fig. 5 Hiring/Performance Manager (monitors applicant status and employee performance)

162

Annex D: DodoIS GUI Screenshots

Fig. 6 Availability/Schedule Manager (monitors available hours, scheduling shifts)

Fig. 7 Shift Manager (assigns shifts, solves scheduling conflicts)

Annex D: DodoIS GUI Screenshots

163

Fig. 8 Preparation Speed Manager (displays pizzas made on time and delayed; uses game-like interface for motivation)

Fig. 9 Delivery Speed Manager (displays delivery time and delays; uses game-like interface for motivation)

164

Annex D: DodoIS GUI Screenshots

Fig. 10 Inventory Manager (ingredients in stock)

Fig. 11 Consumption Manager (tracks write-offs and their reasons)

Annex D: DodoIS GUI Screenshots

165

Fig. 12 a. Marketing Manager (displays and tracks promotions/special offers). b. Marketing Manager (displays and tracks promotions/special offers)

166

Annex D: DodoIS GUI Screenshots

Fig. 13 DodoIS report (pizzas prepared by hours; March 7, 2013; Pervomaiskaya St., Syktyvkar; incl. promotions/special offers)

Annex E

DodoIS as a Cloud-Based System

The key advantages of the solution based on Microsoft Azure cloud platform: – Seamless support of heterogeneous software (operation systems such as iOS, Android, and Windows; applications, language s such as JavaScript, Python, .NET, PHP, Java, and Node.js,; platforms, tools, and databases) and hardware (tablets, smartphones) – Free use of open source software applications – Uniform service design for a wide range of hardware devices. Implementing Azure cloud-based DodoIS simplified the application development and packaging processes so that the software can be easily deployed anywhere. Servers are now configured automatically; the applications no longer require multiserver manual installation or complex configuration procedures. In 2016, Dodo Pizza expand geographically: they opened restaurants in the USA and in China. As such, system administrators easily deployed the Dodo IS providing the same level of system operation stability as in Russia. This happened thanks to the new Azure Container Service by Microsoft that provides scalable clusters of host machines for automated deployment and management of the containerized applications. In other words, there was no need for manual configuring and deploying of the Dodo IS on any of the remote local servers (whether located in USA or China), and therefore the software management costs were minimized. Dodo Pizza benefits from a strategic synergy of cutting-edge technologies and efficient offline pizza delivery. The technological part helps users to efficiently manage restaurants on a scalable way, as it collects and monitors the key data to analyze the productivity. Due to the new order tracking software system, which is a key element of the DodoIS, all pizzas are prepared from equally fresh ingredients and delivered hot to the customer. Due to DodoIS, a number of competitive advantages is available for the company; these include: – Direct customer’s order propagation from the website to the kitchen tablet PCs; © Springer Nature Singapore Pte Ltd. 2021 S. V. Zykov, IT Crisisology: Smart Crisis Management in Software Engineering, Smart Innovation, Systems and Technologies 210, https://doi.org/10.1007/978-981-33-4435-8

167

168

Annex E: DodoIS as a Cloud-Based System

– Real-time video surveillance of the ordered pizza preparation for each individual customer. The IT system reduces human factor-based risks, and thereby supports efficient and continuous operation, quickly moving the company from local Russian market to the international food industry.

Annex F

The Dodo Pizza’s KPI and Team

See Figs. 1, 2, 3, 4 and 4

Fig. 1 Dodo’s number of stores growth (2011–2017 Source 2017 annual report)

© Springer Nature Singapore Pte Ltd. 2021 S. V. Zykov, IT Crisisology: Smart Crisis Management in Software Engineering, Smart Innovation, Systems and Technologies 210, https://doi.org/10.1007/978-981-33-4435-8

169

170

Annex F: The Dodo Pizza’s KPI and Team

Fig. 2 Dodo’s CEO at Papa John’s (Which recipe to follow? The screen at the end of the line tells that)

Fig. 3 The motivation screen of the DodoIS tracking system

Annex F: The Dodo Pizza’s KPI and Team

Fig. 4 Dodo’s managers inspecting the motivation screen of the DodoIS tracking system

Fig. 5 Dodo’s kitchen team (Source 2017 annual report)

171

Glossary

Agile methodology Agile software development is an approach to software development under which requirements and solutions evolve through the collaborative effort of self-organizing and cross-functional teams and their customer/end user. It advocates adaptive planning, evolutionary development, early delivery, and continual improvement, and it encourages rapid and flexible response to change. Backlog A product backlog is a list of the new features, changes to existing features, bug fixes, infrastructure changes or other activities that a team may deliver in order to achieve a specific outcome. Burndown chart A burndown chart is a graphical representation of work left to do versus time. The outstanding work (or backlog) is often on the vertical axis, with time along the horizontal. Business Intelligence (BI) BI act as a set of all technologies used to collect and investigate data so that it helps the organization in the decision-making process. Collective code ownership Collective code ownership abandons any notion of individual ownership of modules. The code base is owned by the entire team and anyone may make changes anywhere. Communities of Practice (CoPs) Communities of Practice (CoPs) are organized groups of people who have a common interest in a specific technical or business domain. They collaborate regularly to share information, improve their skills, and actively work on advancing the general knowledge of the domain. Collaborative wall (information radiator) “Information radiator” is the generic term for any of a number of handwritten, drawn, printed or electronic displays, which a team places in a highly visible location, so that all team members as well as passersby can see the latest information at a glance. Culture All definitions in general relate to identifying with the shared mindsets, feeling, shared meaning and characteristics, shared socially developed environments, common ways in which innovations are utilized, and commonly experienced events. Cross-cultural Cross-culture in the business world refers to a company’s efforts to interact effectively with professionals from different backgrounds. © Springer Nature Singapore Pte Ltd. 2021 S. V. Zykov, IT Crisisology: Smart Crisis Management in Software Engineering, Smart Innovation, Systems and Technologies 210, https://doi.org/10.1007/978-981-33-4435-8

173

174

Glossary

Cross-functional teams A cross-functional team is a group of people with different functional expertise working toward a common goal. Daily build A daily build or nightly build is the practice of completing a software build of the latest version of a program, daily. Domain knowledge In software engineering, domain knowledge is knowledge about the environment in which the target system operates, for example, software agents. Explicit knowledge Explicit knowledge (also expressive knowledge) is knowledge that can be readily articulated, codified, stored, and accessed. Framework A framework, or software framework, is a platform for developing software applications. It provides a foundation on which software developers can build programs for a specific platform. Globalization The process by which businesses or other organizations develop international influence or start operating on an international scale. Global Software Development Global Software Development (GSD) is “software work undertaken at geographically separated locations across national boundaries in a coordinated fashion involving real-time (synchronous) and asynchronous interaction“. Individualism Individualism is a feel that a person gets reelecting to a personal sense of accomplishment from work, having a work that spares adequate time for individual or for the family members, flexibility to use their own method for work. Intellectual property Intellectual property (IP) refers to creations of the mind, such as inventions, literary and artistic works, designs, and symbols, names and images used in commerce bigdata and artificial intelligence. Knowledge management Knowledge management (KM) is the process of creating, sharing, using, and managing the knowledge and information of an organization. Lightweight methodology A lightweight methodology is a software development method that has only a few rules and practices, or only ones that are easy to follow. Masculine Masculine characterizes gender roles, for example, in some societies, men are to exhibit masculine characteristics and behaviors as self-confident, tough, intense, and concentrated on objective achievement, while women being more humble, delicate, and concerned with the quality of life. Modularization The design or production of something in separate sections. Pair programming Pair programming is an agile software development technique in which two programmers work together at one workstation. One, the driver, writes code, while the other, the observer or navigator, reviews each line of code as it is typed in. Planning game A planning game is a meeting attended by both IT and business teams that is focused on choosing stories for a release or iteration. Process A series of actions or steps taken in order to achieve a particular end. Refactoring Code refactoring is the process of restructuring existing computer code—changing the factoring—without changing its external behavior. Refactoring is intended to improve non-functional attributes of the software.

Glossary

175

Retrospective meeting Retrospective, a meeting that is held at the end of an iteration in agile development. Self-organizing team A self-organizing team is one that does not depend on or wait for a manager to assign work. Instead, these teams find their own work and manage the associated responsibilities and timelines. Shotgun debugging A process of making relatively undirected changes to software in the hope that a bug will be perturbed out of existence. Sociocultural variable Sociocultural factors are customs, lifestyles, and values that characterize a society or group. Software development lifecycle The systems development lifecycle (SDLC), also referred to as the application development lifecycle, is a term used in systems engineering, information systems, and software engineering to describe a process for planning, creating, testing, and deploying an information system. Sprint A sprint is a set period during which specific work has to be completed and made ready for review. Tacit knowledge Tacit knowledge resides in the human mind, being a personal asset and is difficult to put it in a formal way, classifies as well as communicates. Throughput Throughput is the rate of production or the rate at which something is processed. User Story A user story is a very high-level definition of a requirement, containing just enough information so that the developers can produce a reasonable estimate of the effort to implement it.

Index

A Accenture, 114 ACDM/ATAM, 15 Activity, 46 Adaptive Software Development (ASD), 10 Adizes, 45, 50 Administrative (A), 12 Administrator, 45 Adolescence, 47 Agile, 10, 21, 23 Agile software development, 10 Agility, 118 Aging, 12 Analytic Hierarchy Process (AHP), 69, 71, 72 Andersen Consulting, 114 API, 100, 102 Architectural review, 95 Architecture Centric Development Method, 93 Aristocracy, 12, 48 Artificial intelligence, 104 AUTM, 117 Available solutions for a driver, 127

B Backlog, 22, 173 Banerjee, 125 Bessel function, 133 Best practice, 114 B-factor, 5 Bitcoin, 106 Black Belts, 56 Blockchain, 60, 105–107 Blockchain in Transport Alliance, 106

Bloom’s taxonomy, 24 Blossoming, 12 Brockhoff, 120 Bureaucracy, 12, 48 Business processes, 118 Business requirements, 13, 34, 41

C Camerer, 125 Carnegie Mellon University, 2 CASE tools, 4 Castillo, 125 Chaincode, 106 Chen, 125 Childhood, 49 Choosing by Advantages (CBA), 75–78 Cloud computing, 100 Cloud layer, 100 Cloud system, 100 Coaching, 115 Cognitive, 113 Cognitive models, 114 Cognitive system, 116 Cognitive tacit knowledge, 114 Common vision, 3 Community crisis, 97 Computing layer, 101 Conflictual crisis, 97 Continuous integration, 23 Cost Risk Analysis, 65 Courtship, 11, 46, 49 Crisis, 5, 97 Crisis management, 98 Crisis management plan, 98 Crowdsensing, 99

© Springer Nature Singapore Pte Ltd. 2021 S. V. Zykov, IT Crisisology: Smart Crisis Management in Software Engineering, Smart Innovation, Systems and Technologies 210, https://doi.org/10.1007/978-981-33-4435-8

177

178 Crystal Clear, 10 Crystal methods, 23, 24, 37, 40 D Data, 112 Data Envelopment Analysis (DEA), 82–84, 86 Data Fusion Inc, 114 Death, 12, 48 Decision-making, 13, 23 Decision-making units, 82 Decision space, 68 Decision Tree Analysis, 64 Deep learning, 100 Degrees of belief , 89 Delivery cycle, 40 Deployment, 23 Design and Prototyping, 22 Dijkstra, E., vii Design, Development, 10 Design pattern, 22 Development lifecycle, 18 Digitalization, 104, 105 Digital product, 118 Documentation, 115 DodoIS, 32, 34, 37 Dodo Pizza, 31 Driver strategy, 127 Dynamic Software Development Method (DSDM), 10 E Early victory, 41 Easy access to expert users, 40 Ecosystems, 107 Efficiency, 84 Efficiency frontier, 83 Efficient, 69 Empirical transition matrix, 126 Enterprise Agility Matrix, 13 Enterprise architectures, vii Entrepreneur, 45 Entrepreneurial (E), 12 Ernst & Young, 114, 120 Ethereum, 107 Evidential Reasoning Approach (ERA), 88, 91 Expected Monetary Value, 64 Expected Monetary Values Analysis, 64 Expert Judgment, 63 Explicit knowledge, 114 Extreme Programming (XP), 10, 14, 116

Index F Fall, 48 Feature-Driven Development (FDD), 10 Feedback, 7 First In—First Out, 134 Flexible, 136 Flexible hours strategy, 134 Flexible relocation, 136, 137 Flexible strategy with relocations, 135 Flexible transfer strategy, 129 Flourishing, 11, 12 Focus, 40 Forrestor, 122 Frequent Delivery, 39

G Gathering Requirements, 22 GEMBA, 34 General Electric Company, 51 Gill & Cohen, 120 Goose chases, 125 Gross domestic product (GDP), 104

H H-factor, 5 Hikes, 138 Home zone, 127 Human-centric factors, 116 Human factor, 96, 117–120, 124 Human-related factor, 2, 118

I IBM, 106 ICONIX, 14 IDC Company, 106 Ideal point, 69 Importance of Advantages (IofA), 82 Incremental, 19 Incremental architecture change, 41 Industrial crisis, 97 Inefficiency, 84 Infancy, 11 Information, 113 Information and communication technologies, 96 Information radiators, 41 Information theory, 7, 120 Informing, 103 Informing science, 120 Integration cycle, 40

Index Integrative (I), 12 Integrator, 46 Intelligence layer, 100 Interface layer, 100 Internet of Things (IoT), 104, 105 Interquartile range, 136 ISO 9000, 53 IT Crisis, 5 IT Crisisology, 5, 7 IT Crisisology Framework (ITCF), 14, 15 Iteration cycle, 40

J Jaeyoong, 125

K Kaizen, 51, 52 Kanban, 10, 51, 52 Key performance indicator, 32 Knowledge, 113, 118 Knowledgebase, 118, 123, 124 Knowledge management, 118, 120 Knowledge management system, 119 Knowledge sharing, 116, 120, 121 Knowledge sharing model, 120 Knowledge transfer, 113, 115, 116, 118, 120, 124

L Language, 167 Lean, 51 Lean Software Development (LSD), 10 Less important, 71 Let’s Go, 11 Long-term strategic risks, 62 Lyft, 125

M Machine learning, 104 Machine-to-machine, 104 Maciejewski, 125 Maintenance, 23 Markov chain, 126 Marx, K., 2 Mastery Training, 116 Matrix of option scores, 71 MaxEarnings, 131, 135, 137 Maximization of output, 84 Megapolis, 132 Mentoring, 116

179 Metaphors, 114 Minimization of input, 84 Mining, 105 Mitigation, 98 Mobile Crowdsensing (MCS), 101–103 More important, 71 Motorola, 53 Motorola Corporation, 51 Multiple-Criteria Decision-Making (MCDM), 67, 68

N Nadir point, 69 National Health Service, 91 Natural crisis, 97 Naur, P., 7 NIST, 51 Nodes, 101 Non-community crisis, 98 Non-conflictual crisis, 97 Non-dominated, 69 Non-dominated solution, 67 Non-industrial crisis, 97 Normal distribution, 52

O On-the-Job Training, 116 OpenUP, 14 Operational risks, 62 Optimization, 59 Organizational culture, 124 Organization culture, 119 OSHA, 82 Osmotic communication, 39 Ozkan, 125

P PAEI, 45 PAEI (Adizes), 15 PAEI (Adizes) methodology, 12 Paired work, 116 Pairwise comparison matrix, 71 Personal safety, 39 Personal Software Process, 55 Plan, 10 Plan-Do-Check-Adjust, 51 Planning, 22 Poisson distribution, 133 Precedence Diagramming Method, 66 Preparedness, 98 Presentations, 116

180 Prevention through Design (PtD), 78, 79 Price hikes, 138 Prime, 47 Process transformation, 104 Producer (P), 12, 45 Production possibility set, 83 Product vision, 3 Project cycle, 40 Proof-of-state, 106 Proof-of-work, 106

Q Qualitative Risk Analysis, 61 Quantitative Risk Analysis, 63

R Recovery, 99 Recrimination, 48 Redeployment, 116 Reflective improvement, 39 Regular Training, 116 Relocation, 137 Relocation strategy, 129, 137 Requirements Analysis, 10 Response, 99 Reusable, 114 Reuse, 114, 115 Reward matrix, 126, 134 Risk, 60 Risk analysis, 20, 60, 61, 66, 77 Risk categorization, 62 Risk management, 60 Risk Urgency Assessment, 62 RobustEarnings, 139

S Salem City, 12 Same importance, 71 Schedule Risk Analysis, 66 Score matrix, 72 Score vectors, 72 SCRUM, 10, 14, 21 SDLC models, 19 Sensing layer, 101 Seven principles, 15 Shadowing, 116 Shannon, 7, 120 Shannon’s information model, 15 Sheldon, 125 Short-term strategic risks, 62 Simple, 136, 139

Index ‘Simple strategy, 129, 134 Simulation, 64 Six Sigma (6σ), 42, 44, 51–53, 56 Skellam distribution, 133 SMART, 15 Smart city, 96 Smart contracts, 106 Soft skills, 15 Software architecture, 93 Software development, 22 Software Development Lifecycle (SDLC), 18 Software Lifecycle Management Systems, 23 Source Control, 23 Spiral, 20, 44 Sprints, 21 Surge chase, 138, 139 Sustainable income, 131 Systematic knowledge, 114 T Tacit knowledge, 113, 114 T-, B- and H-factors, 14 Team Software Process, 55 Teamwork, 2 Technical environment, 40 Technical tacit knowledge, 113 Technology transfer, 113 Test and Deployment, 10 Testing, 23 T-factor, 5 Three pillars, 5 360° study, 40 Time matrix, 134 Toyota, 51 Tradeoffs, 15 Transition time matrix, 126 U Uber, 125 UBS, 106 Utility, 90 V Vector of criteria weights, 70 V-Shaped, 21 W Walmart, 106 Ward, 125

Index Waterfall, 19, 23 Weakly efficient, 69 Weakly non-dominated, 69 Work Breakdown Structure, 65 Y Yandex.Taxi, 132, 134, 138

181 Youth, 11

Z Zero input point, 84 Zero output point, 84