159 94 6MB
English Pages 300 [286] Year 2020
Advanced Sciences and Technologies for Security Applications
Carl S. Young
Risk and the Theory of Security Risk Assessment
Advanced Sciences and Technologies for Security Applications Series Editor Anthony J. Masys, Associate Professor, Director of Global Disaster Management, Humanitarian Assistance and Homeland Security, University of South Florida, Tampa, USA Advisory Editors Gisela Bichler, California State University, San Bernardino, CA, USA Thirimachos Bourlai, West Virginia University, Statler College of Engineering and Mineral Resources, Morgantown, WV, USA Chris Johnson, University of Glasgow, Glasgow, UK Panagiotis Karampelas, Hellenic Air Force Academy, Attica, Greece Christian Leuprecht, Royal Military College of Canada, Kingston, ON, Canada Edward C. Morse, University of California, Berkeley, CA, USA David Skillicorn, Queen’s University, Kingston, ON, Canada Yoshiki Yamagata, National Institute for Environmental Studies, Tsukuba, Ibaraki, Japan
The series Advanced Sciences and Technologies for Security Applications comprises interdisciplinary research covering the theory, foundations and domain-specific topics pertaining to security. Publications within the series are peer-reviewed monographs and edited works in the areas of: – biological and chemical threat recognition and detection (e.g., biosensors, aerosols, forensics) – crisis and disaster management – terrorism – cyber security and secure information systems (e.g., encryption, optical and photonic systems) – traditional and non-traditional security – energy, food and resource security – economic security and securitization (including associated infrastructures) – transnational crime – human security and health security – social, political and psychological aspects of security – recognition and identification (e.g., optical imaging, biometrics, authentication and verification) – smart surveillance systems – applications of theoretical frameworks and methodologies (e.g., grounded theory, complexity, network sciences, modelling and simulation) Together, the high-quality contributions to this series provide a cross-disciplinary overview of forefront research endeavours aiming to make the world a safer place. The editors encourage prospective authors to correspond with them in advance of submitting a manuscript. Submission of manuscripts should be made to the Editorin-Chief or one of the Editors.
More information about this series at http://www.springer.com/series/5540
Carl S. Young
Risk and the Theory of Security Risk Assessment
Carl S. Young New York, NY, USA
ISSN 1613-5113 ISSN 2363-9466 (electronic) Advanced Sciences and Technologies for Security Applications ISBN 978-3-030-30599-4 ISBN 978-3-030-30600-7 (eBook) https://doi.org/10.1007/978-3-030-30600-7 © Springer Nature Switzerland AG 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG. The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To Irving Young, MD August 15, 1922 – October 3, 2016
Foreword
Organizations of all types face significant security challenges these days. Threats to both people and assets are increasing as the pressure to contain costs intensifies. In institutions of higher learning, the demand for successful and cost-effective security is coming from trustees, faculty, staff, students, and parents. Therefore, the need for rigor in assessing and managing security risk is greater than ever. However, rigor requires pedagogy grounded in theory. If security practitioners lack such a foundation it is difficult to gauge whether their decisions are truly risk-based and the resulting security strategies are indeed cost-effective. Carl S. Young has written a timely book that provides the theoretical basis for security risk assessments. Fundamentally, this book enables problem solving by teaching how to reason about security risk from first principles. I believe it fills a longstanding gap in the risk management literature. Carl is an accomplished risk theorist as well as an experienced practitioner. He has tested his theories over the course of a long career that includes senior-level positions in government, industry, and consulting, and most recently at The Juilliard School as both Chief Information Officer and Chief Security Officer. As president emeritus of one of the world’s leading performing arts conservatories, I am encouraged by this fresh and long overdue approach. This book has the potential to become a standard reference for students, professionals, and academics in the field of security risk management. It should be required reading for any individual interested in the theory of security risk as well as anyone required to translate theory into practice. President Emeritus, The Juilliard School New York City, NY, USA
Joseph W. Polisi
vii
Preface
Since the events of 9/11 there has been an intense focus on security risk management. Many organizations have been professionalizing their physical and information security programs by hiring staff and implementing security technologies. The overarching objective of these enhancements is to ensure the security and safety of people, property, and information. Given the assets at stake and the investments made in protecting those assets, it may be surprising to learn that security-related decisions are often based largely on intuition. Although intuition can be a valuable by-product of experience, it is no substitute for rigorous analysis. Perhaps even more surprising might be the persistent misconceptions about basic security risk management that pervade the industry. For example, even security professionals regularly conflate the terms “threat” and “risk.” Such distinctions might seem unimportant if not pedantic. However, the misuse of basic terminology by experts suggests that the processes used to design security strategies and/or assess their effectiveness might also be flawed. The situation begs the question of why such confusion is indeed so pervasive. In this author’s view, the absence of pedagogy is a contributing factor. Formal instruction on security is generally missing from academic curricula, which is the natural place to learn the conceptual foundations of any discipline. In addition, reasoning about risk often takes a back seat to methods and technology. The reality is that the foundations of risk-based thinking have not been formalized and/or effectively communicated within the security community. A truly rigorous, i.e., risk-based, approach to security risk assessment is especially important in disciplines where relevant data are in short supply and confirming the efficacy of solutions via experiments is not practical. In these circumstances even the most basic questions can be difficult to answer. For example, has the lack of historical threat incidents been due to the effectiveness of security controls, adversary disinterest, or just dumb luck? The answers to such questions are never obvious, but gaining the necessary insight is impossible in the absence of rigor.
ix
x
Preface
That said, theory might be considered extravagant if not downright irrelevant to security professionals who must address real threats every day. Learning theory will always be a lower priority than satisfying operational requirements. Also, a professional might reasonably question why learning theory is worth the effort given that significant threat incidents are relatively rare. For example, why not just evaluate each threat scenario on a case-by-case basis and forgo the formalism? Ironically, a low number of threat incidents actually increases the requirement for rigor. One phrase is particularly apropos, and succinctly captures a fundamental aspect of security risk assessment and management: An absence of threat incidents does not imply an absence of risk.
In other risk management fields, plentiful threat incidents enable the application of standard statistical methods to calculate the probability of future incidents. Furthermore, experiments can be conducted to confirm the efficacy of risk management. Unfortunately, security problems are often constrained by a lack of data and conducting security-related experiments is impractical. However, a risk-based approach is especially needed in such circumstances, and this constrained condition is what motivates much of the material in this book. Importantly, the motivation for a theoretical treatment is not to fill classrooms but to enhance security operations. The objectives are to assess security risk more accurately and apply security controls more effectively, efficiently, and in proportion to the assessed risk. In the end, a risk-based assessment reduces the dependence on luck, which is the ultimate objective of all risk management. The appropriate place to both formulate and promulgate theory is academia. Yet most universities lack even an introductory course in security risk management let alone an entire curriculum.1 It is certainly true that numerous professional certifications are available in this area. Although such courses have tactical value, they typically do not teach students how to think about assessing risk. In addition, science and engineering concepts and methods are conspicuously absent from security pedagogy. This situation exists despite the proliferation of security technologies whose performance is governed by the laws of nature. Attracting more scientists and engineers to security risk management should certainly be a priority. Finally, the contention is that a good grasp of the theory will actually help address real-world security problems by providing a logical basis for decision-making. That logic must be based on risk and apply to any threat scenario. It is the incontrovertible logic and general applicability of risk-based analyses that bridge the abstract and the practical, and thereby promote confidence in the effectiveness of security solutions so identified. New York, NY, USA
Carl S. Young
1 A notable exception is the John Jay College of Criminal Justice of the City University of New York (CUNY).
Acknowledgments
Like many obsessions, writing a book is characterized by intermittent periods of pleasure and pain. In this case, the pain substantially increased when my father died in 2016. I wrote portions of this book in his hospital room, and it was difficult to regain momentum after his passing. I ultimately did return to writing partly because my father would have been distraught if I abandoned this project because of him or, more precisely, because of his absence. Although the book took longer than planned, I eventually completed it in part to honor his memory. Certain individuals helped me to regain my bearings. Specifically, my mother, Geraldine Young, has been a constant source of love and inspiration throughout my life. Her mere presence is enough to motivate my best effort. My sisters, Diane Uniman and Nancy Young, have never faltered in their support for anything I’ve undertaken, and this book is no exception. My lifelong friends, Fran Davis and Jim Weinstein, David and Lisa Maass, Ruth Steinberg, and Peter Rocheleau, deserve special mention. Suffice it to say, I am extremely fortunate to have had such supportive friends for many decades. Recent events merely add to my accruing indebtedness. Some of my parents’ closest friends fall into the same category, and we have become that much closer in recent years. In particular, Bill and Vivian Seltzer have become like parents, a burden they likely didn’t anticipate nor welcome at this stage in life. They and another of my family’s closest friends, Sora Landes, provided companionship, food, and much-needed support during regular trips to Philadelphia. I hope these impromptu appearances haven’t been too onerous, as they have meant the world to me. I must acknowledge two cardiologists, who likely give new meaning to the term “heartfelt thanks.” The first is Donald C. Haas, co-director of the Comprehensive Heart Failure Program at the Abington Memorial Hospital in Abington, Pennsylvania. Don was my father’s cardiologist and, I am proud to say, has become my friend. His expertise is exceeded only by his dedication and compassion. I will never forget
xi
xii
Acknowledgments
the day he drove to the hospital from New Jersey on his day off to check on my father’s condition and perhaps mine as well. The second is Erica C. Jones, formerly of the Weill Cornell Medical Center. She successfully diagnosed and treated my own health issue that arose while writing this book. Fortunately, she too is a superlative physician and risk manager who is devoted to her patients. William Osler, an icon at my father’s beloved Johns Hopkins University Medical School, clearly had physicians like these in mind when he remarked, “A good doctor treats the disease. A great doctor treats the patient.” Curious, intelligent, and motivated stakeholders are essential to developing effective security risk management strategies. I have been fortunate to work with two such individuals in my work as a consultant. Doug Maynard and McLean Pena are attorneys at Akin Gump Strauss Hauer & Feld who are both proponents and practitioners of rigorous security risk management. In addition to being a pleasure to work with, their insights have been instrumental in translating theory into practice. The Juilliard School is a world-renowned performing arts conservatory as well as a rewarding place to work. As with any institution, it has its share of idiosyncrasies that are ingrained in the culture. I am grateful to the vice president for Administration and General Counsel, Maurice Edelson, for helping me to navigate that culture and, most notably, for having faith in my abilities. I must thank Zhaodong Zheng (“Z”), a highly skilled Web technologist, who contributed a number of the figures. Her contributions are much appreciated, and I apologize if my random pleas for assistance have been onerous or in any way inconvenient. Finally, Annelies Kersbergen and Anitta Camilya of Springer deserve mention. Although we have never met in person, Annelies relieved much of the pressure by allowing me to miss (many) deadlines. I believe I was able to produce a worthwhile product as a result, but ultimately, that will be for others to decide. I appreciate her understanding and guidance throughout this process. Similarly, Anitta indulged my dilatory ways and undoubtedly devoted considerable time to addressing the proof corrections and other issues. Whatever this book may ultimately achieve it will in no small way be due to her and her team's considerable efforts.
Introduction
The objective of this text is to provide the conceptual foundation of security risk assessments. This task may appear straightforward, but it is complicated by the nature of risk itself, which might explain why theoretical treatments of security risk are relatively rare. Risk is multi-faceted, and often assessments are incomplete since only one of three components is evaluated. Moreover, risk actually describes a relationship between the elements of a threat scenario. Therefore, issues affecting its magnitude can be subtle, and assessment results can easily be misinterpreted. A conceptual foundation sounds abstract, but it actually provides a practical framework for problem-solving. From this framework emerges an assessment process that can be applied to any threat scenario. Importantly, the theory enables generalizations about the magnitude of risk, which in turn facilitates comparisons of diverse threat scenarios and the prioritization of security controls. Ironically, a fundamental problem in assessing security risk is a limited number of threat incidents. The situation is exacerbated by misconceptions about probability and statistics. The suggestion that an absence of incidents is a handicap seems analogous to claiming a lack of disease inhibits the practice of medicine. Although somewhat impolitic, such a statement would be technically correct. Today’s medical patients benefit from the collective misery of their antecedents whose ailments have yielded valuable data for both researchers and clinicians. Medicine and security risk management are similar in that they both assess the magnitude of risk albeit within dissimilar contexts. Although there are obvious differences in the two fields, the universal nature of risk ensures that the risk assessment process in each case is identical. In fact, at a high level all risk problems are equivalent. The differences and similarities of medicine and security risk management are worth exploring. In medicine, the threats are disease or injury, and the entity affected by these threats is the human body. Human anatomy and physiology are fortunately relatively similar within any given population. This similarity is what enables medical practitioners and medications to be generally effective despite obvious differences in our respective phenotypes and genotypes. xiii
xiv
Introduction
Consider if human anatomy and physiology varied significantly from person to person. In that case, bespoke treatments would be required for every individual. There could be no standard medical references since each patient would be the subject of a unique textbook. Moreover, identifying risk factors for diseases, which is essential to identifying treatments, would be impossible since each data sample would consist of a single individual. A limited number of drugs would be available because drug manufacturing would be highly unprofitable. Another contributor to success in assessing health risk is the ability to conduct controlled experiments. These can isolate the effect of individual risk factors and determine the effectiveness of treatments. Statistical results gleaned from experiments enable generalizations about the magnitude of risk for the entire population. Controlled human trials and animal experimentation are the sources of much of this data. In contrast, threat scenarios are varied, and the specific effect of individual risk factors on the magnitude of risk is often difficult to ascertain. For example, there are numerous risk factors for terrorism, and their respective contributions to the likelihood of future terrorism incidents is often impossible to quantify. Quantifying the likelihood of any future threat incident is impossible in the absence of statistics. Moreover, even if threat incidents have occurred, the conditions that spawned such incidents must be stable over relevant time scales in order to extrapolate to the future. However, the reality is there are typically few comparable threat incidents, so any probability distribution based on this small sample would likely have a large variance. The upshot is assessments of likelihood for many threat scenarios are inherently subjective. However, subjective conclusions based on objective risk criteria are as valid as estimates based on a sample of historical threat incidents. The difference in each case is that a qualitative estimate of risk inevitably results from the former and a quantitative estimate is possible for the latter. Importantly, theory provides the basis for security-related decisions, and therefore it has both theoretical and practical consequences. In particular, a common frame of reference for assessing risk evolves from the theory, which is grounded in a set of core principles. These principles specify the building blocks that are common to all threat scenarios as well as the nature of the connections that link each block. The implications are profound: all security risk problems are equivalent and the general approach to security risk assessments is always the same. It might be difficult to appreciate theory in a field that often demands decisive and immediate action. Moreover, if the theory seems too disconnected from reality it will surely and perhaps justifiably be ignored. Therefore, explicit connections to the real world are required to demonstrate relevance as well as to facilitate comprehension. In that vein, my undergraduate mathematics professor, the late Gian-Carlo Rota, urged his students to focus on examples rather than theorems. He believed it easier to extrapolate from the tangible to the abstract rather than the other way around. Life lessons gleaned from experience have since confirmed the wisdom of his insight. This book attempts to explain theory by providing real-world examples in addition to some admittedly not-so-real ones. The latter are frequently quite
Introduction
xv
scenario-specific and therefore not particularly applicable in general. Nevertheless, they serve a purpose, which is often to demonstrate the power inherent in certain assumptions, and the applicability of various methods once such assumptions are accepted. Concepts relating to basic probability and statistics must accompany any theoretical treatment of security risk assessment. This requirement is often driven by the need to assess the likelihood of a future incident of a specific type or possessing a particular feature should it occur. However, it is important to understand when such methods are actually applicable. Probability and statistics can provide quantitative insights that are unobtainable otherwise, but they are not applicable to every threat scenario. Analogies with various physical phenomena are presented throughout the text. Examples such as constructive and destructive interference are useful because they provide visual representations of abstract concepts. However, it would be a mistake to interpret these analogies too literally. That said, their prevalence suggests that perhaps security and science have deeper connections, which should be explored further. The book has three parts consisting of 12 chapters in total. Part I, “Security Risk Assessment Fundamentals,” provides the building blocks of the theory of both security risk assessment and management. These fundamentals include definitions and concepts that are required to assess and manage security risk from first principles. Part II, “Quantitative Concepts and Methods,” describes the analytical machinery that is useful in estimating the magnitude of threat scenario risk. For readers disinclined to delve into the details, Part II can be skipped without severely compromising the key theoretical concepts. However, those readers can’t be given a complete pass if an in-depth understanding of security risk is the ultimate objective. Part III, “Security Risk Assessment and Management,” explores topics intended to round out the theory and demonstrate its applicability. In particular, Part III specifies a model for threat scenario complexity that is derived from elementary principles of information theory. Metrics that point to systemic security risk issues are also presented in a later chapter. Part III segues from theory to practice by presenting a risk-based security risk management process that evolves directly from the theory. Descriptions of the individual chapters are provided next. Chapter 1 is entitled “Definitions and Basic Concepts,” and as its name suggests, it specifies the basic definitions and concepts of risk and security risk assessment. The pillars of the theory that include threat scenarios, the components of risk, and risk factors are also introduced in this chapter. The distinction between probability and potential is explained, which is a key facet of the theory and represents the predicate for many of the methods described in later chapters. Chapter 2 is entitled “Risk Factors,” which is arguably one of the most important concepts in any field of risk management. Risk factors determine the magnitude of risk and mediate the relationship between threats and affected entities. Five types of risk factors are identified, and the relevance of features specific to each type is explained.
xvi
Introduction
“Threat Scenarios” is both the title of Chap. 3 and the focus of all security risk assessments. Five threat scenario categories are identified and explained in detail. Phenomena specific to each category plus a security risk assessment taxonomy are also presented in this chapter. Chapter 4, “Risk, In Depth,” discusses some of the most significant themes pertaining to the theory. Many of the discussions in later chapters build on the topics discussed here. Key topics in Chap. 4 include risk universality, threat scenario equivalence, uncertainty, quantifying security risk, the effect of time on risk, direct and indirect assessments of likelihood, and risk relevance. Chapter 5, “The (Bare) Essentials of Probability and Statistics,” is the first chapter of Part II. It provides some of the concepts and methods that enable quantitative assessments of the likelihood component of risk. Although only the most basic statistical concepts are included, these are sufficient for assessing the likelihood component of risk for most threat scenarios. Ultimately, a basic familiarity with the fundamentals of probability and statistics as well as their limitations is key to rigorous assessments of likelihood. Perhaps most importantly, the content in this chapter gives an appreciation for the inherently statistical nature of the theory, and security-related examples demonstrate the relevance of specific methods. “Identifying and/or Quantifying Risk-Relevance” is the title of Chap. 6. This chapter complements Chap. 5, where the focus is on methods that are applicable to quantifying security risk and resulting metrics. These topics are potentially relevant to all three components of risk, which include trends, time series, and correlations. The chapter introduces the calculus, which is ubiquitous in traditional science and engineering disciplines but also has relevance to security risk assessment. More esoteric topics such as the random walk are discussed along with examples that demonstrate their potential, if narrow, applicability. Chapter 7 is entitled “Risk Factor Measurements.” This chapter provides examples of analytic methods applied to all three components of risk. Sections of the chapter focus on specific threat scenarios, and are organized according to the risk factor categories specified in Chap. 2, and are further organized according to the components of risk. Chapter 7 includes analyses of the effect of risk factor changes, which is the basis for indirect assessments of the likelihood component of risk. It also introduces fundamental concepts such as the statistical uncertainty associated with multiple risk factors, the confluence of likelihood risk factors, and risk measurements in the time and frequency domains. Chapter 8, “Elementary Stochastic Methods and Security Risk,” focuses on probabilistic methods in estimating the likelihood component of risk. The applicability of these methods is predicated on the assumption that a threat incident behaves like a random variable. Additional details associated with probability and their applicability to threat scenarios are also discussed. “Threat Scenario Complexity” is the subject and title of Chap. 9. Complexity affects most real-world threat scenarios, and is often a significant contributor to the likelihood component of risk. Notwithstanding its prevalence, complexity is not addressed in traditional security risk assessments, and is often excluded from
Introduction
xvii
academic treatments of security risk. A model of threat scenario complexity derived from elementary information theory is presented, which leads to metrics that enable comparisons of its magnitude across diverse threat scenarios. Chapter 10 is entitled “Systemic Security Risk.” Five threat scenario metrics are identified, which relate to the spatial distribution and temporal history of risk factors. Each metric is indicative of the overall approach to security risk management, which exposes potential systemic issues and the need for cultural change. Chapter 11, “General Theoretical Results,” identifies, organizes, and summarizes some of the significant theoretical results. Most importantly, these results include the core principles that represent the crux of the theory. This chapter also specifies metrics and thresholds that could be incorporated into security policies and standards. The content is organized according to the threat scenario categories identified in Chap. 3. Chapter 12, “The Theory, In Practice,” is the final chapter. It synthesizes the material developed in the preceding chapters to reveal the logic and sequencing of security risk management efforts. The fundamentals of security risk assessments are reviewed, which dovetails with the security risk management process that naturally evolves from the assessment fundamentals. Two detailed examples are presented, which help explain how the theory is applied to actual threat scenarios. Finally, we note that the terms “security risk” and “threat scenario risk” are used interchangeably throughout the text. This is admittedly less than ideal, but it should not cause confusion since their interchange has no effect on the theory. Threat scenarios are the focus of any security risk assessment so the terms “security” and “threat scenario” are closely related if not completely interchangeable. Security risk is part of the vernacular whereas the use of threat scenario risk is more consistent with a formal treatment. In general, we will use the latter term but recognize the two are functionally equivalent.
Contents
Part I
Security Risk Assessment Fundamentals
1
Definitions and Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Introduction to Risk and Risk-Relevance . . . . . . . . . . . . . . . 1.2 Threat Scenarios and the Components of Risk . . . . . . . . . . . . 1.3 The Risk Meter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Introduction to Risk Factors . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Threat Incidents and Risk Factor-Related Incidents . . . . . . . . 1.6 Probability v. Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.7 The Fundamental Expression of Security Risk . . . . . . . . . . . . 1.8 Absolute, Relative and Residual Security Risk . . . . . . . . . . . 1.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
3 3 9 11 13 16 17 26 27 30
2
Risk Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Definitions and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Apex Risk Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Spatial Risk Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Temporal Risk Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Behavioral Risk Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Complexity Risk Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 Inter-related Risk Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9 Risk Factor Scale and Stability . . . . . . . . . . . . . . . . . . . . . . . . 2.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31 31 32 36 39 40 42 43 43 44 47
3
Threat Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Static Threat Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Dynamic Threat Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Behavioral Threat Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Complex Threat Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . .
49 49 51 52 52 53
. . . . . .
xix
xx
Contents
3.6 3.7 3.8 3.9 3.10 4
Random Threat Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . Maximum Threat Scenario Risk . . . . . . . . . . . . . . . . . . . . . . . General Threat Scenario Phenomena . . . . . . . . . . . . . . . . . . . . A Security Risk Assessment Taxonomy . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53 54 56 58 60
Risk, In-Depth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Threat Scenario Equivalence and Risk Universality . . . . . . . . . 4.3 Direct and Indirect Assessments of Likelihood . . . . . . . . . . . . 4.4 Sources of Uncertainty in Estimating Likelihood . . . . . . . . . . . 4.5 Time and Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Risk-Relevance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 The Confluence of Likelihood Risk Factors . . . . . . . . . . . . . . . 4.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61 61 63 69 71 74 78 79 81
Part II
Quantitative Concepts and Methods
5
The (Bare) Essentials of Probability and Statistics . . . . . . . . . . . . . . 85 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.2 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.3 Average, Standard Deviation, Variance and Correlation . . . . . . 91 5.4 The Normal and Standard Normal Distributions . . . . . . . . . . . 93 5.5 The Z-Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.6 Statistical Confidence and the p-value . . . . . . . . . . . . . . . . . . . 99 5.7 The Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 5.8 Value-at-Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6
Identifying and/or Quantifying Risk-Relevance . . . . . . . . . . . . . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Linearity, Non-linearity and Scale . . . . . . . . . . . . . . . . . . . . . 6.3 Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Trends and Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Histograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 Derivatives and Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7 Correlation and Correlation Coefficients Revisited . . . . . . . . . . 6.8 Exponential Growth, Decay and Half-Value . . . . . . . . . . . . . . 6.9 Time and Frequency Domain Measurements . . . . . . . . . . . . . . 6.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
111 111 112 120 121 123 125 127 128 132 135
7
Risk Factor Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Spatial Risk Factor Measurements . . . . . . . . . . . . . . . . . . . . 7.3 Temporal Risk Factor Measurements . . . . . . . . . . . . . . . . . . 7.4 Behavioral Risk Factor Measurements . . . . . . . . . . . . . . . . .
137 137 138 148 152
. . . . .
Contents
xxi
7.5
Multiple Risk Factors and Uncertainty in Security Risk Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
7.6 8
Elementary Stochastic Methods and Security Risk . . . . . . . . . . . . 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Probability Distributions and Uncertainty . . . . . . . . . . . . . . . 8.3 Indicative Probability Calculations . . . . . . . . . . . . . . . . . . . . 8.4 The Random Walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 The Probability of Protection . . . . . . . . . . . . . . . . . . . . . . . . 8.6 The Markov Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Time-Correlation Functions and Threat Scenario Stability . . . 8.8 The Convergence of Probability and Potential . . . . . . . . . . . . 8.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Part III
. . . . . . . . . .
157 157 160 163 171 172 175 179 185 187
Security Risk Assessment and Management
9
Threat Scenario Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Introduction to Complexity . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Complexity Combinatorics . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Information Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Estimates of Threat Scenario Complexity . . . . . . . . . . . . . . . 9.6 Complexity Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.7 Temporal Limits on Complexity . . . . . . . . . . . . . . . . . . . . . . 9.8 Managing Threat Scenario Complexity . . . . . . . . . . . . . . . . . 9.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
191 191 192 195 200 207 212 215 216 218
10
Systemic Security Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 The Risk-Relevance of Assets and Time . . . . . . . . . . . . . . . . . 10.3 Spatial Distribution of Risk Factors: Concentration and Proliferation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.1 Concentration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.2 Proliferation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4 Temporal History of Risk Factors: Persistence, Transience and Trending . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.1 Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.2 Transience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.3 Trending . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
221 221 222
General Theoretical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Core Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Random Threat Scenario Results . . . . . . . . . . . . . . . . . . . .
231 231 231 234
11
. . . .
. . . .
223 223 224 224 225 226 227 228
xxii
Contents
11.4 11.5 11.6 12
Static and Dynamic Threat Scenario Results . . . . . . . . . . . . . . 235 Complex Threat Scenario Results . . . . . . . . . . . . . . . . . . . . . . 237 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
The Theory, in Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 The Security Risk Management Process . . . . . . . . . . . . . . . . 12.3 Applying the Theory (1): Information Security Threat Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . 12.4 Applying the Theory (2): Password Cracking . . . . . . . . . . . . 12.5 A Revised Fundamental Expression of Security Risk . . . . . . . 12.6 Testing for Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.7 The Security Control/Risk Factor Ratio (C/R) . . . . . . . . . . . . 12.8 Cost and Constraints in Security Risk Management . . . . . . . . 12.9 Low Likelihood-High Impact Threat Scenarios . . . . . . . . . . . 12.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 241 . 241 . 242 . . . . . . . .
246 251 257 260 260 261 262 264
Epilogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
About the Author
Carl S. Young specializes in applying science to information and physical security risk management. He has held senior positions in government, the financial sector, consulting, and academia. He is the author of three previous textbooks in addition to numerous technical papers and has been an adjunct professor at the John Jay College of Criminal Justice (CUNY). Young earned undergraduate and graduate degrees in mathematics and physics from the Massachusetts Institute of Technology (MIT).
xxiii
Part I
Security Risk Assessment Fundamentals
Chapter 1
Definitions and Basic Concepts
1.1
Introduction to Risk and Risk-Relevance
Remarkably, humans make many decisions with relatively little deliberation. Issues are routinely resolved with minimal effort, and include such diverse events as crossing the street, ordering from a menu, choosing a pair of socks, buying a house and selecting a spouse. It is tempting to believe that humans are hard-wired for decision-making and to speculate that this capability has evolved over the millennia via natural selection. Decisions that were necessary to stay alive would have been critical to the survival of our ancestors. To put it bluntly, the outcome of many decisions likely meant the difference between eating and being eaten. The ability to consistently make good decisions might be one reason Homo sapien survived and other species perished. It is difficult to separate decision-making from thinking itself given the role decisions play in converting thoughts into actions. It is plausible that a key feature of human thought is a robust decision-making capability, which seems like a logical adjunct if not inherent to enhanced cognition. Biological and social scientists generally agree that the human brain and its cognitive agent, the mind, evolved via a combination of natural selection and cultural influences.1 However, it appears no one completely understands the relative contributions of culture versus biology in shaping the mind. The issue was famously discussed by Descartes (“Cogito ergo sum”) and contemplated by Plato. A passage from the preeminent biologist Edwin O. Wilson eloquently expresses the complex interplay between culture and the human brain2:
1
Edwin O. Wilson, Consilience, The Unity of Knowledge (Chapter 7: From Genes to Culture); Random House/Vintage Books, New York, 1998. 2 Ibid. © Springer Nature Switzerland AG 2019 C. S. Young, Risk and the Theory of Security Risk Assessment, Advanced Sciences and Technologies for Security Applications, https://doi.org/10.1007/978-3-030-30600-7_1
3
4
1 Definitions and Basic Concepts The brain constantly searches for meaning, for connections between objects and qualities that crosscut the senses and provide information about external existence. We penetrate that world through the constraining portals of the epigenetic rules. As shown in the elementary cases of paralanguage and color vocabulary, culture has risen from the genes and forever bears their stamp. With the invention of metaphor and new meaning, it has at the same time acquired a life of its own. In order to grasp the human condition, both the genes and culture must be understood, not separately in the traditional manner of science and the humanities, but together, in recognition of the realities of human evolution.
To this author, who admittedly has zero expertise in the area of human cognition, the decision-making process seems instinctual in a way that resembles language acquisition.3,4 Humans are not taught decision-making any more than they are taught how to walk or speak a language. The process is inherent to every sentient human and is apparent at an early age. Judgment and experience modulate decision-making as humans mature. Perhaps humans developed a simple decision-making algorithm that was easily processed by a relatively primitive brain. Even a nascent decision-making capability might have marginally increased chances of survival. Alternatively, human neurological circuitry may have evolved so that our brains became capable of processing an effective decision-making capability. Whatever the explanation, a structured decision protocol might have given Homo sapiens an evolutionary edge, and thereby helped to avoid a catastrophic winnowing of the species. Thankfully, most of today’s decisions do not affect personal survival much less the future of human civilization. Bad decisions do not typically carry the same evolutionary penalty they once did. However, an ironic consequence of the intellectual ascent of humans is that the results of certain decisions could lead to the destruction of civilization. Their impact mandates intelligent, responsible and ethical decision-makers in addition to rigorous assessments of risk. The previous discussion begs the questions of what actually constitutes a decision, and what are the individual decision criteria. Simply put, a decision is a choice between possible outcomes, where the relative “goodness” and “badness” of the outcomes are evaluated as part of the decision process. Every choice has at least two possible outcomes where the consequences of each outcome vary. Therefore, inherent to the decision process are the concepts of relative goodness and badness. Implicit in the word threat is the notion of a bad outcome. But to reiterate, “bad” is a relative term, and a choice by definition involves a gradient of outcomes whose relative goodness or badness depends on one’s perspective and the decision details. Therefore, any process yielding a spectrum of possible outcomes can be considered a form of “threat” in the most general sense since some of the outcomes are relatively bad. Simply put, the objective of every decision is to optimize process outcomes, which is tantamount to reducing the likelihood, vulnerability and impact of a
3
S. Pinker, The Language Instinct; How the Mind Creates Language, Harper Collins, 2007. The notion that language was instinctual was first posited by Noam Chomsky as part of the theory of generative grammar. 4
1.1 Introduction to Risk and Risk-Relevance
5
relatively bad outcome. We will see shortly that these criteria are the same ones used to assess risk. Therefore, we conclude that every decision is a form of risk assessment. At a high level, the context for both risk assessments and decision-making consists of three elements: a threat, an entity affected by that threat and the environment where the threat and entity interact. The context for decisions and risk assessments is a threat scenario. As noted above, the criteria for decision-making and assessing risk are identical. We will now review those criteria in more detail. The likelihood of a threat scenario outcome, the magnitude of the effect of a threat scenario outcome, and the significance of a threat scenario outcome are the three (universal) criteria used to both assess risk and make decisions. These criteria specify the nature of the relationship between a specific threat and a particular entity affected by that threat, which exists within the context of a given threat scenario. In the absence of such a relationship a threat is irrelevant to an entity within a defined threat scenario. The magnitude of this relationship is precisely the risk associated with that threat scenario. Deciding when to cross the street is illustrative of the equivalence of decisions and risk assessments, and provides an introduction to the three decision-making/ risk assessment criteria noted above. Pedestrians who wish to enjoy continued good health focus on estimating the likelihood of a violent encounter with an approaching vehicle as they cross the street. That estimate is based on judgments regarding the approaching vehicle’s distance and speed relative to the pedestrian’s speed. The likelihood of a threat incident, which in this particular instance is a collision between the vehicle and the pedestrian, is a criterion common to both decision-making and security risk assessments. Note that the purpose of a traffic signal is to eliminate the need for such estimates. Humans can be overly optimistic in determining their chances of a safe crossing. They are also susceptible to influences that cloud their judgment such as the prospect of being late for a dental appointment. When evaluated in a more rational light, the choice between death and the dentist seems obvious. A slight miscalculation in crossing the street could be life altering if not life ending, whereas a few minutes more or less in the dentist chair will be relatively inconsequential.5 Given the overwhelming number of urban distractions as well as the ongoing competition between drivers and pedestrians, orderly traffic flow requires an enforceable process to regulate their interactions. Otherwise, the situation can quickly devolve into chaos as anyone who has witnessed a broken traffic signal at a busy intersection can attest. Even in New York City where a traffic signal is viewed as more of a helpful suggestion than an absolute requirement, most New Yorkers would agree that complete autonomy in deciding when to cross the street could have disastrous consequences. Given the potential outcome resulting from an inattentive or overly
5 http://www.nyc.gov/html/dot/downloads/pdf/nycdot-pedestrian-fatalities-by-bike-motor-vehicle. pdf
6
1 Definitions and Basic Concepts
aggressive driver, prudent pedestrians look both ways before stepping into the crosswalk even if a traffic signal is functioning properly. The inevitability of a bad outcome in any violent encounter with a moving vehicle obviates the need to estimate the second decision criterion: vulnerability. Vulnerability is the magnitude of loss or damage resulting from a decision or process outcome. At a high level there are two possible outcomes resulting from crossing the street. The first is the vehicle strikes a pedestrian, which will almost certainly yield physical impairment or the loss of life. The second outcome is collision avoidance, which nearly guarantees pedestrian health for at least the time spent in the crosswalk. Rarely does life offer such unambiguous choices. Therefore, an assessment of the vulnerability associated with the process of crossing the street reduces to a simple choice between two extremes, which is not much of a choice for anyone with a strong desire to live. Even the most impatient individual should be willing to sacrifice a few seconds in the interest of remaining injury-free. It is therefore a constant source of amazement to observe pedestrians regularly tempt fate by prematurely venturing into the crosswalk, sometimes preceded by strollers containing infants, for relatively little gain. The third decision criterion is impact, which is a seriously unfortunate term in this context. Impact is defined as significance-per-threat incident. The term significance has subjective overtones. Even reasonable people might disagree on the significance of a particular decision outcome or threat incident. In many threat scenarios, significance equals the monetary value of a particular outcome. For example, the significance of a theft threat scenario might be characterized by the value of stolen items. In such instances the impact risk-decision criterion can be characterized as the vulnerability-per-threat incident. Note that the vulnerability component of risk for a street-crossing threat scenario is always significant: injury or death awaits the careless pedestrian at a busy intersection. Therefore, the impact risk-decision criterion is also significant. Most pedestrians intuitively recognize this condition and therefore take the precaution of looking both ways before crossing. In addition, there is no way to reduce the magnitude of the impact decision criterion in any confrontation between pedestrian and moving vehicle. For example, it is impossible for a pedestrian to dilute the effect of a single collision by spreading out the injuries over multiple street crossings. Such an approach is obviously ridiculous. However, other types of threats allow for reductions in impact by reducing the loss-per-incident. For example, the financial sector blunts the effect of a dramatic downturn in one asset class by diversification. This phenomenon is colloquially referred to as “hedging one’s bet.” Unfortunately, the irrevocable outcome of a single violent encounter with a massive, fast-moving machine mandates that the same precautions be taken each time. Other threat scenarios further illustrate how the risk-decision process revolves around ensuring favorable outcomes, or conversely, avoiding adverse ones. Selecting an item from a restaurant menu might not appear to qualify as a threat scenario, but we now know that any process qualifies as a threat because of the spectrum of outcomes. That is, each menu selection represents a decision with
1.1 Introduction to Risk and Risk-Relevance
7
relatively good and bad outcomes just like any other decision or assessment of risk. In fact, ordering from a menu entails evaluating the same criteria as those used to cross the street. For example, the specter of disappointment looms over any restaurant patron confronted with a choice between lobster and steak. Disappointment is a form of loss, albeit not a material one. Ultimately, the patron’s selection is partly based on an estimate of the magnitude of disappointment resulting from choosing one option over another. Avoiding disappointment is equivalent to seeking happiness in this zero-sum decision process. In addition to estimating the magnitude of disappointment, the restaurant patron also assesses the likelihood of experiencing disappointment with respect to each option. Somehow taste, mood, hunger and other intangibles are evaluated to yield such an estimate. Previous experiences at this or similar eating establishments might also influence the decision. Of course, a meaningful quantitative estimate of the magnitude of disappointment, or any feeling for that matter, is not possible. A number could be assigned to rank disappointment that is based on some arbitrary scale such as when a doctor requests a rating of pain on a scale from one to ten. This assessment is useful as a relative guide, but it is not equivalent to a proper measurement using a calibrated instrument. Personal sensations and perceptions are subjective and therefore inherently qualitative, which is not necessarily an obstacle to decision-making. In fact, some of the most impactful decisions in life rely exclusively on sentiment, intuition and/or feelings of one kind or another. For example, most individuals do not base their choice of a spouse on quantitative metrics unless money somehow figures in the calculation. The challenge is to understand when quantitative methods are required and applicable, and to identify viable alternatives as necessary. Returning to the restaurant threat scenario, the disappointment resulting from choosing steak or lobster is fortunately confined to a single dining experience assuming there are no leftovers. Therefore, the impact decision criterion should not be remotely life altering. Anguishing over the choice between lobster and steak seems over-the-top precisely because the difference in the two outcomes is relatively trivial, especially in contrast with threat scenarios such as crossing the street. However, the underlying decision process is noteworthy irrespective of what is at stake (or at steak). The point is that the decision-making process is identical no matter how trivial or profound the consequences. Decisions of all types also occur in professional settings. Experts are constantly asked to assess possible threat scenario outcomes related to their particular area of expertise. For example, medical doctors assess the threat of disease, meteorologists assess the threat of storms and security professionals assess the threat of crimes. Professional certifications are sometimes required to ensure practitioners are properly trained, especially in professions where the loss associated with a single threat incident could be significant. Such certifications attest to a minimum level of competence that is affirmed by examination and/or relevant experience.
8
1 Definitions and Basic Concepts
For example, no prudent individual would voluntarily fly on an airplane, undergo surgery or allow the only toilet on the premises to be fixed by anyone other than a qualified professional. A basic level of proficiency affirmed by objective criteria is required in professions where incompetence could have life-altering implications. Although airplane crashes, surgical mishaps and dysfunctional toilets might appear to have little in common, they all could result in significant damage or loss. We now know that the criteria for decision-making and assessing risk are identical, and are used to assess the likelihood, vulnerability, and impact of the spectrum of possible outcomes. The three risk assessment-decision criteria apply to any threat scenario. For example, the criteria used to determine the relevance of a disease to a community are identical to those used to assess the relevance of a terrorist group to that same community. In the former scenario, an epidemiologist is required to assess the magnitude of the three criteria associated with epidemics, and a counterterrorism expert would evaluate the same criteria for terrorism. To repeat for emphasis, notwithstanding the fact that the knowledge, skills and abilities required to assess various threat scenarios might be different, the assessment processes are identical. The brain surgeon and the plumber evaluate their respective threat scenarios in exactly the same way, which exemplifies the principle of threat scenario equivalence. Much of the theory of security risk assessment evolves from this principle. Although not generally viewed in this light, physicians are quintessential risk managers. The good ones can effectively assess so-called risk-relevance, which in this context is the unique relationship between a specific disease (threat) and a particular patient (affected entity). Later we will see that this assessment is partly based on statistics pertaining to other patients’ historical relationship to this disease. Medical diagnoses and treatments are exercises in assessing risk-relevance. However, medical knowledge is so vast that a single individual cannot possibly know the relevance of every symptom to every disease nor be aware of the appropriate treatment. As a result, sub-specialties have emerged, and each sub-specialist has an increasingly granular view of the medical landscape. The specialist is understandably more capable of diagnosing and treating diseases that relate to his or her sub-specialty. However, this specialization comes with a price. Difficulties sometimes arise when a disease affects multiple organ systems or when a patient suffers from multiple conditions. Such difficulties are amplified if no one is managing the patient at the enterprise level. Determining the relevance of a specific threat to a particular entity is the essence of a security risk assessment. Of course, there are many types of threats, and not every threat is relevant to every entity. The magnitude of risk is highly contextual. For this reason, threats cannot be evaluated in a vacuum. In fact, threats are mere abstractions in the absence of risk. Conversely, risk is meaningless unless the specific threat scenario elements are specified. In addition, risk-relevance could change with time or be affected by the environment in which the threat and entity interact. As noted above, the highly contextual nature of threat scenarios requires specific expertise to accurately assess riskrelevance and thereby identify appropriate remediation.
1.2 Threat Scenarios and the Components of Risk
9
To illustrate the importance of context, consider a fire-related threat scenario. The magnitude of risk for a fire threat scenario depends on the environment where the fire is ignited. There are certainly general guidelines for fire safety, but a meaningful fire risk assessment requires an evaluation of a particular environment. The magnitude of the likelihood, vulnerability and impact criteria could vary significantly if the fire threat scenario is a forest versus a high-rise apartment building. Fire can be either deadly or lifesaving, even within the same physical setting but displaced in time. A campfire provides warmth in a forest in January. However, a flame in the same forest could destroy life and property during the summer months. The somewhat obvious conclusion is that assessing risk is impossible in the absence of context, which is represented by the threat scenario. In fact, risk is not meaningful without context since by definition it specifies the relationship between threat scenario elements.
1.2
Threat Scenarios and the Components of Risk
We first provide a high-level overview of the canonical threat scenario, which anticipates additional details provided later in this and subsequent chapters. The threat scenario is the focus of a security risk assessment and always consists of the following three elements: • Threats • Entities affected by threats • The environment in which threats and entities interact Threats are the progenitors of risk. There is no risk without the presence of a threat, and conversely, a threat is meaningless without risk, i.e., a single component of risk is zero. We will investigate this statement more carefully because of its theoretical and practical implications. For now, we present a formal definition of a threat: A threat to an entity is anything that results in relative harm, loss or damage to that entity. By definition, an entity is always “worse off” after experiencing a threat.
As noted in the first section, a multi-faceted feature called “risk” describes the relationship between threats and affected entities within the context of a threat scenario. Identifying the specific features of the threat scenario that affect this relationship is the crux of a security risk assessment. Figure 1.1 shows the canonical threat scenario structure. Threat incidents are what results from threat scenarios, and their number and/or distribution are determined by the risk factors. A formal definition of a threat scenario risk factor is provided in Chap. 2:
10
1 Definitions and Basic Concepts
Fig. 1.1 The threat scenario
Importantly, adding, subtracting or otherwise changing one or more risk factors substantively changes the threat scenario. Threat incidents resulting from different threat scenarios must be considered dissimilar, and this dissimilarity has implications to assessing the likelihood component of risk. Each risk component and their definition are provided again below. To reiterate, their magnitude in aggregate determines the relevance of a specific threat to a particular entity in the context of a specific threat scenario. 1. Likelihood. The probability or potential for a threat incident, which is based on a probability distribution of historical threat incidents in the former case and changes to threat scenario risk factors or risk factor-related incident statistics in the latter. 2. Vulnerability. The loss experienced by a particular entity as a result of a specific threat incident. 3. Impact. The importance or significance-per-threat incident. Let’s now discuss each of these components in more detail. The likelihood component of risk is unique among the three components. First, there are two distinct assessments of likelihood: probability and potential. The relevance of each depends on the threat scenario. It is essential to assess likelihood appropriately, which will depend on the available data, which in turn drives the type of uncertainty associated with the threat scenario in question. A quantitative estimate of the likelihood component of risk requires a probability distribution of historical threat incidents. Probability distributions are discussed in Chap. 5 but also appear throughout the text in a variety of contexts. Historical threat incidents might not have occurred, yet this condition does not imply zero risk. In such cases, it is necessary to determine the potential or tendency for a threat incident to occur. As noted in the definition, estimating potential entails measuring a change in the magnitude of a risk factor or leveraging the statistics of risk factor-related incidents. The vulnerability component of risk is the magnitude of loss or damage to an affected entity resulting from a threat incident. If a model for vulnerability can be established, it is sometimes possible to specify an absolute value for loss. Such models often relate to physical threats since they frequently behave according to various laws of nature. Impact is defined as the significance-per-threat incident. If significance is equivalent to the magnitude of loss, i.e., vulnerability, the impact becomes the
1.3 The Risk Meter
11
vulnerability-per-incident. The vulnerability and impact components are often closely related if not equivalent, recognizing that if the vulnerability component of risk is zero, the impact must also be zero. Confusion between the vulnerability and impact components of risk is common. The latter is a relative quantity and the former is absolute. Impact is easily explained by invoking a nuclear weapon threat scenario.6 Conventional bombs are capable of the same destruction as nuclear weapons. However, in general it requires more conventional bombs to inflict the same damage as a single nuclear weapon. There are also collateral health effects that persist after a nuclear bomb explodes. These effects are due to the radiation released during detonation. Therefore, the impact component of risk for a nuclear weapon would typically greatly exceed the impact of any conventional weapon. Although vulnerability and impact are related, the security risk management strategy used to address each component is not necessarily the same. The example of dispersing soldiers on patrol as discussed in Chap. 7 is an example of a strategy designed to specifically address the impact component of risk.
1.3
The Risk Meter
The mere existence of a threat does not mean it is actually threatening to a particular entity. Moreover, the magnitude of “threateningness” depends on the context. We now know that context refers to the risk-enhancing features of a specific threat scenario. For example, the threat posed by a great white shark to a human is significant if the two meet face-to-face in the ocean. The imposing features that are characteristic of all great white sharks are relevant whenever the threat (the shark) and affected entity (the human) are co-located in a specific environment (the ocean). What particular features are relevant to the magnitude of risk for this threat scenario? The shark’s advantages in size, swimming prowess and dentifrice combine to make it threatening to any human as well as to nearly every other denizen of the ocean. However, what about the magnitude of risk in other contexts? For example, what if the threat scenario is dry land? In that case, a human would completely dominate the shark since the animal could neither move nor breathe. In fact, the absence of water, and salt water in particular, thoroughly negates the shark’s prodigious physical assets. Yet, if a great white shark and a human are swimming together in the ocean, the human would be completely at the shark’s mercy. Although features of the threat scenario clearly affect the magnitude of risk, they do not necessarily affect the three components of risk equally. The vulnerability component of risk in the open ocean is always injury or death if the shark is in proximity to the swimmer. In contrast, the likelihood component of risk is affected by local environmental features.
6
A nuclear bomb is a weapon that uses either nuclear fusion or fission to generate the explosion.
12
1 Definitions and Basic Concepts
For example, the murkiness of the water affects the shark’s ability to discriminate between various types of prey, the shark’s appetite at the time of an encounter depends on the availability of other prey and the availability of a hiding place will vary with the local underwater terrain. Each of these features will affect the magnitude of the likelihood component of risk for a specific threat scenario. It would be advantageous to actually measure the individual components of risk and thereby specify a figure for the overall magnitude of threat scenario risk. To that end, imagine it were possible to construct a “risk meter.” Such a device could measure the magnitude of risk for any threat scenario. What quantity would actually register on such a meter? Everyone is familiar with a scale, which measures the weight of an object in the gravitational field of the earth. Weight is actually the force exerted by the earth on an object due to gravity and vice versa. Therefore, the force of gravity establishes a relationship, loosely speaking, between the earth and any object with mass. Weight, as distinguished from mass, is expressed in units of lb-force.7 By analogy, the risk associated with a threat scenario characterizes the relationship between a specific threat and a particular entity in the context of a threat scenario. By now it should be clear that this relationship is determined exclusively by the three components of risk. Therefore, a hypothesized security risk meter would register the overall magnitude resulting from the individual components of risk in some suitable system of units. This risk meter would produce a risk measurement for a particular threat-entity pair, which would vary according to the threat scenario. Analogously, the weight of a person on the moon is different than it is on earth since the force of gravity differs in each environment. Note the mass of the individual in both environments is the same. Unfortunately or not, a risk meter does not exist. If one did exist it would be extraordinarily convenient recognizing it would also obviate the need for security risk assessments as well as this book. There are a number of reasons why such a meter cannot exist. For starters, the three components of risk are fundamentally different and would require different units of measurement. The meter must measure all three components if it is to provide a complete characterization of the relationship between a specific threat and particular entity within the context of a given threat scenario. It is unclear how the disparate components could yield a single measurement. Perhaps most significantly, the magnitude of risk as measured by a risk meter would vary depending on where the measurement is made. Importantly, a change in context would not have a predictable effect on the magnitude of risk as it does when
7
The slug is the unit of mass in the US system of units when the pound-force is used as the unit of force, i.e., weight. One slug has a mass of 32.17 lb. if the pound is a unit of mass. In other words, if the pound is a unit of force, the unit of mass is the slug.
1.4 Introduction to Risk Factors
13
using a scale. For example, the gravitational constant on earth and the moon are both known, and so the force in each environment can be appropriately calibrated. It is equally unclear how the risk meter could be calibrated and thereby adjust to environmental changes. Note this situation does not simply mean the meter registers a different value in a different environment. The nature of the measurement itself is dependent on the environment. Therefore, risk readings would be inconsistent, thereby rendering comparisons meaningless. Although a universal risk measurement is not realistic, it does not alter the fact the three components do collectively describe the relationship between threats and affected entities. There is no requirement to specify a single numerical value for the overall magnitude of risk in the same way we measure weight. Moreover, it is possible to describe that relationship in qualitative terms without loss of meaning or operational utility. The three components describe the relationship between threats and entities in much the same way that physical properties of objects such as color, shape, charge, resistivity, capacitance, weight and mass et al. describe an object in the real world. There are many physical properties that describe physical objects. In contrast, there are only three components of risk, and these provide a complete description of its magnitude. An object confined to the physical world doesn’t exist in the absence of observable properties. Such properties are what enable detection and hence signify a physical presence. Analogously, a threat is merely an abstraction, i.e., is not threatening, if any one of the three components is zero.
1.4
Introduction to Risk Factors
Let’s review some of what we have learned thus far. Threats and entities affected by threats, i.e., “affected entities,” are linked via a quantity called “risk.” Risk always consists of three components. Although the components of risk are universal, a particular threat-entity relationship is scenario-specific. A threat scenario feature that increases the magnitude of risk is said to be risk-relevant. Risk-relevant features of a threat scenario are known as risk factors. There are multiple categories of risk factors, which are described in Chap. 2. For example, smoking cigarettes is a behavioral risk factor affecting the likelihood component of risk for lung cancer, emphysema, cardiovascular disease and a host of other diseases. Statistical studies involving large populations of smokers have confirmed that the likelihood of developing one or more of these diseases increases with the number of cigarettes smoked. Figure 1.2 provides quantitative evidence linking the likelihood of developing lung cancer to the duration of a smoking habit.8
8
P. Vineis, National Cancer Institute, 2004; 96:99.
14
1 Definitions and Basic Concepts
Fig. 1.2 Cumulative risk of developing lung cancer from smoking cigarettes
In some threat scenarios the relevance of a risk factor is clear: a meteorologist would focus on a violent storm approaching a populated city rather than a highpressure system located far out to sea. Modestly competent physicians do not screen women for prostate cancer. Villagers in a remote part of Antarctica would mostly ignore the threat of anti-Western terrorists in favor of marauding polar bears. The identification of risk factors is a principal focus of medical research for good reason. Medical scientists attempt to prevent diseases linked to risk factors and determine the efficacy of treatments. Great progress has been made in combating many diseases, which have resulted in increased longevity and a better quality of life in countries that are able to take advantage of these advances. In defiance of logic and their self-interest, people often ignore one or more risk factors for a disease until it is too late. Consider Joe, a middle-aged male who leads a remarkably sedentary life. Joe smokes two packs of cigarettes a day and his culinary proclivities lean toward French fries, fried onion rings, potato chips, ice cream, and greasy double cheeseburgers. His blood cholesterol and triglyceride readings are sky-high even by the most generous standards. Joe spends many of his waking hours watching sports on his big screen, high definition television while reclining on his comfortable living room couch. In the course of a single game he routinely smokes a pack of cigarettes and consumes a six-pack or two of beer, which has contributed to his prodigious belly fat. Joe’s father died of a heart attack at age 45, and Joe is roughly 50 pounds overweight at age 40.
1.4 Introduction to Risk Factors
15
Joe is the veritable poster child for heart disease due to the simultaneous presence of multiple risk factors. The probability that he will live a life free of significant medical issues while maintaining his current lifestyle is equivalent to the chances of winning the lottery, i.e., infinitesimally small. Indeed, some people do win the lottery. Likewise, Joe could reach one hundred years of age without significant health problems. However, this outcome is quite unlikely based on a preponderance of data derived from extensive medical studies.9 In fact, sufficient data exist to construct a probability distribution of threat incidents as a function of the various risk factors for heart disease. Joe’s prospects for a healthy future can be determined with statistical confidence because of an abundance of historical data. Critically, those results can be generalized to any person possessing one or more of the identified risk factors. The reality is that the collective misery of millions of overindulgent couch potatoes has enabled accurate estimates of the likelihood that the same fate will befall Joe. Indolence, smoking, excessive body mass, high blood pressure, a family history of heart disease and elevated blood cholesterol are now confirmed risk factors for heart attacks and strokes. Historical data coupled with the innate similarity of humans enable generalizations about the likelihood of you, me and Joe succumbing to heart disease. Now consider Joe’s friend, Jim. Jim is a rail-thin, 50-year-old male who runs 30 miles-per-week, has never smoked tobacco, drinks a glass of wine or two each week, consumes a low-dose statin each day to lower blood cholesterol and assiduously avoids fatty foods. Jim’s lipid profile is well below the levels recommended by cardiologists. Each of Jim’s parents is over 80 years old and displays no sign of heart disease. In stark contrast with Joe, Jim’s likelihood of experiencing a heart attack is low because he possesses none of the known risk factors. Yet despite Jim’s ultra-healthy lifestyle, excellent blood chemistry and favorable genetics, he too could suffer a heart attack. As is the case with Joe, the likelihood is inherently statistical, which means the risk is determined from a probability distribution of like entities, i.e., other humans. However, the odds of a heart-healthy future are stacked heavily in favor of Jim and against Joe. Recognize that this particular threat scenario relates only to heart disease. The likelihood component of risk for these two entities might be different for cancer threat scenarios. For example, if Jim spent 20 years working around asbestos, a risk factor for mesothelioma, or he has a genetic predisposition for other cancers, the magnitude of the likelihood component of risk for such threat scenarios might exceed Joe’s. However, the unfortunate reality is that Joe is unlikely to live long enough to be affected by cancer. In a previous text, this author used examples of “psychotic squirrels” and “sociable sharks” to describe statistical outliers.10 Few people expect to encounter squirrels that spontaneously attack humans or sharks that are particularly friendly to
9
https://www.framinghamheartstudy.org/ C. Young, Metrics and Methods for Security Risk Management; Syngress, Waltham, MA 2010.
10
16
1 Definitions and Basic Concepts
members of our species. Yet reports of a squirrel attacking humans in New York City clearly demonstrate that such incidents do occur.11 As of this writing there do not appear to be documented instances of a domesticated shark. In general, if threat scenario risk factors are stable, the likelihood that a future threat incident, should it occur, will possess a particular characteristic can be determined from a probability distribution of historical threat incidents. Unfortunately or not, the number of similar security-related threat incidents is usually low. In addition, threat scenario risk factors can vary significantly and over relatively short time scales. These two conditions play a critical role in the assessment of the likelihood component of risk for a given threat scenario. Risk factor stability occupies an important place in the theory of security risk assessment, and in particular the likelihood component of risk. If a risk factor is not stable over relevant time scales, threat incidents linked to that risk factor cannot be included in the same probability distribution. In the absence of a probability distribution, the probability of a future threat incident possessing a particular characteristic cannot be calculated. In such instances, other methods are required to assess the magnitude of likelihood, which is a topic we will explore in considerable depth.
1.5
Threat Incidents and Risk Factor-Related Incidents
Recall that risk factors link threats and entities affected by those threats. Threat incidents are the manifestations of threat scenarios, and they confirm the existence of likelihood risk factors except in the relatively rare case when threat incidents occur spontaneously and without provocation. A principal challenge in any security risk assessment is to identify the risk factors and their relevance within a given threat scenario. However, recognize that the converse is not necessarily true; the absence of a threat incident does not imply an absence of risk factors. Difficulties in assessing the magnitude of risk often arise precisely because there are no threat incidents yet threat scenario risk factors persist. Moreover, incidents that relate to a risk factor are risk-relevant but do not necessarily qualify as actual threat incidents. For example, unauthorized physical access to a restricted space is a risk factor for a host of threats. Numerous incidents of unauthorized physical access somehow increase the likelihood component of risk yet we cannot specify a quantitative relationship. Nor can we draw the same conclusions about the magnitude of the likelihood component of risk from threat scenarios with a history of actual threat incidents relative to those where only risk factor-related incidents have occurred. Different approaches are required to assess the likelihood component of risk for threat scenarios characterized by risk factor-related incidents versus those where actual threat incidents have occurred. The difference in approach ultimately derives
11
Squirrel Attacks in Prospect Park Lead to Worry of Rabies, New York Times, July, 23, 2017.
1.6 Probability v. Potential
17
from the nature of uncertainty inherent to each of these threat scenarios. The resulting dichotomy in the assessment of likelihood is one of the most significant aspects of the theory of security risk assessment.
1.6
Probability v. Potential
Estimating the likelihood component of risk is a principal focus of security risk assessments. The answer to the seemingly simple query, “What is the likelihood of a future threat incident?” is not always so simple. There are two distinct approaches to answering that question, and the approach depends on the available data. The meaning of likelihood is itself a traditional source of confusion. Likelihood typically conjures up a prediction not unlike those gleaned from tarot cards or crystal balls. In these mystical performances, information about the past is used to enhance the credibility of a process that is entirely divorced from science. Any connection between the past and the future as determined by these devices is speculative at best, but paying customers are predisposed to believe otherwise. Specific information about the past is required to make statements about the likelihood of future threat incidents. In particular, information about previous threat incidents must be available. Without such data there is no hope of determining the probability of a future incident. For example, a statement such as “The probability of a future terrorist incident is 24 percent” is in general meaningless, and the reason is linked to the definition of a probability. The likelihood component of risk is less a prediction of the future than it is a reflection of the past. In the absence of data about previous incidents, one can only make inferences about the tendency for future incidents to occur based on features of the threat scenario or incidents that relate to actual incidents. A quantitative assessment of likelihood, i.e., a probability, is predicated on the statistics of actual threat incidents. As we will learn in Chap. 5, a probability is a fraction, which represents a subset of a distribution, and where the sum of the fractions/probabilities equals one. A distribution of probabilities must be normalized to one since the sum of the constituent parts of the distribution must equal the entire distribution. Therefore, a probability greater than one does not make sense. The normalization requirement places an unqualified burden on statements about the likelihood component of risk and specifically the probability of a future event. For example, a probability distribution of cardiac events, i.e., heart attack or stroke, can be formulated in terms of risk factors. The fractions of the distribution, i.e., the individual probabilities, relate to one risk factor, two risk factors, etc. A “prediction” about Joe-the-couch potato’s cardiac future can be made based on his risk factors, which in turn is based on the statistical evidence. Figure 1.3 illustrates
18
1 Definitions and Basic Concepts
Fig. 1.3 Lifetime risk of death due to heart disease as a function of age and the number of risk factors
the cumulative risk of death as a result of a heart attack or stroke relative to the number of risk factors each study participant possesses.12 The statistics suggest the cumulative risk of death due to heart disease with two or more risk factors is about 15% by age 70.13 However, Fig. 1.3 cannot predict the magnitude of Joe-the-couch potato’s risk with absolute certainty. Fundamentally, such predictions are generalizations about Joe based on the experience of many Joe-equivalents. We could construct a similar probability distribution for security-related threat incidents. For example, incidents relating to thefts could be parameterized in terms of various risk factors, e.g., incidents of unauthorized access, the value of objects exposed to theft, et al. A probability distribution can be formulated that specifies the probability of a theft as a function of one or more risk factors. Similarly, if a probability distribution of historical thefts is formulated in terms of the value-per-theft, we could determine the probability that if a particular theft is chosen at random from the distribution it would be more or less than a particular value. Therefore, if the conditions affecting thefts do not change, it is possible to “predict” the likelihood that a future theft will be above or below that value, assuming a future threat incident actually occurs. The underlying assumption in such a calculation is that past is prologue. The word “predict” is shown in quotations because such estimates should more appropriately be viewed as a generalization or extrapolation to the future. A
12
New England Journal of Medicine, Berry, et al., Lifetime Risks of Cardiovascular Disease, 366: 321–329. Copyright 2012 Massachusetts Medical Society. Reprinted with permission. 13 The cumulative risk results from continuously adding the results in a distribution to yield the sum of all previous results. For example, if the probability of experiencing a disease based on the day of exposure is 10% on day one, 20% on day two, and 30% on day three, the cumulative risk is 10% on day one, 30% on day two, and 60% on day three.
1.6 Probability v. Potential
19
statement about the likelihood of a future threat incident is predicated on historical evidence. Threat scenario conditions must remain stable if historical results do indeed presage future risk. It is clear why statements about terrorism risk are nonsense in the absence of a normalized probability distribution of historical threat incidents. Without this data, the spectrum of possible threat scenario outcomes is infinite including the scenario where no incident actually occurs! In other words, it is impossible to specify a fraction of the relevant incidents from the universe of similar incidents because that universe does not exist. One might be tempted to ask whether a lack of thefts implies that everyone can keep their doors unlocked and leave their valuables in plain site? Of course, we know this to be a bad strategy. Recall from the Introduction that an absence of threat incidents does not imply an absence of risk. However, an absence of threat incidents makes it necessary to leverage other riskrelevant features to assess the likelihood component of risk. Specifically, an estimate of potential uses such information to enable inferences about the future. This too is a form of generalization about the future based on the past, albeit a qualitative version of same. Fundamentally, estimates of probability versus potential reflect differences in uncertainty. The potential is driven by uncertainty in the contribution of the individual risk factors to the likelihood component of risk. We simply cannot determine the effect of a risk factor in yielding historical threat incidents because such an effect has not been realized, i.e., there have been no or too few threat incidents. In contrast, a probability distribution of historical threat incidents is a manifestation of the effect of risk factors. Therefore, a direct if still uncertain connection between a risk factor and actual threat incidents has been established. However, this type of uncertainty is very different from the uncertainty associated with threat scenarios lacking historical threat incidents. This new type of uncertainty relates to the precision in pinpointing the exact value of a specific threat incident if chosen at random from a distribution of threat incidents. Importantly, this precision can be quantified via the distribution variance and enables generalizations about future threat incidents, should they occur. We now arrive at the following result: The uncertainty associated with a probability distribution of historical threat incidents enables quantitative estimates of the likelihood of a future threat incident assuming the risk factors that relate to those historical incidents remain stable. The magnitude of uncertainty is equal to the variance of the distribution.14
Estimates of the potential are frequently based on subjective risk factors such as human motivation. These risk factors are the legitimate domain of social science yet
14
The probability of protection method described in Chap. 8 uses a probability distribution of risk factor values to estimate the likelihood of the effectiveness of a security control. Specifically, the distribution is used to estimate the dispersion in the vulnerability component of risk relative to a security control specification.
20
1 Definitions and Basic Concepts
20% 18% Female
Incident % of Total
16%
Male
14% 12% 10% 8% 6% 4% 2% 0%
10-14
15-17
18-19
20-24
25-29
30-34
35-39
40-44
45-49
50-54
55-59
60-64
65+
Total 0.12% 2.89% 5.16% 17.52% 18.81% 16.48% 14.15% 12.15% 7.26% 3.01% 1.43% 0.58% 0.44% Total 0.02% 1.75% 3.48% 15.23% 17.95% 17.58% 14.81% 12.48% 8.58% 4.33% 2.00% 0.99% 0.82%
Incident % of Total
Fig. 1.4 Distribution of domestic violence by age and gender
they are perfectly valid nonetheless. The difficulty arises in attempting to formulate a probability distribution of subjective risk factors. Although social science is distinct from physical science, distinctness does not imply inferiority or irrelevance. The social behavior of humans is ultimately linked to basic science, and effective security risk management demands awareness of human behavior. The following passage expresses this relationship.15 That does not mean, of course, that a theory capable of elucidating a theory of social behavior of human beings will have a special place outside the framework of the natural sciences. Man is the product of an evolution reflecting universal natural laws. The explanation for human behavior, complex as that explanation may be, will ultimately be found in the basic principles of science. We cannot predict, of course, what levels of complexity our understanding will reach. Theories are based on abstraction. They abstract what is regular and readily reproducible from reality and present it in an idealized form, valid under certain assumptions and boundary conditions.
Suppose the Department of Social Services is attempting to tackle domestic violence threat scenarios in a particular community. The Police Department has provided data on crime statistics in the form of a probability distribution of threat incidents for both males and females, which is shown in Fig. 1.4.16 Note the higher curve specifies incidents relating to females, and the lower curve shows incidents pertaining to males. 15
M. Eigen and R. Winkler, Laws of the Game; How the Principles of Nature Govern Chance. Princeton University Press, Alfred A. Knopf, Inc., 1981. 16 J. Kerr, C. Whyte, H. Strang, Targeting Escalation and Harm in Intimate Partner Violence: Evidence from Northern Territory Police, Australia, Cambridge Journal of Evidence-Based Policing, September 2017, Volume 1, Issue 2–3, pp. 143–159 (Springer).
1.6 Probability v. Potential
21
Assuming conditions contributing to domestic violence remain relatively stable over risk-relevant time scales, Fig. 1.4 could be used to develop a risk management strategy. Such a strategy would likely be based on the frequency of incidents relative to domestic violence risk factors. To that end, Fig. 1.4 is a probability distribution of threat incidents parameterized in terms of age and gender. It appears that the probability of being a female victim between 25 and 29 years old is about 0.19. By contrast, the probability of being a female victim between the ages of 60–64 is only about 0.06. Therefore, The Department of Social Services might focus on younger women (and men) in addressing this threat scenario. Again, the term “prediction” in this context does not imply that future incidents of a certain type, i.e., committed by a person of a certain age or gender, will necessarily occur. The implication is that any future incidents, should they occur, would conform to historical precedent. To reiterate, a probability is a generalization that the likelihood of a future incident, should it occur, will possess a specific characteristic based on a probability distribution of previous incidents. The accuracy of this generalization depends on the sample size. More precisely, the uncertainty in the accuracy, i.e., the precision, is inversely proportional to the square root of the sample size assuming a normal distribution of historical incidents. Furthermore, if conditions change too rapidly, generalizations about future threat incidents based on the original data will not necessarily apply. Figure 1.2 showed the cumulative risk of developing lung cancer based on a single risk factor for that disease, i.e., cigarette smoking. One would expect the magnitude of the likelihood component of risk to increase when multiple risk factors are present. Figure 1.3 confirmed that hypothesis, at least with respect to the risk factors for cardiac incidents. Although it would appear to be common sense, this effect follows from the fact that the magnitude of threat scenario risk is a function of the product of the likelihood risk factors. The observed exponential increase in the magnitude of the likelihood component of risk would not be true if the cumulative effect of likelihood risk factors were additive. The contemporaneous presence of likelihood risk factors is referred to as a “confluence of risk factors,” and this condition is discussed in Chap. 7. The predictive power of the data in Figs. 1.2 and 1.3 lies in the statistical confidence gained through numerous historical incidents in conjunction with the stability of the risk factors. The conclusions so derived are evidence-based, and the evidence derives from numerous threat incidents that correlate with the presence of various risk factors. Let’s examine another threat scenario to illustrate the distinction between probability and potential. The threat is now a violent encounter with a subway train as a result of slipping on the subway platform and falling to the track below. A detailed analysis of this rather grim threat scenario is provided in Chap. 7. We exclude the even grimmer threat scenario of being pushed on to the track. As one might expect, a risk factor for slipping is the “slipperiness” of the platform surface in proximity to the platform edge. The coefficient of friction of the platform surface is a metric expressing the magnitude of slipperiness.
22
1 Definitions and Basic Concepts
Based on the physics of friction and its effect on stability, a reasonable if qualitative statement regarding the likelihood component of risk emerges as follows: A lower coefficient of friction of the platform in the vicinity of the track increases the potential for slipping and falling on to the tracks. In other words, the presumption is that the tendency to slip increases when the coefficient of friction of the platform surface decreases. Moreover, the likelihood of experiencing a threat incident only occurs if the following two risk factors are present: (1) a platform area with a lower coefficient of friction, and (2) proximity of that platform area to the platform edge. Note the assumption of increased risk due to a lower coefficient of friction lacks direct empirical evidence but could be confirmed via experiment. Neglecting the influence of other factors such as shoe type and the balance of experiment participants, numerous individuals could be observed walking on the platform. The number of slips could be recorded as a function of the coefficient of friction of the platform surface, which would lead to a probability distribution of slips relative to the platform surface slipperiness. If many individuals knowingly or unwittingly participated in this experiment, the probability that any individual will slip and fall on the subway platform could be determined. The more data we collect the greater the statistical confidence in the resulting generalization. An indicative (and idealized) probability distribution of slips as a function of the platform coefficient of friction might appear similar to Fig. 1.5. As expected, slipping and the coefficient of friction are inversely related. The lower the coefficient, i.e., the more slippery the surface, the greater the probability of slipping. Absent such an experiment, only the tendency to slip could be ascertained from measuring changes to the coefficient of friction. Such conclusions are based on the presumption that pedestrians tend to slip more on slippery surfaces. Our intuition is confirmed by personal experience as well as physics. One’s foot remains static on the
Fig. 1.5 Likelihood of slipping as a function of the coefficient of friction
1.6 Probability v. Potential
23
surface because it is balanced by horizontal forces exerted by a backward muscular force of the leg against the force of static friction. If the frictional force is low the muscular force becomes too powerful, which causes the leg to slip along with the attached person. To reiterate, the actual probability of slipping cannot be deduced from changes to the coefficient of friction alone. However, it is possible to make inferences about likelihood. Such inferences enable statements about relative changes in the magnitude of risk. For many threat scenarios it is simply not practical to perform the type of experiment necessary to generate a probability distribution of threat incidents. Therefore, assessments of likelihood must settle for estimates based on the number of risk factor-related incidents or a change in the magnitude of a threat scenario risk factor, e.g., the coefficient of friction. The good news is that inferences are often good enough, but care is required in interpreting conclusions based on inference alone. Anticipating a more fulsome discussion of risk factors in Chap. 2, it is instructive to provide an analytic representation of the connection between risk factors and threat incidents. The ultimate objective is to provide additional insight into the distinction between probability and potential. Specifically, we leverage a liberal interpretation of Bayes Theorem to describe conditional probabilities for threat incidents given the presence of risk factors, which reveals the uncertainty that accompanies threat scenarios characterized by risk factor-related incidents.17 This theorem describes the probability of an event based on prior knowledge of conditions that might be related to the event. Specifically, for two events A and B, Bayes Theorem is written as follows: PðA=BÞ ¼ ½PðB=AÞ PðAÞ=PðBÞ
ð1:1Þ
In words, (1.1) states that the probability of event A given event B equals the probability of event B given event A times the probability of event A all divided by the probability of event B. Let A be an actual threat incident resulting from a threat scenario. Let B be ANY risk factor in a threat scenario. We see that P(B/A), i.e., the probability of a risk factor given a threat incident, is one. Why? If a threat incident has occurred, a threat scenario risk factor must be present by definition. If there has been a theft in a building, it means either an instance of unauthorized building access took place, a drawer containing valuables was not sufficiently secured, etc. Threat incidents are manifestations of threat scenario risk factors. Therefore, (1.1) reduces to the following expression. It specifies the conditional probability of observing an actual threat incident given a risk factor is present: PðA=BÞ ¼ PðAÞ=PðBÞ
17
Thomas Bayes, English mathematician, 1801–1861.
ð1:2Þ
24
1 Definitions and Basic Concepts
In other words, the probability of a threat incident given the presence of a risk factor simply equals the probability of a threat incident divided by the probability of a risk factor. We see that P(A/B) decreases when P(B) increases, and it increases when P(A) increases. This condition holds irrespective of whether P(B/A) equals one, which is only the case if B is assumed to be any risk factor and not a specific one. Let’s now assume B is a particular risk factor and evaluate a specific scenario. Let P(A) be the probability of a terrorist incident, and P(B) is the probability of poverty. Poverty is a risk factor for many threats other than terrorism. Therefore, P(A/B) is lower if B is, say, a malnourished citizen rather than an illegal cache of explosives. The former is unfortunately rather common and nonspecific. In contrast, the latter would be expected to highly correlate with terrorism incidents. We would expect to see terrorism incidents if caches of explosives were prevalent but less so if there were only malnourished citizens. Likewise, if P(A) increases, it implies P(A/B) increases since the likelihood of a terrorism incident has increased irrespective of P(B). Note that (1.2) should be less than or equal to one since P(A) P(B). Therefore, relation (1.2) suggests a qualitatively reasonable result. Namely, the probability of an actual threat incident is less than or equal to the potential for risk factor-related incidents. An actual threat incident is the resultant of likelihood risk factors. Risk factor-related incidents would be expected to be more likely since they merely suggest the potential for threat incidents. Expression (1.2) also suggests but doesn’t explain why threat scenarios that are characterized by threat incidents are rarer than those that lack threat incidents. Required thresholds or triggers for threat incidents such as those related to terrorism can be significant especially in the presence of security controls, yet risk factors for such threats abound. Moreover, (1.2) succinctly expresses the insuperable gulf between potential and probability for threat scenarios lacking a history of threat incidents. The disjunction between probability and potential is a central theme in the theory of security risk assessment and the impetus for much of the discussion in this text. Suppose we characterize the potential for threat incidents in a particular threat scenario as a continuum of values ranging from zero to some unspecified value. We now specify P(A/B) as a value ranging from zero to one, which is consistent with the definition of a probability. If P(A/B) is zero it means the probability of a threat incident is nil. A value of P(A/B) equal to one means the likelihood of a threat incident is a certainty given the presence of a risk factor. Recall the potential is a function of the number of risk factor-related incidents or a change in the magnitude of a risk factor. Therefore, the value of the potential is theoretically unlimited, and would (somehow) scale with one or both of these parameters. A plot of P(A/B) versus the potential for a threat incident reveals the inherent difficulty in assessing the likelihood component of risk. One possible plot is shown in Fig. 1.6, which reveals an exponential increase in P(A/B). Alternatively, Figs. 1.7 or 1.8 might characterize the magnitude of P(A/B) as a function of the potential.
1.6 Probability v. Potential
25
Fig. 1.6 Exponential increase in P(A/B)
Fig. 1.7 Sigmoidal increase in P(A/B)
Fig. 1.8 Increase in P(A/B) that asymptotically approaches one
The point is it is impossible to determine what the trend in P(A/B) will look like. All we really know is the trend is increasing as a function of either the number of risk factor-related incidents and/or a risk-relevant change in a risk factor. The impossibility derives from uncertainty in the effect of risk factors on the magnitude of risk. We will see that the assessment of the likelihood component of risk is ultimately determined by the type of uncertainty inherent to a given threat scenario, which in turn is driven by the presence or absence of threat incidents.
26
1.7
1 Definitions and Basic Concepts
The Fundamental Expression of Security Risk
We know that risk characterizes the relationship between threats and affected entities within the context of a threat scenario. In addition, the risk associated with any threat scenario consists of three components. We can explicitly relate the magnitude of threat scenario risk to the three components as follows: Risk ðthreat scenarioÞ / Impact Likelihood Vulnerability
ð1:3Þ
Expression (1.3) should be read as, “The magnitude of treat scenario risk is proportional to the product of the impact of a threat incident, the likelihood of a threat incident and the vulnerability to a threat incident.” Expression (1.3) is known as “The Fundamental Expression of Security Risk,” an admittedly grandiose term for a relatively simple expression. Importantly, the representation of (1.3) as a product does not imply the magnitude of risk is achieved by multiplying the individual components. The form of (1.3) merely implies that the three components collectively represent the magnitude of threat scenario risk. Therefore, in the absence of even one component the magnitude of risk is zero irrespective of the value of the other components. In other words, there is no risk associated with a particular threat scenario if any one of the components of risk is zero. A more granular depiction of (1.3) would explicitly delineate the risk factors for each component. Importantly, the risk factors for impact and vulnerability should be expressed as a sum. Their cumulative effect is often additive although it will depend on the threat scenario. In contrast, the effect of likelihood risk factors is always multiplicative, and we will soon see that their combined effect is an exponential increase in the magnitude of likelihood. As before, the three components of risk specify the magnitude of threat scenario risk in aggregate. Expression (1.4) is a more exact if lengthy expression incorporating all risk factors for each component of risk: Risk ðthreat scenarioÞ / ½Likelihood Risk Factor 1 . . . Likelihood Risk Factor N ½Vulnerability Risk Factor 1 þ . . . þ Vulnerability Risk Factor N ½Impact Risk Factor 1 þ . . . þ Impact Risk Factor N
ð1:4Þ
Strictly speaking, (1.4) is still not quite correct since it does not include the effect of security risk management, i.e., the presence of security controls. The components of risk as expressed in both (1.3) and (1.4) represent a highly idealized threat scenario since security controls are present in some form in any realistic threat scenario. Arguably, even the awareness of a threat constitutes a form of security control. A more complete form of (1.4) is presented in Chap. 12, which accounts for the presence of security controls as well as other parameters relating to the theory.
1.8 Absolute, Relative and Residual Security Risk
27
Other threat scenario conditions that affect the magnitude of risk are also missing from (1.4). Specifically, two conditions that act to maximize risk are not accounted for: the confluence of likelihood risk factors and the misalignment of security controls and risk factors. In addition, the effect of complexity, a significant risk factor for the likelihood component of risk, has been omitted. Each of these conditions is discussed in later chapters, and all are ultimately included in the revised expression noted above. Despite the fact that (1.3) and (1.4) are incomplete they are useful nonetheless. Identifying basic threat scenario features that affect the three components of risk yields important insights, and is critical to understanding the requirements for security controls. The reader should also note that (1.3) and (1.4) are not proper equations. They are written as a proportionality, which reflects the uncertainty in the magnitude of the individual components and hence their relative contribution to the overall magnitude of risk. To be more specific, any coefficients and exponents associated with each component are missing. Because of its inherent inexactness, expression (1.3) and (1.4) must be considered indicative. They provide a useful statement about the requirement for the individual components but offer no clues regarding the scaling of those components. For example, it is conceivable that threat scenario risk equals the magnitude of one component squared, another component cubed, etc. Both expressions are written as a linear product of all three components, a condition that may or may not bear any resemblance to reality. Finally, the fact that (1.3) and (1.4) are not exact does not mean assessing the magnitude of risk is impossible. It also does not imply that (1.3) and (1.4) are entirely irrelevant. Rather, it provides a high-level template from which to begin a security risk assessment. Its main message is that all three components of risk must be represented in any realistic threat scenario. Its incompleteness suggests the need to explore the contributions of the individual components further as well as to identify other potentially risk-relevant parameters. These methods will still only yield approximations, but they are often sufficient to gain the insight necessary to identify appropriate security controls.
1.8
Absolute, Relative and Residual Security Risk
Although risk is simply the relationship between a threat and an affected entity, the results of risk assessments are often subject to misinterpretation. What are the features of risk that contribute to varied and/or ambiguous results? First, the multi-component nature of risk is certainly a contributory factor. Second, the magnitude of risk is highly contextual, and therefore it possesses an inherent fluidity. Third, an assessment in the real world is always conducted with respect to the status of existing security controls. Therefore, an assessment must always be viewed in terms of two dimensions that are in constant juxtaposition; the
28
1 Definitions and Basic Concepts
magnitude of a component of risk versus the performance specifications of relevant security controls. It is the assessed gap between these two dimensions that determines the risk profile of an entity or organization. Lastly, and perhaps most importantly, the actual meaning of risk depends on what aspect of the relationship is being evaluated. For example, several components of risk can be measured in the same way one measures intensive and extensive properties of systems like mass, temperature and pressure. These yield absolute quantities that may or may not exceed an articulated threshold. In contrast, the likelihood component of risk is an inherently relative feature of a threat scenario. What are the implications of each component of risk in terms of their relative and absolute value? For purposes of review, we learned in this chapter that the magnitude of risk is represented as a product of three components, which is captured in the Fundamental Expression of Risk. However, we also learned that this expression does not yield an absolute number for a given threat scenario. It merely conveys the presence of all three components of risk, and does not specify their respective contributions to the overall magnitude. However, individual components can yield absolute results. For example, if we can establish a model for loss for a given threat scenario we can predict its magnitude as a function of one or more risk factors. In Chap. 7 we will discuss explosive threat scenarios. Here the magnitude of structural damage or loss can be estimated from two risk factors: distance between the explosion and the target and the explosive payload. As noted above, the likelihood component of risk is a relative quantity. Although a probability has a specific numerical value, it actually represents a comparison among peer elements within a probability distribution. Threat scenarios with a history of threat incidents enable quantitative assessments of likelihood, but the resulting probability represents some fraction of the total distribution of incidents. Even in the absence of a probability distribution of historical threat incidents, an estimate of likelihood can only be an evaluation of relative risk. As noted previously, such a situation demands an evaluation of risk factors or risk factor incidents in order to assess likelihood. However, implicit in the result of such evaluations is a value relative to other scenario outcomes. Residual risk is the magnitude of risk that remains following the application of security controls with respect to some ideal or tolerated threat scenario condition. Although it represents an estimate of actual threat conditions, it is still not an absolute quantity. Residual risk is always “measured” relative to some agreed standard. It is important to appreciate that the difference between the actual and idealized risk is never zero. There is always some risk associated with any threat scenario even after security controls have been implemented. It may be difficult to assess residual risk, especially for complex threat scenarios, but each component of risk always possesses a finite magnitude.
1.8 Absolute, Relative and Residual Security Risk
29
In a weak moment it is tempting to speak of “hidden variables” in characterizing residual risk. We do so with extreme caution, and caveat this analogy by pointing out that a theory of hidden variables is how Einstein explained a very specific phenomenon in quantum mechanics, i.e., spooky action at a distance. Nevertheless, a very loose analogy is possible in the sense that sometimes there are intangibles that contribute to the magnitude of security risk. As a result of the crash of several 737 Max aircraft, the Boeing aircraft company is being scrutinized for their management of residual risk. As noted above, residual risk always exists, which is equivalent to saying there is no such thing as a zero risk threat scenario or a zero-risk condition for any realistic threat scenario. Manufacturers like Boeing conduct simulations to determine the performance limits of specific airplane components. The tolerance for risk is dictated by many factors including profitability, and tradeoffs must always be made. For example, a particular part might experience a failure rate of 106, i.e., one failure per million trials. Depending on the part and its specific function in keeping the aircraft aloft, that figure might be deemed satisfactory or not. Although it might be possible to improve performance to (say) 109, it might be cost prohibitive and the company will not invest the resources. The magnitude of risk could be reduced, but the company has presumably established a standard based on its tolerance for risk. In theory, the U.S. government’s regulatory agency, the FAA, has signed off on that standard. In the case of Boeing and the 737 Max, the question is whether safety standards were sufficient to reduce residual risk to an acceptable level or whether such standards were sufficient and ignored. A further question exists around the process of inspections and sign-off by the US government. The magnitude of the likelihood component of risk for commercial air travel is particularly low and has been improving steadily over the years. This condition is evidence that risk management efforts are effective. In 2018 there were 0.39 fatal accidents per million flights.18 The magnitude of risk can always be decreased, but as noted above, it will never be zero even if the number of fatalities is reduced to zero. The risk factors for likelihood, vulnerability and impact will persist in spite of historical evidence or lack thereof. An airplane accident is an example of a threat scenario with low likelihood and high impact, a tricky threat scenario type that is singled out for discussion in Chap. 12. Finally, a number representing the magnitude of risk, irrespective of whether it represents an absolute or relative value, cannot be used to manage risk unless it is evaluated against some threshold that reflects the tolerance for risk. A security standard defines that threshold, and thereby determines if an organization must act on assessment results.
18
Javier Irastorza Mediavilla (Jan 2, 2019); "Aviation safety evolution (2018 update)".
30
1.9
1 Definitions and Basic Concepts
Summary
The fundamental construct of the theory of security risk assessment is the threat scenario. Threat scenarios are comprised of threats, affected entities and the environment where threats and entities interact. When viewed from this vantage in combination with the universality of risk, all threat scenarios are equivalent. The Fundamental Expression of Security Risk states that the magnitude of risk is proportional to the product of its three components, and is written as follows: Risk ðthreat scenarioÞ / Impact Likelihood Vulnerability The components of risk are universal and consist of the likelihood of a threat incident, the vulnerability or loss resulting from a threat incident and the impact of a threat incident. The magnitude of threat scenario risk determines the relevance of a specific threat to a particular entity. To put it another way, risk is what makes a specific threat threatening to a particular entity. Risk factors associated with each component are features of affected entities or the threat scenario environment that increase the magnitude of one or more components of risk. The components of risk and associated risk factors determine the relationship between threats and affected entities. Threat scenario equivalence ensures the process of security risk assessment applies to all threat scenarios. The dichotomy in assessing the likelihood component of risk is a fundamental aspect of the theory of security risk assessment. This dichotomy is characterized by calculations of probability versus estimates of potential. Probability calculations require a probability distribution of historical threat incidents in conjunction with risk factor stability. An estimate of the potential for a threat incident is an assessment of the tendency for threat incidents to occur. An estimate of the potential yields inferences of the magnitude of threat scenario risk, and is based on either a probability distribution of risk factor-related incidents or a change in the magnitude of a risk factor. The distinction between probability and potential motivates the discussion on direct and indirect measurements of the likelihood component of risk in Chap. 7. The consequence of a theory of security risk assessment is the emergence of a structured and risk-based security risk management process. The magnitude of risk associated with any threat scenario can be compared via objective, i.e., risk-based criteria. Finally, although the identical process is used to assess all threat scenarios, the same security risk management strategy will not necessarily be invoked, even for identical threat scenarios. The cost of security controls plus the organizational tolerance for risk often varies and can significantly influence such strategies.
Chapter 2
Risk Factors
2.1
Introduction
Risk factors are a principal focus of any security risk assessment. Threats and the entities affected by those threats are related via the components of risk, and the risk factors modulate that relationship. Therefore, a comprehensive security risk management strategy must account for, if not necessarily address, each risk factor identified during an assessment. Threats and affected entities can also evolve over time. However, unless the risk factors are affected, which they often are if one or more threat scenario elements are changed, such changes have no effect on the magnitude of risk. In other words, risk factors exclusively determine the magnitude of threat scenario risk. Organizing risk factors into categories is helpful in establishing a conceptual framework of security risk as well as in identifying appropriate security controls. Five categories of risk factors have been identified: apex, spatial, temporal, behavioral and complexity. Apex risk factors (ARF) have the maximum effect on the magnitude of threat scenario risk. ARFs do not necessarily have common features, and any risk factor can quality as an ARF. Therefore, every ARF also belongs to at least one other risk factor category. ARFs are afforded their own category because of the profound effect on the magnitude of risk and their effect on other risk factors. We now know that risk factors are only present in affected entities and the environments in which threats and affected entities interact. Risk factors do not apply to threats. The threat-entity environment of interaction is the “space” to which the adjective “spatial” refers in the context of threat scenarios. In other words, spatial risk factors are present in threat scenario environments. A temporal risk factor is one where the magnitude of its effect relates to its longevity or variability. Temporal risk factors consist of two subcategories: persistent and intermittent. The former enhance the magnitude of risk due to a continued © Springer Nature Switzerland AG 2019 C. S. Young, Risk and the Theory of Security Risk Assessment, Advanced Sciences and Technologies for Security Applications, https://doi.org/10.1007/978-3-030-30600-7_2
31
32
2 Risk Factors
presence, and the latter have the same effect but it is due to their intermittent appearance and disappearance. A behavioral risk factor is a feature or behavior of an affected entity that increases the magnitude of threat scenario risk. Behavioral risk factors can have a disproportionate effect on threat scenarios that contain numerous entities or in scenarios where affected entities can themselves affect the performance of security controls. An information technology environment is an example of a threat scenario where the effect of behavioral risk factors is particularly acute. This effect is due to the presence of numerous entities and their individual effect on the performance of security controls, e.g., password selection. There are two threat scenarios that affect threat scenario complexity: the number of risk factors and the uncertainty in risk management, where risk management is defined as the application of security controls to risk factors. A threat scenario characterized by either one of these features is categorized as complex. Complex threat scenarios are described in detail in Chap. 9.
2.2
Definitions and Examples
A formal definition of a risk factor is as follows: A risk factor is a threat scenario feature or condition that increases the magnitude of one or more components of risk.
As noted above, risk factors do not pertain to threats. Threats certainly have consistently threatening features, e.g., fire is always hot, guns always discharge highvelocity projectiles, and great white sharks always have sharp teeth. Indeed, threats may be inherently threatening, but the magnitude of threateningness with respect to a particular entity depends exclusively on the context, i.e., specific features of the affected entity and/or the threat scenario environment. Let’s return to the great white shark threat scenario to illustrate this important point. As noted in Chap. 1, a great white shark is a significant threat to nearly every animal in the ocean due to its razor-sharp teeth as well as other predatory advantages conferred through superior size and speed. However, the magnitude of risk is clearly different depending on the particular prey, i.e., the affected entity. For example, consider if its prey is a human versus a blue whale. The whale’s size effectively neutralizes the great white shark’s assets. In general, the shark is a highly effective predator except in threat scenarios involving much larger adversaries such as blue whales and killer whales.1 As noted previously, the magnitude of threat scenario risk also depends on the features or conditions of the environment where threats and affected entities interact.
1 The largest blue whale can be 30 m in length and weigh 173 metric tons (one metric ton equals 1000 kg). In contrast, the largest great white shark is 6.1 m in length and 1905 kg.
2.2 Definitions and Examples
33
In the shark attack threat scenario, the ocean is the environment, where sharks are almost always the threat and their prey are affected entities. Risk-relevant environmental conditions affecting the likelihood component of risk might include the murkiness of the water and the abundance of more attractive food sources.2 The latter condition is a risk factor since it affects the shark’s appetite and thereby increases the potential for an attack. Lastly, recall that each of these risk factors would cease to be risk-relevant in a threat scenario where the environment is anywhere but the ocean. The shark might still possess big teeth and a huge size advantage, but it would be unable to leverage these features on land. In salt water, the great white shark is an apex predator. On land, its chances of even short-term survival are nil, even against the most diminutive adversaries. The reader might reasonably question why threateningness only applies to affected entities and threat scenario environments and not the threats themselves. For example, a great white shark that that weighs 100 pounds and lacks teeth is much less threatening than one that weighs 2500 pounds and possesses the usual array of dental weaponry. The contention is that an edentulous 100-pound great white shark is actually not the same animal as the version forever immortalized in Jaws. Unless it is still a pup, a great white shark is arguably not a great white shark without big and sharp teeth in conjunction with a large body. Therefore, a threat scenario consisting of a miniature great white shark is not comparable to one containing an actual great white shark any more than if a tuna were the threat. They are simply not equivalent threat scenarios. Only the characteristics of the affected entity and/or the environment where the threat and entity interact can cause the same threat to be more or less threatening. Consider other examples of risk factors in various threat scenarios: • The vulnerability associated with falling from a scaffold is a function of the person’s height above the ground when the individual falls. • The likelihood associated with falling from a scaffold is a function of the stability of the scaffold and the balance of the person on the scaffold. • The likelihood of slipping on a surface is a function of the surface coefficient of friction. • The vulnerability associated with vehicle-borne explosives is a function of the explosive payload and the distance of the vehicle from the target. • The vulnerability associated with a financial asset is the total amount of that asset that is subject to loss. • The impact associated with a biological weapon is a function of the magnitude of environmental conditions that increase dispersion, e.g., wind.
2
Sharks have exceptional olfactory capabilities and can also sense minute variations in the electric field intensity generated by other animals. However, they purportedly have poor vision. Attacks on humans have sometimes been explained as cases of mistaken identity where the murkiness of the water contributes to the shark’s difficulty in distinguishing humans from more nourishing meals, e.g., seals.
34
2 Risk Factors
• The impact associated with a chemical weapon is an inverse function of the magnitude of environmental conditions that increase dilution, e.g., wind. • The likelihood of a theft from within a building is a function of the number of unauthorized entries to internal areas of that building. • The likelihood of a denial-of-service attack against an organization’s IT network is a function of the public profile of that organization. A risk factor is specified for each of the aforementioned threat scenarios, which by definition increases at least one of the three components of risk. For example, in the case of the scaffold threat scenario, the magnitude of injuries sustained as a result of falling relates to the height from which the person fell. Note that the height above the ground itself has nothing to do with the likelihood of falling unless the individual becomes increasingly unstable at greater heights. Risk factors for this threat scenario include the stability of the scaffold and the balance of the individual on the ladder, where the former is a spatial risk factor and the latter is behavioral. Another risk factor for the scaffold threat scenario is the composition of the ground. The harder the surface the greater the damage to the individual should a fall occur. Clearly, concrete would increase the vulnerability component of risk relative to a bed of feathers for the same height above the ground. It is possible to generate a curve specifying the relative magnitude of injuries due to falling as a function of the height above the ground for a given ground composition. Note that at some threshold height the ground material is of little consequence. At that threshold height and above, the maximum vulnerability, i.e., severe injury or death, is a near-certainty. Figure 2.1 is an indicative curve illustrating the relative vulnerability for an individual on a scaffold as a function of height above a concrete surface. The curve is normalized to unity, which is the maximum vulnerability, i.e., death. Note that Fig. 2.1 suggests that the magnitude of the vulnerability component of risk initially increases linearly with height. Although not unreasonable, this representation is not based on empirical evidence. The critical feature is that the vulnerability increases as a function of height up to a maximum value. For heights exceeding the maximum, whatever that actual value might be, the magnitude of vulnerability is constant. To summarize using the language of risk, height above the ground is a risk factor for the vulnerability component of risk for a scaffold threat scenario. A recurring threat scenario is the theft of valuables from within an office. Unauthorized access to internal space is a risk factor for such threats, which is one reason to install a centrally managed access control system as a security control. Notwithstanding the fact that an actual theft may or may not have occurred in the past, instances of unauthorized building access are presumed to increase the magnitude of the likelihood component of risk for office theft threat scenarios. Simply stated, there are more opportunities for thefts to occur if more individuals without permission are allowed to enter restricted space. Although we intuitively believe this fact to be incontrovertible, it is impossible to quantify the magnitude of
2.2 Definitions and Examples
35
Fig. 2.1 Relative vulnerability as a function of height above the ground (concrete)
the likelihood component of risk unless we assume an attempted theft in conjunction with unauthorized access is a random variable. The broader implications of such an assumption is central to the discussion in Chap. 8. Contrast a theft threat scenario without threat incidents with one where actual thefts have occurred. How is the latter threat scenario different from the one involving incidents of unauthorized access but no actual thefts? For starters, no risk factors are specified in the threat scenario where actual thefts have occurred. The relevance of risk factors to a risk assessment depends in part on the available data and the precise question to be answered. For example, we could ask, “What is the probability that a theft exceeds $100 in value?” If we organize the data according to ranges of theft values, and the fraction each range represents relative to the total theft population is specified, we can calculate the probability that any theft selected at random exceeds a specific amount. The resulting distribution would have utility as a prognosticator of the value of future stolen items if and only if theft-related risk factors remained stable. Although risk factors certainly exist for this threat scenario as in all threat scenarios, their effect yields a probability distribution of historical threat incidents. In other words, threat incidents and the resulting probability distribution are explicit manifestations of the threat scenario risk factors at work. However, the risk factors for theft are not germane to the probability of a specific theft value. In this case, the probability distribution of theft values facilitated a calculation of the likelihood of the loss-per-incident. Therefore, the question actually resulted in a distribution pertaining to the impact component of risk. Such a calculation might be used to justify the expense of a particular security control relative to the historical distribution of losses. In summary, extrapolations from the past to the future become possible with actual threat incidents. The power of a probability distribution is in facilitating generalizations about the likelihood of a particular incident based on the historical
36
2 Risk Factors
record. However, the validity and precision of such generalizations depends on risk factor stability and the number of historical incidents respectively.
2.3
Apex Risk Factors
The presence of Apex Risk Factors (ARF) significantly increases the magnitude of threat scenario risk. Their individual contribution outweighs the effect of all other risk factors combined. A risk factor can also qualify as an ARF if the contribution of other risk factors depends on its presence. In summary, the presence of an ARF disproportionately increases the magnitude of threat scenario risk. It is possible to conduct the following thought experiment to determine if a risk factor is an ARF, recognizing that conferring ARF status to a risk factor is inevitably a judgment call. Begin by constructing a series of related threat scenarios consisting of various combinations of risk factors. A risk factor qualifies as an ARF if both of the following conditions exist: (a) the magnitude of a component of risk increases in every threat scenario where that risk factor is present and (b) removing that risk factor from a threat scenario significantly reduces the magnitude of a component of risk irrespective of the presence or absence of any other risk factors. For example, consider the threat of being mugged. Location, and in particular a location with a historically high crime rate, is a risk factor for the likelihood of an attack. The time of day might also be a risk factor for the likelihood component of risk for this threat scenario. Certainly an uptick in crime might be expected at night. However, the likelihood of a mugging in a historically low-crime rate area would likely be low during the night or day. The contention is that location outweighs the contribution of time of day to the magnitude of threat scenario risk as well as the contribution of any other risk factor. Hence, a location with a historically high crime rate is an apex risk factor for a mugging threat scenario. A second example of an ARF is illustrated in fire-related threat scenarios. Absent the presence of a spark a fire cannot ignite. The presence of gasoline by itself has no effect on any component of risk for the fire threat scenario. A fire must have a catalyst, and gasoline merely accelerates burning following the catalyst. Similarly, the presence of dry materials such as paper or wood only increases the likelihood component of risk if there is a source of ignition. Wind increases the vulnerability component of risk by increasing the rate of spreading and hence the magnitude of loss. But wind in the absence of a spark to ignite a flame has no effect on the magnitude of risk for a fire threat scenario. In this threat scenario, the spark is the ARF, and gasoline, wind and dry materials represent ancillary risk factors. A third example of an ARF involves Air France Flight 447, which crashed in the Atlantic Ocean on June 1, 2009. An extensive investigation, which included deploying a submarine to locate the wreckage, revealed that the aircraft was in a prolonged stall due in part to the absence of accurate air speed information. The pilots failed to recognize the stall condition until it was too late to recover. Air speed on an aircraft is measured via instruments called pitot tubes, which are designed to measure fluid flow. Pilots rely on knowledge of the air speed to ensure the
2.3 Apex Risk Factors
37
plane maintains sufficient lift, which results from the pressure difference between the upper and lower surfaces of the wing. In the case of Flight 447, the pilots lacked air speed information due to ice on the pitot tubes. The plane descended thousands of feet in a steep vertical angle of attack thereby preventing the wings from generating lift. A simple correction would have been to point the nose of the plane down to gain airspeed. Every trained pilot is aware of this simple recovery technique. However, by the time the pilots of Flight 447 realized the plane’s condition it was too late, and the plane belly flopped on the ocean surface with predictably devastating effects. The ARF in this threat scenario was the lack of wind speed information caused by pitot tube malfunction due to ice build-up.3 Other risk factors were present such as a less experienced pilot at the controls and bad weather. However, these risk factors are common in commercial aviation. By themselves they are unlikely to result in an adverse outcome. In other words, addressing these ancillary risk factors might have helped compensate for the effect of the ARF, but would not by themselves be expected to result in the outcome experienced by Flight 447. Applying the aforementioned test, consider the following threat scenario variations for the Air France disaster. Each scenario consists of the risk factors with and without the obstructed pitot tube risk factor. A subjective evaluation of the effect on the magnitude of the likelihood component of risk accompanies each threat scenario variation noting only three risk factors are evaluated: Threat Scenario #1 1. Pitot tubes obstructed 2. Relatively inexperienced pilot 3. Inclement weather Effect on Risk: Maximum effect on the likelihood component of threat scenario risk. Threat Scenario #2 1. Relatively inexperienced pilot 2. Inclement weather Effect on Risk: Minimal effect on the likelihood component of threat scenario risk. Threat Scenario #3 1. Pitot tubes obstructed 2. Inexperienced pilot Effect on Risk: Significant effect on the likelihood component of threat scenario risk. Threat Scenario #4: Increase in the magnitude of the likelihood of a disaster 1. Pitot tubes obstructed 2. Inclement weather Effect on Risk: Significant effect on the likelihood component of threat scenario risk.
3
https://en.wikipedia.org/wiki/Air_France_Flight_447
38
2 Risk Factors
The presence of obstructed pitot tubes is the risk factor in common for scenarios with a significant effect on the magnitude of risk based on reasonable if inexact assessment criteria. An obstructed pitot tube condition qualitatively elevates the magnitude of risk (likelihood component only) in three of the four threat scenarios. In the one threat scenario variation that lacks the pitot tube risk factor, i.e., threat scenario #2, the effect on the magnitude of the likelihood component of risk is assessed to be minimal. The effect is a maximum when all three risk factors are present, i.e., Threat Scenario #1. The conclusion from this simple and admittedly subjective exercise is that the obstruction of the aircraft’s pitot tubes is an ARF for this threat scenario. Note again that risk factors other than those presented here likely contributed to the magnitude of risk, and any test for an ARF strongly depends on an accurate delineation of the significant risk factors. Note also that the principal risk factors are only coincident in Threat Scenario #1. In general and irrespective of the presence of an ARF, coincident likelihood risk factors result in the most significant effect on the likelihood component of risk. This condition is known as “confluence,” and a more fulsome discussion of its effect is included in Chap. 7. The last example of an ARF occurs in “phishing” threat scenarios. Phishing, or alternatively spear phishing, utilizes social engineering to gain unauthorized access to a computer network. One scheme is to send computer users a phony but realisticlooking email enticing them to visit a malicious web site and/or enter their user name and password. Once the user’s authentication credentials are revealed, the attacker inherits the access privileges of the compromised user. The ARF in this case is naïve or inattentive computer users. Indeed, there are plenty of ancillary risk factors for the likelihood component of risk in this threat scenario. These risk factors include numerous users with administrative privileges, liberal network access privileges and computer users with a history of visiting dodgy web sites. However, inattentive or careless users who facilitate unauthorized network access by providing their login credentials is the sine qua non for a successful attack. The fact that user inattentiveness and/or carelessness is an ARF for this threat scenario explains why threat awareness is generally recognized as a priority security control. Note that an ARF does not necessarily affect the same component of risk as the ancillary risk factors. In the case of vehicle-borne explosive threat scenarios, distance from the explosive source to the target and the explosive payload are both ancillary risk factors. The magnitude of the vulnerability component of risk scales with each of these risk factors according to generally accepted physical models. Figure 7.3 in Chap. 7 graphically illustrates the combinations of payload and distance that result in constant values of overpressure upon detonation. Building damage/loss is caused by the overpressure resulting from the explosive shock wave. The most common security control used to reduce the vulnerability component of
2.4 Spatial Risk Factors
39
risk for this threat scenario is the use of bollards and other types of barriers. These enforce a physical separation between the attacking vehicle and the target facility. However, the public profile of the organization occupying a targeted building is an ARF that affects the likelihood component of risk. The public profile would be considered an ARF since it represents the motivation for the bombing. In the absence of an enhanced public profile or some other connection to a cause celebre, there is less or possibly no incentive for that particular facility to be attacked. The fact that the ARF and ancillary risk factors can affect different components of risk could have significant implications to a security risk management strategy. Security controls to address the likelihood component might have no effect on those affecting vulnerability and vice versa.
2.4
Spatial Risk Factors
Spatial risk factors are features or conditions inherent to a threat scenario environment that increase the magnitude of one or more components of risk. Such environments constitute the “space” in a threat scenario, noting that the corresponding adjective, i.e., “spatial,” in this context does not necessarily refer to a physical space. A software program, a building, and an IT network component could each be the locus of spatial risk factors depending on the threat scenario. Consider a threat scenario where an automobile is the threat and a pedestrian is the affected entity. An obstacle that blocks a pedestrian’s view of traffic would be a spatial risk factor. The obstacle belongs to the environment in which the threat and affected entity interact. In contrast, being blind is a behavioral risk factor for the same threat scenario, where this risk factor is inherent to the affected entity. Next consider a threat scenario in which a rifle is the mode of attack. An entity’s position relative to the direction of the gun barrel would be a spatial risk factor since it relates to the threat scenario environment, although it is a temporary configuration.4 Specifically, there is a solid angle and range of distances from the weapon where the magnitude of the vulnerability component of risk is increased. Outside that angle and range the weapon would have no deleterious effect on this specific entity. Note also that the spatial dependency of the vulnerability component of risk could be quite different depending on the particular weapon. For example, the spatial risk factor for a mortar would be more broadly distributed. In fact, the entire point of weapons that fire exploding ammunition is to decrease the sensitivity to position with respect to the vulnerability and impact components of risk. In other words, the objective is to affect as many victims as possible in a single threat incident.
4
Note that the affected entity’s position or location relative to the weapon is not an inherent characteristic of the entity. Therefore, position/location would not be considered a behavioral risk factor for this threat scenario notwithstanding the fact that this feature relates to the affected entity.
40
2 Risk Factors
Spatial risk factors can undergo discrete transitions in magnitude. A threat scenario discussed in detail in Chap. 7 is slipping on the subway platform and subsequently being struck by an oncoming train. It will become apparent that the vulnerability component of risk is essentially infinite across the width of the tracks yet is effectively zero everywhere else.5 The application of security controls is dictated by the location of boundaries in some spatial threat scenarios. Clearly, applying security controls outside the region where a component of risk is affected would be a waste of resources. If only finite resources were available, as is often the case, indiscriminately applying security controls would detract from areas where security controls are required and might actually be effective. Finally, another example of spatial risk factors is the vulnerability in a computer application.6 Again, this example might at first seem counter-intuitive because software vulnerabilities do not conform to traditional notions of “space.” As noted previously, an actual physical space is not required to qualify as a locus for spatial risk factors. The only requirements are a presence in the environment where a threat and affected entity interact and a resulting increase in the magnitude of one or more components of risk.
2.5
Temporal Risk Factors
As the name suggests, temporal risk factors are those where their effect is a function of their duration, which can be either long or short-lived and/or intermittent. Temporal risk factors can be present in either affected entities or threat scenario environments. Two sub-categories of temporal risk factors exist. Although there is significant variation in each sub-category, their respective behaviors can increase the magnitude of risk albeit in different ways. The first category corresponds to risk factors that maintain a prolonged presence within a threat scenario and their magnitudes are relatively constant. These risk factors are called persistent.
5
The width of the track is a spatial risk factor for this threat scenario but only during a critical time period. This period is defined by the time interval between a fall onto the track and the maximum time required to decelerate to zero speed after an approaching train applies the brakes. Spatial and temporal risk factors in combination affect the likelihood and vulnerability components of risk for this threat scenario. 6 In the context of information security threat scenarios these risk factors are specified as vulnerabilities in accordance with an accepted rating scheme known as Common Vulnerabilities and Exosure (CVE).
2.5 Temporal Risk Factors
41
The second temporal sub-category belongs to risk factors that intermittently appear and disappear or fluctuate in magnitude. These ephemeral risk factors are referred to as transient. Fluctuating and ephemeral risk factors contribute to uncertainty in security risk management, which increases threat scenario complexity and hence the magnitude of the likelihood component of risk. The delineation between persistent and transient is not exact and will depend on the context. Both short and long duration risk factor time intervals can be riskrelevant, which will become apparent when we examine the subway-related threat scenario in more detail in Chap. 7. In that threat scenario, time, position and platform slipperiness combine to increase the magnitude of the likelihood component of risk. Excellent examples of persistent risk factors are vulnerabilities that linger in an IT environment. These vulnerabilities are typically remedied by the application of patches. Security patching of applications hosted on premise is an ongoing housekeeping chore of IT departments and a strong motivator to migrate to the Cloud. The persistence of such vulnerabilities could have enterprise-level implications as discussed in Chap. 10. A key reason why a rapidly changing risk factor amplifies the likelihood component of risk is because of the potential lag in applying security controls. The magnitude of risk is enhanced during the lag period. Any inherent uncertainty in the value of a risk factor would be exacerbated during fluctuations. If multiple fluctuating risk factors are present, the uncertainty would potentially be amplified with a resulting increase in the likelihood component of risk. Ultimately, the importance of determining a risk factor category or type is rooted in the application of security controls. For example, consider the threat scenario of overexposure to the sun. The time of day is a temporal risk factor, where the solar intensity at the surface of the earth fluctuates hour-by-hour, which affects the magnitude of the vulnerability component of risk. Multiple controls are available for this threat scenario: avoiding the sun, covering up with clothing and/or applying sunscreen. In order to be effective in reducing the magnitude of risk, the control is best applied when the intensity of the sun demands it. For example, applying sun block after dark would be entirely superfluous. The late morning/early afternoon is the time to lather up since this is the time when sunscreen is needed most. Finally, the fluctuations of risk factors relative to security controls are riskrelevant. In Chap. 3 we will discuss static and dynamic threat scenarios, which are so designated based on differences in the relative time rates of change of risk factors and security controls. Static and dynamic threat scenarios play an important role in the theory of security risk assessment, and in particular with respect to the stability of complex threat scenarios.
42
2.6
2 Risk Factors
Behavioral Risk Factors
Behavioral risk factors are features or actions of affected entities that increase the magnitude of one or more components of threat scenario risk. As a practical matter, the affected entities are often humans, but belonging to our species is not a requirement. An example of a behavioral risk factor in information security threat scenarios is the propensity for computer users to visit high-risk web sites. As noted previously, blindness is a behavioral risk factor for threat scenarios related to crossing the street. Behavioral risk factors can be particularly difficult to address due to the inherent nature of human behavior. The problem is not that humans are unpredictable; the field of psychology is based on established patterns of human behavior. The issue is that humans either ignore risk-relevant conditions and/or subvert controls to address such conditions. Perhaps even more insidious is that individuals tend to act in ways contrary to their best interests and/or the best interests of their colleagues. Behavioral risk factors disproportionately affect the magnitude of risk in threat scenarios where humans exert a significant influence on the risk profile. Consider the effect on the likelihood component of risk for an airplane crash threat scenario if each passenger could influence the operation of the plane! An IT network with many computer users is an example of such a threat scenario. Behavioral risk factors for malware threat scenarios include a propensity by computer users to visit dodgy web sites, use of low-complexity passwords and clicking on hyperlinks embedded in emails from unknown sources. Each of these actions is a behavioral risk factor for information compromise threat scenarios. System administrators have the unenviable task of addressing user requirements in the face of such behaviors. Behavioral risk factors are sometimes de-emphasized or even overlooked in security risk management strategies. Again, this phenomenon is common in information security threat scenarios. Often less attention is paid to risk factors associated with the individuals who use technology than it is to the technology itself. We note once again for emphasis that the behavior of threats is not risk-relevant. Risk factors do not apply to threats, whose threatening characteristics exist independent of the context, i.e., the particular threat scenario. However, the magnitude of their effect does depend on context as we observed in the case of the great white shark, which is harmless on land but becomes an apex predator in the ocean. Behavorial risk factors are common in medicine. Causal relationships exist between specific diseases and the behavior of patients. For example, smoking and obesity are behavioral risk factors for multiple diseases such as heart disease, diabetes, and emphysema, just to name a few. Behavioral risk factors can often be linked to the organizational culture. Yet despite the importance of cultural influences, security controls and/or specific control settings are typically the focus of security risk assessments rather than the root causes of risk. Ultimately, every control setting can be traced to policy, which reflects an organizational culture that either reinforces the need for security or covets convenience, two countervailing philosophies.
2.8 Inter-related Risk Factors
43
Security controls are designed to limit behavior. Therefore, they can be antithetical to a culture that eschews inconvenience. For example, firewall settings that control network segmentation reflect the corporate culture with respect to information access. Organizations that exercise tighter control over information will typically have more segmented IT networks, where access is determined by a legitimate “need-to-know.” The IT department merely implements such settings based on the organization’s tolerance for risk.
2.7
Complexity Risk Factors
The magnitude of threat scenario complexity is inversely proportional to the probability that a threat scenario is in a specific state, where each state consists of a unique set of managed and unmanaged risk factors. The lower the probability the greater the complexity. There are two risk factors for threat scenario complexity: the number of risk factors and the uncertainty in security risk management, i.e., the application of security controls to risk factors. Complexity is itself a risk factor for the likelihood component of risk, and complexity risk factors can exist in any threat scenario especially those with numerous risk factors. A model for threat scenario complexity is presented in Chap. 9.
2.8
Inter-related Risk Factors
Inter-related risk factors are risk factors that have a symbiotic effect on each other and thereby affect the overall magnitude of threat scenario risk. Information security threat scenarios again provide instructive examples. The Open System Interconnection (OSI) model is commonly used to characterize IT environments. This model partitions an IT environment into abstraction layers, where each layer serves the layer above or below it depending on their relative position in the layer hierarchy. Information security risk factors can exist within and across the seven layers of the OSI model. Moreover, each layer can possess unique risk factors for information compromise. But because the layers are inter-related, the risk factors within each layer are frequently related. As noted in the discussion on behavioral risk factors, user behavior can be a significant contributor to information security risk. Individuals who demonstrate promiscuous on-line behavior enhance the likelihood of information compromise in addition to increasing the likelihood of risk for others connected to the same network. Moreover, the contemporaneous presence of multiple likelihood risk factors, inter-related or not, has an exponential effect on the magnitude of likelihood as noted previously. Consider the following scenario: An organization operating a global IT network is the focus of relentless media attention that depicts the organization in a
44
2 Risk Factors
consistently bad light. The IT network contains multiple shared network drives that store highly confidential information. These drives are protected by weak passwords, and are accessible by many IT users who themselves have weak passwords and possess liberal administrative rights. These same IT users regularly visit high-risk web sites. In addition, some network drives containing confidential information are accessible from the Internet. Such a threat scenario represents a perfect storm of information security risk due to a confluence of likelihood risk factors. Some of these risk factors are related, e.g., shared network drives that are accessible from the Internet and the public profile of the organization. So it would be especially important to address publically accessible information assets in light of the presence of relevant risk factors. Both the likelihood and vulnerability components of risk for information compromise are enhanced by weak security controls and public access to internal network resources. A tangled web of information technology exists in most modern IT networks. Inter-related risk factors also contribute to threat scenario uncertainty, a significant contributor to complexity, which is itself a risk factor for the likelihood component of risk. Spatial and temporal risk factors in particular can have a symbiotic relationship and thereby increase risk. We have discussed the potential increase in the magnitude of risk that can occur when risk factors are present for either long or short time intervals. A short time interval can increase the magnitude of risk if there is a narrow window of opportunity to react especially if it is also spatially confined and therefore difficult to access or pinpoint. The subway threat scenario described previously would qualify as such a threat scenario. On the other hand, a persistent risk factor can also increase the likelihood component of risk due to confluence or because there are increased opportunities for a threat to manifest itself.
2.9
Risk Factor Scale and Stability
The scale and stability of risk factors are of great significance to the theory of security risk assessment. What is meant by scale and stability? Risk factor scale refers to a dependence on the change in some risk-relevant parameter, e.g., time or position, across the range of values of that parameter. In other words, risk factor scale is the change in magnitude of growth or decay of a risk factor as a function of a specific parameter or feature across the range of parameter values. Risk factor stability refers to its time rate of change and specifically whether it is relatively stationary in time. The stability of a threat scenario directly relates to the stability of its constituent risk factors. The characteristic common to both scale and stability is change or more precisely, the effect of change. Assessing security risk and its natural follow-on activity, security risk management, often entails determining how risk factors “behave” within a threat scenario. Both the scale and stability of a risk factor can affect the application of security controls. The scale will determine the range of relevant risk
2.9 Risk Factor Scale and Stability
45
factor values, and stability relates to how rapidly risk factors are changing. Both effects are risk-relevant since security controls must be applied in proportion to the magnitude of risk in both space and time, i.e., they must accurately address the relevant risk factor(s), which may be rapidly changing and/or assuming a range of values. An examination of risk factor scaling is crucial to a comprehensive security risk assessment. If the assessment merely examines a few values of a multi-valued risk factor, only a limited view of the magnitude of threat scenario risk is available. For example, if the value of a risk factor scales non-linearly with a parameter such as time or position, the implications of ignoring the values at the upper range could become exponentially significant. Furthermore, risk factor scale and stability could strongly influence the perspective of the assessment. A narrow perspective offers greater opportunities to miss or misinterpret a threat scenario feature. For example, a narrow view of a dynamic threat scenario, i.e., where the time rate of change of a risk factor exceeds the time rate of change of the relevant security control, might prejudice a security assessment result depending on when and how long the scenario is examined. If risk factor fluctuations are rapid, an assessment result could be very different from one time interval to the next. On the other hand, if these fluctuations are slowly varying, a view gained from examining too narrow a time interval or a limited range of values might not reveal slowly varying changes. The stability of risk factors has particularly significant implications to the likelihood component of risk. The importance of stability derives from its effect on formulating a probability distribution of similar threat incidents. Such a distribution is a requirement for calculating the probability of a future threat incident. We can be confident in the magnitude of cardiovascular risk for Joe, the overindulgent couch potato, because of the existence of a probability distribution consisting of other humans who have indulged in similar behavior. A prerequisite for formulating a probability distribution is similarity among the elements of that distribution. This condition does not imply that entities within the distribution population must be identical. In fact, probability distributions inherently differentiate between its constituent elements. But the constituent elements must be comparable. For example, a probability distribution of individuals with varying hair color is presented in Chap. 5. Clearly, not every individual in the distribution has the same hair color. The entire point of formulating such a distribution is to identify the frequencies with which each distinct color appear in the sample population, and thereby generalize about the parent population. But certainly every element of the distribution is a human with hair, thereby inviting comparison. Such generalizations represent the essence of statistical sampling. Importantly, if the elements in the probability distribution are threat incidents, the magnitude of the risk factor(s) that lead to or precipitate those threat incidents cannot vary significantly over risk-relevant time scales. In other words, the risk factors must be stable.
46
2 Risk Factors
Consider the threat of terrorism. Unfortunately, there is no shortage of terrorism threat incidents. Yet despite their prevalence, a probability distribution of threat incidents is typically elusive. This situation exists principally because of threat scenario instability, i.e., risk factor fluctuation, which precludes the formulation of a probability distribution of similar threat incidents. By comparison, statistics on diseases abound because human anatomies and physiologies are similar enough to formulate probability distributions consisting of like entities, i.e., humans. Therefore, risk factors for specific diseases can be identified based on historical data. Recall Fig. 1.3 in Chap. 1 showed the probability of heart disease as a function of the number of risk factors present in study participants. The probability curves in this figure were presumably based on studies of large sample populations. The point is that generalizations about the likelihood of anyone developing heart disease are possible from these curves because human anatomy and physiology do not meaningfully vary from person-to-person and also remain relatively stable over time. Other factors that might influence the data have presumably been accounted for so that the observed effects can be confidently attributed exclusively to confirmed risk factors for cardiac disease. Let’s make the reasonable assumption that weak passwords are a risk factor for unauthorized access to computer applications. Furthermore, we want to examine the magnitude of information loss assuming a particular password complexity policy is in effect. Let’s also assume it is desired to know the likelihood of successful account compromises as a function of password complexity. The complexity potentially changes each time a user resets his or her password. This requirement sets the limit on the time interval for which threat incidents and associated threat scenarios involving password compromises would be considered similar. That limit would correspond to the duration of an individual password for any given account if passwords were conforming to policy. For other threats the temporal limits on risk factor relevance are not so well delineated. We shall see that this limit is important in assessing security risk since it determines threat scenario longevity and hence an estimate of the time interval for threat incident similarity. Is it possible to identify general limits on risk factor invariance and thereby specify temporal limits on threat scenarios? This limit will determine when threat incidents can be included in the same probability distribution. In other words, is there a metric for risk factor similarity and hence the similarity of threat incidents that result from those risk factors? As a practical matter, risk factor fluctuations must not exceed the frequency of security risk assessments. Therefore, and at a minimum, the risk factors must be stable over the time interval defined by the assessment frequency. Although this
2.10
Summary
47
metric is operationally valid, it is less than satisfying since it is subject to the variability of the security risk management process. A metric that is inherent to risk factors is preferable. The objective is to establish a general time interval for risk factor stability, and therefore be able to generalize about the time duration of threat scenario/threat incident similarity. Foreshadowing the discussion in Chap. 8, if and only if a risk factor is a variable subject to random fluctuations, the correlation-time function can be used to measure the characteristic time required for a risk factor to become statistically independent of its initially measured value. Since threat incidents result from risk factors, the correlation-time function can be used to infer risk factor/threat scenario stability but only in that very limited context.
2.10
Summary
Risk factors increase the magnitude of one or more components of threat scenario risk. Therefore, assessing the contribution of each risk factor to the magnitude of risk is essential to conducting a comprehensive and rigorous security risk assessment. There are five risk factor categories: apex, spatial, temporal, behavioral and complexity. Apex risk factors dominate a threat scenario based on their disproportionate contribution to the magnitude of risk or their effect on other risk factors. Although they have their own category, apex risk factors must belong to at least one other risk factor category since they do not possess identifiable characteristics other than their effect on the magnitude of risk. Spatial risk factors are features of a threat scenario environment that increase the magnitude of risk. The distribution, i.e., concentration or proliferation of risk factors, can affect the magnitude of risk and point to systemic security risk. Two sub-categories of temporal risk factors exist: persistent and transient. In the former, the duration affects the magnitude of risk due to increased opportunities for threat incidents or the effects of confluence. Transient risk factors affect the magnitude of risk due to either intermittent appearance and disappearance or periodic and aperiodic fluctuations in magnitude. Intermittency and fluctuations in magnitude can increase the uncertainty in applying security controls to risk factors, which contributes to complexity. In general, the magnitude of security risk is a maximum when risk factors and security controls are non-coincident and/or when likelihood risk factors are coincident. The behavior and/or features of affected entities can have a profound effect on the magnitude of one or more components of threat scenario risk. Behavioral risk factors are sometimes the most difficult to manage, especially when humans intentionally circumvent security controls. Importantly, behavioral risk factors often result from,
48
2 Risk Factors
or are the by-product of, organizational culture. That culture is often baked into the fabric of the organization. Therefore, implementing a seemingly small change in organizational culture might require a commitment by senior executives. Finally, threat scenario complexity is inversely proportional to the probability that a threat scenario exists in a particular state consisting of managed and unmanaged risk factors. There are two risk factors for threat scenario complexity: the number of risk factors and the uncertainty in security risk management.
Chapter 3
Threat Scenarios
3.1
Introduction
Threat scenarios establish the context for security risk assessments. Their canonical structure and the relationship between its structural elements lead to a universal assessment process and drive the requirement for security controls. Therefore, threat scenarios are not abstractions without connection to the real world. Specifically, we know that all threat scenarios consist of three elements as follows: 1. Threats 2. Entities affected by threats 3. The environment where a threat and affected entity interact Although 1, 2 and 3 represent the elements of every threat scenario, their presence does not imply all threat scenarios are the same. It is obvious there are significant variations among threat scenarios as evidenced by the breadth of threats we confront each day. Many threat scenarios such as natural disasters and crimes against humanity are clearly deleterious, and therefore conform to the definition of a threat given in Chap. 1. However, recognize that opinions can vary on what actually constitutes a threat. Consider the hyper-popular pastime of watching television. Many people could barely live without this activity. Yet, for other folks TV is a profound annoyance or worse. Figure 3.1 shows the results of a study of the effects of watching television on grade point average (GPA). The results show these two parameters are modestly
© Springer Nature Switzerland AG 2019 C. S. Young, Risk and the Theory of Security Risk Assessment, Advanced Sciences and Technologies for Security Applications, https://doi.org/10.1007/978-3-030-30600-7_3
49
50
3 Threat Scenarios 4.5
High School GPA
4
y = –0.0288x + 3.4397 R2 = 0.2373
3.5 3 2.5 2 1.5 1
0
10
20
30
40
50
Hours of Television Viewing During High School
Fig. 3.1 Correlation of student GPA and hours of television viewing
anti-correlated. 1 In other words, the more hours of television watched the lower the student’s GPA. Of course, other activities might contribute to a lower GPA given the breadth of distractions that are readily available today. In addition, a correlation between hours watched and lower GPAs does not necessarily mean television causes low GPAs. Correlations and correlation coefficients are discussed in Chap. 6. Parents must draw their own conclusions on what qualifies as a threat scenario if their children are the affected entities. Each parental unit has presumably established their own tolerance for risk. Some parents might be willing to accept lower grades from their progeny if the latter are preoccupied with practicing a musical instrument. They might be considerably less tolerant if television is believed to be the reason for inferior academic performance. A security risk management strategy ultimately reflects the subjectivity inherent to perceptions of “loss” and “gain.” In addition to identifying actual threats, a security risk assessment must identify the spectrum of distinct threat scenarios. In other words, an assessment must explicitly delineate the three elements of each threat scenario and thereby distinguish those that are unique. This observation might seem gratuitous, but the difference between threat scenarios is not always obvious. A rigorous assessment requires a detailed analysis of the threat scenario elements and risk factors. For example, no one would confuse petty theft with assault and battery. Likewise, distinguishing between terrorism and shoplifting threat scenarios would not be difficult. Yet terrorism threat scenarios are often lumped into the same category despite significant variations in their respective elements. If the risk factors are not
1 A. Hershberger, The Evils of Television; The Amount of Television Viewing and Student Performance Levels, Indiana Undergraduate Research Journal, Vol. 5, 2002.
3.2 Static Threat Scenarios
51
disambiguated, there is a real danger of applying ineffective security controls. Consider the acts of terrorism committed by Charles Whitman versus those perpetrated by Osama bin Laden. There is a simple test for threat scenario distinctness. Namely, two threat scenarios are distinct if their respective risk factors differ. Conversely, two threat scenarios are identical if they possess the same risk factors. In any case, threat scenario risk factors must always be identified if not actually addressed if theory and practice are in any way related. Threat scenarios can be organized according to the relationship between risk factors and security controls and the type of risk factors present. The risk factors and/or their behavior determine the threat scenario category. Specifically, the relative time rate of change of risk factors and security controls is a risk-relevant threat scenario feature. If security controls lag changes to relevant risk factors, it can affect the magnitude of one or more components of risk since the risk factor is likely unaddressed during the relevant time interval. Recall the two sub-categories of temporal risk factors as discussed in Chap. 2 are as follows: • Transient risk factors intermittently appear and disappear or significantly fluctuate in magnitude • Persistent risk factors maintain a prolonged presence Transient risk factors affect the magnitude of threat scenario risk as a result of changing behavior. In contrast, persistent risk factors have a cumulative effect on the magnitude of risk. We will see that the magnitude of the likelihood component of risk increases with the number of contemporaneous likelihood risk factors, i.e., risk factor confluence. Although transient risk factors contribute to uncertainty in the application of security controls, i.e., security risk management, persistent risk factors are more likely to result in confluence.
3.2
Static Threat Scenarios
A static threat scenario is characterized by the following two conditions: The time and positional rate of change of a security control are greater than or equal to the time and positional rate of change of the relevant risk factor. These conditions can be compactly written as temporal and spatial derivatives (see Chap. 6, Sect. 6.6), where C is a security control, R is the magnitude of a risk factor, x is distance and t is time: dC=dt dR=dt
ð3:1Þ
dC=dx dR=dx
ð3:2Þ
and,
52
3 Threat Scenarios
In other words, in a static threat scenario security controls are updated at least as frequently as spatial and temporal changes to the relevant risk factors. Of course, just because security controls are appropriately updated, it does not mean they are adequately addressing the magnitude of risk. Ensuring security controls are in sync with risk factor changes is a necessary but not sufficient condition for managing threat scenario risk.
3.3
Dynamic Threat Scenarios
As one might expect, dynamic threat scenarios are those where there is a lag in security control updates relative to changes in a relevant threat scenario risk factor. Specifically, dynamic threat scenarios are defined as follows: The time or positional rate of change of a security control is less than the time or positional rates of change of the relevant risk factor. Stated in the language of calculus with the variables the same as above, dC=dt < dR=dt
ð3:3Þ
dC=dx < dR=dx
ð3:4Þ
or,
Clearly, this condition has potential implications to the magnitude of risk during the time interval or location that a security control lags the relevant risk factor. Dynamic threat scenarios can be characterized by an increase in complexity due to an increase in uncertainty in security risk management. For these reasons, identifying a dynamic threat scenario is highly risk-relevant.
3.4
Behavioral Threat Scenarios
As discussed in Chap. 2, behavioral risk factors are behaviors or features of an affected entity that increase the magnitude of threat scenario risk. A behavioral threat scenario is simply a threat scenario that contains behavioral risk factors. It qualifies as a separate category because of the unique nature of these risk factors and the requirement for specific security controls. It seems obvious that the magnitude of risk for threat scenarios with numerous affected entities is potentially greater than those with fewer entities. The magnitude is disproportionately increased for threat scenarios with numerous entities that can themselves affect the efficacy of security controls. A networked IT environment is one example of such a threat scenario, which is one reason such scenarios are difficult to manage.
3.6 Random Threat Scenarios
3.5
53
Complex Threat Scenarios
As with behavioral threat scenarios that are defined by the specific category of risk factor present, a threat scenario is complex if it possesses complexity risk factors. In view of its prevalence and potential effect on the magnitude of the likelihood component of risk, complexity must be assessed in any comprehensive security risk assessment. The magnitude of complexity is agnostic to the types of risk factors present in a threat scenario. In other words, the magnitude of complexity is sensitive to the number of risk factors but not their particular category. As noted previously, threat scenario features that affect the magnitude of complexity are the number of risk factors and the uncertainty in security risk management. A model for complexity is presented in Chap. 9.
3.6
Random Threat Scenarios
If a threat incident is a random variable the threat scenario from which it originates is a random threat scenario. Randomness can result from two threat scenario conditions: 1. Incoherence resulting from multiple likelihood risk factors 2. The absence of likelihood risk factors so that threat incidents occur spontaneously and without provocation Let’s discuss each of these conditions in more detail. As its name implies, a random variable is a variable that assumes values at random, i.e., its values are not predictable. The outcomes of coin and die tosses are well-known examples of a random variable. A coin toss has two possible outcomes and a die toss has six. If the coin and die are fair, their respective outcomes are equally likely and therefore not knowable a priori. The following quote helps clarify this important point2: Rather than attempting a mathematical definition, we make use of the intuitive concept of randomness, exemplified by the shuffling of cards, the mixing of numbered chips in a bowl and similar manipulations. Such procedures are known as random processes. Their function is to exclude any form of bias, such as a conscious or even unconscious process of discriminatory selection, on the part of the individual, or the effects of gradual shifts in the measuring apparatus.
A risk factor introduces bias to any process. In other words, it influences the outcome of a process, thereby increasing the likelihood of a particular outcome(s). For this reason, a likelihood risk factor cannot be a random variable. Furthermore, a likelihood risk factor that is a random variable is a contradiction. However, we conveniently ignore that contradiction in applying the correlation-time function in Chap. 8.
2
J. Mandel, The Statistical Analysis of Experimental Data, Dover, New York, 1964.
54
3 Threat Scenarios
However, the net effect of multiple risk factors could be incoherence where a threat incident behaves as a random variable. In essence, the net effect of incoherent risk factors is to “randomize” the resulting threat incidents. The analyses in Chap. 8 are predicated on the assumption that a threat incident is a random variable. This highly simplifying assumption is an admittedly expedient version of reality. The results are meant to illustrate the power of stochastic reasoning and thereby transform otherwise intractable problems into straightforward mathematical exercises. We are sacrificing realism for the sake of convenience and therefore must be cautious in interpreting results so derived. In contrast with likelihood risk factors, there is nothing to preclude a vulnerability or impact component of risk from being a random variable. Vulnerability and impact risk factors increase the magnitude of loss and importance, respectively. Loss and importance risk factors can assume any value appropriate to the scenario, random or otherwise, and still preserve their status as risk factors. Therefore, an assumption of randomness for vulnerability and impact risk factors is not a contradiction. The probability of protection method discussed in Chap. 8 is based on the premise that a vulnerability risk factor(s) is a normally distributed random variable. Note that a threat scenario with only one likelihood risk factor cannot result in a threat incident that is a random variable. If only one likelihood risk factor is present, the threat scenario is not subject to the randomizing effects of other risk factors since there are no other risk factors. Therefore, the only way a threat incident can be a random variable in a threat scenario with only one likelihood risk factor is if that risk factor is a random variable, which we know is a contradiction. The second condition for a random threat scenario is where threat incidents happen spontaneously without provocation or stimulus. This situation can only occur in the absence of likelihood risk factors. Such scenarios are rare but possible. For example, radioactive decay causing biological damage is one such threat scenario. In this case we define a threat incident as the emission of radioactive particles, e.g., alpha, beta and gamma electromagnetic energy. These particles are spontaneously emitted, and their arrival times obey Poisson statistics. There can be no influence on emissions, and the rate of decay depends exclusively on the type and amount of radioactive material present.
3.7
Maximum Threat Scenario Risk
Two important threat scenario conditions warrant further discussion because they result in a maximum risk condition assuming all risk factors contribute equally to the magnitude of risk: 1. The likelihood risk factors are coincident, i.e., risk factor confluence 2. Security controls and relevant risk factors (all types) are non-coincident We digress here to describe constructive and destructive interference. These are physical phenomena that are helpful in visualizing maximum threat scenario risk
3.7 Maximum Threat Scenario Risk Fig. 3.2 Constructive interference
55
+
=
conditions. However, these analogies should not be interpreted too literally. Risk factors and security controls are not forms of physical energy. However, the cancellation and reinforcement of physical energy is effective in displaying the effect of synchronous and asynchronous threat scenario parameters. If the magnitude of two oscillations of physical energy is changing in time, the difference in how they align is known as their phase difference or phase relationship. Moreover, a function that oscillates in time can be represented as a sine wave (or equivalently, a cosine wave, which is merely a sine wave shifted by 90 degrees). If two waves A and B are oscillating, and wave A and wave B align perfectly, i.e., the two waves are exactly coincident, they are said to be “in-phase.” Furthermore, the waves reinforce each other to yield a larger fluctuation. This phenomenon is known as constructive interference and Fig. 3.2 illustrates this effect.3 Constructive interference is useful in representing the effect of confluence. On the other hand, if the peaks and troughs of each wave are offset from another, the two waves are said to be “out-of-phase.” The magnitude of the offset, measured in degrees or radians, indicates by how much the two oscillations differ in phase. If likelihood risk factors are coincident, the effect is analogous to constructive interference. Their combined effect is to increase the magnitude of the likelihood component of risk. We witnessed this effect with the risk factors for heart disease. If likelihood risk factors are not coincident, i.e., do not overlap in time, each risk factor has an effect, but their individual effects are independent of each other. In contrast, a confluence of multiple likelihood risk factors results in maximum risk. As one might guess, if two fluctuating quantities of physical energy with identical amplitudes are precisely 180 degrees out of phase there is a subtractive effect. Therefore, the two waves extinguish each other. In other words, if the peaks and anti-peaks align they will cancel to yield a zero fluctuation condition. This phenomenon is known as destructive interference, which is illustrated in Fig. 3.3. It is a useful analogy in characterizing the salutary effect that results when security controls are applied in synchrony with relevant risk factors.4 If threat scenario risk factors and their corresponding security controls are not coincident, i.e., they are out of phase, the effect is to also increase the magnitude of risk. Furthermore, the magnitude of risk increases for longer periods of non-coincidence.
3 4
https://www.nasa.gov/missions/science/f_interference.html ibid.
56
3 Threat Scenarios
Fig. 3.3 Destructive interference
+
3.8
=
General Threat Scenario Phenomena
By now it should be clear that risk factor type and/or behavior and threat scenarios are inexorably linked. The risk factors determine the magnitude of risk associated with a particular threat scenario; there is no such thing as a threat scenario without risk factors. Therefore, it is no surprise that the risk factor types identified in Chap. 2 largely determine the organizational schema for threat scenarios. However, there are also general phenomena that describe aspects of threat scenarios that transcend a particular risk factor category. In some cases these phenomena have significant operational implications. Three such phenomena are described below. (a) Differing Vulnerability and Likelihood Risk Factors In many threat scenarios there are different risk factors for the vulnerability and likelihood components of risk. Consider threat scenarios relating to violent weather, disease, and car accidents. In each of these scenarios the potential loss has little or nothing to do with the likelihood of a threat incident occurrence. Therefore, security controls that address the likelihood of a threat incident would often be distinct from those designed to limit loss or damage. Furthermore, there can be huge disparities in the magnitude of each component of risk. A threat scenario might be characterized by a significant vulnerability component of risk but the likelihood of an incident is quite low. Nuclear weapons-related threat scenarios exemplify this condition. A nuclear attack would result in complete devastation yet the likelihood of a threat incident is generally assessed to be low. Low likelihood-high vulnerability/impact threat scenarios are common, and can be difficult to address precisely because of the large disparity in the respective components of risk. Such scenarios are examined in more detail in Chap. 12. Assessing security risk involves estimates of the magnitude of each component of threat scenario risk where the overall magnitude is determined by the risk factors for all three components. Security risk management is the process of applying security controls to the relevant risk factors. If the risk factors associated with each component differ, the respective security controls used to manage these risk factors will likely differ as well. If the risk factors for the vulnerability and likelihood components of risk are distinct, each must be addressed in order to manage threat scenario risk in its totality. Again, consider the threat of nuclear weapons. Addressing the vulnerability
3.8 General Threat Scenario Phenomena
57
component of risk by lining the walls of a facility in lead and/or building that facility underground will do nothing to affect the likelihood of an attack and vice versa. Precisely because a risk factor might only affect one component of risk, the purpose of a security control must be understood. For example, increasing visible security controls might decrease the potential for future threat incidents but they would do nothing to reduce the vulnerability component of risk. The classic example is the security system sticker prominently displayed in the window of a home. Although a sticker that warns of an alarm could discourage theft via break-ins, it would not affect the magnitude of losses that result from such a theft. The valuables inside the house exclusively determine the magnitude of the vulnerability component of risk. (b) Identical Vulnerability and Likelihood Risk Factors A threat scenario could possess identical risk factors for both the vulnerability and likelihood components of risk. For example, the presence of valuables could be a risk factor for both components of risk in a theft threat scenario. The valuables affect the vulnerability component of risk by virtue of their inherent value. The likelihood component of risk is affected by the attractiveness of valuables to a would-be thief. Of course the thief must be aware or at least suspect the presence of valuables. In other words, the potential for a threat incident is increased due to knowledge (or suspicion) of the presence of valuables. When a risk factor simultaneously affects the likelihood and vulnerability components of threat scenario risk, the absence of a security control is a risk factor for that threat scenario. For example, a firewall or its functional equivalent affects both the likelihood and vulnerability components of risk, and is considered a sine qua non for any viable information technology network. Attackers typically probe networks for open ports followed by the identification of available services (e.g., FTP, telnet), which is facilitated by poorly configured firewalls or a badly implemented security policy. Open ports that allow external parties to access services running on the network increase the likelihood component of risk for threats of information compromise. Open ports provide sought-after avenues to internal resources and potentially lead to threat incidents facilitated by other risk factors. Therefore, the inclusion of a firewall in the design is an indispensible feature of any viable network architecture. Any information technology asset storing confidential information that is not segregated from the Internet via a firewall is not adequately protected from network intrusions and the attacks that would inevitably ensue. The magnitude of the vulnerability component of risk is proportional to the number and types of assets lacking adequate protection via a firewall along with other security controls. Because of its effect on both components of risk, a dysfunctional or non-existent firewall is a risk factor for information loss. Other security controls in an IT environment might also fall into this indispensible category. Passwords and other forms of authentication immediately come to mind. Their contribution to security risk management is profound since a lack of authentication affects both the vulnerability and likelihood components of risk associated
58
3 Threat Scenarios
with a host of attacks. Therefore, an absence of authentication pursuant to accessing an information asset would also be considered a risk factor for information compromise. Perhaps a more pedestrian but no less poignant examples of a risk factor affecting both components of risk are electrical grounding threat scenarios. In order for a toaster or any device using AC power to be useable the device must be electrically grounded and thereby prevent electrical shocks. An ungrounded device would significantly increase both the vulnerability and likelihood of electric shocks. Grounding is essential to the successful operation of this device. Therefore, the absence of electrical grounding is a risk factor for devicerelated injuries resulting from electrical shocks. (c) The Presence of Apex Risk Factors We learned in Chap. 2 that the presence of an ARF significantly increases the magnitude of threat scenario risk. For this reason, ARF presence is a threat scenario condition that must be identified and prioritized as part of a comprehensive security risk assessment. As one might expect, identifying an ARF is more complicated in the presence of other risk factors. Recall the test for the presence of an ARF. If the elimination of any single risk factor significantly affects the magnitude of threat scenario risk or if the effect of other risk factors is significantly enhanced by the presence of a particular risk factor, it suggests that the risk factor in question is an ARF. Trivially, if only one risk factor is present it is an ARF by definition. Therefore, in such cases no test for “ARFness” is required. If a second risk factor is added to the mix, this condition warrants a test to assess the effect. The number of steps in the test increases with each risk factor. In many if not most cases, judgment will play a role in evaluating the results since there is no absolute metric for an apex risk factor. Non-ARFs have a more proportionate effect on the magnitude of threat scenario risk. That said, a non-ARF could also affect different components more or less profoundly depending on the threat scenario. However, the individual contribution of any non-ARF will not exceed the cumulative effect of all other risk factors nor will its presence significantly increase the effect of other threat scenario risk factors.
3.9
A Security Risk Assessment Taxonomy
A security risk assessment taxonomy can now be constructed. This taxonomy specifies the various elements and their inter-relationships that drive the assessment methodology. If an automobile inspection protocol were to have any connection to the real world it would be structured according to a logical hierarchy of the systems and sub-systems of automobiles. In exact analogy, the taxonomy of security risk assessments must be based on threat scenarios and associated components of risk whose magnitude depends on the relevant risk factors. Since the canonical threat scenario is the focus of any real-world security risk assessment, it is only logical that they exist at the top of the taxonomy. Threat
3.9 A Security Risk Assessment Taxonomy
59
scenarios consist of three elements, but the components of risk only affect two of these elements: affected entities and threat scenario environments. The all-important risk factors determine the magnitude of each component of risk. The three components of risk and associated risk factors determine the relationship between threats and affected entities, and threat incidents are the resultant of threat scenarios. Vulnerability is the magnitude of loss that can be incurred as a result of a threat incident, and impact is the importance of such losses, which can often be expressed as loss-per-incident. These are the only representations for these components of risk. In contrast, the likelihood component of risk is dichotomous. The probability of an event is based on a probability distribution of historical threat incidents. The potential for a threat incident derives from risk factor-related incidents or changes to a risk factor and represents an inference of the likelihood component of risk. The presence of risk factors increases the magnitude of each component of risk. The aggregate value of all the risk factors determines the overall magnitude of risk for a given threat scenario. The five risk factor types are temporal, spatial, behavioral, apex and complexity (likelihood only), and the threat scenario categories are static, dynamic, random, behavioral and complex. Static and dynamic threat scenarios result from the behavior of risk factors relative to security controls. In contrast, random, behavioral and complex threat scenarios are so categorized because of the type of risk factors present. Figure 3.4 is a graphic depicting the taxonomy as described above.
Fig. 3.4 Taxonomy of security risk assessments
60
3.10
3 Threat Scenarios
Summary
Table 3.1 summarizes the five threat scenario categories, their defining features and risk-relevant conditions. R is a risk factor, C is a security control t is time and x is position or distance.
Table 3.1 Summary of threat scenario categories and risk-relevant features and conditions Threat scenario category Defining condition
Risk- relevant feature
Risk- relevant feature
Static dC/dt dR/dt and dC/dx dR/dx Controls in sync with risk factors
Dynamic dC/dt < dR/dt or dC/dx < dR/dx
Random Threat incident is a random variable
Behavioral Behavioral risk factors are present
Complex Complexity risk factors are present
Controls lag risk factors
No likelihood risk factors; threat incidents occur spontaneously
Risk factors are present in or relate to an affected entity
Number of risk factors
Persistent risk factors
Transient and/or inconsistent risk factors
Confluence of at least two likelihood risk factors causing incoherence, which is expressed as a threat incident random variable
Uncertainty in security risk management, i.e., the application of security controls to risk factors
Chapter 4
Risk, In-Depth
4.1
Introduction
The relationship between a threat and an affected entity is completely determined by three components that are collectively known as risk. Features of the threat scenario, i.e., the risk factors, affect the magnitude of each component. This relationship is loosely analogous to the way features such as shape and color describe a physical object. For example, a ripe banana is characterized by its yellowness, distinctive smell and quasi-cylindrical shape. Arguably, a banana would simply not be a banana in the absence of certain features that differentiate it from other fruits. Similarly, a threat is simply not threatening in the absence of any one of the components of risk. Any threat scenario feature that enhances one or more component of risk is relevant to the relationship between a threat and the entity so affected, i.e., is risk-relevant. Of course, it is quite possible that a specific threat and particular entity could be unrelated in one threat scenario and related in another. In other words, a different environment where the threat and entity interact might completely change the relationship between the same threats and entities. A threat-entity relationship is highly contextual, where the context is determined by the threat scenario details. A diagram of the canonical threat scenario and the influences on the relationship between threats and affected entities is shown in Fig. 4.1. Importantly, at a high level all threat scenarios are equivalent and the same three component types affect every threat scenario. Threat scenario equivalence and the universality of risk are of critical importance to the theory of security risk assessment and are two core principles identified in Chap. 12. A structured and repeatable security risk assessment process is a consequence of these principles.
© Springer Nature Switzerland AG 2019 C. S. Young, Risk and the Theory of Security Risk Assessment, Advanced Sciences and Technologies for Security Applications, https://doi.org/10.1007/978-3-030-30600-7_4
61
62
4 Risk, In-Depth
Fig. 4.1 Threat scenarios and threat-entity relationship
There is an innate similarity between the theory of security risk assessment and any intellectual discipline. Each is grounded in a set of fundamental principles, which form the basis for reasoning and thereby enable analyses of an infinite variety of scenarios. For example, Newton’s laws enable predictions of the motion of any object under the influence of inertial forces and are valid everywhere.1 Maxwell’s equations describe all electromagnetic phenomena.2 These analogies are not meant to imply that the theory of security risk assessment is in any way equivalent to theories that have literally shaped our understanding of the physical world. The suggestion is that other intellectual disciplines are also based on a formalism that provides a conceptual foundation for problem solving. The core principles identified in Chap. 12 represent the distillate of this conceptual foundation for the theory of security risk assessment. Threat scenario equivalence and risk universality are particularly crucial since they establish a common frame of reference for problem solving. For that reason these two principles are explored in more detail in the next section. In this chapter we discuss other concepts that are not necessarily foundational but are nonetheless influential or mandate more in-depth discussion due to their broad relevance. For example, diverse forms of uncertainty are what drive the various estimates of the likelihood component of risk. Uncertainty is also a factor in the magnitude of complexity, which increases the likelihood component of risk.
1
Quantum mechanics is required to analyze scenarios on an atomic scale. Conversely, electric and magnetic fields that are not described by Maxwell’s equations are not physically realistic.
2
4.2 Threat Scenario Equivalence and Risk Universality
4.2
63
Threat Scenario Equivalence and Risk Universality
Two principles are fundamental to the conceptual framework that govern security risk assessments: threat scenario equivalence and risk universality. Although they are central to the theory, these two principles also have practical implications. Namely, although the specific threat scenario risk factors may vary, the risk assessment process is always the same. Consider the following threat scenarios and their respective assessments of risk: Threat Scenario #1: Baseball Baseball is a game played by two opposing teams, where the object is to score more runs than the opposition in the allotted number of innings. There is always only one winner and one loser per game. The notion of “winning and losing” implies there is risk associated with every game. Moreover, each game consists of a sequence of mini-vignettes, i.e., threat scenarios, which involve a face-off between at least one pitcher and a minimum of three batters per inning. Baseball, or any game for that matter, is interesting because of the individual threat scenarios that ultimately determine the outcome. Someone unfamiliar with baseball is perhaps blissfully unaware of the number of variables in any given threat scenario, e.g., the particular batter-pitcher combination, the runners on base, the score, the inning, the available bullpen personnel, the batter on deck, the available pinch hitters, the weather conditions, etc. Such details are part of the flow of a game, and they drive strategic adjustments based on situations that arise from an infinite number of possibilities in each scenario. In Rules of the Game; How the Principles of Nature Govern Chance, the authors describe this process3: In games of chance, in games of strategy, and in games involving both chance and strategy, the course that play takes on any given occasion will be “historically” unique because of the large number of possible choices involved. The sequence of moves constantly opens new directions the game can follow along branches of a decision tree. The arbitrary course that play takes at each fork of the decision tree depends on the chance roll of the dice as well as each player’s ignorance of his opponent’s strategy.
Chance plays a role in every game no matter how structured and how skilled the participants. If a batter hits a ball that happens to catch a breeze at the time its arc approaches the left field wall, and the air current carries the ball into the stands for a home run, the timing of such an occurrence can in part be attributed to luck. The more degrees of freedom that exist the more opportunities there are for luck to influence the outcome. The rules are what make a game, a game. We again quote from Eigen and Winkler who write of the relationship between games and life in general4:
3 Eigen, M. And Winkler, R. Rules of the Game; How the Principles of Nature Govern Chance. Alfred A. Knopf, 1981. 4 Ibid.
64
4 Risk, In-Depth Every game has its rules that set it apart from the surrounding world of reality and establish its own set of values. Anyone who wants to “play” has to follow those rules. In parlor games, rules established before the game begins determine the course the game will take and define the scale of values by which it is played. But the effects of a chance occurrence can change the constellation of the game and set it running in a totally different direction. This is how life initiated its first games, and this is how our thoughts and ideas continue to play those games.
We next consider one particular baseball threat scenario pursuant to assessing the risk-relevant facets of the game. It is the bottom of the ninth with two outs and runners are on first and second base in a baseball game between rival teams. We will assume the teams are the Yankees and Red Sox in order to make things more interesting. The Red Sox are behind by two runs and are batting. The Yankees must decide whether to have the current pitcher pitch to the batter at the plate or bring in a relief pitcher. Fortunately, the manager of the Yankees has access to a plethora of data to assist in this decision, most likely accessible via a computer application created just for this purpose. For starters, the manager queries the computer program for information on the outcomes of every historical confrontation between this particular pitcher and batter combination. Casting a wider net in search of risk-relevant data, the Yankees manager seeks information on the effectiveness of this batter against both left and right-handed pitchers. Since this is a major league game, the manager can likely leverage statistics reflecting every relevant historical scenario.5 In the language of security risk management, the Yankees manager is assessing the likelihood component of risk for this threat scenario. What are the specific elements of this particular threat scenario? The threat is the Red Sox batter hitting a game-winning home run by making solid contact with the ball. The affected entity is the team in the field, i.e., the Yankees, and the threat scenario environment is the ballpark that consists of a multitude of scenario-specific details, i.e., the runners on base, the park dimensions, the composition of the field, the weather, et al. The risk assessment process for both managers is made significantly easier by the fact that the game of baseball is highly structured. Although there are an infinite variety of threat scenarios they can only vary in prescribed ways. The spectrum of outcomes for any threat scenario is well defined if uncertain. The rules of any game are constraining by design, and these enable meaningful comparisons of performance. Importantly, the rules apply equally to both teams. Only the players and ballpark dimensions will change from game-to-game. As noted above, the rules define the structure for playing baseball, and any game for that matter. In the absence of rules the very notion of winning and losing in the context of a game becomes meaningless. A by-product of the game’s structure is the
5 The rejection of intuition and accepted baseball wisdom in favor of statistical reasoning is described in the book “Moneyball” by Michael Lewis.
4.2 Threat Scenario Equivalence and Risk Universality
65
identification of risk factors that are known a priori for every threat scenario. In addition, numerous games yield statistics consisting of the outcomes of similar threat scenarios. In this context threat incidents are strikeouts, stolen bases, home runs, triples, walks, etc. An abundance of statistics on player performance under a multitude of conditions facilitates the creation of predictive models that can be applied to any particular threat scenario. Any security professional would crave such data although it would imply there have been many historical threat incidents. It is no wonder that statistics junkies are drawn to the game of baseball. The large number of games and players competing in a highly structured environment offers numerous metrics that can be tracked and analyzed ad infinitum.6 Returning to our hypothetical confrontation, the Yankees manager’s risk model helps to determine under what conditions the batter is more likely to make an out than get a hit. That is the question he or she must answer for the threat scenario noted above. The model will be used to assess the likelihood the batter will make solid contact with the ball, and in the worst case, hit a homerun. A homerun would result in the loss of the game, the maximum value of the impact component of risk for this particular threat scenario. The Yankees manager and everyone else in the ball park is keenly aware that a home run is possible if solid contact between the bat and ball occurs. The likelihood component of risk for a homerun (and often a strikeout as well) increases if the batter is a so-called “power hitter,” i.e., has a history of extra-base hits. So to be precise, the Yankees manager’s risk management strategy will be to assess the likelihood that the batter will make solid contact with the ball. Monte Carlo simulation could be used to generate many similar scenarios and thereby yield a figure for the likelihood of various outcomes. The Yankees manager will decide whether to change pitchers or not in order to reduce the likelihood of bat-ball contact. The Yankees manager might also rearrange the fielders to minimize the likelihood of an extra base hit if the batter does indeed hit the ball. The Yankee manager would also assess the vulnerability component of risk for this threat scenario. This analysis requires an assessment of the “loss” that would ensue assuming solid contact between the bat and ball. The magnitude of loss for the Yankees, the team in the field, relates to the distance and location of the ball if it is accelerated by contact with the bat. In general, the distance the ball travels and the bases advanced by the batter are positively correlated. In other words, the farther the ball travels the more bases traversed by the batter. Can the Yankees manager actually estimate the vulnerability component of risk, i.e., how far the ball will travel and therefore the number of runs that will score if there is contact between the bat and the ball? Newton’s laws can be used to estimate the distance the ball will travel and hence whether a home run will result assuming there is contact between the bat and the ball.
6 MLB Advanced Stats Glossary https://www.cbssports.com/mlb/news/mlb-advanced-stats-glos sary-a-guide-to-baseball-stats-that-go-beyond-rbi-batting-average-era/
66
4 Risk, In-Depth
More generally, these laws can predict the reaction of any object under the influence of inertial forces. Therefore, the equations resulting from Newton’s laws can produce the sought-after solution irrespective of the particular conditions on the field. Specifically, the ball’s trajectory can be determined if specific parameters are known: the initial velocity of the ball and the initial angle of the ball after being struck by the bat. The force on the ball due to gravity (but neglecting the effect of air resistance) must also be known but this factor is anything but a secret. The simplified model used to calculate the distance the ball travels following contact with the bat, is as follows: d ¼ vo t þ 1=2gt2
ð4:1Þ
Here d is the distance, vo is the ball’s initial velocity, g is the acceleration due to the force of gravity and t is the ball’s time-of-flight. This equation shows why baseball mavens have become obsessed with a ball’s initial velocity following contact, noting the relevance to its trajectory has been known since the seventeenth century! The Yankees and Red Sox managers may or may not be physicists, but they both intuitively understand Newton’s laws of motion. In particular they know the situational risk associated with solid contact between the bat and the ball. Both managers are acutely aware that if the batter imparts enough energy to the ball at the correct angle, the game could be over in one swing. The batter will attempt to realize the maximum value of the impact component of risk for this threat scenario. Finally, note that the Red Sox experience the mirror image of the Yankees threat scenario. What is good for the Yankees is, by definition, bad for the Red Sox and vice versa. This arrangement is inherent to any zero sum competition. Threat Scenario #2: An Information Technology (IT) Environment. Many modern IT environments consist of a data center hosting physical and virtual machines that facilitate access to applications accessed by a multitude of users. An IT Department might also support tens of thousands of desktops and associated users with varying information management requirements and risk-relevant details. The objectives of information technologies and information security controls are often antithetical. The former facilitates information sharing and the latter limit access to information. Moreover, the IT users occupy a prominent place in the environment, and therefore play a key role in information security. As a result, their behavior, i.e., the behavioral risk factors, often affects the magnitude of risk. IT threat scenarios typically contain numerous risk factors. In addition, these scenarios are not static. Their variability is due to changes in the user population, each user’s behavior and/or user-managed security controls, e.g., passwords, infrastructure changes and evolving threats. The requirement for information sharing, both internal and external to the organization, coupled with the omnipresent risk of information compromise drive the requirement for the standard security controls of authentication, authorization, electronic and physical access restriction and network segregation.
4.2 Threat Scenario Equivalence and Risk Universality
67
An IT environment is not a game and therefore there aren’t well-defined winners and losers. Although there is a standard structure to any IT environment per the OSI model, the number of entities, the breadth of technologies and information management requirements increase the degrees of freedom associated with any threat scenario and make the possible outcomes exponentially more varied. The “rules” associated with IT environments correspond to the specific information sharing requirements, which are modulated by security controls. Nevertheless, the Chief Information Security Officer (CISO) identifies and assesses threat scenario risk factors and applies security controls in accordance with the organizational tolerance for risk. In precisely the same way the baseball manager assesses a given threat scenario, the CISO examines the effect of the risk factors and makes decisions based on their relative effect on the likelihood, vulnerability and impact components of risk. The CISO’s job is complicated by the variety of threat scenarios and the countervailing requirements of information sharing versus information access restrictions. What can we learn from a comparison of the baseball and IT threat scenarios? Although their respective risk-relevant details are different, the two scenarios are in fact identical when viewed at a sufficiently high level. This situation results from threat scenario equivalence and the universality of risk.7 In the case of baseball, the threat scenario changes from batter-to-batter and even from pitch-to-pitch. As discussed, the threats and risk factors are well understood due to the highly structured nature of the game. Furthermore, a baseball manager, indeed the baseball “risk manager,” is able to evaluate the multiplicity of risk factors associated with each threat scenario variation and apply appropriate controls, e.g., a pitching change, player realignment, etc. However, the manager only has so much latitude in adjusting the risk management strategy; he or she must always comply with the rules. In a similar vein, an IT Department system administrator attempts to determine the threats, risk factors and the effectiveness of security controls. For the threat of information compromise, both the likelihood and vulnerability components of risk are enhanced by risk factors such as open network architecture, weak authentication and hidden pathways to the Internet to name but a few. A key question associated with any threat scenario is whether a model can be identified that will enable estimates of the magnitude of at least one component of risk. The answer to that question boils down to whether it is possible to isolate the contribution of specific risk factors to the magnitude of threat scenario risk. Unfortunately, assessing the effect of the risk factors for information security threat scenarios is not as straightforward as it is in baseball. A general model to estimate the specific contributions of the individual risk factors does not exist. Transient risk factors contribute to the variability of the environment, which drive
7
Note that something qualifies as a threat if it produces loss or damage. Therefore, in the most general sense the production of a run in a baseball game represents a threat to the team allowing the run.
68
4 Risk, In-Depth
uncertainty in security risk management and the effectiveness of security controls. The lack of a probability distribution of similar threat incidents along with unstable risk factors of uncertain magnitude precludes a calculation of the probability of a future threat incident. Information security threat scenarios are less structured than baseball games. Although networks have common features, e.g., desktops, switches, routers, firewalls, access points, etc., their number, variety, interconnectedness and configuration et al., can vary significantly from network to network and even within the same network. Requirements for authentication and authorization to access IT devices are also variable. The “players” in an IT environment are not similar to baseball players. There are frequently many more than nine, and the rules of engagement are much less restrictive. Furthermore, and as noted above, the behavior of the computer user population contributes to the magnitude of security risk. Importantly, the risk factors in an IT environment are also interdependent. Although the handedness of a batter relative to that of the pitcher and the ability to hit a curveball are related, their relative significance can be parsed and thereby assessed for each threat scenario. Moreover, both risk factors can be addressed by bringing in a pitcher with the requisite physical skills and a history of success in that threat scenario. We noted that transient risk factors exist in IT environments. We will see in Chap. 9 that this condition contributes to the magnitude of threat scenario complexity. For example, the topology of large networks can change over time scales that are shorter in duration than the relevant security controls. As discussed in Chap. 2, this condition will result in a dynamic threat scenario. It also contributes to uncertainty in the application of security controls to risk factors, one of the two complexity risk factors. Furthermore, the transparency of such changes might be limited. Ultimately, numerous sub-networks with interdependent layers undermine the ability to assess the contribution of a particular risk factor to the overall magnitude of risk. The net result is that a general model of information security risk remains elusive. Absent such a model, a risk-based standard is required to obtain consistent security risk assessment results across diverse threat scenarios. It is telling that a current search of the Internet reveals numerous examples of such standards: the SANS Institute, NIST, ISO, NSA, et al. The UK and other governments have created their own cyber security standards.8 Since IT environments do not impose rules in the same way as baseball games, a risk-based standard that reflects the organization’s tolerance for security risk is needed to determine, if not actually measure, a significant drift from the agreed limits on risk tolerance. A detection of excess drift should trigger the application or
8 https://www.bsigroup.com/en-GB/Cyber-Security/Cyber-security-for-SMEs/Standards-for-ITand-cyber-security/
4.3 Direct and Indirect Assessments of Likelihood
69
adjustment of security controls. The implicit understanding is that excessive drift represents an unacceptable level of information security risk. The fact that a quantitative model of threat scenario risk is not achievable does not mean that a security risk management strategy cannot be effective with the judicious, i.e., risk-based, application of security controls. However, it does imply that it is impossible to quantify the reduction in risk that results from applying a specific security control. Despite the obvious differences between a baseball game and an IT environment, the elements that define the two threat scenarios and the features that affect the relationship between these elements are identical. A baseball manager and network administrator identify and assess the risk factors that increase the magnitude of risk for their respective threat scenarios. In fact, when viewed from a sufficiently high level, the baseball manager and the CISO have similar functions, as do all risk managers. Of course, the details associated with each threat scenario preclude them from easily swapping positions.
4.3
Direct and Indirect Assessments of Likelihood
Security risk management professionals are often asked to opine on the likelihood of a future threat incident. For example, a director of security might be asked about the likelihood of a terrorist incident perpetrated against a company’s corporate headquarters. A CISO might be asked about the likelihood of an attack on the IT network by some version of malware that has been identified in the popular press. The term likelihood is often misapplied if not misunderstood in the context of assessing risk. Colloquial use of the term contributes to the confusion surrounding this component of risk. For example, the magnitude of likelihood is the desired quantity in the following two assessments: “What is the likelihood of a terrorism incident occurring next week?” and “What is the likelihood of a heads occurring with the next coin toss?” The assessment of likelihood is different in each case because of inherent differences in the terrorism and coin toss “processes.” Of course, each sentence makes perfect linguistic sense, and we intuitively understand what is meant. Specifically, an assessment of a future terrorism outcome calls for a prediction based on an objective evaluation of mostly subjective factors. In addition, the possible outcomes are often infinite. In contrast, the possible outcomes of a coin toss are finite and are known a priori. Moreover, the outcome of a coin toss is a random variable. Therefore, although the spectrum of possible outcomes is known, it is impossible to predict the precise outcome of a given toss. The distinction between the two processes reflects the essence of direct and indirect assessments of the likelihood component of risk. In direct assessments of security risk, a probability distribution of outcomes, i.e., threat incidents, exists. These incidents arise from one of two threat scenario conditions:
70
4 Risk, In-Depth
1. Multiple influences/risk factors result in a probability distribution of the number of threat incidents formulated in terms of a risk-relevant parameter. 2. There are no likelihood risk factors and incidents occur spontaneously and without provocation. Therefore, the number of threat incidents occurring in a specified time interval is a random variable. Suppose numerous thefts occurred in the same building. As a result, a probability distribution of actual incidents could be formulated in terms of some risk-relevant feature, e.g., monetary loss, time of day, floor, etc. These features have been identified as threat scenario risk factors. Assuming conditions remain relatively constant over relevant time scales, generalizations about the probability of future thefts can be specified in terms of the identified risk factors. Risk-relevant insights into threat scenario risk can be gleaned without the benefit of historical threat incidents if specific information about the threat scenario is available. Inferences about the likelihood component of risk are possible via indirect assessments. Rather than a probability, these assessments yield the potential for incident occurrence. There are two indirect assessment methods. The first method involves a subjective assessment of the tendency for actual threat incident occurrence based on historical risk factor-related incidents. For example, incidents of unauthorized entry into or within restricted space are correlated with actual threat incidents such as theft. A history of risk factor-related incidents qualitatively increases the likelihood of various illicit activities. The absence of actual threat incidents precludes a quantitative assessment of the likelihood component of risk, i.e., a probability, and therefore an indirect assessment is the only option. The second method of performing indirect assessments of the likelihood component of risk entails measuring changes to a threat scenario risk factor. In Chap. 7, a threat scenario of slipping and falling onto the subway tracks is assessed in detail. There are a number of ways to estimate the likelihood component of risk for this threat scenario. One way is to measure the slipperiness of the subway platform surface via the coefficient of friction in the vicinity of the subway track. A low coefficient of friction is known to cause slipping and is therefore a risk factor for actual slips on the platform surface. Of course indirect assessments are predicated on actually identifying at least one threat scenario risk factor. Such identifications can occur through various means. Intuition based on experience is one perfectly valid method. Note that sometimes it might be necessary to validate intuition with more objective criteria. The essential fact to remember is that incidents or conditions that lead to actual incidents, however they may be identified, are used in indirect assessments of likelihood. However, it is impossible to specify an actual probability of slipping based on the coefficient of friction alone. It is clear that decreasing the coefficient of friction increases the potential for slipping, which relates to the physics of friction. Therefore, a statement about the increase or decrease in the potential for slipping is possible based solely on a measurement of the slipperiness of the platform surface.
4.4 Sources of Uncertainty in Estimating Likelihood
71
Table 4.1 Direct and indirect assessments of the likelihood component of risk Direct assessments of the likelihood component of risk: probability 1. Threat incident random variable (a) Apply Poisson statistics assuming relevant conditions apply (b) Calculate the number of threat incidents expected in a specified time interval
2. Probability distribution of threat incidents (a) Formulate a probability distribution of threat incidents in terms of a risk-relevant feature or parameter (b) Calculate the probability of the number of incidents with a specific value based on the properties of the distribution (c) Generalize about the probability of future incidents (should they occur) assuming threat scenario risk factors remain stable
Indirect assessments of the likelihood component of risk: potential 1. Risk factor-related incidents (a) Identify a threat scenario risk factor (b) Establish a probability distribution of risk factor-related incidents or make a subjective judgment regarding risk-relevance (c) Infer the potential or tendency for an actual threat incident 2. Change in risk factor magnitude (a) Identify a threat scenario risk factor
(b) Measure changes to the magnitude of that risk factor (c) Infer the potential or tendency for an actual threat incident as a function of risk factor magnitude
Correlation is another method used to identify risk factors especially if threat incidents have actually occurred. For example, the Pearson Product Moment Correlation Coefficient is used to establish a linear relationship between two time series. Those time series could be the number of incidents and a potential risk factor. This method is also discussed in Chap. 7. As we will show in Chap. 7, an actual experiment could be conducted that would lead to a probability distribution of actual threat incidents. This distribution would enable an explicit calculation of the probability of slipping, i.e., a direct assessment of the likelihood component of risk. Although arguably somewhat cruel, this experiment might consist of changing the platform coefficient of friction and observing the number of resulting slips. For example, if there were 100 recorded slips, and 10 slips occurred when the coefficient was less than or equal to 0.52, the probability of slipping under such platform conditions is 0.10. Table 4.1 summarizes the methods associated with direct and indirect assessments of the likelihood component of risk.
4.4
Sources of Uncertainty in Estimating Likelihood
The magnitude of security risk is always an approximation. Multiple risk components coupled with the uncertainty inherent to likelihood in its various incarnations ensure there is no absolute risk measurement scale akin to those used in the physical world, e.g., temperature, pressure, intensity et al. In addition, the source of uncertainty will vary depending on the threat scenario.
72
4 Risk, In-Depth
The contribution of uncertainty in assessing likelihood can be viewed from two perspectives. From one perspective, the source of uncertainty determines whether direct or indirect assessments of likelihood are applicable. From another vantage, the type of threat incident, i.e., risk factor-related or historical threat incidents, drives the source of uncertainty in a given threat scenario. Each perspective represents a different side of the same coin, and it is difficult to differentiate cause from effect. The bottom line is that the source of uncertainty has significant theoretical and practical implications to assessments of the likelihood component of risk. If a probability distribution of threat incidents can be formulated, the uncertainty inherent to that distribution, i.e., the variance, determines the precision of likelihood assessments. Therefore, a probability can be assigned to a particular value in the distribution, which represents a quantitative estimate of the likelihood of a future incident assuming conditions remain stable. If a probability distribution of threat incidents does not exist, the uncertainty is not the same, and as a result, the likelihood component of risk is inherently unquantifiable. Namely, this uncertainty concerns the contribution of each risk factor to the magnitude of the likelihood component of risk. This type of uncertainty forces a reliance on either incidents that correlate with threat incidents or changes to a risk factor. Both enable inferences regarding the effect on the magnitude of the likelihood component of risk as discussed later in this section. Specifically, three sources of uncertainty have been identified that affect the likelihood component of risk, and are discussed in this section. The absolute number of risk factors of all types also contributes to threat scenario uncertainty, but this condition relates specifically to the application of security controls. The uncertainty associated with multiple risk factors is discussed in Chaps. 7 and 9, where the latter discussion specifically relates to threat scenario complexity. In threat scenarios where a probability distribution of threat incidents can be formulated, that distribution will almost always be characterized by dispersion about the mean, i.e., the variance. The variance reflects the magnitude of uncertainty in estimating the mean of the distribution. The larger the distribution sample the smaller the variance and therefore the less uncertainty exists with respect to the mean value. An example might be the number of global terrorism threat incidents relative to the countries where such incidents have occurred. A probability distribution of threat incidents can be formulated by first computing the total number of threat incidents. We would then calculate fractions of the total by country. The result would be a distribution with a mean number of incidents and a variance about the mean. If the variance were broad, i.e., a fat distribution, it would reflect big differences in the number of incidents occurring in each country. Therefore, one could make a statement about the probability of an attack occurring in a specific country relative to all other countries in the distribution. It is worth pausing a moment to reflect on this point and its implications. For example, if the distribution has specified the probability of an attack in the United States is 0.15, this statistic is not saying there is a 15% probability of experiencing a terrorism attack in the United States. In fact, it is saying nothing about the likelihood of an attack occurring in the U.S.
4.4 Sources of Uncertainty in Estimating Likelihood
73
It is specifying the historic probability of an attack relative to the total number experienced in other countries. What does the 0.15 probability figure really tell us? Suppose we placed a marker corresponding to each attack on a dartboard and were subsequently blindfolded. We were thereafter instructed to throw a dart at the target. The fact that 15% of all terrorist attacks occurred in the United States means there is a 15% chance of hitting a marker corresponding to one of the U.S. attacks. In other words, if we selected an incident at random from the spectrum of incidents, the probability of it being a U.S. incident is 15%. Why and when is this figure risk-relevant? How could such a distribution offer insights into a security risk management strategy? First, it is risk-relevant if and only if the conditions that contributed to the 15% figure remain in effect. The underlying premise is that the outcomes that have occurred in the past relate to the outcomes of the future. These results, or any historical results, will be meaningless if terrorismrelated conditions change appreciably and thereby substantively affect a future distribution of threat incidents. Second, the dartboard distribution of incidents is strategically relevant only when considering other countries in the distribution. For example, suppose we traveled around the world and determined our itinerary based on this probability distribution of terrorism incidents. We might be more inclined to visit countries where the historical probability of a terrorism incident was below a certain figure relative to the spectrum of options. Of course, there will be an average number of threat incidents associated with our distribution of terrorism threat incidents. This distribution is likely to have a finite variance since the number of incidents by country is almost assuredly not constant. Various factors act incoherently to yield a varied number of threat incidents. The variance reflects the width about the average value, which is equivalent to the uncertainty that a value chosen at random is within some distance of the mean. Note that almost any distribution of incidents would exhibit dispersion resulting from various parameters that contribute to variations in the number of incidents. We witness this effect all the time and in conjunction with many processes. A simple action such as slicing a banana results in pieces of different widths no matter how hard we try to make them uniform. The actions required in making each cut are themselves varied, which causes slice width to be a random variable. A mechanized cutting process would also produce pieces of varying width, although the distribution of widths would likely be much narrower than with a manual process. If the probability distribution of terrorism incidents by country is a Gaussian, i.e., a normal distribution, we know the values of uncertainty about the mean. Specifically, there is roughly a 68.4% probability that a specific threat incident chosen at random from the distribution will be within one standard deviation of the mean, a 95.2% probability it is within two standard deviations and a 99.7% probability of being within three standard deviations of the mean. If the distribution is of another type the associated standard deviations will differ from a Gaussian but will be equally prescribed. We might conclude that location is a risk factor for terrorism. Furthermore, a distribution of incidents formulated in terms of the country where attacks have
74
4 Risk, In-Depth
occurred arguably reveals a relevant threat scenario detail. However, the country where an attack has been perpetrated is not likely to be the only terrorism risk factor or parameter of interest. A probability distribution of incidents could also be formulated in terms of the year of attack, mode of attack, etc. Identifying the risk factors for a threat scenario is a critical step in any security risk assessment. It is unclear how helpful such a distribution would be in developing a counterterrorism strategy. The important point is to identify the source of uncertainty and understand its implications to any threat scenario. As noted in the beginning of the section, the source of uncertainty for a given threat scenario drives the likelihood estimate as well as the conclusions so derived. The second source of uncertainty in estimating likelihood results from a dearth of threat incidents. In that case, the effect of a risk factor on the magnitude of likelihood is unknown since there are little or no incidents. Therefore, we must leverage other features of the threat scenario to assess likelihood. Specifically, risk factor-related incidents or a change in a risk factor are used to determine the potential for future threat incidents and thereby conduct an indirect assessment of the likelihood component of risk. The third source of uncertainty results from threat scenarios where incident arrivals occur at random. The result is a Poisson distribution of incidents occurring in a given time interval. Certain threats relating to natural phenomena such as radioactive decay display this type of behavior, where incident arrival is a random variable. Figure 4.2 is a graphical representation of the three sources of uncertainty in estimates of the likelihood component of risk.
4.5
Time and Risk
In Chap. 3 we discussed threat scenarios where the magnitude of risk is affected by the temporal characteristics of risk factors. We observed that long or short time intervals for threat interaction with affected entities, the overlap of risk factors and the phase relationship between security controls and risk factors can all have a profound effect on risk-relevance and hence a security risk management strategy. Estimating the magnitude of the likelihood component of risk is challenging if there are only a few historical threat incidents and threat scenario conditions are not stable. Both these conditions often apply to threat scenarios, and therefore the magnitude of likelihood resists quantification. However, it is safe to say that in general the magnitude of the likelihood component of risk increases as the duration of a threat scenario risk factor increases. Insurance companies routinely incorporate time in assessing risk. They use the return period to express the frequency of low likelihood-high impact incidents such as in natural disasters and other catastrophes. The return period refers to the return of some event of a certain magnitude.
4.5 Time and Risk
75
Fig. 4.2 Sources of uncertainty in estimating the likelihood component of risk
Specifically, a return period of time T, equals the inverse of the probability of occurrence p. In other words, T ¼ 1/p. Conversely, p ¼ 1/T. A 10-year earthquake means there is a one-in-ten probability that an earthquake of that magnitude or greater will occur in any given year. Using historical data one could formulate a probability distribution corresponding to any event of interest. For example, climate scientists might be interested in a 1000year daily rainfall event. This figure means there is a one in a thousand chance that rainfall will exceed a specific amount in a given year. That amount will be determined by formulating a probability distribution based on historical data. This data might look similar to Fig. 4.3.9 According to Fig. 4.3, the 1-in-1000 year rainfall amount is calculated to be 7.25 inches. The probabilities corresponding to 7.25 inches or greater constitute 0.1% of the area under the curve. Consequently, 99.9% of the area under the curve corresponds to
9 https://www.climate.gov/news-features/event-tracker/how-can-we-call-something-thousand-yearstorm-if-we-don%E2%80%99t-have-thousand
76
4 Risk, In-Depth
7.25 inches probability of occurence
The 1-in-1000-year rainfall amount has a 0.1% chance of occuring. To find it, split the area into two parts
0
1
2
3
4
5
99.9%
0.1%
of the area less than that amount
of the area greater than that amount
6
7
8
amount of rain (inches)
Fig. 4.3 Return period probability distribution
rainfall amounts that are less than 7.25 inches. This formulation is consistent with intuition: more extreme events occur with lower frequency and have a lower probability of occurrence. Despite the connotation associated with the word “return” in return period, this word does not imply that the incident is expected to occur regularly over the specified time interval. In other words, a 1000-year daily rainfall amount does not imply a specific amount of rainfall will occur every 1000 years. The return period is a statistic, and is therefore based on historical rates of occurrence. It is assumed that the probability of incident occurrence does not change over time, and incidents are statistically independent. In principle, the return period could be used to compare the magnitude of risk associated with a specific threat against a particular facility relative to other facilities in close proximity. For example, assuming historical data could be used to estimate the likelihood component of risk for terrorism, a graphic such as Fig. 4.4 could be generated. The thin line with circles represents the return period for the expected loss due to terrorism for a building with security controls in a particular urban neighborhood. The thick line with square data points depicts the same calculation for typical buildings in the same neighborhood that lack security controls. An underlying model that enables estimates of loss for the specified time intervals is used to generate these curves. According to Fig. 4.4, in 1000 years one can expect property losses due to terrorism for the building with security controls be about $325 M or greater. This figure is compared to nearly $400 M or greater for a building in the vicinity without security controls. The alternative yet equally valid explanation is there is a 0.001 probability that a building with security controls will experience losses of $325 M or
4.5 Time and Risk
77
Fig. 4.4 Return period applied to terrorism risk
greater. The same probability would be associated with losses of $400 M or greater for a building without security controls in the same geographic area. If one had confidence in the model, Fig. 4.4 might be used to justify lower insurance premiums for terrorism threat scenarios based on the presence of security controls. Estimates of the effectiveness of security controls must account for the time interval over which they remain effective. Recall dynamic threat scenarios are defined as those where the time rate of change of a security control is less than the time rate of change of the relevant risk factor. Therefore, there is a lag in the security control relative to changes in the relevant risk factor. Consider the use of passwords, which as of this writing is the predominant form of authentication in computer systems. There is often confusion or at the very least uncertainty regarding the required composition of passwords. The decision on password complexity depends on only two factors: (1) the value of the assets being protected, and (2) the estimated time required to successfully execute a brute force attack against the hashed password file. The latter depends in part on the computational resources available to the adversary of concern.10 Note that both the value of assets requiring protection and the resources of adversaries can change with time. A changing risk profile is one of many reasons to regularly evaluate a password policy and to perform internal password cracking exercises. The interplay between time and complexity in connection with password cracking is discussed in detail in Chap. 12. This relationship is a further illustration of how time is relevant to assessing security risk.
10
The total number of possible passwords as determined by character diversity and password length defines the so-called information entropy. Information entropy is equivalent to diversity, and specifically relates to the uncertainty associated with an information source. The higher the password entropy the more computational power is required to attempt every possible password combination. Attacks where all possible passwords are attempted is known as a brute-force attack. We will encounter information entropy again in the discussion of complexity in Chap. 9.
78
4.6
4 Risk, In-Depth
Risk-Relevance
Physics sometimes offers interesting analogies with threat scenario risk. For example, we learned in Chap. 1 that the relationship between threats and risk is loosely analogous to the connection between gravity and the mass of an object. Gravity acts on all objects that possess a feature called mass, and it is absent otherwise. In the world of every day experience, we are blissfully unaware of objects lacking mass. Plato wrote about the “essence” of an object in his Theory of Forms. He postulated that all objects are imbued with this innateness, which relates to its function. For example, one might encounter many types of hammers during a trip to the hardware store. Each hammer might vary in color, size and even their specific use, e.g., ball-peen, claw, sledgehammer, framing. However, each of these objects has “hammerness” in common. In fact, a hammer would simply not be a hammer in the absence of a property called “hammerness,” which represents the essence of a hammer. Moreover, a hammer and a screwdriver might have the same color and/or size but there is little danger of confusing a screwdriver with a hammer to anyone with even a modicum of experience in home repairs. Therefore, objects in the most general sense are described if not defined by their essence. From the definition of a threat provided in Chap. 1 we know that a threat causes damage or loss to an entity, and must have the potential to do so in order to be relevant to that entity. As explained in Chaps. 1 and 2, a relationship called risk specifies the relevance of a specific threat to a particular entity within the context of a threat scenario. Recall we also previously introduced the concept of risk-relevance in Chap. 1, which is obvious from certain examples. An encounter with a great white shark in the ocean versus meeting the same animal on land is a poignant example. An oceanic environment is highly risk-relevant in this threat scenario. Risk-relevance is mentioned frequently in connection with the theory of security risk assessment. It is an adjective that succinctly expresses whether a relationship indeed exists between a specific threat and a particular entity. Since something is risk-relevant if it is a risk factor associated with a particular threat scenario, assessing risk-relevance is equivalent to assessing the contribution of a threat scenario risk factor to the magnitude of a component of risk. In another example, assessing the risk associated with threat scenarios involving the Zika virus would mostly be an academic exercise to individuals living in the Swiss Alps. Similarly, threat scenarios involving avalanches are probably low on the list of concerns to the residents of Miami, Florida. These examples may seem obvious, but other threat scenarios are subtler. Regarding the first example, scientists have determined that a species of mosquito, A. aegypti is responsible for transmitting the Zika virus, dengue fever and yellow fever. These mosquitoes are prevalent in warm climates, and are definitely a concern for individuals located in infested areas.
4.7 The Confluence of Likelihood Risk Factors
79
With respect to avalanches, these phenomena involve the rapid movement of large amounts of snow in mountainous areas. Avalanches are risk-relevant to skiers and hikers since rapidly moving snow can quickly overwhelm individuals caught in their path. Avalanches are caused by mechanical disturbances of loosely compacted snow on elevated terrain. However, snow itself cannot exist if the temperature significantly exceeds 32 F (0 C). Therefore, the risk factors of mountains, loosely compacted snow and low temperatures increase the relevance of avalanches to entities located in areas with these conditions. Clearly, individuals in Miami and the Swiss Alps have a different relationship to the threats of Aedes mosquitoes and avalanches. However, a single physical property of the environment determines the relevance of both threats to their respective affected entities: temperature. This similarity is purely a coincidence. Temperature indeed increases the magnitude of the likelihood component of risk in both threat scenarios, but this risk factor is only risk-relevant at opposite ends of the temperature scale for each threat scenario. Presumably, mosquitoes and avalanches have little else in common other than the fact that individuals should assiduously avoid both of them. To reiterate, a risk-relevant feature of a threat scenario is by definition a risk factor. Identifying such features and determining the magnitude of their contribution to a threat-entity relationship represent the essence of a security risk assessment.
4.7
The Confluence of Likelihood Risk Factors
In general, the magnitude of threat scenario risk increases with the number of risk factors. This statement applies to all three components of risk. However, the effect of multiple likelihood risk factors is amplified when their peak values are coincident in time. As noted previously, this condition is known as confluence. Figure 4.5 shows the relationship between the number of likelihood risk factors and the rate of heart disease occurring in middle-age women.11 It is evident from the data that the existence of multiple risk factors exponentially increases the likelihood of heart disease in this segment of the population. We see that if a single risk factor is present, the likelihood of heart disease doubles relative to threat scenarios where no risk factors are present. If a patient has two risk factors, the likelihood of heart disease doubles again, i.e., four times the risk when no risk factors are present. However, the presence of three risk factors increases the likelihood of heart disease by a factor of ten relative to zero risk factors. The data confirms that the risk of heart disease increases exponentially with the number of risk factors.
11
nhlbi.nih.gov; http://www.nhlbi.nih.gov/health/educational/hearttruth/downloads/html/infographicmultiplier/infographic-multiplier.htm
80
4 Risk, In-Depth
Fig. 4.5 The likelihood of heart disease in middle age women as a function of the number of risk factors
A naïve view of risk might lead one to conclude that the magnitude of the likelihood component of risk is additive with respect to the number of risk factors. However, we know that the likelihood cannot exceed unity by the definition of probability. The addition of multiple risk factors makes it possible to exceed one if enough risk factors are present. In the limit of an infinite number of risk factors the probability of a threat incident should converge to one. This condition is only possible if the effect of multiple likelihood risk factors is multiplicative. The non-linear effect of confluence underscores the importance of identifying likelihood risk factors when conducting security risk assessments. Finally, the duration of risk factors is relevant to confluence since the risk is enhanced when risk factors overlap in time in accordance with the definition of confluence. For example, the time interval used to measure threat scenario risk factors and/or risk factor duration could affect the actual detection of a confluence condition. The shorter the measurement time interval or the shorter the overlap condition the higher the probability of not detecting confluence. For two partially contemporaneous, risk factors, what is the probability of detecting a confluence condition? We can use simple geometric arguments to figure it out. Figure 4.6 illustrates the situation. First, we calculate the probability of a confluence condition. Let t1 equal the time interval corresponding to the duration of risk factor 1, and t2 equals the time interval corresponding to the duration of risk factor 2. Then t1 + t2 ¼ L equals the total time interval without time overlap. Let M equal the time interval with time overlap. If Δt ¼ L – M is the time overlap of t1 and t2, the probability of a confluence condition is M/L. As expected, when M ¼ 0 the probability of the time intervals overlapping is zero, i.e., there is no overlap of the two risk factors. At the other extreme, i.e., M ¼ L, the probability of an overlap is 1.
4.8 Summary
81
Fig. 4.6 Probability of detecting risk factor confluence
Next, we must specify the probability of a measurement occurring during a confluence condition. This probability will depend on the length of the measurement time interval. Let’s specify the measurement time interval as Δm. If Δm is less than or equal to Δt, the probability of the measurement occurring during confluence equals (ΔmΔt)/L. Therefore, the probability of detecting a confluence condition equals the probability of a confluence condition times the probability of detecting that condition, which equals M/L (ΔmΔt)/L ¼ M(ΔmΔt)/L2. If the measuring time interval Δm, is greater than the overlap time interval Δt, the probability of a measurement occurring during confluence is Δm/L. Therefore, the probability of detecting a confluence condition in this case is (Δm/ L) (M/L) ¼ (ΔmM)/L2. Significantly, in both cases the probability is proportional to the measurement interval.
4.8
Summary
Threat scenarios always consist of three elements: threats, affected entities and the environment in which threats and entities interact. The relationship between a specific threat and a particular affected entity is the risk associated with a given threat scenario. The magnitude of risk is determined exclusively by the three components of risk and their respective risk factors. When viewed from this vantage all threat scenarios are equivalent. Uncertainty occupies a central place in the theory of security risk assessment. There are four sources of threat scenario uncertainty: the inherent uncertainty in the resulting probability distribution of threat incidents, uncertainty in the contribution of a specific risk factor to the overall magnitude of risk, uncertainty in the arrival of threat incidents, and the uncertainty driven by complexity, which relates to multiple risk factors and the application of security controls to those risk factors.
82
4 Risk, In-Depth
The term “risk-relevant” is a concise way of expressing that a particular threat scenario feature or condition is a risk factor for a component of risk. A security risk assessment focuses on determining the features of a threat scenario that are riskrelevant. Time and risk are inexorably linked in the context of a threat scenario. In general, the longer a risk-relevant feature persists the greater the cumulative risk. The relative time rate of change between risk factors and security controls can affect uncertainty in security risk management, which increases the magnitude of threat scenario complexity and hence the likelihood component of risk. Direct assessments of the likelihood component of risk are based on a probability distribution of historical threat incidents. The probability of a threat incident with a particular characteristic or a particular number of threat incidents can only be calculated from a probability distribution of threat incidents. In addition to a probability distribution of historical threat incidents, direct assessments also require stable risk factors over relevant time scales in order to effectively generalize about the likelihood of future incidents. In contrast, indirect assessments of the likelihood component of risk are based on risk factor-related incidents or the magnitude of a risk factor. Indirect assessments enable inferences on the magnitude of the likelihood component of risk. Although risk quantification can lead to a more precise understanding of riskrelevance, it is not essential to developing useful insights on the magnitude of risk. Quantification in this context means determining the probability of a future threat incident (likelihood), a numerical figure for loss (vulnerability) and/or calculating the precise effect of specific threat scenario features on the significance-per-threat incident (impact). Although quantitative risk assessments are typically required to specify the exact performance requirements for security controls, qualitative estimates can yield equally useful results, and thereby inform a security risk management strategy.
Part II
Quantitative Concepts and Methods
Chapter 5
The (Bare) Essentials of Probability and Statistics
The laws of probability, so true in general, so fallacious in particular. Sir Edward Gibbon
5.1
Introduction
Probability and statistics are essential to developing an in-depth understanding of the theory of security risk assessment. Probability and related concepts provide the machinery to assess the likelihood component of risk. Statistical methods yield quantifiable estimates of the magnitude of risk for many threat scenarios. Because of their conceptual and practical importance, probability and statistics are critical to linking theory with practice. Of the three components of risk, likelihood is the most subtle and involved. The subtlety and complicated nature of likelihood arise from the requirement to generalize about the future based on the past, with or without historical evidence. Ultimately, assessing the magnitude of the likelihood component of risk represents an attempt to link the past and the future. Assessing the likelihood of future incidents depends on the uncertainty inherent to the likelihood component of risk. However, the type of uncertainty will vary depending on the threat scenario. For example, if a statistically significant number of historical threat incidents exists, the uncertainty of the incident distribution, i.e., the variance, enables us to quantify the likelihood of a future incident type or characteristic, should it occur, with the implicit assumption that past is prologue. If threat incidents do not exist or are limited in number, there is no probability distribution of incidents. As a result, the uncertainty associated with the likelihood component of risk in such threat scenarios is different. Specifically, the uncertainty concerns the effect of the risk factors on the overall magnitude of risk. In this case, one can only make qualitative inferences about the likelihood component of risk. Such inferences must either be based on the number of risk factor-related incidents or a change to one or more risk factors. © Springer Nature Switzerland AG 2019 C. S. Young, Risk and the Theory of Security Risk Assessment, Advanced Sciences and Technologies for Security Applications, https://doi.org/10.1007/978-3-030-30600-7_5
85
86
5 The (Bare) Essentials of Probability and Statistics
However, it is sometimes possible to circumvent a dearth of historical threat incidents using a simplifying yet profound assumption of randomness. More precisely, we require a threat incident to be a random variable. As a consequence, the laws of probability immediately become available, along with standard statistical methods. Such an assumption is as liberating as it is a gross approximation to reality. An assumption of randomness yields the desired probability distribution, and a spectrum of possible outcomes immediately becomes known. For example, if we assume a risk factor is either managed or unmanaged with equal probability, we know that the risk factors will assume a binomial distribution, in analogy with a coin toss. In the end we will have accomplished our mission in the interest of expedience, but not without a considerable loss of verisimilitude. Certain aspects of the theory of security risk assessment are inherently statistical and therefore require no such leaps of faith. The magnitude of uncertainty associated with threat scenarios containing multiple risk factors that are variables with finite variance scales according to the statistical properties of variance. In general, the statistics of multiple risk factors has profound implications to the magnitude of risk. For example, in Chap. 9 threat scenarios are modeled as an ensemble of states consisting of managed and unmanaged risk factors. Statistical methods applied to this ensemble yield estimates of threat scenario complexity. Even the vulnerability component of risk can be viewed statistically. A distribution of vulnerability risk factor values can sometimes be applied to a model for vulnerability. Such a distribution is used to estimate the effectiveness of security controls, as we will learn in Chap. 8 in the discussion of the probability of protection. Note that invoking the laws of probability is an implicit admission that there is an incomplete understanding of the process under assessment. The power of statistics is the ability to generalize about the future based on an imperfect understanding of the past. George Boole put it this way1: Probability is expectation founded upon partial knowledge. A perfect acquaintance with all the circumstances affecting the occurrence of an event would change expectation into certainty, and leave neither room nor demand for a theory of probabilities.
In this chapter we introduce the essential concepts of probability and statistics. These concepts mostly relate to estimates of the magnitude of the likelihood component of risk. Perhaps more importantly, these concepts facilitate quantitative reasoning about risk. The intent is to ensure the reader is aware of the basics, and ultimately to demonstrate when and how statistical methods apply. To be clear, it is not necessary to have a working knowledge of probability and statistics to be rigorous about assessing security risk. It is enough to have a passing familiar with the definitions and concepts depending on the particular application.
1
George Boole, English mathematician,1815–1864.
5.2 Probability
87
Ultimately, the objective is to identify the spectrum of risk factors and assess their relative effect on the magnitude of one or more components of risk.
5.2
Probability
The meaning of a probability is often misunderstood, perhaps due in part to colloquial use of the term. A probability is actually nothing more than a fraction. Specifically, if a collection of elements can be divided into subsets, a probability corresponds to the fraction of elements represented by a particular subset. The sum of all subsets equals the entire collection. Therefore, each fraction corresponds to the probability of randomly selecting that subset from among the entire collection. If there are 10 subsets in the collection labeled A through J, the probability of selecting a particular subset, e.g., “C,” at random is 1/10. It should be self-evident that the sum of all the fractions, i.e., the entire collection of subsets, must equal one. Importantly, the subsets provide the context that enable comparisons relative to the constituent elements. In the absence of context, the very notion of a probability is meaningless. A calculation of probability derives from a probability distribution of like elements. Therefore, a calculation of probability is inherently a comparison. The following passage aptly describes the reality of estimating probability2: Note that the probability depends very much on the nature of the ensemble, which is contemplated in defining this probability. For example, it makes no sense to speak simply of the probability that an individual seed will yield red flowers. But one can meaningfully ask for the probability that such a seed, regarded as a member of an ensemble of similar seeds derived from a specified set of plants, will yield red lowers. The probability depends crucially on the ensemble of which the seed is regarded as a member. Thus the probability that a given seed will yield red flowers is in general different if this seed is regarded as (a) a member of a collection of similar seeds which are known to be derived from plants that had produced red flowers or as (b) a member of a collection of seeds which are known to be derived from plants that had produced pink flowers.
A probability distribution of peer elements or subsets of the distribution enables generalizations about its occurrence in other contexts based on the historical evidence. Note we have used the word generalization and not prediction. A probability is not a crystal ball. It is akin to a window that reveals a narrow slice of the landscape and thereby enables generalizations about the entire landscape. The degree to which one can generalize about the broader view will depend on the width of the window relative to the variability of the topography.
2
F. Reif, Fundamentals of Statistical and Thermal Physics, McGraw-Hill, Inc. 1965.
88
5 The (Bare) Essentials of Probability and Statistics
If the window provides even a narrow view of a vast meadow or ocean, a generalization about the broader view will typically be more accurate than a similar generalization about more spatially varied scenarios. The length scale over which a view of the ocean changes is much greater than an urban landscape. The field of view is analogous to sample size in distributions. The bigger the fieldof-view the more representative it is of the scene in its entirety. Similarly, the bigger the sample derived from a “parent” distribution, the more accurate are predictions about the parent that are based on the sample. Importantly, a probability distribution specifies the limits on uncertainty within that distribution. The narrower the dispersion, i.e., the spread about the mean or average value, the less uncertainty there is in estimating the mean. Larger samples result in narrower dispersions, i.e., less uncertainty. The precise dependence of the dispersion on sample size in a normally distributed sample is derived in Chap. 8. Consider a sample drawn from a parent population. We have previously stated that the probability distribution of the sample enables generalizations about the parent. In fact, such generalizations represent the essence of statistical sampling. For example, one might wish to know the fraction of individuals in a population who possess various hair colors. A probability distribution would specify how frequently individuals with red, black, brown and blonde hair appear within a sample drawn from the general population. Let’s assume that in a random sampling of 1,000,000 U.S. citizens hair color is distributed as follows: Black: 250,000 Blonde: 200,000 Brown: 500,000 Red: 50,000 Sample Population Total: 1,000,000 The hair color frequencies within a large sample population such as this one could reasonably be expected to reflect the frequencies in the parent population.3 The probabilities within the sample are calculated as follows: Probability of red hair ¼ p(red) ¼ 50,000/1,000,000 ¼ 0.05 Probability of brown hair ¼ p(brown) ¼ 500,000/1,000,000 ¼ 0.50 Probability of blonde hair ¼ p(blonde) ¼ 200,000/1,000,000 ¼ 0.20 Probability of black hair ¼ p(black) ¼ 250,000/1,000,000 ¼ 0.25 The sample data yields a probability distribution of hair colors. In other words, the distribution specifies the frequency of occurrence of each hair color within the sample. Note that the sum of the fractions, i.e., probabilities, equals one, which is consistent with the definition of a probability. In this sample, 5% of the sample
3 The standard deviation of a sample drawn from a normally distributed parent will equal the standard deviation of the parent distribution divided by the square root of the sample size.
5.2 Probability
89
Fig. 5.1 Probability distribution of four hair colors in a sample population
population possess red hair, 50% possess brown hair, 20% possess blonde hair and 25% possess black hair. Figure 5.1 graphically illustrates the sample probability distribution. If the sample distribution is sufficiently large, it will accurately reflect the distribution of hair color within the parent population from which it was chosen. However, realize that any generalizations about the parent would only be valid over time scales dictated by trends in style and fashion. If red hair suddenly became the rage immediately after the sample was taken, Fig. 5.1 might no longer accurately reflect the distribution of hair colors within the current parent population. In other words, style/fashion is a factor that biases the preference for hair color, and such biases notoriously vary with time. How does the distribution of hair color compare to the probability distribution associated with well-known processes such as coin and die tosses? In a coin or die toss the spectrum of possible outcomes is known in advance. A coin toss is a binary process, and a single toss yields the familiar heads or tails outcome. Since there are only two outcomes-per-toss, the probability of heads equals the probability of tails. Clearly, each probability equals 0.5, which implies there is a 50–50 chance of obtaining either outcome. Similarly, a die has six faces, and each face is associated with a unique number between one and six. Therefore, the probability of any single outcome is 1/6, which in decimal form is roughly 0.17. The distribution of possible process outcomes enables calculations of the likelihood of any specific outcome. Figure 5.2 illustrates the probability distribution of a die toss process. In other words, if the die is fair, the probability of a particular outcome, i.e., any outcome, is 0.17. A die is “loaded” if the probability of any of the six possible outcomes is not 0.17. This inequality would result from an uneven distribution of die material, intentional or otherwise.
90
5 The (Bare) Essentials of Probability and Statistics
Fig. 5.2 Probability distribution of a die toss
The probability of any single die toss outcome is inversely proportional to the number of possible outcomes. For example, the probability of a die landing on a three is 0.17. There are six possible outcomes, where each outcome is unique and a three is one of those six outcomes. Since a probability represents a fraction of the total number of outcomes, a probability is expressed as a decimal or equivalently as a percentage between 0 and 100. The notion of independence is crucial to discussions of probability. Qualitatively, two events are independent if the outcome of one event does not affect the outcome of the other. For example, any single toss of a coin does not affect the outcome of any other toss. Therefore, each coin toss is an independent event and the joint probability associated with any two tosses is the product of the individual probabilities. In other words, if A is the probability associated with some process event, and B is the probability associated with another event within the same process, if A and B are independent the probability of both events is given by A x B. It makes intuitive sense that the probability of multiple independent events equals the product of the individual probabilities. For example, the probability that a single coin toss results in a heads or tails outcome is ½ or 0.5. The probability that two coin flips both produce a heads or tails is exponentially less: ð1=2Þ2 ¼ ½ ½ ¼ ¼ ¼ 0:25 If the temptation is to add rather than multiply the individual probabilities, consider the probability of producing a particular outcome in a binary process consisting of three independent events. For example, if the process was additive, the probability of tossing three heads (or three tails) would equal 1/2 + 1/2 + 1/2 ¼ 3/ 2 ¼ 1.5. The total probability exceeds one, which has no meaning. It is logical that
5.3 Average, Standard Deviation, Variance and Correlation
91
for increasing numbers of coin tosses, the likelihood of a specific outcome, e.g., all heads, all tails, decreases exponentially with each toss. Finally, note that the elements of a probability distribution could relate to any collection of comparable entities, e.g., the result of IQ measurements of a classroom of students, the bowling scores of the local Kiwanis Club, rates of cancer among the population of a city, etc.
5.3
Average, Standard Deviation, Variance and Correlation
A probability distribution is characterized by two parameters: the mean and standard deviation. The mean of a probability distribution is its average value. Summing the elements in a distribution and dividing by the number of elements yields the mean, which is denoted as μ (Greek letter mu). An expression for the mean is as follows: μ ¼ ðx1 þ . . . þ xN Þ=N
ð5:1Þ
x1. . .xN represent the individual elements of the distribution and N is the total number of elements. For example, the mean of the collection of numbers one through ten results from adding 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10 and dividing by 10, the total number of numbers. Trivially, the mean is calculated to be 55/10 or 5.5. Notice that the number 5.5 is not actually an element in the collection. A shorthand representation for the average of a collection of elements Ri, is given by the following: μ ¼ 1=N
N X
ð5:2Þ
Ri
i¼1
Similarly, one could describe the statistics of a measurement of a variable R in terms of the so-called moments of a probability distribution P(R). This characterization represents the probability of observing R between R and R + dR, where dR is an infinitesimal fraction of R. Refer to Chap. 6 if the reader is unfamiliar with integrals or differentials. Z Average ¼< R >¼
RPðRÞdR
ð5:3Þ
The mean square value of a variable R is given by, Z Mean Square Value ¼< R >¼ 2
R2 PðRÞdR
ð5:4Þ
92
5 The (Bare) Essentials of Probability and Statistics
In analogy with averages, the mean square value for discrete elements of a distribution Ri, equals, Mean Square value ¼ 1=N
N X
R2i
ð5:5Þ
i¼1
The precision in the value of a distribution of a variable R is reflected in the variance. Specifically, the variance expresses the average difference from the mean. For a variable R, the variance equals the average of the variable squared minus the square of the variable average: Variance ¼ σ2 ¼< R2 > < R>2
ð5:6Þ
The standard deviation σ equals the square root of the variance: 1=2 σ ¼ < R2 > < R>2
ð5:7Þ
Specifying only the average of a distribution is incomplete and therefore subject to inaccurate generalizations. Distributions with few elements are characterized by significant variance so any particular element chosen at random can be expected to be anywhere within a large distance from the mean. Merely stating the average value of a distribution can have significant consequences. The apocryphal story about the statistician who drowned in a river that on average was only 2 ft deep expresses this concept nicely. Specifying both the mean and standard deviation is necessary in order to convey a complete statistical picture. A statistical relationship between two variables is captured in the joint probability distribution, P(R1, R2), which describes the probability of observing R1 between R1 and dR1 and R2 between R2 and dR2. An important measurement of such a relationship is the correlation function, which is also referred to as the covariance, i.e., the variance of a bivariate distribution: CR1R2 ¼< R1 R2 > < R1 >< R2 >
ð5:8Þ
Qualitatively, the correlation function measures the difference between the average of the product of two variables and the product of their averages, which is precisely the variance except in this case there are two variables. If no correlation exists, i.e., CR1R2 equals zero, which implies ¼ . Two independent random variables have zero correlation, but the converse is not necessarily true. Note that the two variables in question can actually be the same variable only measured at different times. Such a measurement is known as the autocorrelation function, and is discussed in Chaps. 7 and 8. To quantify the magnitude of correlation, a correlation coefficient r, is calculated, which is actually just the normalized covariance, and is defined as follows:
5.4 The Normal and Standard Normal Distributions
r ¼ CR1R2 =σR1 σR2
93
ð5:9Þ
The value of r varies between 1 and 1, where 1 represents perfect correlation between the two variables and 1 is perfect anti-correlation. In other words, if two variables are correlated, an increase/decrease in one variable is accompanied by an increase/decrease in the other. Anti-correlation implies the reverse effect: an increase/decrease in one variable is accompanied by a decrease/increase in the other. Increases and decreases in one variable that result in proportionate increases and decreases in the other are indicative of a linear relationship between the two variables. Finally, and as noted above, if r ¼ 0 there is zero correlation, which implies CR1R2 ¼ 0 or ¼ .
5.4
The Normal and Standard Normal Distributions
The normal distribution is arguably the most important and common probability distribution. It therefore deserves special attention. Also known as the Gaussian distribution or bell curve because of its characteristic bell shape, the normal distribution is ubiquitous in all areas of science.4 A probability distribution is often specified as a density, i.e., the probability-perelement or per-outcome. If a probability density function is continuously summed, i.e., integrated over specified limits, it yields a specific probability. In other words, the integral of the probability density function yields the fraction of the distribution that exists within the limits of integration. We note that such limits can be from minus infinity to plus infinity. A very brief introduction to integration and differentiation is provided in Chap. 6. For a normally distributed and continuous random variable, the probability density of the variable, x is given by the following expression, where μ is the mean and σ2 is the variance: h i 1=2 exp ðx μÞ2 =2σ2 f x, μ, σ2 ¼ 1= 2σ2 π
ð5:10Þ
Since the sum of the individual probabilities in a probability distribution must equal one, the probability density function must be normalized so that integrating (5.10) from minus infinity to plus infinity equals unity. The coefficient of (5.10) is the normalization constant, 1/(2σ2π)1/2. If the mean μ, is zero and the standard deviation σ, is unity, (5.10) becomes the standard normal distribution:
4 Johann Carl Friedrich Gauss, German physicist and mathematician, April 30, 1777- February 23, 1855.
94
5 The (Bare) Essentials of Probability and Statistics 6% Mean = 52, Standard Deviation = 7
Probability Density Functions
Probability of X
5%
Mean = 71, Standard Deviation = 11
4% 3% 2% 1% 0% 0
10
20
30
40
50
60
70
80
90
100
X speed, mph Fig. 5.3 Normal probability density functions with different means and standard deviations
f ðxÞ ¼ 1=ð2πÞ1=2 exp ð1=2Þx2
ð5:11Þ
Figure 5.3 illustrates two normal distribution density functions with differing means and standard deviations. Note the difference in the peak values, i.e., the mean, of these two symmetric distributions, and their respective widths about the mean, i.e., the standard deviation/variance.5 Each type of probability distribution has statistical characteristics that are unique to that distribution. For example, Poisson distributions have the same value for the mean and variance, and therefore only one parameter is required to specify the probability of an event. The normal distribution is important for at least two reasons. First, it describes the statistics of many naturally occurring phenomena. Second, normal distributions are the result of repeated sampling from any parent population according to the Central Limit Theorem. In other words, although the parent distribution may not be normally distributed, samples drawn from that parent will be normally distributed. This remarkable theorem has powerful implications to many fields of science and mathematics. Therefore, we will provide a formal description as well as dwell on its implications. Laplace first proved the Central Limit Theorem in 1810.6 The theorem can be described qualitatively as follows:
5 6
http://safety.fhwa.dot.gov/speedmgt/ref_mats/fhwasa10001/ Pierre-Simon Laplace (1749–1827), French mathematician.
5.4 The Normal and Standard Normal Distributions
95
Assume a sample is drawn from a parent distribution and the average of the sample is computed. If that procedure is repeated a sufficient number of times, the distribution of the sample of averages will approach a normal distribution irrespective of whether the parent distribution is normally distributed. In addition, the mean of the averages will approach the mean of the parent distribution, with a standard deviation determined by the sample size of averages. One of the fundamental challenges in assessing security risk is the fact that the Central Limit Theorem cannot be applied to many threat scenarios. This situation arises because a parent distribution of similar threat incidents does not exist. As we now know, a probability corresponds to the fraction of sub-elements within a sample population of elements. Recall the sample distribution of individuals with a particular hair color. In that case, there were four probabilities corresponding to the fraction of individuals with red, blonde, black or brown hair within a sample population of one million individuals. A sufficiently large sample could be used to generalize about the distribution of hair colors within the parent population, i.e., the total population of the United States. A similar exercise might entail weighing one million adult males in New York City, and thereby establish a probability distribution of weights for this segment of the population. As unlikely as it might seem, there might be interest in the probability of encountering a male of a particular weight. For example, the Department of Health might be interested in understanding the prevalence of obesity among men in New York City. The range of weights in this sample distribution would likely be quite diverse in contrast with the distribution of hair color. One might imagine that the sample would range from about 100 pounds to over 400 pounds for the more corpulent denizens of the city. A large sample would be expected to yield a normal distribution of values with a mean of about 200 lbs.7 The probabilities of sub-elements consisting of ranges of weights would be determined, where each range of weights is expressed as a fraction of the total sample. For example, men weighing between 100 and 125 pounds might represent 10% of the sample. Therefore, if the sample were representative of the distribution of weights across the population of New York City males, the likelihood of randomly encountering a male weighing between 100 and 125 pounds is 0.1. One, two and three standard deviations from the mean of a normal distribution correspond to 68.3, 95.5 and 99.7% of the distribution respectively. These percentages are characteristic of all normal distributions, which enables generalizations about the likelihood of a particular outcome chosen at random from the distribution. Figure 5.4 depicts the first three standard deviations of a normal distribution and their corresponding fractions of the distribution population.8
7
Wikipedia https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule U.S.Center for Disease Control and Prevention https://www.cdc.gov/ophss/csels/dsepd/ss1978/ lesson2/section7.html#ALT29 8
96
5 The (Bare) Essentials of Probability and Statistics
68.3% of data
95.5% of data
99.7% of data
-3SD
-2SD
-1SD
MEAN
+1SD
+2SD
+3SD
Fig. 5.4 Standard deviations of the normal distribution
A normally distributed random variable could, in theory, possess any mean and standard deviation. It is convenient to parameterize a normally distributed random variable such that the mean of the distribution is zero and the standard deviation is one. This formulation is accomplished by establishing the Z-statistic, which is a new random variable parameterized in terms of the standard deviation. Standard tables of Z-statistic values have been calculated. Figure 5.5 shows a normally distributed random variable with a mean of 10 and standard deviation of two juxtaposed with a standard normal distribution, i.e., a mean of zero and a standard deviation of one. The Z-statistic is discussed in the next section. Per the Central Limit Theorem, the more averages selected from a parent distribution the more the distribution of averages will approach normality. Of course, such samples will always be an estimate since real world data will never follow a normal distribution exactly. Assume a parent distribution is normally distributed with a mean μ, and standard deviation σ. If a sample consisting of n points is drawn at random from that distribution, the sample will also be normally distributed with a mean μ, and standard deviation σ’, equal to σ/√n. It is clear from this discussion that the uncertainty about the mean of the sample distribution decreases with increasing sample size. Popular participatory sporting events offer the opportunity to view the normal distribution in action. For example, marathon races routinely draw tens of thousands of runners with varying running abilities. The large number of race participants with widely varying talent would be expected to yield a normal distribution of marathon finishing times with a large standard deviation. Specifically, the expected mean finishing time for the general running population is around 4 h, where the top competitors in the field of runners would have finishing
5.4 The Normal and Standard Normal Distributions
97
Fig. 5.5 The standard normal distribution and a normally distributed random variable
times far less than the mean. In one sample consisting of three million marathon finishes, only 4% of the men’s times were under 3 h, and only 1% of the women’s finishes were under that time.9 The reality is that a typical marathon or any public athletic contest is actually a competitive event for a tiny fraction of participants. What would a distribution of finishing times for a group of elite long distance runners look like? The mean finishing time of Olympic-level runners would be expected to be just over 2 h, i.e., about half the time of the amateurs. Whatever the exact number, the finishing times of the best runners would undoubtedly be narrowly clustered about a much lower mean than the general running population. A designation of “elite runner” is reserved for those runners whose finishing time is multiple standard deviations from the mean of the general running population. If the mean finishing time is 4 h and the standard deviation is one half-hour, the probability that the finishing time is between 3.5 h and 4.5 h for a runner selected at random is approximately 0.68. This 1-h time interval corresponds to one standard deviation from the mean of the normal distribution of finishing times.10 Importantly, if the sample is sufficiently large it can be used to generalize about the finishing time of any marathon runner selected at random from the general running population. A hypothesis of normality can be confirmed with statistical methods. A very simple back-of-the-envelope test uses the sample maximum and minimum to compute their Z-score, i.e., the number of standard deviations that a sample deviates from the sample mean, and compares it to the so-called “68–95–99.7 Rule.” This rule is a short hand for the percentage of values that lies within a band around the mean in any normal distribution. In other words, and as depicted in Fig. 5.4,
9
https://marastats.com/marathon/ The answer is approximate because the distribution is presumably not an exact normal distribution. There are tests to determine the degree of normality. 10
98
5 The (Bare) Essentials of Probability and Statistics
Fig. 5.6 P(X x)
68.3%, 95.5% and 99.7% of the values lie within one, two and three standard deviations of the mean, respectively. If numerous points selected from the total number exceed three standard deviations from the mean, the normality of the distribution would be in question. At increasing standard deviations, there should be a decreasing number of points as a percentage of the total population. One might wish to know the probability that an element picked at random from a normal distribution will be more or less than a specific value. That is, if X is a normally distributed random variable, and a specific value of interest is x, calculating either p(X x) or p(X x) might be of interest. p(X x) can be calculated by performing a continuous summation, i.e., integrating (5.10) from x to plus infinity using appropriate values for the mean and standard deviation. Figure 5.6 illustrates the relevant area of integration, where represents the mean of the normally distributed variable X. Similarly, integration of a normal distribution can be performed to determine P (X x). This calculation entails integrating the variable, X from minus infinity to x, the value of interest. Figure 5.7 depicts the relevant area of integration. The next section focuses on the method used to perform this very common calculation.
5.5
The Z-Statistic
Standard normal distributions can be re-parameterized in terms of a new random variable Z. If X is a normally distributed random variable with mean μ, and standard deviation σ, its Z-statistic can be calculated from X by subtracting μ and dividing by the standard deviation. A new random variable Z, is now defined as follows: Z ¼ ðX μÞ=σ
ð5:12Þ
A Z-statistic can also be created from a sample drawn from a parent population. Here is the sample mean, μ is the parent population mean, σ is the standard deviation of the sample and n is the number of samples:
5.6 Statistical Confidence and the p-value
99
Fig. 5.7 P(X x)
z ¼ ½< X > μ= σ=√n
ð5:13Þ
As discussed briefly in the previous section, suppose we were interested in calculating the number of outcomes in a distribution that were more or less than a certain value. In that case, the probabilities must be summed over the relevant fraction of the distribution, noting that the sum of all probabilities in the distribution must equal one. A cumulative distribution, i.e., the integral of the standard normal distribution, would be used for this calculation. For example, we might want to know the likelihood that any particular theft selected at random from a distribution of thefts exceeds 100 dollars in value. This problem is equivalent to asking what fraction of all thefts exceeds 100 dollars if the theft amount is a continuous, normally distributed random variable. Tables are available that contain the area under the standard normal distribution curve that correspond to the sample z-statistic from z ¼ 0 to some specified value. This data is used to calculate cumulative distribution values, and thereby obviates the need to perform the integration alluded to above. Estimates of likelihood using probability distributions ultimately entail counting a fraction of the total number of outcomes in a distribution, noting that sometimes such fractions are part of a continuous distribution of elements. Figure 5.8 illustrates the areal fraction of the standard normal distribution, N(0,1) that exceeds the mean x ¼ 0 for the random variable X. In other words, the shaded area corresponds to the area under the curve corresponding to the values from x ¼ 0 to x ¼ plus infinity. That area is determined by integrating N(0,1) within those limits. We see from Fig. 5.8 that the fraction is 50%. In Chap. 7, the standard normal distribution will be used to perform other calculations of this type.
5.6
Statistical Confidence and the p-value
Nearly every description of the p-value, i.e., the probability-value, begins with a statement about how the method is misunderstood and/or misapplied. It appears this treatment is no exception but hopefully we can avoid a similar fate. A discussion of
100
5 The (Bare) Essentials of Probability and Statistics
Fig. 5.8 Integrating the standard normal distribution N(0,1)
the p-value is included in order to provide a basic familiarity with the concept since it is used in many risk management applications. The following is a working definition of the p-value: The p-value is the probability of obtaining a sample more extreme than the ones observed in the sample data assuming the null hypothesis is true.
This definition is admittedly somewhat convoluted. An example will be provided later, which might provide clarity. First, we will discuss the links between the p-value, hypothesis testing and the normal distribution, which are essential to understanding the meaning of a p-value and its application. The p-value is ultimately a method that facilitates predictions of statistical significance. Estimates of statistical significance are inherently probabilistic. They assess the likelihood that observed results could be explained by random fluctuations. The assessment is conducted via hypothesis testing, which involves assessing the veracity of two mutually exclusive theories. Because of this mutual exclusivity, both theories cannot be simultaneously true. The two hypotheses are presented as an alternative and a null hypothesis. Hypothesis testing is fundamental to statistics, and is used to analyze the results of studies in every possible discipline. Such studies are typically concerned with making comparisons between two groups or between one group and the entire population. For example, a medical trial might evaluate the effect of a drug taken by trial participants. The participants would be divided into a group that has been administered the drug and a control group where the same drug has been withheld. As noted above, the p-value is also linked to the normal distribution. Recall in a normal distribution prescribed fractions of the distribution correspond to a number of standard deviations, i.e., deviations from the mean. Figure 5.4 specified the fractions of a normally distributed population associated with each standard deviation. Recall also that the z-statistic represents a normally distributed random variable parameterized in terms of standard deviations. Therefore, the z-statistic enables direct quotes of deviations from the distribution mean. A hypothesis test requires an assumption regarding a distribution for the test statistic, i.e., a measured parameter of interest. For a z-test, the normal curve is used
5.6 Statistical Confidence and the p-value
101
as an approximation to the distribution of the test statistic. According to the Central Limit Theorem, an increasing number of averages sampled from a parent distribution will tend towards a normal distribution. However, the resulting distribution will always be an estimate because real-world data never follow a normal distribution exactly as noted previously in Sect. 5.4. The normal distribution enables determinations of the meaningfulness of study results via the z-statistic. The higher or lower the z-statistic (depending on the direction of the test), the more unlikely the result is happening by chance, i.e., the more likely the result is meaningful and not due to random fluctuations. The p-value enables quantification of the meaningfulness of a particular sample, which in this case is achieved using a normally distributed z-statistic. We now dig a bit deeper into the p-value and give an example of its use. Recall the p-value evaluates how well the sample data validates the veracity of the null hypothesis, i.e., it measures how compatible your data is with the null hypothesis. In other words, how likely is the effect observed in your sample if the null hypothesis is true? A high p-value means your data reflect a true null hypothesis. A low p-value means your data are unlikely assuming a true null hypothesis. Therefore, your sample provides sufficient evidence to reject the null hypothesis for the parent population. Recall from the definition that a p-value is the probability of obtaining an effect as least as extreme as the one observed in the sample data, assuming the null hypothesis is true. When hypothesis testing for drugs or vaccines, researchers are looking for an effect within the group that has been given the drug relative to the control group that has been denied the drug. The null hypothesis is that the drug has no effect. For example, suppose that a vaccine is tested within a population where the study outcome yielded a p-value of 0.03. This value specifies that if the vaccine had zero effect, i.e., if the null hypothesis is true, there is only a 3% probability of observing the effect seen in the vaccinated group due to random sampling error. The p-value only addresses one question: how likely are your data assuming the null hypothesis is true. Whether or not the result is statistically significant depends on the value that has been established for statistical significance, which is often designated as alpha. If the observed p-value is less than the chosen value of alpha the results are statistically significant. Intellectual honesty mandates the choice of alpha be made before the experiment. If one specifies the level of significance after the results are declared, a number could be selected that proves significance no matter what the data reveal! The choice of alpha depends on the study, but a commonly used value is 0.05. This figure corresponds to a 5% chance that the results occurred at random. In fact, an alpha of 0.05 is arbitrary. R.A. Fischer, the father of modern statistics, choose 0.05 for indeterminate reasons and it persists today. Values from 0.1 to 0.001 are also commonly used. The physicists who discovered the Higgs Boson particle used a
102
5 The (Bare) Essentials of Probability and Statistics
N(0.1) p-value
z (observed test statistic) 0
Fig. 5.9 p-value for a left-tailed test
value of 0.0000003 for alpha, i.e., the threshold for the probability that the discovery was explained by noise is 3 in ten million or roughly 1 in 3.33 million.11 To reiterate, the p-value is the probability of obtaining a measurement that is more extreme in value than those observed in the sample data assuming the null hypothesis is true. Note that extremeness will depend on the problem at hand. If a right-tailed statistical test is conducted, i.e., examining the fraction of data from the sample normal distribution that is greater than some value, i.e., “more extreme,” will be different than a left-tailed test. The p-value for a left-tailed test corresponds to the areal fraction of the standard normal distribution that is more extreme than the observed z-statistic, i.e., the measured sample data, assuming the z-statistic follows a normal distribution. All the values to the left of z are more extreme than z. Fig. 5.9 illustrates the p-value for a left-tailed test. The distribution is the standard normal distribution N(0,1). In a right-tailed test, the values to the right of z are the ones more extreme than z. Therefore, the p-value is the probability that the values of the sample data are more extreme than what was measured, i.e., the observed z-statistic. Fig. 5.10 illustrates the p-value for a right-tailed test. Finally, Fig. 5.11 displays the p-value for a two-tailed test, which is the probability that the values of the samples are either larger OR smaller than the observed test statistics. A table or statistical software is used to transform a z-statistic into a p-value. The result will reveal the probability of a z-statistic lower than the calculated value. For example, if the z-statistic is 2, the corresponding p-value is 0.9772. This result means there is only a 2.3% probability a z-score would randomly be higher than 2. Figure 5.12 shows the area of the standard normal distribution for a p-value corresponding to a z-statistic of 2. In other words, the percentage of the distribution below a z-statistic of 2 is 97.7%.
11
https://towardsdatascience.com/statistical-significance-hypothesis-testing-the-normal-curve-andp-values-93274fa32687
5.6 Statistical Confidence and the p-value
103
N(0.1)
p-value
0
z (observed test statistic)
Fig. 5.10 p-value for a right-tailed test
Fig. 5.11 p-value for a two-tailed test
To recap, statistical significance requires a hypothesis and anti-hypothesis to test a theory. A normal distribution is used to approximate the distribution of test results. The p-value specifies the probability of observing a result at least as extreme as the z-statistic assuming the null hypothesis is true. The p-value is based on the z-statistic. Let’s now examine a security-related scenario. An organization might be in the market for a security widget. It is considering purchasing a particular brand that is considerably more expensive than the generic brand of widget. The brand name contends its widgets operate longer than the generic widget without experiencing a failure. Suppose the alternative hypothesis is that the mean number of continuous hours of operation of the more expensive brand is below the mean of the generic brand widget. The mean number of uninterrupted hours of operation for the generic widget is listed as 7.02 h. Therefore, the null hypothesis is that the mean number of continuous hours of operation of the more expensive security widget is not below the mean of the general security widget. This assertion is in direct opposition to the alternate hypothesis and therefore both assertions cannot be true. We will measure the results using a sample
104
5 The (Bare) Essentials of Probability and Statistics
0.4
f(x)
0.3 0.2 0.1 0.0 -4
-3
-2
-1
0 x
1
2
3
4
Fig. 5.12 Area of a standard normal distribution corresponding to a z-statistic of 2 [p (x < 2 ¼ 0.97725)]
number of hours for the brand name security widget and determine the statistical significance of the result using the p-value. We note this is an example of a one-sided hypothesis since only one direction is of interest. We wish to know whether the measured hours of continuous operation of our more expensive brand of security widget is a statistical anomaly. We assume alpha is 0.05, i.e., 5%, which means if the p-value is below this number the result is meaningful. In other words, there is a low probability that the results would be explained by random fluctuations. We use a left-tailed test for this exercise as shown in Fig. 5.9. The distribution about the mean is the statistic of interest. The average number of hours of continuous operation for the generic security widget is 7.02 h. The null hypothesis is that the mean number of continuous hours of operation for the more expensive brand of security widget is not below the mean of the generic security widget, i.e., the expensive security widget performance is better than the cheaper brand. In striking contrast to the manufacturer’s advertised performance, the average number of continuous hours for the expensive widget is measured to be 6.90 h with a standard deviation of 0.84 h. The statistical problem boils down to evaluating two alternative theories regarding the mean of a standard normal distribution, and both alternatives are shown in Fig. 5.13, where the dotted line is the hypothesized distribution of the expensive widget and the continuous line is the same for the generic variety. Using Eq. (5.13) we calculate the z-statistic for this scenario as follows: z statistic ¼ ð6:90 7:02Þ= 0:84=√202 ¼ 2:03 The p-value corresponding to the test statistic of 2.03 is approximately 0.02, which is below the alpha of 0.05. Therefore, we can reject the null hypothesis in
5.6 Statistical Confidence and the p-value
105
Fig. 5.13 Graphical representation of two theories regarding security widget hours of operation
favor of the alternative hypothesis. In other words, there is statistically significant evidence that the average number of continuous hours of operation for the expensive security widget is indeed less than the average of the generic widget. Presumably, significant financial resources have been saved as a result of this analysis. Alternatively, we can say there is a 2% likelihood that the observed number of continuous hours of operation of the expensive security widget is due to random variations in the data, i.e., the results reflect its real performance. Based on the alpha value selected in this case, 5% likelihood is the minimum value required for a statistical validation of the null hypothesis. However, before registering a complaint with the security widget manufacturer for false advertizing, it is worth pointing out that if the level of statistical significance were chosen to be 0.01, there would be statistical confidence in the null hypothesis. This statement reflects the importance of objectivity in selecting alpha since it represents the threshold for statistical confidence. Finally, noe the p-value is used to estimate statistical significance of a hypothesis and NOT the absolute veracity of that hypothesis. Therefore, a p-value that exceeds alpha does not necessarily mean the null hypothesis is true. Conversely, a p-value less than alpha does not imply the null hypothesis is not true. Rather, the measured p-value relative to alpha indicates the statistical confidence in the null or alternative hypotheses relative to a result explained by random fluctuations in the data.
106
5.7
5 The (Bare) Essentials of Probability and Statistics
The Poisson Distribution12
Another distribution that is sometimes applicable to threat scenarios is the Poisson distribution. This distribution is used to determine the probability that a specific number of events will occur in a fixed interval of time or space, assuming the following conditions are met.13 • The probability of an incident occurring in a time interval Δt is proportional to Δt when Δt is very small. • The probability that more than one incident occurs in a time interval Δt is negligible when Δt is very small. • The number of incidents that occur in one time interval is independent of how many occur in any other non-overlapping time interval. For example, the Poisson distribution can be used to predict the expected number of radioactive particles emitted by a radioisotope in a given time interval. This prognostication might seem too good to be true since an actual prediction of future threat incidents is the Holy Grail of a security risk assessment. However, the prerequisites for applying Poisson as stated above are stringent. There is no getting around the fact that probability distributions are inherently statistical. Therefore, they must be based on a population of historical threat incidents, which in this case must be independent and appear at a constant rate. However, Poisson distributions could be applied to specific threat scenarios if the conditions noted above are indeed satisfied. An example will help to demonstrate its applicability. Suppose the objective is to determine the probability, p(k) that exactly k incidents or outcomes resulting from a process will occur in a given time period, t. The rate of incident occurrence is constant, and is given by the Greek letter, lambda (λ). The Poisson probability distribution density function is as follows: pðkÞ ¼ eλt λk =k!
ð5:14Þ
One immediately observes that p(k) is strongly dependent on λ. An idealized threat scenario is as follows: A security professional knows from her meticulous records that threat incidents requiring significant security control resources occur at a rate of 1.4 per year. Each incident is independent, and the rate of incident occurrence is constant. She is preparing a long-term security strategy and wants to calculate the likelihood of experiencing exactly one incident over the next 5 years. This is equivalent to calculating p(k ¼ 1) in (5.14).
12
Simeon Denise Poisson, French mathematician, 1781–1840. Poisson and Normal Distributions, Rochester Institute of Technology, Lecture 7; https://www.cis. rit.edu/class/simg713/Lectures/Lecture713-07.pdf 13
5.7 The Poisson Distribution
107
Fig. 5.14 Poisson distribution of expected security incidents
In this case, t is 5 years, λ is 1.4/year and k is 1. Therefore, the exponent λt becomes 1.4 5 ¼ 7. Plugging this figure and the remaining values into (5.14) yields the probability of precisely one incident occurring in a 5-year period: pð1Þ ¼ e7 =71 =1 ¼ 0:006 ¼ 0:6% The security professional could then multiply this figure by the cost of remediating a single incident to compute the expectation value for this threat. She can then compare this value with similar computations for other threats, assuming their probabilities can be similarly estimated. A security strategy could be informed by such calculations as it incorporates the likelihood of threat incident occurrence and the cost of resources required to address them. Equation (5.14) can also be used to generate a probability distribution of incidents x in a 5-year time frame, again assuming a constant incident rate λ, equal to1.4 per-year. This probability distribution is shown in Fig. 5.14. Equation (5.14) is also sensitive to the selected time interval. For example, if the rate of incidents remains 1.4 per year, the expected number of events in a 2-year interval is 2 1.4 ¼ 2.8. If we are interested in knowing the probability that exactly one event will occur in a 2-year period, (5.14) is still applicable but now λ ¼ 2.8. The resulting calculation yields a probability of 17% that exactly one event will occur in a 2-year time interval. Contrast this figure with the probability that exactly one event will occur in a 5-year interval, i.e., 0.6%. One might argue that such a distribution is superfluous since a straightforward multiplication shows that seven events would be expected in a 5-year interval. Indeed, an examination of Fig. 5.14 shows that seven events is the most likely threat scenario for this time interval relative to other possible scenarios. However, there is a distribution of possible threat scenarios, where each has a probability of occurrence. This distribution provides the full spectrum of scenarios, which might be leveraged pursuant to conducting a cost-benefit analysis of required security controls. There are many probability distributions in addition to the normal and Poisson distributions. The applicability of a particular distribution depends on the threat scenario. Details associated with other distributions are beyond the scope of this
108
5 The (Bare) Essentials of Probability and Statistics
book. However, below are four basic questions that could be helpful in identifying a probability distribution that is relevant to a particular threat scenario14: 1. Does the data assume discrete or continuous values? If there are large samples of data, a continuous distribution can be assumed thereby enabling integration of the distribution density function. 2. Are the data symmetric? In other words, are positive and negative values equally likely? Data symmetry points to a normally distributed set of values. 3. Can upper and/or lower limits to the data be identified? If so, these limits would be used to normalize the distribution since by definition a probability distribution must sum to one. 4. Do extreme values of the data occur frequently or infrequently? The frequency of extreme values point to specific distributions as models of behavior. For example, financial and physical catastrophes occur infrequently but cause huge losses. The log-normal distribution is sometimes used to model such scenarios.
5.8
Value-at-Risk
We now know that quantitative estimates of the likelihood component of risk are not possible in the absence of a probability distribution of historical threat incidents. In such instances, indirect assessments of the likelihood component of risk are required. Various disciplines use metrics to estimate risk and particularly the exposure to loss. As one might expect, the financial industry assiduously measures potential gains and losses associated with individual transactions as well as portfolios of assets. Predictions of asset performance are based on quantitative models used to guide investment strategies. Financial institutions expend significant resources developing such models, and are continuously refining them to achieve better results in light of empirical data. However, even the most sophisticated institutions struggle with modeling rare events that result in extreme losses. Such losses can be devastating as the world witnessed in 2008. It would appear from the widespread results of that global calamity that existing risk models were either overly sanguine or largely ignored. In order to minimize the effect of big economic shocks, the Basel Committee on Bank Supervision developed the Basel Accords, which provide recommendations on managing capital, operational and market risks.15 The purpose of the accords is to ensure that financial institutions hold enough capital in reserve to withstand unexpected losses. The value-at-risk or VaR is a metric used by the banking industry to determine how much capital reserve is required to offset potential losses. VaR rates the
14 15
http://people.stern.nyu.edu/adamodar/New_Home_Page/StatFile/statdistns.htm https://www.investopedia.com/terms/b/basel_accord.asp
5.8 Value-at-Risk
109
Fig. 5.15 Value-at-risk
probability of loss relative to a fraction of the value of the portfolio for a given time interval. A representative curve depicting VaR is shown in Fig. 5.15.16 The 99.9th percentile of loss appears in Fig. 5.15. In other words, the magnitude of portfolio losses in a particular time interval would be expected to be less than the specified amount 99.9% of the time. Note that VaR should not be confused with vulnerability. The vulnerability of a portfolio of assets is nothing more than the value of that portfolio; it represents the actual amount that could be lost. In contrast, the VaR expresses the likelihood that a specified amount or greater will be lost over some time interval. VaR has also been applied to estimating information security risk.17 However, and as noted in Chap. 3, a probability distribution that accurately depicts the likelihood component of risk in IT scenarios remains elusive. Isolating the individual contributions of identified risk factors to the magnitude of risk is particularly difficult in information security-related threat scenarios for reasons cited previously. Moreover, the very notion of “loss” is not straightforward in these contexts. Determining a meaningful probability distribution of loss in terms of a specific risk factor would be challenging for any IT environment. Therefore, the applicability of VaR to information security risk assessments would depend on how it is applied and how the results are interpreted. It cannot be viewed as a measurement of absolute risk but might be used to assess relative risk. For example, an organization might measure an indicative value for VaR at various times in the hope of revealing significant fluctuations. Even if VaR is only used in this limited capacity, two points are true in general with respect to risk models and therefore worth noting here: the underlying model used to generate a curve such as the one shown in Fig. 5.15 is impossible to verify without historical data, noting that extreme events are relatively rare. Second,
16
Federal Register. Risk-Based Capital Standards; Advanced Capital Adequacy Framework; Basel II www.federalregister.gov; https://www.federalregister.gov/articles/2007/12/07/07-5729/riskbased-capital-standards-advanced-capital-adequacy-framework%2D%2D-basel-ii 17 J. Rees & J. Jaisingh, Center for Education and Research, Information Assurance and Security, Purdue University, CERIAS Tech Report, 2001–127.
110
5 The (Bare) Essentials of Probability and Statistics
identifying the source of a fluctuation would still be required. At a minimum, VaR could be used to highlight a general risk condition, which might precipitate various pre-emptive measures.
5.9
Summary
The probability of a future threat incident type, should it occur, can only be calculated if a probability distribution of historical threat incidents exists and threat scenario risk factors remain stable over risk-relevant time scales. A probability specifies the relative frequency of a specific element within a collection of like elements. A probability distribution is characterized by its mean and standard deviation. The standard deviation and variance, the square of the standard deviation, reflect the uncertainty in the mean of the distribution. The normal or Gaussian distribution is the most common and well-known probability distribution. The validity of sampling and resulting generalizations derive from the Central Limit Theorem, which states that repeated measurements of the average of samples drawn from any parent distribution tend toward a normal distribution even if the parent itself is not normally distributed. The Central Limit Theorem ensures that a sample of elements drawn from a parent population can be used to describe the parent. The standard normal distribution is a normal distribution with a mean of zero and standard deviation of one. The standard normal distribution of a sample can be parameterized in terms of the z-statistic, which is a random variable re-parameterized in terms of the distribution mean and standard deviation. Tables for z-statistics are published, and these enable estimates of probabilities for various threat scenarios. The p-value determines statistical significance, and is used in conjunction with the z-statistic and hypothesis testing. The p-value is frequently used to validate the results of experimental trials and thereby determine if results are meaningful or more likely explained by statistical noise. The Poisson distribution expresses the probability that a given number of events will occur in a fixed interval of time. Invoking the Poisson distribution requires that events occur with a constant rate and each event occurs independently of other events. Although Poisson processes applied to threat scenarios are relatively rare, they are applicable if specific and somewhat restrictive conditions have been met. Statistical methods and metrics from other disciplines have been applied to threat scenarios. One such example is the Value-at-Risk or VaR, which measures the probability that the magnitude of loss will exceed a certain value over a specified time interval. This method is traditionally used in the banking industry and has been suggested as a metric for information security environments. General limitations associated with risk models are also relevant to VaR when applied to threat scenarios, most notably the rarity of extreme events.
Chapter 6
Identifying and/or Quantifying Risk-Relevance
6.1
Introduction
Imagine if it were possible to actually measure the contribution of each risk factor to the overall magnitude of risk for any threat scenario. For example, suppose we could design a risk meter such as the one described in Chap. 1 and plug the resulting output into an equation that could generate a meaningful risk metric for any threat scenario. Such a metric could accurately guide decisions on security risk management. Similar metrics exist in other fields of risk management. A medical work-up often includes a blood test, which presents a patient with a laundry list of results. Each test result is presumably a measurement of a likelihood risk factor for one or more diseases. Of course, an individual test result does not yield a complete picture of the likelihood component of risk for a given disease. The physician’s job is to interpret test results based on the threat scenario, which includes the disease (threat), the patient’s (affected entity) and environmental risk factors. Although diagnosing the problem is a necessary first step, the ultimate objective is to cure the disease and prevent its reoccurrence. The risk meter is the security risk management-equivalent of a blood test. If such an instrument did exist, it would facilitate measurements of security risk for any threat scenario. The obvious question is why doesn’t such a device exist? Threat scenario models are desired in any risk management discipline. They enable generalizations about the magnitude of risk for variants of a threat scenario and thereby facilitate risk management strategies. A model for estimating the magnitude of threat scenario risk would require knowing the contribution of each risk factor to the overall magnitude of risk. The model might be in the form of a master equation where we plug in the individual risk factor values using appropriate units and out pops a meaningful risk score. © Springer Nature Switzerland AG 2019 C. S. Young, Risk and the Theory of Security Risk Assessment, Advanced Sciences and Technologies for Security Applications, https://doi.org/10.1007/978-3-030-30600-7_6
111
112
6 Identifying and/or Quantifying Risk-Relevance
The Fundamental Expression of Security Risk appears to offer a glimmer of hope in identifying such a model. However, this expression does not actually convey the individual contribution of each component. It merely specifies that the overall magnitude of risk is represented as a product of three components. Threat scenarios often contain multiple risk factors. These risk factors can be inter-related and produce different outcomes depending on the context. Therefore, it is difficult to correlate a particular risk factor with a specific outcome. These conditions are the reasons why a universal model of security risk remains elusive. Compounding the problem is that risk factors can change with time. The stability of threat scenarios has important implications to assessing the likelihood component of risk. The presence of identical risk factors across threat scenarios, i.e., risk factor stability, means the resulting threat incidents are comparable. Therefore, an accurate measurement of threat scenario stability could reveal threat incident similarity and thereby identify threat incidents that could be included the same probability distribution. A significant issue affecting measurements of threat scenario risk is the absence of threat incidents. Threat incidents result from threat scenario risk factors. Therefore, a lack of incidents makes it impossible to quantitatively relate cause and effect. Recall this condition is the basis for indirect assessments of the likelihood component of risk. Although a master security risk equation does not exist, risk-relevant features can still be identified, measured, analyzed and displayed using traditional quantitative methods. Some of the relevant concepts and methods are presented in this chapter.
6.2
Linearity, Non-linearity and Scale
The initial phase of a risk assessment focuses on identifying the threats and risk factors. Assessments subsequently attempt to understand the relationship between threats and affected entities, which relates to risk factor details. The concept of scale is particularly useful in characterizing the dependence of a component of risk on a risk factor. To appreciate the meaning and utility of scale requires an understanding of linearity and non-linearity. Ultimately, the scale of a relationship between two variables depicts how they are changing with respect to each other. A function is a mathematical expression that relates two or more variables. The expressions y ¼ x, y ¼ x2 and y ¼ x3 are examples of relatively simple functions. Each function relates the magnitude of the dependent variable y, to the magnitude of the independent variable x. A function is really a prescription that specifies the magnitude of change in a variable y resulting from a change in a variable x (or multiple variables).
6.2 Linearity, Non-linearity and Scale
113
Fig. 6.1 Linear function (y ¼ x ¼ x1)
Recall the concept of an exponent, which the reader likely first encountered in high school algebra class. An exponent indicates the number of times a variable is multiplied with itself. For example, with respect to the function y ¼ x1, x is multiplied by itself one time. The expression y ¼ x1 is equivalent to y ¼ x. The expression y ¼ x2 specifies that x should be multiplied by itself two times, i.e., x times x. In these expressions, the exponent is 1 and 2 respectively. A function is “linear” if the exponent of the independent variable is one. Therefore, in the expression y ¼ x the dependent variable y is linear in the independent variable x. In other words, y is a linear function of x. If a linear function is plotted on a graph it looks exactly as its name suggests: a straight line. Importantly, and for any linear function, a change in the value of the independent variable x, will yield proportionate changes in the value of the dependent variable y. For example, if the value of x is doubled it will result in a doubling of y. Tripling the value of x will triple y, etc. Figure 6.1 reveals the extremely linear function y ¼ x. In contrast, the function y ¼ x2 is an example of a non-linear function, and is depicted in Fig. 6.2 for the values x ¼ 1 to x ¼ 10. It is readily apparent that the form of this function is anything but a straight line. In fact, the value of the dependent variable y becomes disproportionately larger for increasing values of x. Disproportionate growth (or decay for negative exponents) is the hallmark of a non-linear function. For example, if we double the independent variable x, from 2 to 4, the resulting values of the dependent variable increase from 4 to 16, i.e., a difference of 16 – 4 ¼ 12. However, if we double the independent variable from 5 to 10, we see the resulting increase in y jumps from 25 to 100, i.e., a difference of 100 – 25 ¼ 75.
114
6 Identifying and/or Quantifying Risk-Relevance
Fig. 6.2 Nonlinear function (y ¼ x2)
Let’s say a specific threat scenario risk factor has been identified, which could be a risk factor for any one of the three components of risk. This risk factor might be a particular feature of the threat scenario. We want to determine the relationship between this risk factor and the magnitude of the relevant component of risk. If such a relationship were identified it could yield risk-relevant information about the magnitude of threat scenario risk under various scenario conditions, i.e., different values of the feature. In that vein, central to specifying the breadth of such a relationship is to specify a risk-relevant function over a range of feature values. This range of values reveals the scale of the relationship. A plot of the relationship might appear similar to either Figs. 6.1 or 6.2, i.e., linear or non-linear, or a very different function depending on the threat scenario. For example, suppose the threat scenario is the malicious interception of speech. A competitor is suspected of conducting a surreptitious interception of speech-borne information from a contiguous room. The company under attack uses a conference room where sensitive discussions take place. The company attempts to reduce the magnitude of information security risk by interposing an acoustic barrier between the conference room and the contiguous room. The decision facing the company interested in protecting its information is whether to purchase and install acoustic shielding material A or shielding material B. The former is more expensive than the latter but offers enhanced performance. Can the difference in performance be quantified and thereby assess the relative difference in information security risk? Significant risk factors in this threat scenario are physical proximity by an adversary and a wall in common that transmits acoustic energy across the audible spectrum. The leaked acoustic intensity as measured in the room that is contiguous with the conference room scales inversely with shielding material thickness t. The decision
6.2 Linearity, Non-linearity and Scale
115
Fig. 6.3 Acoustic energy intensity attenuation as a function of shielding thickness
Effect of Acoustic Shielding 1
Acoustic Energy Intensity
0.8 0.6 0.4
Material A
Material B
0.2 0 0 1 2 3 4 5 6 7 8 9 10 Shielding Material Thickness (arbitrary units)
regarding which material to purchase depends on the relative rates of change in intensity as a function of material thickness at audible frequencies. The scaling relation between the acoustic intensity and material thickness reveals the relative rate of change in intensity. Prudently, the company seeking to protect its information does not trust the advertised performance specifications of the shielding material. Therefore, company engineers have independently measured the scaling of transmitted intensity to be 1/t2, i.e., t2 if material A is interposed between the conference room and the contiguous room. In contrast, the intensity scales as 1/t or t1 for material B. The acoustic intensity is measured at a frequency of one kilohertz for both materials, which corresponds to a peak in the human audible spectrum.1 The likelihood of a successful attack via this vector decreases dramatically if a relatively small amount of material A is used as an acoustic barrier. Significantly less attenuation occurs if an equivalent amount of material B is used for the same purpose, thereby demonstrating the importance of comparing material performance. The effect of each material can be easily quantified using the aforementioned scaling relationships. It is helpful to represent the shielding performance of the two materials on the same graph, where transmitted acoustic intensity versus material thickness can be readily compared. We see from Fig. 6.3 that the shielding performance of material A increases at a faster rate than material B, where the relative intensity for both materials is normalized to one. The graph confirms that material A is clearly superior to material B. The operational questions are twofold: what material thickness is actually required for each material and what is the relative return on investment? Recognize that although the relative attenuation of A is superior to B, material B might actually be good enough for this threat scenario. If so, the extra money spent
1
The attenuation for a given material can vary significantly with frequency.
116
6 Identifying and/or Quantifying Risk-Relevance
Table 6.1 Cost-benefit ratio Relative cost 1 2 3 4 5
Relative material thickness 1 2 3 4 5
Relative attenuation (A) 1 4 9 16 25
Relative attenuation (B) 1 2 3 4 5
Cost-benefit ratio (A) 1 0.50 0.30 0.25 0.20
Cost- benefit ratio (B) 1 1 1 1 1
on material A would be a waste if all other considerations were equal. Alternatively, although more effective than material B, material A might also not be satisfactory, and its purchase would be a waste of money while potentially providing a false sense of security. The absolute efficacy of either material is impossible to determine from Fig. 6.3. The intensity has been normalized to unity to facilitate comparisons of performance. In addition, arbitrary units were used for material thickness. In other words, the graph merely reveals the relative rate of signal attenuation and not the actual reduction in transmitted energy. The latter requires an estimate of the signal amplitude after the acoustic energy traverses each material relative to the amplitude of the ambient noise in the frequency band of interest. The ultimate objective of installing acoustic shielding and thereby address a risk factor for information loss is to ensure the signal-to-noise ratio as measured by an adversary is 1 or lower. The cost-benefit ratio is also impossible to determine from Fig. 6.3 alone. This ratio has nothing to do with the magnitude of threat scenario risk. However, it is potentially a factor in the risk management decision since other threats may need mitigation and therefore are competing for resources. The cost-benefit ratio can be determined from the scaling relations for signal attenuation in conjunction with the cost of the material. For example, suppose the cost of acoustic shielding scales linearly with thickness. We know that the attenuation due to material B scales as 1/t and the attenuation due to material A scales as 1/t2. Table 6.1 shows the disparity in the cost-benefit ratios associated with material A and material B. The cost-benefit ratio of material B remains constant with increasing material thickness whereas the ratio for material A decreases with thickness, i.e., the ratio becomes more favorable with increasing thickness. It is clear that material A is a better deal for thicker materials although we still have not determined if one or both materials will actually be effective. Note that the relative benefit of selecting material A would not be so favorable if the cost increased non-linearly with thickness. The axes of Fig. 6.4 are represented on a linear scale. Note that a logarithmic scale could have been used depending on the range of values being displayed. Logarithms are the inverse of exponents. They enable the display of broad ranges of values on a single graph, thereby immediately revealing features of scale over multiple decades, i.e., powers of ten.
6.2 Linearity, Non-linearity and Scale Fig. 6.4 Cost-benefit ratio for two acoustic shielding materials
117
Cost-Benefit Ratio of Acoustic Shielding Materials 1
1
1
1
1
1
CostBenefit Ratio
0.8 0.6
Material A
0.5
Material B
0.4
0.3
0.2
0.25
0.2
0 1
2 3 4 5 Relative Material Thickness
An equivalent representation of the numbers 10, 100, 1000 and 10,000 is 101, 10 , 103 and 104 respectively. We now know that the exponent in each case is 1, 2, 3 and 4, respectively. Therefore, the value of the logarithm equals the exponent. The number 10 in this case is the base. It is clear that log(10) is 1, log(100) is 2, log(1000) is 3 and log(10,000) is 4. Next, imagine plotting all the integer values from one to one billion on the same graph using a linear scale. In this case, each number must be specified along an axis, which would necessitate a rather big piece of paper! In contrast, a logarithmic scale could represent that same range of values in only ten integer increments from zero to nine, thereby compactly depicting the range from 100 to 109 or equivalently, from one to one billion. For this reason, particular attention must be paid to the scale of both axes when interpreting risk-relevant data. Each integer of a logarithmic scale is ten times the previous integer. The familiar Richter scale relating to earthquake intensity is an example of a logarithmic scale. An earthquake of magnitude 7 is actually 10,000 times more powerful than one of magnitude 3 although the difference on a linear scale is merely 7–3 ¼ 4 integers. Figure 6.5 is a log-log plot of Y ¼ X2, where each axis is represented by the logarithm of the independent and dependent variables. The slope of any graph with logarithmic axes equals the exponent of the plotted equation. In this case, the slope of the line is 2, i.e., Y/X ¼ 2, which equals the exponent of the independent variable X. Scaling can help to identify and/or explain risk-relevant phenomena. In particular, risk factors for the vulnerability component of risk often vary with parameters such as distance, time, density, etc. A security risk management strategy focuses on ensuring one or more of these parameters remains below an identified threshold. Such thresholds often become evident from a scaling relation. 2
118
6 Identifying and/or Quantifying Risk-Relevance
Fig. 6.5 Log-log plot of Y ¼ X2
In a non-security-related but nonetheless illustrative example, consider the fact that smaller weightlifters are proportionately better at weightlifting than their bulkier teammates. A scaling argument can be used to explain this phenomenon. Olympic weight lifting competitions consist of two lifts: the clean and jerk and the snatch. The former involves lifting the weight overhead in two separate movements, i.e., the clean followed by the jerk. The latter requires moving the weight from the floor to an overhead position in a single motion. Heavier weightlifters are typically able to lift more kilos than their more diminutive counterparts. The actual weight that has been lifted reflects the absolute strength of the lifter, which is proportional to muscle cross-section. Muscles are approximately cylindrical in shape, and therefore can be characterized by a radius and a length. Assume that muscle cross section is approximated by a circle of radius r. The area of a circle is given by πr2. Therefore, absolute strength scales as the radius squared. However, weight or body mass increases with muscle volume, which scales as the radius cubed or r3. Relative strength is defined as the strength-to-weight ratio. Therefore, the strength-to-weight ratio in this case scales as r2/r3 ¼ 1/r. So bigger weightlifters “lose” relative strength at a rate inversely proportional to r. In other words, although heavier weightlifters can lift increasingly greater weight as they gain body mass, their relative strength actually decreases in the process. Figure 6.6 graphically confirms the results of this simple scaling argument. The line with square data points shows world record lifts in the snatch, and the line with circular data points indicates the weight that would be lifted if relative strength
6.2 Linearity, Non-linearity and Scale Fig. 6.6 Relative v. absolute strength
119
Weightlifting World Records for the Snatch 275 250 240
230 210
205 Lift (kg)
190
170
170 165
170
70
80
190 180
180
85
95
150 140 135
100 56
62
105
Weightlifter (kg) Linear Scaled Lift
Actual Lift
scaled in proportion to body weight. It is apparent that world records do not scale as expected if relative strength increased in proportion to body weight.2 The following passage describes the relevance of scale to cylindrical bodies such as human muscles3: Cylindrical bodies obey the rules of scale. In fact, consideration of the size scale was the original stimulus for the mechanical implications of the cylindrical form in large organisms. Borelli (1680) understood that bodies increase their mass with the cube of the radius, whereas the strength of a supporting cylindrical stem or leg increases with the square of its radius. Thus, if terrestrial plants and animals are to achieve great size, they cannot just get bigger without changing shape-the radius of cylindrical supporting parts must increase faster than the length.
The notion of scale is highly relevant to phenomena in the biological world. The scale of an organism’s physical dimensions plays a significant role in how that organism relates to its environment. Extremely small organisms exhibit particularly interesting behavior as they attempt to navigate aqueous environments, where their motion is governed by a low ratio of inertial to viscous forces.4
2
D. Esker, The Paradox of Large Dinosaurs, http://www.dinosaurtheory.com/big_dinosaur.html S. A. Wainwright, Axis and Circumference; The Cylindrical Shape of Plants and Animals, Harvard University Press, Cambridge, MA, 1988. 4 E. M. Purcell, Life at Low Reynolds Number, American Journal of Physics, 45, 3, 1977. 3
120
6 Identifying and/or Quantifying Risk-Relevance
Although the weightlifting and other physical activities are illustrative of how scaling arguments explain physical phenomena, the more relevant issue is how scaling applies to security risk assessments. As one would expect, the magnitude of the vulnerability component of risk in physical threat scenarios is often a function of one or more physical parameters. This condition is exemplified by explosive threats. For these threat scenarios, simple scaling arguments can be used to estimate the magnitude of vulnerability, which is a function of two parameters: (1) the distance between an explosive source and a target and (2) the explosive payload. The effect of these two risk factors on vulnerability will be investigated more thoroughly later in the chapter. A scaling relation can be a powerful prescription when it yields insight into the relationship between risk-relevant variables. Specifically, it reveals how a change in one variable affects the magnitude of another variable over a range of values. Identifying such a relationship often leads to specifications on the required thresholds for security controls.
6.3
Density
A density is a ratio that specifies the relative amount of two things. Densities are often expressed in terms of an amount of thing one per an amount of thing two, e.g., watts-per-square meter, grams-per-cubic centimeter, pounds-per-square inch. In this way, density can be thought of as the rate of appearance or presence of one parameter relative to another. The juxtaposition of two quantities is often more physically interesting and/or meaningful than an absolute value. For example, atmospheric pressure is a key metric that affects weather patterns. In the United States, air pressure is most often expressed in units of pounds-per-square inch. Meteorologists and climate scientists are much more interested in the local areal density, i.e., atmospheric pressure, of the atmosphere than the total weight of the atmosphere surrounding the earth. Another example of an important density figure occurs in vision. The retina of the eye detects light intensity, i.e., power-per unit area, focused by the cornea (~35 diopters) in combination with the lens (~15 diopters). Light intensity is specified in units of watts-per-square meter or watts-per-square centimeter. In detecting light, the cornea-lens system ensures optical power converges on the retina so that a threshold power density is achieved. Density can be useful in conveying risk-relevant concepts. In Chap. 10, a metric that reflects the “concentration” of risk factors within assets is specified. Its formulation as a rate yields insight into the effectiveness of security risk management for a given threat scenario. A rate of occurrence of a risk-relevant parameter as expressed by a density is often more risk-relevant than the same parameter expressed as an absolute quantity. For example, one thousand vulnerabilities spread across ten assets, i.e., one hundred vulnerabilities per asset, would likely be more indicative of poor security
6.4 Trends and Time Series
121
governance than the same number of vulnerabilities spread across one billion assets, i.e., one vulnerability per million assets. Specifying the absolute number of vulnerabilities is not particularly risk-relevant since the relationship of the risk factors to affected entities, i.e., assets, is missing. In the absence of a ratio or density of risk factors relative to things of value, a key piece of contextual information is missing.
6.4
Trends and Time Series
Trends often tell a compelling story about security risk. For example, if the number of threat incidents-per-year is trending in a certain direction it yields insight into the effectiveness of security risk management or the lack thereof. Of course, the trend by itself does not say anything about the root cause of threat scenario risk or the relative effectiveness of security controls. However, a precipitous rise in threat incidents does say something about changes to the magnitude of risk. Recalling the discussion in the last section, a more insightful trend might be to specify the density or ratio of threat incidents to the dollars spent on security controls by year. Trends can also be useful in identifying risk factors. An upward trend in the number of threat incidents as a function of some parameter suggests that this parameter is correlated with a component of risk. Recall Fig. 1.2 of Chap. 1, which revealed a clear increase in the rate of lung cancer in cigarette smokers as a function of the number of the years of smoking cigarettes. A trend can provide a dramatic view of the effect of security risk management. If a precipitous drop in the number of threat incidents coincides with an upward trend in the application of security controls, it suggests that risk management efforts and threat incidents are anti-correlated, and therefore suggests the controls are indeed effective. A trend is really a time series, i.e., a series of data points indexed in time order. The simplest way of displaying time series data is via a trend line. Such representations clearly show the rate of change of a parameter as a function of time since it reveals both the slope and direction of the line. Recall our discussion of scale earlier in the chapter. A trend line conveys scale since it depicts the rate of change of one variable relative to another across a range of independent variable values. Such a variable might be a threat scenario feature, the magnitude of a security control, et al. As noted above, two trend lines can reveal the relationship between two riskrelevant variables. A threat scenario experiencing an upward trend in the number of threat incidents in spite of increased security controls suggests those controls have not been particularly effective. Figure 6.7 depicts such a condition. Trending of threat incidents and security controls in opposite directions would suggest the opposite condition. Figure 6.8 shows security controls increasing and threat incidents decreasing over time. These two contemporaneous trends suggest that the application of security controls has had a salutary effect. Note this condition
122
6 Identifying and/or Quantifying Risk-Relevance
Fig. 6.7 Upward trends of security controls and threat incidents suggesting ineffective security controls
Threat Incidents and Security Controls 20
20
18 16
15
14
Number
12 10
10
10 9
8
8 7
6
6 5
5
4
4 3
2
2
0 1
2
3
4
5
6
7
8
9
10
Time (arbitrary units) Security Controls
Fig. 6.8 Upward and downward trends of security controls and threat incidents
Threat Incidents
Threat Incidents and Security Controls 10
10
10 9
9 8
8
8
6
Number
7
7 6
6
5
5 4
4
4
3
3 2
2 2
1
1
0 1
2
3
4
5
6
7
8
9
10
Time (arbitrary units) Security Controls
Threat Incidents
does not prove that security controls caused threat incidents to decrease. However, it reveals the two variables are perfectly anti-correlated, and thereby supports the contention that security controls are effective.
6.5 Histograms
6.5
123
Histograms
It is often useful to depict threat scenario data in terms of discrete bundles, and thereby reveal how a risk-relevant entity or parameter is distributed across those bundles. A histogram is actually related to a probability distribution. The former can be transformed into the latter simply by forming fractions or bins consisting of each bin’s contents relative to the total distribution of bins. Histograms convey texture, revealing where a particular parameter is concentrated across the data landscape. For example, we might be interested in knowing the number of unique ID holders, i.e., “users,” who accessed a particular space inside a facility within a particular time interval. Such information might inform a security strategy by revealing the requirement for additional security controls such as CCTV. Figure 6.9 is a histogram showing the number of users as a function of the number of entries to a particular space within a 2-month period. The bins of the horizontal axis are grouped according to the number of entries and in intervals of two entries-
Fig. 6.9 Histogram of access history (2 months)
124
6 Identifying and/or Quantifying Risk-Relevance
Fig. 6.10 Cumulative access history (2 months)
per-bin. For example, the first bin consists of the number of users who accessed the space between one and three times during the recorded time interval, which in this case is approximately 550. Figure 6.9 reveals that the space is used frequently, but the majority of users enter relatively infrequently. However, some users do frequent the space quite often, and this disparity explains the dispersion in the mean number of entries-per-user. The data in Fig. 6.9 might also be used to determine the fraction of users across the spectrum of bins. The so-called Pareto line shown in Fig. 6.10 specifies the cumulative distribution of users as a function of the number of entries, i.e., the fraction of users by percentage. As one would expect, the line reveals the progressively increasing fraction of the user population with an increasing numbers of bins. Figure 6.10 indicates that roughly 40% of all users entered the facility between one and three times. Approximately 50% entered the facility less than or equal to seven times, and roughly 90% entered 23 times or less in the 2-month measurement interval.
6.6 Derivatives and Integrals
6.6
125
Derivatives and Integrals
The mathematical operations known as differentiation and integration are introduced next. These two operations are unquestionably two of the most common and useful methods in science and engineering. They are prevalent in nearly every scientific endeavor, and were first conceived by Isaac Newton and Gottfried Leibniz independently in the seventeenth century in the formulation of the calculus.5,6 The reader will intermittently encounter derivatives and integrals throughout this text since they are also useful in conveying risk-relevant concepts. At a high level, a derivative describes a rate of change and an integral is a continuous summation. Each is applied to functions of one or more variables. In the context of security risk assessments, such functions are often risk-relevant features of threat scenarios, e.g., risk factors. The derivative is the mathematical operator used in differentiation. Differentiation has particular applicability to trends since there is often interest in knowing the rate at which a trend line is changing at a particular point of the trend. Let’s assume a variable y is a function of another variable x. We can express this condition as y ¼ f(x). As noted in the discussion on scaling, if we plot this function, the vertical axis would consist of the values of the dependent variable y, and the horizontal axis would reflect the values of the independent variable x. The slope of the resulting line is defined as the change in the value of y divided by the change in the value of x, which is sometimes depicted as Δy/Δx. In other words, if the line represents a hill, the slope would be the rate of ascent in the vertical direction relative to the distance traversed in the horizontal direction. If each step or unit of measurement in the horizontal direction corresponds to three steps in the vertical direction, the hill has a slope of three-to-one or three. Any function with similar properties, i.e., where the change in y relative to the change in x is three units, would have the same slope. For example, the functions y ¼ 3x + 2, y ¼ 3x + 10 and y ¼ 3x – 100,000 have a slope of 3, noting there are an infinite number of such functions. The slope of a line is not necessarily constant, and it is often desired to know the slope of the line at a specific point. This situation is where derivatives enter the picture. A geometric interpretation of the derivative in calculus can be represented as the slope of the line tangent to a curve at a given point on that curve. The tangent of a twisty curve will change direction as the direction of the curve itself changes. Derivatives are often described as specifying the “instantaneous rate of change” of a function at a specific point on a curve. To summarize, a derivative is equivalent to the slope of a function of one or more variables at a specific point of that function. The derivative of the function y ¼ f(x) is
5 6
Gottfried Leibniz (1646–1716), German mathematician. Isaac Newton (1642–1727), English mathematician and physicist.
126
6 Identifying and/or Quantifying Risk-Relevance
written as f0 (x) or dy/dx. In Chap. 3 derivatives were used to express the time rate of change of risk factors and security controls, i.e., dR/dt and dC/dt respectively, and thereby characterize static and dynamic threat scenarios. The inverse of differentiation is integration. The Fundamental Theorem of Calculus is a statement of the relationship between differentiation and integration, and it can be expressed mathematically as follows, where the S-shaped symbol is the integral7: d dx
Z
x
f ðx0 Þdx0 ¼ f ðxÞ,
a
The process of integration adds infinitesimally small pieces of a function dx0 to yield the area under the curve described by that function. If the function is constant in each variable, the integration becomes a straightforward addition or equivalently, a multiplication. But what if the value of a function is not constant? In that case, finding the area under the curve requires more than simple multiplication. For example, calculating the area bounded by a rectangular function of constant length x and constant width y is accomplished by a straightforward multiplication of x times y. However, if the values of x and/or y vary, determining the area under the curve defined by the function requires a subtler approach. As noted above, Leibniz and Newton developed the approach independently in the seventeenth century. At a high level, summing the entire function involves dividing it into infinitesimal pieces, dx and dy, and adding the pieces between the limits of the function. The so-called definite integral is a number that represents the result of a continuous summation of a function within defined limits. Figure 6.11 depicts the integration operation. In this case, the process of integration is used to determine the area under the curve I, defined by the varying function f(x) between the limits specified by the points A and B. Fig. 6.11 The definite integral
7
mathworld.wolfram.com/integral.html
6.7 Correlation and Correlation Coefficients Revisited
127
Integrals are used to explain the effect of time on security risk, the impulse of an explosive threat and the area under the curve of a probability distribution in addition to many other threat scenario examples. The diversity of these examples demonstrates the broad applicability of differentiation and integration even within security risk assessment contexts.
6.7
Correlation and Correlation Coefficients Revisited
We first encountered correlations in Chap. 5. The correlation function measures the normalized average of the product of two variables minus the product of their averages, i.e., the normalized covariance. If the two variables under comparison are independent random variables, the covariance is zero, i.e., there is zero correlation. Various forms of correlation exist, and are used to analyze time series data, and in particular to distinguish signal from noise. (a) Autocorrelation: The autocorrelation function is a correlation function of the same variable measured at two distinct times. This method is often used as a test for randomness. It entails multiplying a signal with itself and integrating the result with respect to time. If a time series of data points is truly random, past and future points are completely unrelated. Only the present data can be correlated with itself in a truly random signal. In Chap. 8 we leverage the covariance of a single variable, i.e., the autocorrelation function, to “measure” the time required for a threat incident, which is assumed to be a variable subject to random fluctuations, to become statistically independent of its original value. In other words, we calculate the time required for to become . (b) Cross-Correlation: This method is similar to autocorrelation except that it involves sliding, multiplying and integrating two different times series relative to each other. (c) Convolution: This is a method used to measure the response of a system to an input signal. The input can be expressed as a sum of scaled and shifted impulses or frequencies. The response to each impulse or frequency is calculated, and the impulse or sinusoidal responses are summed to determine the overall system response to the input signal. Testing whether two variables have a linear relationship can have implications to a security risk assessment. For example, a method to test whether a parameter is a risk factor is to calculate whether there is a linear relationship between the magnitude of that parameter and the number of threat incidents.
128
6 Identifying and/or Quantifying Risk-Relevance
One useful metric for calculating such an association is the Pearson Product Moment Correlation Coefficient (PPMCC) P. The formula for this coefficient with respect to two sets of data points, x and y is written compactly as follows: P ¼ Cxy =σx σy
ð6:1Þ
The reader will recognize (6.1) as the correlation coefficient of Chap. 5. Cxy is the covariance of the random variables x and y, and σx and σy are the standard deviations of the distributions of x and y, respectively. An example of how (6.1) might be applied to a security risk assessment is provided next. Although admittedly not entirely realistic, it is illustrative of the concept. Suppose a purported counterterrorism expert claims that the number of terrorism attacks correlates with the day of the month. In other words, the contention is that the number of attacks increases with increasing days of the month. Furthermore, the expert is lobbying for security controls to be applied accordingly, which means adding more security risk management resources as the number of days increases. The claim is that the day of the month is a risk factor for the likelihood component of risk for terrorism threat scenarios. The PPMCC can be used to test the expert’s theory. Table 6.2 indicates the average number of attacks per day versus the day of the month. An on-line calculator is used to arrive at P ¼ 0.06 (rounded to three significant decimal places) for the two series shown in Table 6.2. The calculation indicates that the two data sets are slightly anti-correlated. There is only a very slight linear relationship between the day of the month and the number of terrorist attacks and that relationship is opposite to the consultant’s claim! The implication is that security controls should not be applied in proportion to the day of the month based on the results of this analysis. Of course, if the expert had merely plotted the results it would have obviated the need for the calculation. Figure 6.12 represents such a plot. It is clear that this relationship is anything but linear.
6.8
Exponential Growth, Decay and Half-Value
In this section we introduce a process that describes the growth or decay of a quantity as a function of some risk-relevant parameter. Often that parameter is distance or time since spatial and temporal characteristics of threat scenarios often affect the magnitude of risk. Consider a process where the time rate of change of some quantity is dependent on the amount of that quantity. Furthermore, the rate of change is governed by a rate
6.8 Exponential Growth, Decay and Half-Value Table 6.2 Number of terrorism incidents v. the day of the month
Day of the month 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
129 Incident number 70 80 73 35 61 55 56 84 51 89 41 31 61 8 82 80 85 20 20 36 85 71 38 97 92 79 2 89 32 64 52
0.05747204
constant. We could write a simple expression for the rate of decay (or growth) of the quantity as follows, where λ is the rate constant, I is intensity and t is time: dI=dt ¼ λI
ð6:2Þ
There are many processes whose behavior is described by (6.2). Note that if the minus sign on the right hand side were eliminated, (6.2) would describe exponential growth rather than decay. The solution to (6.2) is given by the following expression, where Io is the initial intensity and “e” is the base corresponding to the natural exponential function 2.72:
130
6 Identifying and/or Quantifying Risk-Relevance
Terrorism Incidents and Days of the Month 100
Terrorism Incidents
80
60
– 40
20
0 1
3 2
5 7 9 11 13 15 17 19 21 23 25 27 29 31 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Day Number Fig. 6.12 Terrorism incidents versus days of the month Fig. 6.13 I(t) ¼ et
I ¼ Io eλt
ð6:3Þ
In other words, the intensity decreases exponentially as time increases. Figure 6.13 is a graph of the function, I ¼ et where Io and λ are assumed to be unity and t is represented by integer increments of arbitrary units.
6.8 Exponential Growth, Decay and Half-Value
131
Fig. 6.14 I(x) ¼ ex
If the intensity exponentially increases with respect to distance rather than time, the governing equation is identical to (6.2) except for the absence of a minus sign on the right side, and the time t is replaced by distance x: dI=dx ¼ λI
ð6:4Þ
The solution to (6.4) is given by the following expression: I ¼ Io eλx
ð6:5Þ
Figure 6.14 is a graph of I(x) ¼ ex calculated for five values of x, where again Io and λ are assumed to be one, and the distance variable x is specified using integer increments of arbitrary units. The solution to an equation describing exponential decay with respect to distance would appear identical to (6.5) except for the addition of a minus sign in the exponent of the exponential function8: I ¼ Io eλx
8
ð6:6Þ
The reader is cautioned not to confuse the term exponent with the exponential function. The former is the number of times a variable is multiplied by itself and the latter is a function of the form f (x) ¼ abx. A special form of exponential function is one where b in the above expression is the constant e ¼ 2.71828. Exponential functions have the unique property that the growth rate of the function, i.e., the derivative, is proportional to the value of the function. The constant of proportionality in the expression for the derivative is the natural logarithm of the base b. The constant e is the unique base for which the constant of proportionality is one.
132
6 Identifying and/or Quantifying Risk-Relevance
When assessing threat scenario risk it is sometimes useful to know the time required to reduce the intensity of some parameter by half its previous value. Successive time intervals of the same duration will each reduce the intensity by half the previous value. In other words, after one such time interval the intensity will drop by one-half, after two intervals the intensity will reduce by ½ x ½ ¼ ¼, etc. A convenient expression for the time corresponding to the half-value is derived next. Recall the solution to (6.2) was given by I ¼ Ioet. Dividing both sides by Io and taking the natural logarithm (ln) of both sides while noting the ratio of the general intensity to the initial intensity I/Io ¼ ½ we get, ln ðI=Io Þ ¼ ln ð1=2Þ ¼ 0:693 ¼ λt
ð6:7Þ
Therefore, t ¼ 0.693/λ, where λ has units of inverse time, i.e., t1. If (6.6) were the governing expression instead of (6.3), λ would have units of inverse distance (x1). Note that (6.7) applies to any process described by (6.3) or (6.6). Therefore, knowing the decay constant λ for such a process immediately yields the time or distance required to reduce a quantity to one-half its original value. For example, if the rate of decay of a material were λ ¼ 10 decays per-second, the half-life would occur in time t ¼ 0.693/λ ¼ 0.693/10 ¼ 0.069 ~ 0.07 s. Therefore, in 10 s, the activity would experience 10/0.07 ~ 143 half-lives, and therefore would be reduced by (1/2)143 ~ 1043. Finally, we note that back-of-the-envelope calculations sometimes use 0.7 to approximate 0.693 in (6.7).
6.9
Time and Frequency Domain Measurements
In some threat scenarios, the risk factors can be difficult to identify let alone measure. This condition is especially true if threat scenario data are “noisy,” i.e., there is an abundance of irrelevant data. The first order of business in analyzing potentially risk-relevant data is to view the information in the time domain. This is a fancy way of saying the data must be analyzed with respect to any changes that occur as a function of time. Often one immediate objective is to ascertain the behavior of a threat scenario feature and infer its effect on the magnitude of a component of risk. At other times, it may be a priority to understand the effect of a security control. However, in order to identify risk-relevant characteristics in a threat scenario it is sometimes necessary to gain a different perspective. It turns out that data that are hopelessly complicated in the time domain can appear much simpler when viewed as inverse-time, i.e., in the frequency domain. Figure 6.15 is a graphic showing outbound IT network connections from an internal IP address as a function of time. There are a number of peaks but it is difficult to identify risk-relevant features from this data alone. For example, it would
6.9 Time and Frequency Domain Measurements
133
Fig. 6.15 Outbound IT network connections displayed in the time domain
be useful to identify periodic outbound connections, which might be indicative of malware attempting to connect to its external command and control server. Recall from high school physics that the frequency of a signal is the rate at which the signal repeats itself. A sine wave consisting of a single frequency, e.g., sin(ωt), repeats itself ad infinitum at the frequency ω, and the separation between the wave peaks is constant. As noted above, frequency represents the inverse of time, i.e., 1/t or equivalently t1. Time and frequency measurements are complementary and are actually two sides of the same coin. This complementarity has benefits. If we view sin(ωt) in the frequency domain it will appear as a single peak at the frequency ω instead of as a continuous wave in the time domain. Peaks in the frequency domain are indicative of those frequencies where power is concentrated across the signal spectrum. Next, imagine a periodic signal consists of two sine waves oscillating at frequencies ω and 2ω. In the time domain, the signal would appear as two oscillating functions, which might be relatively easy to disambiguate given there are still only two frequencies. Now imagine 100 such signals oscillating simultaneously at different frequencies. In this and in similar instances it might be very difficult to differentiate one frequency from another and thereby discern any useful information about the overall signal structure.
134
6 Identifying and/or Quantifying Risk-Relevance
Fig. 6.16 Outbound IT network connections displayed in the frequency domain
If the same signal is represented in the frequency domain rather than the time domain, the appearance of the signal changes dramatically. Each frequency is represented as a single spike rather than as an oscillating wave. In particular, discrete frequencies appear as distinct, non-overlapping peaks. In contrast, signals on display in the time domain can appear as a ménage of overlapping oscillations. Fortunately, there is a way to transform a signal from the time domain to the frequency domain (and vice versa). If a mathematical operation known as the Fourier Transform is applied to the signal, the frequencies will appear as lone spikes, where the square of the amplitude of each spike is proportional to the signal power at that frequency.9 Let’s now apply the Fourier Transform to the data set in Fig. 6.10 in order to reveal risk-relevant information pertaining to outbound connections. Figure 6.11 displays the results. Note that whereas Fig. 6.10 appeared as a jumble of seemingly incoherent data, Fig. 6.16 clearly reveals the most prominent signals. Specifically, the concentration of signal energy at frequencies corresponding to 300 connections per day (300 day1) and 600 connections per day (600 day1) is now immediately apparent. As noted above, a well-known modus operandi of malware is to establish a foothold in a compromised network and send outbound communications to a command and control site on the Internet. It could be difficult to identify this threat scenario risk factor in the time domain. However, this signal feature is immediately apparent in the frequency domain. Of course, the observed behavior could turn out to be an innocent communication beacon. However, periodic outbound communications from internal network resources warrant follow-up in order to rule out malicious activity.
9
Joseph Fourier (1768–1830), French mathematician.
6.10
6.10
Summary
135
Summary
Quantitative concepts that routinely appear in traditional scientific disciplines also have application to the theory of security risk assessment. Density is one such concept, which is expressed as a ratio of two quantities. Density can also be thought of as the rate of occurrence of one quantity relative to another. A quantity expressed as a density is often more risk-relevant than an absolute quantity. The scale of a risk-relevant parameter is its rate of change over a range of values. Ascertaining whether a scaling relationship is linear or non-linear can be crucial in characterizing how the change in a risk factor affects the magnitude of a component of risk. Differentiation of a function yields its slope, i.e., the rate of change, at a specific point on the line tangent to that function. Integration is the inverse of differentiation, and it is a continuous summation of a function between specified limits. Integrating a function between such limits yields the area under the curve defined by that function. Both differentiation and integration are ubiquitous in science and engineering, and likewise have applicability to security risk assessments. Exponential growth and decay describe processes associated with many natural phenomena, and can also be applicable to threat scenarios. Such processes obey a first order differential equation, where the rate of growth or decay of a quantity is a function of the amount of that quantity. The model for compound interest is an example of such a process. Spatial and temporal features of threat scenarios often affect the magnitude of risk, and therefore changes to a threat scenario as a function of distance or time can be risk-relevant. A time series or trend of a risk-relevant feature can reveal the behavior of that feature as a function of time. It can also expose the effect of a security control or lack thereof. Both the rate and magnitude of changes to risk factors and security controls can be important in developing a risk management strategy. The relative time rate of change between a risk factor and relevant security control distinguishes static from dynamic threat scenarios. A correlation calculation can assist in identifying threat scenario risk factors. For example, a relationship between the number of threat incidents and a particular threat scenario feature as manifest by a positive or negative correlation coefficient might identify or discount that feature as a potential risk factor. The Pearson Moment Correlation Coefficient is particularly useful in identifying a linear relationship between two variables. The Fourier transform enables views of data in both the time and frequency domains. It is a technique that has many applications in traditional areas of science and engineering. It is applicable to threat scenarios in particular because of its effectiveness in identifying periodic features within complex sources of information such as the data produced by firewall logs.
Chapter 7
Risk Factor Measurements
7.1
Introduction
Risk factors were first introduced in Chap. 1 and have been a focus of attention ever since. The reason they receive such attention is that they determine the magnitude of risk in any threat scenario. An actual measurement is significant due to the contribution of any risk factor to the overall magnitude of risk coupled with the insight gained via a quantitative analysis. Risk factor measurements are organized according to the categories identified in Chap. 2 with the exception of apex risk factors. Recall this category of risk factor exists because of their effect on the magnitude of risk rather than a unique set of features. The measurement categories are further delineated according to the components of risk. This chapter applies the theory developed thus far to specific threat scenario examples. These assessments are not sparing in their level of detail. Threat scenarios are dissected to reveal risk-relevant parameters and their effect on the magnitude of a component of risk. The purpose of including these details is to demonstrate the diversity of quantitative assessments and the specificity that is required to identify risk-relevant effects. In that vein, the examples are highly scenario-specific, and the assessment methods are not applicable in general. In addition, some of the threat scenario features are admittedly esoteric if not downright ludicrous. The point of these analyses is to highlight the relevance of the theory as well as to reveal the type of insights gained from quantitative analyses of scenario details and the metrics that can result. Spatial risk factor measurements are discussed first, which are followed by measurements relating to temporal and behavioral risk factor categories. Recall from Chap. 6 that spatial and temporal features can be particularly insightful since threat scenario features relating to distance and time are often risk-relevant. © Springer Nature Switzerland AG 2019 C. S. Young, Risk and the Theory of Security Risk Assessment, Advanced Sciences and Technologies for Security Applications, https://doi.org/10.1007/978-3-030-30600-7_7
137
138
7.2
7 Risk Factor Measurements
Spatial Risk Factor Measurements
Spatial risk factors are those that relate to or are present in the environment where threats and entities interact. In this section, we focus on examples of spatial risk factor measurements. As is the case with the other measurement categories, the discussions are organized according to the individual components of risk. (a) Measuring likelihood Consider the rather gruesome scenario consisting of an individual colliding with a subway train. This unfortunate individual landed in this predicament after slipping on the platform and falling onto the track. In this scenario the train is the threat, the individual who falls on the track is the affected entity, and the platform and track are the environment where the threat and entity interact. The magnitude of the likelihood component of risk for this threat scenario is minimal at any distance greater than some critical distance from the platform edge. Only at the platform edge could a slip by an affected entity lead to a fall onto the track and a subsequent collision with an approaching train. Therefore, as one approaches the edge of the platform, the magnitude of the likelihood component of risk precipitously increases. Furthermore, distance from the platform edge is a spatial risk factor for the likelihood component of risk. In this case, distance and likelihood are anti-correlated; the shorter the distance from the platform edge the greater the magnitude of the likelihood component of risk within some critical distance of the platform edge. The slipperiness of the platform in the vicinity of the platform edge also increases the likelihood component of risk, and is therefore another spatial risk factor for this threat scenario. Although the distance risk factor can be readily measured via a simple tape measure, slipperiness might appear to be a more qualitative threat scenario feature. The property of slipperiness can in fact be measured via the so-called coefficient of friction, and devices designed to measure surface slipperiness exist.1 The coefficient of friction is discussed next in some detail. The point is to illustrate how the measurement of an environmental risk factor is applicable to assessments of the likelihood component of risk. The coefficient of friction is a dimensionless parameter, which is defined as the ratio of the frictional resistance of a surface (measured in units of force) to the magnitude of the applied force. If there is no slipping, the applied force exactly balances the frictional resistance. If the object remains stationary on the surface, the coefficient of static friction is unity. The more slippery the surface the less force is required to move the object along the surface. Everyone has experienced this phenomenon when attempting to negotiate icy surfaces. Therefore, the coefficient of friction and the likelihood component of risk are inversely related; the lower the coefficient of friction the greater the likelihood that a 1
https://en.wikipedia.org/wiki/Floor_slip_resistance_testing
7.2 Spatial Risk Factor Measurements
139
slip will occur if all other relevant factors are equal. The coefficient of friction transitions from a static to dynamic state once motion along the surface ensues. Importantly, a measurement of slipperiness by itself will not yield the probability that an individual will actually slip on a given surface. Such a measurement is an indirect measurement of the likelihood component of risk, and therefore we can only infer the potential for slipping. An experiment could be constructed to measure the actual probability of slipping. If numerous slips from the platform are observed under various surface conditions, i.e., values of the coefficient of friction, the probability of slipping could be determined from the resulting probability distribution of slips. Such an experiment would constitute a direct assessment of the likelihood component of risk since historical threat incidents are used to generate a probability distribution of “threat incidents,” i.e., slips. The resulting distribution would provide an answer to the following question about future slips: “What is the probability that an individual will slip on a subway platform if the coefficient of friction of the platform is X?” The probability distribution enables generalizations about future incidents occurring in a congruent environment. We know that spatial risk factors are inherent to a threat scenario environment. In this particular threat scenario the environment is the subway platform and track. Anyone who has travelled on the New York subway system surely knows that other risk factors for this threat scenario could exist. For example, a patch of grease or a banana peel located near the platform edge would also be a risk factor for the likelihood component of risk. Now imagine the individual walking along the platform edge, i.e., the affected entity, is afflicted with Tabes dorsalis, i.e., syphilis. If left untreated, this condition produces a loss of coordination/balance as well as other unpleasant symptoms. My father, a pathologist, used to characterize highly unlikely events in terms of the likelihood of a blind, tabetic pig surviving on a tightrope over Niagara Falls. Tabes dorsalis is another risk factor for this threat scenario, but it relates to the affected entity and not the environment. Risk factors that are present in or relate to affected entities are behavioral risk factors, and their measurement will be discussed later in this chapter. The confluence of the two risk factors, one inherent to the environment and the other to the affected entity, increases the likelihood component of threat scenario risk relative to a scenario where only one of these risk factors is present. Specifically, both these risk factors increase the potential for slipping and a subsequent violent encounter with a train. The distance between the entity and the platform edge is an apex risk factor in this threat scenario. All other risk factors are irrelevant unless this condition exists. The New York City Transit System recognizes its significance as evidenced by the yellow line painted on every subway platform edge coupled with frequent admonitions to remain behind the yellow line. The precise threat is falling on to the track followed by being hit by a moving train. Slipping and falling on to the track by itself would surely be embarrassing, and possibly injurious, but it would not necessarily be fatal. The magnitude of the
140
7 Risk Factor Measurements
likelihood component of risk increases significantly if one is on the track during some critical time interval defined by the time between the fall and the arrival of the train. This critical interval is determined by the distance required for the train to completely stop after applying the brakes relative to the time of the fall. This time interval is a temporal risk factor. When combined with platform slipperiness in the vicinity of the track, the magnitude of the likelihood component of risk is significantly increased. The spatial and temporal risk factors defined by a critical distance and critical time interval must coincide in order to substantively increase the magnitude of the likelihood component of risk. Therefore and in loose analogy with standard probability terminology, the “joint potential” associated with these two risk factors must be estimated in assessing the magnitude of threat scenario risk. In summary, the threat scenario risk factors enable inferences about the magnitude of the likelihood component of risk. Specifically, the confluence of platform slipperiness, distance of the entity from the platform edge, stability of the affected entity, and the time of arrival of an approaching train relative to the time of an entity’s fall would significantly increase the magnitude of likelihood for this threat scenario. (b) Measuring Vulnerability We next discuss assessing the vulnerability component of risk and spatial risk factors. As usual, the assessment process consists of identifying a risk factor and determining how the magnitude of a component of risk scales with risk factor values. The subway threat scenario continues to offer grim but useful lessons on the theory of security risk assessment. As noted in the previous sub-section, the magnitude of the vulnerability component of threat scenario risk is nil unless an affected entity is actually located in the direct path of a train during a time interval defined by the fall onto the tracks and train arrival. Specifically, it rises from zero to near-infinity as the affected entity transitions from the subway platform to the track bed during this interval. Note that walking along the edge of the platform does not affect the magnitude of the vulnerability component of threat scenario risk. The magnitude of loss is the same anywhere on the platform in contrast with the likelihood component of risk. The magnitude of vulnerability does not change until the person is actually situated on the track and therefore is susceptible to a collision. When an entity is located on the track, the vulnerability component of risk is essentially infinite since death or severe injury is a near certainty as the train approaches. An indicative graph of the vulnerability component of risk as a function of distance from the subway platform would look similar to Fig. 7.1. As the affected entity gets closer to the platform edge the magnitude of the vulnerability component of risk remains at zero until the individual is actually located between the track rails. The train can have no effect on an entity unless he or she is within its path. Death or severe injury awaits anyone hit by a train even at relatively low speeds. Therefore,
7.2 Spatial Risk Factor Measurements
141
Fig. 7.1 The magnitude of the vulnerability component of risk as a function of distance from the subway track
the vulnerability component of risk is essentially infinity within the width of the track, which is normalized to one in Fig. 7.1. In other words, there is a discrete transition in the magnitude of the vulnerability component of risk as the affected entity moves on or off the track. Let’s evaluate an equally horrific threat scenario, which again involves the muchmaligned great white shark. What is the magnitude of the vulnerability component of risk associated with a shark attack threat scenario in and out of the ocean? To be precise, the threat scenarios being compared are physical attacks on a human inflicted by a great white shark in and out of the ocean. First, recognize that an entity is highly unlikely to suffer an injury from a great white shark if that entity is not physically located in the ocean. A physical presence in the ocean is an apex risk factor that affects all components of risk. If one is safely ensconced inside a boat or on dry land, the vulnerability component of risk is effectively zero, the film Jaws notwithstanding. However, if the threat scenario environment switches to the ocean, the relationship between the threat and an affected entity changes dramatically. The change in the relationship encourages the use of a special safety control designed specifically to manage the vulnerability component of risk. Namely, a heavy gauge steel cage or the equivalent is used to prevent physical access to the affected entity by the shark. The entire point of the cage is to enable a human to be in the water with the shark while significantly reducing all three components of risk. Strictly speaking, the vulnerability of the person is the same inside or outside the cage, i.e., injury or death. It is the potential for a successful attack that has been reduced. However, since the shark is unable to access its victim, the vulnerability component of risk has effectively been reduced. Spear guns would also reduce the likelihood of a successful attack, but would do nothing to reduce the vulnerability component of risk. The shark could do damage to
142
7 Risk Factor Measurements
an entity as long as the shark has physical access to that entity. The presence of blood is another threat scenario risk factor, but it affects the likelihood component of risk and has no effect on vulnerability. The vulnerability component of risk is sometimes the only component that is amenable to security controls, and is therefore the focus of a security risk management strategy. Understanding which component of risk is actually being addressed by a security control is critical to developing an effective security risk management strategy. For example, installing bollards to enforce a minimum standoff distance is strictly intended to address the vulnerability component of risk for attacks using vehicles. Such attacks could take the form of a direct collision with affected entities or the use of vehicle-borne explosives. Physical laws are often the basis for models of threat behavior in physical threat scenarios. Such models are formulated in terms of a risk factor(s) that yields estimates of the magnitude of the vulnerability component of risk. For example, models of explosive threat scenarios enable estimates of the damage that results from the detonation of an explosive payload. Structural damage due to an explosive event results from two parameters: the overpressure incident on the structure and the explosive impulse. The former is the magnitude of the shock wave-induced pressure that exceeds the ambient air pressure. The latter is the cumulative effect of the explosive force interacting with building materials, i.e., the integral of the explosive force with respect to time. Two risk factors for this threat scenario are the distance between the explosive source and the target, and the explosive payload, i.e., the weight of explosive material. Both risk factors appear prominently in the various models for the vulnerability component of risk for explosive threat scenarios. Figure 7.2 depicts the time history of a pressure wave caused by an explosive detonation. Note the regions of positive and negative pressure that result as the pressure wave evolves over time.2 Fig. 7.2 Time history of an explosive pressure wave
2 Emergency War Surgery, United States Government Printing Office. 1988. http://emedicine. medscape.com/article/822587-overview
7.2 Spatial Risk Factor Measurements
143
Fig. 7.3 Incident overpressure as a function of explosive payload and distance from the target
The overpressure caused by the incident shock wave is a principal cause of building structural damage. Figure 7.3 graphically illustrates the distance and explosive weight combinations that yield constant overpressures.3 We see from the shape of the curves in Fig. 7.3 that the scaling relation between distance and payload is non-linear for a given overpressure value. The non-linearity reveals the dramatic effect of distance on overpressure. For example, an incident overpressure of 0.5 psi occurs with a net explosive weight of 2000 lbs-TNT and a standoff distance of 1000 feet. The same explosive weight with a standoff distance of only 100 feet results in an incident overpressure of 10 psi. Establishing a relationship between overpressure and the two risk factors noted above would inform a security risk management strategy for this threat scenario. Specifically, the required standoff distance for bollards, a standard security control for this threat scenario, would follow immediately from this data. Many threat scenarios are more difficult to model. It might not be possible to identify the most critical risk factor let alone provide a quantitative estimate of the vulnerability component of risk. Fortunately, such an estimate is not necessarily required in order to gain insight into an effective security risk management strategy. Sometimes just a trend in a risk-relevant parameter can be enlightening, especially if a trend reversal can be correlated with the implementation of a security control. It is sometimes easy to confuse the likelihood and vulnerability components of risk depending on the threat scenario. Consider two threat scenarios where the threat is theft, the amount of money subject to the threat of theft is $100 and the money is hidden inside an office. In the first threat scenario, the office door remains locked. In the second scenario, the same door is left wide open. In both threat scenarios, the
3 Installation Force Protection Guide-United States Air Force; http://sloansg.com/wp-content/ uploads/2016/05/Air-Force-Installation-Security-Handbook.pdf
144
7 Risk Factor Measurements
magnitude of vulnerability is the same: $100. In other words, one hundred dollars is the total value of the money that could be lost if a theft actually occurred. However, the potential for theft has clearly been reduced as a result of locking the office door. If 1000 thefts occurred and incident reports were available with the relevant information, we could generate a probability distribution of these incidents in terms of a risk factor, e.g., time of day, ambient light level, floor of the building, etc. The resulting distribution would reveal the probability of a theft as a function of that risk factor. Note we might also be able to generate a probability distribution for the vulnerability component of risk. Specifically, we could record the loss associated with each threat incident, and thereafter specify the fraction of thefts corresponding to various ranges of values for all items taken. For example, if 10 out of 100 total thefts were of items worth between one and two thousand dollars, we could state confidently that the probability of loss is 0.10 for items in that dollar range. We could generalize from this figure to the likelihood of future thefts in this dollar range if the risk factors for theft remain relatively constant over a prescribed time interval. Finally, a barrier to quantifying the magnitude of the vulnerability component of risk occurs when the value of the asset is inherently difficult if not impossible to quantify. An obvious example is when human lives are the asset in question, a situation that is not uncommon in physical security threat scenarios. (c) Measuring Impact As noted in Chap. 1 and elsewhere, the impact component of risk is defined as the significance attached to the loss resulting from a threat incident. Impact is equivalent to the vulnerability or loss-per-threat incident if the loss corresponds to the absolute value of the resulting damage. In such cases, the magnitude of the impact component of risk multiplied by the number of threat incidents yields the total vulnerability for that threat scenario. Simply put, threat scenario A is more impactful than threat scenario B if a threat scenario A incident results in a more significant loss than a threat scenario B incident. The key word is “significant,” which will vary according to the context, and is not necessarily equal to the absolute value of the loss or damage incurred. The terms “vulnerability” and “impact” are often used interchangeably in the security vernacular, which only increases the historical confusion. There is clearly a relationship between the two components. However, it is quite possible that two threat scenarios could experience the same loss yet the impact differs in each case. It is sometimes convenient to express the magnitude of the impact component of risk for spatial risk factors in relative terms. This condition might be represented graphically as a series of points normalized to unity, where the latter is used to indicate maximum significance. An indicative depiction for three threat incidents is shown in Fig. 7.4. Although games of sport are not generally associated with security risk, there are similarities between such games and security risk management strategies. This phenomenon was observed in the baseball scenario described in Chap. 4. However,
7.2 Spatial Risk Factor Measurements
145
Fig. 7.4 The relative magnitude of impact for three threat incidents
like other activities consisting of two opposing sides, baseball offers useful analogies with security risk management. In particular, it illustrates the importance of context in assessing the magnitude of risk. The impact component of risk is especially noteworthy in this regard. Baseball players are keenly focused on the impact of a specific at-bat. The current inning, and in particular the ninth inning, in conjunction with the score and the runners in scoring position affect the magnitude of the impact component of risk for an individual at-bat. Consider the following scenario: A batter is at the plate with the bases loaded and there are no outs. A single would likely score one and possibly two runs. Let’s assume the batter hits a single and the runners on second and third base both score. There are now runners on first and third since the original runner on first base advances to third base. The next batter hits a double, which scores the two runners located on third and first base. A total of four runs have scored as a result of these two at-bats. Therefore, the impact-per-incident might be characterized as four runs divided by two incidents, i.e., two runs/incident, for the team in the field. Of course the team at bat is experiencing the mirror image of this threat scenario, and is therefore characterizing the events in terms of the gain-per-incident or per-atbat rather than the loss-per-incident. The inherently zero sum nature of baseball ensures that what is good for one side is bad for the other. Baseball seems obsessed with measuring gain and loss, and the plethora of data feeds that obsession. For example, a statistic that measures the gain-per-incident is “runs produced.”4 Runs produced is equal to the runs scored plus runs-batted-in minus the number of home runs. Home runs are subtracted in order to compensate 4
https://en.wikipedia.org/wiki/Runs_produced
146
7 Risk Factor Measurements
for the batter getting credit for both one run and at least one RBI. There are also metrics that reflect the significance-per-incident, i.e., impact. These might include the number of runners left in scoring position per-at-bat and strikeouts per-at-bat. Now consider a different scenario. The batter comes to the plate with bases loaded and proceeds to smash a grand slam. The impact component of risk associated with this particular feat is four runs divided by one incident or four runs/incident. As noted above, baseball players are acutely aware of the impact component of risk associated with every pitch, and are therefore particularly cautious when pitching to a batter with men on base and the score is close. In some scenarios any hit is a game-changer, and therefore each hit is equivalent in terms of the magnitude of the impact component of risk. As one might expect, such a condition profoundly affects the risk management strategy. For example, suppose it is the bottom of the ninth inning and there are two outs. There is a runner on third base, the game is tied, and the home team is batting. The magnitude of the impact component of risk is technically the same for a single as it is for a home run. Therefore, the team in the field will position itself to prevent any type of hit. The lesson is that the magnitude of risk is once again revealed to be highly contextual, even in baseball. What about a more complex measurement of a spatial risk factor affecting the impact component of risk? Consider the scenario consisting of a platoon of soldiers on patrol in enemy territory. The lieutenant instructs the troops to spread out in the event of a mortar attack. The rationale behind this strategy is to minimize the number of casualties-per-incident, i.e., the vulnerability or loss resulting from a single mortar round. The lieutenant is seeking to minimize the impact component of risk. In this threat scenario, the impact component of risk depends on the areal density of mortar fire relative to the areal density of troops. If the attack blankets the entire patrol area, the loss is infinite since every soldier is certain to die or be injured. However, the lieutenant assumes a mortar attack will be dispersed over the target area, and invokes a strategy of troop separation so that a single round would, at most, affect only one soldier. In devising this strategy the lieutenant has assessed each component of risk. The magnitude of the likelihood component of risk is unknown; a round could land at anytime and at any location. If the likelihood of a round landing in a specific area were known to be lower than surrounding areas, the lieutenant would no doubt devise a defensive strategy that leverages that knowledge, i.e., position the troops where the rounds are least likely to fall. Unfortunately, that particular strategy is not an option. Therefore, the lieutenant must devise a strategy designed to minimize the impact component of risk, and in so doing, implicitly assumes the location of a falling artillery round is a random variable. The lieutenant does have excellent knowledge of the vulnerability as a function of the distance from where a round lands. Therefore, the lieutenant’s risk management strategy integrates uncertainty in the likelihood component of risk with a high degree
7.2 Spatial Risk Factor Measurements
147
Fig. 7.5 Minimum soldier separation (d) and artillery round blast area (πr2)
of certainty regarding the magnitude of vulnerability as a function of the distance from a mortar round explosion. The result is a focus on the impact component of risk. The lieutenant also realizes that the impact component of risk equals the product of the areal density of soldiers times the blast area of a single round. Moreover, the impact component of risk is a minimum if d/2 > r, where d is the soldier separation distance, and r is the blast radius. In other words, the separation of the soldiers must be greater than the diameter of the circular blast area. If this condition exists, a single mortar round can affect only one soldier. Therefore, a spatial risk factor for the impact component of risk is the ratio d/r. Figure 7.5 depicts the geometry of this threat scenario. A dynamic model of threat scenario risk is required to account for the non-linear effect on the magnitude of impact as the battle progresses. In other words, the lossper-threat incident is not constant. A single shell will affect a disproportionate fraction of the remaining soldiers as each round takes an increasing toll over time. The impact component of risk is now a relative quantity, and can be specified as the number of soldiers lost-per-round L, divided by the number of remaining soldiers R, i.e., L/R. For example, assume one soldier is lost per mortar round, one round lands per second, the total duration of a barrage is 5 s and there are five soldiers before the barrage commences. Figure 7.6 graphically illustrates the relative impact as a function of time for this threat scenario. The previous discussion suggests it might be more accurate to specify impact in terms of the fraction of soldiers wounded or killed per exploding round rather than the absolute number. Consider the scenario where two soldiers are on patrol. If a single round can affect just one soldier, the absolute magnitude of the vulnerability component of risk is precisely one soldier. However, that round has just neutralized one-half of the soldier population. In contrast, if one soldier out of a hundred is located within the blast radius, the vulnerability component of risk is again the same, i.e., one soldier, but the fractional loss or relative impact is 1/100th of the soldier population. The magnitude of the impact component of risk can be particularly relevant if the focus is on achieving a specific objective. If the platoon’s mission is to neutralize an enemy machine gun, the impact component of risk affects the squad’s resilience, which in turn affects its ability to accomplish the mission. This particular example
148
7 Risk Factor Measurements
Fig. 7.6 Relative impact for an artillery threat scenario
illustrates how both the vulnerability and impact components of threat scenario risk should be considered a priori with respect to the specific mission at hand.
7.3
Temporal Risk Factor Measurements
(a) Measuring Likelihood The magnitude of the likelihood component of risk can be affected by time or more precisely, a time interval of a specific duration. Such durations can be long or short, where the magnitude of risk will depend on the context. In general, the magnitude of the likelihood component of risk increases for longer time intervals. Furthermore, the magnitude of risk could be a non-linear function of time. Consider the likelihood of a terrorism threat incident occurring during the next 50 min versus the next 50 years. Perhaps contrary to intuition, a short time interval could also be a risk factor in specific threat scenarios. A collision with a subway train continues to be an instructive if gruesome threat scenario. We previously identified the slipperiness of the subway platform in the vicinity of the track as a spatial risk factor. We also observed that the time interval between falling on the track and the arrival of a subway train is a risk factor for the likelihood component of risk. This risk factor is not spatial since it is not inherent to the threat scenario environment. To be precise, it is the shortness of the time interval that affects the magnitude of the likelihood component of risk for this threat scenario. Specifically, if the time defined by when an affected entity falls on the track relative to the train arrival is too short, the likelihood component of risk is essentially infinite since there is a non-zero likelihood of escaping injury or death. The time
7.3 Temporal Risk Factor Measurements
149
interval is quantifiable. It is a function of the train speed when the conductor first applies the brake combined with the train’s rate of deceleration. If the timing of the fall sufficiently precedes the train arrival, the likelihood component of risk will be significantly reduced since the affected entity presumably has time to get out of the way. The likelihood of a threat incident, i.e., collision with a subway train, is low unless the affected entity is on the track during the critical time interval as defined above. We can estimate the time interval using Newtonian mechanics. It is assumed that the conductor first applies the brakes and therefore begins to decelerate as soon as he or she views the affected entity on the track. The distance required for the train to come to a complete stop is given by the following expression: x ¼ vo t ð½Þ at2
ð7:1Þ
Here x is distance, vo is the initial speed, i.e., the speed when the brakes are first applied, a is the acceleration due to braking and t is the time required for the train to come to a complete stop.5 We assume that vo is 44 feet-per-second (~30 miles-per-hour) and a is 5 ft/s2. The calculation reveals that injury or death is a near certainty if the train begins to decelerate 100 feet or less from where the affected entity falls and he or she remains on the track for 15 s from the time of initial deceleration. Therefore, there is only a 15 s window to escape the track if the train begins decelerating 100 feet from the entity. The train conductor is powerless to stop the train in less time. If the magnitude of deceleration is less or the train’s initial velocity is greater, the window of time will be even narrower. The likelihood component of risk is increased during this window since a tragic outcome is inevitable unless the affected entity can somehow manage to escape. (b) Measuring Vulnerability In the subway threat scenario we observed that the likelihood component of risk was essentially infinite if an affected entity cannot extricate him or herself from the track during the critical time interval defined by the time of the fall relative to the location and speed of the train at the time of the fall. The subway threat scenario is illustrative of the effect of spatial and temporal risk factors on the magnitude of the likelihood and vulnerability components of risk. We now ask what happens if the vulnerability component of risk is affected by a temporal risk factor? In the simplest case, the risk factor contributes uniformly over the relevant time interval. Therefore, the magnitude of vulnerability equals the magnitude of the risk factor multiplied by the risk-relevant time interval.
5
Acceleration is actually negative since the train is decelerating after the brakes are applied.
150
7 Risk Factor Measurements
However, the risk factor might not have a uniform effect over this time interval. The effect could be non-linear. In such cases, a simple multiplication will not suffice in determining the total loss resulting from a threat incident. For these threat scenarios, integration over the relevant time interval is required to determine the magnitude of vulnerability. Of course, a model for threat behavior as a function of the identified risk factor is required to even perform such a calculation. Recall the integral and the process of integration were discussed in Chap. 6. Integration of a variable or variables enables a continuous summation of a function whose value is not constant over some range of values.6 We apply the integral to a hypothetical risk factor that does not behave uniformly during some time interval. The objective is to determine the magnitude of vulnerability, where it is assumed that the magnitude of threat scenario vulnerability is exclusively attributable to this risk factor. If the risk factor varies over some risk-relevant time interval T, the cumulative vulnerability V, can simply be expressed as the integral of the vulnerability risk factor Rv(t), which is integrated with respect to time t: ZT V¼
Rv ðtÞdt
ð7:2Þ
t¼0
An example of a security-related parameter that obeys this model is the impulse. In physics, an impulse quantifies the effect of forces acting on an object over a time interval Δt. For a constant force F, the impulse J ¼ F x Δt. The momentum, p of an object is defined as the object’s mass times its velocity. In qualitative terms, it expresses the tendency for an object to remain in motion. It is easy to show that J is equivalent to the change in momentum Δp, which results from the application of a constant force. Here m is mass, v is velocity and a is acceleration. We know that acceleration equals velocity per unit time or v/t so v ¼ at and F ¼ ma from Newton’s laws. Therefore, we have the following expression: Δp ¼ m Δv ¼ m ðaΔtÞ ¼ FΔt
ð7:3Þ
In the real world forces are not necessarily constant. As a result, the impulse must be calculated using an integral such as the one shown in (7.2). This condition applies to explosive threats, where the cumulative effect of the explosive force interacting with building components is a significant factor in the resulting damage. Recall in the case of explosive detonations the vulnerability risk factor V, results from the force exerted on the structure, which in turn is a function of the distance
6 Such an integration could be performed with respect to any variable. However, functions are often integrated with respect to time or position in physical threat scenarios.
7.3 Temporal Risk Factor Measurements
151
from the explosive source plus the explosive payload. The latter is often expressed in units of equivalent pounds of TNT. Therefore, the explosive force Fv and the risk-relevant time interval determine the magnitude of vulnerability. Specifically, the vulnerability is represented by the following expression, which is identical in form to (7.2): ZT V¼
Fv ðtÞdt
ð7:4Þ
t¼0
The risk-relevant time interval corresponds to the limits of integration, i.e., the time that the explosive force is applied to the target structure. Therefore, the cumulative effect of the force, and hence the vulnerability component of risk, increases for an increasing time interval of interaction. The mechanical response of the material interacting with the blast relative to its natural frequency of vibration will affect this interaction time. Temporal risk factors affecting the vulnerability component of risk can have an exponential dependence on time and therefore exhibit exponential growth or decay. An example of this condition is the threat posed by radioactive material. Excessive exposure to such material can cause serious health issues. So-called “dirty bombs” use radioactive material dispersed via conventional explosives, and have been labeled as weapons of mass destruction by the United States government. A risk factor for exposure to a radioisotope, e.g., Cobalt-60, Cesium-137, Iridium-192, et al., is the intensity of the radioactive particles emitted through nuclear decay. The vulnerability results from the absorbed radiation and ensuing biological damage, i.e., the dose-equivalent, where deleterious biological effects are cumulative. Radioactive decay produces electromagnetic energy in the form of alpha, beta and/or gamma particles. The intensity of electromagnetic radiation I, is proportional to the quantity of remaining radioactive material. The rate of decay is governed by a constant, often denoted as λ. In accordance with the exponential processes discussed in Chap. 6, the radiation intensity is governed by the following first order differential equation: dI=dt ¼ λI
ð7:5Þ
In other words, the time rate of change of the radiation intensity is proportional to the intensity, where the constant of proportionality is λ. The minus sign on the right side of (7.4) reflects the decrease in intensity with increasing time. Note that (7.5) describes the general behavior of radioactive decay. Since the radiation intensity can be directly related to biological damage via the radiation absorption (measured in rads; 1 rad ¼ 62.4 x 106 MeV per gram), (7.5) is an appropriate if approximate model for the vulnerability component of risk for specific radioactive threat scenarios.
152
7 Risk Factor Measurements
From Chap. 6 we know the solution to (7.5) is given by the following expression, where Io is the initial intensity: IðtÞ ¼ Io eλt
ð7:6Þ
We also know from Chap. 6 that the half-life of this process equals 0.693/λ. For the interested reader, the half-life of radioactive materials can range from yoctoseconds, i.e., 10–24 s to over a billion years. Calculating the total dose is straightforward. It is just the initial dose rate times the time interval of exposure. For example, an initial dose rate of 10 mrad (i.e., 10 x 103 rad) per hour will result in 100 mrad in 10 h. However, (7.6) reveals that the intensity does not remain constant. The intensity is changing with time since the radioisotope is losing atoms as a result of the decay process. As more atoms decay, there are fewer remaining to emit radiation. After one half-life, the remaining atoms will produce half the initial particles, after two halflives the intensity will be reduced by one quarter, etc. If gamma radiation is the source of energy, the energy that is locally absorbed per unit mass per unit time is determined by multiplying the mass energy absorption coefficient with the flux density of radiation and the energy of the gamma photons.7 Calculation details aside, the point is that the absorbed dose is a temporal risk factor in radiological threat scenarios, where greater exposure times increase the vulnerability component of risk.
7.4
Behavioral Risk Factor Measurements
Behavioral risk factors are actions or features present in affected entities that increase the magnitude of one or more components of threat scenario risk. One might mistakenly assume that “behavior” refers to the behavior of threats. However, we know from previous discussions that risk factors only apply to threat scenario environments and affected entities. IT environments in particular offer opportunities to observe behavioral risk factors in action. Interestingly, when such risk factors are present they can sometimes be quite easy to identify and measure. For example, the web browsing history of networked computer users can be monitored, and thereby uncover individuals predisposed to risk-relevant behavior. Surreptitious downloading of executable code from malicious web sites by a bad actor is a common attack vector for information security threat scenarios. Recognize that this attack is often facilitated by user behavior such as visiting dodgy web sites or a lack of attentiveness in entering their username and password when prompted.
7 Shapiro, J. Radiation Protection; A Guide for Scientists and Physicians. Harvard University Press; Cambridge, MA, 1990.
7.5 Multiple Risk Factors and Uncertainty in Security Risk Management
153
The magnitude of the likelihood component of risk increases exponentially with an increasing number of computer users. Moreover, and in the limit of an infinite number of users, the likelihood is a certainty. A simple mathematical argument shows why this is so. If the probability of a single user being hacked is p, the probability of n users succumbing to this fate is pn. Therefore, the probability that no user is hacked equals 1 – pn. As the number of users increases, i.e., n gets large, the probability that no user is hacked approaches one since p is always less than one. IT users with a propensity for using low-entropy passwords also increase the magnitude of threat scenario risk. Vulnerability to password cracking is most appropriately measured with respect to a parameter that transcends a particular system or network: time. In Chap. 12 we discuss password cracking in detail since it represents the convergence of complexity, computing power and time, where the risk is inexorably linked to the time required to conduct a brute force attack. Behavioral risk factors can be difficult to manage notwithstanding the fact that they can be relatively easy to identify and even measure. Humans are ingenuous in circumventing security controls and/or claiming ignorance of those controls, willful or otherwise. Finally, security controls focusing on behavioral risk factors can be especially difficult to implement precisely because they require changes in behavior. Such changes often result in inconvenience thereby incentivizing individuals to circumvent security controls. Security policies that dictate the acceptable limits on behavior are essential to addressing behavioral risk factors. In a similar vein, education, training and threat awareness complement security policies as part of a comprehensive security risk management strategy.
7.5
Multiple Risk Factors and Uncertainty in Security Risk Management8
The likelihood component of risk increases exponentially when multiple likelihood risk factors in a threat scenario are coincident. This condition is known as confluence and it is discussed in the next section. Furthermore, the source of uncertainty associated with likelihood risk factors affects assessments of likelihood via a calculation of probability or and estimate of the potential as discussed in Chap. 4. The presence of multiple risk factors of any type also contributes to threat scenario uncertainty. However, this effect is a macroscopic condition, which is manifest as uncertainty in security risk management, i.e., the application of security controls to risk factors. In other words, the presence of multiple risk factors
8 Taylor, J. An Introduction to Error Analysis; The Study of Uncertainties in Physical Measurements, Second Edition, University Science Books, Sausalito, CA, 1997.
154
7 Risk Factor Measurements
contributes to the overall uncertainty in addressing those risk factors, and thereby enhances the likelihood component of risk for a given threat scenario. We will discover in Chap. 9 that the likelihood component of risk is affected as a result of an increase in threat scenario complexity. Furthermore, it turns out that the magnitude of complexity is a function of the number of risk factors combined with the uncertainty in security risk management. By necessity, the proposed model for complexity allows for a subjective evaluation of this uncertainty. It would be useful to actually estimate the uncertainty associated with high-risk threat scenarios, i.e., those containing multiple risk factors, rather than settle for a subjective evaluation. Such an estimate is theoretically possible if each risk factor is a variable with finite variance. The following discussion is a straightforward statistical result that follows from any scenario containing multiple variables of finite variance. We are interested in the so-called propagation of uncertainty (or propagation of error). This is the effect of multiple variables’ uncertainties, i.e., errors, and specifically random errors, on the uncertainty of a function comprised of these variables. In this case, the variables are risk factors whose values are derived from individual measurements. Each of these measurements has an uncertainty due to measurement limitations, perhaps instrument imprecision, which propagate due to the combination of variables in the overall function. This function reflects the overall threat scenario risk, which results from the effect of the risk factors in aggregate. The uncertainty u can be expressed in a number of ways. It may be defined by the absolute error Δx. Uncertainties can also be defined by the relative error (Δx)/x, which can be written as a percentage. Most often the uncertainty u, of a quantity is quantified in terms of the standard deviation σ, the positive square root of the variance σ2. The value of a quantity and its error/uncertainty can now be expressed as an interval x u. If the probability distribution of the variable is known or can be assumed, it is possible to derive confidence limits to describe the region within which the true value of the variable may be found. For example, we learned in Chap. 5 that the 68% confidence limit for a one-dimensional variable belonging to a normal distribution is approximately one standard deviation σ from the mean value x. This statement means that the region x σ will encompass the true value in roughly 68% of the sample. As noted above, we assume each variable has a distribution of values with finite variance. For example, assume variable A has a mean value x with variance x and variable B has a mean value y with variance y. The variance of the function f ¼ x + y is given by (x2 + y2)1/2. In other words, the variance in the mean value of the two independent variables equals the square root of the sum of their individual variances. The upper bound on the variance of f ¼ x + y is given by x + y. This result is true in general. For N independent variables x1, x2. . .xN with individual variances x1, x2,. . .xN, the variance or uncertainty U in the mean of the sum is the square root of the sum of the individual variances: 1=2 U ¼ x1 2 þ x2 2 þ . . . : þ xN 2
ð7:7Þ
7.6 Summary
155
If the variables are the risk factors in a threat scenario, the overall uncertainty in their mean value scales as the square root of the sum of squares of the individual risk factors. A measurement of U would constitute a macroscopic quantity relating to the general condition of uncertainty across the threat scenario, and the potential confidence in the application of security controls. Of course, the simultaneous strength and limitation of this approach is the requirement that individual risk factors are variables with finite variance. As with other analyses based on probabilistic reasoning such as those described in Chap. 8, it is imperative to consider both the realism and implications of such assumptions.
7.6
Summary
There are five categories of risk factors: apex, spatial, temporal, behavioral and complexity. Spatial and behavioral risk factors are present in threat scenario environments and affected entities, respectively. Apex risk factors are non-specific and warrant a distinct category exclusively because of their effect on the magnitude of risk. Complexity risk factors include the number of risk factors and uncertainty in security risk management. Complexity is itself a risk factor for the likelihood component of risk, and a discussion regarding complexity details is deferred until Chap. 9. Measurements of risk factors are organized according to a risk factor category and a particular component of risk. Measurements of risk factors often entail quantifying the effect of a change in a particular threat scenario condition or feature, which in turn affects the magnitude of the relevant component of risk. Most if not all risk factor measurements are borrowed from traditional scientific and engineering disciplines. Measurements of spatial and temporal risk factors are particularly risk-relevant since their magnitude often scales with distance and time. Fluctuating risk factors can be intermittently confluent, and thereby increase the magnitude of threat scenario risk during the overlapping time interval. The magnitude of risk remains elevated as long as the likelihood risk factors are coincident. The effect of confluence is multiplicative, and therefore multiple likelihood risk factors exponentially increase the likelihood component of risk during the relevant time interval. Multiple risk factors of any type increase the likelihood component of risk by increasing the uncertainty in security risk management, i.e., the application of security controls. Specifically, assuming each risk factor is a variable with finite variance, the total uncertainty scales as the square root of the sum of the squares of individual risk factor uncertainty.
Chapter 8
Elementary Stochastic Methods and Security Risk
8.1
Introduction
The likelihood component of risk deserves special attention. It is unique in part because of the two methods used to assess its magnitude. As discussed in Chap. 4 and noted elsewhere, direct assessments of the likelihood component of risk leverage probability distributions of threat incidents to specify the probability that a future incident, should it occur, will be of a certain type or possess a specific characteristic. In contrast, indirect assessments use the number of risk factor-related incidents and/or a change in the magnitude of a risk factor to infer the potential for a future type of threat incident. The objective of both direct and indirect assessments is the same: estimate the magnitude of the likelihood component of risk. What we really mean by this statement is that we wish to gain insight into the likelihood of a future threat incident with specific risk-relevant characteristics. To that end, calculating the probability of a future threat incident based on historical evidence would be ideal. Unfortunately, such a calculation is often not possible because there is a paucity of direct evidence, i.e., threat incidents. In other fields of risk management such as medicine, statistical sampling is used to generalize about the likelihood of a future threat incident based on the past. In the absence of statistical evidence, workarounds are required. Some of these workarounds require an assumption of randomness so that the laws of probability can be invoked. These laws facilitate a transition from one form of uncertainty, i.e., the uncertainty inherent to indirect assessments, to the prescribed uncertainty associated with direct assessments, i.e., the dispersion about the mean of a probability distribution. Processes involving random variables are called stochastic processes. As noted above, the advantages conferred by stochastic processes derive from the applicability of the laws of probability. However, applying stochastic processes to certain © Springer Nature Switzerland AG 2019 C. S. Young, Risk and the Theory of Security Risk Assessment, Advanced Sciences and Technologies for Security Applications, https://doi.org/10.1007/978-3-030-30600-7_8
157
158
8 Elementary Stochastic Methods and Security Risk
threat scenarios can require a leap of faith if not a suspension of reality in the interest of making a problem tractable. Nevertheless, an assumption of randomness can represent a useful departure from reality absent additional risk-relevant information. Furthermore, a truly random threat scenario implies an absence of security controls, which might be considered a worst case or at least a very bad one in terms of risk management. The absence of security controls has significant operational implications, and is a condition that is arguably only slightly less severe than the intentional circumvention of security controls by an insider. Consider the following two questions: “What is the likelihood that a “two” appears when tossing a fair die?” and “What is the likelihood that a particular building will be attacked by terrorists in the next month?” These may seem like similar questions because they have the term “probability” in common, but they are actually quite different. Furthermore, the difference highlights the distinction between probability and potential. A significant problem in assessing terrorism risk or any act of violence is that it is not known whether a particular target will actually be attacked. Such a problem does not exist in a game of dice. All participants in a game are confident an outcome is forthcoming. Moreover, the risk factors that affect incident occurrence are varied and do not necessarily affect threat scenarios in the same way. Suppose threat actors suddenly become ill, have a change of heart and/or encounter some other unforeseen issue. The likelihood that a future terrorist incident will actually take place is openended. Moreover, the range of possible outcomes is not bounded, which is equivalent to saying the distribution cannot be normalized, nor can the potential influences on the threats be quantified. One option intended to overcome these issues is to model the problem as a stochastic process such as a coin or die toss. In other words, a threat incident is assumed to be a random variable, which results from either risk factor incoherence or the absence of any likelihood risk factors. In the latter case threat incidents happen spontaneously without prompting and threat incidents are independent. Candidly, modeling a threat scenario as a game of chance is a significant simplification. Although every threat scenario is somehow probabilistic, assuming an outcome is a random variable demands flexibility in thinking about threat scenarios to say the least. Specifying the definition of a random variable is helpful in appreciating why this is so as well as in partially justifying such an assumption. A random variable is a variable whose values depend on outcomes of a random process or phenomenon.1 Importantly, a random variable has a probability distribution, which specifies the probability of its possible values. There are many examples of random variables. They arise with surprising regularity in everyday life and often in rather mundane contexts.
1
Blitzstein, J., Hwang, J. Introduction to Probability. CRC Press, 2014.
8.1 Introduction
159
Recall the discussion on cutting bananas in Chap. 4 in connection with sources of uncertainty. Slicing a banana is a process that yields bite-size portions to facilitate consumption. Although one might attempt to produce slices of exactly the same width, the various forces that govern slicing and stabilizing the banana essentially guarantee inconsistency in width from slice-to-slice. In fact, such factors “conspire” to make slice-width a random variable. Examples like this are everywhere, e.g., the length of each step we take, the cost of groceries each time we go shopping, the time required to travel to work each day, the motion of particles in suspension, et al. Random processes are everywhere, and fortunately there is a high tolerance if not an actual requirement for randomness in most aspects of life. Although games involving die and coin tosses are based on a process outcome being a random variable, predictions of any individual outcome are theoretically possible. In fact, the process is completely deterministic if enough information is available since Newton’s laws govern every outcome. Dice respond to inertial and frictional forces, and such forces are what determine the outcome of a toss. If these forces could be measured and/or calculated, the outcome of a toss would be entirely predictable. The philosopher Karl Popper speaks of predictability in throwing dice2: In order to deduce predictions one needs laws and initial conditions; if no suitable laws are available or if the initial conditions cannot be ascertained, the scientific way of predicting breaks down. In throwing dice, what we lack is, clearly, sufficient knowledge of initial conditions. With sufficiently precise measurement of initial conditions it would be possible to make predictions in this case also; but the rules for correct dicing (shaking the dice-box) are so chosen as to prevent us from measuring initial conditions.
Games such as dice and coin tosses are actually predicated on physical symmetry. Symmetry facilitates randomness in this context. Because dice and coins are roughly symmetric about their principal axes of rotation, the outcome of each toss is a random variable. If this were not the case these types of games would not exist. However, recognize it is impossible to manufacture a coin or die that is perfectly symmetric. This innate asymmetry introduces a small torque about a principal axis upon rotation. Although minor asymmetries exist throughout the die, their net effect is either averaged out or are too insignificant to have a noticeable effect. The response of a die to the forces that accompany each toss is complex and would require sophisticated methods to measure and/or calculate. If this were not the case, each outcome would be predictable, and thereby remove any semblance of surprise. The lack of transparency associated with the inertial and frictional forces associated with each toss coupled with the symmetry of each die make the outcome of a toss a random variable. The physical complexity of a seemingly simple die toss both enables and compels the use of a stochastic process in predicting process outcomes. The net result of the die toss process is that each outcome or “state” of a fair die is equally likely. 2
Popper, Karl, The Logic of Scientific Discovery. New York: Basic Books, 1959.
160
8 Elementary Stochastic Methods and Security Risk
Therefore, the probability of any particular outcome is inversely proportional to the number of possible outcomes, i.e., one-sixth. What lessons can we draw from such systems or processes with respect to threat scenario randomness? We are now well aware that threat incidents are a by-product of the threat scenario risk factors. These risk factors can be numerous, multi-faceted and incoherent, i.e., not in phase. One might envision the net effect of disparate, coincident risk factors is to cause a parameter associated with resulting incidents e.g., incident number, incident magnitude, to be a random variable. The overarching assumption is that the effect of threat scenario bias is negligible compared to the randomizing effects of multiple risk factors affecting the likelihood component of risk. Suppose a terrorism threat scenario consists of six identical buildings subject to attack by an adversary. This threat scenario has six outcomes, i.e., possible states, and would be equivalent to a die toss assuming an attack is inevitable and an attack on any one of the buildings is equally likely. The latter statement is equivalent to saying the likelihood risk factors are identical for each building. If a bias toward one building exists, i.e., the likelihood risk factors are not identical, such a scenario is functionally equivalent to a game involving an asymmetric coin or die. In the absence of other risk-relevant information to bias the outcome, the likelihood that a specific building is attacked is trivially calculated to be one-sixth. The lack of a bias in either scenario enables predictions of likelihood that are strictly numerical in accordance with the laws of probability, i.e., inversely proportional to the number of states or possible outcomes. The assumption that threat scenario outcomes are subject to the laws of probability is central to each method in this chapter. If a threat incident is assumed to be a random variable, the threat scenario becomes similar to a game of chance, where threat incidents are scenario outcomes and the forces that influence the outcomes are the risk factors. The assumption transforms a threat scenario with seemingly insuperable limitations that are driven by uncertainty in the effect of the risk factors into a straightforward mathematical exercise driven by the uncertainty inherent to a probability distribution. It is necessary to recognize the limits of an assumption of randomness and to use nuance in interpreting results so derived, and thereby achieve insights on risk that would otherwise be unobtainable. To that end, various stochastic processes are discussed in this chapter. Each is potentially applicable to specific threat scenarios pursuant to estimating the likelihood component of risk.
8.2
Probability Distributions and Uncertainty
As noted many times, a probability distribution is required to calculate the probability of a specific process outcome. Formulating an appropriate probability distribution is possible assuming certain threat scenario conditions apply. For example, specifying the probability that a specific number of threat incidents will occur in a given time interval is possible if the Poisson assumptions delineated in Chap. 5 are valid.
8.2 Probability Distributions and Uncertainty
161
Fig. 8.1 An indicative poisson distribution of threat incidents
Figure 8.1 illustrates a hypothetical Poisson distribution of threat incidents, where n is the number of incidents and P(X ¼ x) is the probability of experiencing x incidents in a given time interval. It is evident from Fig. 8.1 that the probability of experiencing two or three threat incidents in the specified time interval is about 1.5 times higher than experiencing exactly one incident, i.e., a probability of 0.23 compared to 0.15. The probability of experiencing exactly four incidents is 3.4 times higher than experiencing zero incidents in the specified time interval, i.e., a probability of 0.17 compared to 0.05. Unfortunately, security threats such as terrorism do not reliably obey Poisson statistics. For example, threat incidents are not necessarily independent, and they certainly do not occur at a fixed rate. Nevertheless, it is instructive to assume threat incidents do obey a Poisson process, and thereby demonstrate its potential in assessing likelihood. We first discuss the implications of sample size on stochastic assessments of likelihood. We have seen that a sample drawn from a parent distribution enables estimates of likelihood with respect to the parent. The simple analysis of the distribution of hair color distribution in Chap. 5 exemplified the statistical sampling process. However, the precision of such estimates depends on the sample size. It is instructive to quantify that dependence. Let’s assume threat incidents obey Poisson statistics. The standard deviation σ, of a Poisson distribution equals the square root of the sample size N. In other words, σ ¼ √N
ð8:1Þ
If we can only tolerate an uncertainty about the mean equal to 10% of the sample size, this requirement is equivalent to the following condition: σ ¼ N=10
ð8:2Þ
162
8 Elementary Stochastic Methods and Security Risk
Equating (8.1) and (8.2) we get, √N ¼ N=10
ð8:3Þ
Squaring both sides and solving for N yields N ¼ 100. Therefore, 100 samples are required to limit the uncertainty to the specified level or conversely, to establish the required level of precision. Suppose even less uncertainty can be tolerated. Specifically, what if we specify that the standard deviation must be one hundredth of the sample size. In other words, σ ¼ N=100
ð8:4Þ
Equating (8.1) and (8.4) implies that N must equal 10,000 in order to reduce the uncertainty by one-tenth or conversely to increase the precision by a factor of 10. Therefore, to reduce the uncertainty by one-tenth requires increasing the sample size by a factor of 100. One can generalize from these examples and state that the precision of a Poisson distribution (or a normal distribution) scales inversely with the square of the sample size. Figure 8.2 quantifies the required precision as a function of the number of samples. We see that relaxing the precision requirement by a factor of five, e.g., from 5 to 25 percent, implies the sample size can be reduced from 400 to 16, i.e., a factor of 25 decrease. If we substitute the word “samples” with “threat incidents,” the implication to the likelihood component of risk becomes clear. Namely, a distribution consisting of only a few incidents would be characterized by a significant uncertainty about the
Fig. 8.2 Required precision as a function of the number of samples in a poisson distribution
8.3 Indicative Probability Calculations
163
mean, i.e., a large variance. In this case, mathematical expedience invites moral conflict; more threat incidents imply better assessment precision at the expense of reduced security.
8.3
Indicative Probability Calculations
An assumption of randomness is liberating. The laws of probability are well understood, and the resulting probability distribution of threat incidents is a powerful resource in estimating the likelihood component of risk. Specifically, the variance associated with a probability distribution now becomes the source of uncertainty, which enables quantitative estimates of likelihood. But freedom can exact a price. As noted in Chap. 3, the possible explanations for threat incident randomness are that they just happen without rhyme or reason or they represent the net effect of risk factor incoherence. In the former case, the implication is there are no likelihood risk factors and in the latter the risk factors incoherently combine to produce unpredictable results. We next assess the likelihood component of risk for a specific threat scenario assuming a threat incident is a random variable. The point is to illustrate how invoking randomness enables straightforward estimates of likelihood. In so doing, the pitfalls inherent to such an approach will also become clear. Consider the process of enrolling individuals in an access control system. Most large organizations control physical access to restricted internal space via card readers that open magnetic door locks following the presentation of a valid ID card. An access control server manages numerous ID card readers, where information regarding access privileges is maintained in a database of individuals affiliated with the organization. Control panels on each floor are connected to a specific set of card readers that manage physical access to designated areas. These ID card readers disengage locking mechanisms when presented with an ID card that is encoded with information regarding a particular cardholder’s access privileges. Such privileges are often based on an individual’s role within the organization, e.g., employee, consultant, student, etc. Each role entitles an individual to enter specific areas, and an ID card reader is typically mounted outside the door of the restricted area. Role-based access control is a fundamental security control used by most large organizations. The nexus of security risk and role-based access privilege highlights the importance of identity and access management in the provisioning and de-provisioning of such privileges. Many security threats are facilitated by unauthorized access to restricted areas. Each individual affiliated with the organization is enrolled in the ID card access system as part of the on-boarding process. As noted above, access privileges align with the individual’s role in the organization. There are a number of reasons an
164
8 Elementary Stochastic Methods and Security Risk
individual could be placed in an incorrect role and therefore be granted incorrect access privileges. At a high level, such reasons include intentional and unintentional enrollment errors. Critically, for this estimate of risk we assume system enrollment is a random variable. The objective is to determine the probability of unauthorized access privileges via the improper assignment of such privileges, i.e., erroneous role assignment. A stochastic approach is used to estimate the likelihood that an individual will be assigned an incorrect role. As a result of this mistaken assignment, an individual is granted access to locations controlled by ID card readers associated with the incorrect role. We assume the following threat scenario conditions apply: • Physical access privilege assignment is role-based. • ID card readers linked to a centrally managed access control system enforce access privileges. • A role is characterized by a unique sub-set of ID card readers that facilitate access to restricted areas appropriate to a given role. • Each employee is assigned only one role. • Role assignment is a random variable with two outcomes: correct and incorrect. Assume there are N roles in an organization. Since enrollment is a random variable and only one role is the correct role, the probability that an individual is randomly assigned an incorrect role is given by the following3: pðincorrect roleÞ ¼ ðN 1Þ=N
ð8:5Þ
The numerator equals N-1 since one role is presumably correct for a given enrolee. Conversely, the probability of assigning a correct role is p(correct) ¼ 1/N. We note that (N-1)/N + 1/N ¼ 1, which must be true according to the definition of a probability. If enrollment is indeed a random variable, the probability of an error is quite high and would scale inversely with the number of roles up to a minimum of 0.5 assuming there are at least two roles. Note we are not describing the probability of assigning a particular incorrect role. Rather, we are specifying the probability of assigning an individual with ANY incorrect role. However, we know that in real life the process is deliberate, and therefore the potential for an error should actually be quite low. We might view random role assignments as a near-worst case, where the worst case is an intentional incorrect assignment. This example highlights the all-important gulf between estimates of probability and potential. If the process outcomes are not random, there are risk factors that can contribute to the likelihood of an enrollment error. These include the alertness of the enroller, the complexity of the enrollment process, et al. Absent a probability
3 The probability of being assigned a specific erroneous role if roles are assigned randomly is 1/(N-1).
8.3 Indicative Probability Calculations
165
distribution of historical enrollment errors any estimate of the likelihood component of risk is inherently qualitative. The assumption of randomness obviates the need for such a probability distribution, where the price of that assumption is a contrived threat scenario. However, we continue with the calculation for effect. If a distinct set of ID card readers is associated with a particular role, the probability of erroneously being granted access to a restricted space controlled by a particular ID card reader equals the probability of an incorrect role assignment, i.e., (N-1)/N, times the fraction of ID card readers associated with an incorrect role. However, there could be an overlap of authorized card readers across roles. Therefore, even if an individual has been assigned an incorrect role, some fraction of the ID card readers that belong to the erroneous role might also belong to his or her legitimate role. Let T be the total number of card readers and M is the number of legitimate, i.e., authorized, card readers within an erroneously assigned role. The probability of being granted access to an unauthorized card reader becomes as follows: pðunauthorized card reader within an erroneously assigned roleÞ ¼ ðT MÞ=T Therefore, the probability of being granted unauthorized access privileges equals the probability of being assigned an incorrect role times the probability of an unauthorized card reader within an erroneously assigned role: pðunauthorized access ¼ ½ðN 1Þ=N ½ðT MÞ=T Let’s put in some numbers and thereby calculate the magnitude of the likelihood component of risk for a specific threat scenario. If N ¼ 100 roles, T ¼ 1000 total card readers, and M ¼ 50 authorized card readers per role, the probability of unauthorized access to a restricted space due to a randomly occurring enrolment error equals, pðunauthorized accessÞ ¼ 99=100 950=1000 ¼ 0:94 Note that if each of the various roles facilitates access to a nearly identical set of roles, i.e., each role has numerous card readers in common, i.e., M is a high number, the probability of unauthorized access decreases significantly. For example, if T ¼ 1000 and M ¼ 900, the probability of unauthorized access to a restricted space by virtue of a random role assignment now becomes, pðunauthorized accessÞ ¼ 99=100 100=1000 0:10 These estimates exemplify the trade-offs that often accompany decisions on security risk management. In the first case, a highly segregated environment increased the likelihood of unauthorized physical access, a risk factor for many threat scenarios.
166
8 Elementary Stochastic Methods and Security Risk
In the second example, liberal physical access privileges reduced the likelihood of unauthorized access yet such privileges are a known risk factor for many threat scenarios. Of course, the trade off in this case derives from the admittedly far-fetched assumption that role assignment is a random variable. Returning to the example, if enrollments occur frequently within a highly segregated organization, the probability of at least one enrollee being granted unauthorized access to a card reader in a 1000-person organization is given by the following expression, assuming each person is enrolled only once: pðat least one enrollee with unauthorized access privilegesÞ ¼ 1 ð0:94Þ1000 ¼ 1 1:3 1027 1 We see that the likelihood that at least one enrollee is granted unauthorized access privilege is a near-certainty for a large organization. If we now use the figure for the probability of unauthorized access for organizations with liberal access privileges, the probability that at least one enrollee is granted unauthorized access privileges also becomes 1 as follows: 1 ð0:1Þ1000 1 In other words, the probability of granting erroneous access to restricted space via this error mechanism is a near certainty irrespective of whether the environment is highly segregated or not. This condition exists when there is even an infinitesimally small probability of unauthorized access per enrollment and there are many enrollments. Note that the Poisson process can also be used to develop a probability distribution associated with this error process if the assumptions specified in Chap. 5 are valid. For example, assume the programming error rate is one-per-month or equivalently, twelve-per-year. Figure 8.3 illustrates the probability distribution for a one-year time interval. How would Fig. 8.3 or a similar probability distribution assist in a security risk assessment and the subsequent development of a security risk management strategy? To answer this question, it must be determined whether the Poisson process is applicable. First, threat incidents associated with the same threat scenario are not necessarily independent. Humans that commit threat incidents are driven by a multiplicity of factors that bias the threat scenario. Second, threat incidents clearly do not necessarily occur at a constant rate. Therefore, it seems unlikely that the Poisson process can, in general, be applied to threat scenarios. The reality is that threat scenarios, which beget threat incidents, are often biased by one or more risk factors. In contrast, there are no such influences associated with a process such as radioactive decay, where incident arrivals do conform to the assumptions noted above. Hence, the Poisson process is appropriate to estimating
8.3 Indicative Probability Calculations
167
0.12
P(X=x)
0.09 0.06 0.03
27
24
21
18
15
12
9
6
3
0
0.00 x
Fig. 8.3 Poisson distribution of access control programming errors (λ ¼ 12)
the likelihood of a specific number of radioactive counts in a specified time interval, which in turn relates to the absorption of damaging energy. In general, the challenge in assessing the likelihood component of risk for a given threat scenario is to identify the risk factors and to assess their individual contribution to each component of risk. How might a probability distribution of threat incidents inform an idealized security risk assessment, assuming such incidents are plentiful? Suppose a nutrition-conscious terrorist organization has an antipathy for fast food establishments. Executive management at the big fast food chains hire an expensive security consultant to determine how to best address this threat scenario. Unfortunately, there are likely to be many credible threats to fast food chains including patrons who have experienced one of many unpleasant digestive aftereffects. Undaunted, the security consultant attempts to estimate the likelihood component of risk for this threat scenario. The first step is to establish a probability distribution of historical threat incidents in terms of a possible risk factor. The consultant posits that terrorism threat incidents against fast food restaurants are correlated with the number of cheeseburgers served. She uses the historical data on terrorist attacks against fast food establishments coupled with cheeseburger sales to calculate a correlation coefficient between the two series. This calculation is identical to the one performed in Chap. 6. Let’s say the result is a positive correlation coefficient equal to 0.6. The operational question is what to do about these results, which presumably is the point of hiring a security consultant. Although this example is admittedly tonguein-cheek (figuratively and literally), organizations might face security risk management issues that require similar analyses. Each organization attempts to identify the spectrum of threat scenario risk factors and their relative contribution to the magnitude of security risk. The objective is to leverage these results to develop an effective security risk management strategy, which means identifying and addressing as many risk factors as possible. Although the correlation coefficient noted above suggests a relationship between cheeseburger sales and terrorism-related threat incidents, the evidence is not
168
8 Elementary Stochastic Methods and Security Risk 3.50 P trend=0.0015
HR (95% CI)
3.00 2.50 2.00 1.79 1.50
1.49 1.19
1.00
1.00
1.03
0.50 None
1-3 / Month
1 / Week
2-3 / Week
> 4 / Week –
Frequency of Intake Fig. 8.4 Mortality from heart disease due to fast food consumption
dispositive. Ideally, a controlled experiment would be conducted to isolate the effect of cheeseburger sales relative to other possible causes. For example, what about the effect of French fries and/or onion rings? Furthermore, even a high positive correlation does not imply that future attacks on fast food establishments that serve large quantities of cheeseburger are likely. The historical record provides evidence of what happened in the past. The past is not necessarily an accurate predictor of the future unless the relevant risk factors remain stable. Although cheeseburgers are probably not a risk factor for terrorism, they are a risk factor for other threat scenarios such as heart disease. In fact, the magnitude of the likelihood component of risk for heart attack threat scenarios have been quantified, and it turns out the data clearly points to cheeseburgers and other fast food items as culprits. Extensive heart disease studies provide statistically significant data on the fast food risk factor. Fig. 8.4 shows the Hazard Ratio (HR) and 95% confidence interval of coronary heart disease mortality relative to the frequency of fast food consumption.4 Note that data confidence intervals have been specified for this sample. For example, the 95% confidence interval means there is a 95% probability that the value lies within the ranges indicated in Fig. 8.4. Based on the data, the risk of mortality due to eating fast food clearly increases with greater fast food
4 Odegard, et al., Western Style Fast Food Intake and Cardiometabolic Risk in an Eastern Country; Circulation, July 8, 2102. http://circ.ahajournals.org/content/126/2/182
8.3 Indicative Probability Calculations
169
100
Deaths Injuries Total
P(x>X)
10-1
10-2
10-3
10-4
1
10
100 severity of attack, X
1000
10000
Fig. 8.5 Severity of terrorist attacks as a function of the number of terrorism incident deaths and injuries
consumption. Importantly, it is possible to generalize about the likelihood of a cardiac event for any individual contemplating a diet rich in fast food. Different approaches to characterizing terrorism risk are possible, and the resulting probability distribution will differ accordingly. For example, one study related to terrorism parameterized terrorist incidents in terms of the severity of attacks.5 The results are shown in Fig. 8.5. The probability of severity obeys an inverse power law, i.e., a function with a negative exponent. Such functions are sometimes referred to as “scale-free” distributions, and their general form is written as follows: f ðxÞ ¼ xy
ð8:6Þ
The lesson of Fig. 8.5 is best described in words: most attacks produced few casualties but a few attacks resulted in many casualties. Understanding common conditions associated with the more severe attacks might yield insights into the typical terrorism modus operandi. However, Fig. 8.5 will not yield much insight into the likelihood of a future terrorist attack. It reveals that severe, i.e., high-impact, attacks are relatively rare, which is not unsurprising. However, this revelation alone is of questionable operational value. Other threat scenarios illustrate how a probability distribution of threat incidents can be used to quantitatively assess the likelihood and/or vulnerability components
5
A. Clauset, M. Young. Scale Invariance in Global Terrorism; Feb 3, 2005.
170
8 Elementary Stochastic Methods and Security Risk
of risk. The technique is a straightforward calculation using the standard normal distribution introduced in Chap. 5. Suppose the mean value of stolen items in a building is $1.75 K, and the standard deviation of the distribution of stolen item values is $0.05 K. There have been 5000 historical thefts, and the value of stolen items is assumed to be a normally distributed random variable. What fraction of all thefts in the distribution should be less than $1.80 K? Recall the standard normal variable Z ¼ (x μ)/σ, where x is a specific value of the sample distribution, μ is the mean of the distribution and σ is the standard deviation. A fraction of the area of a standard normal distribution corresponds to the probability that x is greater or less than the remainder of the distribution. A table of values for Z can be used to find the fractional area of the standard normal distribution curve corresponding to the number of thefts less than $1.80 K in value as follows: Z½ðx μÞ=σ ¼ Z½ð$1:80 K $1:75 KÞ=$0:05 ¼ Z½1 Therefore, the objective is to determine the fraction of thefts less than Z ¼ 1. Z[1] is identified from a cumulative table for the standard normal distribution and its value is 0.8413. Therefore, approximately 84% of all stolen items were less than $1.80 K in value. Since there have been a total of 5000 thefts, the number of thefts of interest corresponds to 0.8413 5000 ¼ 4206 thefts. This statistic might influence the choice of security controls. For example, it might not make sense to install a multi-million-dollar security system to protect items where the historical probability that any single stolen item is worth less than $1.80 K is 84%. On the other hand, one might argue that any organization that experiences 5000 thefts clearly requires more intense security risk management irrespective of the distribution of stolen merchandise values. We might be interested in how many thefts in the distribution were worth more than a particular value. Suppose the value of interest is $1.85 K. In that case, the complementary value of Z, i.e., 1 – Z is relevant. We calculate the area under the standard normal distribution curve that is greater than a specified standard normal variable. We apply the standard normal variable formula and perform the calculation as follows: 1 Z½$1:85 ð$1:75Þ=$0:05 ¼ 0:0228 Therefore, the number of thefts in the distribution where the stolen items are worth more than $1.85 K equals 5000 x 0.0228 ¼ 114 thefts. The final statistic of interest might be the number of thefts of items valued between $1.80 K and $1.85 K. This figure is given by the total population of thefts
8.4 The Random Walk
171
minus the sum of the stolen items worth less than $1.80 K plus the number of stolen items worth more than $1.85 K: 5000 ð4206 þ 114Þ ¼ 680 thefts
8.4
The Random Walk
We next present a threat scenario where the principal risk factor is a variable subject to random fluctuations. Although not supremely realistic, it does illustrate how risk factor-security control misalignment is manifest as uncertainty in security risk management. Moreover, it demonstrates how the prescribed limits of uncertainty associated with a probability distribution of threat scenario outcomes can be leveraged to monitor the likelihood component of risk. Consider a knob on a security control that is mechanically stimulated by random noise. Discrete knob movements are “threat incidents,” which are caused by noiseinduced vibrations of a loose security knob, i.e., the risk factor. Each jiggle results in either a clockwise or counter-clockwise displacement from the knob’s previous position. The bi-directional displacement is a random variable, and the vibration can be modelled as so-called additive white Gaussian noise.6 This scenario is an example of a random walk, alternatively referred to as the “drunkard’s walk.” In this case, mechanical noise is the source of the random displacements rather than too much alcohol. Some significant quantitative results of random walks are presented in Appendix 1. We now offer a mostly qualitative description of how this stochastic process might be leveraged to assess the magnitude of the likelihood component of risk. The Central Limit Theorem discussed in Chap. 5 applies to random walks. Let’s assume a random walk consists of 100 steps. If 100 steps were repeated one million times, the resulting displacement would appear as a Gaussian distribution of the probability versus the displacement distance. The most likely position after 100 steps and one million iterations is the zero displacement position. The bigger displacements from equilibrium in either the clockwise and counter-clockwise directions are less likely, and therefore appear at the tails of the distribution. To reiterate, this threat scenario consists of mechanical noise causing perturbations in a control setting in either a clockwise or counter-clockwise direction with equal probability. If one knew a priori that an error source would produce a Gaussian distribution of security control displacements, a risk management strategy could be developed to address an errant security control or some other risk-relevant parameter based on a random walk model.
6 Additive white Gaussian noise has uniform power in the frequency domain and a normal distribution in the time domain.
172
8 Elementary Stochastic Methods and Security Risk
Such a strategy would leverage the statistical properties of the resulting probability distribution of knob displacements to determine the uncertainty, i.e., root mean dispersion from the mean, of the security control setting as a function of time. The security control could be adjusted based on the probability the knob has drifted outside the limits on tolerance. For example, security control readjustments could be based on estimates of the probability that the security control setting is one, two or three standard deviations from the mean displacement. As noted above, in this model we are assuming a risk factor, i.e., a poorly secured control knob, is a variable subject to random fluctuations. Finally, although this example illustrates how a stochastic process can be leveraged to assess security risk, remediation in this particular threat scenario is potentially much less challenging: merely tighten the knob!
8.5
The Probability of Protection7
The probability of protection is yet another example of the statistical nature of security risk. Here it is the uncertainty of risk factors that is exploited to yield estimates of security control effectiveness. In a traditional security risk assessment, the magnitude of the vulnerability component of risk is assessed directly to yield a figure for loss. Such estimates rely on a model for how vulnerability scales with changes in one or more risk factors. Such models relate a risk factor to the magnitude of vulnerability. Of course, a significant issue in assessing risk for a given threat scenario is the magnitude of the risk factor. For example, it is well known that damage or loss due to explosives is a function of the explosive payload, i.e., a risk factor for explosive threat scenarios. However, it is impossible to know a precise figure in advance of an incident. Therefore, we invoke the power of stochastic processes and assume a normal distribution of payload values, where the mean and variance of the distribution are based on scenario-specific estimates. As we might guess from the previous discussion, there are two prerequisites for using the probability of protection method: (1) a model for the vulnerability component of risk, and (2) a probability distribution of vulnerability risk factor(s) values. The probability of protection transforms a model for the vulnerability component of risk into an assessment of the likelihood of security control effectiveness. The first step in the probability of protection process is to assume a risk factor for the vulnerability component of risk is a random variable.8 An assumed probability
7
Chang, D. & Young. C. (2010). Probabilistic Estimates of the Vulnerability to Explosive Overpressures and Impulses, The Journal of Physical Security, Volume 4, Issue 2, 10–29. 8 An assumption of normality is not required. However, absent other information regarding the risk factor in question, the normal distribution is considered the most generally applicable.
8.5 The Probability of Protection
173
distribution of risk factor values is plugged into a model for the vulnerability component of risk. The result is a new distribution of values that can be compared to security control performance specifications. As noted above, scenario-specific conditions are used to define the mean and variance of the distribution. Consider a threat scenario consisting of a vehicle-borne explosive threat and a target building as the affected entity. The security control is a series of bollards placed circumferentially around the building perimeter at some radius to enforce a minimum standoff distance. As discussed in Chap. 7, the two physical parameters that cause damage in explosive scenarios are the explosive payload, often specified in units of lbs-TNT, and the distance between the explosive payload and the target. Bollards are rated according to the maximum absorbed kinetic energy upon impact, which is often specified in megajoules (MJ). The U.S. government has established a bollard performance specification corresponding to zero penetration of the bollard line by a 15,000 lb. (6804 kg) vehicle moving at three speeds9: • K4 ¼ 30 mph (48 kph) • K8 ¼ 40 mph (64 kph) • K12 ¼ 50 mph (80 kph) Since the kinetic energy of the vehicle relative to the bollard energy rating determines the vulnerability to bollard penetration, vehicle speed is clearly a risk factor for this threat scenario. Recall the kinetic energy of a moving object equals 1/2mv2 where m is the object’s mass and v is its velocity on impact. Importantly, we see immediately that kinetic energy scales linearly with vehicle mass and quadratically with velocity. Of course, in any particular threat scenario it is impossible to know an approaching vehicle’s speed a priori. However, a combination of scenario-specific features and the laws of physics can be used to specify reasonable limits on vehicular speed. Furthermore, we assume that vehicle impact speeds are normally distributed with a mean value of 40 mph and a standard deviation of 10 mph. We also assume a vehicle weight of 15,000 lbs. The question we wish to answer is, “What is the probability that the security control will provide adequate protection for the vehicle-borne explosive threat scenario?” In other words, against what fraction of threat scenarios will a bollard with a particular energy rating be effective? The properties of a normal distribution as discussed in Chap. 5 can be used to answer this question and thereby determine the probability of protection. To reiterate, the assumption is that the vehicle speed upon impact is normally distributed with the properties of the distribution dictated by scenario-specific features. The vehicle kinetic energy is the key parameter of interest in this threat scenario. This problem is mathematically equivalent to the probability that a specific value in a normal distribution is greater than the remainder of the distribution, i.e., a left-
9 Perimeter Security Design; https://www.fema.gov/media-library-data/20130726-1624-204900371/430_ch4.pdf
174
8 Elementary Stochastic Methods and Security Risk
tailed distribution. This would be determined by integrating the normal distribution from minus infinity to the distribution value of interest. In this instance, the values of interest are the three kinetic energies corresponding to the three vehicular speeds, 30, 40 and 50 mph. Recall we fixed the vehicle weight at 15,000 lbs. so these speeds determine the vehicle’s kinetic energy. We first examine the probability of protection provided by K12 bollards. In general, a continuous distribution of vehicle speeds would be used to determine the probability rather than the three discrete values noted above. To calculate the fraction of scenarios parameterized in terms of a standard normal distribution we would calculate the relevant z-statistic: Z ¼ ðX μÞ=σ ¼ ð50 40Þ=10 ¼ 1 We therefore look up values for Z(1) to determine the probabilities of protection. In this threat scenario we conveniently simplified the problem by assuming the risk factor, i.e., vehicle speed, is a normally distributed random variable with continuous values from minus infinity to plus infinity. The mean of the distribution is 40 mph and the standard deviation of the three-valued distribution is 10 mph. In general a risk factor is limited by scenario-specific conditions, which have defined limits. Therefore, a truncated probability distribution is required to realistically characterize the spectrum of risk factor values, which must be normalized so that the distribution values sum to one. Performing the integration of the standard normal distribution (or using an on-line calculator) with the aforementioned values for the mean and standard deviation yields a probability of 0.841. In other words, the probability that a K12 bollard will prevent vehicular penetration for this threat scenario is about 84%, i.e., about 84% of the threat scenarios are protected when using K12 bollards. Conversely, approximately 16% of all realistic threat scenarios will not be protected using a bollard with this performance specification. Figure 8.6 shows the fraction of threat scenarios protected by K12 bollards, where threat scenarios defined by a distribution of vehicle speeds are distributed according to a standard normal distribution.
0.4
f(x)
0.3 0.2 0.1 0.0 -4
-3
-2
-1
0 x
Fig. 8.6 Fraction of threat scenarios protected by K12 bollards
1
2
3
4
8.6 The Markov Process
175
Table 8.1 Bollard return on investment Ratio of bollard types K12/K8 K12/K4 K8/K4
Relative cost 2 4 2
Relative protection 1.68 5.25 3.13
Cost-protection ratio 1.19 0.76 0.64
The utility of this technique is that it enables quantitative estimates of the effectiveness of security controls. Another benefit is that a comparison of the return on investment for security controls can be calculated. For example, performing the same calculation for bollards rated to withstand impacts of 40 mph (K8) and 30 mph (K4) yields a probability of protection of 0.500 and 0.159, respectively. Suppose the price of bollards is $25,000, $50,000 and $100,000 for K4, K8 and K12-rated bollards respectively. We can compare the cost of each bollard type to the probability of protection, and thereby evaluate their relative returns on investment. For example, the ratio of the cost of K12 and K4 bollards is 4.0. The ratio of their probabilities of protection is 0.84/0.16 or 5.25. So although the price of K12 bollards is four times that of the K4 bollards, they are 5.25 times more likely to provide sufficient protection. In other words, K12 bollards protect 5.25 times the number of threat scenarios than do K4 bollards. Therefore, the K12 would appear to be a good deal in spite of the significantly higher price tag. Table 8.1 compares the cost of each bollard type to their probability of protection, and thereby reveals the relative returns on investment. We see from Table 8.1 that both K12 and K8 bollards are a good deal relative to K4 bollards since the relative cost is less than the relative protection, i.e., a lower cost-protection ratio. However, K12 bollards are less cost-effective than K8 bollards. The best deal is obtained when choosing between K8 and K4 bollards since the costprotection ratio is lowest (0.64). Note that a cost-benefit analysis is a useful adjunct to any security strategy. However, it becomes less relevant if a specific level of protection is required for a given threat scenario. Finally, the bollard threat scenario is a particularly simple one since only one risk factor is evaluated. The probability of protection calculation becomes more difficult when two or more risk factors must be considered.
8.6
The Markov Process10
Security risk assessments can sometimes be an exercise in extracting signal from noise. That is, despite a plethora of data, or precisely because of an abundance of data, identifying a particularly risk-relevant datum can be a challenge. This condition
10
Jean Gobin contributed significantly to this section.
176
8 Elementary Stochastic Methods and Security Risk
is common in IT environments where various logs and other sources of risk-relevant data identify many potential security-related events. Network intrusion detection devices attempt to overcome this problem by identifying deviations from “normal” behavior rather than applying filters based on fixed rules. A stochastic method that has many applications to science and engineering can be used to identify such deviations. The method is known as a Markov process, and it was named after the mathematician who developed the theory.11 At a high level, the process is based on calculations of the probability of a transition between states of a system. Specifically, data regarding the states of a security-related process or system are used to calculate transition probabilities, i.e., the probability that a specific state of the system will follow from some previous state. As a simple example, if a system is observed to transition from state A to state B ten times, state B to state C twenty times, state C to state D thirty times and state A to state A (no transition) fifteen times, the following are the transition probabilities for this process: p(A-B) ¼ 10/75 ~ 0.13 p(B-C) ¼ 20/75 ~ 0.27 p(C-D) ¼ 30/75 ¼ 0.40, p(A-A) ¼ 15/75 ¼ 0.20 If this data were applied to a threat scenario, transition probabilities outside an expected range for each transition state might constitute anomalous behavior and as a result trigger further investigation. This example is admittedly very general, so a Markov process is next applied to a more specific threat scenario. The overarching assumption in applying this technique to threat scenarios is that an anomalous pattern suggests suspicious behavior. Unexpected changes in IT network communication patterns are sometimes indicative of risk-relevant behavior. Anomalous outbound network connections to countries known to engage in hacking and information compromise via malware can be examples of such behavior. These connections are typically recorded in firewall logs. A common modus operandi of malware is to remotely control a compromised host within an IT network. The communication between the compromised host and the remote command and control device can appear as periodic outbound connections. A sophisticated attacker would not be obvious about establishing such connections, which would be hidden among thousands if not millions of legitimate connections. Intra-day communications to and from any given IT network will likely follow established patterns. For example, outbound data packets to a specific IP address or range of IP addresses might peak at 11 AM every business day or exhibit multiple peaks throughout the day. After regular business hours, the number of transmitted packets would typically decline, and the fall-off would be expected to follow a
11
Andrey Andreyevich Markov, Russian mathematician, 1856–1922.
8.6 The Markov Process
177
consistent pattern. Deviations from this pattern might be indicative of an attempt by an external site to control a compromised machine inside the network. At a minimum it would represent an event worthy of further investigation. Another risk factor for malware is the destination of outbound packets. Therefore, an additional investigative criterion might include outbound connections to IP addresses physically located in high-risk countries. While some outbound communications to these countries would be typical, the pattern of activity should occur within defined ranges. Deviations from these ranges would be designated as anomalies. For example, an idealized firewall log measuring outbound activity from an internal source IP address to China, Russia, Uzbekistan etc., might reveal attempts at discreet communication, where each log record is unique with respect to both time and host activity. The data could be organized into time interval windows to facilitate the analysis. Specifically, all connections to high-risk destinations made between 4:31 PM and 5:30 PM on Tuesdays could be associated with a 5:00 PM time window. A single value corresponding to the number of high-risk outbound connections occurring within that window would be specified. An organization would accumulate multiple data points within each time window over some time interval. Each time window could be compared to other time windows. The result is a probability distribution of log activity corresponding to the fraction of outbound, high-risk connections per window. An example is shown in Fig. 8.7, which consists of two peaks corresponding to two distinct log entries and their associated probabilities.
Firewall Log Distribution Outbound Connections to High-Risk Countries (Tuesday 5:00 PM) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
100
200
300
400
500
Fig. 8.7 Idealized baseline firewall log probability distribution for Tuesday at 5:00 PM
600
178 Table 8.2 Firewall log activity
8 Elementary Stochastic Methods and Security Risk Tuesday 5:00 PM Firewall log activity
Week 1 107
Week 2 380
During the 5:00 PM Tuesday window, an organization would expect 100 log entries with a probability of 0.75, and 500 log entries with a probability of 0.25. Therefore, a high probability associated with 1000 high-risk connections in this time window would be indicative of risk-relevant behavior. The data suggest it is worth examining the most recent logs for the same time window and measure if the outbound traffic volume fits within some defined interval above and below 100 and 500 log entries, e.g., (say) within 10%. Let’s examine another scenario. Table 8.2 shows the results of 2 weeks of firewall log monitoring during the Tuesday 5:00 PM time window. If week 1 results represent the baseline, the results from week 2 reveal a 255% increase in outbound connections. If a traffic volume exceeding 10% of the baseline is assumed to be risk-relevant, these results would be indicative of anomalous activity given the magnitude of the discrepancy. If the change in the number of outbound connections from week 1 to week 2 is deemed acceptable, e.g., within 10% of 100 incidents, the data would be included in a recalculation of the baseline. However, if the number of connections from week 1 to week 2 falls outside the agreed interval, an organization might ultimately block specific IP addresses at the firewall depending on the outcome of a follow-up investigation. We now apply the Markov process to examine transitions in the number of outbound connections to high-risk countries for consecutive time windows. The probability of specific transitions across relevant time windows can be calculated pursuant to identifying suspicious behavior. For example, suppose two network traffic volume entries are recorded during the 4:00 PM window and another two at 5:00 PM. The calculated probabilities for traffic volumes during the 4:00 PM and 5:00 PM time windows are listed in Table 8.3. Note that the data in Table 8.3 is similar to the data in Fig. 8.7 except the probability distribution in the former includes connections in two time windows: 4 PM and 5 PM. The relevant question becomes, “What is the probability of observing a specific number of connections at 5:00 PM on Tuesday given a specific number of connections at 4:00 PM on the same day?” Table 8.4 provides the answer. Table 8.4 shows there is a reasonable probability of registering either 100 or 500 outbound connections at 5:00 PM following the appearance of 200 connections at 4:00 PM. However, the probability that the system registers 100 high-risk outbound connections at 5:00 PM following 1000 connections at 4:00 PM is only 0.10. This foregoing analysis reveals unusual transitional behavior. Note that each time window never actually registers a specific connection volume above the acceptable range as specified in Table 8.4. However, the 4 PM to 5 PM transitional behavior showing 1000 to 100 high-risk outbound connections might be flagged
8.7 Time-Correlation Functions and Threat Scenario Stability Table 8.3 Probabilities of high-risk outbound firewall log activity for consecutive time windows
Time 4:00 PM 4:00 PM 5:00 PM 5:00 PM
179
Probability 0.65 0.35 0.75 0.25
Traffic volume 200 1000 100 500
Table 8.4 Transition probabilities between 4:00 PM and 5:00 PM for high-risk outbound network connections Transition probabilities 5:00 PM ¼ 100 5:00 PM ¼ 500
4:00 PM ¼ 200 0.90 0.50
4:00 PM ¼ 1000 0.10 0.50
for follow-up due to the relative rarity of this connection history. Note also that the absolute number of connections is not of interest. Rather, it is the probability of a particular pattern that is risk-relevant.
8.7
Time-Correlation Functions and Threat Scenario Stability12,13
We know from Chap. 3 that threat scenarios where a threat incident is a random variable is classified as a random threat scenario. We also now know that the similarity of two threat scenarios depends on the similarity of their respective risk factors. As discussed in the beginning of this chapter, the effect of multiple threat scenario risk factors can yield a threat incident that is a random variable. A critical question that arises in connection with estimates of likelihood is the similarity of threat incidents as the threat scenario evolves. If threat scenarios are identical, i.e., possess identical risk factors, it implies threat incidents derived from those scenarios are similar and therefore comparable via the same probability distribution. A threat scenario that morphs over time so that it can no longer be considered the same threat scenario, i.e., its risk factors have significantly changed, cannot be so evaluated. Therefore, the time evolution of risk factors is a measure of threat scenario stability and has implications to assessments of the likelihood component of risk. In other words, understanding risk factor behavior as a function of time provides insight into the evolution of the threat scenario itself. To do so we apply the time-
12 13
Quantum Chemistry course notes (5.74), MIT Department of Chemistry, 2009. F. Reif, op. cit.
180
8 Elementary Stochastic Methods and Security Risk
correlation function to threat scenarios, which entails assuming a risk factor is a variable subject to random fluctuations. This dynamic model of threat scenarios is borrowed from the physical world, where the correlation-time function is used to probe physical systems. However, threat incidents are not particles, and a threat scenario is in general not a physical system in equilibrium. Therefore, this treatment is at best an approximation. In that vein, we present a highly theoretical analogy borrowed from the physical world that is intended to provide a gross estimate of the time evolution of risk factors. In so doing, we hope to gain insight into the temporal history of the threat scenario itself, and the effect of that history on estimates of the likelihood component of risk. As with a number of examples in this chapter, the discussion illustrates the potential insights derived from assumptions of randomness but care must be exercised to not interpret these results too literally. In addition, the potential applicability of this method begs the question of how to actually measure the time-correlation function. If it were possible to determine risk factor time evolution via autocorrelation, it is likely that other, perhaps simpler measurements would yield equally insightful results. The crux of the problem in assessing the likelihood component of risk is an absence of data that can definitively correlate the effect of a particular risk factor with a specific threat scenario outcome. In this model, risk factors are assumed to fluctuate about some average value. In Chap. 5, we introduced the correlation function. An important measurement of such a relationship is the correlation function for two different variables, i.e., the covariance of I1 and I2: CI1I2 ¼< I1 I2 > < I1 >< I2 >
ð8:7Þ
In Chap. 6, we explored additional statistical measurements to include autocorrelation, which is the variance of the same variable but measured at two different instances of time. We can potentially leverage autocorrelation to assess how rapidly a variable subject to random fluctuations becomes statistically independent of its originally measured value. This measurement computes values of autocorrelation at sequential time intervals. Specifically, we compare the value of a variable I at time t with its value at time t’. A time-correlation function is a time-dependent quantity I(t), which is multiplied by the same quantity measured at some later time t’, and averaged over an ensemble of values. We write the time-correlation as follows: CII ðt, t0 Þ ¼< IðtÞIðt0 Þ >
ð8:8Þ
We next provide a qualitative explanation of time and ensemble averages. The magnitude of the fluctuations of a single variable can be measured as a function of time. We can measure its average value over some time interval. Next, consider an ensemble of fluctuating variables. These measurements can also yield an average if we measure the magnitude of each variable at a particular instant of time. Appendix 2 provides mathematical definitions of both time and ensemble averages. For those disinclined to tackle the details, Fig. 8.8 graphically illustrates the difference between time and ensemble averages.
8.7 Time-Correlation Functions and Threat Scenario Stability
181
Time Average
I1
I2
I3
Ensemble Average
Fig. 8.8 Time and ensemble averages
In applying the time-correlation function to threat scenarios, we must make an important assumption. Namely, the time average and the ensemble average are identical. A stochastic process where this condition holds is called ergodic. The implication to threat scenarios is that the time history and associated statistics of a single, randomly fluctuating risk factor is representative of the entire process, i.e., an ensemble of risk factors in a threat scenario. For example, suppose the collective password strength of user passwords in an IT environment fluctuates in time.14 If all individuals are subject to the same password complexity and change management requirements, the time average of the complexity of a single user’s password would mirror the average of all users’ passwords measured at any particular time t. If different segments of the user population were subject to different complexity requirements, the ergodic condition would not apply. Let’s investigate this scenario more carefully. Password strength might be measured in terms of the information entropy, a concept that is important in the model for threat scnenario complexity discussed in Chap. 9. Let’s say we measure the strength S(t), of the password of a standard Windows user. The time average of the complexity of the particular password selected would vary in accordance with the enforced password policy for all Windows users. However, the statistical average measured at some time t, across all users, e.g., local administrators, system administrators, Windows users, would not have the same value as the time average for the selected account. The statistical average will depend on the strength of the passwords for the other account types as dictated by policy. Therefore, this process is not ergodic as defined above.15 14
See the justification for this view of passwords later in this section. Although the process is not ergodic it is stationary since S(t) will be constant over some time scale determined by the password policy. 15
182
8 Elementary Stochastic Methods and Security Risk
P( l )
l1(t) l
t Fig. 8.9 Probability distribution of fluctuations about
If a variable I(t) fluctuates about its mean, i.e., , a probability distribution of values about the mean can be generated. Per the definition of variance and standard deviation, the uncertainty in the mean is given by the square root of the variance, i.e., the standard deviation σ. Figure 8.9 is an indicative time history of a random variable I(t), which fluctuates about and yields a probability distributions of values P(I). The objective in applying the time-correlation function to threat scenarios is to determine if a statistical relationship exists between subsequent values of I(t). If subsequent values of I become uncorrelated over some time scale, this condition would be indicative of dissimilar threat scenarios. Moreover, if one could determine the time scale for de-correlation, it would reveal the statistical longevity of a threat scenario such that a threat scenario measured at time t is similar to itself measured at a later time t + Δt. A characteristic time constant associated with these fluctuations would reveal the time scale of threat scenario stability. Furthermore, the time-correlation function of a risk factor might reveal a riskrelevant (and time-dependent) feature of the underlying process affecting I(t). The underlying processes relate to the threat scenario risk factors. A correlation function equals the average of the product of two variables minus the product of their averages, i.e., the covariance or < I1I2 > < I1 > . The normalized time-correlation function, CII(t, t’), is defined as follows: CII t, t’ ¼ < IðtÞI t’ > < IðtÞ >< I t’ > = < It >< It >
ð8:9Þ
We see that (8.9) is a correlation function that measures the covariance of the same variable measured at different times, and the value varies between 0 and 1. If < I1 > ¼ 0 it means the two variables are independent and therefore uncorrelated. If two variables are statistically independent they must be uncorrelated. However, and as noted previously, the converse statement is not true; uncorrelated variables are not necessarily statistically independent. In expression (8.9), I1 and I2 are identical but are evaluated at subsequent times. If we compare the value of I(t) with its initial values at I(0) for sufficiently short times, CII will likely be non-zero. However, at some time t, CII will equal zero due to the
8.7 Time-Correlation Functions and Threat Scenario Stability
183
effect of some fluctuating influence that result in incoherence. At that time, I(t) and I(0) are statistically independent, and therefore are uncorrelated. In a physical system, the time correlation function describes how long a given property of a system persists until it is averaged out by fluctuations or interactions with the environment. It describes a condition when a statistical relationship between two variables has disappeared. We have modeled a threat scenario as an ensemble of risk factors that change with time and devolve with increasing incoherence over some time scale with attendant implications to the threat scenario and the likelihood component of risk. It is this time scale of risk factor changes that is risk-relevant, and which we seek to identify through the time-correlation function with the appropriate caveats noted previously. Information about an underlying dynamic process in a threat scenario is contained in the time decay of CII(t, t’). We assume the starting time is arbitrary so the averaging can begin at any time t without loss of generality. CII t, t’ ¼ ½< Ið0ÞIð0 þ tÞ > < Ið0Þ >< Ið0 þ tÞ >= < Ið0ÞIð0Þ >
ð8:10Þ
The normalized value decays from the maximum value of one to some lower value between zero and one. At some time t’, CII(t, t’) becomes zero. If the normalized function, CII(t, t’) is unity it means there is perfect correlation of I(t) with itself, which typically occurs at . To reiterate, threat scenario fluctuations cause I(0) to become increasingly uncorrelated with future values of I(t). At some later time, the average < I(0)I(t) > decays to zero, which implies I(0) and I(t) are statistically independent, i.e., ¼ < I(0) > . Therefore, (8.10) specifies the time scale for I(t) to become statistically independent of its initial value I(0). The short-time value is proportional to , whereas the asymptotic, long-time value is proportional to ¼ 2. The normalized time-correlation function characterizing a variable subject to random fluctuations is shown in Fig. 8.10. The actual rate of decay from its initially measured value I(0), to its uncorrelated value at I(t), will depend on the rate of fluctuation. The more frequent the fluctuations the faster the decay of CII(t). Fig. 8.10 Normalized timecorrelation function for a variable subject to random fluctuations
184
8 Elementary Stochastic Methods and Security Risk
The time scale for the decay process is denoted as the correlation-time tc. It characterizes the time it takes for CII to decay from its maximum normalized value, i.e., from one to zero. The time-correlation function shown in Fig. 8.10 exhibits exponential decay and is written as follows: CII ðtÞ ¼ exp ðt=tc Þ ¼ et=tc
ð8:11Þ
When t ¼ tc, this value of t is the so-called e-fold. Integer multiples of t/tc correspond to the number of e-folds, and this representation is indicative of the rate of exponential decay. For example, if t/tc ¼ 1, CII(t) ¼ e1 ¼ 0.37. If t/tc ¼ 2, CII(t) ¼ e2 or 0.14. It is clear that CII(t) decreases with an increasing number of e-folds. Although the time-correlation function might seem abstract, an example will show it has at least a modicum of applicability to the real world. Consider an IT environment governed by a password policy with a complexity requirement and an expiration time. In other words, passwords must conform to a minimum length and diversity requirement as well as be changed at a prescribed time interval, e.g., every 3 months, 6 months, 12 months, etc. Such a policy might be enforced or voluntary. The concept of information entropy will be discussed in Chap. 9 in connection with threat scenario complexity. For now, suffice it to say that information entropy is a measure of diversity and hence the inherent uncertainty associated with the possible outcomes of a process. Uncertainty associated with the process of password selection translates into immunity from guessing. In this context, the make-up of a specific password would be considered a risk factor for the threat of information compromise. Note that other threat scenario risk factors might include the existing password policy, the tolerance for inconvenience, vulnerability of the file containing stored hash values, et al. The password construction as determined by the user is a factor in the success of any brute force attack. Therefore, and perhaps bizarrely, since the user determines the quality of the security control, he or she is unwillingly complicit in any threat incident involving the compromise of passwords. Assume there is only one password complexity requirement for an environment. For passwords generated by a process that randomly selects a string of symbols of length L from a set of N possible symbols, the number of possible passwords can be found by raising the number of symbols to the power L, i.e. NL. Increasing either L or N will strengthen the password’s immunity to guessing, although in Chap. 12 it will be shown that L has a more profound effect than N on password strength. In other words, length matters in password strength. The strength of a randomly generated password as measured by the information entropy is given by H ¼ log2(NL), where N is the number of possible symbols and L is the number of symbols in the password. H is measured in information bits.16 One could assign a value to the average strength of passwords at some time t, based on the entropy. 16
https://en.wikipedia.org/wiki/Password_strength
8.8 The Convergence of Probability and Potential
185
We can consider weak password entropy to be a threat scenario risk factor. One could (and should) test password strength by conducting a password cracking exercise, and thereby measure the average password entropy at various time intervals. Specifically, the entropy of all passwords could be measured at some arbitrary start time of t ¼ 0, i.e., H(0). H could subsequently be measured over the time interval determined by mandatory password changes. Let’s assume the password policy dictates that passwords are changed every six months, and the policy applies to everyone in the user community. The time-correlation function is . At some arbitrarily selected initial time t ¼ 0, all passwords are perfectly correlated with themselves. As time progresses, the time-correlation function should decrease as individuals across the enterprise change their passwords. Not all users will choose passwords of the same complexity. Therefore, the entropy as measured across the community will change with time, which will affect the time-correlation function. Moreover, password changes for the user population would likely occur at random during any six-month cycle. Furthermore, it should not matter when the six-month cycle begins, and no password should remain unchanged after six months. Although there would not be perfect correlation of password entropy over the six-month cycle, since users are hopefully continuously changing passwords, the deviation from perfect correlation would be confined to a range, in accordance with the existing password complexity requirement. Therefore, all passwords should be characterized by a minimum value of entropy, and the time-correlation function of the entropy should fluctuate within the limits defined by policy. If this condition is not true, i.e., the entropy does not significantly change over the prescribed cycle as measured by the time-correlation function, it is evidence that the password policy is not being obeyed across the community of computer users. Note that an exponential decay in the time-correlation function would also occur if the average password entropy increased over time. The time-correlation function measures the statistical correlation of a random variable with itself but displaced in time. That correlation would be affected by any macroscopic password complexity change including those deemed salutary from a security perspective.
8.8
The Convergence of Probability and Potential
An assumption that a threat incident is a random variable can help in exploring the disjuncture between probability and potential. In fact, if the randomness condition is assumed to exist, potential and probability converge to unity as the number of threat incidents becomes extremely large. We next examine a specific threat scenario to demonstrate why and when this condition occurs. Consider a threat scenario where the threat is attempted theft from within a facility where such attempts are facilitated by unauthorized access. We intuitively
186
8 Elementary Stochastic Methods and Security Risk
believe that the more times individuals without access privileges are afforded access to this facility the more likely one of them will exploit that access for illicit purposes. The objective is to confirm or reject our intuition. To do so we must make several significant assumptions as follows. An implicit assumption is that individuals with access privileges have somehow been vetted using risk-relevant criteria, e.g., no prior criminal activity, employer references, and such individuals possess a genuine need for such privileges. These individuals have therefore earned that privilege and as a result of proper vetting have been designated as trustworthy. We also assume unauthorized access to the facility is a random variable with two possible outcomes: an attempted theft or no attempted theft. The analogous process is a coin toss, where the outcome is either a heads or a tails with equal probability. One might ask the probability of some number of heads (or tails) appearing for a specific number of tosses. Analogously, we might ask about the probability of an attempted theft resulting from some number of unauthorized building access incidents. For example, what is the probability of no attempted thefts if there have been 100 recorded instances of unauthorized access? Let the probability of an attempted theft during an unauthorized access event be designated as p. The probability that a theft was not attempted during an unauthorized access event must equal 1–p since this is a binary process. We assume each instance of unauthorized access is independent. If p ¼ (1-p) ¼ q, the attempted theft process is identical to a coin toss. So the probability of an attempted theft during an unauthorized access event equals the probability of no attempted theft. The spectrum of outcomes resulting from multiple unauthorized access events reflects the potential for attempted thefts versus no attempted thefts, and ultimately the likelihood of an actual theft. However, if p > q or p < q, the process is equivalent to tossing a biased coin. Specifically, let's assume p > q, i.e., the probability of an attempted theft during an unauthorized access event is greater than the probability of no attempted theft. Furthermore, assume there are n unauthorized access events. The probability of experiencing exactly n attempted thefts or n non-attempts decreases exponentially with increasing n. In both cases, the limiting value is zero. However, the probability of no attempted thefts decreases more rapidly if p > q. In a nutshell, the point of deploying physical security controls is to ensure p < q. If there have been two unauthorized access events, the probability of exactly two attempted thefts is (p)(p), i.e., p-squared. The probability of all other outcomes, i.e., (p)(q), (q)(p) and (q)(q), equals 1-(p)(p). The same calculation holds for the probability of exactly two non-attempted thefts during two unauthorized access events. As the number of unauthorized access events increases, the probability of a specific outcome such as all events result in no thefts approaches zero. Therefore, the complementary probability approaches one. Importantly, the rates of convergence to one and zero depend on the values of p and q. Therefore, as n approaches infinity, the potential for a specific outcome, e.g., no attempted thefts, all attempted thefts, becomes an exact probability, e.g., one.
8.9 Summary
187
Of course, we don’t know the probability of an attempted theft by individuals who are authorized to enter restricted space. Presumably, that probability is less than individuals who lack such privileges, but the contention is impossible to prove without statistics. As stated previously, we don’t know the actual value of p. But knowing the precise figure for p is not necessary to gain insight into the trend in the likelihood component of risk. The simple model described above leads to the following result: If a risk factor-related event resulting from a threat scenario is a random variable with two possible outcomes, the potential and probability converge as the number of risk factorrelated events increases without limit. Specifically, the probability becomes one as the number of risk factor-related incidents approaches infinity. Moreover, only the rate of convergence to unity is affected by differences in the respective probabilities of the two possible outcomes.
The previous discussion confirms our intuition: the tendency for threat incidents increases with an increasing number of risk factor-related incidents. Furthermore, for a sufficiently large number of risk factor-related incidents, the actual value of p, the probability of a threat incident, becomes irrelevant. In other words, an attempted theft becomes increasingly likely if there are many instances of unauthorized access no matter how small the probability of an attempted theft.
8.9
Summary
Assuming a threat incident is a random variable enables probabilistic or “stochastic” approaches to assessing the likelihood component of security risk. By definition, the laws of probability apply to such scenarios, which enable straightforward calculations of likelihood. However, care must be exercised in interpreting results so derived, perhaps by assuming such results represent a worst-case scenario, e.g., the absence of security controls. Standard probability distributions apply to security-related threat scenarios assuming certain prerequisites are met. For example, estimating the probability of a specific number of threat incidents in a given time interval becomes possible via a Poisson process if the conditions for Poisson apply. The probability of protection is a stochastic method that assumes a vulnerability risk factor can be represented by a probability distribution, where the mean and variance of the distribution are dictated by scenario-specific conditions. Using this probability distribution as the input to a model of risk factor behavior produces a spectrum of possible threat scenarios. Evaluating the model results relative to a security control performance specification yields the fraction of threat scenarios protected by that control. Estimates of the security control return on investment become possible with this method. A normal distribution of displacements results from numerous steps of a random walk process. Such a model could be applied to threat scenarios governed by random processes, and thereby estimate the likelihood component of risk using the resulting
188
8 Elementary Stochastic Methods and Security Risk
Gaussian distribution of process outcomes. If a specific value of the distribution exceeded specified tolerance limits, e.g., two or three standard deviations from the mean, such a condition might trigger intervention. Transition probabilities for states of a threat scenario can sometimes be calculated. These transition probabilities would identify anomalous and therefore suspicious threat scenario behavior. A method that leverages transition probabilities of threat scenario states is known as a Markov process, and is a standard mathematical technique that has been applied to numerous scientific and engineering problems. If a risk factor-related event can be considered a random variable with two possible outcomes each with respective probabilities, the potential converges to a probability of one in the limit as the number of events approaches infinity. The time-correlation function of risk factors perturbed by random fluctuations could be used to measure the characteristic time scale for threat scenario stability. Fluctuating risk factors reflect the underlying stability of threat scenarios, and this stability condition could be used to identify changing threat scenarios and thereby identify similar scenarios. The importance of these identifications is in determining if threat incidents originate from similar threat scenarios. If so, they can be included in the same probability distribution and thereby compared relative to one another to estimate the likelihood component of threat scenario risk.
Part III
Security Risk Assessment and Management
Chapter 9
Threat Scenario Complexity
9.1
Introduction to Complexity
We know from Chap. 4 that contemporaneous risk factors have an exponential effect on the magnitude of the likelihood component of risk. Furthermore, we learned in Chap. 7 that the presence of multiple risk factors introduces uncertainty in security risk management. The magnitude of uncertainty can be quantified if the individual risk factors are variables with finite variance. Intuitively, this source of uncertainty seems eminently reasonable; identifying, monitoring and addressing one risk factor is less difficult than two risk factors, which is less onerous than three, etc. Security risk management, which is in part based on risk assessment results, consists of periodically applying security controls on a timescale dictated by the time rate of change of threat scenario risk factors. If the threat scenario is dynamic, the magnitude of the likelihood component of risk increases during periods when updates to a security control lag changes in the risk factor it is addressing. Such a condition might also contribute to uncertainty in security risk management. One would expect the presence of simultaneous and/or changing risk factors to enhance the magnitude of the likelihood component of risk. Various threat scenario features might also obfuscate the requirement and/or effectiveness of security controls. Furthermore, not all threat scenarios are as obvious as those involving great white sharks and speeding subway trains. Increases in the magnitude of security risk can result from subtle threat scenario features, and therefore might not be recognized as risk-relevant. The precise effect of the number of risk factors and/or their interrelationship can be difficult to pinpoint let alone quantify. Therefore, threat scenarios that possess features that obscure risk-relevance, i.e., increase the likelihood component of risk, and therefore obfuscate the precise requirement for security controls, might need to be assessed in some special way. Such scenarios are colloquially referred to as “complex.” © Springer Nature Switzerland AG 2019 C. S. Young, Risk and the Theory of Security Risk Assessment, Advanced Sciences and Technologies for Security Applications, https://doi.org/10.1007/978-3-030-30600-7_9
191
192
9 Threat Scenario Complexity
Our immediate task is to be more precise about what is meant by this term, and is a prelude to quantifying its effect. Although a strictly informal view of complexity can be useful in describing this phenomenon at a high level, a more formal characterization is necessary to actually identify and ultimately address complex threat scenarios.
9.2
Background
We begin the discussion of threat scenario complexity by making the rather vanilla assertion that whatever its root cause, complexity results in an obfuscation of the requirements for security risk management. We start with the premise that except for the most trivial threat scenarios, there is always some uncertainty regarding the application of security controls to risk factors, and the source of that uncertainty is complexity. Furthermore, there are infinite ways a risk factor can be obscured within a threat scenario and thereby affect perception of the requirement for security controls. Therefore, it would be futile to attempt to delineate every possible threat scenario if the objective is to develop a general prescription to address complexity. Such a prescription should be based on a general model that is applicable to all threat scenarios yet yields a specific value of complexity for a given threat scenario. A prerequisite for such a model and any subsequent analyses is a proper definition of complexity. Therefore, a preliminary definition is provided immediately below: Threat scenario complexity is a risk factor for the likelihood component of risk that obscures or otherwise hinders the identification and/or management of threat scenario risk factors.
This definition is not too dissimilar from the description in this Chapter’s Introduction. Although it is quite general, it will suffice for now, where a more precise definition will emerge as we attempt to identify a complexity metric. A working definition is necessary but not sufficient to quantify threat scenario complexity. As noted above, a proper model is required. To that end, we begin with a simple combinatorial model, which will reveal the challenges in measuring complexity as well as set the stage for a more sophisticated approach. Recall a binary process is any process with two possible outcomes. A familiar example is a coin toss, which we previously used in other contexts. The two possible outcomes of a coin toss are designated as “heads” and “tails.” If the coin is fair, the probability of a heads outcome equals the probability of a tails outcome. A heads outcome could equally be designated as “1” and a tails outcome as “0” (or vice versa) without any loss of generality. Each of the two outcomes would again have an equal probability of occurrence, i.e., 0.5. As always, the sum of the probabilities for this process must equal one. Since the coin is presumed to be fair, there is no way to know whether a heads or tails will appear until after the coin is tossed.
9.2 Background
193
A state of a binary process is defined as a series of process outcomes. For example, in a coin toss process consisting of three tosses, each state of the process has three possible outcomes. Therefore, there are 23 ¼ 8 possible states: HHH (111), HHT (110), HTH (101), HTT (100), THH (011), TTH (001), THT and TTT (000). Similarly, there are 210 possible states for 10 outcomes of a binary process, 2100 possible states for 100 outcomes, 21000 possible states for 1000 outcomes, etc. Maximum uncertainty regarding the outcome of a toss occurs when heads and tails appear with equal probability since there is nothing to suggest one outcome is favored over the other. In other words, there is no a priori knowledge of an outcome. A coin exhibiting perfect unpredictability, i.e., maximum uncertainty associated with an outcome, is a fair coin. A coin where one outcome is more likely to appear than the other is referred to as a biased coin. Although we might take fairness for granted, it is not necessarily guaranteed that a heads or tails will occur with equal probability. It is conceivable that one outcome is favored due to an imbalance in the distribution of the coin material. A fair coin depends on symmetry such that when it is tossed there is no net torque about a principal axis of rotation. In contrast, consider what happens to a highly asymmetric object such as a book when it is rotated about its middle axis. For a biased coin, either heads or tails would be a more likely outcome depending on the nature of its asymmetry. Suppose for a particular coin heads appears with 75% frequency, i.e., there is a 75% probability a tossed coin will land on heads each time the coin is tossed. Tails must therefore appear with 25% probability. For this coin, there is a priori knowledge of the outcome of each toss. Such knowledge translates into decreased uncertainty, i.e., higher predictability, associated with the coin tossing process. Put another way, which outcome would represent the smart bet on average? Maximum uncertainty does not exist if the probability of a heads or tails outcome is not equal. Perhaps less obvious is the fact that a non-maximum uncertainty condition implies the process conveys less information than if the probabilities are equal. Information in this context relates to the diversity of process outcomes. Based on this view of information, predictability is actually antipodal to information conveyance.1 Hence, maximum uncertainty or conversely minimum predictability implies maximum information conveyance for a given process. As a simple example, imagine if a person’s vocabulary were limited to three words. This condition is unfortunately not so far-fetched. That person’s utterances would be quite predictable, and the probability of speaking a particular word could be guessed with relative ease. In this case, the message source for the conversational process consists of three words, and consequently the diversity of possible process outcomes is quite limited. Now consider a conversation with an individual with a vocabulary of 10,000 words. In this case, the message source is much more diverse. A conversation with
1 The italicized version of information is intended as a bit of foreshadowing. The reason for the italics will become apparent in the discussion on entropy.
194
9 Threat Scenario Complexity
the person possessing a diverse message source promises to be more stimulating in part due to its unpredictability. However, even diverse content can still be profoundly uninteresting, but content is not relevant to this discussion. We note in passing that every language has some predictability due to patterns inherent to all languages.2 Of course, a large vocabulary is no guarantee that the vocabulary will be used effectively much less creatively. In addition, extreme economy of words is subject to broad interpretation. The lead character, Chance the Gardner (a.k.a. Chauncey Gardner) in the Jerzy Kosinsky novel, “Being There” and the 1970 film by the same name exemplifies this phenomenon. But in general the speech of someone possessing a limited vocabulary is more predictable in terms of the number of possible outcomes. The previous discussion hints at the inherently statistical view of information. In fact, such a view is crucial to quantifying threat scenario complexity. In order to develop this concept, an interim method of estimating complexity is introduced next, and we will table the discussion of uncertainty, predictability and information for the time being. To be sure, this method suffers from the same drawback that afflicts any threat scenario that requires a threat incident to be a random variable. Nevertheless, such an approach also has benefits as discussed in Chap. 8. In this case, a principal benefit is to motivate a more sophisticated approach that yields a model for threat scenario complexity. The model of threat scenario complexity introduced in the next section and the method used to analyze the likelihood of unauthorized access discussed in Chap. 8 share a fundamental premise. Recall in the physical access threat scenario we assumed errors in the assignment of role-based access privileges occurred at random, i.e., role assignment was a random variable with two possible outcomes. This simple assumption facilitated a stochastic approach to assessing the likelihood component of risk for this threat scenario. The random variable assumption led to a quantitative estimate of the likelihood component of risk at the expense of operational verisimilitude. Although a card access system with numerous enrolees might indeed be prone to random errors in role assignment, it is not realistic to assume all role assignments occur this way. In fact, most role assignments would presumably be correct because assignments are intentional and therefore the process is biased toward correctness assuming sentient and well-intentioned individuals control the process. The model of threat scenario complexity presented in the next section assumes that security risk management, i.e., the application of security controls to risk factors, is a random variable. Therefore, assessing the probability that the system is in a particular state, i.e., a specific configuration of managed and unmanaged risk factors, is a strictly combinatorial exercise.
2 Piantadosi, S. T., Zipf’s word frequency in natural language; A critical review and future directions. Psychon Bull Rev. 2014 Oct; 21(5): 1112–1130.
9.3 Complexity Combinatorics
195
This initial approach leads to a more sophisticated method that appropriates the concept of information entropy to characterize uncertainty in security risk management. Metrics that quantify the magnitude of complexity evolve naturally from this information theoretic model. However, note that even this model ultimately relies on judgment in assessing risk unless the process governing risk is truly random. Specifically, the information theoretic model requires a subjective assessment of the quality of the security risk management process. In the absence of the Risk Meter discussed in Chap. 1, subjectivity will always play a role in assessing the magnitude of security risk.
9.3
Complexity Combinatorics
Let’s assume a threat scenario consists of an ensemble of risk factors and security controls, where the number of risk factors equals the number of security controls. Each risk factor is managed by only one unique security control. Crucially, the assignment of a security control to a risk factor is a random variable. In other words, there is no bias in the security risk management process and each risk factor is managed by a security control. We further assume the risk management process is binary. In other words, a risk factor is either managed by a correct or incorrect security control where there is only one correct security control per risk factor. We might label a correctly managed risk factor with a 1. Risk factors managed by an incorrect security control are designated as 0. A threat scenario state is defined as a series of risk factor-security control pairs. Recognize that this approach is simplifying yet naïve. There is no underlying process of security risk management since security controls are assigned to a particular risk factor without rhyme or reason. In a more sophisticated approach, the probability of a managed versus unmanaged risk factor would somehow relate to the quality of the governing security risk management process. It is evident that the number of risk factors will affect the number of possible threat scenario states. If there are more risk factors, there is a larger number of possible risk factor-security control combinations. How do these combinations scale with the number of risk factors? Qualitatively, increasing the number of risk factors means more incorrect states are possible. Figures 9.1, 9.2 and 9.3 together depict the spectrum of risk factorsecurity control states in a threat scenario consisting of three risk factors and three security controls. In Fig. 9.1, security control C1 is managing risk factor R1, security control C2 is managing risk factor R2 and security control C3 is managing risk factor R3. The configuration as depicted in Fig. 9.1 is the only correct state. Repeatedly exchanging security controls will generate the full spectrum of risk factor-security control combinations, i.e., all possible states of the threat scenario. For example, C2 has been incorrectly applied to risk factor R1, and C1 is incorrectly applied to R2, thereby yielding State #2 as shown in Fig. 9.2.
Fig. 9.1 Threat scenario state #1
Fig. 9.2 Threat scenario state #2
Fig. 9.3 Threat scenario state #3
R1
R2
R3
C1
C2
C3
R1
R2
R3
C2
C1
C3
R1
R2
R3
C2
C3
C1
9.3 Complexity Combinatorics
197
The remaining threat scenario state occurs when C1 is exchanged with C3, which implies C3 is incorrectly applied to R2 and C1 is incorrectly applied to R3. The third state is shown in Fig. 9.3. We see that there are three possible states resulting from the process of swapping three security controls among three risk factors. The astute reader will realize that these permutations correspond to the number of distinct groups of security controls taken two at a time. A general expression for risk factor-security control combinations that reflects all possible combinations would be useful. To that end, a simple formula for an arbitrary number of risk factors R, and security controls C, is as follows: R!=½C!ðR CÞ!
ð9:1Þ
Expression (9.1) should be read as “R choose C” where R! is R factorial, which is shorthand for R x R-1 x R-2 x. . . .x. . .1. Similarly, C! is C x C-1 x C-2 x. . . .x. . .1. In this case, there are only two security controls, so (9.1) becomes “R choose 2” or R!/ [(R-2)! 2!]. The application of (9.1) to a threat scenario with three risk factors and three security controls confirms the numerical result previously determined by simply counting the number of possible states: 3!=2!1! ¼ ð3 2Þ=ð2 1Þ ¼ 3 Similarly, if there are four risk factors and an equal number of security controls, and any two security controls are interchanged, the number of distinct risk factorsecurity control combinations is four choose two, which from (9.1) is calculated as follows: 4!=ð2!2!Þ ¼ ð4 3 2Þ=ð2 2Þ ¼ 6: Clearly, as R increases the number of possible risk factor-security control combinations, i.e., threat scenario states, also increases. Figure 9.4 shows the dependency of the number of states on the number of risk factors assuming equal numbers of risk factors and security controls. The following is the equation describing this particular scaling relationship, where y is the number of states and x is the number of risk factors: y ¼ 0:5x2 0:9x þ 1:4
198
9 Threat Scenario Complexity
Scaling of Risk-Factor Control States With the Number of Risk Factors 70 y = 0.5303x2 - 0.9303x + 1.4
60 50 40 30 20 10 0 0
2
4
6
8
10
12
14
Fig. 9.4 Threat scenario states v. risk factors
We see that the number of threat scenario states increases quadratically with the number of risk factors. In other words, there is a non-linear relationship between the number of possible threat scenario states and the number of risk factors. Qualitatively, the number of states becomes disproportionately higher as the number of risk factors increases. This dependency certainly makes sense, but it is only modestly helpful in evaluating complexity in an actual threat scenario. Of course, in addition to the random assignment of security controls, this model is contrived in other ways. For example, there is certainly no guarantee of a one-to-one correspondence between risk factors and security controls. Indeed, it is quite possible that the number of risk factors exceeds the number of security controls especially in threat scenarios where security risk management is lax. Clearly, a more general scaling relation is required. Let’s assume the number of risk factors remains fixed but the number of security controls can vary from one up to the total number of risk factors. We further assume for the sake of illustration that there are ten risk factors. According to (9.1), the computation required to determine the total number of states consists of calculating 10 choose 0, 10 choose 1, etc., up to 10 choose 10. Figure 9.5 graphically illustrates the results. The maximum number of threat scenario states occurs when five security controls are applied to ten risk factors. The symmetry is indicative of the general case, i.e., the maximum number of states occurs when the number of security controls equals half the number of risk factors (assuming an even number of risk factors). In the combinatorial model presented above, the assignment of a security control to a risk factor is a random variable. If the security risk management process were
9.3 Complexity Combinatorics
199
Risk Factor-Control States
300 250 200 150 100 50 0 0
1
2
3
4
5
6
7
8
9
10
Security Controls Fig. 9.5 Number of threat scenario states (10 risk factors)
strictly random, we could simply use (9.1) to calculate the number of possible states. The inverse of that number is the probability that the threat scenario exists in its correct state since there is only one correct state. This probability is inversely proportional to the magnitude of threat scenario uncertainty; the lower the probability the greater the uncertainty associated with a threat scenario being in a particular state. The underlying contention is that the magnitude of uncertainty directly relates to threat scenario complexity. The more complex the threat scenario, i.e., the greater the number of possible risk factorsecurity control states. Therefore, with an increasing number of states the lower the probability any given state is the correct one. This relationship between uncertainty and complexity is directionally correct but not very realistic. Consider the strong possibility that a threat scenario is biased, i.e., specific security controls have been intentionally applied to specific risk factors. In such instances, which are hopefully true in general, the application of a security control to a risk factor is anything but a random variable. If the random variable assumption is not valid, it is not possible to determine the number of states via a combinatorial approach. Clearly, complexity as defined above will vary with the number of possible states. But it will also vary with the quality of security risk management. As noted above, the very notion of security risk management implies a bias in applying security controls, inevitably yielding a reduction in uncertainty. In other words, the combinatorial model ignores the effect of security risk management. Fortunately, we can do better. The challenge in developing a useful model of threat scenario complexity is to incorporate the quality of the underlying security risk management process, and thereby account for bias in the process that governs the application of security controls to risk factors. To address that challenge, we utilize our revised notion of information and thereby invoke a fundamental result of information theory.
200
9.4
9 Threat Scenario Complexity
Information Entropy
In the combinatorial model of complexity, threat scenarios were modeled as an ensemble of states consisting of managed and unmanaged risk factors. The model assumed no security risk management and so security controls were randomly applied to risk factors. Although the previous approach is admittedly simplistic, the notion of an ensemble of states of risk factors is useful. This representation is analogous to physical systems modeled as an ensemble of particles in various states of energy. A statistical view of physical systems leads to stochastic methods that are used to determine various thermodynamic parameters, which is the conceptual basis of statistical mechanics. For example, at equilibrium, the probability that a system will be in a particular state is proportional to the number of states accessible to the system. Specifically, the probability Pr, of a system being in a particular state r, relates to the energy of the system as follows. Here β ¼ 1/KT, K is Boltzmann’s constant, T is absolute temperature, Er is the energy of state r and C is a constant of proportionality dictated by the normalization requirement for a probability3: Pr ¼ CeβEr
ð9:2Þ
An equilibrium state of a system is defined as the state where the probability of accessing the most number of states is highest. We see from (9.2) that higher energy states are less likely. Although stochastic models are used in both statistical mechanics and threat scenario complexity, any connection between the two disciplines ends there. Furthermore, the analogy is only meaningful when threat scenarios contain a large number of risk factors since only in that context does a statistical approach make sense. Physical systems contain a fantastically large number of particles, i.e., many, many more than the number of risk factors in a typical threat scenario. Even then, a threat scenario is not typically governed by natural laws, noting that understanding physical phenomena is the point of statistical mechanics. However, in both cases a stochastic approach facilitates estimates of specific parameters by calculating the probability of a particular state relative to the spectrum of possible states. Let’s assume a threat scenario consists of an unspecified number of risk factors. A successfully managed risk factor is designated as “1” and an unmanaged risk factor by “0.” Therefore, a series of 1 s and 0 s characterizes each threat scenario state, and each state represents one possible outcome of a security risk management process. In other words, a distinct sequence of 1 s and 0 s represents each unique threat scenario state. For example, assume a threat scenario consists of a thousand distinct risk factors, and each risk factor is labeled with a 1 or a 0 depending on whether it is managed by
3
F.Reif, op. cit.
9.4 Information Entropy
201
a security control or not. If there is an equal probability that any given risk factor is managed or not, there are 21000 possible states in the entire ensemble of states, where each state consists of a unique sequence of 1 s and 0 s. Importantly, the probability of risk factor management by a security control, which is known a priori, has been specified as 0.5. Recognize this probability is a macroscopic feature of the threat scenario that reflects an average condition. In other words, this probability reflects the general uncertainty about security risk management across the threat scenario. This homogenizing condition is indicative of another of the simultaneous advantages and disadvantages of a stochastic approach. It assumes all risk factors contribute equally to the magnitude of the likelihood component of risk, which is not necessarily true. Moreover, specifying a single figure for the probability of risk management that applies to all risk factors is clearly a simplification. However, a vanilla approach is also what makes the problem of modeling complexity tractable. Claude Shannon was a pioneer in the field of information theory.4 Significant contributors to the theory also include R. V. L. Hartley, Harry Nyquist, Leo Szilard, Norbert Weiner, and Ludwig Boltzmann. Information theory describes the fundamental limits in sending signals over noisy communication channels.5 It was developed to address practical problems in signal transmission. It is important to recognize that the theorems of information theory only have applicability to signals with well-defined mathematical properties. Therefore, we must not overreach or misinterpret the results. In this model of complexity, only the most basic concepts have been appropriated with the relatively narrow objective of quantifying uncertainty in security risk management. The probability that risk factors are either managed or unmanaged is crucial to determining the “information” conveyed by a security risk management process. Information has a very specialized meaning that differs from the colloquial use of the term. Recall this distinction was highlighted earlier in the chapter. Typically, information suggests the conveyance of meaning. A statement laden with information might be communicating a fact or a question about a fact such as the dog is missing, is the gas tank empty, the house is on fire, et al. Such statements are characterized by content that is of interest to someone such as the participants in a conversation, the readers of a book and the viewers of a television program. However, information can also be viewed in terms of probabilities. Foreshadowing a more in-depth discussion, Shannon defined information as “the negative reciprocal value of a probability associated with a process.” The higher the probability of a process outcome the less information that process conveys. Therefore, and as discussed in the last section in connection with language, the complete absence of diversity among process outcomes implies that such a process conveys zero information. In this instance, the probability of a particular process
4
Claude Shannon, 1916–2001, an American mathematician, electrical engineer and cryptographer. C. E. Shannon, The Mathematical Theory of Communication, Bell System Technical Journal, July and October 1948. 5
202
9 Threat Scenario Complexity
outcome is always the same: one, i.e., perfect certainty. We can therefore quantify the notion of information as a probability, which is the essence of information entropy as communicated by Shannon. Suppose a single binary digit, call it an information bit, constitutes a message source, and the bit value is one. Suppose further that messages derived from that message source are sent over a communication channel, and each message is four bits in length. In other words, messages that always consist of four ones, i.e., 1111, are sent via a transmitter to a remote receiver. One could arrange and rearrange the four digits ad infinitum and the message would not vary. It would always consist of four ones. Next, consider a second message source consisting of two digits: a 1 and a 0. If each message derived from this message source also consists of four digits, e.g., 1001, 0110, etc., there are 24 ¼ 16 possible sequences or states. The diversity in the outcomes of the second message source is obviously greater than the first. In fact, the first message source has zero diversity. Can we quantify the diversity of a message source? Since there is zero diversity resulting from the first message source, all transmitted messages are known a priori. Every message is identical, and therefore the probability of a message containing four ones is unity, i.e., 100%. To reiterate for emphasis, in the absence of message source diversity, a process utilizing this message source conveys zero information. Another way of characterizing zero diversity is the absence of uncertainty. What could a recipient glean from messages that always consist of a single symbol and the same number of symbols? Absolutely nothing could be gleaned from such a transmission, which is equivalent to saying no information could be conveyed and the probability of receiving any particular message is one. There is zero uncertainty in a process utilizing a source with zero diversity. A probability of one with respect to message outcomes implies complete certainty in the transmitted message. In contrast, there are 16 possible configurations or outcomes resulting from the second message source. Therefore, if all outcomes are equally likely, the probability of a specific message is 1/16. This probability characterizes the inherent uncertainty associated with any transmission resulting from this message source. Again, diversity implies uncertainty and uncertainty implies information conveyance. Let’s suppose an analog signal is encoded such that it is sent as a digital transmission. What would the encoding look like if a series of 1 s and 0 s were used versus a series of only 1 s? Assume this analog signal is a pure sinusoid. Therefore, its amplitude is oscillating at a single frequency. Message source one is three bits in length and contains either a “1” or” 000 in each bit position. The transmitted signal can be represented as an oscillating voltage that varies between 0 and 1 volt (V) in amplitude. Message source two is also three bits in length, but “1” is the only symbol. The sine waves encoded with message source 1 and message source 2 are shown in Table 9.1. The sine wave amplitude can assume eight values between 0 and 1 volt if it is encoded with message source 1. Variations in the analog signal can be specified with a resolution of 1 V/7 ~ 0.14 V since there are seven evenly divided intervals between 000 and 111. The probability of any particular three-bit sequence of 1 s and 0 s being
9.4 Information Entropy
203
Table 9.1 Signal encoding using all ones versus a combination of ones and zeros Source 1 Encoding Message Source 1 Source 2 Encoding Message Source 2
111 1V 111 1V
110 0.84 V 111 1V
100 0.70 V 111 1V
011 0.56 V 111 1V
001 0.42 V 111 1V
101 0.28 V 111 1V
010 0.14 V 111 1V
000 0V 111 1V
transmitted is 1/8 since there is a total of eight voltages. The diversity of the message source definitely enables information to be conveyed via message source 1. In contrast, a signal derived from message source 2 is always one volt in amplitude. The probability of a three-bit sequence consisting of 111 is one, i.e., there is absolute certainty in the outcome of any transmission. If analog signal voltages were encoded using message source 2, and if the voltage output were used to drive an audio speaker, the resulting tune would be a single tone, a monotonous listening experience to be sure. The lack of diversity contained in message source 2 precludes the conveyance of information as defined by Shannon. A real world example of the diversity of an information source or lack thereof is exemplified in the speech of many American youths. It seems that tomorrow’s leaders are preternaturally fond of using the word “like.” An informal analysis conducted while involuntarily confined within the same subway car has determined that up to 10% of an individual’s spoken vocabulary might consist of this single word. Repetitive use of a single word is equivalent to reducing the diversity of the message source. This reduction results in the increased predictability of speech so generated. A lack of diversity combined with a chirp-like intra-sentence frequency modulation characterizes the speech of many young Americans. The diversity of a message source leads to the notion of information entropy, and once again we use a coin toss process to introduce and develop the concept.6 By way of review, a coin is deemed fair if there is an equal probability of a heads or tails. Quantitatively, this statement translates to a 50% chance of the coin landing on heads and a 50% chance of landing on tails. If the coin is indeed fair, we have no practical way of determining the outcome of a particular coin toss in advance. In other words, the uncertainty associated with the coin toss process is a maximum when the probability of a heads equals the probability of a tails, i.e., the probability equals 0.5. Suppose the probability of a heads appearing on a given toss is actually 0.75, i.e., p(heads) ¼ 0.75 instead of 0.5. Therefore, if the coin were tossed 100 times, we would expect heads to appear roughly 75 times. Since the sum of the probabilities associated with the outcomes of a single process must equal one, the probability of a tails appearing on any given toss must equal 1- p(heads) ¼ 0.25 ¼ p(tails). Is there a difference in the information conveyed by the coin-tossing process if p(heads) differs from p(tails)? 6 Information entropy should not be confused with the entropy of statistical mechanics. The two concepts have a relationship, which can be informally summarized as “There is no such thing as a free lunch.” See J.R. Pierce, An Introduction to Information Theory; Symbols, Signals and Noise, Information Theory and Physics (Chap. 10), Dover, Second Edition, 1980.
204
9 Threat Scenario Complexity
We now know that if the two probabilities associated with each coin toss are the same there is maximum uncertainty with respect to the outcome. Therefore, it is impossible to predict any particular outcome in advance. However, if heads and tails have dissimilar probabilities we have a priori knowledge regarding the most likely outcome of a toss. Assume the coin toss process generates a 1 if an outcome is heads and a 0 if it is tails. We can view coin tossing as a message source consisting of binary digits. In fact, any binary process can be characterized this way as long as the sum of the individual probabilities equals one. We now define a quantity called the information entropy H, for a process with two outcome probabilities, po and p1, as follows7 where the reader is reminded that information entropy and thermodynamic entropy are not equivalent: H ¼ ½po log 2 po þ p1 log 2 p1
ð9:3Þ
H has units of information bits-per-outcome or in this case, information bits-persymbol, where the symbol is either a 1 or a 0. If the process is a coin toss, H would have units of information bits-per-toss. Note that log2 is the designation for the logarithm in base 2.8 In the general case, a summation sign ∑ (Greek upper case sigma) is used to characterize H in terms of the sum of the probabilities associated with the process. In a binary process, this means there are identically two probabilities. H can more generally be written as follows: H¼
N X
pi ð log 2 pi Þ information bits per‐outcome
ð9:4Þ
i¼1
Recalling log2 (0.5) ¼ 1, the information entropy of the coin toss process can be easily calculated. Namely, if po ¼ p1 ¼ 0.5, and using (9.3) the information entropy becomes, H ¼ ½0:5ð1Þ þ 0:5ð1Þ ¼ 1 information bits‐per‐toss
7
Thermodynamic entropy, often designated as S, refers to the number of states available to a system Ω, written as kln(Ω), where k is Boltzmann’s constant and ln is the natural logarithm. 8 For those readers requiring a math refresher, a logarithm corresponds to an exponent. For example, the number 100 can be written as 10 10 ¼ 102 where 10 is the base and 2 is the exponent. Therefore, the logarithm of 100 equals 2 in base 10. Analogously, the number 8 can be written as 2 2 2 ¼ 23. Therefore, the logarithm of 8 equals 3 in base 2.
9.4 Information Entropy
205
Therefore, one information bit is required to characterize a single process outcome, i.e., one coin toss, assuming the probabilities of each outcome are equal. Hereafter log2 will be written without the subscript since only binary processes are considered. What happens to the information entropy when the probabilities of tossing a heads or tails are unequal, which implies there is a priori knowledge of the outcome of any given toss? For example, suppose po ¼ 0.75 and p1 ¼ 0.25. For any binary process with these outcome probabilities, the information entropy is calculated using (9.3) as follows: H ¼ ½0:75ð0:415Þ þ 0:25ð2Þ ¼ ½0:311 0:50 ¼ 0:811 information bits‐per‐toss The bias in the coin toss process is manifest as lower information entropy. In this example only 0.811 information bits-per-binary digit (or equivalently, information bits-per-toss) are required to characterize a process outcome. Recall that the information entropy of an unbiased coin toss process, i.e., p(heads) ¼ p(tails) ¼ 0.5, was calculated to be 1 information bits-per-toss. Therefore, the uncertainties associated with unbiased and biased processes result in differing information entropies, i.e., 1 and 0.811 respectively. Again, maximum uncertainty, i.e., maximum information entropy, occurs when all process outcomes are equally likely and therefore there is no a priori knowledge of a particular outcome. If the probability of a heads is one, the probability of a tails must be zero. The situation is symmetric. Furthermore, if p(heads) ¼ 1 and p(tails) ¼ 0, or p(tails) ¼ 1 and p(heads) ¼ 0, the information entropy is zero since there is complete certainty regarding the process outcome. In other words, zero information bits are required to convey a process outcome if the outcomes have zero diversity. Figure 9.6 graphically illustrates the information entropy of a binary process. The graph confirms that when p(heads) ¼ p(tails) ¼ 0.5, the information entropy is maximum, i.e., H ¼ 1 information bit-per-toss. The parabolic shape of Fig. 9.6 is characteristic of the information entropy of any binary process. The symmetry of the curve has important operational implications. The information entropy of a process with p(heads) ¼ 0.1 and p(tails) ¼ 0.9 is identical to the information entropy of a process with p(heads) ¼ 0.9 and p(tails) ¼ 0.1. From an information theoretic perspective, each of these processes conveys the same information. If two coins are tossed instead of just one there are four possible states, where each state consisting of two possible outcomes, State 1 ¼ Heads-Heads (1–1) State 2 ¼ Heads-Tails (1–0) State 3 ¼ Tails-Heads (0–1) State 4 ¼ Tails-Tails (0–0)
206
9 Threat Scenario Complexity
Fig. 9.6 Information entropy of a binary process (e.g., coin toss)
Assuming both coins are fair, the probability of any single state is ¼. Therefore, according to (9.4) the information entropy of this process equals, Information Entropy H ¼ ½1=4 log 1=4 þ 1=4 log 1=4 þ 1=4 log 1=4 þ 1=4 log 1=4Þ ¼ ð4 ¼Þ½1=2 1=2 1=2 1=2 ¼ 2 information bits‐per‐pair of coins tossed
More generally, if there are n possible outcomes of a process, and each outcome is equally likely, the information entropy becomes, Information Entropy H ðcoin tossÞ ¼ n ½1=nÞ log ð1=nÞ ¼ log ð1=nÞ information bits‐per‐outcome ð9:5Þ What about the information entropy of a process with more than two outcomes per event? A die is the functional equivalent of a six-sided coin, where each side of the die corresponds to a unique number ranging from one to six. Applying (9.5), the information entropy associated with the process of throwing a fair die is given by, Hðdie tossÞ ¼ ð6 1=6Þ log ð1=6Þ ¼ log ð1=6Þ ¼ 2:58 information bits‐per‐throw Recall that a state of a process represents a unique series in the ensemble of process outcomes. For example, if a process consists of two coin tosses, there are four possible states: HH, HT, TH and TT (note that H represents a heads in this context and not information entropy). How can information entropy be used to express the information conveyed by a given process?
9.5 Estimates of Threat Scenario Complexity
207
We know from Fig. 9.6 that if the information entropy of a binary process is one, it means each state is equally likely, i.e., there is a 0.5 probability. However, for other values of H, some states are more probable than other states. We can quantify the number of probable states using information entropy. For a binary process, the number of possible states equals the two outcomes raised to an exponent corresponding to the number of process events. For example, if there are two tosses of a fair coin, there are 22 possible outcomes. However, for a process consisting of two tosses of an unfair coin, the number of probable states equals the number of outcomes, i.e., two, raised to an exponent consisting of the number of tosses times the information entropy. Let N equal the number of probable states, O is the number of process outcomes (i.e., values of a binary message source), T is the number of process events (e.g., coin tosses) and H is the information entropy of the process, we get the following relation: N ¼ ðOÞTH
ð9:6Þ
In the special case where p(heads) ¼ p(tails) ¼ 0.5, i.e., H ¼ 1, and two coin tosses, the number of probable states becomes, N ¼ 22ðH¼1Þ ¼ 4 We see that number of probable states equals the number of possible states when there is maximum uncertainty, which is equivalent to a condition where H ¼ 1. We typically assume coins are fair so that H equals 1. However, if this is not the case, i.e., each outcome of this binary process has a different probability, the process is biased. Therefore, some fraction of the possible states of a binary process becomes more likely than others. This condition will be important in the estimates of complexity discussed in the next section. Specifically, calculating the number of probable states resulting from a security risk management process enables estimates of threat scenario complexity.
9.5
Estimates of Threat Scenario Complexity
The information entropy of a security risk management process can be used to calculate threat scenario complexity. Complexity in this context relates to the diversity of possible outcomes, which is inversely proportional to the probability that a threat scenario exists in a particular state. Therefore, the lower the probability of any particular state the greater the threat scenario complexity. This probability derives from the uncertainty in security risk management coupled with the number of risk factors. Let’s assume each of the two outcomes of a binary security risk management process has equal probability. Therefore, a risk factor has an even chance of being
208
9 Threat Scenario Complexity
managed or not managed. We know from the discussion in the last section that the information entropy associated with such a process equals one. This condition implies there is maximum uncertainty in security risk management. Let’s further assume there are 1000 risk factors present in a threat scenario. This figure may seem high, but individuals and/or systems can be considered risk factors depending on the threat scenario. For example, each Windows account password selected by an individual computer user might be considered a risk factor in an information security threat scenario. There are only two information management process outcomes. As before, managed risk factors are labeled as “1” and unmanaged risk factors are labeled as “0.” If the probability of a managed risk factor equals the probability of an unmanaged risk factor, i.e., 0.5, each state of the threat scenario ensemble consists of a unique series of 1000 managed and unmanaged risk factors, and each of these states is equally probable. The number of possible states is 21000. Recognize this threat scenario is identical to a coin tossing process. Each coin toss has an equal probability of landing on a heads or tails. In exact analogy, each risk factor has equal probability of being managed or unmanaged, and the assumption is the process is unbiased. Either 1000 fair coins or 1000 threat scenario risk factors yields a total of 21000 possible states, where each state consists of a unique combination of equally likely outcomes. Therefore, the probability of a threat scenario existing in one particular state is 1/21000 or equivalently, 21000. In other words, the probability that a threat scenario is in a particular state chosen at random is inversely proportional to the number of possible states. We repeat for emphasis that this condition is only true when the probabilities of a managed and unmanaged risk factor are equal, i.e., 0.5, which implies the risk management process is completely random. We can calculate the information entropy for this process using (9.5): H ¼ 21000 21000 log 21000 ¼ 1000 information bits‐per‐state In other words, 1000 information bits are required to specify each threat scenario state, and each state consists of 1000 managed and unmanaged risk factors. Similarly, if the threat scenario consists of 1,000,000 risk factors that are either managed or unmanaged with equal probability, the information entropy associated with this security risk management process equals 1,000,000 information bits-perstate. In general, for a threat scenario with N risk factors, and where each risk factor outcome has equal probability, the information entropy equals N information bitsper-state in accordance with (9.5). We see that the information entropy, i.e., uncertainty, associated with a binary security risk management process scales linearly with the number of risk factors. Therefore, it is clear that the uncertainty in security risk management depends on the number of risk factors. We pause here to discuss the meaning of H in this context, which is critical to understanding threat scenario complexity. The information entropy is characterizing
9.5 Estimates of Threat Scenario Complexity
209
the uncertainty inherent to the process of security risk management, i.e., the application of security controls to risk factors. H is a macroscopic parameter. It reflects the magnitude of uncertainty in security risk management across the entire threat scenario. We use a monolithic figure for information entropy to generalize about the magnitude of uncertainty in security risk management throughout the environment. Often the figure for information entropy must be estimated based on objective if not quantifiable parameters. What if the security risk management process is biased? Such a condition will affect the likelihood that the scenario is in a particular state. Bias in security risk management is a good thing, since it means security controls are being preferentially applied to risk factors. It is easy to see that the probability of selecting a particular state from the spectrum of possible states will increase with increasing bias in security risk management, i.e., decreasing H. The information entropy is expressing this inherent bias. We will see how that bias affects the number of probable states, and therefore the magnitude of threat scenario complexity. Next we examine the more general case where the information entropy of a binary process does not equal 1, i.e., p(1) does not equal p(0)? A shift in the information entropy leads to an expression for the number of probable threat scenario states as follows: For a binary security risk management process with information entropy H, where each threat scenario state consists of M risk factors, there are 2MH probable states for that threat scenario. In the special case when H ¼ 1, the number of probable states equals the number of possible states.9,10
The number of probable threat scenario states can now be directly related to the magnitude of threat scenario complexity: Threat scenario complexity is the probability that a threat scenario exists in a specific state among an ensemble of threat scenario states, where each state consists of a unique mixture of managed and unmanaged risk factors.
Using this result, if there are a thousand states resulting from the security risk management process, i.e., M ¼ 1000 risk factors, each labeled as a one or zero with equal probability, and the probability of a managed and unmanaged risk factor is equal, i.e., there are 2MH ¼ 21000 x 1 ¼ 21000 probable states. Notice that 21000 also equals the number of possible states since when H ¼ 1 there is maximum uncertainty in the binary security risk management process. In general, and for a constant number of risk factors, threat scenario complexity is a maximum when H equals one.
9
Pierce, John R., An Introduction to Information Theory; Symbols, Signals and Noise, Dover, Second Edition, 1980. 10 We use M to designate the number of risk factors in complexity threat scenarios only. Otherwise, the number of risk factors will be designated as R.
210
9 Threat Scenario Complexity
Furthermore, the probability of a threat scenario with H ¼ 1 and 1000 risk factors being in a particular state is 21000, which is a very small number. What happens to the magnitude of threat scenario complexity if the number of risk factors remains constant but H decreases? For example, let’s assume H ¼ 0.811, which corresponds to a risk management process where p(managed risk factor) ¼ 0.75 and p(unmanaged risk factor) ¼ 0.25 or vice versa (see Fig. 9.6). The number of probable threat scenario states becomes 21000 x 0.8111 ¼ 2811. Therefore, the probability that this threat scenario exists in a particular probable state is 2811. This number is much larger than 21000 but it is still a very small figure. To reiterate for emphasis, maximum threat scenario complexity for a fixed number of risk factors occurs when there is maximum uncertainty associated with the security risk management process, i.e., H ¼ 1. In such instances, all threat scenario states are equally probable since the probability that a risk factor is managed or unmanaged is equally likely. In other words, if H ¼ 1, the process of managing risk factors is equivalent to tossing a coin. Let’s now assume M equals 2. We again assume a binary security risk management process and H equals 1. Managed and unmanaged outcomes are designated as 1 and 0 respectively. In this case, there are 2MH ¼ 4 probable states, and each state consists of two managed and/or unmanaged risk factors, where each state is equally probable. Therefore, there are four possible states as follows: • • • •
State #1 ¼ 11 State #2 ¼ 10 State #3 ¼ 01 State #4 ¼ 00
Next, compare this threat scenario to one where M equals 2 but the information entropy H equals 0.9. In that case p(1) ¼ 0.7 and p(0) ¼ 0.3 or vice versa (Refer again to Fig. 9.6). Since H is not 1 for this binary risk management process, the number of probable states does not equal the total number of possible states. We see that the number of probable states is reduced because there is less uncertainty in the risk management process. In other words, the risk management process is biased. Specifically, the number of equally probable states now becomes, 2MH ¼ 22ð0:9Þ ¼ 3:24: A graph of the probable states of a binary security risk management process for threat scenarios with two risk factors is shown in Fig. 9.7. As expected, its shape is similar to Fig. 9.6 since in both cases the underlying security risk management process is binary. What is the dependency of the number of probable states on the number of risk factors? Let’s assume H equals 0.9 for a given threat scenario risk management process. Therefore, the probability of managed and unmanaged risk factors equals
9.5 Estimates of Threat Scenario Complexity
211
Fig. 9.7 Probable states for threat scenarios governed by a binary risk management process and two risk factors
Fig. 9.8 The number of probable states as a function of the number of risk factors (H ¼ 0.9)
0.7 and 0.3 respectively (or vice versa). The number of probable states as a function of the number of risk factors is shown in Fig. 9.8. The y-axis is logarithmic so the number of probable states increases dramatically with the number of risk factors. For example, if the number of risk factors increases by a factor of 5, i.e., from 20 to 100, the number of probable states increases from 105.4 (i.e., 262,144) to 1027! (Note: this is an exclamatory declaration and not a factorial sign). Therefore, the probability of a threat scenario existing in a particular probable state decreases from 10–5.4 to 1027. It is clear that the number of probable states and the probability of a specific threat scenario state are inversely related. Furthermore, the number of probable threat
212
9 Threat Scenario Complexity
scenario states scales exponentially with the number of risk factors. In the next section we will see that this formulation leads immediately to a metric for the magnitude of threat scenario complexity. Finally, note that although 1027 is an extremely big number, it is actually quite small relative to the number of possible states that exists in a threat scenario with 100 risk factors and where H equals 1, i.e., 2100. The ratio of probable to possible states leads to another complexity metric as discussed in the next section.
9.6
Complexity Metrics
Threat scenario complexity contributes to the magnitude of the likelihood component of risk. Despite its prevalence and the magnitude of its effect, it is often overlooked in a security risk management strategy. There is not even consensus on a proper definition of threat scenario complexity. The theory developed in the last section leads to metrics that enable comparisons of threat scenario risk across disparate threat scenarios. These metrics offer a reference for comparison, recognizing they are not an absolute measurement such as the output of our fictitious risk meter. In other words, these complexity metrics are measures of relative risk. We define a complexity metric Cm as the uncertainty in identifying a particular threat scenario state if a state were selected at random from the spectrum of probable states. This metric is expressed as a probability, Cm ¼ 2MH
ð9:6Þ
Cm is minimum, i.e., threat scenario complexity is a maximum, when the information entropy H, of the underlying risk management process equals 1 for a fixed number of risk factors. Therefore, the number of probable states equals the number of possible states and Cm ¼ 2-M. Note that the value of H for a binary process will always be between zero and one whereas the number of threat scenario risk factors is theoretically unlimited. Therefore, the relative effect of increasing the number of risk factors on the magnitude of complexity is disproportionally significant. For example, let’s calculate Cm when M ¼ 10 and H ¼ 0.7, i.e., p(1) ¼ 0.8 and p(0) ¼ 0.2 or vice versa: Cm ¼ 2ð0:7Þ10 ¼ 27 ¼ 0:08
9.6 Complexity Metrics
213
If now M equals 100 and H again equals 0.7, Cm ¼ 270 ¼ 9:3 1010 Evidently, increasing the number of risk factors by a factor of 10 increases the magnitude of complexity by roughly eight orders of magnitude, i.e., a factor of 100 million. This increase occurred despite the fact that the information entropy remained constant. To summarize, the magnitude of complexity is greatly dependent on the number of risk factors for a binary risk management process. Figure 9.9 is a graph of Cm for M ¼ 2 and M ¼ 4 as a function of H. Recall smaller value of Cm mean higher values of complexity. The magnitude of threat scenario complexity depends exclusively on the product of H and M. Therefore, these two features could offset each other if both parameters are changing in opposite directions within the same threat scenario. The condition required for complexity to remain constant following any changes to either H or M (or both) is simply H x M ¼ 1. This expression implies that if complexity is to remain constant, H and M must vary inversely. As noted above, H can only vary between zero and one whereas M can assume any value. Therefore, large changes in M will typically have a disproportionate effect relative to H. However, if M is indeed large, and H varies even a small amount, the complexity will change significantly. For any threat scenario, as either the number of risk factors or the value of the information entropy increases, the ratio of probable to possible states will change. This ratio leads to the concept of relative complexity. The relative complexity, i.e., the density of probable threat scenario states, is defined as follows: Let H be the information entropy associated with a binary security risk management process for a given threat scenario, H0 is the maximum information entropy for Fig. 9.9 Magnitude of complexity versus information entropy, H for M ¼ 2 and M ¼ 4
214
9 Threat Scenario Complexity
Fig. 9.10 Relative complexity for varying information entropy and number of risk factors
that process, i.e., H0 ¼ 1, and M is the number of risk factors. The relative complexity of a threat scenario is as follows: Relative Complexity ¼ Cr ¼ 2MH =2MH0 ¼ 2MðH1Þ
ð9:7Þ
Relative complexity enables comparisons of complexity across disparate or changing threat scenarios. A low value of relative complexity implies a small density of probable-to-possible states. Therefore, probable states are less dense, i.e., rarer, in low-Cr threat scenarios. Figure 9.10 illustrates the dramatic effect of increasing the number of risk factors on relative complexity. The density of probable states increases with increasing values of information entropy. Cr facilitates direct comparisons of complexity across threat scenarios irrespective of the particular values of H and M. For example, if M ¼ 100 and H ¼ 0, Cr equals 2100, an infinitesimally small number (7.89 x 1031) so -log (Cr) ¼ 100. Figure 9.10 plots relative complexity as a function of information entropy H, for three values of M. Figure 9.10 might be confusing because the curve is decreasing even though Cr is increasing for larger values of H. In fact, Cr increases precipitously as H increases for threat scenarios with a large number of risk factors. For example, we observe from Fig. 9.10, that -log(Cr) decreases from 100 to 90 when H changes from 0 to 0.1. The delta represents a huge increase in relative complexity since the scale is logarithmic. Furthermore, we observe that when M = 100 and H = 0, Cr is a minimum despite the fact that the number of probable and possible states is very high. When M = 100 and H = 1, the situation is reversed, i.e., there are a lower number of probable and possible states yet Cr is a maximum (1). Finally, we see from Fig. 9.10 that the increase in relative complexity as a function of H is much steeper, i.e., is more sensitive to changes in entropy, for threat scenarios with a large number of risk factors.
9.7 Temporal Limits on Complexity
215
Note once again that relative complexity actually increases with increasing H. This makes sense since the denominator of Cr reflects the total number of possible threat scenario states, i.e., a condition of maximum information entropy. As H approaches that maximum value, i.e., H equals 1, the number of probable and possible states converges irrespective of the number of risk factors. In general, and for small numbers of risk factors, the difference in relative complexity does not vary significantly with changes in information entropy. However, a small change in the quality of security risk management as reflected in the information entropy has a big impact on complexity, and hence the likelihood component of risk, for threat scenarios with large numbers of risk factors.
9.7
Temporal Limits on Complexity
Assessing security risk often includes examining spatial and/or temporal characteristics of risk-relevant features of a threat scenario. In assessing threat scenario complexity, one feature of interest is the rate of change of information entropy. In other words, what is the limit on the rate of change of information entropy? This limit might be important in determining the required rate of change to relevant security controls. The magnitude of threat scenario complexity depends on the number of risk factors and/or the uncertainty in the application of security controls to risk factors. The latter is expressed in terms of the information entropy of the security risk management process. In general, the number of risk factors is relatively stable compared to other threat scenario features when measured over risk-relevant time scales. The required duration for a specific security control is determined by the interaction between threats and affected entities. Recall risk factors are present in only two of the three threat scenario elements: affected entities and the environment in which threats and those entities interact. Therefore, the interaction-time between threats and affected entities is particularly risk-relevant, and has implications to threat scenario complexity. In fact, it is obvious that threats are only risk-relevant during the time interval defined by threat-entity interactions. By extension, uncertainty in the application of security controls to risk factors is only risk-relevant during this time interval. Therefore, the information entropy associated with the security risk management process must remain stable, or at least not increase, for the duration of the threat-entity interaction time. However, additional temporal limitations apply to the information entropy associated with a security risk management process, which translates to limits on complexity. These limitations can be understood by returning to threat scenario fundamentals. Information entropy characterizes the uncertainty associated with the security risk management process, where we have assumed the process is stochastic. For simplicity, we have also assumed the process is binary, where risk factors are either managed by security controls or not. Therefore, the information entropy reflects the quality of security risk management for a given threat scenario. A change in the time rate of change of a risk factor potentially affects the information entropy of a security risk management process since such changes potentially beget uncertainty depending on the relative rate of change between
216
9 Threat Scenario Complexity
security controls and risk factors. Recall from Chap. 3 the temporal relationship between risk factors and security controls defines static and dynamic threat scenarios. Based on the above, we can write down a simple expression for information entropy stability, which has implications to threat scenario complexity. Here H, R, C and t are the information entropy of the security risk management process, risk factors, security controls and time respectively. Specifically, we write the following: dH=dt ¼ dC=dt dR=dt
ð9:8Þ
Expression (9.8) states that the time rate of change of information entropy equals the difference in the time rates of change of risk factors and security controls. Assuming the number of risk factors is constant over risk-relevant time scales, (9.8) specifies the condition for complexity stability. Namely, if dC/dt ¼ dR/dt it means dH/dt ¼ 0, i.e., H is constant. Furthermore, if the time rate of change of a security control is less than the time rate of change of the relevant security control, which defines a dynamic threat scenario, it means dH/dt < 0. This condition leads to alternate definitions of dynamic and static threat scenarios: The threat scenario is dynamic if the time rate of change of the information entropy is less than zero, and it is static if the information entropy is greater than or equal to zero. Once again we see that the relationship between risk factors and security controls, and specifically their relative rates of change, is a risk-relevant feature of a threat scenario. The key takeaway is that a static assessment of security risk will not necessarily reveal all the risk-relevant features. Risk factors are the critical features of threat scenarios, and their dynamic properties are particularly relevant to the magnitude of complexity, and hence the likelihood component of risk.
9.8
Managing Threat Scenario Complexity
Although this book is definitely not a “how-to” book on security risk management, some discussion of managing complexity risk is warranted. To that end, this section provides high-level thoughts on this topic that are a consequence of the model of threat scenario complexity. The number of risk factors combined with the information entropy of the security risk management process determines threat scenario complexity. In view of this multi-factorial dependency, and assuming existing security controls are dedicated to other security risk management functions, a dedicated security control is likely warranted to address threat scenario complexity. A simple two-step process is suggested for managing threat scenario complexity. First, security resources dedicated to addressing complexity should be a priority for threat scenarios with numerous risk factors irrespective of the particular types of risk factors that are present. Second, specific tests would be used to estimate the
9.8 Managing Threat Scenario Complexity
217
probability of managed versus unmanaged risk factors, and therefore the magnitude of information entropy, across the enterprise or some subset of the enterprise. Some thoughts on use of a specific test for this purpose are presented next. In the case of information security threat scenarios, which are often characterized by complexity due to a large number of risk factors, an organization might conduct a password cracking exercise pursuant to specifying a general figure for H, the information entropy of security risk management. Specifically, it might establish a threshold for the time required to conduct a successful brute force attack on some fraction of the password space. Assumptions would have to be made regarding the computational capability of an adversary. As always, an organization must also determine its tolerance for risk. Chap. 12 discusses password cracking in some detail and the implications to enterprise security risk management. Such a test would not necessarily reflect the quality of every security risk management program or process. However, it might be an indicative source of risk-relevant information. At a minimum it could represent one input to an overall information entropy figure. For example, the results of a password cracking exercise could yield a probability distribution of the number of passwords cracked in various time intervals. That probability distribution might specify the fraction of passwords cracked in 10 minutes, an hour, four hours, a day, a week, etc. If an acceptable threshold were set to be one day, the distribution would yield the fraction of passwords cracked within the threshold, which would contribute to the confidence in the overall quality of security risk management. Specifically, if 60% of the passwords were cracked in less than a day, this metric could be used to establish an overall information entropy figure, ideally in conjunction with other objective and subjective criteria. The 60% figure by itself might admittedly be a simplistic if not overly pessimistic view of the security risk management landscape. However, one purpose of a model for complexity is to establish limits that could at least function as benchmarks for the application of security controls. Referring once again to Fig. 9.6, we see that for any binary process where p(managed) ¼ 0.6 and p(unmanaged) ¼ 0.4 (or vice versa), the information entropy is 0.95. A calculation of threat scenario complexity can result from this figure combined with an estimate of the number of risk factors. Since limiting the number of risk factors is presumably not feasible, the only option is to increase the probability that threat scenario risk factors are managed thereby decreasing the security risk management information entropy.11 That recommendation might seem gratuitous, but it actually has important implications to the approach to security risk management. For example, it argues for not necessarily focusing on individual risk factors, but rather to reduce the overall 11
H could be minized by either ensuring all risk factors were managed or unmanaged. Both conditions are equivalent from an information theoretic perspective. Recall it is the knowledge of the state of the risk factor in a threat scenario that affects information entropy and not the specific outcome or result. However, if one could ensure all risk factors were managed and this were known a priori it would surely represent an ideal security risk management condition.
218
9 Threat Scenario Complexity
number, if possible. Moreover, if other processes or threat scenario features contribute to complexity, and these are already being managed by dedicated security controls, the complexity should be addressed separately.
9.9
Summary
Complexity is a risk factor for the likelihood component of risk, and it results from two threat scenario conditions: uncertainty in security risk management, i.e., the application of security controls to threat scenario risk factors, and the number of threat scenario risk factors. Information theory, and in particular information entropy, can be leveraged to construct a model for threat scenario complexity. Specifically, a binary process is assumed for security risk management, where threat scenario risk factors are either managed or not managed by security controls. Each of the two outcomes in the security risk management process has a probability. Furthermore, a threat scenario can be characterized as an ensemble of states, where each state consists of a series of managed and unmanaged risk factors. The information entropy of this binary security risk management process is a macroscopic parameter that can be used to characterize the uncertainty in the application of security controls to risk factors, and hence the uncertainty (or conversely, the predictability) that a threat scenario exists in a particular state. Information entropy is a maximum when the probabilities of the two security risk management process outcomes, i.e., managed or unmanaged risk factors, are equal. The number of probable states in a threat scenario modeled as a binary risk management process equals 2MH, where M is the number of risk factors and H is the information entropy of the security risk management process. If H ¼ 1, i.e., maximum uncertainty in security risk management, the total number of possible threat scenario states is given by 2M. In such a threat scenario, the number of probable states equals the number of possible states. The magnitude of threat scenario complexity Cm, for a binary security risk management process results from calculating the probability that a threat scenario exists in a particular state. It is therefore expressed as follows: Cm ¼ 2MH A small change in the quality of security risk management has a big impact on complexity, and hence the likelihood component of risk, for threat scenarios with large numbers of risk factors.
9.9 Summary
219
The ratio of probable-to-possible states defines the relative complexity of a threat scenario. Relative complexity Cr, is defined as follows, where H0 is the maximum information entropy, i.e., 1: 0
Cr ¼ 2MH =2MH ¼ 2MðH1Þ Relative complexity is a useful metric for assessing the effect of a change in the number of risk factors, and/or changes in the quality of risk management across threat scenarios or within a changing threat scenario. A decrease in relative complexity will result from reducing the number of risk factors. Relative complexity increases with increasing information entropy, and converges to one with an increasing number of risk factors.
Chapter 10
Systemic Security Risk
10.1
Introduction
The principal steps in a security risk assessment are as follows: 1. 2. 3. 4.
Identify the three threat scenario elements. Identify the threat scenario risk factors. Estimate the effect of the risk factors on the components of risk. Prioritize security controls pursuant to addressing the risk factors in accordance with the tolerance for risk. 5. Assess the residual risk at periodic intervals and adjust security controls accordingly. Although the threat scenario details will vary from scenario-to-scenario, steps 1–5 are always applicable because of threat scenario equivalence and the universality of risk. However, even if appropriate security controls have been applied following the invocation of steps 1–5, a threat scenario could easily revert to its pre-assessment condition unless systemic security risk issues have been addressed. The organizational culture is often the root cause of systemic security risk. For example, security is typically juxtaposed with convenience, and the organizational culture dictates which of these is a higher priority. Unless proper security governance is enacted, tactical fixes will in general have a limited effect over the longterm. Systemic issues reveal themselves via the approach to security risk management. For example, a dilatory approach to patching computers over a sustained period would be more indicative of a systemic risk management issue than a temporary lapse in attention. Ignoring published risk factors for malware et al., could have numerous underlying causes: a lack of organizational sophistication, limited IT resources, negligence, et al. Importantly, risk-relevant conditions are likely to exist elsewhere and are equally likely to continue unless the organizational culture changes. © Springer Nature Switzerland AG 2019 C. S. Young, Risk and the Theory of Security Risk Assessment, Advanced Sciences and Technologies for Security Applications, https://doi.org/10.1007/978-3-030-30600-7_10
221
222
10
Systemic Security Risk
Note the deeper objective in identifying systemic issues is not the identification of specific risk factors, which is the focus of ordinary security risk assessments. Rather, the emphasis is on uncovering risk-relevant patterns in the spatial distribution and/or temporal history of risk factors. Patterns so identified are likely indicative of systemic flaws in security risk management. Furthermore, the contention is that simple metrics that either reveal unevenness in risk factor distribution or an anomalous time history presage risk, and point to inadequate security governance as a root cause. These metrics are macroscopic parameters, and are analogous to human vital signs, e.g., temperature, pulse, and blood pressure. Although not granular enough to pinpoint a specific illness, an abnormal vital sign suggests an underlying disease and warrants follow-up diagnostics. In the same vein, patterns revealed by the aforementioned metrics suggest gaps in security governance and provide cause for deeper analyses.
10.2
The Risk-Relevance of Assets and Time
Metrics that reveal systemic security risk management issues are in general based on spatio-temporal characteristics of risk factors. Recall from Chap. 6 we introduced the concept of density, which is expressed as the ratio of two parameters. A lopsided ratio relating to the distribution of risk factors suggests a risk-relevant condition. In a similar vein, if risk factors are protracted in duration or recur intermittently it is potentially risk-relevant. The ratio of the number of risk factors to the number of security controls (R/C) is a risk-relevant parameter and is discussed in Chap. 12. Two possibilities exist if R/C > 1: There is at least one unaddressed risk factor or at least one security control is managing multiple risk factors. With respect to the former condition, the implication is obvious, and further investigation is required to determine why a threat scenario contains an unmanaged risk factor. Regarding the latter condition, and assuming all risk factors affect the magnitude of risk equally, the risk-relevance of a security control managing multiple risk factors is typically more significant than one managing a single risk factor. In either case, if R/C is greater than one it is potentially indicative of a risk-relevant condition. The parameter R/C is clearly risk-relevant. However, by itself it is difficult to draw conclusions. Although a high value of R/C is a condition worth investigating and/or monitoring, the fact that risk factors are unaddressed or security controls are multiplexed might be the result of a conscious risk management decision. The resource implications of managing every risk factor could be significant. In other words, R/C has important tactical implications but additional metrics are needed to determine if security governance is an underlying issue. Specifically, we require metrics that relate risk factors to things of value, i.e., assets. The ultimate objective of risk management is to protect assets. Therefore, the
10.3
Spatial Distribution of Risk Factors: Concentration and Proliferation
223
spatial distribution and temporal history of risk factors with respect to assets is deeply risk-relevant. Moreover, it is an important input to the development of a security risk management strategy. As noted previously, time can also have an effect on the magnitude of threat scenario risk. Persistent risk-relevant conditions increase the opportunity for threat incidents to occur. The likelihood of loss can also be expressed in terms of the inverse of the time interval over which such incidents can occur. For example, it is clear from the return period discussion (see Fig. 4.3) that cumulative losses can be expected to be greater for longer time intervals. However, transient conditions can also affect the magnitude of risk as we observed in the subway threat scenario. Obviously, a security risk management strategy that ignores temporal issues that are risk-relevant is sub-optimal. Risk factors that persist suggest systemic risk management issues that result from neglect, a lack of sophistication, inattentiveness et al.
10.3
Spatial Distribution of Risk Factors: Concentration and Proliferation
10.3.1 Concentration In the previous section we noted that the ratio of the number of risk factors to security controls is a risk-relevant parameter. If the number of risk factors exceeds the number of security controls, i.e., R/C > 1, either one security control is addressing multiple risk factors or at least one risk factor is unaddressed. The magnitude of R/C is indicative of a managed threat scenario condition. Security controls are applied after an assessment of risk has been completed. The link between risk factors and things requiring protection is a more organic condition than the connection between risk factors and security controls. Identifying assets, i.e., affected entities, is one of the incipient efforts in assessing security risk and precedes the application of security controls. We therefore seek a metric revealing risk-relevant information about the drivers of risk rather than the response to such drivers. Ultimately, there must be a relationship between security controls and things requiring protection. As noted above, we refer to these things as “assets,” where any asset is prima facie risk-relevant. A security risk assessment begins with the identification of threats and risk factors relative to the loss or damage to assets. The application of security controls is linked to, and naturally follows from, asset and risk factor identification. A metric that is potentially indicative of systemic risk is the ratio of the number of risk factors R to the number of assets X, as measured across a threat scenario. Note that R/X is a density, which is indicative of the concentration of risk factors within
224
10
Systemic Security Risk
and/or among assets. The units for such a metric might be risk factors-per-server, risk factors-per-application, risk factors-per-facility, etc. depending on the threat scenario. Moreover, assets riddled with risk factors potentially increase all three components of risk. A high concentration of risk factors within high-value assets in particular would be indicative of systemic risk. Note that since assets represent things of value, the concentration metric can also be thought of as a concentration of vulnerability. Finally, the concentration metric could also be used to measure risk dilution, i.e., an intentional decrease in the density of assets to reduce the impact component of risk.
10.3.2 Proliferation Proliferation is another condition that is potentially indicative of systemic risk and relates to the spatial distribution of risk factors. Proliferation refers to the sprawl of risk factors across a threat scenario. In this case, the metric is represented as a product rather than a ratio, i.e., R-X. An example is server-vulnerabilities. The astute reader might point out that concentration appears to be the polar opposite of proliferation; the former reflects risk factor density and the latter connotes diffuseness. However, both conditions are risk-relevant. If risk factors are spread across many assets or the same number is concentrated in a single asset it can be cause for alarm. As always, context is everything in assessing the magnitude of risk. Moreover, multiple risk conditions might be described by the same proliferation metric. For example, if 1000 risk factors are distributed across 10 assets, the proliferation metric is specified as 10,000 risk factor-assets. However, 100 risk factors distributed across 100 assets also yields 10,000 risk factor-assets. Of course, the value of the assets in question is also highly risk-relevant.
10.4
Temporal History of Risk Factors: Persistence, Transience and Trending
The temporal history of risk factors can also be risk-relevant. For example risk factors that persist in an environment are more likely to be exploited simply because the opportunities for exploitation tend to increase with time. Note that time by itself is not a risk factor. The effect of time can definitely increase the magnitude of risk but it must act in conjunction with other risk factors. For example, time affects the likelihood component of risk in the subway threat scenario but only when a slippery platform or other risk-relevant feature facilitates
10.4
Temporal History of Risk Factors: Persistence, Transience and Trending
225
falling onto the track. Time is an adversary to an entity who falls onto the track during a specific time interval but it can be an ally otherwise. Time can also affect the magnitude of the vulnerability component of risk. Recall the effect of the impulse in explosive threats. In explosive threat scenarios, the time of interaction of the explosive shock wave with a building structure affects the magnitude of structural damage.
10.4.1 Persistence If risk factors persist without intervention it is often indicative of indifference or unawareness. Moreover, a dilatory approach to addressing the most risk-relevant features of a threat scenario is probably indicative of systemic security risk management issues. Conversely, efficient (and effective) security risk management processes suggest a strategic, i.e., risk-based approach. A metric that succinctly characterizes persistence is R x Δt, where R is the number of risk factors and Δt is the average time interval that risk factors remain unaddressed. Numerous risk factors that remain unaddressed for a protracted period are indicative of a pattern of inaction in applying security controls. As we observed with spatial distribution metrics, temporal metrics that present the same value are subject to different interpretations. For example, a metric corresponding to a single risk factor that persists for 30 days is equivalent in form to one that describes 30 risk factors persisting for only 1 day. Each is specified as 30 risk factor-days. In addition, not all risk factors are created equal. Arguably a risk factor such as a high-value asset exposed to the Internet would be more risk-relevant than 30 instances of inadequate Windows passwords. Of course, those passwords might be protecting file shares storing sensitive information as discussed below. It is impossible to determine the magnitude of relative risk without additional details. As always, such a determination depends on the specific threat scenario, i.e., the context. Consider an information technology scenario in which a shared drive storing highly sensitive company information is accessible from the Internet. Such access is a significant risk factor for the threat of information compromise. Suppose in one case one drive has been exposed for 30 days and in another case 30 such drives were Internet-accessible for only 1 day. Both threat scenarios point to systemic security risk management issues. A lengthy Internet exposure suggests a lax approach to assessing and managing risk. However, a large number of assets with such exposure, even briefly, is also indicative of systemic risk management issues. Again, understanding the context is critical to assessing security risk, but the metric is risk-relevant in either case. Numerous risk factors that persist within applications and servers until patches can be applied is a common theme in IT environments. Commercial network scanning tools identify such risk factors, which are designated as “vulnerabilities.”
226
10
Systemic Security Risk
Vulnerabilities are ranked according to a common framework for evaluating risk, and these results drive patching efforts.1 Note that the absolute number of risk factors by itself is not necessarily relevant. However, the number of risk factors in conjunction with time is highly risk-relevant. For example, a network scanning application often reveals thousands of the aforementioned vulnerabilities. If many vulnerabilities persist and/or the number increases over time, it is at least indicative of the efficiency of the overall security risk management process. The point is that the number of risk factors in combination with the average time they linger in the environment has more risk-relevant implications than does the absolute number of vulnerabilities as measured at any instant in time.
10.4.2 Transience Risk factors of short duration can also increase the magnitude of risk, and particularly the likelihood component. Perhaps it is counterintuitive that a risk factor abbreviated in time could increase the magnitude of threat scenario risk. The risk increases because of the uncertainty in applying security controls. Ephemeral but recurring risk factors relative to the duration of security controls are especially riskrelevant. However, the duration of the risk factor by itself is not particularly risk relevant in this context, noting it was quite risk relevant to persistence. It is the ratio of risk factor duration and security control duration that is measured by transience. We can therefore write the following expression for the transience risk metric, where ΔC is security control duration and dr is risk factor duration: Transience ¼ dr=ΔC Importantly, we see that if the risk factor duration decreases it means the security control duration must increase in proportion in order keep the transient metric constant. Note the risk increases for decreasing values of ΔC. Specifically, the risk associated with short duration risk factors relates to the increased likelihood of a misapplication or non-application of security controls. A short duration risk factor is more difficult to pinpoint. We would, perhaps, be more precise if we designated dr and ΔC as the uncertainty in risk factor and security control duration respectively, noting that uncertainty and risk factor duration are inversely proportional.
1 The Common Vulnerability Scoring System (CVSS) is an open industry standard for assessing the severity of information security risk factors or equivalently, vulnerabilities.
10.4
Temporal History of Risk Factors: Persistence, Transience and Trending
227
10.4.3 Trending A change in the direction or magnitude of a risk-relevant parameter such as the number of threat incidents can relate to the effectiveness of security risk management efforts and therefore indicative of an underlying security governance issue. A trend can convey significant information regarding the time rate of change of risk in a threat scenario. As with persistence and transience, the notion of time is inherent to trending. The direction of a trend of a risk-relevant parameter tells a story about the threat scenario. An upward trend in threat incidents could be a sign of inattentive or even non-existent security governance. For example, in information security threat scenarios, if the number of published IT vulnerabilities steadily increased over a 6-month period it might signify dilatory patching efforts. At a minimum, such an increase indicates that the rate of adding new vulnerabilities exceeds the rate of fixing them. Note that month-to-month fluctuations in risk factors are not necessarily indicative of a security governance issue; a sustained build-up would be more significant. In particular, a sharp upward trend of risk factors that continues unabated for an extended period likely points to an underlying security risk management issue. Note also that if a security control resulted in a trend reversal it would tend to confirm the efficacy of the strategy used to address the relevant risk factor. In addition, a trend line is a graphic representation of data that can be readily appreciated by non-security types. A trend line pointing up or down might obviate the need for a lengthy explanation regarding the magnitude and direction of risk or the effect of security controls. Communicating risk-relevant conditions can be a challenge but is essential in day-to-day security risk management. Therefore, metrics that assist in this endeavor are generally beneficial. Comparing two trend lines can yield insight into the factors influencing a threat scenario, and can be particularly germane to revealing root causes of security risk. For example, a positive correlation coefficient between two time series of parameters would suggest some operational connection. Equally, a lack of correlation can yield risk-relevant insights. Recall the correlation coefficient calculated for the number of terrorism threat incidents relative to the number of days in the month as discussed in Chap. 6. Trending metrics can assume multiple forms since numerous trends are riskrelevant. We know that changes to both risk factors and threat incidents are riskrelevant, where the latter are manifestations of the former. The expressions for some of the significant trending metrics are as follows, where ΔR0 is the change in the magnitude of a risk factor, ΔR is the change in the number of risk factors, ΔI is the change in the number of threat incidents, and t is time: Trending Metrics ¼ ΔR0 =t, ΔR=t and ΔI=t
228
10
Systemic Security Risk
Fig. 10.1 Increasing trend in the number of risk factors
A feature related to trending is the slope or rate of change of a trend line. The rate of change is expressed as the change in the dependent variable divided by the change in the independent variable. If the trend is linear, the slope of the line is constant. Figure 10.1 illustrates a remarkably linear upward trend in the number of risk factors as a function of time. The independent variable is time and the dependent variable is the number of risk factors. The slope in Fig. 10.1 is one since the change in the number of risk factors corresponds to an equivalent change in time. Finally, recall from Chap. 8 that indirect assessments of the likelihood component of risk are based on either the number of risk factor-related incidents or a change in the magnitude of a risk factor. In contrast, direct assessments are based on actual threat incident statistics. Therefore, the metrics R/t and ΔR/t are indirect assessment metrics, and ΔI/t is a metric for a direct assessment of the likelihood component of risk.
10.5
Summary
Security risk assessments identify threat scenario risk factors and determine their effect on the magnitude of risk. Appropriate security controls are applied in proportion to the magnitude of the assessed risk and in accordance with an organization’s tolerance for risk. In addition to addressing threat scenario risk factors and applying near-term remediation, a strategic security risk assessment focuses on systemic security risk issues. These issues are exemplified by the spatial distribution and temporal history of risk factors. Systemic issues are indicative of the approach to security risk management and often point to the organizational culture as a root cause.
10.5
Summary
229
Five spatiotemporal metrics suggest patterns indicative of systemic risk management issues. These metrics include (1) the concentration of risk factors relative to threat scenario assets, (2) the proliferation of risk factors relative to threat scenario assets, (3) risk factor persistence, (4) risk factor transience and (5) risk factor trending. The metrics are specified as follows, where R is the number of risk factors, ΔR is the change in the number of risk factors, ΔR0 is the change in the magnitude of a risk factor, t is continuous time, dr is risk factor duration, ΔC is security control duration, ΔI is the change in the number of threat incidents and X is the number of assets: 1. 2. 3. 4. 5.
Concentration: R/X Proliferation: R X or equivalently R-X Persistence: R Δt or equivalently R-Δt Transience: dr/ΔC Trending: ΔR/t, ΔR0 /t and ΔI/t
Chapter 11
General Theoretical Results
11.1
Introduction
Organizing the key theoretical results according to general categories is a useful means of presenting the theory. It is also helpful in translating theory into practice. We should never lose sight of the fact that the point of the theory is to promote accuracy in assessing security risk and increase the effectiveness of security risk management. The theoretical results highlighted in this chapter cannot necessarily be proven in the mathematical sense. Some of these are observations that either follow directly or are easily deduced from the core principles specified in the next section. Moreover, some of the results have been discussed previously but are reiterated here so that they can be part of a compendium. Metrics or thresholds are also sometimes specified, and in some cases these can be incorporated into security risk assessment standards. Perhaps most importantly, a set of core theoretical principles is articulated. These principles represent the crux of the theory, and ultimately provide the predicate for all security risk management efforts. The results are organized according to the threat scenario categories presented in Chap. 3 with the exception of the core principles. The latter represent the backbone of the theory and therefore deserve its own section. This organizational scheme was adopted so that themes common to each threat scenario category can be readily identified.
© Springer Nature Switzerland AG 2019 C. S. Young, Risk and the Theory of Security Risk Assessment, Advanced Sciences and Technologies for Security Applications, https://doi.org/10.1007/978-3-030-30600-7_11
231
232
11.2
11
General Theoretical Results
Core Principles
Twelve core principles represent the crux of the theory of security risk assessment. Every aspect of the theory derives from, or is somehow linked to, one or more core principles. As noted in the Introduction, the theory of security risk assessment evolves from a canonical threat scenario structure and the universal relationship between its elements. These features are captured in the first two principles, which establish a common frame of reference for assessing security risk, and also leads to Principle Three. This frame of reference has profound theoretical and practical consequences since it facilitates generalizations about the magnitude of risk and the prioritization of security controls. Principle One: Threat Scenario Equivalence All threat scenarios are comprised of three elements: threats, entities affected by threats and the environment in which threats and affected entities interact. Principle Two: The Universality of Risk A threat scenario feature called “Risk” describes the relationship between threats and entities affected by threats within the context of a threat scenario. Risk always consists of three components: likelihood, vulnerability and impact, where the threat scenario risk factors for each component increase the magnitude of risk. A natural consequence of Principles One and Two is Principle Three, which has profound theoretical and practical implications. Principle Three: The Risk Assessment Process The process of assessing risk is always the same irrespective of the threat scenario details. Principle Four: Threat Incident Origination Threat incidents originate from threat scenarios, and relate to or are manifestations of one or more threat scenario risk factors. Principle Five: Sources of Uncertainty for the Likelihood Component of Risk The uncertainty associated with the likelihood component of risk derives from two sources, and results in the two assessment methods specified in Principle Six: • The dispersion associated with a probability distribution of historical threat incidents. • The contribution of the individual likelihood risk factors to the magnitude of risk in the absence of threat incidents.
11.2
Core Principles
233
Principle Six: Assessing The Likelihood Component of Risk Assessing the likelihood component of risk is based on two distinct methods, and relate to the sources of uncertainty noted in Principle Five: • The probability of a future type of threat incident, which is based on a probability distribution of historical threat incidents • The potential for a threat incident, which is based on the statistics of risk factorrelated incidents or a change in the magnitude of one or more risk factors Principle Seven: Residual Risk Residual risk is the difference between the magnitude of the assessed and actual threat scenario risk following the application of security controls. The residual risk is never zero in any realistic threat scenario. Principle Eight: Threat Incident Similarity All threat scenarios are structurally equivalent (per Principle One) but are not congruent unless they have identical risk factors. Threat incidents are similar, and therefore comparable, if they are the product of congruent threat scenarios. Principle Nine: Maximum Risk Conditions There are two maximum risk condition assuming all risk factors are equal in magnitude: (1) the risk factors for all three components of risk and their relevant security controls are not coincident, and (2) the likelihood risk factors are coincident, i.e., a confluence condition. Principle Ten: The Effect of Multiple Risk Factors The likelihood component of risk increases exponentially with an increasing number of likelihood risk factors. In contrast, the effect of multiple vulnerability and impact risk factors on the magnitude of risk is in general additive. Principle Eleven: Threat Scenario Complexity Two threat scenario features determine the magnitude of complexity: • The number of risk factors • The information entropy of the security risk management process, i.e., the uncertainty in applying security controls to risk factors Principle Twelve: Systemic Security Risk The spatial distribution and temporal history of risk factors suggest the presence of systemic security risk management issues. The organizational culture is often a root cause of systemic security risk.
234
11.3
11
General Theoretical Results
Random Threat Scenario Results
If a threat incident is a random variable the threat scenario from which it originates can be categorized as random. An assumption of randomness offers a mathematically convenient path to estimating the likelihood component of risk since the laws of probability now apply. However, and as noted in Chap. 8, true randomness is not so common in this context. It would be unusual to identify a threat incident that qualifies as a random variable. Most threat scenarios are biased in some way, and therefore resulting threat incidents have an underlying cause. Nevertheless, assumptions of randomness enable simplifications and thereby overcome the statistical handicap of a small incident sample space. The results with appropriate caveats can be used to assess worst-case scenarios or at least confirm our intuition regarding the upper bounds of threat scenario risk. The results derived from an assumption of randomness clearly require an understanding of their limitations. The first result associated with random threat scenarios is as follows: (a) Random threat scenarios are those where a threat incident is a random variable. Two threat scenario conditions can yield random threat scenarios. The first is where threat incidents occur spontaneously, i.e., without influence or prompting. The second are threat scenarios where the net effect of multiple risk factors is incoherence over risk-relevant time scales. An important operational implication of a probability distribution of threat incidents is in specifying the uncertainty in the mean of that distribution. That uncertainty, i.e., the standard deviation and variance, affects the precision in estimates of probability. This condition is perhaps counterintuitive since precision and uncertainty might seem antithetical. The following statement specifies the magnitude of uncertainty that results from numerous samples of a random variable according to the Central Limit Theorem: (b) The uncertainty in the mean of a probability distribution of incidents can be quantified in accordance with the properties of the particular distribution. Specifically, the uncertainty, i.e., standard deviation about the mean of a normal distribution of N incidents is proportional to √N. Moreover, the standard deviation of a sample of N threat incidents selected from a parent distribution equals the standard deviation of the parent divided by √N.
11.4
Static and Dynamic Threat Scenario Results
Static and dynamic threat scenarios are categorized according to the relationship between the time and spatial rates of change of security controls relative to changes in risk factors.
11.4
Static and Dynamic Threat Scenario Results
235
The presence of static and dynamic risk factors is the criterion for static and dynamic threat scenarios, respectively, and are defined as follows: (a) A static threat scenario is characterized by the following two conditions, where C is a security control and R is a risk factor: dC=dt dR=dt and, dC=dx dR=dx In other words, in a static threat scenario, security controls change at least as frequently as do temporal and spatial changes to the relevant risk factors. (b) A dynamic threat scenario is defined as follows: dC=dt < dR=dt or, dC=dx < dR=dx In a dynamic threat scenario security controls are lagging with respect to time or spatial variations relative to the relevant risk factors. This condition has implications to the magnitude of risk during the time interval or position that a security control lags changes in the relevant risk factor. Risk factor stability is significant to the theory of security risk assessment. The status of the risk factors relative to security controls determines the magnitude of threat scenario risk. Therefore, and simply put, if a risk factor varies “too much,” the threat scenario containing that risk factor has morphed beyond recognition. When this happens, threat incidents resulting from threat scenarios preceding such variations cannot be compared to threat incidents that occurred post-variation. This condition has important ramifications to the likelihood component of risk, and is encapsulated in the following result: (c) Quantitative estimates of the likelihood component of risk, i.e., a calculation of probability, require that the historical risk factors remain stable over riskrelevant time scales. Furthermore, risk factor instability mandates indirect assessments of the likelihood component of threat scenario risk, i.e., an estimate of potential. Threat scenarios can, of course, be influenced by multiple risk factors, and each risk factor can affect more than one component of risk. The presence of multiple
236
11
General Theoretical Results
likelihood risk factors in particular has a multiplicative effect on the magnitude of threat scenario risk. This condition leads to the following result: (d) The presence of contemporaneous likelihood risk factors, i.e., confluence, exponentially increases the magnitude of the likelihood component of risk. Vulnerability and impact risk factors do not contribute to the overall magnitude of threat scenario risk in the same way as the likelihood risk factors. It makes no difference if these risk factors are present in parallel or in sequence although their effect in both instances is cumulative. This distinction is significant, and leads to the following result: (e) The effect of multiple risk factors for the vulnerability and impact components of risk on the overall magnitude of threat scenario risk is additive. We have noted many times that the precise scaling of the components of risk cannot be determined from The Fundamental Expression of Security Risk alone. More information is required regarding the behavior of each component of risk, and specifically their respective risk factors. The alignment of risk factors as well as the alignment of risk factors relative to the relevant security controls in a given threat scenario will affect the magnitude of threat scenario risk. The risk factor-risk factor and risk factor-security control alignments lead to the following result, which has significant operational implications: (f) Assuming the individual risk factors have equal magnitude, the overall magnitude of risk is a maximum if (1) the likelihood risk factors are coincident or (2) likelihood, vulnerability or impact risk factors and relevant security controls are not coincident. The absence of a security control is sometimes considered a threat scenario risk factor. This situation occurs frequently when performing information security risk assessments, where security controls are baked into IT environments. In general, the absence of a security control is not a threat scenario risk factor. Security controls by definition address risk factors. However, an exception to the rule occurs when a risk factor affects all three components of risk, and a particular security control is managing that risk factor. An example of such a threat scenario is the threat of electrical shock from an electrical appliance. A control to prevent shocks is electrical grounding, which ensures devices are at zero electrical potential with respect to ground. An electrical device is functionally worthless if it is not suitably grounded. Specifically, the absence of electrical grounding increases the likelihood, vulnerability and impact components of risk for electrical shock threat scenarios. In other words, grounding constitutes an existential feature of an electrical device since it would not be operationally viable otherwise. Therefore, the absence of grounding is a risk factor for electric shock threat scenarios. A firewall would arguably qualify as a security control whose absence would be a risk factor for information security threat scenarios. It is difficult to envision an IT
11.5
Complex Threat Scenario Results
237
network with connectivity to the Internet that could function in the absence of a device that segregates internal and external network environments. The following statement reflects the special status reserved for security controls that simultaneously affect the magnitude of multiple components of security risk. (g) The absence of a security control is a threat scenario risk factor if and only if it simultaneously affects the impact, likelihood and vulnerability components of risk.
11.5
Complex Threat Scenario Results
The following conditions follow immediately from the discussion of threat scenario complexity in Chap. 9, and in particular the contribution of specific threat scenario features to the magnitude of complexity: (a) The number of risk factors and uncertainty in the security risk management process drive the magnitude of threat scenario complexity. This simple statement is the most important take-away from Chap. 9, and is key to developing a security risk management strategy that addresses complexity. Complexity increases exponentially with the product of the number of risk factors and the information entropy of the security risk management process. The latter characterizes the uncertainty in the application of security controls to risk factors. The information entropy of a binary security risk management process only varies between zero and one. In contrast, the number of potential risk factors is unlimited. The information entropy has a more profound effect on the magnitude of complexity if the threat scenario contains numerous risk factors. Therefore, determining and managing the risk factors should be the principal focus in addressing threat scenario complexity. We can therefore provide the following general statement on the effect of the number of risk factors on the magnitude of threat scenario complexity: (b) The total number of risk factors has a disproportionate effect on the magnitude of threat scenario complexity. The model of complexity specified in Chap. 9 leads to other general results. The exponent containing the product of the number of risk factors (M) and the security risk management information entropy (H), i.e., M x H, determines the number of threat scenario states, and leads to the following statement: (c) An increase in information entropy of the security risk management process must be offset by a proportionate decrease in the number of risk factors in order for the magnitude of threat scenario complexity to remain constant. In practice, a reduction in risk factors might not be practical. Therefore, security risk management efforts must focus on reducing uncertainty in security risk management.
238
11
General Theoretical Results
Also recall from Chap. 9 security risk management was modeled as a binary process. Specifically, risk factors exist in either a managed or unmanaged condition, and a state of a threat scenario consists of a series of risk factors in one or the other condition. A threat scenario is modeled as an ensemble of these states. This simplification led to a characterization of security risk management in terms of the information entropy associated with the security risk management process. The following is a direct result of a binary model of threat scenario risk management: (d) For a fixed number of risk factors, complexity is a maximum when the probability of a managed risk factor equals the probability of an unmanaged risk factor ¼ 0.5, i.e., the information entropy, H is 1. The time rate of change of the information entropy of the security risk management process relative to the time rates of change of the risk factors and security controls has risk-relevant implications. This simple observation leads to the final result pertaining to complex threat scenarios: (e) The time evolution of information entropy is governed by the following expression when the number of risk factors remains constant: dH=dt ¼ dR=dt dC=dt The above expression states that the rate of change of the information entropy associated with a security risk management process equals the difference in the time rate of change of a risk factor and the relevant security control. The condition for threat scenario stability is dR/dt ¼ dC/dt. This condition implies dH/dt ¼ 0, which means H is constant. Note that the information entropy of security risk management in a dynamic threat scenario can never be stable since in a dynamic threat scenario dR/dt is always greater than dC/dt. Therefore, information entropy stability is another characteristic of static threat scenarios. A metric for threat scenario complexity evolves from the model based on the information entropy of an ensemble of managed and unmanaged states as follows: (f) The magnitude of threat scenario complexity is inversely proportional to the number of probable threat scenario states. It can therefore be measured by calculating 2-MH, where M is the total number of risk factors and H is the information entropy of the security risk management process. The relative complexity specifies the ratio of probable to possible threat scenario states. Relative complexity increases with increasing H and converges to one as the value of H approaches 1, i.e., maximum uncertainty.
11.6
11.6
Summary
239
Summary
The theory of security risk assessment includes twelve core principles that represent its conceptual foundation. Most, if not all security risk management efforts ultimately relate to these core principles. Additional results can be specified and grouped according to the threat scenario categories. These results also have practical implications and can sometimes be incorporated into security-related policies and standards. The latter are required for rigorous assessments of security risk, and is a topic discussed in the next and final chapter.
Chapter 12
The Theory, in Practice
12.1
Introduction
Synthesizing the preceding eleven chapters yields a conceptual framework for the theory of security risk assessment. Before reviewing that framework let’s briefly revisit its benefits and the advantages of a theoretical foundation more generally. A theoretical foundation for security risk assessment establishes a common frame of reference for comparing disparate threat scenarios. Critically, a risk-based assessment process evolves from this frame of reference. A truly risk-based process enables the prioritization of security controls, and thereby more consistently results in effective and cost effective strategies. Consistency, accuracy and efficiency are the ultimate payoffs in applying rigorous assessment methods. Of course, any single assessment of risk might yield accurate results. Many threat scenarios are not complex, i.e., have few risk factors, and options for security risk management are limited. Risk-based approaches become increasingly important over time and with an increasing number of risk factors. In addition, the competition for security resources mandates rigor in comparing threat scenarios and thereby apply security controls to legitimate priorities. Ultimately, a rigorous assessment methodology reduces the dependence on luck. Appropriate security controls are those that address the relevant risk factors and are proportionate to the magnitude of threat scenario risk. The inherently defensive nature of security risk management, the absence of a statistically significant number of similar threat incidents and the inability to confirm the effectiveness of security controls via experiment are what drive the requirement for a rigorous assessment methodology. The Fundamental Expression of Security Risk is the conceptual starting point of any security risk assessment. However, although this expression is indeed fundamental, it turns out to be incomplete. For example, in any real threat scenario,
© Springer Nature Switzerland AG 2019 C. S. Young, Risk and the Theory of Security Risk Assessment, Advanced Sciences and Technologies for Security Applications, https://doi.org/10.1007/978-3-030-30600-7_12
241
242
12
The Theory, in Practice
security controls exist to reduce the effect of risk factors. Therefore, the effects of security controls must be accounted for in any operationally useful representation of the magnitude of threat scenario risk. We have discovered other threat scenario features that affect the magnitude of threat scenario risk but were not included in the Fundamental Expression introduced in Chap. 1. Such features include threat scenario complexity, the confluence of likelihood risk factors and risk factor-security control asynchrony. This chapter presents a revised expression for threat scenario risk that is a direct outgrowth of this more expansive view. A security risk management process derived from the theory is also presented. As noted above, such a process facilitates the prioritization of security controls, the principal objective of any security risk management strategy.
12.2
The Security Risk Management Process
Security risk management is the prioritized application of security controls to the relevant threat scenario risk factors identified in an assessment. Therefore, the reader will notice a striking resemblance between the security risk management process presented next and the theory of security risk assessment described in the previous eleven chapters. This condition is no coincidence. In fact, it would be quite strange otherwise since theory and practice must be closely related if theory has anything to do with reality. Furthermore, the principles of threat scenario equivalence and the universality of risk ensure that a security risk management process is applicable to any threat scenario. The following sub-sections describe the security risk management process and its underlying logic. (a) Threat Scenarios Revisited The context for security risk assessments is a threat scenario. Every threat scenario consists of three elements: threats, entities affected by threats and the environment where threats and entities interact. Threats are the progenitors of risk, which consists of three components: likelihood, vulnerability and impact. The risk factors associated with each component characterize the relationship between threat scenario elements and drive the requirement for security controls. (b) Assessing Threat Scenario Risk Factors Identifying and assessing the contribution of threat scenario risk factors to the magnitude of threat scenario risk drive the requirements of a security risk management strategy. There is no theoretical limit to the number of risk factors within a threat scenario. However, the availability of resources usually limits the number of risk factors that can actually be addressed. Such practicalities drive the need for the prioritization of security controls.
12.2
The Security Risk Management Process
243
Each significant risk factor must be accounted for if not explicitly addressed in a security risk management strategy. Note that one security control might address multiple risk factors. As discussed later in this chapter, a one-to-many security control-to-risk factor condition is risk-relevant, and would arguably represent an area of focus especially in dynamic threat scenarios. (c) Security Controls and Performance Specifications Effectively managing security risk requires that security controls be in sync with the relevant risk factors. That said, a static threat scenario is no guarantee that existing controls are effective. The performance of security controls must be evaluated relative to the risk-relevant features of the threat scenario that surface during the security risk assessment. Security control performance is captured in a device or process specification; bollards are rated according to the energy they can absorb on impact, and CCTV camera performance relates to the lens resolution and other technical parameters. As always, the risk factors dictate the requirements for security control performance. However, the reality is that the quality of security control performance is often proportional to cost, and resources available for security are limited. Therefore, the tolerance for residual risk can have significant budgetary implications, and a decision must be made on the magnitude of residual risk that can be tolerated. For example, the requirement for bollard performance should be determined by an assessment of whether vehicles are capable of achieving a minimum kinetic energy in a run-up to the facility being protected. This assessment might reveal that K12-rated bollards are the only type that can address the operational requirements. Installing K12-rated bollards might be too expensive, and either K8 or K4-rated bollards represent lower-cost alternatives. If an organization opts for cheaper technology, it is accepting the residual risk that accompanies reduced security control performance. Acceptance of risk can be justified based on a risk-based decision, which might include calculating the cost-benefit ratio associated with the various mitigation options. Recall the probability of protection method discussed in Chap. 8 facilitates an examination of such trade-offs. The linear reasoning that links threats, the components of risk, risk factors and security controls to address the risk factors in the context of a threat scenario is fundamental to the theory of security risk assessment. However, the actual management of security risk requires an additional resource pursuant to risk-based decisionmaking. That resource is discussed next. (d) Security Policies and Standards The somewhat abstract nature of security risk derives in part from the absence of threat incidents and the inability to perform controlled experiments to confirm the efficacy of security controls. Although linking threats to the spectrum of outcomes can be straightforward, determining the likelihood of a specific outcome is non-trivial for many threat scenarios.
244
12
The Theory, in Practice
The good news is that security controls are generally effective in managing their piece of the threat scenario. For example, CCTV cameras are generally deployed to monitor the comings and goings of humans. Humans move sufficiently slowly relative to the camera frame rate so that image capture is assured. Moreover, the physical dimensions of humans and the things they carry are sufficiently large to enable CCTV camera lenses to provide the resolution sufficient to identify alleged perpetrators, weapons and general items of value. However, although a particular security control might be effective in addressing one aspect of a threat scenario, that security control is typically only one facet of an overall strategy. The fact that an individual security control performs according to specification is no guarantee it is sufficient in the grand scheme of things and/or is deployed correctly. We learned that uncertainty in the contribution of individual risk factors to the magnitude of likelihood affects threat scenarios that lack a history of threat incidents. Critically, without historical incidents that enable calculations of the probability of a future scenario outcome, a security standard and/or policy must be used to establish “ground truth” against which assessment results are evaluated. Furthermore, these policies and standards must be risk-based in order to perform the required function. That way, any residual risk will by definition be at or below the established tolerance threshold. Residual risk is addressed via an adjustment of the relevant security control or the addition of new controls relative to the specification in the policy and/or security standard. A security policy and a standard are not the same thing and sometimes both are necessary. The target audience for a policy is the community of security stakeholders. These individuals require a simple and easily digestible manifesto of the rules of the road. In some cases, a policy document might contain prescriptive details such as password complexity requirements. In general, a policy should specify security-related themes, processes and restrictions that directly affect stakeholders in the performance of day-to-day activities. In addition, security policies should be sufficiently general to remain relatively constant over time. In contrast, the target audience of a security standard is the security professional. Often such standards are technology-based, where performance specifications of security controls are prescribed, which are based on the security risk profile of the organization implementing the standard. Standards will likely change more frequently than policies due to ephemeral risk profiles as well as their inherent specificity. For example, requiring a form of authentication prior to logging into a network resource belongs in a security policy document. Authentication of some kind will likely always be required of IT network users. However, if passwords are the security controls chosen for authentication, the specification on password complexity belongs in a security standard.
12.2
The Security Risk Management Process
245
(e) The Structure of Security Policies What should a security policy look like? Any policy should be rooted in a set of fundamental security principles. Such principles represent the philosophic bedrock of the policy, and are therefore the predicate for implementing any security control. The following are examples of security principles that might be specified in the beginning of an information security policy document: 1. All employees must protect the confidentiality and integrity of the Organization’s information at all times. 2. All employees must exercise professionalism, good judgment and discretion in managing the Organization’s information and when using the Organization’s devices. 3. All employees must comply with all Organization security policies and standards, and never attempt to subvert, circumvent or otherwise impede the Organization’s security controls. 4. All employees may only use the Organization’s information for official business purposes and only use the Organization’s devices in a secure manner. 5. No employee may ever attempt to review, use or disseminate the Organization’s information or gain access to the Organization’s devices beyond what is necessary to perform required business activities. 6. All employees may only retain the Organization’ information within approved information repositories. 7. All employees accept that the Organization’s information will be retained only for as long as necessary to facilitate the Organization’s business activities. 8. All employees must immediately report any unauthorized disclosure of the Organization’s information or the loss or potential compromise of an Organization’s device to the Information Security Department. Once the security principles have been articulated, a comprehensive information security policy should specify requirements across two dimensions: activities associated with the lifecycle of physical or information assets, and the locations, devices and methods associated with those assets. Specifying the requirements across both dimensions is likely to address the breadth of potential threat scenarios, thereby obviating the need to delineate all scenarios, noting such a delineation is futile. For example, an information security policy should first specify requirements associated with the following activities, which describe the lifecycle of information assets: • • • •
Information creation and reproduction Information storage and retention Information transport and transmission Information disposal and destruction
246
12
The Theory, in Practice
Specifying these requirements would be followed by requirements governing stakeholder behavior relative to various information technologies, i.e., the devices used for creating, reproducing, storing, retaining, transporting, transmitting, disposing and destroying information assets1: • • • • • • •
The Internet E-mail, texting and instant messaging Facsimile machines, printers, scanners, photocopy machines Remote network access Wireless access and technology Mobile devices Social media
Other elements of an information security policy would specify the roles and responsibilities of entities that perform security governance and the requirements for compliance. Ideally, such entities should include a centralized committee charged with enforcing the information security policy and granting policy exemptions. (f) The Security Risk Management Feedback Loop Residual risk is evaluated via a recursive process that compares security control performance specifications against the magnitude of threat scenario risk factors. This process represents an ongoing if not continuous feedback loop that periodically corrects for changing risk profiles as well as any changes in the tolerance for risk. At a minimum, the periodicity of threat scenario sampling must occur as frequently as the time rate of change of risk factors. This requirement places a particular burden on assessing dynamic threat scenarios for obvious reasons. Figure 12.1 graphically illustrates the security risk management process, which is a natural outgrowth of the theory of security risk assessment.
12.3
Applying the Theory (1): Information Security Threat Scenarios
As noted in the Introduction, examples facilitate an understanding of abstract concepts. To that end, “virtual” security risk assessments help demonstrate how the theory of security risk assessment applies to the real world. These thought experiments are a valuable and relatively painless way of assessing risk. We know from the core principles that all threat scenarios are equivalent, and therefore a standard security risk assessment methodology is generally applicable irrespective of the threat scenario details. An information security threat scenario is
1
Locations include personal devices with business-related information as well as all hardware and virtual technologies that perform one or more of the information management activities cited in the text.
12.3
Applying the Theory (1): Information Security Threat Scenarios
247
Fig. 12.1 The security risk management process
the first threat scenario to which we apply the standard methodology. Information security threat scenarios are often top of mind because information assets often contain the keys to the kingdom, information technology is critical to the continued functioning of any organization and even physical security controls are dependent on the IT network. In this case we are conducting a security risk assessment of an organization’s IT environment with respect to the threat of information loss due to unauthorized external access to sensitive or confidential information. We note that our virtual security risk assessment will focus exclusively on the compromise of electronic information, and thereby ignore the risk of information compromise associated with paper documents stored on premise. The first steps in any security risk assessment are to identify and characterize the three elements of the threat scenario. One of those elements is the environment where threats and affected entities interact, and we must identify the risk factors that relate the threat(s) to affected entities. In this case, the environment is arbitrarily constrained to be a specific subnet of the IT network. Although the selected subnet might only correspond to a single office within our fictitious organization, it is assumed to be representative of the entire network. This simplification makes the assessment tractable without a significant loss of generality. The wired and wireless networks plus associated hardware and software constitute the environment within the subnet, and are presumed to be representative of the broader network. Affected entities in this case are the population of computers and servers that are connected to the Internet and each other via network switches and routers.
248
12
The Theory, in Practice
Depending on the organization’s mission and public profile, a number of potential adversaries exist. These include hacktivists, politically motivated criminals, financially motivated criminals, so-called “bad leavers” (i.e., individuals who leave the organization under bad terms) and disgruntled job applicants. Possible attack vectors in this threat scenario are plentiful. These include the introduction of malware via phishing, social engineering for credentials and unauthorized connections to the network via WiFi. Note that denial-of-service attacks are typically focused on business disruption rather than information compromise. Therefore, this mode of attack will be ignored notwithstanding the fact that such attacks are not infrequent.2 The next step is to identify the threat scenario risk factors. Five common risk factors significantly contribute to the magnitude of information security risk: 1. Undiscovered routes to the Internet 2. Weak authentication to access network file shares, i.e., file shares, containing sensitive information3 3. An open network architecture 4. Proliferation of administrative privileges 5. Limited threat awareness by the community of computer users In IT environments, the user-machine interplay greatly affects the information security risk profile. Therefore, hidden pathways to the Internet (network layer), application vulnerabilities (presentation layer), information storage devices (physical layer), user account configurations/privileges (session layer) combined with user behavior, all contribute to the magnitude of the three components of information security risk. We know that the likelihood component of risk is a maximum when the threat scenario risk factors are coincident (assuming all risk factors are equivalent in their respective effects on the magnitude of risk) or when threat scenario risk factors and security controls are not coincident. We also know that a confluence of likelihood risk factors has a multiplicative effect on the magnitude of security risk. Therefore, we are particularly attentive to such conditions in assessing the magnitude of threat scenario risk, and we prioritize the application of security controls accordingly. In that vein, network file shares are discovered to have one or more of the following risk factors: • • • •
Terabytes of confidential information Plaintext data storage Direct accessibility from the Internet Weak passwords for account authentication
2 Denial-of-Service attacks represented the leading type of information security incident per the Verizon 2018 Data Breach Investigations Report, 11th Edition. 3 Network file shares are also known as Personal Data Drives, and are areas/locations in computers that are dedicated for files that users create to store and share information. Changes to file content can therefore be made centrally.
12.3
Applying the Theory (1): Information Security Threat Scenarios
249
• Accessibility by computer users with a history of promiscuous Internet browsing • Accessibility by computer users with administrative privileges and who use weak passwords to authenticate to those accounts As noted above, priority is given to addressing file shares that possess multiple risk factors. Strong authentication coupled with access privilege restrictions is a security control that addresses several of the aforementioned risk factors. IT administrators have conducted a password cracking exercise against users who possess local and system administrative privileges since the compromise of these passwords would result in maximum vulnerability and impact should they be compromised by an adversary. The cracking exercise revealed 70% of the administrative passwords could be cracked in less than 1 day using computational resources available to sophisticated hacktivists. This figure exceeds the risk tolerance threshold established by the organization as specified in its security technology standard. The IT administrators use this figure to generalize about the uncertainty in security risk management across the IT environment. That uncertainty is reflected in the information entropy (H) associated with the overall security risk management process. They estimate H to be 0.9 since this figure corresponds to a probability of 0.7 and 0.3 for managed and unmanaged risk factors respectively (or vice versa per Fig. 9.6). Based on the number of individuals with administrative account privileges, it has been determined that the number of risk factors M equals 20. Therefore, the complexity metric, Cm is calculated to be 2MH ¼ 220(0.90) ¼ 218 ¼ 3.8 106. Recall this figure represents the probability of the threat scenario being in a specific probable state, where each state consists of a mix of managed and unmanaged risk factors. A smaller value of Cm implies higher complexity since there is greater uncertainty in the state of the system. This figure suggests a medium complexity threat scenario based on the organizational standard established by the IT Department. The information security professionals at this organization believe a medium complexity rating argues for implementing additional security controls to specifically address complexity. These controls must complement existing security controls since the latter are devoted to addressing other risk factors. The root cause of complexity is likely multi-faceted, and the requirement for multiple security controls is anticipated. In response to the security risk assessment findings, the IT Department in consultation with senior executives increase the complexity requirement for passwords associated with administrative accounts. They also implement a data leakage protection solution through Office 365 with the intent of monitoring the egress of sensitive information originating from internal network devices. Moreover, a high-concentration condition is found to exist within company servers. These devices are storing confidential information in unencrypted databases as determined by a vulnerability scan via Nessus. Some of these risk factors, i.e., CVSS vulnerabilities, are high-severity and have persisted for years, which point to systemic security risk management issues.
250
12
The Theory, in Practice
Although this threat scenario appears grim, there are no apex risk factors based on the criteria specified in Chap. 2. Nevertheless, the IT team conducts a more in-depth analysis to identify significant network-related risk factors such as unintended paths to the Internet and unauthenticated access between security zones. To that end, they use an application known as Red Seal to develop an accurate network topology using Layer 3 device configuration files, which also exposes internal pathways to the Internet. Red Seal and Nessus results are combined to reveal high-vulnerability assets that are accessible via the Internet and thereby subject to unauthorized data exfiltration.4 The assessment also reveals that the time rate of change of some risk factors exceeds the time rate of change of relevant security controls thus qualifying as a dynamic threat scenario. Security controls must be brought into temporal alignment with relevant risk factors, which argues for more frequent monitoring of the environment. As noted above, the spatial distribution and temporal behavior of vulnerabilities/ risk factors point to systemic security risk. Specifically, both the concentration and persistence conditions hint at ineffective security governance. This topic is raised with senior executives, where IT Department managers point out that information security is currently underfunded relative to peer organizations based on the results of a recent benchmarking exercise. The results of the security risk assessment are presented to the organization Board of Trustees, which finds the assessment conclusions to be compelling. As a result, they increase IT Department security funding by 50% and award a huge bonus to the Chief Information Security Officer for a job well done! Finally, one potentially useful exercise is the creation of theoretical attack pathways. These pathways originate with the initial point of attack and follow the attack sequence to the IT asset, which if compromised, could result in information exfiltration. Specifically, theorized attack scenarios conform to the following general attack sequence: 1. 2. 3. 4. 5. 6.
Threat source Attack vector Compromised application Network (e.g., WiFi) Specific accounts and associated access privileges Compromised asset
Figure 12.2 shows a set of hypothetical but nonetheless realistic information security risk attack scenarios for a non-profit organization, which is based on the assessment methodology presented herein.
4
https://www.redseal.net/
12.4
Applying the Theory (2): Password Cracking
251
Fig. 12.2 Information security attack scenarios
12.4
Applying the Theory (2): Password Cracking
Password cracking demonstrates how time, complexity and computation converge to affect decisions on security controls. Despite their widespread exploitation, passwords remain the most common form of authentication in computer systems. There is often debate regarding requirements for password complexity, which in part reflects a lack of understanding about how passwords are actually cracked. Computer users determine the details of this critical security control according to constraints established by IT departments. A password, and more precisely the rules regarding password complexity, reflects the ongoing struggle between security and convenience. Significantly, the outcome of an individual’s struggle affects the magnitude of information security risk for the entire user community.
252
12
The Theory, in Practice
Password security is a function of the number and diversity of password characters coupled and the frequency of password changes, relative to an adversary’s computational capabilities. As we will soon see, there is no absolute metric for password invulnerability since password resilience is linked to the length of time it remains invulnerable. Furthermore, the period of invulnerability relates directly to an adversary’s computing power. Therefore, the simple algorithm to increase password security is to make them more complex and/or change them more frequently. The reality is that passwords requiring frequent changes or significant complexity are a perennial source of user frustration. Unfortunately, user frustration is directly proportional to password invulnerability. Hence, the tension between security and convenience noted above. Therefore, to maximize security as well as inspire confidence, the organizational standard on password complexity must be risk-based rather than based on some arbitrary requirement. In determining the appropriate password complexity it is essential to understand the details of brute force attacks, which amounts to guesswork on a grand scale. Passwords are encrypted via a so-called hashing algorithm, which is a one-way encryption function that yields a unique value for each unique combination of letters, numbers and symbols. One prerequisite of an enterprise-level attack is to access the hash values en masse. For example, if an attacker were able to access the keychain for an iOS device he or she could initiate the type of attack described herein. The attack is predicated on the attacker accessing the password file or the stored password on a particular machine. This argues for protecting such resources at all costs. Once an attacker is in possession of the password hash file, he or she uses a particular hashing algorithm to create self-generated hash values. These values are then compared to the hash values accessed on the targeted machine. A match between self-generated and targeted hash files reveals the password since each unique password corresponds to a unique hash value. Of course, once the password is compromised, an attacker has the same electronic access privileges as the legitimate account holder. Password complexity represents a combination of length and character diversity. Complexity drives the computational resources required to calculate hash values in a specified time period, which directly relates to the computational capability of the adversary. Immunity from a brute force attack is guaranteed if the time between password changes is less than the time it takes to crack some fraction of the space of possible passwords assuming the hashing algorithm itself is secure. The bottom line is that the time to crack a password is a function of both password complexity and the adversary’s computational capability. It should be noted that by the time the reader reads this book, the brute force cracking times noted herein will be outdated given the increasing capability and accessibility of advanced computer central processing units (CPU). However, the modus operandi of the attack remains the same. To reiterate, a hashing algorithm is a function that uniquely encrypts data so that it is computationally difficult to decrypt, i.e., restore the encrypted text to plaintext. Passwords are stored as hashed values of the corresponding plaintext. For example,
12.4
Applying the Theory (2): Password Cracking
253
the password “password” is encoded as $1$O3JMY.Tw$AdLnLjQ/5jXF9. MTp3gHv/ using the MD5 hashing algorithm.5 Although this string of characters is indeed complex, the reader is strongly encouraged not to use “password” as a password or any other easily guessed word or phrase. Since a brute force attempt to discover passwords involves computing numerous combinations of hashed values of plaintext symbols, the more diverse the source of symbols the more computational power is required to decrypt those hashed values in a given time interval. The requirement for password resilience inevitably boils down to evaluating the time required to crack a password by a particular adversary versus the time that the protected information remains confidential. Computational power has grown exponentially in the last 20 years, which is consistent with Moore’s Law.6 Commercial technology is now available for a few thousand dollars that enable passwords to be cracked on time scales that until recently would have required government resources. It is useful to quantify this capability, and thereby estimate the magnitude of risk.7 Using a modern desktop computer, i.e., 8-core, 2.8 GHz machine, and the SHA512 hashing algorithm, it takes about 0.0017 milliseconds or equivalently 1.7 microseconds (1.7 106 s) to compute a single hash corresponding to one password.8 This translates to 588,235 passwords-per-second. Graphical processing units (GPU), which are particularly well suited for password cracking because of the extreme parallelism achieved via “pipelining,” are obtainable for relatively little cost. A GPU can calculate hashes at speeds 50–100 times greater than a standard commercial computer.9,10 So-called supercomputers, which are computationally equivalent to a botnet consisting of 100,000 standard computers, are expensive but readily available to a government-sponsored entity.11 Modern supercomputers can be up to 150,000 times faster than their desktop counterparts. Botnets can be enlisted to crack passwords, and thereby leverage the processing power associated with multiple computers in parallel.
5
http://openwall.info/wiki/john/sample-hashes Moore’s law predicts that the density of integrated circuits will double every 2 years. 7 A brute force attack is one where all password variations are attempted. 8 https://thycotic.force.com/support/s/article/Calculating-Password-Complexity 9 A GPU has hundreds or even thousands of cores (CPU elements) that are used to perform computations in parallel. Each core computes one 32-bit arithmetic operation per clock cycle in a “pipeline.” Pipelining is where an individual operation requires many cycles to run, but multiple operations are launched in successive waves while sharing instruction decoding (since many cores will run the same instructions simultaneously). As noted previously, some GPUs have thousands of cores. For example, A GTX 680 graphics card contains 1536 cores, and there are two of these in a GTX 690. Therefore, the GTX 690 is capable of calculating billions of SHA-1 hashes per second. 10 https://security.stackexchange.com/questions/32816/why-are-gpus-so-good-at-crackingpasswords 11 A botnet is a collection of computers linked via the Internet. 6
254
12
The Theory, in Practice
It is assumed that a password can be cracked when half the possible passwords in the password space are checked. If an individual uses an 8-character password consisting of all lowercase letters, e.g., the word “military,” the size of the character set is 26 since there are 26 letters in the English alphabet. In this case there are 26 26 26 26 26 26 26 26 ¼ 268 possible combinations of 8-character passwords. In order to crack 8-character passwords consisting of only lower case letters it will take the following amount of time using the desktop computer noted above: 1:7 106 seconds=password 268 passwords =2 2 days By comparison, using a supercomputer or the equivalent botnet, this same cracking exercise will require 1.8 s. The situation is actually much worse for a defender because an attacker will likely use a dictionary so that known words are identified even more quickly. The use of dictionaries is the reason for the oftenrepeated admonition against using common words or phrases as a password. Next we assume the password consists of a mix of lowercase and uppercase characters. This time the password information source has 52 elements so there are 528 possible combinations of 8-character passwords. To crack an 8-character password consisting of upper and lower case letters using a desktop computer will require the following time: 1:7 106 seconds=password 528 passwords =2 1:44 years This same exercise using a single GPU would require about 5 days. The cracking time would be reduced to 7.6 min using a supercomputer or botnet.12 It is evident that simply using lowercase and uppercase characters is insufficient protection for 8-character passwords, especially in defending against even modestly sophisticated adversaries. If numbers are included in the source of characters, there are 62 possible characters in the set. Therefore, there are 628 possible passwords. Using the previous figure for computational power we arrive at a figure of 5.88 years to crack a password of this complexity using a desktop computer. The same effort would require 31 min using a botnet or supercomputer. Table 12.1 is a summary of password cracking times relative to computational capability. Password length is its most important security feature. Recall from Chap. 9 that the information entropy of an information source represents the diversity of that source. In the case of passwords, the information entropy equals [log2RL] where R is the size of the character set and L is the number of characters.13 For example, the information entropy of a character source where R ¼ 95, i.e., the number of unique characters on a keyboard, and L ¼ 12, equals 78.9 information bits-per-password. 12 13
http://techgenix.com/how-cracked-windows-password-part2/ https://www.pleacher.com/mp/mlessons/algebra/entropy.html
12.4
Applying the Theory (2): Password Cracking
255
Table 12.1 Brute force password cracking times
10
Password character diversity 26 (lower or uppercase letters) 52 (upper and lower- case letters) 62 (52 plus numerals) 80 (62 plus symbols) 62
10
80
Password length 8
8
8 8
Single GPU cracking time 0.02 days or 0.48 h
Botnet/Supercomputer cracking time 1.8 s
511 days or 1.44 years
5 days
7.6 min
5.88 years
31 min
45,295.5 years
0.06 years or 2.8 days 0.45 years or 21.7 days 4530 years
579,479.7 years
5794.8 years
Desktop cracking time 2 days
45.2 years
4h 83 days or < 1 day if a GPU is used for each botnet computer 3 years
Table 12.2 Common password character sources Character types Lowercase Lower and upper case Alphanumeric Alphanumeric and upper case Common ASCII characters Diceware word list English dictionary words
Character number (diversity) 26 52 36 62 30 7776 171,000
Password resilience is directly related to the diversity of the character source since diversity increases the number of password possibilities that an attacker must try. The CPU of the computer is preoccupied with hashing all password possibilities to arrive at a match with a password hash on file. This situation is reminiscent of the scene in 2001: A Space Odyssey. Recall the on-board computer HAL has been tasked with calculating the value of the transcendental number pi (π) in order to preoccupy its CPU, pursuant to saving the ship from the clutches of this rogue machine. It is clear from this expression that password length has a disproportionate effect on the magnitude of entropy since the entropy increases exponentially with L and only logarithmically with R. Table 12.2 lists the number of characters and hence the diversity of common character sources.14
14
Ibid.
256
12
The Theory, in Practice
Fig. 12.3 Password entropy for various character sources
For example, a 16-character password consisting of only numbers, i.e., 1016 possible passwords, is roughly as difficult to crack as an 8-character password consisting of 94 possible characters, i.e., 948 ~ 0.6 1016 possible passwords. This example reveals the disproportionate effect of password length on complexity. Doubling the password length from 8 to 16 characters has almost the same effect as increasing the base (i.e., the character set) from 10 to 94. As one might expect, a combination of length and complexity is the ideal strategy to protect against brute force password attacks. Figure 12.3 shows the relationship between information entropy and various password character sources.15 For complex passwords, so-called rainbow tables can be used to accelerate the cracking process. A rainbow table is a lookup table consisting of password hashes for every possible password combination using a specific hashing algorithm, e.g., MD5, SHA-1, SHA-2, SHA-3. Rainbow tables can decrease the time required to crack multiple passwords unless the passwords are “salted.” Salting is the process of adding random data to increase complexity. However, salting will not deter the cracking of a single password.16 For example, the MD5 hashing algorithm is not salted. Therefore, more recent hashing algorithms such as others noted immediately above are considered more secure.17 In light of the previous discussion, it seems prudent to enforce a minimum of 10 characters for a Windows user account and 20 characters for all administrator accounts. If LanMan hashes are still in use, presumably to ensure backward compatibility, the conventional wisdom is that Windows account passwords should be a
15
https://csrc.nist.gov/ Note that the MD5 hash algorithm is not salted, however salting is not effective against cracking single passwords. 17 https://technet.microsoft.com/en-us/library/security/961509.aspx 16
12.5
A Revised Fundamental Expression of Security Risk
257
minimum of 15 characters. An enforced mixture of numbers, letters (upper and lowercase) and symbols would be appropriate for both account types.18
12.5
A Revised Fundamental Expression of Security Risk
Recall the Fundamental Expression of Security Risk introduced in Chap. 1: Riskðthreat scenarioÞ / Impact Likelihood Vulnerability
ð12:1Þ
Readers have been repeatedly cautioned against interpreting (12.1) too literally. It does not actually specify the absolute magnitude of each component of risk, and therefore does not specify the risk in aggregate. Expression (12.1) merely indicates that threat scenario risk is proportional to the product of the individual components of risk, whatever their value. But we now have considerable justification to question the validity of (12.1). Perhaps most conspicuously, security controls are absent from this expression. The presence of security controls has the effect of decreasing the magnitude of threat scenario risk factors, and thereby affects the overall magnitude of risk. Clearly, (12.1) is at best incomplete and at worst inaccurate although it generally suffices for high-level discussions. A revised expression is required if a more fulsome description of threat scenario risk is the objective. Specifically, security controls as well as risk-relevant spatiotemporal features of threat scenarios must be included. By definition, security controls decrease threat scenario risk. This condition suggests (12.1) is more accurately represented as a fraction, where security controls are included in the denominator. Therefore, a first modification to (12.1) might read as follows: Risk ðthreat scenarioÞ /
Impact ðIÞ Likelihood ðLÞ VulnerabilityðVÞ ð12:2Þ Controls ðCÞ
Notwithstanding this adjustment, note the magnitude of threat scenario risk is still not a strict equality. As before, expression (12.2) merely represents security risk as a product of its three components and divided by security controls that act to reduce the magnitude of risk. We might say that (12.2) describes the risk associated with a managed threat scenario. However, we know that other threat scenario features affect the magnitude of security risk. For example, that the temporal relationship between risk factors and security controls is also risk-relevant. Specifically, a misalignment between risk
18
Ibid.
258
12
The Theory, in Practice
factors and their corresponding security controls increases the magnitude of threat scenario risk. Similarly, a confluence of likelihood risk factors increases the magnitude of security risk. Complexity is a risk factor for the likelihood component of risk that results from uncertainty in security risk management and the presence of multiple risk factors of any type. As a practical matter, complexity always exists in any realistic threat scenario. Recall from Chap. 9 that Cm is a metric for the magnitude of complexity risk. This metric corresponds to the probability that a threat scenario exists in a particular state consisting of a unique combination of managed and unmanaged risk factors. A threat scenario can be modeled as an ensemble of such states. The lower the probability, i.e., the smaller the value of the metric Cm, the higher the magnitude of threat scenario complexity. Complexity always increases the magnitude of security risk. In theory, complexity could be absorbed into the likelihood component of risk since complexity only affects that component but we choose to separately indicate its contribution. We express the magnitude of complexity as “C*”. As discussed previously in this section, the effect of time must also be accounted for in a more fulsome expression of threat scenario risk. We know that in dynamic threat scenarios the time rate of change of a security control is less than the time rate of change of the relevant risk factor. In general, the magnitude of security risk in a dynamic threat scenario is affected by the overlap, or lack thereof, between a risk factor and the relevant security control. Recall there are two maximum security conditions, assuming all likelihood risk factors contribute equally to the magnitude of risk: a) security controls and relevant risk factors are misaligned, and b) a confluence of likelihood risk factors. We can specify a metric that captures the magnitude of risk for the first condition, which should also be added to (12.2). Specifically, if the combined duration of a security control and the relevant risk factor is t, and the time duration of their overlap is Δt, the ratio (t – Δt)/t is a parameter that affects the magnitude of security risk, and therefore should be added to (12.2) for completeness. The metric (t – Δt)/t has a maximum of value of 1 when a security control and relevant risk factor do not overlap, i.e., Δt is zero. When Δt equals t, (t – Δt)/t becomes zero, implying a security control and the relevant risk factor overlap. As a reminder, the time rates of change of threats are not relevant to an expression of the magnitude of threat scenario risk. The magnitude of threat scenario risk is exclusively determined by the status of the risk factors. Therefore, changes to threats by themselves have no effect on the magnitude of risk, and hence there is no need for security control adjustments unless the risk factors associated with an affected entity and/or the threat environment have also changed. For example, if an adversary, i.e., the threat, chooses a gun instead of a knife as a weapon, the magnitude of threat scenario risk has definitely changed. However, the
12.5
A Revised Fundamental Expression of Security Risk
259
effect on risk results from a change in the risk factors that affect the magnitude of vulnerability and impact. If the weapon changed from one type of knife to another, there is no substantive change to the magnitude of risk because the risk factors associated with each threat scenario remain unchanged. Per condition b) above, a maximum risk condition occurs when there is a confluence of likelihood risk factors. Therefore, a suitably revised expression for the magnitude of threat scenario risk must reflect the possibility of a confluence condition. Let L represent the maximum value of a likelihood risk factor at some time t. Now let ΔL represent a change in the value of L at some earlier or later time t0 . Note a finite value for ΔL will always result in a decrease in the magnitude of likelihood. Likewise, if ΔL is zero, the likelihood risk factor is a maximum. The quantity L – ΔL reflects the magnitude of the likelihood component of risk for a single risk factor. However, recall the effect of multiple contemporaneous likelihood risk factors is multiplicative, and hence there is an exponential effect on the overall magnitude of risk. We must therefore incorporate temporal offsets for all likelihood risk factors present in a threat scenario, where we make the simplifying assumption that the offset is the same for all risk factors. Therefore, the likelihood risk factors must be multiplied. Moreover, the representation for the likelihood component of risk that can account for a confluence condition becomes (L – ΔL)n, where n is the number of likelihood risk factors. Finally, and as noted previously, a condition of confluence exponentially increases the likelihood component of risk only. The effect of multiple vulnerability and impact risk factors is additive rather than multiplicative. If we incorporate the modifications noted above, a more complete expression for the magnitude of security risk emerges: Risk ðthreat scenarioÞ /
ðL ΔLÞn IðI1 þ I2 þ . . . þ In Þ VðV1 þ V2 þ . . . þ Vn Þ ½ðt ΔtÞ=t C C
ð12:3Þ Finally, it would be a mistake to think that (12.3) is now an absolute representation for the magnitude of threat scenario risk. Using actual numbers for any of the variables might yield an extremely small or large numerical value, which by itself would have little meaning. This expression is only meant to ensure the significant features affecting the magnitude of risk are represented and to specify their relative effect. The difference between (12.3) and the representations of the Fundamental Expression of Risk presented in Chap. 1 is that we have now accounted for additional risk-relevant features.
260
12.6
12
The Theory, in Practice
Testing for Encryption
An interesting application of information entropy in assessing security risk is testing for encryption.19 One might envision a data leakage protection (DLP) tool that examines files transmitted or transferred outside an IT network. In one possible scenario, a file might be transferred from a protected location such as a passwordprotected network drive, to an external USB drive. Furthermore, that file might require encryption according to company policy. Assuming a nearly perfect random encryption method that transforms English text to hexadecimal characters, the information entropy of the hexadecimal message source can be calculated using (5) in Chap. 9, where all 16 hexadecimal characters have equal probability of appearing: H ¼ 16ð1=16Þ log 2 ð1=16Þ ¼ 4 bits‐per‐character However, if the file were transferred in plain text, i.e., unencrypted, the entropy of the English text would fall within the range of 0.6–1.3 bits-per-character. A random sampling of files analyzed by a DLP would examine the information entropy of the output, and thereby detect anomalies that either represent errors in encryption or the intentional encryption of a file by a malicious user.
12.7
The Security Control/Risk Factor Ratio (C/R)
The ratio of security controls to risk factors is risk-relevant, and is particularly relevant if that ratio is less than one. This condition can be written compactly as follows, where C is the number of security controls and R is the number of risk factors: C=R < 1
ð12:4Þ
As discussed in Chap. 10, (12.4) implies that either at least one risk factor is not managed by a security control or at least two risk factors are managed by a single security control. If a single security control is managing multiple risk factors, and assuming all risk factors have an equal effect on the three components of risk, the loss of that security control would have a disproportionate effect on threat scenario risk. A prudent security risk management strategy would be especially cognizant of security controls that manage multiple risk factors. Mark Twain commented on
19
C. Briscoe, Iona University; private communication.
12.8
Cost and Constraints in Security Risk Management
261
threat scenarios characterized by (12.4) as follows: “If you put all your eggs in one basket you had better watch that basket!” In contrast, threat scenarios that on average have at least one security control per risk factor present a different risk profile. Although seemingly less efficient, this situation is in general more ideal since it limits the number of risk factors affected by a single security control. This condition is written compactly as follows: C=R 1
ð12:5Þ
Understanding the nature of this ratio provides one high-level view of the threat scenario risk profile. Although (12.4) and (12.5) provide no information regarding the quality of the security controls or the magnitude of the risk factors, it does suggest a potential imbalance in the security risk management strategy. It also offers an additional criterion for prioritizing risk management efforts, where the initial focus might be on those security controls assigned to multiple risk factors or on identifying risk factor(s) lacking security controls.
12.8
Cost and Constraints in Security Risk Management
The cost of security controls does not affect the magnitude of security risk, and therefore is not part of the theory per se. However, it would be naïve to assume the availability of resources is not an important consideration in developing a practical security risk management strategy. Therefore, this brief discussion focuses on how finite security resources relative to the cost of security controls affects the formulation of a security risk management strategy. It is important to appreciate that resources, or the lack thereof, drive the need for the prioritization of security controls. The point of a security risk management strategy is to implement security controls where they are most required as determined by a security risk assessment. Implicit in any security risk management strategy is the fact that only finite resources are available. If there were infinite resources, an infinite number of security controls could be deployed, which would obviate the need for risk assessments (and this book!). In other words, the prioritization of security controls would be superfluous. Implementing an unlimited number of resources does not constitute a proper strategy. The cost of risk mitigation does not necessarily relate to price. The effect of security controls also represents a form of cost, and it can have both tactical and strategic implications. The field of medicine once again offers useful lessons. Medical risk management efforts can be tricky since medicines have side effects and patients might have multiple and inter-related issues. It certainly would not be helpful to cure the disease but make the patient worse in the process. An example of one such dilemma is in the treatment of prostate cancer. It is known that high values of the Prostate-Specific Antigen (PSA) correlate with
262
12
The Theory, in Practice
prostate cancer. However, prostate cancer is almost inevitable if you are male and are fortunate enough to live long enough. In fact, the disease can progress rather slowly thereby nearly guaranteeing that death will result from some other cause. The controversy surrounding the PSA test is not about the effectiveness of the test. Rather, it concerns the cost-benefit ratio of the treatments that often follow the test. These treatments can severely affect a patient’s lifestyle for potentially little or no gain. The downside of treatment must be weighed against the expected rate of disease progression relative to the age of the patient. Note that if the treatments had little or no side effects, the downside of aggressive intervention would be greatly reduced. In summary, decisions on security risk management do not depend on the magnitude of threat scenario risk alone. In any realistic threat scenario, the cost and collateral effects of security risk management efforts are also important factors. The cost of a security control can be a particularly significant issue when there is a large discrepancy between any two components of threat scenario risk. The next section discusses a common type of general threat scenario where reconciling significant differences in the magnitude of two components of risk is often an issue.
12.9
Low Likelihood-High Impact Threat Scenarios
The multi-component nature of risk contributes to difficulties in assessments. Specifically, assessing risk entails analyzing and comparing multiple components, where each component has a completely different implication to the overall magnitude of risk. Colloquial use of risk-relevant terminology adds to these difficulties. Recall the street crossing threat scenario in Chap. 1. In observing street crossing habits, we often remark about what people will “risk” for relatively little gain. In other words, they will increase the likelihood of significant loss, i.e., injury or death, in order to gain a few minutes of time. By attempting to “beat the light” they are actually testing the limits of the likelihood component and implicitly accepting that the magnitude of the vulnerability component remains extremely high. The important point is that the vulnerability component doesn’t change but the likelihood precipitously increases during the time interval the light is red or nearly red in the direction of the pedestrian. During a well-defined time interval, both components of risk are significant. This threat scenario is an example of transience as discussed in Chap. 10. This likelihood-vulnerability trade-off is a frequent theme of security risk assessments. A threat scenario that frequently arises is one where the likelihood component of risk is low but the impact is high. As noted previously, an extreme example of such a threat scenario is the detonation of a nuclear weapon. The likelihood of a nuclear attack is presumably low but the losses incurred as a result of a nuclear explosion would be devastating.
12.9
Low Likelihood-High Impact Threat Scenarios
263
Given the expense of effective risk management, e.g., lead-lined offices, underground facilities, the question is whether a reduction in the magnitude of the vulnerability and impact components of risk is worth the expense. In view of the discrepancy in the magnitude of the individual risk factors coupled with an exceptionally expensive and disruptive risk management strategy, it is unclear how to address this threat scenario and others where a similar juxtaposition exists. In general, for low likelihood-high impact threat scenarios, questions on the appropriate strategy focus on the specific requirements for risk management. It is the all-or-nothing-at-all nature of these scenarios with respect to vulnerability that make them difficult to resolve. One nuclear bomb, active shooter etc. is guaranteed to result in significant loss. There appears to be a continuum of values for the likelihood component of risk or at least the perception thereof. Until some threshold of likelihood is exceeded as a result of historical incidents, a typical strategy for low likelihood-high impact threat scenarios is the absence of a strategy. At some unquantifiable threshold, the magnitude of likelihood becomes impossible to ignore, and security controls are implemented. Active shooter threat scenarios appear to have crossed that likelihood perception threshold. In some cases, a potential solution is made possible by deconstructing the threat scenario, thereby altering the all-or-nothing nature of the outcomes. This strategy entails examining the effectiveness of the security control relative to the fraction of threat scenario outcomes it addresses. That fraction is then evaluated against the cost of the security control. The return on investment of the security control relative to the fraction of outcomes it addresses is the parameter of interest. This algorithm is admittedly not applicable to all low-likelihood-high-impact threat scenarios. However, many scenarios can be organized in terms of states, where the effectiveness of a security control is evaluated relative to the fraction of states for which the control is relevant. Recall the probability of protection method is based on this approach. Specifically, the first step is to divide the threat scenario into sub-scenarios, i.e., fractions of the total number of threat scenario states. Once these sub-scenarios are delineated, each one should be evaluated relative to the effectiveness of the required security control. The third step is to compare these fractional scenarios to the cost of the security control, noting that cost might be manifest as money, resources and/or the level of disruption. The real return on investment can become clearer once the range of relevant sub-scenarios is specified relative to the required expense. Consider the threat of fire in an apartment building. Each apartment has at least one smoke detector as mandated by law. The concern is that a fire could trigger a smoke detector inside an apartment but no one would be around to respond. The building is considering installing a system that detects smoke conditions within apartments and common areas, which communicates a signal to a central monitoring station that alerts the Fire Department. The cost of installing such a system is significant since the building is old and retrofitting is expensive. The likelihood of a fire condition occurring without a person actively starting it is low but not zero. Electrical appliances with a short circuit that are plugged into the
264
12
The Theory, in Practice
mains circuit could cause a fire even if they are not in use at the time. But this fraction of fire scenarios occurs infrequently. Therefore, the likelihood is low and the vulnerability/impact of this fraction of threat scenarios is significant. The following are the relevant sub-scenarios in the overall fire threat scenario: 1. Fire occurs while cooking, smoking a cigarette, a misplaced candle or some other use of an open flame. This is by far the most common fire-related threat scenario. 2. Fire occurs during use of a faulty appliance or the use of multiple appliances on a single circuit overload that circuit’s maximum amperage (the circuit breaker should trip). This is arguably the second most common fire-related scenario. 3. Fire occurs spontaneously due to a faulty appliance or circuit panel AND the fire occurs when no one is home. We note in passing that apartments are typically required to be fire-rated to between 2 and 3 h. This rating means a fire will be contained for 2–3 h before spreading. Therefore, in addition to a low likelihood, the magnitude of the vulnerability component of risk to building occupants other than the inhabitants of the apartment where the fire originated is also low. There would be ample time to identify the source of the fire and escape. It is clear from this simple analysis that the safety control is only required for sub-scenario #3 since presumably in the other sub-scenarios the apartment dweller would hear the smoke detector alarm and/or has a direct role in causing the fire. In sub-scenarios #1 and #2 an individual would presumably summon assistance and thereby alert other occupants of the building, which obviates the need for an automated communication link. We see that the deployment of this expensive safety control would mostly apply to one relatively infrequent sub-scenario. Fires in dwellings are unlikely when no one is home since most fires result from the actions of people. Moreover, existing smoke detectors within apartments and common areas would alert building occupants to fires in those areas. Therefore, the return on investment for this safety control is quite limited since the proposed security control only addresses a small fraction of the relevant threat scenarios. Of course, the control under consideration is indeed a potentially life-saving feature. However, its utility is questionable when viewed in light of the spectrum of scenario outcomes. Furthermore, purchasing this control might be even more problematic if competing security/safety controls must be acquired using the same pool of resources.
12.10
Summary
A structured and risk-based security risk management process naturally evolves from the theory of security risk assessment. Since all threat scenarios are equivalent, risk is a universal feature of threat scenarios and threat scenarios are the focus of
12.10
Summary
265
security risk assessments, the aforementioned process is applicable to any threat scenario. This process is summarized as follows: 1. Identify the three threat scenario elements. 2. Determine the relationship between threat scenario elements, i.e., the risk. (a) Direct or indirect assessments of likelihood risk factors (b) Estimates of the magnitude of the vulnerability and impact risk factors 3. Identify security controls to address the risk factors. 4. Set performance specifications for security controls according to the magnitude of the assessed risk and the organizational tolerance for risk as reflected in security policies and standards. 5. Assess the magnitude of residual risk at periodic intervals and adjust security controls at a rate dictated by the time rate of change of risk factors and to a level dictated by the security risk profile. Chapter 1 specified the Fundamental Expression of Security Risk. Although it describes threat scenario risk at a high level, that expression omitted risk-relevant features inherent to the theory. Specifically, it did not include the effect of security controls, the magnitude of complexity, the temporal misalignment of security controls and risk factors and the confluence of likelihood risk factors. Therefore, a more fulsome expression is required that more accurately reflects the magnitude of threat scenario risk. The full expression is specified immediately below: Risk ðthreat scenarioÞ /
ðL ΔLÞn IðI1 þ I2 þ . . . þ In Þ VðV1 þ V2 þ . . . þ Vn Þ ½ðt ΔtÞ=t C C
L, V and I are the likelihood, vulnerability and impact risk factors respectively. ΔL represents a time displacement/offset of likelihood risk factors, n is the number of likelihood risk factors, t is the combined duration of a security control and relevant risk factor, Δt is the duration of a security control-risk factor overlap, C is a security control and C* is complexity. The effect of the likelihood risk factors is multiplicative, whereas the effect of the vulnerability and impact risk factors is often additive. Thought experiments can be useful in assessing the magnitude of threat scenario risk. Such experiments entail identifying threat scenarios involving various attack sequences that begin with an attack vector and end with a compromised asset.
Epilogue
A university education in security risk management is still an uncommon occurrence. This condition begs the question of where security professionals learn to assess security risk and thereby address security-related issues. Arguably security is not like traditional professions where more contemplative approaches to problem solving are encouraged if not required. Security professionals are expected to respond to evolving threats on a moment’s notice. This need for immediacy drives a focus on tactics over strategy. Consequently, theory typically takes a backseat to the day-to-day struggles inherent to protecting people, property and information. It is not surprising that security professionals typically learn about security risk through on-the-job training. The advantage of experience is a realistic appreciation of what does and doesn’t work as well as in facilitating a hands-on approach to problem solving. The potential downside is a reliance on informal, i.e., non-riskbased, methods to assess security risk. Therefore, situations at odds with experience are not necessarily addressed in the most effective and/or efficient manner. Inconsistencies in the results of security risk assessments reflect the lack of a standardized, risk-based assessment methodology, which is in part a consequence of the absence of pedagogy. However, it is certainly true that many successful security professionals never studied theory, and it is not clear they have suffered one iota as a result. This fact doesn’t mean that theory is irrelevant. Rather, it suggests an opportunity to broaden current perspectives and enhance existing capabilities. Although traditional scientific disciplines allow and even encourage differences in data interpretation, such interpretations are always grounded in a conceptual foundation that everyone in the field accepts as the basis for problem solving. The objective of this text has been to specify such a foundation for security risk assessment as well as to identify assessment techniques and metrics that enable practical applications of the theory. A significant consequence of any conceptual foundation is the formulation of core theoretical principles. In this case, such principles define a canonical threat scenario © Springer Nature Switzerland AG 2019 C. S. Young, Risk and the Theory of Security Risk Assessment, Advanced Sciences and Technologies for Security Applications, https://doi.org/10.1007/978-3-030-30600-7
267
268
Epilogue
as well as standard assessment criteria. The upshot is a structured framework for assessing risk, and the ability to generalize about risk-relevance in a consistent and repeatable fashion. In particular, the principles of threat scenario equivalence and the universality of risk ensure that the process used to assess security risk is always the same irrespective of the threat scenario details. A very practical result of the theory is the ability to prioritize security controls across disparate threat scenarios. A complete treatment of the theory includes some non-traditional topics that might seem to stretch the limits on relevance if not reality. However, these topics actually reflect the breadth of the subject matter as much as the unconstrained imagination of the author. One theme common to all topics is that the basis for security risk assessment is inherently scientific. Although it is clearly possible to conduct an assessment without being a scientist, it is not possible to be rigorous without adopting a scientific approach. In that vein, numerous examples drawn from science and technology are woven into explanations of the theory. These are often presented to help visualize concepts, but their frequency hints at deeper connections. Furthermore, many assessment techniques were borrowed from traditional disciplines. Such appropriations support the contention that security risk management constitutes a discipline in its own right. Much of the theory of security risk assessment concerns issues specific to the likelihood component of risk. Estimates of likelihood relate to the type of uncertainty inherent to a given threat scenario, which in turn is linked to the presence or absence of threat incidents. Importantly, there are two types of uncertainty associated with the likelihood component of risk. Their difference is what drives estimates of probability versus potential, where the latter is used to assess likelihood in the absence of a statistically significant number of threat incidents. Overcoming the statistical handicap presented by a dearth of incidents motivates the use of stochastic processes, and thereby leverage the laws of probability. The theory of security risk assessment has applicability to scenarios other than those relating to security. Assessing risk relies on constructs and relationships that are applicable to any scenario where “loss” is a possible outcome. Arguably, every scenario consisting of a choice between outcomes fits into this category, which supports the contention that the processes used to assess risk and make decisions are identical. Three quite dissimilar books inspired some of the major themes that have helped characterize the theory. Their diversity is evidence of the breadth of topics that relate to security risk assessment and management. Perhaps the most significant contributions came from Consilience; The Unity of Knowledge. In that book, E. O. Wilson, a prominent Harvard biologist, discusses how knowledge is intrinsically unified. In his view disparate disciplines are connected via a few natural laws, which he terms consilience. Wilson’s work inspired a holistic and interdisciplinary perspective that is central to the theory of security risk assessment. The second influential work was Axis and Circumference; The Cylindrical Shape of Plants and Animals, by Stephen A. Wainwright, another Harvard biologist. The
Epilogue
269
author discusses how shape, and specifically the cylindrical shape, is part of a natural pattern that improves an organism’s competitive advantage. In particular, his discussion of scale in nature prompted thoughts on how to represent changes to risk factors and their effect on the magnitude of security risk. The third book is An Introduction to Information Theory, Symbols, Signals and Noise, by John R. Pierce. Information theory uses stochastic principles to quantify information transmission. Pierce shows how it is relevant to a variety of disciplines, which inspired ideas on how information theory might be applied to threat scenario complexity, a significant contributor to the likelihood component of risk. Finally, although security risk management has traditionally decoupled theory from practice, it is unclear how to evaluate the effect of this decoupling. What is clear is that assessing and managing security risk are often integral to an organization’s business strategy. Simultaneously, the sophistication of adversaries and the advanced technologies at their disposal have driven the requirement for more effective and cost effective security solutions. Therefore, a rational basis for security decisions will become increasingly necessary to both improve assessment accuracy and to justify expensive security strategies. The overarching challenge in addressing these requirements is to demonstrate the tangible benefits of pedagogy to security practitioners since theory is merely an academic exercise without a connection to the real world. However, implementing security measures without a rigorous means of assessing risk inevitably invites error and inefficiency.
Appendices
Appendix 1: Random Walk Mean and Variance1 Consider a security control knob perturbed by mechanical noise. The Gaussian White Noise source causes the knob to be displaced in either the clockwise or counter-clockwise direction. The displacement of the knob after N steps is x, which is the desired statistic. Let si denote a positive or negative displacement, i.e., a step, of the i-th change in position. We first assume the most general case where a probability distribution describes the i-th displacement and where displacements are not fixed distances. First, note the following: x ¼ s1 þ s2 þ . . . : þ SN ¼
N X
si
ð1Þ
i¼1
The mean value of x ¼ is given by N, where is the mean displacement-per-perturbation of the knob. The dispersion of x (the second moment in statistical parlance), or is given by N , where is the dispersion of the distribution of displacements-per-perturbation. The dispersion, equals (x – < x>)2 by definition. In words, the dispersion equals the square of the width of the distribution of the net displacement about its mean value, . The square root of the dispersion is 1/2, i.e., the root mean square (RMS) deviation from the mean. The RMS metric is a direct measurement of the width of the distribution of displacements about the mean.
1
F. Reif, op cit.
© Springer Nature Switzerland AG 2019 C. S. Young, Risk and the Theory of Security Risk Assessment, Advanced Sciences and Technologies for Security Applications, https://doi.org/10.1007/978-3-030-30600-7
270
Appendices
271
Suppose each perturbation displaces the security control knob a fixed distance “l” in either a clockwise or counter clockwise direction. The results of Chap. 8 can be used to calculate the dispersion of the distribution of perturbations as well as the total displacement after N perturbations. Such statistics could facilitate a risk-based monitoring strategy for threat scenarios affected by random processes. Assume the probability of a clockwise perturbation of distance l is p. The probability of a counter clockwise perturbation of distance l is 1 – p or q. The mean displacement-per-step, , is as follows: < s >¼ pl þ qðlÞ ¼ 2pl l ¼ ð2p 1Þl
ð2Þ
After N perturbations, the mean displacement equals, < x >¼ ðp qÞNl
ð3Þ
The dispersion, i.e., variance in the distribution of displacements is given by the following, ðx < x >Þ2 ¼< Δx2 >¼ 4pqNl2
ð4Þ
Therefore, the RMS deviation from the mean of the distribution of perturbed displacements equals, < Δx2 >1=2 ¼ 2lðpqNÞ1=2
ð5Þ
Appendix 2: Time and Ensemble Averages2 Consider a variable that fluctuates randomly with time, R(t). There could be N systems in an ensemble of such variables, where each member of the ensemble is subject to random fluctuations. The ensemble average, i.e., the average of all N members of the ensemble at a specific time, t1 is given by the following indefinite integral, which some readers might recognize as the expectation value. < Rðt1 Þ>e ¼ lim ð1=NÞ N!1
2
N X
Z Ri ðtÞ ¼
x1 ðtÞpðx1 , t1 Þdx1
i¼1
https://www.nii.ac.jp/qis/first-quantum/forStudents/lecture/pdf/noise/chapter1.pdf
ð6Þ
272
Appendices
The mean square value of R or second-order ensemble average, i.e., the variance, is written as follows: < R2 >e ¼ ð1=NÞ
N X
Ri 2 ðtÞ ¼< R2 > < R>2
ð7Þ
i¼1
We can also specify a time average for the i-th member of the ensemble: þT=2 Z
< Ri ðtÞ>t ¼ lim 1=T
Ri ðtÞdt
T!1
ð8Þ
T=2
The time and ensemble average are not the same thing in general. The former represents the average of the value of all N members of the ensemble at a specific time. In contrast, the time average computes the average value of one member of the ensemble over some interval in its time history. A process is ergodic if the following is true: < Rðt1 Þ>e ¼< Ri ðtÞ>t
ð9Þ
The second order time average or mean square is given by, þT=2 Z 2
½Ri ðtÞ2 dt
< Ri ðtÞ >¼ lim 1=T T!1
ð10Þ
T=2
The autocorrelation function measures the self-similarity of a variable measured at time t relative to a later time τ: þT=2 Z
< Ri ðtÞRi ðt þ τÞ >¼ lim 1=T
½Ri ðtÞRi ðt þ τÞdt
T!1
T=2
ð11Þ
Appendices
273
Appendix 3: Theory of Security Risk Assessment Summary Table The following table is a summary of some of the key elements of the theory of security risk assessment.
Components of risk
Threat scenario elements
Threat scenario categories and features
Element or feature 1 Likelihood Probability distribution of threat incidents; uncertainty due to distribution variance. Spontaneous threat incidents where arrival is a random variable. Threats
Element or feature 2 Potential Risk factorrelated incidents or a change in the magnitude of a risk factor yields inferences about the likelihood component of risk
Element or feature 3 Vulnerability Loss or damage due to a threat incident
Element or feature 4 Impact Vulnerabilityper-threat incident or per-risk factor
Element or feature 5
Entities affected by threats
Environment where threats and entities interact
Risk components (I, V, L) determine threat-entity relationship
Static dC/dt dR/ dt and dC/dx dR/ dx Information entropy is stable
Dynamic dC/dt < dR/ dt or dC/dx < dR/ dx Information entropy is unstable
Behavioral Contains behavioral risk factors
Complex Contains complexity risk factors Complexity metric is the likelihood of a specific threat scenario state: Cm ¼ 2-MH
Fundamental expression of (unmanaged) risk: Risk (threat scenario) ¼ I x V xL Random Threat incident is a random variable
(continued)
274
Risk factor types
Systemic security risk metrics
Appendices Element or feature 1 Apex Dominant contributor to the magnitude of threat scenario risk.
Persistence R x Δt
Element or feature 2 Spatial Distribution of risk factors within a threat scenario environment affects the magnitude of threat scenario risk Transience dR/ΔC
Element or feature 3 Temporal Persistent or intermittent presence affects the magnitude of threat scenario risk
Element or feature 4 Behavioral Behavior or features of entities affect the magnitude of threat scenario risk
Element or feature 5 Complexity Number of risk factors. Uncertainty in risk management, i.e., the application of security controls to risk factors
Trending ΔI/t, ΔR0 /t and ΔR/t
Concentration R/X
Proliferation RxX
R ¼ number of risk factors M ¼ number of risk factors in complexity threat scenarios C ¼ security control I ¼ impact V ¼ vulnerability L ¼ likelihood H ¼ information entropy of a security risk management process ΔI ¼ change in the number of threat incidents ΔR ¼ change in the number of risk factors ΔR0 ¼ change in the magnitude of a risk factor X ¼ number of assets t ¼ continuous time Δt ¼ average time interval risk factors remain unaddressed ΔC ¼ security control duration dR ¼ risk factor duration