146 112 7MB
English Pages 275 [272] Year 2010
Sustainable and Resilient Critical Infrastructure Systems
Kasthurirangan Gopalakrishnan and Srinivas Peeta
Sustainable and Resilient Critical Infrastructure Systems Simulation, Modeling, and Intelligent Engineering
ABC
Dr. Kasthurirangan Gopalakrishnan Research Assistant Professor 354 Town Engineering Bldg. Dept. of Civil, Constr. & Env. Engg. Iowa State University Ames, IA 50011-3232 USA E-mail: [email protected]
Dr. Srinivas Peeta Professor of Civil Engineering Director, NEXTRANS Center Purdue University 550 Stadium Mall Drive West Lafayette, IN 47907-2051 USA E-mail: [email protected]
ISBN 978-3-642-11404-5
e-ISBN 978-3-642-11405-2
DOI 10.1007/978-3-642-11405-2 Library of Congress Control Number: 2010924761 c 2010 Springer-Verlag Berlin Heidelberg This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting & Cover Design: Scientific Publishing Services Pvt. Ltd., Chennai, India Printed on acid-free paper 987654321 springer.com
Preface
In today’s societies, multiple, but individual operational infrastructures such as transportation and power grids exist together and are connected via physical, cyber, and logical interdependencies to function as tightly-coupled socio-technical system of systems with complicated behavior. Continuous, reliable operation and resilience of such critical interdependent infrastructures is crucial for maintaining homeland security, economic prosperity, and quality of people’s life. This edited book is motivated by recent advances in simulation, modeling, sensing, communications/information, and intelligent and sustainable technologies that have resulted in the development of sophisticated methodologies and instruments to design, characterize, optimize, and evaluate critical infrastructure systems, their resilience and interdependencies, and their condition and the factors that cause their deterioration. There are ten carefully selected chapters contained in this book which cover the broad spectrum of latest advances in the simulation, modeling, and intelligent engineering of sustainable resilient critical infrastructure systems. Each chapter has been peer-reviewed by at least two anonymous referees to assure the highest quality. The chapter entitled “Synthesis of Modeling and Simulation Methods on Critical Infrastructure Interdependencies Research” introduces a conceptualization of current research that integrates the multiple ideas in the field of infrastructure interdependencies into a unified hierarchical structure that navigates through research advances from early papers in the1980’s to date. This research survey highlights that most of the existing interdependence modeling strategies are not competing but rather complementary approaches, which can provide a vehicle for immediate innovative studies on coupled infrastructures, such as stochastic interdependence, cascading failures across systems, and the establishment of risk mitigation principles. The chapter entitled “Interdependencies between Energy and Transportation Systems for National Long Term Planning” proposes a modeling framework to perform long-term electric and transportation infrastructure design at a national
VI
Preface
level, accounting for their interdependencies. The approach combines network flow modeling with a multiobjective solution method. The chapter entitled “A Framework for Assessing the Resilience of Infrastructure and Economic Systems” proposes a general framework for assessing the resilience of infrastructure and economic systems. The framework consists of three primary components: (1) a definition of resilience that is specific to infrastructure systems; (2) a quantitative model for measuring the resilience of systems to disruptive events through the evaluation of both impacts to system performance and the cost of recovery; and (3) a qualitative method for assessing the system properties that inherently determine system resilience. The chapter entitled “Regional Infrastructure Investment Allocation for Sustainability” presents a computable theory of multi-region optimal economic growth that may be used to determine tax policy and public infrastructure investment plans that facilitate financial, labor force, and environmental sustainability. The chapter entitled “A Framework for the Manifestation of Tacit Critical Infrastructure Knowledge” reports on authors’ experience in developing techniques, tools and algorithms for revealing and interpreting the hidden intricacies of critical infrastructure systems. The chapter entitled “Intelligent Transportation Infrastructure Technologies for Condition Assessment and Structural Health Monitoring of Highway Bridges” presents an overview of the role that intelligent transportation infrastructure technologies are increasingly assuming within bridge management as well as conceptual strategies for application of several monitoring approaches. The chapter entitled “Maintenance Optimization for Heterogeneous Infrastructure Systems: Evolutionary Algorithms for Bottom-up Methods” presents a two-stage bottom-up methodology involving Evolutionary Algorithm (EA) for maintenance optimization for heterogeneous infrastructure systems, i.e., systems composed of multiple facilities with different characteristics such as environments, materials and deterioration processes. The chapter entitled “A Swarm Intelligence Approach for Emergency Infrastructure Inspection Scheduling” proposed a methodology using a combined Particle Swarm Optimization (PSO) – Ant Colony Optimization (ACO) based framework for scheduling structure and infrastructure inspection crews following an earthquake in densely populated metropolitan areas. The chapter entitled “Optimal Highway Infrastructure Maintenance Scheduling Considering Deterministic and Stochastic Aspects of Deterioration” proposes formulations and solution algorithms for both deterministic and stochastic aspects of infrastructure deterioration. Finally, the chapter entitled “Sustainable Rehabilitation of Deteriorated Concrete Highways: Condition Assessment Using Shuffled Complex Evolution (SCE) Global Optimization Approach” introduces two approaches to characterize the structural condition of rubblized concrete pavement systems using NonDestructive Test (NDT) deflection measurements. Researchers and practitioners engaged in developing, designing, and monitoring sustainable and resilient critical infrastructure systems and those interested in the application of multi-scale modeling methods, simulations, intelligent technologies,
Preface
VII
and stochastic optimization techniques to the study of sustainable and resilient critical infrastructure systems will find this book very useful. This book will also serve as an excellent state-of-the-art reference material for graduate and postgraduate students in sustainable critical infrastructure systems.
January 15, 2010 Ames, Iowa
Kasthurirangan (Rangan) Gopalakrishnan
Contents
Synthesis of Modeling and Simulation Methods on Critical Infrastructure Interdependencies Research . . . . . . . . . . . . . . . . . . . Gesara Satumtira, Leonardo Due˜ nas-Osorio Interdependencies between Energy and Transportation Systems for National Long Term Planning . . . . . . . . . . . . . . . . . . . Eduardo Ib´ an ˜ez, Konstantina Gkritza, James McCalley, Dionysios Aliprantis, Robert Brown, Arun Somani, Lizhi Wang A Framework for Assessing the Resilience of Infrastructure and Economic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eric D. Vugrin, Drake E. Warren, Mark A. Ehlen, R. Chris Camphouse
1
53
77
Regional Infrastructure Investment Allocation for Sustainability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Terry L. Friesz, Sung H. Chung, Robert D. Weaver A Framework for the Manifestation of Tacit Critical Infrastructure Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Ebrahim Bagheri, Ali A. Ghorbani Intelligent Transportation Infrastructure Technologies for Condition Assessment and Structural Health Monitoring of Highway Bridges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Kerop D. Janoyan, Matthew J. Whelan Maintenance Optimization for Heterogeneous Infrastructure Systems: Evolutionary Algorithms for Bottom-Up Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Hwasoo Yeo, Yoonjin Yoon, Samer Madanat
X
Contents
A Swarm Intelligence Approach for Emergency Infrastructure Inspection Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . 201 Vagelis Plevris, Matthew G. Karlaftis, Nikos D. Lagaros Optimal Highway Infrastructure Maintenance Scheduling Considering Deterministic and Stochastic Aspects of Deterioration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 Manoj K. Jha Sustainable Rehabilitation of Deteriorated Concrete Highways: Condition Assessment Using Shuffled Complex Evolution (SCE) Global Optimization Approach . . . . . . . . . . . . . 249 Sunghwan Kim, Kasthurirangan Gopalakrishnan Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Synthesis of Modeling and Simulation Methods on Critical Infrastructure Interdependencies Research Gesara Satumtira and Leonardo Dueñas-Osorio *
Abstract. National security, economic prosperity, and the quality of life of today's societies depend on the continuous and reliable operation of interdependent infrastructures. Models to capture the performance and operation of these systems have been developed to support planning, maintenance, and retrofit decision making from multiple view points, including infrastructure owners or investors, private and public users, and government entities that ensure reliability, economic vitality and security. The study of interdependent infrastructures is challenging due to heterogeneous quality and insufficient data availability and the need to account for their spatial and temporal aspects of complex supply-demand operation. Research and implementation studies have attempted to address interdependence modeling through various techniques, such as Agent Based simulation, InputOutput Inoperability, system reliability theory, nonlinear dynamics, and graph theory. These studies are mainly targeted at understanding infrastructure behavior and response to disruptions through single modeling techniques. However, hybrid modeling techniques, multi-scale analyses, and other realizable innovative approaches are lacking, in part because few studies have characterized existing models into a single source to provide a current state of the field, elucidate connections across existing studies, and synthesize a directive for future research. This chapter introduces a conceptualization of current research that integrates the multiple ideas in the field of infrastructure interdependencies into a unified hierarchical structure that navigates through research advances from early papers in the1980’s to date. The body of knowledge is categorized according to several attributes identified in the field, such as mathematical method, modeling objective, scale of analysis, quality and quantity of input data, targeted discipline, and end user type. The hierarchical conceptualization approach synthesizes available data and is expected to ease the research and application process of interdependencies concepts by finding differences and commonalities in data collection, analyses techniques, and Gesara Satumtira · Leonardo Dueñas-Osorio Dept. of Civil and Environmental Engineering, Rice University, 6100 Main Street, Houston, Texas, United States e-mail: [email protected], [email protected]
*
K. Gopalakrishnan & S. Peeta (Eds.): Sustainable & Resilient Critical Infrastructure Sys., pp. 1–51. © Springer-Verlag Berlin Heidelberg 2010 springerlink.com
2
G. Satumtira and L. Dueñas-Osorio
desired outputs. This research survey highlights that most of the existing interdependence modeling strategies are not competing but rather complementary approaches, which can provide a vehicle for immediate innovative studies on coupled infrastructures, such as stochastic interdependence, cascading failures across systems, and the establishment of risk mitigation principles. New linkages across existing research can facilitate implementation and dissemination of results, inform areas of data collection, enable benchmark models for validation predictions and model comparisons, and point to long term broader and emergent unresolved research issues in infrastructure interdependence research, possibly including smart technologies, bio-inspiration, sustainability, scalability of analysis algorithms, and dimension reduction of network abstractions.
1 Introduction The field of infrastructure interdependency is a new, but rapidly growing area of study with contributions from multiple researchers in various engineering, mathematical and social science disciplines, who often have a diverse set of priorities, academic and implementation preferences, and audiences or beneficiaries of research in mind. These multidisciplinary efforts focus on achieving national security, economic prosperity, and the quality of life of today's societies, while heavily depending on the continuous and reliable operation of critical interdependent infrastructures. Various studies have been conducted recently to quantify infrastructure interdependencies, understand the interaction dynamics between infrastructures, and present tools to mitigate or enhance the emergent effects resulting from interactions. However, these initial attempts to characterize interdependency between critical infrastructures revealed various theoretical and computational challenges for both researchers and policy/decision makers interested in understanding the complex nature of infrastructure behavior. For instance, infrastructure performance studies need to explicitly capture spatial, temporal and institutional dimensions along with their associated correlations, as seen in real infrastructures, which present unique emerging modeling challenges. In addition, research methods are often limited by the amount and quality of information that infrastructure facility owners and managers are willing to share with public and private professional and academic entities, greatly reducing the generic applicability of the models and tools to be used in real-life scenarios. Moreover, few authors have attempted to synthesize the existing research, and identify the similarities and differences across proposed techniques whose common goal is to quantify the effects of interactions on the performance of complex infrastructure systems. The present chapter provides a survey of the current studies and methods in the field of infrastructure interdependencies and synthesizes the existing approaches according to different attributes of their content. For instance, the available research studies are categorized based on traits such as underlying mathematical methods utilized, modeling objectives, scales of analysis, quantity and quality of input data, targeted disciplines, and end-user types. A hierarchical structure is used to explore the literature related to each attribute and its relation with other attributes. Such a framework can serve as an information tool where users can
Synthesis of Modeling and Simulation Methods
3
easily identify research and implementation studies with similar or competing goals of interest, and guide future research by identifying shortcomings and strengths of current contributions. To maintain a uniform description throughout this analysis of current literature, individual networks or systems, such as transportation and power grids, are referred to as “infrastructures”. Individual infrastructures exist together in what is called a “system of systems,” in which two or more infrastructures interact with one another and are connected via physical, cyber, and logical interdependencies. According to [1] “systems of systems” is defined as a set of multiple, but independently operational systems that must interact effectively with one another to meet specific needs. Most of the models and studies analyzed in this chapter examine the interdependencies between infrastructures in a complex system of systems. This review first introduces the field of infrastructure interdependency research, its importance and applications, along with an overview of the existing research methods and tools. The conceptual model for the existing body of work used in this chapter is then introduced is Section 2 and the more relevant attributes of interdependent infrastructure research to date are defined. The conceptual model of existing literature utilizes a hierarchical method where the available research is broken down, starting from different sets of attributes grouped by categories, to produce multiple tree-like models where each can encompass the entire body of work. This approach is intended to facilitate dissemination and usability of interdependencies research, improve and focus data collection, provide reference work for validation of predictions from distinct models within and across different scales, and provide ideas to conducting future research. As one of many examples, users may immediately benefit from this review and the critical appraisals in Section 3, by making connections across studies at the network-level and system of systems scales about the effects and mechanisms of interdependence propagation, both positive and negative, which can be used to support decisions at the regional-level scale and affect back infrastructure operation and management and the network level. The research survey is presented with the additional expectation that it will assist identifying long term future work regarding the validation of existing and evolving models, understanding interdependent failure propagation, quantifying infrastructure monitoring effects on performance, enhancing multiscale infrastructure modeling and detecting viable interdependency mechanisms for readily implementation in practice. In addition to critical appraisals of archival literature on infrastructure interdependencies, Section 3 also has a preamble that reports factual information and the statistics that characterize the trends of the publishing in the field, and provides a closure on realizable future work stemming from current theoretical and implementation developments. Section 4 concludes by stressing the findings on diverse and widely adopted approaches on infrastructure interdependence research, the attributes that have received more attention to date, and the possibility for innovative research that builds upon the current body of knowledge.
4
G. Satumtira and L. Dueñas-Osorio
1.1 Brief History of Research on Infrastructure Interdependencies Although the majority of work on the study and analysis of infrastructure interdependencies is recent, there are some early works that laid the foundation for current studies. The earliest works on infrastructure interdependencies include those by Gee et al., Nojima and Kameda, Cronin et al., and Rose et al. [2-5]. Gee et al. [2] presents a regional development planning model for a system of interdependencies. The model is based on Thoss’ equations and input and output variables. A dynamic budget allocation model is also included and consists of four interacting submodels: the evaluation submodel for understanding infrastructures from a number of factors, including population and employment, the attractiveness of an infrastructure to private supply markets, and emphasis, priorities, or preferences for different infrastructure sectors, the interaction submodel for identifying and quantifying the different types of interdependencies, such as intra-regional, intersectoral interactions, the cost submodel which calculates the total cost of a project, and the allocation submodel that seeks out a specific project based on a number of time and budgetary constraints. The model was simulated as part of a sensitivity analysis and was able to reduce a group of 40 possible projects to approximately 10 based on the large budget volumes allocated to the projects from varying regions, sectors, and points in time according to preset conditions. Nojima and Kameda [3] developed a method to evaluate the risk to critical infrastructures or lifeline systems resulting from earthquake events. Focus was placed on understanding the functional dependencies between two infrastructures and quantifying such systems’ interactions. In the example provided, the effectiveness of installing back-up equipment to reduce the risks associated with disaster propagation was defined in terms of probability and calculated to be 0.17×10-2. Cronin et al. [4] examined the impact of telecommunications infrastructure investment, service availability, and pricing on rural economic development in Pennsylvania at the state and county levels. The study utilized a telecommunications infrastructure modeling system consisting of three statistical models, each focused on different aspects of the relationship between telecommunications development and economic benefit. Reduced business costs resulting from telecommunications investments in the state led to an average of over 40,000 jobs generated statewide and an average of 5,000 rural jobs from 1975 to 1991. Additionally, the study revealed that rural customers paid lower prices for local service but paid more as a percentage of their household income for these services than urban customers, and in general even small geographic regions were affected by telecommunication modernization. Lastly, Rose et al. [5] studied a similar relationship between the electricity infrastructure and regional economic infrastructures. Their methodology proposed estimates of regional economic impacts given a major disruption to electricity service caused by a catastrophic earthquake in an attempt to uncover the optimal restoration pattern for all sectors, including agriculture, mining, construction, etc. Three response scenarios were simulated: proportional reduction of electricity across sectors, optimal reallocation of electricity services, and optimal reallocation and restoration of electricity services. Shadow prices were used to
Synthesis of Modeling and Simulation Methods
5
reflect the amount of growth in gross regional product (GRP) that one additional unit of electricity would provide. The highest shadow prices were input in the optimal reallocations and restoration simulation and yielded restoration of GRP to 99.97 % of baseline in 2-3 weeks compared to 98.05 % in the optimal reallocation scenario and 91.96 % in the proportional reduction scenario. These papers were the first to consider the effects of linkages between individual complex infrastructure systems. In the mid-1990s, terrorist bombings of the World Trade Center in New York City and the federal building in Oklahoma City brought to light the increasing importance of infrastructure interdependencies in modern societies. Recognizing the possible threats to the national infrastructures and security of the United States, President Clinton established the President's Commission on Critical Infrastructure Protection (PCCIP) in 1996. The large commission was represented by both federal departments and private sector agencies, and comprised of an advisory committee of industry leaders appointed by the President to provide the perspective of industry owners and infrastructure operators and a steering committee to oversee the commission's work. The commission was instructed to comprehensively review and recommend a national policy for protecting critical infrastructures to assure their continued operation. The final report [6] was released in October of 1997. No immediate threats to national security were mentioned, but the dangers of an increasing dependence on critical infrastructures and an increased vulnerability to physical disruptions and cyber threats was emphasized. In particular, the commission focused on threats and vulnerabilities in five sectors: Information and Communications, Banking and Finance, Energy, Physical Distribution, and Vital Human Services. In 1998, the Presidential Decision Directive (PDD) no. 63 was released [7]. This directive set a national goal that the United States shall achieve and maintain the ability to protect the nation's critical infrastructures from deliberate attacks by 2003. Several departments and institutions have since been founded and expanded to protect critical infrastructures, including the National Infrastructure Protection Center (NIPC) in 1998, the National Infrastructure Simulation and Analysis Center (NISAC) in 2001, and the U.S. Department of Homeland Security (DHS) in 2002 [7, 8]. Funding to universities, national laboratories, and private companies investigating infrastructure interdependencies has since grown as a result of the increased attention to critical infrastructure protection [9]. Two methodologies emerged in this early phase for modeling infrastructure interdependencies: Agent Based simulation and input-output analysis. In Agent Based simulations, infrastructures are modeled as complex adaptive systems (CAS) composed of agents representing different aspects in an infrastructure system. An agent is a singular piece of code with a specific physical location, function, and memory of past interactions and behaviors. A fundamental element of CAS and Agent Based modeling is the emergence of synergistic behaviors when agents are brought together to interact in a single urban infrastructure system. The different agents can be modeled at varying degrees of granularity based on the intended level of resolution modeling [10]. Agent Based simulations and CAS modeling differed from other prevailing approaches at the time, such as linear
6
G. Satumtira and L. Dueñas-Osorio
programming, by considering intermediate states as opposed to the final state as representative of practical infrastructures [11]. Several Agent Based modeling and simulation tools have been released. The Spot Market Agent Research Tool (SMART) and Flexible Agent Simulation Toolkit (FAST) presented by North [11] provide an interactive highly visual representation of infrastructure interactions with agents specialized to perform diverse tasks from acting as system operators to electricity transmission lines. Schoenwald et al. [12] developed the NISAC Agent Based Laboratory for Economics (N-ABLE) model as an extension of the Aspen Electricity Enhancement (ASPEN-EE) model to simulate the economy by including the impacts of market structures and power outages in the electrical power system on other infrastructures. The N-ABLE model also addressed limitations in existing Agent Based models by representing more realistically the contribution of telecommunication networks. A communication network was built into N-ABLE that was adapted to account for delays in the message delivery system, as opposed to previous process of instantaneous delivery. The new network was tested on a model of the telecommunications and banking infrastructures where delays in routers used to transfer packets from one agent in the banking infrastructure to another were incorporated. The router buffer capacity was one area analyzed and affected the number of transactions dropped by the router. Analysis suggested that the marginal benefits for increasing the size of the router communication buffer decreases. For example, the increase in the number of dropped transaction packages when the buffer size is reduced from nine to seven is greater than twice the increase in dropped packets when buffer size is reduced from 10 to nine [12, 13]. Agent Based modeling also has the added benefit of being able to simulate imperfect, "human" behavior with behavioral algorithms. In a system of two electric power markets, agents were used to analyze economic behavior. With no price cap, the two markets followed similar patterns to arrive at their market clearing prices (MCP). With price caps, the markets also follow similar patterns as in the no-cap case, except when demand spikes led to increased prices in which case one market reaches a capped MCP while the other does not. When both markets have identical price caps, the MCP is significantly higher under price cap rules than without, except for times when prices spike due to high demand. These price spike events are offset by an increase in prices under price cap conditions that are about 6% to 7% higher than the MCP without the price cap [14]. Such behaviors are captured and incorporated into Agent Based simulations through the use of non-deterministic tools, such as a methods based on fuzzy logic where a person's decisions may vary with a certain degree of uncertainty [15]. For example, fuzzy logic rules were employed to study the average time different hospital users (doctors, nurses, students, and patients) waited at a train station during a blackout. Results revealed that when there is major congestion at the station, students did not wait for station times to return to normal while nurses did [15]. The second approach is based on the Input-Output (IO) economic analysis introduced by Wassily Leontief in the early 1930s, but adapted to model interacting infrastructure systems. Using the early Leontief IO analysis, Haimes and Jiang [16] developed the linear input-output inoperability model (IIM) to study the effects interdependencies could have on the inoperability of coupled networked
Synthesis of Modeling and Simulation Methods
7
systems. For example, in a two system model where failure of subsystem 2 leads subsystem 1 to be 80% inoperable, and a failure of system 1 renders subsystem 2 to be 20% inoperable, the effects of functionality loss due to an external perturbation can be calculated by solving the Leontief equation adapted for infrastructures. Specifically, interdependent effects can amplify a 60% loss of performance from external perturbations in subsystem 2 to a 57% inoperability of subsystem 1, and 71% inoperability of subsystem 2 according to the IIM approach. Santos and Haimes [17, 18] then created the demand-reduction IIM to complement the physical based IIM. The physical based model quantified inoperability of infrastructure systems as a function of degraded ability to deliver inputs, while the demandreduction model analyzed inoperability that stems from reductions in demand and consumption compounded with psychological effects sparked by perturbations, such as terrorism-related events. This demand-reduction model utilized both geographic and functional decomposition methods to allow for a more accurate analysis of specific geographic regions and the breakdown of large-scale systems into smaller subsystems. Analysis using the demand reduction inoperability IO model is capable of producing sector rankings based on the estimated economic loss caused by perturbations to a primary sector. For instance, in terms of inoperability, the most affected interdependent economic sectors in the United States from terrorism events perturbing 10% of the air transportation system were air transportation itself, travel and sightseeing agencies, aircraft engine and part manufacturing, oil and gas extraction, footwear manufacturing, printing, and petroleum refining among others. By actual dollar amounts, however, the inoperability of air transportation resulted in greater economic loss for the workforce sector from tens of thousands of millions of 1997 dollars, to losses in the telecommunications sector of hundreds of millions of 1997 dollars, and air transportation itself, real state, retail and wholesale trade, food and drinking services, travel agencies, and oil and gas extraction in between these top impacted sectors. These results capture disaster induced psychological effects, such as the “fear factor”, confirming a significant reduction of the public’s demand for goods and services. Such reductions in demand and associated economic repercussions can be easily compared against an infrastructure’s nominal operation level. Further expansions of the IIM approaches combined the physical and demand-reduction IO models with sector specific cost of recovery models to evaluate risk management options by generating a Pareto-optimal frontier of solutions used to identify inferior solutions for disaster scenarios. These inferior solutions lie above the investment economic loss tradeoff curve and require greater recovery investment without any sizeable economic payoff. Power systems are used to illustrate an almost linear Pareto-optimal tradeoff between economic loss and recovery cost for an initial loss of approximately 50% of the users of a utility company that serves more than two million customers in the Eastern United States [17]. The IIM approach has continued expanding and is now deemed as a unique, flexible tool for describing economic and infrastructure operation interdependent effects. In fact, IIM approaches offer an intuitive interpretation of interactions, which has led to their successful application in risk-based decision making processes under uncertainty [19-21]. For example, IIM tools have been used to study management of disruptions to the transportation
8
G. Satumtira and L. Dueñas-Osorio
sector and identify sources of risk that would affect the economy, safety, and efficient sector operations [20] and to develop risk management strategies to reduce effects of disruptive events originating in the mining sector on the oil and gas operations [21]. Additionally, a wide variety of tools have been developed, such as Hierarchical Holographic Models (HHM) and Phantom Systems Models (PSM), to identify risk scenarios in a system and help justify risk preparedness investments for resilient design of systems against emergent risks [19]. In the last five years, researchers began employing various other methods to model infrastructure interdependencies. Approaches based on graph and network theory modeled interconnected infrastructures through abstract graphs made of nodes and arcs representative of links between components in an infrastructure and took advantage of closed form expressions and numerical simulations to characterize their topology, performance, and uncertainty. For instance, Nozick et al. [22] presents a model of interconnected natural gas distribution and electricity generation/distribution systems in which gas distribution hubs and electricity generators are represented as nodes. The gas distribution network is maintained by a supervisory control and data acquisition (SCADA) system. The SCADA equipment monitors volumes, pressures, temperatures, and the status of the pipeline facilities, and it can be used to remotely start and stop gas compressors to change flow volumes. In the model, the SCADA system controls the flow of gas in links between the electricity generators and gas distribution nodes. Changes in link capacity over time include random failures and repair actions that reduce link gas/electricity capacity. Markov and Semi-Markov processes are used to reflect uncertain capacities on links and represent transition states of links over time. Markov processes are also used to represent investment opportunities as changes in the link state transition matrices, for example by representing an improvement to equipment reliability as a reduced probability of entering a failure state. In an illustrative example, a strategy for multiple investments is developed, such that the order of investments ensures the reliability of gas supply from a specific supplier. Alternatively, Dueñas-Osorio et al. [23, 24] present a novel approach that uses network theory and system reliability to model the interdependent response of infrastructure systems to natural hazards and targeted attacks. In their model, they utilized spatial proximity and logical interactions to establish the density and degree of coupling between practical power and water infrastructures. Storage tanks and large pumps in the water infrastructure and gate stations and electrical substations in the power infrastructure were modeled as nodes, while water pipelines and power transmission lines were modeled as links. They found for both types of disruptions that critical components of individual infrastructures, which are often the object of priority retrofit and maintenance, are not necessarily the elements that facilitate coupling at the interface between systems, but that secondary elements are often the ones with the coupling role. Also, their interdependent fragility curves captured in a probabilistic manner the effects of power system failure on water distribution networks, which can increase direct damage of one system four to five times as a function of interdependence strength. Mitigation strategies were also shown to be effective by only focusing on a few key network elements and local topological modifications that are important for network operation, flow
Synthesis of Modeling and Simulation Methods
9
traversal, and coupling across systems. Svendsen and Wolthusen [25] developed a similar graph theoretic approach that also allowed for the modeling of buffered or stored resources. Network theory and probabilistic analyses provided an alternative to Agent Based and IIM analysis methods by being more physical component detailed than the large-scale holistic view of IIM models and less data intensive than agent-based simulations [26], while still capturing multiple uncertainty sources and their effects if explicitly modeled. More recently, the mixed holistic reductionistic (MHR) approach, proposed by [27], was created to exploit the capabilities of generic holistic and reductionistic methods. In holistic modeling, infrastructures are seen as singular entities with clearly defined boundaries and functional properties. Services are provided by the infrastructure in such a way that the individual components comprising the infrastructure are irrelevant. On the other hand, reductionistic modeling emphasizes the need to fully understand the roles and behavior of individual components to truly understand the infrastructure as a whole. The boundary definitions are often lost at this multiple levels of abstraction, and different levels of analysis require one or both of the two points of view in a single model. Global analysis, which identifies the many critical infrastructures and the roles they play at the global scale, generally adopts a holistic point of view. Whereas system analysis, which accounts for the interactions among components contained in an infrastructure, takes on a reductionistic stand point in modeling. However, the analysis of complex networks requires understanding of the interaction of components within the same infrastructure, as well as interactions within and across multiple infrastructures. With this new Mixed Holistic-Reductionistic (MHR) model relationships between infrastructures could be seen at different levels through either a top-down or bottom-up approach. In addition to modeling interactions at the component and infrastructure level, the MHR model is able to model functional relationships between components and infrastructures at different levels of granularity. However, this MHR model is still a conceptual proposition that needs to be materialized. Evolving work on network theory modeling approaches may fill the gaps and enhance MHR techniques, since recent attempts have included specialized modeling tools that account for the sources of failures, the effect of network topology on response and the propagation of disruptions across interdependent infrastructures. Such tools include the joint use of topological signatures and structural fragility [23, 24], vulnerability and flow analyses [28], failure propagation and crises evolution simulations [29], and consequence curves and flexible cartography for spatial and temporal analyses [30].
2 Literature Review Model Description and Category Definitions Although the study of interdependent infrastructure is relatively recent, its vast applicability across disciplines has resulted in a broad range of research topics. This diverse intellectual output is difficult to synthesize when identification of similarities and/or difference between studies is desirable. Thus, there is the need
10
G. Satumtira and L. Dueñas-Osorio
to find structure in the existing literature to analyze and compare all present studies, as well as to categorize future studies. Such conceptual structure that emerges from existing literature is presented in Section 2.1 followed by a description of the dominant research categories to date in Section 2.2.
2.1 Hierarchical Decomposition of Existing Literature The conceptualization structure of current research needs to reach a sufficient level of detail to capture specialized elements of published studies, but be general enough so that they can be grouped in a manageable number of categories and attributes per category. Six common categories seen in the existing archival literature are identified, where each category is further characterized by a different number attributes. The dominant categories and recurring themes used to classify each of the studies include the mathematical model, modeling objective, scale of analysis, quantity and quality of input data, targeted discipline, and end user type. Figures 1a and 1b present a schematic classification sample with the six identified categories and their attributes. Note that each attribute per category can be explored for all of the attributes of the immediately dependent category. Conceptual and strategic action papers are also included in the conceptualization structure of published research on infrastructure interdependencies as they inform the rest of the research literature body. Studies with similar attributes are easily grouped using the hierarchical structure in which relations between existing studies can be identified and potential new relations for future works can be created. More detailed conceptual descriptions of the individual categories and associated attributes are presented in Section 2.2. To aid in the classification process, an attributes matrix containing all of the analyzed published literature was developed, and it is included in Appendix A for the readers’ own appraisal and selection of research papers not just based on frequency of categories as exemplified in Figures 1a and 1b, but on any other hierarchical criterion of interest, such as specific sequences of categories or attributes. In the matrix, studies are input as rows in the same order as the references of this chapter while categories and attributes are listed in columns. For any given paper, the attributes to which it applies are highlighted. For example, column 1d
Fig. 1a Sample Application of the Hierarchical Decomposition to Existing Literature: Mathematical Method Utilized, Modeling Objective, Scale of Analysis
Synthesis of Modeling and Simulation Methods
11
Fig. 1b Sample Application of the Hierarchical Decomposition to Existing Literature: Quality and Quantity of Input Data, Targeted Discipline, and End User Type
contains a count of the number of unclassified methods utilized by a given study where blacked out cells are equal to one. All together, the matrix is an inventory of the different studies and their attributes. Inclusion of the matrix is useful for quickly and easily identifying particularly dense or sparse areas which warrant further investigation. Through its use, the entire literature body may be viewed from any category or attribute. See Figure 1c for a snapshot of a portion of the literature matrix. To view the entire literature matrix and for a description of the column headings, see Appendix A.
Fig. 1c A Portion of the Literature Attributes Matrix
12
G. Satumtira and L. Dueñas-Osorio
2.2 Definition of Categories and Attributes The different categories presented here are identified by examining the available literature and determining common characteristics seen in all of the pertinent published studies. The categories included are mathematical method utilized, modeling objective, scale of analysis, quantity and quality of input data, targeted discipline, and end user type. Each category is further divided into its respective attributes. For example, in the targeted discipline category studies fall under engineering, social-technical, or economic attributes based on their characteristics and focus of the manuscripts’ content. Studies that have multiple attributes (i.e. targeted to both the engineering and economic disciplines) are classified under all that apply. Some of the categories, such as mathematical method and modeling objective, were typically the main focus of their given studies. In contrast, other categories, for instance quality and quantity of input data, revealed themselves to be common areas of difficulty stressed and discussed in many of the studies, but were not their intellectual focus. 2.2.1 Mathematical Method Authors have used several types of mathematical modeling methods to analyze and simulate infrastructure interdependencies as part of their studies. There are four attributes in this category: Agent Based, input-output, network or graph theory, and all other emerging models. In Agent Based modeling, components in an infrastructure (such as electric transformers or generators, along with users and institutions) are represented by individual agents defined by a specific set of rules that determine the actions the agent will take in response to the actions by other agents. An infrastructure system consists of multiple agents interacting together based on Complex Adaptive Systems (CAS) approaches. Agent Based simulations can be used to model various infrastructures and social systems, including economic activity and the energy infrastructure [31-33]. In [13], an economic activity model is presented that uses N-ABLE to simulate the economic environment allowing agents to represent manufacturing firms, government agencies, and households that interact via markets and infrastructures. The model was used to study the economic impacts to chemical industries caused by the disruption to electric power in the Gulf Coast and rail transport nationwide. Specifically, inter and intra-firm supply chain impacts were simulated for disruptions to electric power and transportation services that ranged from three to eight days using industry data collected from over 3,000 firms that manufacture, transport, and repackage this chemical. In the key findings it was found that even though the disruption to electric power only lasted one week, the chemical supply chain still took four to six weeks to return to a baseline state. The underlying constraint was in this case the availability of rail cars to ship the chemical product to and from locations. Balducelli et al. [15] use the SICIMOD simulator (Simulation of Interdependency of Critical Infrastructures and MODeling of the users' behavior) to capture the interdependencies between the energy and transportation infrastructures. A scenario was considered in which an electrical power outage occurs on parts of an entire railway path. The passengers on the train were divided into two classes, hospital
Synthesis of Modeling and Simulation Methods
13
users and generic passengers, and each class had a set of fuzzy behavioral rules to define its behavior. The simulation begins under normal conditions, and when the power outage occurs the behavioral rules are applied and the users waiting for the train have the option to take alternative means of transport, go by foot, or remain waiting. The simulation was executed for multiple scenarios in which the power outage duration, time of day, and day of the week were changed in order to evaluate the level of delays users experienced and evaluate any resulting emergency problems. For a power outage of 30 minutes, users began arriving late to the hospital one hour after the blackout occurs. Of the users, the highest percentage late to the hospital was the students with up to 80% late within an hour after the outage occurred. While the Agent Based method has proven itself to be useful for modeling reactionary and time-evolving behaviors of elements in a system, there is another method that takes advantage of known input and output resources. Inoperability input-output models (IIM) analyze how perturbations on an initially affected infrastructure propagate to other infrastructures through the exchange of input and output commodities that link them. The input-output analysis is based on Wassily Leontief’s original model used for studying equilibrium conditions in the economy. Leonteif’s model is based on equation (1) as presented by [16, 34, 35]. The xi term refers to the total production output from the industry, i. The Leontief technical coefficient, aij, is the ratio of inputs of industry i to industry j, in terms of the production requirements of industry j Lastly, ci refers to the portion of industry i’s final production for end user consumption. Given n industries, aij indicates the distribution of inputs from industries i = 1, 2, ..., n to the total number of inputs required by industry j. x = Ax + c ↔ {xi = Σj aijxj + cj}∀i
(1)
Interpreting this equation from an infrastructure view point, the terms of the equation become risks of inoperability, while the functional form of the equation remains the same. In the infrastructure model, xj is the overall risk of inoperability experienced by infrastructure j, and is a measure of both degree of inoperability and its probability, aij.is the probability of inoperability that the jth infrastructure contributes to the ith infrastructure due to their interconnectedness, and cj is the additional risk of inoperability that is inherent in the complexity of the jth infrastructure. Where aij, xj, and cj are factors in the total risk of inoperability of infrastructure i, xi. In a recent application, Haimes et al. [34, 35] used the inoperability input-output analysis method to model the impacts of high-altitude electromagnetic pulse (HEMP) attacks, which have the potential to cause severe damage to electrical and telecommunication systems. For example, for a 60 day recovery period and 6% inoperability scenario of the HEMP vulnerable equipment, total economic losses sum to $6 billion for all sectors in the greater northeastern region of the United States. There are other extensions of the IIM analysis not yet discussed, including the supply side and output side models [36]. The supply-side IIM is used when the supply factors are more important than demand, such is the case in monopolistic markets and when there is a shortage of resources. The resulting model measures the normalized price change of output due to a value adding price perturbation. For instance, the transportation industry has to pay higher
14
G. Satumtira and L. Dueñas-Osorio
wages due to its workers working overtime in order to maintain a nominal level of operations. A 20% increase in wages in the transportation sector results in an increase in other sectors’ output prices of 32% in the Transportation system, 12% in the Power system, 18% in the Health system, and 14% in the Gas Supply system. The normalized price changes can be translated back to degraded output in terms of dollars of $4,597, $5,132, $3,651, and $3,000, respectively. Alternatively, the output-side IIM is used to observe the impact on infrastructures ability to output resources due to capacity loss, facility closure, and other disruptions caused by perturbations. In many cases, the change in output can be estimated based on available capacity and resources. A potential extension of this versatile IIM approach is to expand its framework to explicitly account for the uncertainties in inoperability matrix components as well as uncertainty in perturbation events. This can be done by explicitly introducing random variables in the existing formulations. Initial steps in that direction are provided by Barker and Haimes [21], although restricting randomness to only one of the demand components of the system of infrastructures. Network science and graph theory are approaches that use nodes and links to directly model different types of infrastructure components and their coupling topology while exploiting the visual representation of the systems for increased adoption and usability, and even building upon existing methods including the IIM approach [37-42]. For instance, in an electrical network, power generators deliver power to electrical substations, which are connected by electrical conductors to deliver electricity to users and components of other systems, such as water pumping or gas compressor stations. Generators, substations, and load points are represented by nodes while electrical wires are represented by links. More generically, nodes are representative of units in the same infrastructure or across multiple infrastructures that consume and/or produce resources while links represent the means by which resources travel to and from units. In [23] and [24] nodes and links of water (W) and power (P) systems are referred to as vertices vW ∈ VW or vP ∈ VP and edges eW ∈ EW or eP ∈ EP, which are contained in networks or graphs, GW(VW,EW) or GP(VP,EP) representing the water and power systems, respectively. Several fundamental topological properties characterize the centrality, efficiency, and redundancy of infrastructure systems, such as the vertex degree or number of connections to a single vertex, the path length between vertices, the extent of local clustering, and the number of node-independent paths, which are redundant paths that only share their start and end nodes. Interdependencies within the network system are modeled by interdependent adjacency matrices, which capture the location, direction, and strength of coupling. These matrix terms are expressed through conditional probabilities P(Wi|Pj) = pi|j for i = {1,…,NW} and i = {1,…,NP}, where pi|j is the probability of failure of water component i given failure of power component j, and N represents the cardinality of the vertex set V for the water (W) and power (P) systems. The effect of the interdependent interface is assessed by running multiple simulations in which vertices are removed, either randomly, targeted, or as a consequence of natural hazards. The network system response is typically measured by the loss of global connectivity or service quality. In a similar framework using the Demon Model and mean field theory,
Synthesis of Modeling and Simulation Methods
15
Carreras et al. [43] carries out a set of calculations based on ideal networks and pre-established coupling rules regarding the operational state of each node at each time step. These calculations determine coupling and fraction of operational components. Analysis reveals that the symmetric coupling of systems increases the susceptibility of a system to critical failures than non coupled systems and can cause subsequent failures in coupled systems. Similarly, Rosato et al. [44] models the Italian high-voltage grid (HVIET) and its functional dependency on the Italian internet system (GARR) by means of flow models that go beyond connectivity concepts. The power flow on the high-voltage grid can be written as P = Bθ, where θ and P are the vectors composed by the voltage phase and the electrical power at each of the N nodes of the HVIET, and B is an NxN matrix of the reactance between nodes. The solution of the power flow model is the normal response of the network to predefined initial conditions, P0. The GARR network consists of an adjacency matrix of nodes and arcs in which data packets are generated at an Origin Node (ON) and send to Destination Nodes (DN). At any given time, a node may perform one of two options: send a packet of data to a node directly connected to it but not to itself or receive one of more data packets from nodes it is directly connected to. For each DN there are defined two different nodes j1 and j2, that are directly connected to the node. The DN will be directed towards one of these two nodes based on the probabilistic rule:
ܲ ሺ݆ଵ ሻ ൌ ܲ ሺ݆ଶ ሻ ൌ
షഁೣ ೕ భ
షഁೣೕ షഁೣೕ భ ା మ
(2a)
షഁೣ ೕ మ
షഁೣೕ షഁೣೕ మ భ ା
(2b)
where Xk, k = j1, j2 is the number of packets previously sent to node k through the different paths and β is a parameter to ensure the routing strategy is completely random. To simulate the relationship between the HVIET and GARR networks, the HVIET network was perturbed and the corresponding configuration of the GARR network examined. In the case of intermediate coupling (where the GARR node is put in the off state if and only if the power received by the node is below 75% normal power level), the HVIET network was able to partially recover its operability levels while the GARR network was not despite the implemented routing strategy. Several other mathematical modeling techniques have been used to model infrastructure interdependencies. Such models include those based on matrix partitioning that is capable of solving large scale networks in real time [45], optimization-based network flow modeling which combines network flow models with multi-objective solution methods [46, 47], petri nets [48, 49], the widely used nonhomogeneous Poisson power law models [50], system dynamics and nonlinear science approaches for studying complex feedback systems [51], the multiobjective 0-1 knapsack problem method to simulate transportation planning and modeling of multiple interdependent criteria [52, 53], joint vulnerability and flow analyses [28], Mixed Holistic Reductionistic tools [27], and Monte Carlo Simulation that incorporates elements from petri nets, flood frequency analysis,
16
G. Satumtira and L. Dueñas-Osorio
and Markov Chain processes [54, 55]. Additional recent methods include using Bayesian networks and game theory for infrastructure interaction and performance modeling. In a study conducted using Bayesian networks, Bensi et al. [56] integrates information from various data sources in near-real time to provide up-todate descriptions of infrastructures following an earthquake. Bayesian network model is used to model both seismic demands and system performance and is most powerful at assessing and updating component and system reliability. To illustrate this, five cases are presented with varying details, including location of seismic epicenter, bridge collapse, sensor data transmission, and infrastructure status updates. The probability for failure of six different bridges is calculated to range from less than 1% to 5%. Another means for effectively modeling network designs and capturing spatial elements is game theory. Zhang and Peeta [57] use a multilayer infrastructure network spatial computable general equilibrium (MINSCGE) model that considers interdependent infrastructures as production sectors interacting with each other as well as a modeled household in an economic environment. This game theory approach is able to capture all aspects of the interdependencies in a single model and takes advantage of realistic calibration data available through public and private entities. The proposed MINSCGE model consists of a closed spatial economy with N geographical regions and one household unit. Individual commodities networks, such as data, water, and electricity are represented by a directed graph Gk(N,Ak), where Ak is the set of links in the infrastructure network k. Simulation of interdependencies using the MINSCGE model is capable of analyzing cascading disruptions, disaster recovery, and resource allocation for multiple regions. These various techniques share a common goal of improving the ability of existing methods to accurately model infrastructure interdependencies through improved data acquisition, sophisticated mathematical underpinnings, multi-scale modeling, and the motivation to improve risk communication and mitigation. 2.2.2 Modeling Objective The models presented in the literature typically have various objectives aside from modeling the interdependencies between infrastructures. The most common objectives are risk and vulnerability analysis, risk mitigation measures and infrastructure protection, failure propagation prediction, and interdependence modeling and simulation. It is often the case that a single model or study will have multiple objectives. For example, a model that can anticipate cascading failures may then present a tool to select an appropriate mitigation investment strategy. In this instance, studies in the archival literature are classified as having both investment option prioritizations for mitigation and failure propagation prediction objectives. Risk and vulnerability analysis and assessment is conducted to identify the events that can lead to malfunction or failure of critical infrastructures and quantify the social, technical, and economic consequences of their occurrence [28, 58-63]. Vulnerabilities and their associated consequences can be more easily identified and better understood by viewing them in the context of both normal operational levels and an extreme event or direct perturbation [64]. The vulnerability of critical infrastructures, such as the hospital system, due to emerging
Synthesis of Modeling and Simulation Methods
17
interdependencies is often event driven. Meaning that on normal days, the disruption of a hospital system’s operations will not cause the immediate breakdown of other critical infrastructures. Although the hospital system itself may have reached critical levels, it does not contribute to the vulnerability of infrastructure interdependencies. Chai et al. [64] emphasize the importance of studying infrastructure interdependencies within a certain event and context to analyze emergent behaviors. Arboleda et al. [65] presents a case study that investigates the operability of a large Midwestern hospital that receives upwards of 150,000 patients every year. Operational levels of service are analyzed during normal conditions, and then in the event of a natural disaster, operational levels are assessed to determine the effects of demand surge and disruption to water, electricity, and power supply. The disaster event, in this case an earthquake, causes disruptions to water, power, and the road system reducing supply. The impact of the earthquake reduced the hospital’s supply of water, power, and transportation of medical supplies by 84 %, 89 %, and 48 %, respectively. However, only the water and transportation systems can satisfy the demand, while power can only supply up to 85 % of the demand. Alternatives to restore infrastructure networks are evaluated, such as the installation of new electricity generators, water intake stations, and power plants to satisfy the demand. The risk analysis techniques employed can assist hospital administrators to identify their most critical components as well as evaluate restoration strategies which constitute another objective that is discussed in the next paragraph. As part of the analysis, normal operation levels of the health care facility are also assessed. The normal operations model describes the functioning of the supportive infrastructure systems, such as the power generation, water supply, and transportation systems, as well as the hospital’s internal services, including the emergency room, intensive care unit, and surgery. In the illustrative case, the demand at all nodes under normal conditions is satisfied, and the hospital is able to provide sufficient medical care to incoming patients (fewer than 500 casualties). The normal operations model analysis can then be used to identify component node and flow arc that can be used in the recovery process following a disaster event. In response to emergencies and mitigation of interdependent effects, different investment strategies have been created in the form of physical and operational changes to infrastructure systems [19, 20, 26, 47, 52, 53, 66]. The objective of many models is to identify and analyze risk management measures and infrastructure protection, and then select or prioritize the options that would best improve infrastructure performance [67]. Identifying and modeling these investment strategies is a major attribute in the modeling objective category. To model investment opportunities, Xu, et al. [26] represents them as changes to the transition matrices in the Markov Model used to capture variations in capacity of links over time in network flow models. For example, improved reliability of a piece of equipment can be represented as reduced probability of entering a failure state in the link capacity of the Markov model. Considering the SCADA link S1 to DS1, a 20% reduction in the transition probability to enter the lowest failure state and an investment of $100k on the link would result in a new transition matrix where the second column is [0.032, 0.303, 0.005, 0.005]. If an additional $150k is invested
18
G. Satumtira and L. Dueñas-Osorio
on this link, the second column in the transition matrix would now be [0.026, 0.242, 0.004, 0.004] whereby the probability to enter the failure state is reduced. Using the Markov model, spatial and temporal connections between network links are represented in both transient and steady-state analysis. Investment prioritization can also have the added difficulty of meeting multiple and often conflicting criteria including economic, ecological, social, health, and safety aspects. Merz et al. [68] uses the Multi Attribute Value Theorem (MAVT) as part of the Multi Criteria Decision Analysis (MCDA) to evaluate potential Business Continuity Planning (BCP) programs. In a model of the chemical industry, the best possible BCP is a function of the acceptance by workers and the public, costs for personnel and resources to implement the BCP, and the effectiveness in terms of environmental impacts and production aspects. The BCP alternatives are valued based on their preference to decision makers and overall performance. The highest ranking BCP in this case study corresponds to the criteria of environmental impacts and production aspects. Through MAVT and MCDA, Merz et al. [68] identified the best BCP given specific criteria. The MAVT and MCDA methods foster effective decision making and BCP selection in the event of conflicting interests [20, 21, 66]. Once system failures are not always isolated events, they can spread very rapidly to other parts of the infrastructure network system and neighboring infrastructures. Many modeling techniques attempt to predict cascading or domino failures with the hope of containing the failure to a manageable level or prevent its propagation through individual and coupled systems. Failure propagation analysis features such techniques as consequence curves, used by Robert et al. [30, 69] who attempt to anticipate cascading effects in the telecommunication, electricity, and drinking water infrastructures. Consequence curves are used to depict the changing status of an infrastructure due to the loss of an input that it uses. A color code is used to represent the status and the extent to which the infrastructure network has degraded. The effective use of these tools and modeling of the impacts on systems from removing resources used and supplied by systems has been verified by partners at the Centre Risque and Performance who appreciate the simplicity of the model and its ability to be applied quickly and usefully. Similarly, [70] study the cascading failures that result between the telecommunication and electricity networks. Results reveal that the telecommunication infrastructure’s ability to recover mitigates cascading failure effects. A fast-healing telecommunication system enables the electrical infrastructure to gather information quickly and take appropriate actions to reconfigure its system elements. More recently, HernándezFajardo and Dueñas-Osorio [71] developed an instantaneous failure propagation approach that enables tracking forward and backward (cyclic) interdependent effects across water and power infrastructure systems under earthquake hazards until the propagation of failure is locally absorbed, or the system of systems collapses. They found that most of the detrimental effects are propagated in the initial forward and backward cycles of the cascade, and that after response stabilization occurs, cascading effects alone can increase infrastructure fragility up to 25% over acyclic one-way interactions.
Synthesis of Modeling and Simulation Methods
19
The final attribute is for interdependencies modeling and simulation where model methods, tools, and techniques, such as FAST and SMART models, mixed holistic reductionistic approach, federated Agent Based modeling, a multi-model approach, Bayesian networks, and the Critical Infrastructure Protection Decision Support System (CIPDSS) model are presented and their prospective uses identified [11, 29, 31, 32, 51, 56, 72, 73]. In [72], the CIPDSS model developed through collaboration between Los Alamos, Sandia, and Argonne National Laboratories is presented. Through simulations, the capabilities of the model are showcased through scenarios to identify its capabilities. The CIPDSS model was developed for government and industry leaders to determine the consequences to be expected from disruptions, understand their mechanisms, and analyze risk mitigation practices. A base case scenario for a large city of five million people operating under base line conditions is first introduced. Then, two disruption event scenarios are applied to the base case. The first scenario involved disruption of several road ways and results depicted a 25 % loss in roadway capacity resulting in increased travel times during normal rush hour periods. The second scenario added to the first scenario to include the effects of a mass evacuation where a third of the population is attempting to leave the city. With the addition of evacuees to normal road conditions, the gridlock experienced on the road ways persists for 14 hours. Through several other scenarios on other infrastructures, including telecommunications, the CIPDSS model has demonstrated usefulness in modeling various disruption scenarios that allow for easy comparison of impacts across infrastructure systems. For such large scale interdependency modeling efforts, Min et al. [51] also presented a method based on combined elements of system dynamics for analyzing individual infrastructure components, integrated definition function modeling (IDEF0) [74] to identify data requirements and describe information exchange between the methods, and non-linear optimization algorithms for finding control variables values to optimize system performance from the system dynamics simulation. The combined framework can be used to study the entire system of physical infrastructures and economic activity. 2.2.3 Scale of Analysis The scale of analysis describes the level of granularity the infrastructure interdependencies are analyzed. Scales were identified based on the various illustrative examples and case studies presented in the research body and could be grouped into three common levels: system of systems, network, and advanced network levels. The highest infrastructure abstraction level (or lowest level of granularity) is the system of systems level. A system of systems contains all the critical infrastructures, including electricity, telecommunications, etc. System of systems analysis is, therefore, the examination of interdependencies between two or more infrastructure systems, each treated as a single entity, usually without distinguishing its constitutive components [23-25, 28, 58, 61, 62, 75-77]. For example, an extreme risk analysis using the IIM and dynamic IIM approaches is conducted on a two sector economy by Lian et. al [78]. An interdependent system of equations is used to solve for the inoperability and expected economic loss of the two industries. Given a reduction in demand of 9 % and 16 % for industries 1 and 2,
20
G. Satumtira and L. Dueñas-Osorio
respectively, the expected inoperability and economic loss are 22 % and US$218 million for industry 1 and 19 % and US$383 million for industry 2. Similarly, Jiang and Haimes [66] considered a power generation plant and a coal plant in their sample problem of finding an optimal strategy for distributing limited electric power to minimize overall unsatisfied demand. Reducing the electric power supply to customers by $7.5 thousand increases the production of coal by $30 thousand confirming that the managed distribution strategy is superior to the unmanaged strategy. The system of systems scale can be extended to include the social system. The term “Infrastructure environment” was first introduced in [10] to describe various aspects outside of individual inputs and outputs that play a major role in shaping infrastructure operational characteristics and behavior. Such factors, including economic and business opportunities, public policy, government investment decisions, legal and regulatory concerns, public health and safety, technical and security issues, and social and political concerns, are the driving forces behind how infrastructures operate. The intermediate level (or medium level of granularity) is at the network level. At this level, interactions and interdependencies between the most basic infrastructure units in a system are analyzed [23, 24, 79-81]. Lee et al. [47] present a study at the intermediate level. The deliberate attacks on the New York World Trade Center (WTC) resulted in the loss of a power substation and a resulting power outage of the nearby Canal Street. The loss of power and extremity of the situation caused temporary disruptions to the subway stations, and resulted in the loss of 300,000 phones lines and 4.4 million data lines. Similarly, Chee-Wooi et al. [48] present their method for evaluating the consequences of cyber attacks on SCADA systems through a case involving a communications link between a SCADA control center network and its connected substation network. A simulation is conducted where the attack is launched from outside of the substation level network and from within the network. The steady state probabilities indicative of the level of vulnerability are calculated for the computer systems under supervisory control at two substations: 1 and 22. Assuming that comparable computers are used at both locations, substation 22 was determined to be more vulnerable than substation 1 with probabilities of 0.2329 and 0.1513, respectively. The lowest abstraction (with the highest level of granularity) is the advanced network level which includes elements of the network level previously mentioned, except now the components have been refined to include supply and demand factors that evolve in time, as well as people and their decisions [12, 14, 15, 31, 32, 82-84]. For example, North [11] uses the Agent Based Spot Market Agent Research Tool (SMART) II to model the interconnections between the electric power generators, electric power consumers, and power transmission lines that coexist in the electric power infrastructure. Generator agents determine their level of production based on the potential profit to be made and have investment capital that fluctuates by profits and losses. Generators choose whether to sell power based on cost curves. Consumers also choose whether to sell power based on cost curves and buy power for their own use based on the potential to make profit using the power bought. When consumer agents reach a predetermined level of investment capital, they can acquire new customers, but can also go bankrupt when they run
Synthesis of Modeling and Simulation Methods
21
out of investment capital. In this sense, bankruptcy or growth represents the loss or gain of power-based industries and people due to unfavorable or favorable economic conditions. Some initial insights drawn from preliminary simulations conducted at the time indicated that raising the natural gas-fired electric generator market share significantly increased market interdependence. In times of simultaneous failures, this increased interdependence would result in both the electric power and natural gas markets fighting for the same resource. It is important to distinguish the previous description of scale of analysis (infrastructure scale) from a more traditional use of “scale” presented by other authors. The methods produced by Peerenboom and Fisher [85] and Pederson et al. [9] both described infrastructures in terms of spatial resolution (resolution scale) where individual networks are treated as single entities or sectors that exist at different geographical scales. In this interpretation of scale, there are four resolution levels: local, regional, national, and international. For example, infrastructures at the local level include municipal water distribution utilities and emergency services. The electricity or power grid operates at the local, regional and national levels. Interstate transportation and transmission gas systems mainly operate at the national level, and the telecommunication and finance infrastructures exist at the international level. Although this definition of scale produces different subsets of collected works, the two scales (infrastructure and resolution) are usually combined. For example, infrastructure systems with a resolution at the regional level can be analyzed at the component and network level or at the system of systems level in the infrastructure scale. In [23, 24], the dependence of water distribution infrastructure elements on the electric power infrastructure are analyzed at the network and component level of the infrastructure scale. Local municipal water storage tanks in the resolution scale are linked at the component level infrastructure scale via water pipelines. Similarly, power network gate stations, which exist at the regional level in the resolution scale, are connected to other gate stations and substations via transmission lines at the infrastructure component scale. For the purpose of the present study, research will be characterized by the infrastructure scale first and then by resolution scale, when pertinent. 2.2.4 Quantity and Quality of Input Data The quantity and quality of available data presents a major challenge to complex infrastructure modeling and simulation efforts. Modeling infrastructure interdependencies accurately requires a large volume of development and validation data, which is scarce and when found, it needs homogenization from different formats, levels of detail, and completeness. When categorizing most of these and other studies on interdependent infrastructure analyses according to the data they manage, three types of data quantity and quality emerged: public/private sector published data, extreme event reported data, and directly elicited data. Each of the three types has varying quantities and quality of data available. Certain studies have been able to utilize publicly available data. The economic sector, for example, has been modeled using empirical matrices and algorithms based on, “transactions among various industry sectors in the U.S. economy,” published by the Bureau of Economic Analysis under the U.S. Department of Commerce
22
G. Satumtira and L. Dueñas-Osorio
[16-21, 36, 66, 75, 78, 86]. Data published by public/private sector entities is often published in large quantities at a specified interval, such as yearly, monthly, or daily. [17] and [66] both rely on national input output accounts published by the U.S. Bureau of Economics. Such accounts are relatively high in quality and contain specific details about day to day operational conditions of infrastructures through trade indicators. Public/Private sector data is also being used to model existing infrastructure and emerging electric power smart grids with self healing properties. This new technology, presented by Amin [87], has the ability to minimize disruptive impacts on the system through adaptive protection and coordination methods between components of the smart grid. The smart self healing grid is composed of substations containing breakers, transformers, etc. with built in processors that relay information on their device parameters, status, and analog measurements to other devices. When a new device is added to the substation, the device automatically sends information on its parameters and interconnections to the substation central computer which gets automatically updated. With the added situational awareness and better communications and controls, the smart self healing grid can operate close to or at the limit of stability while minimizing quality degradation of system performance and the impact on interdependent systems. Studies in this data type attribute also include those that construct network topologies and structures based on available configurations of system component connectivity and are presumably accurate to a certain degree of detail. The method of using information reported by news agencies and emergency management services during and after an extreme event is less detailed than and not as available as data published by the public/private sector. Post disaster reports contain information of power or telecommunication outages and other infrastructure disruptions resulting from the extreme events [72, 75, 76, 88]. Nojima and Kameda [3] collected damage reports following the Loma Prieta earthquake which contained limited, yet useful, data of the effects each damage, such as a downed power line, had on other infrastructures. Certain studies have attempted to understand the risk of disruption to interdependent infrastructure systems by analyzing past events in an empirical fashion. Coverage of terrorist attacks, natural disasters, and day-to-day utility disruptions by news networks, government agencies, and post-event expert reconnaissance missions are often used as sources of information for analyzing failure propagation and system recovery. Information on subsequent infrastructure component failures and recovery efforts were collected following the 1989 Loma Prieta earthquake [3], the2001 World Trade Center attack [89], and power outages in the United States and Europe reported by the Federal Communications Commission between 1996 and 2003 [50]. This post-disaster information has proven to be useful for developing and integrating disaster preparedness in communities. For example, in the aftermath of the 1906 San Francisco earthquake and the resulting urban fire, the world’s largest seismically reliable water supply system was constructed and later upgraded in the 1980s. The initiative to build such an infrastructure paid off as emergency personnel including San Francisco Fire Department units were able to respond effectively to several fires following the 1989 Loma Prieta earthquake [90, 91]. Media reports were also used by McDaniels et al. to patterns in the infrastructure
Synthesis of Modeling and Simulation Methods
23
failure interdependencies (IFIs) that occurred in the August 2003 northeastern North America blackout, 1998 Quebec ice storm, and three 2004 Florida hurricanes in order to develop an analytical framework to characterize IFIs. The multiple IFIs that occurred during each event were separated into four quadrants based on their impact, in terms of duration and severity, and extent or amount of people affected. The majority (320) of the 653 IFIs are contained in Quadrant 4 which are low impact low extent events. These relatively minor failures generally do not require any mitigation action and usually last a short period of time. However, it could become a major disturbance over long periods of time. The final approach is to conduct interviews with involved agencies or distribute questionnaires in an effort to extract information [75, 84, 92]. Robert and Morabito [30] and Brown et al. [14] both elicit data from industry and private/public sector partners through interviews and surveys. However, Robert and Morabito [30] established a much more focused study area and were able to obtain more detailed descriptions of the critical infrastructures in the area. The entire study area was divided into 1- square kilometer sectors while Brown et al. [14] analyzed the entire state of California’s infrastructure systems. In [30] case information regarding the consequences of resource degradation were collected from partners who used their expertise and knowledge of their infrastructures to identify their level of functioning when faced with a degradation of resources in each sector. Brown et al. [14] used this technique to construct an assessment of infrastructure interdependence. Questionnaires were developed based on past assessment to extract background information for help with understanding network system components, specifying the purpose of the system, identifying and ranking the potential consequences of their non-functionality, and identifying information used to make decisions. The data was analyzed to identify conditions that may lead to system failure, identify significant uncertainties of decisions or consequences and any potential sources of information to reduce these uncertainties. The results of the analysis are summarized as potential vulnerabilities, consequences, and the immediacy and likelihood the situation will result in failure or exploitation. Analysis of the data collection procedure resulted in The other approach to conduct interviews was used by Lambert and Patterson [93] who collected dependency scenarios through interviews with state and local agencies involved in disaster recovery efforts to identify dependencies and functional units of the transportation system. Approximately 500 agencies were interviewed and asked such queries as, “what is a case in which your agency has not been able to start a pre- or post-disaster activity for which you are responsible because you were waiting on transportation agency to complete a prerequisite task?” Over 200 dependency scenarios were collected and initially analyzed by their primary organizational units which are the source of a significant number of dependencies. The different organizational units are Administration, Environmental and Regulatory Affairs, Equipment, Finance, Information Management, Legal/Authorization, Materials, Operations, Personnel, and Structure. Results reveal that over 30% of dependencies collected are primarily involved with the Information Management unit, nearly 23% are involved with the Operations unit, and Environmental and Regulatory Affairs, Finance, Legal/Authorization , and Materials all account for only
24
G. Satumtira and L. Dueñas-Osorio
4.2% of all scenarios collected. These results indicate that many of the surveyed agencies relied on the transportation agencies to deliver current and accurate road condition information, alerting the transportation agency to review its current information management recovery plans. Other sources of data not available to the agencies that may have caused delays include the location of hazardous materials, the location of floodplains, environmental regulatory permit requirements, and actions resulting in environmental violations. Apostolakis and Lemon [94] used surveying and interviewing techniques to understand interdependencies faced by systems of systems through a small network of real systems. The authors examine the Massachusetts Institute of Technology (MIT) campus, which was viewed as a small city network of systems with 6,000 residents, 14,000 commuters, a utility plant, data network, cable television station, phone system, and police and medical personnel. Through the full cooperation of the MIT Department of Facilities, Apostolakis and Lemon [94] were able to obtain qualitative and quantitative systems operation details and accurately identify vulnerabilities faced by the MIT campus. The performance index (PI) for each building A - B, which is a measure of what is important to the building, as well as for each multi-cut set (mcs), which is a set of events that guarantee interruption of services. The vulnerability and physical meaning of the mcs can now be determined using the PI data. Vulnerability is a measure of the mcs’ severity and susceptibility to attack ranging from red, most severe and susceptible, orange, yellow, blue, to Green, very low severity and susceptibility. In particular, mcs ev8 is an electrical service manhole that contains the main electrical service switch to Building B. This mcs is at a level red vulnerability meaning it is severely vulnerable and at a critical location highly susceptible to attack. In terms of risk management, the red category has the highest priority for countermeasure actions, such as welding the manhole cover to reduce susceptibility to a level yellow. Installing an additional electrical service would reduce the PI of the manhole from 0.07274 to 0.04234 as electrical services account for 63% of the affected buildings overall PI of 0.11508. Despite the many techniques potentially available, acquiring data in general remains a challenge as surveying, reporting, and communicating impacts during a disruptive event is difficult. Moreover, infrastructure managers and service providers are often unwilling to reveal information about their vulnerabilities because of the threat to company security. 2.2.5 Targeted Discipline Researchers of infrastructure interdependencies have recognized the importance of interdependency studies to the engineering, social-technical, and economic disciplines. For the engineering focus, many studies look to improve infrastructure network system design and reliability by analyzing the physical linkages and components. Such studies highlight operational changes or infrastructure investments that lead to increased reliability [23, 24, 26, 28, 29]. Ozog et al. [95] examine how infrastructure interdependence between the power infrastructure and other infrastructures may affect the power restoration process. A case study in which the effect of lost communication infrastructure components on power restoration revealed a reasonable correlation between the number of communication points
Synthesis of Modeling and Simulation Methods
25
removed and the restoration time. However, this correlation is not linear and no simple correlation exists between the number of bus connections and restoration time. Similarly, Ouyang et al. [28] modeled the power and gas infrastructures and analyzed their functional interdependencies. Results of the simulation may be used to suggest strategies for robust design and growth of infrastructures and provide managers with the information to prioritize scarce resources to mitigate adverse effects. Social-technical studies are targeted at institutions, such as government agencies or businesses which own or operate physical systems while attempting to address concerns posed by society [42, 64, 82, 96, 97]. Kroger [98] has identified several societal factors that shape the risks faced by critical infrastructures. Such factors include public risk acceptance and awareness, attractiveness of society to attacks, and urbanization. Kroger [98] concludes that risk governance strategies need to be developed to reduce vulnerability and better understand and balance society needs. Bekkers and Thaens [99] introduce four different risk-governance models. In the cybernetic governance model, government is placed at the center above the rest of society and allows the government to intervene, such as in the closing of a hazardous building. The market model puts government in a more restricted role and instead shifts the responsibility of risk awareness and liability to citizens and companies. In the civil society model, government, citizens, and private companies are jointly responsible for risk governance and management and emphasize the importance of involving citizens in risk management policy making. Lastly, the network model looks at the internet as a self-organizing tool capable of managing risks internally. The governance models presented in this conceptual paper can be used to effectively handle risks resulting from interdependencies between systems. The interaction between the different kinds of networks and infrastructures and also between the policy regimes and different governance models presents a significant policy challenge in the governance of risk. Many of the early studies conducted by Rose et al. [5] and Cronin et al. [4], as well as several current studies [17, 36, 83, 100], specifically analyze the impacts of interdependencies between the economy and other critical infrastructures. As research methods and analysis techniques evolved, certain mathematical models dominated the study of economic activity and critical infrastructures. Leontief’s input-output model, which was originally used to model economic behavior and later adapted to study infrastructure risk of inoperability and the movement of resources in all infrastructures, has retained trade metrics at its core [16, 18, 21, 66]. The results of such studies can be used to measure reductions in commodities produced and consumed and potential profit losses due to system inoperability. In the HEMP attack scenarios conducted by Haimes et al. [35], the IIM is used to analyze, identify, and highlight critical workforce segments that are essential for the continued operations of infrastructure systems in the aftermath of a HEMP event. Two metrics of impacts on the workforce are depicted: the workforce earnings losses and the number of affected workforce personnel. A workforce earnings loss can be seen as a reduction in income due to the inability of a workforce sector to perform its intended function because of sickness, injury, inaccessibility of
26
G. Satumtira and L. Dueñas-Osorio
workplace, etc. The number of affected workforce personnel refers to the number of workers unable to perform their functions. The two metrics are evaluated over a 60 day power recovery, 60 day HEMP vulnerable equipment inoperability, and the baseline scenario, which combines the two 60 day scenarios. In the baseline scenario, the top three sectors with the highest workforce earnings losses were the electric, gas, and sanitary service with $709 million and 11,664 people lost, industrial machinery and equipment with $457 million and 8,205 people lost, and business services sectors with $316 million and 8,117 people lost. With these results, the top economically critical sectors are easily identified and prioritized. Similarly, Macaulay [101] assessed the internal and external risks associated with critical infrastructures and their potential to threaten the vulnerability of the US financial services sector. Several metrics and methods are presented, including operational risk, data dependency, and inbound and outbound metrics and a tool for analyzing cascading effects. The dependence of infrastructures on the financial sector and the financial sector on other infrastructures is quantified in an interdependency matrix. Results reveal that food, health, and energy goods and/or services are dependent on services provided by the financial sector. On a scale of 1 to 10 with the level of dependence increasing to 10, dependence for food goods was 8.45, 7.77 for energy services, and 7.79 for health services. Such services are highly vulnerable to stemming from the financial sector. 2.2.6 End User Type In contrast to the different targeted disciplines, the end user type refers to the profession, field, or group of people who will benefit from the study. The multiple studies are aimed at industry professionals, government agencies, or scholarly researchers and scientists. The financial planners and strategists mentioned in the previous section are classified as industry professionals. Others in this professional type include facility operators and managers who use the results and modeling tools of such studies to optimize performance in terms of reliability, profit and/or efficiency. For example, in a case study on the British Columbia Transmission Corporation bulk power network (BCTC) and its communication system conducted by Ozog et al. [95], an assumed risk matrix was constructed for outage probabilities, allowing calculation of lost mega-watt hours (MWh), and level of risk for each component in the BCTC system. This matrix is useful in assessing the amount of resources to disburse to different components in the system to optimize system restoration. Therefore, studies on investment strategies that focus on emergency management and mitigation are quite beneficial to industry professionals [30, 50, 54, 88, 95, 102]. Government agencies and personnel would also stand to benefit from such studies that provide details of interdependent infrastructures and quantified assessments that may aid in the development of critical infrastructure protection strategies [46, 48, 54, 76, 103]. Infrastructures, such as telecommunications, power, transportation, and water, are under constant threat from malicious attacks and natural disasters. The potential for extensive disruptions across the nation from such events has made emergency management and risk mitigation a top priority for government agencies. However, unlike the power and telecommunications industries, which are for the
Synthesis of Modeling and Simulation Methods
27
most part private company operated, systems such as transportation or water delivery, are operated by government agencies. These bodies rely on emergency management strategies to respond to natural and man-made disasters effectively. Take for example the case study presented by [93] in which various system restoration schedules in a post hurricane scenario are highlighted. Schedule dependencies are affected by eight indices, including the resources involved, cascading effects, and severity of the situation in terms of time duration. This data could be used by transportation agencies to select, prioritize, and manage schedule dependencies effects in the post-hurricane recovery process. Scholarly researchers are a source of many of these studies used by industry professionals and government agencies. Likewise, scholarly researchers can draw their future studies from applications and improvements to existing methods, as well as from the introduction of new modeling methods and sophisticated theories [31, 32, 37, 44, 56, 70]. These aspects are often seen in studies classified in the modeling objective category under the Interdependency Modeling and Simulation attribute where new and updated models are presented. However, scholarly researchers are not restricted, and can in fact make use of almost any study as most of the ideas being presented require additional testing and refinement before they can be used. Scholarly researchers carry out this work and continuously produce new and updated methods. Given the multiple years of published literature, it is possible to see how the different modeling methods and approaches have evolved. For example, early economic studies by Rose et al. [5] and Cronin et al. [4], which investigated the interdependencies between infrastructure system and economic activity at a holistic view of the economy, have evolved to incorporate elements of supply, demand, and the transfer of input and output resources [16-21, 34-36, 66, 75, 78, 86, 104-106]. Similarly, Agent Based methods have adapted their agent behavior definitions, which were once very crude, to sophisticated models that more accurately mimic aspects of society, technical systems, hazards, and nonperfect, human behaviors in infrastructures [32, 63, 98].
3 Appraisal of Research Literature Conceptual and strategic papers, such as those by Rinaldi et al. [10], Pederson et al., [9] and others, are valuable resources that provide overviews of existing research and insights into the future direction of the infrastructure interdependency field [85, 107-112]. These conceptual studies examine challenges in the field and explicitly call for action and investment in research. They identify existing modeling tools, objectives, and challenges, and present ideas and areas that need concerted attention. Also included are reports by government bodies, such as the President’s Commission on Critical Infrastructure Protection in the United States which have pin pointed specific areas where research, development, and university education is lacking. The European Union Commission developed the European Program on Critical Infrastructure Protection (EPCIP) [113], and the Australian parliament released a report entitled “Thinking about the Unthinkable: Australian Vulnerabilities to High-Tech Risks” [114]. Reports on terrorist attacks [89, 106], seismic events [91, 115, 116], and hurricanes [117, 118] provide a qualitative view
28
G. Satumtira and L. Dueñas-Osorio
of the impacts these disasters have on critical infrastructure systems. An important element of these disaster reports that differentiates them from disaster reports in the remaining body of literature is that they do not utilize a mathematical method to analyze interdependencies or attempt to systematically quantify the impacts. Instead, they provide a description of events, identify observed interdependencies, and pose opportunities for research endeavors. Together, these types of high-level documents are paving the way to the understanding of infrastructure interdependencies. The following critical analysis encompasses the remaining literature material and characterizes the whole collection of published papers in terms of observed frequencies, relationships across studies, and opportunities for future research. Approximately 200 published studies were analyzed, 162 of which were included in the assessed archival literature collection of the present review. It is clear that the scientific and application interest in interdependencies research is growing, which according to Figure 2a, the percentage of papers published in the field has increased significantly since 2001 and has so far reached peak values in 2007, 2008, and 2009. In fact, 73 % of the entire assessed literature collection was published between 2005 and beyond as seen in Figure 2b, and an astonishing 95 % of all papers in infrastructure interdependencies have been published within the last 10 years alone. This publishing dynamics confirms the nascent nature of the interdependencies field, the growing interest, and the upcoming maturing process, as the already published ideas, models, and tools start percolating from the academia to the industry and governments and feedback to refine theories and consolidate the foundations of this important area of knowledge. The number of studies to be published in the coming years is expected to surpass current trends as interest in the field expands to encompass a broader range of disciplines, users, and geographies.
Fig. 2a Frequency of Literature on Interdependent Infrastructures Published Every Year
Synthesis of Modeling and Simulation Methods
29
Fig. 2b Publishing Trends in a Five Year Basis
To assess the published literature, mathematical method was chosen as the primary category, because each study in the collection has a clearly defined, distinct underlying method to abstract the problem. This initial identification of the different papers is the most helpful entry point to researchers and practitioners attempting to align their research methods and ideas, since it is consistent with the current interest in the field as shown by the frequency of published papers per category. Four attribute classifications emerged from the mathematical method view point: Input-Output, Agent Based, network and graph theory, and all other emerging but unclassified methods and tools. The single methodology with the greatest percentage of papers belonged to the network and graph theory attribute with 22 % of the studies utilizing this methodology [23, 24, 28, 44, 70, 119-126]. Input-Output theory was the second most encountered methodology in the analysis. This attribute included studies that were based on either the study of system inputs and outputs [2, 4, 5, 101, 106] or the Inoperability Input- Output method (IIM) [16, 20, 21, 34, 35, 78, 86, 106]. The study of inputs and outputs is used to identify interdependencies between infrastructure systems by investigating the transfer of resources between systems and observing their reactions to disturbances in the flow of commodities. The IIM, on the other hand, is a specialized method based on the study of inputs and outputs, but also considers supply and demand. The two methods are grouped as they are based on the same underlying principles of input and output analysis. Agent Based methods were used in 10% of the studies [31, 32, 61, 73, 98, 127-129]. Unclassified methodologies and tools were used in the remainder of the studies representing 46%. Such methodologies and tools included the use of Petri nets [48, 49, 54, 55, 119, 130, 131], Monte Carlo simulations [24, 76], Geographic Information Systems tools (GIS) [102, 132-134], genetic algorithms [135], Markov or Semi-Markov processes [13, 22, 26, 136], and multi-model simulators [105, 137-139]. Many of the studies
30
G. Satumtira and L. Dueñas-Osorio
classified under network or graph theory, Input-Output and IIM methods, and agent-based method rely on several of the unclassified methods and tools previously mentioned. The network or graph theory attribute had the highest percentage of studies, 34.5 %, pairing with other methods and tools. For example, power law theory [23, 44], genetic algorithms [140], Fuzzy Numbers [38], Markov and Semi-Markov processes [22, 26], the 0-1 knapsack problem [52], Monte Carlo simulations [23, 24], cross impact matrices and fault tree analysis [3], Petri nets [119], non-linear programming [120] were the preferred techniques used with network or graph theory methods. Once the different mathematical methodologies were identified, the studies were analyzed to determine their modeling objectives as the second most frequent published category. The following modeling objectives were identified in Section 2.2.2: Risk and Vulnerability Analysis, Interdependencies Modeling and Simulation, Risk Mitigation Measures and Infrastructure Protection, and Failure Propagation Analysis. To tie with the modeling objective category and impart structure to the analysis of the literature, each modeling objective attribute was viewed in terms of the primary category mathematical methods. The percentage of each modeling objective attribute for the top three mathematical models and the unclassified methods is presented in Table 1. The percentages are relative to the total number of papers excluding the conceptual papers, government documents [6-8, 74], and other descriptive works that did not rely on any formal methods. For instance, there are 82 papers classified under Risk and Vulnerability Analysis. Of those, 21 papers utilized the Input-Output method, 9 utilized agent based methods, 26 utilized network or graph theory, and 41 utilized the unclassified methods. Based on these numbers, it would seem that the total number of papers classified under Risk and Vulnerability Analysis is actually 97 not 82. However, this is not the case. The total percentages in each of the modeling objectives exceeds 100%, because several studies, such as in [38, 39], utilize more than one mathematical method. This analysis can be repeated using the data contained in Appendix A.1 to identify correlations (positive and negative) across other categories and attributes. Table 1 Proportion of Modeling Objectives by Mathematical Method Risk and Vulnerability Analysis, %
Interdependencies Risk Failure Modeling and Mitigation Propagation Simulation, % Measures and Analysis, % Infrastructure Protection, %
Network or Graph Theory
31.71
26.92
36.59
7.69
Agent Based Method
10.98
19.23
4.88
3.85
Input-Output Simulation
25.61
3.85
19.51
38.46
Unclassified Methods
50.00
73.08
63.41
80.77
Total
118.29
123.08
124.39
130.77
Synthesis of Modeling and Simulation Methods
31
Identifying the different modeling objective attributes in this manner puts into perspective which mathematical methods have proven to be and are most often used to address each modeling objective. Although the unclassified methods clearly exceeded the other mathematical methods in their use, it is also interesting to look at the top single mathematical methods. Network or graph theory was the second most used mathematical method across all modeling objectives. For the Risk and Vulnerability Analysis attribute, Input -Output Simulation methods [17, 58, 86, 105, 141-143] was the third most used method. Likewise, Agent Based methods were the next most often used to address Interdependencies Modeling and Simulation objective [11, 31, 32, 98]. However, there was not a close third mathematical method used after network or graph theory for Risk Mitigation Measure and Infrastructure Protection and Failure Propagation Analysis objectives [6, 38, 46, 47, 52, 59]. Following the published frequency order of categories, the literature body is then assessed by the scale of analysis. This category, if not explicitly stated, was identified based on what scale was used in case studies and sample simulations that are often included to illustrate models. For instance, in the example provided by Delamare et al. [70] the interdependencies between the telecommunication system on power and electrical systems are presented in an example. In modeling these systems, every router and electrical node needed to support telecommunications is included placing this system at the network scale thus far. However, in the modeling the telecommunication network is assigned multiple detection scenarios to how it reacts to a failure in the electrical system. This pushes the scale of analysis into the advanced network level, because of the added element of multiple paths and timing by which the systems may respond. Looking at the literature body as a whole, three scales were identified in Section 2.2.3: System of systems, network, and advanced network levels. The system of systems level comprises 29 % of the literature [16, 30, 43, 75, 92, 94, 132, 144-146], network level 27 % [22, 29, 56, 134, 147-152], and the advanced network level 18 % [32, 57, 93, 100, 137, 153-155]. However, isolating the scale of analysis by mathematical method and modeling objective reveals more useful data. As mentioned before, Risk and Vulnerability Analysis studies saw the highest percentages under network or graph theory and Input-Output method. Of these studies, roughly 32 % for network or graph theory [22, 47, 94] and 79% for input output methods are analyzed at the system of systems level [17, 34, 35, 58]. Included is an illustrative example by Barker and Haimes [21] of a 15 industry system. The effect of a disruptive event in the mining sector on the oil and gas operations is studied. Five risk management strategies are developed to reduce these effects. The 15 industry system is analyzed at the system of systems level considering the effects of demand perturbation and strategy implementation costs and do not considered detailed interactions of system components directly. Analysis of each of the strategies based on multiple economic loss and sensitivity metrics tradeoffs reveals that different strategies are appropriate different decision makers’ personal interest in potential economic losses and model sensitivity to uncertainty.
32
G. Satumtira and L. Dueñas-Osorio
Like the scale of analysis category, attributes in the quantity and quality had to be identified based on the data used in case study examples. The different sources of data include Public and Private sector Published Data, Disaster Events Reports, and Solicited Data. More than 50 % of the assessed literature utilized Public and Private Sector Published Data. As described in Section 2, these data sources include data published by the government, such as the Bureau of Economic Activity’s input output data reports, and also information that is made publicly available or come across by common knowledge. The heavy use of public and private published data is evident in every modeling objective attribute and most mathematical methods utilized. This fact also illustrates the potential of extracting untapped data using alternative techniques, as post event reports and large scale surveys. The targeted discipline and end user type categories have similar considerations to account for in the characterization of the existing literature. For targeted discipline, the engineering discipline was the most widely used across all studies. Early conceptual papers called for quantitative analysis of interdependencies, which became an opportunity for engineers to use existing or develop new modeling methods and tools, such as Bayesian networks [56], which are still in development. However, as the field progresses and attention grows, input from other disciplines, and therefore the number of targeted studies in collaboration with the economic and social-technical disciplines is expected to increase. The end user type distribution is somewhat more spread across the attributes. Researchers and scientists, government and policy decisions makers, and industry professionals each saw a fair share of studies. 44% of the studies were aimed at scholarly researchers, while 41% were aimed at industry professionals. At 65%, industry professionals are the main end user type of the input output methodology studies. Agent Based modeling studies are split almost evenly across the three end user types. Lastly, for network or graph theory studies are used most often by scholarly researchers and industry professionals and less by policy and government decision makers. The percentage of studies for policy and government decision makers is expected to be slightly less than the percentage for researchers and scientists or industry professionals, because many newer models still require testing and validation before being implemented by government and policy decision makers to make critical, high-impact decisions.
3.1 Opportunities and Future Work Results of the analysis revealed many areas of interest that can either build on, merge, or improve existing tools and techniques. One such area involves the use of multiple scales in modeling. As mentioned earlier, there are different scales by which an infrastructure may be analyzed. In the present analysis, the infrastructure scale describes the hierarchy from the lowest level of individual components to the highest level of interacting infrastructure network systems. The other scale introduced by [9] is a description of the resolution at which different infrastructures exist (i.e. international, national, and regional infrastructures). While the individual scales are useful independently, joining the two scales (within and across) presents a novel approach for analyzing and synthesizing infrastructures
Synthesis of Modeling and Simulation Methods
33
interdependencies. One benefit is the ability to study a network at multiple infrastructure scales and then apply this analysis to examine interdependencies in one or more resolution scales. De Porcellinis et al. [27] conceptually suggest a similar technique in their mixed holistic reductionistic approach that allows modeling of systems and different infrastructure scales. As more data becomes available from monitoring and sensing of infrastructure systems, smart devices at the consumer end, and distributed control at the infrastructure operation level, studies from the advanced network scale to the systems of systems scale at different geographical resolutions will be more feasible. This will be supported by the potential availability of detailed systems topologies and reconfigurations, along with timedependent supply and demand patterns, and social trends of service use within particular cultural contexts. Since the success of multi-scale studies relies on other areas of research, it is pertinent to expand data acquisition and analysis/synthesis techniques and increase the incorporation of societal factors into modeling. Most of the data modeled in the available studies was collected from economic figures, extreme event reports, industry questionnaires, or through assumptions of infrastructure system interactions. However, more detailed data of specific infrastructure component interactions and vulnerabilities is needed in order to more accurately model infrastructures before smart systems are commonplace. This is a major challenge in modeling as most industries are unwilling to release such detailed information. Future research efforts are faced with the challenge of either gathering such information through increased data sharing agreements, increased social and behavioral information about infrastructure use and expectations, or the development of techniques that exploit the available and future data to accurately depict infrastructure interdependencies. One such technique currently being developed by Shih et al. [146] involves the concept of data warehouses which combine data from multiple sources, including the Energy Information Administration, Department of Energy, etc., and manages historical data sets. The use of data warehouses addresses the difficulty of conducting a vulnerability analysis of the impacts across interdependent systems in the event of a large scale attack where the data required is highly dispersed. Once compiled, the data warehouse can then be used in a geographic information system (GIS) to visually represent the impacts of disruptions across systems. Currently, the method has only been applied to the U.S. coal distribution system. Future expansion to other critical infrastructures would be most beneficial. Another similar method by Chang et al. [76] constructs a database of infrastructure failure interdependencies (IFIs) and their impacts on society, economic, health, safety, and the environment. The database is primarily used to identify patterns in severity of impacts from IFIs. The results can then be used to prioritize areas for risk mitigation. Future research using the IFI database should consider failures originating from infrastructure other than electric power and include additional types of hazards. Databases are a useful tool for synthesizing data that is already available, but a recent technique utilizing Bayesian networks can be used in actual physical systems with incomplete information which can be probabilistically updated as new information becomes available. Bensi et al. [56] present a method for modeling system connectivity using Bayesian networks to
34
G. Satumtira and L. Dueñas-Osorio
develop a probabilistic decision support system (DSS) for infrastructure operation. System data is integrated in near real time in the Bayesian network to provide current description of the state of infrastructure systems. This DSS can be used to enhance emergency management services immediately after a major disaster. This technique is similar to the smart-self healing electric power grid by Amin [87, 156, 157], and highlights the need to embrace probabilistic techniques more fully to quantify and manage uncertainty in model predictions. On top of these potential areas for work, identified gaps in the literature analysis also revealed additional areas of research interest. The input-output and network or graph theory approaches are very suitable platforms for stochastic, probability based analysis of demand and capacity relationships, not only in time but also in space enhanced by the potential use of random fields and spatial correlation structures [158]. Also, much of the IIM models treat interaction between inputs and outputs in a linear fashion when in fact these systems react to one another in a highly non-linear, dynamic fashion [159]; hence, game theory and other non-linear science approaches can readily find more applications in complex infrastructure interdependent systems. Likewise, advanced graph theory approaches can take advantage of the emerging concepts now used in bio-engineering and statistical physics [160], such as multipartite networks, community structures, spectral network analyses, and modularity of systems to reduce the dimension and gain insights on network structure and mechanics at the nexus of system topology and function [161]. Modeling cascading failures across systems is also made easier with interdependency modeling becoming more spatially and temporally aware to the point where modeling tools can be developed to capture these failures in multiple time scales, from millisecond failures to days for restoration of equipment and service. Benchmark systems with standard input and output data sets can also enhance comparability and spark competition among proposed analysis and prediction methods as more researchers join the field of infrastructure interdependencies. Also, comparable systems can take the development of risk reduction principles to a new level, since demonstrable benefits from standardized design recommendations and mitigation strategies can gain the attention of decision makers and infrastructure stakeholders. Finally, future design of infrastructure systems should also include awareness of sustainability for adequate life-cycle operation, maintenance, and decommissioning [162]. Sustainability of infrastructure systems is at the intersection of engineering, economics, and society, and along with the other future areas of work, has the potential to benefit a broader set of disciplines than previously achieved.
4 Summary and Conclusions The study of infrastructure interdependencies is a new, but increasingly important and dynamic field for research and high impact implementations. The many papers that have been The percentage of studies published per year has increased rapidly since 2001 with around 97 % of the material published in this past decade alone. Conducting an assessment of the available literature revealed a vast breadth and depth of topics on infrastructure interdependencies with successful
Synthesis of Modeling and Simulation Methods
35
applications in practice for natural disaster, manmade disruptions, and operational aging scenarios, where rigorous theory and models are typically develop with intuitive concepts to easily relate to practical systems and enhance their adoption by stakeholders. The top categories are the Mathematical Method, Modeling Objective, and Scale of Analysis. Within these, the most significant attribute are network or graph theory, input output, and Agent Based modeling methods, risk and vulnerability analysis, and system of systems level, are widely used in the field. Analysis of the literature body revealed that network or graph theory was the choice method used most often in the studies. Further assessment of the literature body by modeling objective also revealed that input-output methods was often used in the risk and vulnerability analysis attribute. A complete breakdown of each study into its categories and attributes significantly aids in the identification of the strengths and weaknesses of various models presented. This breakdown and the hierarchical structure presented in this chapter with support material in the appendices are introduced with the intentions to provide a means for researchers to easily identify studies with competing or similar interests to their own. Several areas of infrastructure interdependency analysis were also identified based on their potential for future research and their unexplored synergistic relations. These include the development of multi scale modeling systems, integration of multiple static and dynamic modeling methods, both deterministic and stochastic, adoption of bio-inspired approaches to tame complexity among interacting systems, inclusion of sustainable long-term quality of service and performance objectives for infrastructure operation, exploration of non-linear dynamic models and their coupling with economic, reliability, and computational methods, and the creation of tools to study interdependent reconfigurable and smart urban grids. These themes are clearly interdisciplinary, and successful results to date have only stemmed from deliberate and effortful collaborations among researchers with distinct backgrounds but committed to learning each others’ disciplines. Also, the explicit search for compromises between analytical approaches and large-scale computational experiments is bringing the best of each method to improve modeling and prediction of coupled infrastructure performance. This has been the path of rigorous and solid contributions published in the recent literature, which are paving the way for rich research and implementation goals in the infrastructure engineering field.
36
G. Satumtira and L. Dueñas-Osorio
Appendix A – Literature Matrix and Column Codes In the attributes matrix provided below and shown partitioned in Figures A.1a A.1d, the numbers listed in the ID column are consistent with the references of the chapter. The headings of the matrix are explained in Table A.1 after the matrix and correspond to the typical attributes identified in the archival literature on infrastructure interdependencies. The electronic version of the attributes matrix is available for download from the website of Prof. Dueñas-Osorio at http:// www.ceve.rice.edu/duenas-osorio.
Synthesis of Modeling and Simulation Methods
Fig. A.1a Matrix of attributes for literature review from references 1-40
37
38
G. Satumtira and L. Dueñas-Osorio
Fig. A.1b Matrix of attributes for literature review from references 41-80
Synthesis of Modeling and Simulation Methods
Fig. A.1c Matrix of attributes for literature review from references 81-120
39
40
G. Satumtira and L. Dueñas-Osorio
Fig. A.1d Matrix of attributes for literature review from references 121-162
Synthesis of Modeling and Simulation Methods Table A.1 Literature Matrix Column Codes Matrix Code
Attribute Equivalent
ID
In-Text Reference
1a
Input--Output Method
1b
Agent Based Method
1c
Network or Graph Theory
1d
Unclassified Methods
2a
Risk and Vulnerability Analysis
2b
Interdependencies Modeling and Simulation
2c
Risk Management Measures and Infrastructure Protection
2d
Failure Propagation Analysis
3a
Systems of Systems
3b
Network
3c
Advanced Network
4a
Public/Private Sector Published Data
4b
Disaster Event Reports
4c
Solicited Data
5a
Engineering
5b
Social-Technical
5c
Economic
6a
Scholarly Researchers
6b
Government Agencies
6c
Industry Professionals
7
Conceptual and Strategic Papers
41
42
G. Satumtira and L. Dueñas-Osorio
References 1. DeLaurentis, D., Crossley, W.: A taxonomy-based perspective for systems of systems design methods. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 1, pp. 86–91 (2005) 2. Gee, C., Treuner, P.: Methodological aspects of interdependency in quantitative models for regional development planning. North-Holland, Amsterdam (1981) 3. Nojima, N., Kameda, H.: Cross-impact analysis for lifeline interactions. ASCE, Los Angeles (1991) 4. Cronin, F.J., McGovern, P.M., Miller, M.R., Parker, E.B.: Rural economic development implications of telecommunications: evidence from Pennsylvania. Telecommunications Policy 19(7), 545–559 (1995) 5. Rose, A., Benavides, J., Chang, S.E., Szczesniak, P., Lim, D.: The Regional Economic Impact of an Earthquake: Direct and Indirect Effects of Electricity Lifeline Disruptions. Journal of Regional Science 37(3), 437–458 (1997) 6. The Report of the President’s Commission on Critical Infrastructure Protection (PCCIP) Critical Foundations: Protecting America’s Infrastructures (1997) 7. White Paper on The Clinton Administration’s Policy on Critical Infrastructure Protection: Presidential Decision Directive 63 (1998) 8. Homeland Security Act of 2002. Pub. L. No. 107-296, 116 Stat. 2135 (November 25, 2002) 9. Pederson, P., Dudenhoeffer, D., Hartley, S., Permanm, M.: Critical Infrastructure Interdependency Modeling: A Survey of U.S. and International Research. Idaho National Laboratory (2006) 10. Rinaldi, S.M., Peerenboom, J.P., Kelly, T.K.: Identifying, understanding, and analyzing critical infrastructure interdependencies. IEEE Control Systems Magazine 21(6), 11–25 (2001) 11. North, M.J.: Toward strength and stabilty: agent-based modeling of infrastructure markets. Social Science Computer Review 19(3), 307–323 (2001) 12. Schoenwald, D.A., Barton, D.C., Ehlen, M.A.: An agent-based simulation laboratory for economics and infrastructure interdependency. IEEE, Piscataway (2004) 13. Ehlen, M.A., Scholand, A.J.: Modeling interdependencies between power and economic sectors using the N-ABLE agent-based model. IEEE, Piscataway (2005) 14. Brown, T., Beyeler, W., Barton, D.: Assessing infrastructure interdependencies: The challenge of risk analysis for complex adaptive systems. International Journal of Critical Infrastructures, 2004 1(1), 108–117 (2004) 15. Balducelli, C., Bologna, S., Di Pietro, A., Vicolo, G.: Analysing interdependencies of critical infrastructures using agent discrete event simulation. International Journal of Emergency Management, 2005 2(4), 306–318 (2005) 16. Haimes, Y.Y., Jiang, P.: Leontief-Based Model of Risk in Complex Interconnected Infrastructures. Journal of Infrastructure Systems 7(1), 1–12 (2001) 17. Santos, J.R., Haimes, Y.Y.: Demand-reduction input-output (I-O) analysis for modeling interconnectedness. American Society of Civil Engineers, Santa Barbara (2002) 18. Santos, J.R., Haimes, Y.Y.: Modeling the demand reduction input-output (I-O) inoperability due to terrorism of interconnected infrastructures. Risk Analysis 24(6), 1437–1451 (2004) 19. Haimes, Y.Y.: Models for risk management of systems of systems. International Journal of Systems of Systems Engineering 1(1-2), 222–236 (2008)
Synthesis of Modeling and Simulation Methods
43
20. Haimes, Y.Y., Santos, J.R., Williams, G.M.: Assessing and managing the inoperability of transportation systems and interdependent sectors. International Journal of Risk Assessment & Management 7(6-7), 968–992 (2007) 21. Barker, K., Haimes, Y.Y.: Assessing uncertainty in extreme events: Applications to risk-based decision making in interdependent infrastructure sectors. Reliability Engineering and System Safety 94(4), 819–829 (2009) 22. Nozick, L.K., Turnquist, M.A., Jones, D.A., Davis, J.R., Lawton, C.R.: Assessing the performance of interdependent infrastructures and optimising investments. International Journal of Critical Infrastructures 1(2-3), 144–154 (2005) 23. Duenas-Osorio, L., Craig, J.I., Goodno, B.J., Bostrom, A.: Interdependent response of networked systems. Journal of Infrastructure Systems 13(3), 185–194 (2007) 24. Duenas-Osorio, L., Craig, J.I., Goodno, B.J.: Seismic response of critical interdependent networks. Earthquake Engineering and Structural Dynamics 36(2), 285–306 (2007) 25. Svendsen, N.K., Wolthusen, S.D.: Connectivity models of interdependency in mixedtype critical infrastructure networks. Information Security Technical Report 12(1), 44–55 (2007) 26. Xu, N., Nozick, L.K., Turnquist, M.A., Jones, D.A.: Optimizing investment for recovery in interdependent infrastructures. Inst. of Elec. and Elec. Eng. Computer Society, Big Island (2007) 27. De Porcellinis, S., Panzieri, S., Setola, R.: Modelling critical infrastructure via a mixed holistic reductionistic approach. International Journal of Critical Infrastructures 5(1-2), 86–99 (2009) 28. Ouyang, M., Hong, L., Mao, Z.-J., Yu, M.-H., Qi, F.: A methodological approach to analyze vulnerability of interdependent infrastructures. Simulation Modelling Practice and Theory 17(5), 817–828 (2009) 29. De Porcellinis, S., Setola, R., Panzieri, S., Ulivi, G.: Simulation of heterogeneous and interdependent critical infrastructures. International Journal of Critical Infrastructures 4(1-2), 110–128 (2008) 30. Robert, B., Morabito, L.: The operational tools for managing physical interdependencies among critical infrastructures. International Journal of Critical Infrastructures 4(4), 353–367 (2008) 31. Ensel, C.: A scalable approach to automated service dependency modeling in heterogeneous environments. IEEE Comput. Soc., Los Alamitos (2001) 32. Casalicchio, E., Galli, E., Tucci, S.: Federated agent-based modeling and simulation approach to study interdependencies in IT critical infrastructures. IEEE, Piscataway (2007) 33. Rigole, T., Vanthournout, K., De Brabandere, K., Deconinck, G.: Agents controlling the electric power infrastructure. International Journal of Critical Infrastructures 4(12), 96–109 (2008) 34. Haimes, Y.Y., Horowitz, B.M., Lambert, J.H., Santos, J.R., Lian, C., Crowther, K.G.: Inoperability input-output model for interdependent infrastructure sectors. I: Theory and methodology. Journal of Infrastructure Systems 11(2), 67–79 (2005) 35. Haimes, Y.Y., Horowitz, B.M., Lambert, J.H., Santos, J., Crowther, K., Lian, C.: Inoperability input-output model for interdependent infrastructure sectors. II: Case studies. Journal of Infrastructure Systems 11(2), 80–92 (2005) 36. Leung, M., Haimes, Y.Y., Santos, J.R.: Supply- and output-side extensions to the inoperability input-output model for interdependent infrastructures. Journal of Infrastructure Systems 13(4), 299–310 (2007)
44
G. Satumtira and L. Dueñas-Osorio
37. Chakrabarty, M., Mendonca, D.: Integrating visual and mathematical models for the management of interdependent critical infrastructures. Institute of Electrical and Electronics Engineers Inc., The Hague, Netherlands (2004) 38. Panzieri, S., Setola, R.: Failures propagation in critical interdependent infrastructures. International Journal of Modelling, Identification and Control 3(1), 69–78 (2008) 39. Reed, D.A., Kapur, K.C., Christie, R.D.: Methodology for assessing the resilience of networked infrastructure. IEEE Systems Journal 3(2), 174–180 (2009) 40. Hernández, I., Dueñas-Osorio, L.: Time-sequential evolution of interdependent lifeline systems. In: Proceedings of the 10th International Conference on Structural Safety and Reliability (ICOSSAR), Osaka, Japan (2009) 41. Min, X., Dueñas-Osorio, L.: Inverse Reliability-based Design of Interdependent Lifeline Systems. In: Proceedings of the 2009 Technical Council on Lifeline Earthquake Engineering (TCLEE) Conference, Oakland, California, USA (2009) 42. Lambert, J.H., Sarda, P.: Risk Analysis in Disaster Planning by Superposition of Infrastructure and Societal Networks. American Society of Civil Engineers, Santa Barbara (2002) 43. Carreras, B.A., Newman, D.E., Gradney, P., Lynch, V.E., Dobson, I.: Interdependent risk in interacting infrastructure systems. IEEE Comput. Soc., Los Alamitos (2007) 44. Rosato, V., Issacharoff, L., Tiriticco, F., Meloni, S., De Porcellinis, S., Setola, R.: Modelling interdependent infrastructures using interacting dynamical models. International Journal of Critical Infrastructures 4(1-2), 63–79 (2008) 45. Marti, J.R., Hollman, J.A., Ventura, C., Jatskevich, J.: Dynamic recovery of critical infrastructures: real-time temporal coordination. International Journal of Critical Infrastructures 4(1-2), 17–31 (2008) 46. Ibanez, E., McCalley, J., Aliprantis, D., Brown, R., Gkritza, K., Somani, A., Wang, L.: National energy and transportation systems: interdependencies within a long term planning model. IEEE, Piscataway (2008) 47. Lee, E.E., Mitchell, J.E., Wallace, W.A.: Restoration of services in interdependent infrastructure systems: a network flows approach. IEEE Transactions on Systems, Man, and Cybernetics-Part C (Applications and Reviews) 37(6), 1303–1317 (2007) 48. Chee-Wooi, T., Chen-Ching, L., Manimaran, G.: Vulnerability assessment of cybersecurity for SCADA systems. IEEE Transactions on Power Systems 23(4), 1836– 1846 (2008) 49. Gursesli, O., Desrochers, A.A.: Modeling infrastructure interdependencies using Petri nets. Institute of Electrical and Electronics Engineers Inc., Washington (2003) 50. Snow, A.P., Weckman, G.R., Chayanam, K.: Modeling Telecommunication outages due to power loss. International Journal of Industrial Engineering 13(1), 51–60 (2006) 51. Min, H.-S.J., Beyeler, W., Brown, T., Son, Y.J., Jones, A.T.: Toward modeling and simulation of critical national infrastructure interdependencies. IIE Transactions (Institute of Industrial Engineers) 39(1), 57–71 (2007) 52. Iniestra, J.G., Gutierrez, J.: A multi-objective evolutionary methodology for an interdependent transportation project selection problem. IEEE Computer Society, Los Alamitos (2006) 53. Iniestra, J.G., Gutierrez, J.G.: Multicriteria decisions on interdependent infrastructure transportation projects using an evolutionary-based framework. Applied Soft Computing Journal 9(2), 512–526 (2009) 54. Sultana, S., Chen, Z.: Modeling flood induced interdependencies among hydroelectricity generating infrastructures. Journal of Environmental Management 90(11), 3272–3282 (2009)
Synthesis of Modeling and Simulation Methods
45
55. Krings, A., Oman, P.: A simple GSPN for modelling common mode failures in critical infrastructures. IEEE Comput. Soc., Los Alamitos (2003) 56. Bensi, M., Straub, D., Friis-Hansen, P., Der Kiureghian, A.: Modeling infrastructure system performance using BN. In: Furuta, H., Frangopol, D., Shinozuka, M. (eds.) Safety, Reliability and Risk of Structures, Infrastructures and Engineering Systems. CRC Press, Osaka (2009) 57. Peeta, S., Zhang, P.: Modeling Infrastructure Interdependencies: Theory and Practice. In: Furuta, H., Frangopol, D., Shinozuka, M. (eds.) Safety, Reliability and Risk of Structures, Infrastructures and Engineering System. CRC Press, Osaka (2009) 58. Crowther, K.G.: Decentralized risk management for strategic preparedness of critical infrastructure through decomposition of the inoperability input-output model. International Journal of Critical Infrastructure Protection 1, 53–67 (2008) 59. Dudenhoeffer, D.D., Permann, M.R., Manic, M.: CIMS: a framework for infrastructure interdependency modeling and analysis. IEEE, Piscataway (2006) 60. Kim, H.M., Biehl, M., Buzacott, J.A.: M-CI2: Modelling cyber interdependencies between critical infrastructures. Institute of Electrical and Electronics Engineers Computer Society, Perth (2005) 61. Kroger, W.: Critical infrastructures at risk: Securing electric power supply. International Journal of Critical Infrastructures 2(2-3), 273–293 (2006) 62. Sultana, S., Chen, Z.: Modeling infrastructure interdependency among flood plain infrastructures with extended Petri-net. IASTED, Anaheim (2007) 63. Ulieru, M.: Design for resilience of networked critical infrastructures. Inst. of Elec. and Elec. Eng. Computer Society, Cairns (2007) 64. Chai, C.-L., Liu, X., Zhang, W.J., Deters, R., Liu, D., Dyachuk, D., Tu, Y.L., Baber, Z.: Social network analysis of the vulnerabilities of interdependent critical infrastructures. International Journal of Critical Infrastructures 4(3), 256–273 (2008) 65. Arboleda, C.A., Abraham, D.M., Richard, J.-P.P., Lubitz, R.: Vulnerability assessment of health care facilities during disaster events. Journal of Infrastructure Systems 15(3), 149–161 (2009) 66. Jiang, P., Haimes, Y.Y.: Risk management for Leontief-based interdependent systems. Risk Analysis 24(5), 1215–1229 (2004) 67. Bagheri, E., Ghorbani, A.A.: The state of the art in critical infrastructure protection: A framework for convergence. International Journal of Critical Infrastructures 4(3), 215–244 (2008) 68. Merz, M., Hiete, M., Bertsch, V.: Multicriteria decision support for business continuity planning in the event of critical infrastructure disruptions. International Journal of Critical Infrastructures 5(1-2), 156–174 (2009) 69. Robert, B., De, C.R., Morabito, L.: Modelling interdependencies among critical infrastructures. International Journal of Critical Infrastructures 4(4), 392–408 (2008) 70. Delamare, S., Diallo, A.-A., Chaudet, C.: High-level modelling of critical infrastructures’ interdependencies. International Journal of Critical Infrastructures 5(1-2), 100– 119 (2009) 71. Hernández, I., Dueñas-Osorio, L.: Sequential propagation of seismic fragility across interdependent lifeline systems. Earthquake Spectra (2010) 72. Santella, N., Steinberg, L., Parks, K.: Decision Making for Extreme Events: Modeling Critical Infrastructure Interdependencies to Aid Mitigation and Response Planning. Review of Policy Research 26(4), 409–422 (2009)
46
G. Satumtira and L. Dueñas-Osorio
73. Klein, R., Rome, E., Beyel, C., Linnemann, R., Reinhardt, W., Usov, A.: Information Modelling and Simulation in Large Interdependent Critical Infrastructures in IRRIIS. In: Critical Information Infrastructure Security: Third International Workshop, pp. 36–47 (2009) 74. Integration Definition for function Modeling (IDEF0). Federal Information Processing Standards Publications (1993) 75. Lamm, G.A., Haimes, Y.Y.: Assessing and managing risks to information assurance: A methodological approach. Systems Engineering 5(4), 286–314 (2002) 76. Chang, S., McDaniels, T., Beaubien, C.A.: Societal Impacts of Infrastructure Failure Interdependencies: Building an Empirical Knowledge Base. In: Tang, A., Werner, S. (eds.) TCLEE 2009: Lifeline Earthquake Engineering in a Multihazard Environment. American Society of Civil Engineers, Oakland (2009) 77. Fovino, I.N., Masera, M.: Emergent disservices in interdependent systems and system-of-systems. Institute of Electrical and Electronics Engineers Inc., Taipei (2007) 78. Lian, C., Santos, J.R., Haimes, Y.Y.: Extreme risk analysis of interdependent economic and infrastructure sectors. Risk Analysis 27(4), 1053–1064 (2007) 79. Irani, Z., Sharif, A., Love, E.D., Kahraman, C.: Applying concepts of fuzzy cognitive mapping to model: The IT/IS investment evaluation process. International Journal of Production Economics 75(1-2), 199–211 (2002) 80. Azarm, M.A., Bari, R., Yue, M., Musicki, Z.: Electrical substation reliability evaluation with emphasis on evolving interdependence on communication infrastructure. Institute of Electrical and Electronics Engineers Inc., Ames (2004) 81. Shahidehpour, M., Yong, F.U., Wiedman, T.: Impact of natural gas infrastructure on electric power systems. Proceedings of the IEEE 93(5), 1042–1056 (2005) 82. Moselhi, O., Hammad, A., Alkass, S., Assi, C., Debbabi, M., Haider, M.: Vulnerability assessment of civil infrastructure systems: A network approach. Canadian Society for Civil Engineering, Toronto (2005) 83. Conrad, S.H., LeClaire, R.J., O’Reilly, G.P., Uzunalioglu, H.: Critical national infrastructure reliability modeling and analysis. Bell Labs Technical Journal 11(3), 57–71 (2006) 84. Li, T., Eremia, M., Shahidehpour, M.: Interdependency of natural gas network and power system security. IEEE Transactions on Power Systems 23(4), 1817–1824 (2008) 85. Peerenboom, J.P., Fisher, R.E.: Analyzing cross-sector interdependencies. IEEE Comput. Soc., Los Alamitos (2007) 86. Crowther, K.G., Haimes, Y.Y.: Application of the inoperability input-output model (IIM) for systemic risk assessment and management of interdependent infrastructures. Systems Engineering 8(4), 323–341 (2005) 87. Amin, M.: Challenges in reliability, security, efficiency, and resilience of energy infrastructure: toward smart self-healing electric power grid. IEEE, Piscataway (2008) 88. Rahman, H.A., Beznosov, K., Marti, J.R.: Identification of sources of failures and their propagation in critical infrastructures from 12 years of public failure reports. International Journal of Critical Infrastructures 5(3), 220–244 (2009) 89. Mendonca, D., Lee Ii, E.E., Wallace, W.A.: Impact of the 2001 World Trade Center attack on critical interdependent infrastructures. Institute of Electrical and Electronics Engineers Inc., The Hague (2004)
Synthesis of Modeling and Simulation Methods
47
90. McDaniels, T., Chang, S., Peterson, K., Mikawoz, J., Reed, D.M.: Empirical framework for characterizing infrastructure failure interdependencies. Journal of Infrastructure Systems 13(3), 175–184 (2007) 91. Scawthorn, C., O’Rourke, T., Blackburn, F.: The 1906 San Francisco Earthquake and Fire- Enduring Lessons for Fire Protection and Water Supply. Earthquake Spectra 22(S2), S135–S158 (2006) 92. Lambert, J.H., Sarda, P.: Terrorism Scenario Identification by Superposition of Infrastructure Networks. Journal of Infrastructure Systems 11(4), 211–220 (2005) 93. Lambert, J.H., Patterson, C.E.: Prioritization of schedule dependencies in hurricane recovery of transportation agency. Journal of Infrastructure Systems 8(3), 103–111 (2002) 94. Apostolakis, G.E., Lemon, D.M.: A screening methodology for the identification and ranking of infrastructure vulnerabilities due to terrorism. Risk Analysis 25(2), 361– 376 (2005) 95. Ozog, N., Desjardins, E., Jatskevich, J.: Bulk power system restoration interdependency risk modeling. Inst. of Elec. and Elec. Eng. Computer Society, Vancouver (2008) 96. Brown, A.: Human interoperability and net-centric environments. IEEE, Piscataway (2008) 97. Little, R.G.: A socio-technical systems approach to understanding and enhancing the reliability of interdependent infrastructure systems. International Journal of Emergency Management 2(1-2), 98–110 (2004) 98. Kroger, W.: Critical infrastructures at risk: A need for a new conceptual approach and extended analytical tools. Reliability Engineering and System Safety 93(12), 1781– 1787 (2008) 99. Bekkers, V., Thaens, M.: Interconnected networks and the governance of risk and trust. Information Polity 10(1-2), 37–48 (2005) 100. Jurkauskas, A., Miceviciene, D., Prunskiene, J.: The main principles of modelling the interaction between transport infrastructure development and economy. Transport 20(3), 117–122 (2005) 101. Macaulay, T.: Assessing operational risk in the financial sector, using interdependency metrics. International Journal of Critical Infrastructure Protection 1, 45–52 (2008) 102. Unen, H., Elnashai, A., Sahin, M.: Seismic Performance Assessment of Interdependent Utility Network Systems. In: Tang, A., Werner, S. (eds.) TCLEE 2009: Lifeline Earthquake Engineering in a Multihazard Environment. American Society of Civil Engineers, Oakland (2009) 103. Luiijf, E.A.M., Klaver, M.H.A.: Protecting a nation’s critical infrastructure: The first steps. Institute of Electrical and Electronics Engineers Inc., The Hague (2004) 104. Haimes, Y.Y.: Infrastructure interdependencies and homeland security. Journal of Infrastructure Systems 11(2), 65–66 (2005) 105. Santos, J.R., Haimes, Y.Y., Lian, C.: A framework for linking cybersecurity metrics to the modeling of macroeconomic interdependencies. Risk Analysis 27(5), 1283– 1297 (2007) 106. Lian, C., Halmes, Y.Y.: Managing the risk of terrorism to interdependent infrastructure systems through the Dynamic Inoperability Input-Output Model. Systems Engineering 9(3), 241–258 (2006)
48
G. Satumtira and L. Dueñas-Osorio
107. Amin, M.: Infrastructure security: Reliability and dependability of critical systems. IEEE Security and Privacy 3(3), 15–17 (2005) 108. Kelly, T.K.: Critical infrastructure interdependency RD. IEEE, Piscataway (2001) 109. Rinaldi, S.M.: Modeling and simulating critical infrastructures and their interdependencies. Institute of Electrical and Electronics Engineers Computer Society, Big Island (2004) 110. Scalingi, P.L.: DOE energy industry program on critical infrastructure protection. Institute of Electrical and Electronics Engineers Inc., Columbus (2001) 111. Zimmerman, R.: Decision-making and the vulnerability of interdependent critical infrastructure. Institute of Electrical and Electronics Engineers Inc., The Hague (2004) 112. Gheorghe, A.V., Masera, M., De Vries, L., Weijnen, M., Kroger, W.: Critical infrastructures: The need for international risk governance. International Journal of Critical Infrastructures 3(1-2), 3–19 (2007) 113. Communication from the Commission on a European Programme for Critical Infrastructure Protection. Commission of the European Communities, Brussels (2006) 114. Cobb, A.: Thinking about the unthinkable: Australian vulnerabilities to a high-tech risks Library, DoAP, Editor, Canberra (1998) 115. O’Rourke, T.: Critical Infrastructure, Interdependencie, and Resilience. The Bridge, 22–29 (2007) 116. Lau, D.L., Tang, A., Pierre, J.-R.: Performance of lifelines during the 1994 Northridge earthquake. Canadian journal of civil engineering 22(2), 438–451 (1995) 117. Leavitt, W.M., Kiefer, J.J.: Infrastructure Interdependency and the Creation of a Normal Disaster: The Case of Hurricane Katrina and the City of New Orleans. Public Works Management Policy 10(4), 306–314 (2006) 118. Bigger, J.E., Willingham, M.G., Krimgold, F., Mili, L.: Consequences of critical infrastructure interdependencies: Lessons from the 2004 hurricane season in Florida. International Journal of Critical Infrastructures 5(3), 199–219 (2009) 119. Chiaradonna, S., Lollini, P., Giandomenico, F.: On a modeling framework for the analysis of interdependencies in electric power systems. Inst. of Elec. and Elec. Eng. Computer Society, Edinburgh (2007) 120. Durango-Cohen, P.L., Sarutipand, P.: Capturing interdependencies and heterogeneity in the management of multifacility transportation infrastructure systems. Journal of Infrastructure Systems 13(2), 115–123 (2007) 121. Lee Ii, E.E., Mitchell, J.E., Wallace, W.A.: Assessing vulnerability of proposed designs for interdependent infrastructure systems. Institute of Electrical and Electronics Engineers Computer Society, Big Island (2004) 122. Mao, Z., Hong, L., Fei, Q., Ouyang, M.: Interdependency analysis of infrastructures. Springer, Wuhan (2009) 123. Shoji, G., Toyota, A.: Modeling of Restoration Process Associated with Critical Infrastructure and Its Interdependency Due to a Seismic Disaster. In: Tang, A., Werner, S. (eds.) TCLEE 2009: Lifeline Earthquake Engineering in a Multihazard Environment, pp. 1–12. American Society of Civil Engineers, Oakland (2009) 124. Svendsen, N., Wolthusen, S.: Multigraph Dependency Models for Heterogeneous Infrastructures. In: Goetz, E., Shenoi, S. (eds.) International Federation for Information Processing, pp. 337–350. Springer, Boston (2007) 125. Svendsen, N.K., Wolthusen, S.D.: Analysis and statistical properties of critical infrastructure interdependency multiflow models. Inst. of Elec. and Elec. Eng. Computer Society, West Point, NY, United states (2007)
Synthesis of Modeling and Simulation Methods
49
126. Zimmerman, R., Restrepo, C.E.: The next step: quantifying infrastructure interdependencies to improve security. International Journal of Critical Infrastructures 2(23), 215–230 (2006) 127. Klein, R.: The EU integrated project IRRIIS on CI dependencies: an overview. IEEE, Piscataway (2009) 128. Bagheri, E., Baghi, H., Ghorbani, A.A., Yari, A.: An agent-based service-oriented simulation suite for critical infrastructure behaviour analysis. International Journal of Business Process Integration and Management 2(4), 312–326 (2007) 129. Luiijf, E.A.M., Klaver, M.H.A.: Critical infrastructure awareness required by civil emergency planning. In: Proceedings - First IEEE International Workshop on Critical Infrastructure Protection, IWCIP 2005, pp. 110–117 (2005) 130. Laprie, J.-C., Kanoun, K., Kaaniche, M.: Modelling interdependencies between the electricity and information infrastructures. Springer, Nuremberg (2007) 131. Fedora, P.A.: Reliability review of north american gas/electric system interdependency. Institute of Electrical and Electronics Engineers Computer Society, Big Island (2004) 132. Abdalla, R., Tao, C.V., Cheng, Q., Li, J.: A network-centric modeling approach for infrastructure interdependency. Photogrammetric Engineering and Remote Sensing 73(6), 681–690 (2007) 133. Johnson, C.W., McLean, K.: Tools for local critical infrastructure protection: computational support for identifying safety and security interdependencies between local critical infrastructures. IET, Stevenage (2008) 134. Yao, B.-H., Xie, L.-L., Huo, E.-J.: Comprehensive study method for lifeline system interaction under seismic conditions. Acta Seismologica Sinica English Edition 17(2), 211 (2004) 135. Permann, M.R.: Toward developing genetic algorithms to aid in critical infrastructure modeling. Inst. of Elec. and Elec. Eng. Computer Society, Woburn (2007) 136. Hausken, K.: Defense and attack of complex and dependent systems. Reliability Engineering and System Safety 95(1), 29–42 (2009) 137. Tao, L., Eremia, M., Shahidehpour, M.: Interdependency of natural gas network and power system security. IEEE Transactions on Power Systems 23(4), 1817–1824 (2008) 138. Schiff, A.J.: Documenting damage, disruption, interdependencies and the emergency response of power and communication systems after earthquakes. International Journal of Critical Infrastructures 1(1), 100–107 (2004) 139. Duflos, S., Diallo, A.A., Le Grand, G.: An overlay simulator for interdependent critical information infrastructures. Inst. of Elec. and Elec. Eng. Computer Society, Szklarska (2007) 140. Restrepo, C.E., Simonoff, J.S., Zimmerman, R.: Unraveling geographic interdependencies in electric power infrastructure. Institute of Electrical and Electronics Engineers Computer Society, Kauai (2006) 141. Chen, P., Scown, C., Matthews, H., Garrett, J., Hendrickson, C.: Managing Critical Infrastructure Interdependencies through Economic Input-Output Methods. Journal of Infrastructure Systems 15(3), 200–210 (2009) 142. Gasparri, A., Oliva, G., Panzieri, S.: On the distributed synchronization of on-line IIM interdependency models. IEEE, Piscataway (2009)
50
G. Satumtira and L. Dueñas-Osorio
143. Reed, D., Nojima, N.: Interdependence Between Power Delivery and Other Lifelines. In: Tang, A., Werner, S. (eds.) TCLEE 2009: Lifeline Earthquake Engineering in a multihazard environments, pp. 1–7. American Society of Civil Engineerings, Oakland (2009) 144. Dryden, L.M., Haggerty, M.S., Lane, L.M., Lee, C.S. (eds.): Modeling the impact of infrastructure interdependences on Virginia’s highway transportation system. Institute of Electrical and Electronics Engineers, Charlottesville (2004) 145. Kajitani, Y., Sagai, S.: Modelling the interdependencies of critical infrastructures during natural disasters: A case of supply, communication and transportation infrastructures. International Journal of Critical Infrastructures 5(1-2), 38–50 (2009) 146. Shih, C.Y., Scown, C.D., Soibelman, L., Matthews, H.S., Garrett Jr., J.H., Dodrill, K., McSurdy, S.: Data management for geospatial vulnerability assessment of interdependencies in U.S. power generation. Journal of Infrastructure Systems 15(3), 179– 189 (2009) 147. HadjSaid, N., Tranchita, C., Rozel, B., Viziteu, M., Caire, R.: Modeling cyber and physical interdependencies - application in ICT and power grids. IEEE, Piscataway (2009) 148. Hellstrom, T.: Critical infrastructure and systemic vulnerability: Towards a planning framework. Safety Science 45(3), 415–430 (2007) 149. Liu, C., Fan, Y., Ordonez, F.: A two-stage stochastic programming model for transportation network protection. Computers and Operations Research 36(5), 1582–1590 (2009) 150. Rozel, B., Viziteu, M., Caire, R., Hadjsaid, N., Rognon, J.P.: Towards a common model for studying critical infrastructure interdependencies. IEEE, Piscataway (2008) 151. Urbina, M., Zuyi, L.: Modeling and analyzing the impact of interdependency between natural gas and electricity infrastructures. IEEE, Piscataway (2008) 152. Harper, M.A., Thornton, M.A., Szygenda, S.A.: Disaster tolerant systems engineering for critical infrastructure protection. IEEE, Piscataway (2007) 153. Angelou, G.N., Economides, A.A.: A decision analysis framework for prioritizing a portfolio of ICT infrastructure projects. IEEE Transactions on Engineering Management 55(3), 479–495 (2008) 154. Rahman, H.A., Armstrong, M., Mao, D., Marti, J.: I2Sim: A matrix-partition based framework for critical infrastructure interdependencies simulation. Inst. of Elec. and Elec. Eng. Computer Society, Vancouver (2008) 155. Robert, B., Morabito, L., Quenneville, O.: The preventive approach to risks related to interdependent infrastructures. International Journal of Emergency Management 4(2), 166–182 (2007) 156. Amin, M.: Toward self-healing energy infrastructure systems. IEEE Computer Applications in Power 14(1), 20–28 (2001) 157. Amin, M.: Toward Secure and Resilient Interdependent Infrastructures. Journal of Infrastructure Systems 8(3), 67–75 (2002) 158. Chou, C.-C., Chen, C.-T., Tseng, S.-M., Lin, J.-D.: A spatiotemporal model for persisting critical infrastructure interdependencies. Inst. of Elec. and Elec. Eng. Computer Society, Jinan (2008) 159. Friesz, T.: Network Science. Nonlinear Science and Infrastructure Systems, 102 (2007)
Synthesis of Modeling and Simulation Methods
51
160. Newman, M.: The structure and function of complex networks. SIAM Review 45(2), 167–256 (2003) 161. Barrat, A., Barthelemy, M., Vespignani, A.: Dynamical Processes on Complex Networks. Cambridge Univ. Press, Cambridge (2008) 162. Padgett, J., Dennemann, K., Gosh, J.: Sustainable infrastructure subjected to multiple threats. In: Tang, A., Werner, S. (eds.) TCLEE 2009: Lifeline Earthquake Engineering in a Multihazard Environment, American Society of Civil Engineers, Oakland (2009)
Interdependencies between Energy and Transportation Systems for National Long Term Planning* Eduardo Ibáñez , Konstantina Gkritza, James McCalley, Dionysios Aliprantis, Robert Brown, Arun Somani, and Lizhi Wang *
Abstract. The most significant energy consuming infrastructures and the greatest contributors to greenhouse gases for any nation today are electric and freight/passenger transportation systems. Technological alternatives for producing, transporting, and converting energy for electric and transportation systems are numerous. Addressing costs, sustainability, and resiliency of electric and transportation needs requires long-term assessment since these capital-intensive infrastructures take years to build with lifetimes approaching a century. Yet, the advent of electrically driven transportation, including cars, trucks, and trains, creates potential interdependencies between the two infrastructures that may be both problematic and beneficial. We are developing modeling capability to perform long-term electric and transportation infrastructure design at a national level, accounting for their interdependencies. The approach combines network flow modeling with a multiobjective solution method. We describe and compare it to the state-of-the-art in energy planning models. An example is presented to illustrate important features of this new approach. Eduardo Ibáñez · James McCalley · Dionysios Aliprantis · Arun Somani Electrical and Computer Engineering, Iowa State University, Ames, Iowa, 50010, USA e-mail: [email protected], [email protected], [email protected], [email protected]
*
Konstantina Gkritza Civil Engineering, Iowa State University e-mail: [email protected] Robert Brown Mechannical Engineering, Iowa State University e-mail: [email protected] Lizhi Wang Industrial and Manufacturing Systems Engineering, Iowa State University e-mail: [email protected] * This material is based upon work supported by the National Science Foundation under Grant No. 0835989.
K. Gopalakrishnan & S. Peeta (Eds.): Sustainable & Resilient Critical Infrastructure Sys., pp. 53–76. © Springer-Verlag Berlin Heidelberg 2010 springerlink.com
54
E. Ibáñez et al.
1 Introduction Most US energy usage is for electricity production and vehicle transportation, two interdependent infrastructures. The strength and number of these interdependencies will increase rapidly as hybrid electric transportation systems, including plugin hybrid electric vehicles and hybrid electric trains, become more prominent. There are several new energy supply technologies reaching maturity, accelerated by public concern over global warming. US DOE-EIA [1] suggests that national expenditures on electric energy and transportation fuels over the next 20 years will exceed $14 trillion, four times the 2010 federal budget [2]. Intentional and strategic energy system design at the national level will have very large economic impact. The proposed work is motivated by a recognition that tools, knowledge, and perspective are lacking to design a national system integrating energy and transportation infrastructures while accounting for interdependencies between them, new energy supply technologies, sustainability, and resiliency. Our goal is to identify optimal infrastructure designs in terms of future power generation technologies, energy transport and storage, and hybrid-electric transportation systems, with balance in sustainability, costs, and resiliency. We will characterize interdependencies between energy resource portfolio and energy/vehicular transportation systems at the national level. This chapter begins with an overall description of our approach in section 2, including the underlying models for the energy and transportation sectors. In section 3 we identify the most relevant interdependencies that an integrated energy and transportation system could present. The goal of our approach is to identify a set of optimal solutions in terms of competing objectives (cost, sustainability and resiliency), which are defined in section 4. A review of the state-of-the-art in infrastructure planning software and methodologies follows in section 5; leading to the formulation of our approach in section 6. That formulation is applied to a small test system in section 7 and the chapter is then concluded with a discussion in section 8.
2 Modeling Approach The energy system is comprised of (but not limited to) electricity, natural gas, liquid fuels, nuclear, biomass, hydroelectric, wind, solar, and geothermal resources. Modeling of national freight and passenger transportation focuses on state-to-state travel; we consider both infrastructures (rail, highways, locks/dams, roads, ports, airports) and fleets (trains, barges, trucks, personal vehicles, airplanes, etc.), and there may be different kinds of fleets for each mode (e.g., diesel trains and electric trains or conventional and plug-in hybrid electric).
Interdependencies between Energy and Transportation Systems
55
Fig. 1 Proposed model that integrates the energy and transportation systems at two levels: operation and planning
Fig. 1 captures the scope of our modeling effort. The transportation and energy systems interact mainly at two different stages: operation and investment. At the operational level each system needs to satisfy its demand with the existing capacity. However, operation of the two systems, and ultimately investment, are interdependent; while the transportation sector demands energy in the form of fuel, the energy sector requires the movement of raw bulk energy sources (e.g. coal or natural gas for thermal power plants). At the same time, the cost of meeting those reciprocal demands has an impact on final prices for energy and transportation. The ever-growing public need for energy and transportation creates the necessity to invest in new capacity. Given the potential for increased coupling between energy and transportation, it is apparent that better designs of both can be achieved if these designs are performed together.
2.1 Energy Systems Modeling A generalized network flow transportation model [3,4] is used to model energy systems, where commodity flow is energy, and transportation paths are AC and DC electric transmission, gas pipelines (for natural gas and/or hydrogen), and liquid fuel pipelines (for petroleum-based fuels, biofuels such as ethanol or biodiesel, and anhydrous ammonia). Energy transport by rail, barge, and truck is included in the freight transport model. Each source node, specified with location, is connected to a fictitious source node that supplies all energy. Arcs emanating from each source are characterized by maximum extraction rate (MBtu/month) and extraction cost ($/MBtu/month). Petroleum, coal, natural gas, and uranium have finite capacities, while renewables have infinite capacities. All sources have finite maximum extraction rates. Conversion and transportation are endowed with: capacity (MBtu-capacity/month),
56
E. Ibáñez et al.
efficiency (%), operational cost ($/MBtu-flow/month), investment cost ($/MBtucapacity/month), component sustainability metrics (e.g., CO2 tons/MBtu-flow), and component resiliency (e.g., reliability).
2.2 Transportation Systems Modeling The freight transport system is modeled as a multi-commodity flow network where the flows are in the units of tons of each major commodity. A commodity is major if its transportation requirements comprise at least 2% of the nation’s total freight ton-miles. Data available to make this determination [5] indicates this criterion includes 23 commodities that comprise 90% of total ton-miles (e.g., the top eight, comprising 55%, are in descending order: coal, cereal grains, foodstuffs, gasoline and aviation fuel, chemicals, gravel, wood products, and base metals). There are two fundamental differences between this formulation and that of the energy formulation. Whereas the energy formulation must restrict energy flows of specific forms to particular networks (for example, natural gas or hydrogen cannot move through electric lines or liquid fuel lines), commodities may be transported over any of the transport modes (rail, barge, truck). Also whereas energy movement requires only infrastructure (electric lines, liquid fuel pipelines, gas pipelines), commodity movement requires infrastructure (rail, locks/dams, roads, ports) and fleet (trains, barges, trucks), and there may be different kinds of fleets for each mode (e.g., diesel trains or electric trains). To accommodate these differences, the transportation formulation is comprised of two multi-commodity flows [6], one embedded inside the other. Commodities flow through the network formed by the different types of fleet available. At the same time, the units in those fleets travel along the network formed by the different infrastructures. An effective method to convert this situation into an ordinary network problem is captured in Fig. 2, where the flow from node A to node B is divided according to the types of infrastructures first and then into the different types of available fleets. Then one can apply capacity limits to the appropriate fictitious arcs.
Fig. 2 Decomposition of transportation arc in two steps: infrastructure and fleet
Interdependencies between Energy and Transportation Systems
57
3 Interdependencies The described model enables analysis of interdependencies between resource mix, sustainability, cost and resiliency at a national level, and we intend to study these relationships under various assumptions of technological maturity (cost, efficiency, reliability) for wind, solar, hydrogen, nuclear, geothermal, gasification, biofuel production, and hybrid electric transportation systems (plug-in hybrid vehicles or hybrid trains). The following interdependencies present special interest to us and can be studied within the broad scope of our model: 1. Wind and resource mix: Since wind is carbon-free, low-cost, and renewable, popular thought is to maximize its use; yet use of wind energy is constrained, since: (1) its night-time peak is in anti-phase with day-time electric load peak; (2) the uncertainty in its day-ahead availability increases costs associated with maintaining higher levels of reserve; (3) high-voltage transmission necessary to move it from wind-intensive regions to load centers is insufficient. We will investigate interdependencies between wind-supplied energy and three particular resource types: (a) PHEV, hybrid train via charging stations; (b) fuel cells via electrolysis to produce hydrogen; (c) hydro pumped storage. 2. Gasification, carbon, and transportation: Integrated gasification combined cycle (IGCC) units convert fossil fuels and/or biomass to syngas to fire a combined cycle unit. IGCCs obtain high efficiencies and enable effective carbon capture/sequestration. Although expensive to build and presently less reliable to operate, an IGCC plant provides a uniquely attractive degree of versatility via low-carbon use of fossil fuels and/or biomass, and co-production of electric energy, hydrogen, and diesel fuel. 3. Transportation patterns and resource mix: Existing highway and rail infrastructure was designed without consideration of coupling to bulk energy transport. We will use concepts from existing statewide [7] and metropolitan [8] travel demand models to refine our freight and energy models to study how passenger and freight transportation patterns are likely to change as a result of increased penetration of PHEV, hybrid trains, and biofuels, identifying interdependencies between electric grid operations/expansion, traffic operations (vehicle movements caused by daily/seasonal population shift cycles) and transportation network expansion. 4. Right-of-way (ROW) and resource mix: The Midwest US has 5 times as much potential wind capacity as present regional load, so moving large amounts of wind from the Midwest to load centers of the east and west coasts is under consideration [9,10]. Midwest-to-east coast alone requires crossing at least twelve states, each with regulatory oversight reflecting intense public sensitivity to overhead transmission. The relation to rail is intriguing. On the one hand, ROW could be obtained from converted rail routes made available by reduction in eastward coal transport caused by increased wind availability. On the other hand, large-scale deployment of hybrid trains may be facilitated by deploying overhead transmission in rail ROW while powering trains from that same overhead transmission.
58
E. Ibáñez et al.
5. Prices of petroleum, natural gas, and electricity: Significant price variability has occurred in each of these three forms of energy. Our model will provide locational marginal prices for any energy network-transportable energy form [3,4] via the dual variables to nodal balance constraints [11]. We will explore price interdependencies between these energy forms as affected by coupling resulting from hybrid electric transportation systems. 6. Demand coordination: Demand coordination offers significant opportunity to enhance sustainability and resiliency while decreasing costs. Microprocessor control of residential, commercial, and industrial loads, based on real-time prices, offers effective means of time-shifting demand, an approach PHEVowners and hybrid train operators could use to make money by charging off-peak and supplying on-peak. Deployment of a high voltage transmission “national superhighway” [9,10] could enable spatial-temporal coordination, e.g., 11 am Southwest solar could be used to supply East-coast 2 pm peaks, or New York City PHEVs could sell 9 pm energy to Seattle residents for their 6 pm peak. Significant deployment of load-side, distributed solar and wind power resources could heavily reduce the need for centralized generation. PHEVs and hybrid trains, if coupled with high-efficiency engines, could increase their onboard power generation capacity, resulting in more on-peak electric capacity. 7. Appearance of new competition: The study of investment on new technologies can unveil the possible existence of parallel paths to satisfy the same demand in the energy system (e.g., coal plant vs. wind), the transportation system (e.g., rail vs. truck) or both (e.g., fuel transportation vs. electric transmission). These parallel paths could create opportunities for the development of new markets or even the combination of existing ones. 8. Environmental legislation: Pollutants and greenhouse gas emissions are byproducts of transportation use or energy processing operation, as well as construction of new infrastructure. Existing or forthcoming legislation can have a tremendous impact on the proposed portfolio for the energy and transportation systems. New policies can take the form of emission limits or “caps”, allowances assignment and trade or taxes per volume emitted. 9. Capacity investment: Freight system “chokepoints,” particularly ports, and energy system “bottlenecks,” particularly electric transmission systems, are continuous targets of improvements. We will investigate how improvements in one infrastructure affect demand, and the need for capacity, in the other. To illustrate an emerging transformative interdependency between the energy and transportation infrastructures, which can potentially have a tremendous impact on the operation of both systems, consider PHEVs with vehicle-to-grid (V2G) capabilities [12,13]. The main concept behind the V2G technology is to allow vehicles to discharge their stored energy back to the power grid, by use of bi-directional power electronic dc/ac interfaces and communications protocols. PHEVs without V2G capability are “just” increased load on the power system, which must be nevertheless capable of supplying the additional power required to charge them (usually during the night, which is optimal from the perspective of the power grid
Interdependencies between Energy and Transportation Systems
59
operator [15]). However, if the V2G technology is further developed, standardized, and widely adopted, then a whole new range of exciting possibilities for power system operation arises. Recent studies have shown that V2G technology can help stabilize the grid during power shortages [15]. Some studies also suggest that PHEVs can be the key component that will permit the penetration of renewable resources to the power grid up to levels of 40–50% [16–18], currently limited by transmission, storage, and reliability considerations. In essence, PHEVs would function as a huge system-wide distributed energy storage element, which would be highly reliable, and would help mitigate the requirement for storing renewable energy locally at the point of generation.
4 Multiple Objectives One of the core features of this approach is its multiobjective nature. The final objective is to find a Pareto front of non-dominated solutions. Rather than a predetermined hierarchy or weighted system before the optimization takes place, the trade-offs can be analyzed a posteriori, giving decision makers and general public a solid background to determine where their efforts should be focused on. The objectives are grouped in three distinctive categories: cost, sustainability and resiliency, which are presented below.
4.1 Cost There are two equally important components to the cost objective that can be differentiated: operational and investment costs. The impact of the latter can be usually perceived as more notorious given the amount of capital that requires over a relative short period of time. However, the operational component could have a bigger impact on the total cost and should definitely be handled with care. Investment cost is calculated in dollars per unit of maximum flow increased in a period of time: MBtu-capacity for energy networks or Ton-capacity for transportation network. The concept of overnight cost is used, that is the cost of financing the entire construction of a given infrastructure if only one payment was going to be done at one point it time. Among other concepts, this cost would include materials, labor, financial costs, intellectual property, and dismantling costs. Salvage value would be taken into account with the appropriate inflation correction, should it exist. Operational cost, as opposed to investment cost, is expressed in a dollar per unit of energy produced or unit of mass transported basis. It could include, but is not limited to, some of the following: labor (operators, drivers), maintenance, byproduct disposal (nuclear waste, ash), or non-fuel materials (e.g. limestone in fossil fuel plants). Useful byproducts could potentially reduce the overall operational cost. There are two traditional operational cost components that we don’t assign directly: fuel costs and amortization of the investment. The first is taken care of in the representation of the fuel sources and distribution networks with the appropriate interconnections and the latter is included in the investment cost.
60
E. Ibáñez et al.
4.2 Sustainability We view sustainability in terms of environmental impact and supply longevity. We capture four classes of environmental impacts related to energy and transportation systems: net emissions, nuclear waste, water consumption (e.g., for biofuel production), and resource displacement (e.g., land usage). The most relevant emissions that result from energy and transportation systems are the emissions of four air pollutants (CO, NOx, SO2 and volatile organic compounds) [19], and greenhouse gas emissions (CO2 and methane) [20]. For each environmental impact belonging to any of our four classes (emissions, waste, consumption, displacement) a linear expression is considered. Coefficients representing the impact per unit of flow need to be determined prior to the optimization process. Environmental consequences of investment can also be included and computed by analyzing the nominal impact of the life cycle of each infrastructure, such as emission during the processing of raw materials (steel, concrete) or during construction. We also characterize supply longevity for a depletable resource (e.g., coal, gas, uranium) as the remaining years for the resource if used at the average rate over the simulation time. Sustainability expressions for water and land as depletable resources or air pollutants that should not exceed a predetermined threshold can be modeled as complicating constraints [3] that specify flow relationships between several arcs. Dual variables for these constraints provide valuation of the corresponding metric in terms of its per-unit effect on objectives. These valuations can be linked to market prices as in the case of tradeable SO2 allowances [21–23]. We will explore potential linkages with models reflecting natural capital [24] and applications to carbon cap and trade markets [25].
4.3 Resiliency The resiliency of the system will be evaluated by studying the impact of fictitious arc failures or cost increases, representing possible contingencies on the different networks. Contingencies may be represented by a decrease in capacity and/or an increase in cost at one or multiple nodes, occurring at any point in time during the simulation span, and for different durations, ranging from a single time step to the complete study period. Arc failure probabilities are a function of exposure to physical failure modes such as natural events (e.g., hurricanes, earthquakes, extreme heat, drought, flood), equipment failures (e.g., train derailments, pipeline explosions, bridge collapse, software/communication system failure), and terrorist acts. They also depend on exposure to sociological and political risk (e.g., another nation’s ability to curtail exports to the U.S. or to demand higher energy prices). A uniquely important influence is the dependence of each arc’s exposure to equipment failure that depends on workforce quantity and quality. Although this part of the work is yet to be developed to its full extent, we envision several options to evaluate the resiliency of the energy and transportation systems. A straightforward approach to be employed within the optimization consists
Interdependencies between Energy and Transportation Systems
61
of computing the increase that contingencies cause in the system’s operational cost [26], once an investment scheme has been determined. We also compute traditional network reliability indices for the system such as unavailability or expected energy not-supplied based on, for example, identification of minimum cut sets [27–31]. Since the system at this level is robust to bulk non-deliverability, a more sensitive resiliency metric may be appropriate, e.g., one that is based on energy price variation over time, computable within our model as the integration over time of locational marginal prices. A related metric, the cumulative reduced price [27], depends on price differences between two nodes summed over time. This metric, useful for ranking future investments, complements information obtained from the capacity part of the optimization solution. The possible combination of causes produces a large space of contingencies, each with their own occurrence probability and very diverse potential impacts. Finding the set of contingencies that represent the highest risk can be achieved with various techniques, such as sensitivity analysis on the operational solution, Monte Carlo simulations, or specialized explicit formulations [32]. This set depends on the structure of the energy and transportation networks, which depends on the profile of the investment decisions. Once the minimum-cost flow problem is solved for the selected contingencies, metrics can be obtained based on the average increase on the overall operational cost with respect to a reference case. Locational marginal prices and minimum cut sets can also be used during the optimization process or in the result analysis.
5 State-of-the-Art and Model Attributes We have performed a detailed comparison between the most advanced energy planning models, which include the National Energy Modeling System (NEMS) [33], the MARKAL/TIMES suite [34], and WASP-IV [36]. Modeling on the transportation side includes national freight forecasting models and tools [36,37] and statewide passenger travel forecasting models [7]. Transportation investment planning tools include (but not limited to) the Highway Economic Requirement System (HERS-ST) [38] for highway investments, and RailDec [39] for rail investments. Other related work includes designs for novel energy infrastructure systems [40–42], infrastructure interdependencies [43-48], and systemwide modeling of energy systems [49] including the EPA’s Integrated Planning Model [50]. We concluded that our proposed work would provide the following attributes not currently available in any of these models: 1. 2. 3. 4.
Ability to optimize multiple objectives; Use of resource depletability as a sustainability measure; Availability of resiliency metrics; Rigorous modeling of interactions between energy and freight/passenger transportation.
In addition, our proposed solution approach, which combines advanced network flow modeling, multiple decomposition techniques, and multiobjective solution
62
E. Ibáñez et al.
methods with computation performed via high-performance computing, represents a unique integration of the very best in approach, algorithm, and computing platform in addressing an extreme-dimensionality problem of high technical, political, and social importance today.
6 General Formulation In this section, the formulation used to achieve the characteristics described above. The explanation of the formulation is preceded by an introduction to the nomenclature used.
6.1 Nomenclature Sets and networks N Set of nodes M Set of arcs T Set of time periods En Set of energy networks Tr Set of transportation networks Inf Set of transportation infrastructures Fleet Set of transportation fleet Efuel Set of energy networks that provide fuel to transportation TrEn Set of transportation networks that provide fuel to generation nodes Objective Functions CostOp Total cost of operating the energy and transportation networks CostInv Total investment cost Emissionsk Total emissions for pollutant k Parameters Η(i,j,l)(t) lb(i,j,l)(t) Ub(i,j,l)(t) lbInv(i,j,l)(t) ubInv(i,j,l)(t) costOp(i,j,l)(t) costInv(i,j,l)(t) kEm(i,j,l)(t) heatRate(i,j,E)(t) fuelCons(i,j,l)(t)
Efficiency of arc (i,j) in network l, during time t Lower bound for flow in arc (i,j) in network l, during time t Upper bound for flow in arc (i,j) in network l, during time t due to the initial existing infrastructure Minimum allowed capacity increase in arc (i,j) in network l, at time t Maximum allowed capacity increase in arc arc (i,j) in network l, at time t Operational cost for flow in arc (i,j) in network l, during time t Investment cost for capacity increase in arc (i,j) in network l, at time t Emission rate for pollutant k for flow in arc (i,j) in network l, during time t Heat rate for thermal generation i at node j, during time t Fuel consumption for transportation mode i arriving at node j in network l, during time step t
Interdependencies between Energy and Transportation Systems
d(j,l)(t) r
63
Fixed energy or transportation demand at node j in network l, during time t Discount rate
Decision Variables f(i,j,l)(t) Operational flow of arc (i,j) in network l, during time t capInv(i,j,l)(t) Capacity increase due to investment in arc (i,j) in network l, during time t
6.2 Formulation Description The optimization problem associated with this model can be conceptually described by the optimization problem given in (1), min {CostOp + CostInv, Emissionsk } subject to : Meet energy demand ,
(1)
Meet transportation demand Decision Variables : flows, capInv
There are two objectives (cost and emissions), each having an energy and a transportation component. We minimize these objectives under constraints of meeting demands on energy, and freight transport. Decision variables characterize operations (flows) and capacity investments (capInv). The following formulation (2) corresponds to a first approach to the modeling capabilities that have been previously described in this chapter. Each arc is specified by (i,j,l), where i is the origin node, j is the terminal node and l is the network to which it belongs. A key attribute of this model is that networks of different energy and transportation forms are represented separately, linked only to the extent that the energy form of one network can be converted to the energy form of another. The simulation period is specified by T. Objectives (2a) are to minimize operational (2g) and investment (2h) costs, and pollutant emissions (2i), subject to the energy and transport balance constraints (2b) for all nodes and the flow bound constraints for all arcs (2c, 2e). Flow balance at the nodes is enforced by (2b), where the right-hand-side represents demand on the commodity form at node j. Certain energy nodes can have a freight-related demand (2k) to fuel the need for transportation. At the same time, the demand of energy related commodities in some given transport networks (carbon, natural gas) will depend on the generation rate at those nodes (2j). The efficiency parameter η(j,k,l) in (2b) accounts for losses in the energy network, and equals 1 in the transport system since it is assumed to be lossless.
64
E. Ibáñez et al.
min {CostOp + CostInv, Emissionsk }( 2a ) subject to :
∑η(
f( j ,k ,l ) ( t ) − ∑ f (i , j ,l ) ( t ) = d( j ,l ) ( t ) ,
∀j ∈ N , ∀l ∈ { En, Tr} , ∀t ∈ T
( 2b )
lb(i , j ,l ) ( t ) ≤ f (i , j ,l ) ( t ) ≤ ub(i , j ,l ) ( t ) + ∑ capInv(i , j ,l ) ( z ) ,
∀ ( i, j ) ∈ M , ∀l ∈ { En} , ∀t ∈ T
( 2c )
lbInv( i , j ,l ) ( t ) ≤ capInv( i , j ,l ) ( t ) ≤ ubInv(i , j ,l ) ( t ) ,
∀ ( i, j ) ∈ M ', ∀l ∈ { En} , ∀t ∈ T
( 2d )
k
j , k ,l )
i
t
z =0
t
lb(i , j ,l ) ( t ) ≤ ∑ f(i , j ,l ) ( t ) ≤ ub(i , j ,l ) ( t ) + ∑ capInv(i , j ,l ) ( z ) , ∀i ∈ { Inf , Fleet} , ∀l ∈ {Tr} , ∀t ∈ T l∈Tr
z =0
lbInv( i , j ,l ) ( t ) ≤ capInv( i , j ,l ) ( t ) ≤ ubInv(i , j ,l ) ( t ) ,
( 2e )
∀i ∈ { Inf , Fleet} , ∀j ∈ M ', ∀l ∈ {Tr} , ∀t ∈ T ( 2f )
where : CostOp = ∑
∑ (1 + r )
CostInv = ∑
∑ (1 + r )
−t
t∈T ( i , j ,l )∈M
t∈T ( i , j ,l )∈M '
Emissionsk = ∑
∑
t∈T ( i , j ,l )∈M
−t
costOp( i , j ,l ) ( t ) f (i , j ,l ) ( t )
( 2g )
costInv(i , j ,l ) ( t ) capInv(i , j ,l ) ( t )
( 2h )
kEm( i , j ,l ) ( t ) f( i , j ,l ) ( t )
( 2i )
d( j ,TrEn ) ( t ) = ∑ heatRate(i , j ,l ) ( t ) f(i , j ,l ) ( t ) ,
∀j ∈ NTrEn , ∀t ∈, ∀t ∈ T
( 2j)
d( j ,Efuel ) ( t ) = ∑
∀j ∈ N Efuel , ∀t ∈ T
( 2k )
i
∑
∀l∈ i∈
fuelCons(i , j ,l ) ( t ) f(i , j ,l ) ( t ),
{Tr} {Fleet}
Decision Variables :
f (i , j ,l ) ( t ) , capInv( i , j ,l ) ( t ) ≥ 0
Upper capacity bounds in (2c, 2e) may change due to the presence of decision variables capInc(i,j.l)(t), modeling facility expansion, which can be constrained (2d, 2f) to represent minimum and maximum levels of investment. In energy networks, every arc is constrained independently. However, in the transportation networks the upper bound and capacity investment is assigned to the combination of commodity flows transported from a determined pair of nodes by a mode of transportation (infrastructure of fleet). Cost expressions (2g) and (2h) are expressed as present worth using present worth factor (1+r)-t. Operational costs in (2g) are summed over the entire arc set M, but investment costs in (2h) are summed over a specified set M’ which enables consideration of both connected and unconnected nodes while controlling problem dimensionality. Salvage values are taken into consideration when the effective life of the investment exceeds the end of the simulation time [51]. Pollutant emissions (2i) are calculated using the amount of pollutant emitted per unit of energy flow, kEm(i,j,l)(t). The flows that are assigned an emission rate different than zero are thermal generation units and transportation flows.
7 Numerical Example To illustrate some of the capabilities of the proposed model a simple example with two geographical regions has been created and analyzed based on a previous model [52] that only took the energy system into consideration.
Interdependencies between Energy and Transportation Systems
65
7.1 Description The illustrated example (Fig. 3) features a high level representation of the energy and transport relations between the Midwestern and Eastern sections of the United States. These areas are also respectively referred to as “1” and “2”. We consider the Midwest region to be delimited by the states between North Dakota, Wisconsin, Mississippi and Texas.
Fig. 3 Proposed example layout displaying the two geographical regions (East Coast and Midwest), and the different energy and transportation layers
The Midwestern area is assumed to produce two types of commodities, coal and corn, which need to be transported to meet the East Coast demand. To simplify the model, it is assumed that coal is produced in the Illinois Basin with enough capacity to meet the thermal generation demand in the East Coast and the Midwest. To transport these commodities, two different infrastructures, railway and highway, can be utilized. Only one type of fleet is accounted for in each infrastructure, train and truck, respectively. Two different energy networks are considered. The diesel network is fed by the production in the Midwest and its mission is to fulfill the need for fuel from trains and trucks. Electricity supply constitutes the second energy network. Both areas can produce electricity from thermal plants, which drive the demand for coal on the transportation side, and are connected by high-voltage power lines to allow energy trading. The Midwestern area has potential to use wind as a source for electricity, although there is no capacity installed at the beginning of the planning period, which lasts 40 years. In order to capture all the components of the formulation, the previous set of physical nodes and arcs can be expanded as shown in Fig. 4. Note that columns
66
E. Ibáñez et al.
represent the four networks and every row represents a node, corresponding to a physical region.
Fig. 4 Example node and arc expansion for the two-node example. The two physical nodes are displayed horizontally, while the different networks are laid out vertically. Solid arrows represent flows and dashed arrows represent interconnection between networks
In the transportation networks, fictitious nodes have been added in between the physical regions to represent the different alternatives of conducting the flow between the Midwest and the East Coast. The transmission line in the electric network is replaced by two opposite directional arcs to ensure the non-negativity of the flows [3]. The dashed lines represent the increase in demand on a node due to activity on other network, i.e. thermal units increase the demand for coal and the use of train and truck for transportation drive the demand for diesel. To simplify the analysis we assume that there will only be capacity limits and investment in the following parts of the system: train and truck transportation, thermal plants, wind generation and electricity transmission. No investment on infrastructure in considered. Also, there is no retirement of facilities or infrastructure so the initial capacity is assumed to be available throughout the simulation.
7.2 Parameters The simulation is performed for 40 years with a time step of a year. A more refined model would use shorter monthly time steps, as suggested in section II, in order to obtain more accurate results. Operational and investment parameters for energy networks are summarized in Table 1, while Table 2 contains the appropriate parameters of capacity, frequency of travel, costs and emissions for the available fleet. Data for this model has been collected from [53–58].
Interdependencies between Energy and Transportation Systems
67
Table 1 Electricity network parameters Coal fired
Wind
Transmission
Initial capacity
225 / 180 GW
0 GW
50 GW
Max. investment
10 GW/year
3.5 GW/year
5 MW/year
Op. cost
1.7 $/MWh
7 $/MWh
2 $/MWh
Investment cost
2.12 $/MW
1.65 $/MW
825 $/kW-mile
Efficiency
9.95 MMBtu/MWh
35 %
100 %
CO2 emission rate
55.77 lb/MWh
0
Table 2 Transportation parameters Train
Truck
Capacity
3200 ton/load
26 ton/load
Loads/year
68 loads
104 loads
Initial capacity
3500 trains
100,000 trucks
Operational cost
0.05 $/ton mile
0.16 $/ton mile
Fuel use
341 Btu/ton mile
3357 Btu/ton mile
CO2 emission rate
0.2 lb/ton-mile
0.6 lb/ton-mile
Max. investment
50 trains
100 trucks
Investment cost
31 million $/train
200,000 $/truck
The electric demand is set to 141 GW for the Midwest region and 118 GW for the East Coast, with a growth of 1.5% every year. The amount of corn shipped from the Midwest to the East Coast equal 300 million tons, with a 0.5% yearly growth. The average distance between the two regions is set to 750 miles. Coal is produced in the Illinois Basin with a content of energy equal to 11,800 Btu/short ton and a cost of $85 per short ton. All costs are subject to a constant inflation of 2%, and a constant discount rate of 7% is used in the economic analysis. Salvage values are assigned for the investments close to the end of the simulation period. All investments are set to devaluate linearly for a period of 15 years.
7.3 Base Case Results The formulation is implemented using Matlab and solved using version 10 of CPLEX. It consists of 914 variables and 1074 constraints. Solution time is under a
68
E. Ibáñez et al.
second, running on a 3.6 GHz Pentium 4 processor with 2 GB of RAM. Table 3 contains a summary of the optimum investment portfolio obtained. Table 3 Cost and investment results Cost
Investment
TOTAL
2.19 trillion $
Operational
2.06 trillion $
-
Investment
125.83 billion $
-
- Coal Midwest
11.40 billion $
-
68.17 GW
- Transmission
4.87 billion $
38.95 GW
- Wind
90.70 billion $
136.50 GW
- Train
18.87 billion $
1244 trains
As we mentioned at the beginning of the chapter US DOE-EIA estimates [1] that the national cost of electric energy and transportation fuels over the next 20 years around $14 trillion. The estimate given by the model is very reasonable, taking into account that we used a very high level representation and only a small part of the transportation and energy systems. Investments take place progressively at different moments in time for different arcs. Transmission capacity is added mainly in the first seven years, while new coal generation capacity in the Midwest is constructed between years eight and forty. Investment on new wind capacity is constant and equal to the maximum over the whole simulation period. Finally, investment on trains happens during the last 25 year period. During the 40 years of the study, CO2 emissions are estimated to be 33.7 billion tons. Fig. 5 represents the generation mix forecasted for the simulation period.
Fig. 5 Electricity generation for the base case
Interdependencies between Energy and Transportation Systems
69
The model is also capable of forecasting electricity prices (Fig. 6). Energy in the Midwest is always cheaper than in the East Coast since it does not require the use of transmission or transportation. The price difference is relatively small between years 26 and 33 because the electric transmission line is not congested in that period of time. The separation is more noticeable in years 1, 2 and 26 to 40 due to congestion in the transportation side. Coal and corn utilize all train capacity, and part of the corn has to be delivered by truck, rising transportation costs.
Fig. 6 Evolution of electricity prices for the two geographical nodes
7.4 Multiobjective Results Now let us assume that, in order to improve the system’s frequency regulation and load following, investments on wind are required to be associated with some sort of electricity storage, doubling the cost of wind investment. In this case, wind energy is not economical anymore and any solution that seeks a reduction on CO2 emissions by switching coal to wind will incur a higher cost. In this case, cost and emissions are two competing objective functions. By forcing different levels of investment on wind, we can find the trade off between the two objectives (Fig. 7).
70
E. Ibáñez et al.
Fig. 7 Front of solutions two analyze the trade-off between emissions and total cost
A very simple metric to evaluate the reliability of the electric system is the reserve margin, the relative difference between installed thermal generation capacity and demand. The smaller the margin, the more probable that demand cannot be met due to unforeseen loss of generation. Adding the corresponding constraints to the formulation, a minimum reserve margin for all time steps can be imposed. Ensuring a higher level of reliability results in an increase in cost, as one would expect, reflecting a conflict between these two objective functions. Fig. 8 captures the corresponding Pareto front for the base case with different reserve margins.
Fig. 8 Representation of non-dominated solution for two resiliency (reserve margin) and cost objectives
Interdependencies between Energy and Transportation Systems
71
Increasing reserve margin causes increasing price stability for energy. Fig. 9 shows the evolution in time of electricity prices for three different levels of reserve margin (0, 25% and 50%) for both the Midwest and the East Coast.
Fig. 9 Price stability and reserve margin in the two geographical regions for different levels of reserve margin
In previous steps, operational and investment cost was studied with respect to emissions and reserve margin. Both methodologies can be combined to obtain a multiobjective approach to the problem with three objective functions: cost, emissions and reliability. Evaluating a number of combinations of minimum investment on wind generation and minimum reserve margin, we obtain the Pareto surface pictured in Fig. 10, corresponding to the solutions to the multiobjective problem.
72
E. Ibáñez et al.
Fig. 10 Pareto surface of non-dominated solutions in terms of cost, sustainability (CO2 emissions) and resiliency (reserve margin)
8 Discussion A new approach to assess investment on national energy and transportation infrastructures has been presented in this chapter. It features a multiobjective approach, enabling the optimization on cost, sustainability and resiliency. A formulation has been developed, which allows the implementation of such a model for minimization of cost and emissions. The formulation has been applied to a simple example representing the Midwestern and Eastern sections of the United States. Even though it has been represented at a very high level, the order of magnitude of the results in terms of costs, emissions and energy prices agrees with those in the real operation of the energy and transportation systems [1,59]. The model forecasts an average operation expenditure of $58 billion per year with an average emission of 893 million tons of CO2/year, which is consistent with DOE estimates. Electricity prices are also reasonable within markets today. The model allows the study of some of the interdependencies presented in the chapter. The most interesting one relates the cost of electricity with the operation in the transportation system. If transportation of coal and corn creates a congested rail connection, electricity prices increase as more corn must be shipped by the higher-price truck in order to allow more coal to be shipped by rail. We could think of this situation as an extension of the idea of locational marginal price, the price is set by the cost of supplying the next unit of energy. To produce one more MWh of electricity, more coal is needed to be transported but the rail is congested.
Interdependencies between Energy and Transportation Systems
73
The solution is to transfer part of the corn by truck to enable the shipment of the extra coal. Therefore, the price of energy suffers a significant increase. An initial attempt for multiobjective calculations has been introduced in the example, both in terms on emissions and reliability. Pareto fronts of solutions have been calculated and plotted, which enable the study of trade-off between different solutions. New metrics to compute sustainability and resiliency need to be develop to further in this type of calculations. The example presented here is meaningful, but has been severely restricted in dimensionality in order to illustrate the approach. This methodology can be expanded by introducing new geographical regions interconnected with more arcs, and new energy and transportation networks with a wider range of technologies, transportation infrastructures and fleets, either readily available or coming in the future. The methodology presented in this chapter, when applied over a full scale representation of the national energy and transportation systems could be used as an assessment tool for decision makers and the general public to debate on the future of these infrastructures. Federal agencies, states and a wide spectrum of private companies could benefit from the comprehensive analysis in order to establish the appropriate balance among the best alternatives in terms of cost, sustainability and resiliency.
References [1] US Energy Information Administration. Forecasts and Analysis (2008), http://www.eia.doe.gov/oiaf/forecasting.html [2] US Government. United States Federal Budget for Fiscal Year 2010. A New Era of Responsibility: Renewing America’s Promise (Febuary 2009), http://www.gpoaccess.gov/usbudget/fy10/ [3] Quelhas, A., Gil, E., McCalley, J., Ryan, S.: A Multiperiod Generalized Network Flow Model of the U.S. Integrated Energy System: Part I – Model Description. IEEE Trans. on Power Syst. 22(2), 829–836 (2007) [4] Quelhas, A., McCalley, J.: A Multiperiod Generalized Network Flow Model of the U.S. Integrated Energy System: Part II – Simulation Results. IEEE Trans. on Power Syst. 22(2), 837–844 (2007) [5] U.S. Department of Transportation, Research and Innovative Technology Administration, Bureau of Transportation Statistics. Commodity Flow Survey United States (1997), http://www.bts.gov/publications/commodity_flow_survey/ [6] Bazaraa, M., Jarvis, J., Sherali, H.: Linear Programming and Network Flows, 2nd edn., ch. 12, sec. 4, pp. 587–597. Willey, New York (1990) [7] Statewide travel forecasting models, NCHRP Synthesis 388, National Cooperative Highway Research Program, Transportation Research Board, Washington, D.C. (2006) [8] Metropolitan Travel Forecasting: Current Practice and Future Direction, TRB Special Report 288, Transportation Research Board, Washington D.C (2007)
74
E. Ibáñez et al.
[9] Osborn, D.: presentation to the PSCE WPCC Panel (October 31, 2006), http://www.ieee.org/portal/cms_docs_pes/pes/subpages/ meetings-folder/PSCE/PSCE06/panel9/ Panel-09-1_Impact_of_Wind_Plant_Operation.pdf [10] Smith, J., Parsons, B.: What does 20% look like? IEEE Power and Energy Magazine 5(6), 22–33 (2007) [11] Litvinov, E., Zheng, T., Rosenwald, G., Shamsollahi, P.: Marginal loss modeling in LMP calculation. IEEE Transactions on Power Systems 19(2), 880–888 (2004) [12] Li, Y.: Scenario-based analysis on the impacts of plug-in hybrid electric vehicles’ (PHEV) penetration into the transportation sector. In: IEEE Intern. Symp. Techn. Soc. (ISTAS), June 1-2 (2007) [13] Hadley, S.W.: Evaluating the impact of plug-in hybrid electric vehicles on regional electricity supplies. In: 2007 iREP Symp. Bulk Power Syst. Dyn. Control — VII. Revitalizing Operational Reliability, August 19–24 (2007) [14] Kintner-Meyer, M., Schneider, K., Pratt, R.: Impacts assessment of plug-in hybrid vehicles on electric utilities and regional U.S. power grids. Part 1: Technical analysis. Pacific Northwest National Laboratory, Tech. Rep. (November 2007) [15] Kempton, W., Letendre, S.E.: Electric vehicles as a new power source for electric utilities. Transp. Res. Pt. D: Transp. Env. 2(3), 157–175 (1997) [16] Kempton, W., Tomić, J.: Vehicle-to-grid power fundamentals: Calculating capacity and net revenue. J. Power Sources 144, 268–279 (2005) [17] Short, W., Denholm, P.: A preliminary assessment of plug-in hybrid electric vehicles on wind energy markets, National Renewable Energy Laboratory (NREL), Tech. Rep. (April 2006) [18] Tomić, J., Kempton, W.: Using fleets of electric-drive vehicles for grid support. J. Power Sources 168, 459–468 (2007) [19] Environmental Protection Agency, 2002 National Emissions Inventory Booklet, http://www.epa.gov/ttn/chief/net/2002neibooklet.pdf [20] Energy Information Administration, Emissions of greenhouse gases in the United States 2006 (2007) [21] Joskow, P., Schmalensee, R., Bailey, E.: The market for sulfur dioxide emissions. The American Economics Review 88(4), 669–685 (1998) [22] Schmalensee, R., Joskow, P.L., Ellerman, A.D., Montero, J.P., Bailey, E.M.: An interim evaluation of sulfur dioxide emissions trading. Journal of Economic Perspectives 12(3), 53–68 (1998) [23] Environmental Protection Agency, Office of Air and Radiation, EPA Acid Rain Program – 2002 Progress Report, EPA-430-R-03-011, Washington, DC (November 2003) [24] Crowards, T.: Natural resource accounting. In: Environmental and Resource Economics, April 1996, vol. 7(3) (1996); ISSN 0924-6460 (Print) 1573-1502 (Online) [25] Hansell, R., Hansell, T., Fenech, A.: A New Market Instrument for Sustainable Economic and Environmental Development. Journal Environmental Monitoring and Assessment 86(1-2) (2003), doi:10.1023/A:1024019121567; ISSN 0167-6369 (Print) 1573-2959 (Online) [26] Al-Ammar, Fisher, J.: Resiliency Assessment of the Power System Network to Cyber and Physical Attacks. In: Power Engineering Society General Meeting. IEEE, Los Alamitos (2006) [27] Gil, E.: Integrated network flow model for a reliability assessment of the national electric energy system. PhD Dissertation, Iowa State University (Febuary 2007)
Interdependencies between Energy and Transportation Systems
75
[28] Jane, C.C., Lin, J.S., Yuan, J.: Reliability evaluation of a limited-flow network in terms of minimal cutsets. IEEE Transactions on Reliability 42(3), 354–368 (1993) [29] Xue, J.: On multistate system analysis. IEEE Transactions on Reliability R-34, 329– 337 (1985) [30] Yeh, W.: A simple approach to search for all d-MCs of limited-flow network. Reliability Engineering & System Safety 71, 15–19 (2001) [31] Yeh, W.: Search for all d-mincuts of a limited-flow network. Reliability Engineering & System Safety 78, 123–129 (2002) [32] Salmeron, J., Wood, K., Baldick, R.: Analysis of Electric Grid Security Under Terrorist Threat. IEEE Transactions on Power Systems 19(2), 905–912 (2004) [33] US Energy Information Administration. The National Energy Modeling System: An Overview 2003 (March 2003), http://www.eia.doe.gov/oiaf/aeo/overview/ [34] Energy Technology Systems Analysis Programme (ETSAP). MARKAL and TIMES Model Documentation, http://www.etsap.org/documentation.asp [35] Adica Consulting. WASP-IV Model Documentation, http://www.adica.com/wasp_iv.html [36] Freight Analysis Framework. Prepared for the U.S. Department of Transportation, Office of Freight Management and Operations, Washington, D.C. (2002), http://www.ops.fhwa.dot.gov/freight/freight_analysis/faf/ index.htm [37] Crainic, T.G., Florian, M., Guelat, J., Spiess, H.: Strategic Planning of Freight Transportation: STAN, An Interactive-Graphic System. Transportation Research Record 1283, Transportation Research Board, National Research Council, Washington, D.C., pp. 97–124 (1990) [38] HERS-ST, Highway Economic Requirements System - State Version. Developed for the Federal Highway Administration by Jack Faucett Associates, http://www.fhwa.dot.gov/infrastructure/asstmgmt/ hersfact.cfm [39] RailDec, Rail Investment Decision Support Tool. Developed for the Federal Railroad Administration by HDR, http://www.hdrinc.com/15/14/1/default.aspx#raildec [40] Patil, A., Herder, P., Weijnen, M.: Actionable Solutions to Tackle Future Energy Uncertainties. In: 2005 IEEE International Conference on Systems, Man and Cybernetics, October 2005, vol. 4, pp. 3705–3710 (2005) [41] Von Jouanne, A., Husain, I., Wallace, A., Yokochi, A.: Gone with the wind: Innovative hydrogen/fuel cell electric vehicle infrastructure based on wind energy sources. IEEE Industry Applications Magazine, 12–19 (July–August 2005) [42] Delucchi, M.A., Lipman, T.: An Analysis of the Retail and Lifecycle Cost of BatteryPowered Electric Vehicles. Transportation Research D 6, 371–404 (2001), http://www.its.ucdavis.edu/publications/2001/ UCD-ITS-RP-01-16.pdf [43] Zimmerman, R.: Decision-making and the Vulnerability of Interdependent Critical Infrastructures. In: IEEE International Conference on Systems, Man and Cybernetics (2004) [44] Wijnia, Y., Kom, M., de Jager, S., Herder, P.: Long term optimization of asset replacement in energy infrastructures. In: 2006 IEEE International Conference on Systems, Man, and Cybernetics, Taipei, Taiwan, October 8-11 (2006)
76
E. Ibáñez et al.
[45] Rinaldi, S., Peerenboom, J., Kelly, T.: Identifying, Understanding, and Analyzing Critical Infrastructure Interdependencies. IEEE Control Systems Magazine, 11–25 (December 2001) [46] Amin, M.: Towards Self-Healing Energy Infrastructure Systems. IEEE Computer Applications in Power, 20–28 (January 2001) [47] Reliability Assessment 2002-2011 – The Reliability of Bulk Electric Systems in North America. North American Electric Reliability Council, http://www.nerc.com/~filez/rasreports.html [48] Geidl, M., Andersson, G.: Optimal Power Flow of Multiple Energy Carriers. IEEE Transactions On Power Systems 22(1), 145–155 (2007) [49] van Beeck, N.: Classification of energy models. Tilburg University and Eindhoven University of Technology (May 1999) [50] Environmental Protection Administration. Integrated Planning Model, http://www.epa.gov/airmarkets/progsregs/ epa-ipm/index.html [51] International Atomic Energy Agency (IAEA), Expansion planning for electrical generating systems, A guidebook, Vienna, Austria (1984) [52] Quelhas, A., Gil, E., McCalley, J.: Nodal Prices in an Integrated Energy System: A Generalized Network Flow Model. Int. J. of Critical Inf. 2(1), 50–69 (2006) [53] US Department of Energy. 20% Wind Energy by 2030 – Increasing Wind Energy’s Contribution to U.S. Electricity Supply (May 2008), http://www.20percentwind.org/20p.aspx?page=Report [54] US Energy Information Administration. Form EIA-411, Coordinated Bulk Power Supply Program Report, http://www.eia.doe.gov/cneaf/electricity/page/eia411/ eia411.html [55] US Energy Information Administration.Coal News and Markets Report, http://www.eia.doe.gov/cneaf/coal/page/coalnews/ coalmar.html [56] US Department of Energy. Energy Intensity Indicators. Transportation Sector; Rail and Waterborne from Freight Total spreadsheet; Combination truck from FreightTrucks spreadsheet (2004), http://eere.energy.gov/ba/pba/intensityindicators/ trend_data.html [57] Hendrickson, C., Griffin, W., Morrow, W., Matthews, H., Wakeley, H.: Transport infrastructure requirements for ethanol as a significant alternative fuel. In: 10th Int. Conf. on Applications of Advanced Technologies in Transportation, Athens, Greece (2008) [58] Association of American Railroads. Class I Railroad Statistics (April 2008), http://www.aar.org/PubCommon/Documents/AboutTheIndustry/ Statistics.pdf [59] US Energy Information Administration. Emissions of Greenhouse Gases Report (November 2007), http://www.eia.doe.gov/oiaf/1605/ggrpt/carbon.html
A Framework for Assessing the Resilience of Infrastructure and Economic Systems Eric D. Vugrin , Drake E. Warren, Mark A. Ehlen, and R. Chris Camphouse *
Abstract. Recent U.S. national mandates are shifting the country’s homeland security policy from one of asset-level critical infrastructure protection (CIP) to allhazards critical infrastructure resilience, creating the need for a unifying framework for assessing the resilience of critical infrastructure systems and the economies that rely on them. Resilience has been defined and applied in many disciplines; consequently, many disparate approaches exist. We propose a general framework for assessing the resilience of infrastructure and economic systems. The framework consists of three primary components: (1) a definition of resilience that is specific to infrastructure systems; (2) a quantitative model for measuring the resilience of systems to disruptive events through the evaluation of both impacts to system performance and the cost of recovery; and (3) a qualitative method for assessing the system properties that inherently determine system resilience, providing insight and direction for potential improvements in these systems. Eric D. Vugrin · Drake E. Warren · Mark A. Ehlen Infrastructure and Economic Systems Analysis Department, Sandia National Laboratories, Albuquerque, New Mexico, USA e-mail: [email protected], [email protected], [email protected]
*
R. Chris Camphouse Performance Assessment and Decision Analysis Department, Sandia National Laboratories, Carlsbad, New Mexico, USA e-mail: [email protected] K. Gopalakrishnan & S. Peeta (Eds.): Sustainable & Resilient Critical Infrastructure Sys., pp. 77–116. © Springer-Verlag Berlin Heidelberg 2010 springerlink.com
78
E.D. Vugrin et al.
Acronyms, Abbreviations, and Initialisms Acronym CIKR
Definition critical infrastructure and key resource
CIP
critical infrastructure protection
CIR
critical infrastructure resilience
CITF
Critical Infrastructure Task Force
CST
Central Standard Time
DHS
U.S. Department of Homeland Security
E.O.
Executive Order
FAST
Fast Analysis and Simulation Team
GDP
gross domestic product
GRP
gross regional product
HSPD
Homeland Security Presidential Directive
LDRD
Laboratory Directed Research and Development
LQR
linear quadratic regulator
MCEER
Multidisciplinary and National Center for Earthquake Engineering Research
MMI
Modified Mercalli Intensity
NBC
Nuclear, biological, or chemical
NIPP
National Infrastructure Protection Plan
NISAC
National Infrastructure Simulation and Analysis Center
NOAA
National Oceanic and Atmospheric Administration
NSTAC
National Security Telecommunications Advisory Committee
PDD
Presidential Decision Directive
POL
petroleum, oil, and lubricants
SME
subject matter expert
SSP
sector-specific plan
TOSE
technical, organizational, social, and economic
1 Introduction 1.1 Protecting the Nation’s Critical Infrastructure Over the past 25 years, the U.S. government has placed increased emphasis on the vulnerabilities and protection of the nation’s critical infrastructure systems.
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
79
Among others, these systems include the electric power grid, the primary and secondary highway systems, and the internet, and together provide the very “backbone” of the U.S. economy and society. Largely due to the terrorist acts of September 11, 2001, the U.S. Department of Homeland Security (DHS) is now concerned about how these systems perform during and after natural and manmade disruptive events, such as hurricanes (e.g., Katrina, Ike), pandemic influenzas (e.g., H5N1, H1N1), chem-bio attacks, and many others. To understand this performance, DHS has primarily focused on conducting consequence analyses that estimate the impacts on these infrastructures (and on the economy and other systems that rely on them) of particular disruptive events. For example, in 2008, the DHS National Infrastructure Simulation and Analysis Center (NISAC) estimated the impacts to critical infrastructures and the economy of Hurricane Ike. Prior to the hurricane’s landfall, NISAC used National Oceanic and Atmospheric Administration (NOAA) forecasts of hurricane path and strength to estimate losses of electric power (Figure 1), damage to infrastructures, and impacts to economic activity (Figure 2).
Fig. 1 Projected electric power outages caused by hurricane damage (NISAC 2008)
80
E.D. Vugrin et al.
Fig. 2 Projected gross domestic product (GDP) losses resulting from hurricane damage (NISAC 2008)
The measures of impact that result from this consequence analysis included, among many others, the extent and duration of power losses and communication outages, number of people evacuated and of people without basic amenities and duration of those impacts, loss of banking and finance services, loss of transportation services, damaged and destroyed buildings, and lost economic gross domestic product (GDP). These estimates are traditionally used to determine how to allocate emergency resources and in anticipation of future similar disruptive events, which of these assets deserve the most protection from disruption. While NISAC has primarily focused on consequence analyses, DHS also recognized the value of understanding restoration and recovery processes that the critical infrastructures undergo following a disruptive event. The concept of critical infrastructure resilience (CIR) includes both of these considerations, and CIR has become a key component of the nation’s CIP policies.
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
81
1.2 Resilience as a Homeland Security Mandate The federal government’s traditional policy toward critical infrastructure security has been one of protection. CIP policies date back at least to 1982, with formation of the National Security Telecommunications Advisory Committee (NSTAC), and with issuance in the mid-1990s of Executive Order (E.O.) 13010, “Critical Infrastructure Protection in the Information Age,” both of which focused on the protection of infrastructure assets. The 1998 Presidential Decision Directive (PDD)-63, “Protecting America's Critical Infrastructures,” also focuses primarily on protection. Following 9/11, the DHS re-energized the government’s efforts toward physical protection of infrastructure assets. Two of the first post-9/11 Presidential directives explicitly focus on protecting critical infrastructure from acts of terrorism: (1) Homeland Security Presidential Directive (HSPD)-3, which states that each threat level “shall prompt the implementation of an appropriate set of protective measures” by responsible infrastructure agencies; and (2) HSPD-7, which calls for “hardening” of key assets. In 2005, DHS Secretary Chertoff charged the Homeland Security Advisory Council to form the Critical Infrastructure Task Force (CITF) to provide recommendations on national policy and objectives. The CITF focused on recommendations that would ensure the optimal delivery of critical infrastructure service in the post-9/11 “all-hazards” environment and reduction of the consequences of the exploitation, destruction, or disruption of critical infrastructures. The CITF’s primary recommendation was that DHS focus on CIR as its top-level strategic objective: …making resilience the overarching strategic objective would stimulate synergistic actions that are balanced across all three components of risk…protection, in isolation is a brittle strategy. We cannot protect every potential target against every conceivable attack; we will never eliminate all vulnerabilities. Furthermore, it is virtually impossible to define a desired end state — to quantify how much protection is enough — when the goal is to reduce vulnerabilities. Critical Infrastructure Resilience (CIR) is not but rather an integrating objective designed investment strategies. Adoption of CIR as the quantifiable objective — identifying the time functionality.
a replacement for CIP, to foster systems-level goal provides a readily required to restore full
It is businesses that must bear the costs of resilience, that must make cost/benefit decisions in a changing, competitive environment. For example, remaining resilient in the face of disasters that destruct structures and harm employees must be evaluated from a systems functionality perspective so that these businesses make sound investment strategies. (Homeland Security Advisory Council, 2006)
82
E.D. Vugrin et al.
Said differently, the CIR plan attempts to address a basic problem: that many of the critical infrastructures and key resources1 (CIKRs) contain so many assets; e.g., the Commercial Facilities Critical Infrastructure, that individual-asset-based risk analysis and policy is difficult and expensive to manage. A systems approach is needed, one where the owners of the individual assets manage their own risks and the overall system functions well during a disruptive event. To do this, the federal government has started a coordinated set of government resilience initiatives, started the process of understanding what features create resilience in CIKRs, and initiated calls to agencies to start measuring the resilience of their infrastructure systems. The DHS National Infrastructure Protection Plan (NIPP) in particular contains explicit language calling for increasing the resilience of the nation's critical infrastructure; e.g., the overarching goal of the NIPP is to …build a safer, more secure, and more resilient America by preventing, deterring, neutralizing, or mitigating the effects of deliberate efforts by terrorists to destroy, incapacitate, or exploit elements of our Nation’s CIKR, and to strengthen national preparedness, timely response, and rapid recovery of CIKR in the event of an attack, natural disaster, or other emergency. (U.S. Department of Homeland Security, 2009)
Many of the NIPP sector-specific plans (SSPs) also have broad, if not specific, language that promotes critical infrastructure resilience as a primary objective. In fact, resilience has become such a priority for the United States that President Barack Obama (2009) proclaimed September 2009 as National Preparedness Month, with the goal of recognizing “the importance of preparing for potential emergencies beforehand and to observe this month with appropriate preparedness activities, events, and training to enhance our national resilience.” He further states that “Our goal is to ensure a more resilient Nation -- one in which individuals, communities, and our economy can adapt to changing conditions as well as withstand and rapidly recover from disruption due to emergencies.”
1.3 A Framework for Assessing Infrastructure and Economic Resilience In support of these directives, Sandia National Laboratories (Sandia) has formulated a new resilience framework, including a definition of resilience and 1
The 18 critical infrastructure and key resources (CIKRs) are, by sector-specific agency (in parentheses): Chemical, Commercial Facilities, Critical Manufacturing, Dams, Emergency Services, Nuclear Reactors, Materials, and Waste (DHS Office of Infrastructure Protection); Agriculture and Food (Department of Agriculture, Food and Drug Administration); Banking (Department of the Treasury); Communications (Department of Homeland Security); Defense Industrial Base (Department of Defense); Energy (Department of Energy); Government Facilities (DHS); Information Technology (DHS); National Monuments and Icons (Department of the Interior); Postal and Shipping (Transportation Security Administration); Public Health and Healthcare (Department of Health and Human Services); Transportation Systems (Transportation Security Administration, U.S. Coast Guard); Drinking Water and Water Treatment (Environmental Protection Agency).
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
83
system performance metrics and measurement methodologies, which can be applied to studies of natural and man-made CIKR disruptive events. This framework is expanded herein to include qualitative and quantitative measures. The proposed definition of system resilience is: Given the occurrence of a particular disruptive event (or set of events), the resilience of a system to that event (or events) is the ability to efficiently reduce both the magnitude and duration of the deviation from targeted system performance levels.
Measurement of system resilience involves two components. The first component is systemic impact, which is defined as the difference between a targeted system performance level and an actual system performance following a disruptive event; i.e., it is measured in terms of the changes in system/economic performance (such as GDP). The second component is the total recovery effort, which is the amount of resources expended during recovery processes following the disruption. To be explicit, the measurement of system resilience requires the quantification of systemic impact and total recovery effort. Given that system resilience is, in part, due to inherent properties of the system, three such properties or capacities are used to define, quantify, and ultimately design for better resilience of the particular system. These properties are (1) absorptive capacity, or the ability of the system to absorb the disruptive event; (2) adaptive capacity, or the ability to adapt to the event; and (3) restorative capacity, or the ability of the system to recover. Better system resilience can then be designed by developing resilience enhancement features that improve one or more of these capacities. In total, the framework creates a methodology for measuring system performance levels and recovery efforts, determining and sometimes measuring system capacities, and developing cost-effective resilience enhancement features, thereby achieving the overarching goal of more resilient critical infrastructure systems. This framework has two primary advantages for infrastructure and economic resilience analysis. First, the framework is general enough to be applied to all 18 of DHS’s CIKR systems. This flexibility is necessary for establishing resilience analysis standards across all CIKR systems. Second, it explicitly considers recovery costs following infrastructure disruptions. Recovery is a fundamental aspect of resilience, and evaluation of the recovery costs is necessary to provide a comprehensive resilience assessment. To the authors’ knowledge, this resilience assessment framework is the first of its kind to address both of these considerations.
1.4 Purpose and Scope The purpose of this chapter is to describe a framework for assessing the resilience of critical infrastructures and economic systems. Section 2 begins with a review of existing resilience definitions and frameworks, including definitions of resilience used across different disciplines and applications. Section 3 describes the authors’ proposed, three-part system resilience assessment framework, and Section 4 illustrates how to apply this framework to a specific example. Section 5 summarizes
84
E.D. Vugrin et al.
the framework and describes future directions for further developing and applying the framework.
2 A Survey of Resilience Definitions and Frameworks There are notable differences of opinion across professional disciplines over the fundamental definition of resilience. These differences often originate from inherent complexities in resilience concepts and how and to which disciplines they are applied; e.g.,, whether resilience is concerned with deviations from a steady state (“engineering resilience”) or with changes between completely different states (“ecological resilience”). Given the explicit goal of converging toward a single, simple, and flexible definition for use in homeland security applications, this section first reviews existing definitions of resilience and then develops a functional definition for these types of analyses.
2.1 Existing Definitions of Resilience 2.1.1 General and Systems Definitions The ecologist C. S. Holling is considered by many to be the first to give a systemslevel definition of resilience: Resilience is “a measure of the persistence of systems and of their ability to absorb change and disturbance and still maintain the same relationships between populations or state variables” (Holling 1973).
Since that time, others have put forward general and domain-specific definitions; e.g.: Resilience is “the capacity to cope with unanticipated dangers after they have become manifest, learning to bounce back.’’ (Wildavsky 1991, p. 77) Resilience is “the ability of a system to withstand stresses of ‘environmental loading’... [it is] a fundamental quality found in individuals, groups, organizations, and systems as a whole.” (Horne and Orr 1998, p. 31) Resilience is “the capacity to adapt existing resources and skills to new situations and operating conditions.’’ (Comfort 1999, p. 21) Resilience is “both the inherent strength and ability to be flexible and adaptable after environmental shocks and disruptive events.” (Tierney and Bruneau 2007, p. 17) Resiliency is “the capability of an asset, system, or network to maintain its function during or to recover from a terrorist attack or other incident.” (U.S. Department of Homeland Security 2006) Resilience is the “ability to resist, absorb, recover from or successfully adapt to adversity or a change in conditions.” (U.S. Department of Homeland Security Risk Steering Committee 2008)
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
85
Resilience is the “ability of systems, infrastructures, government, business, and citizenry to resist, absorb, recover from, or adapt to an adverse occurrence that may cause harm, destruction, or loss of national significance.” (U.S. Department of Homeland Security Risk Steering Committee 2008) Resilience is the “capacity of an organization to recognize threats and hazards and make adjustments that will improve future protection efforts and risk reduction measures.” (U.S. Department of Homeland Security Risk Steering Committee 2008) Resiliency is “defined as the capability of a system to maintain its functions and structure in the face of internal and external change and to degrade gracefully when it must.” (Allenby 2005) “Regional economic resilience is the inherent ability and adaptive response that enables firms and regions to avoid maximum potential losses.” (Rose and Liao 2005) “Engineering resilience […] is the speed of return to the steady state following a perturbation […] ecological resilience […] is measured by the magnitude of disturbance that can be absorbed before the system is restructured….” (Gunderson et al. 2002) Social resilience as the ability of groups or communities to cope with external stresses and disturbances as a result of social, political, and environmental change (Adger 2000). Resilience is “the essence of sustainability […] the ability to resist disorder.” (Fiksel 2003) “Local resiliency with regard to disasters means that a locale is able to withstand an extreme natural event without suffering devastating losses, damage, diminished productivity, or quality of life and without a large amount of assistance from outside the community.” (Mileti 1999)
These definitions all include some aspect of withstanding change, whether by reducing the impact of the change, adapting to the change, or recovering from the change. Many of them assert that one aspect is the speed of the recovery and, for national infrastructure and economic systems, this speed is important; a recovery that takes hours is better than one that takes weeks, all else being equal. Only a few of the above definitions assert that adjusting easily to the change is important; in the case of government homeland security policy, if a disrupted critical infrastructure system can adjust easily and essentially on its own, fewer resources (time and money) need to be committed to the recovery process. 2.1.2 Economic Definitions Rose (2007) defines static economic resilience as …the ability of an entity or system to maintain function (e.g., continue producing) when shocked […it is] primarily a demand-side phenomenon involving users of inputs (customers) rather than producers (suppliers). It pertains to ways to use resources available as
86
E.D. Vugrin et al. effectively as possible. This is in contrast to supply-side considerations, which definitely require the repair or reconstruction of critical inputs.
Static economic resilience can be thought of as an instantaneous measure of the performance of an entity or system relative to a non-resilient or fragile performance (e.g., where total productive capacity is lost). Rose (2007) defines dynamic economic resilience as …the speed at which an entity or system recovers from a severe shock to achieve a desired state.
Rose (2007) also notes that dynamic resilience is “more complex because it involves a long-term investment problem associated with repair and reconstruction [that] involves serious tradeoffs.” 2.1.3 Relationships between Resilience and Other Systems Concepts Resilience as a concept is related to other system concepts, such as survivability. To illustrate some of these similarities and to more broadly show how Sandia’s overall framework is based on work by others, we can directly compare these two concepts. First, the Multidisciplinary and National Center for Earthquake Engineering Research (MCEER) has developed a comprehensive resilience framework (Bruneau et al. 2003) that lists three constituent features of a resilient system: failure probability, level of impacts, and speed of recovery. Second, the U.S. Army (2005) and others have developed survivability frameworks. The Army defines survivability as …the capability of a system to avoid or withstand manmade hostile environments without suffering an abortive impairment of its ability to accomplish its designated mission.
Survivability has three primary components, defined as follows: • •
•
Susceptibility: “the inability of a system to avoid being hit by a threat mechanism.” (U.S. Army 2006). Vulnerability: “…the characteristics of a system that cause it to suffer a definite degradation (loss or reduction of capability to perform the designated mission) as a result of having been subjected to a certain, defined level of effects in an unnatural or manmade hostile environment.” (U.S. Army 2006) Recoverability: “…[following combat damage] the ability to take emergency action to prevent loss of the system, to reduce personnel casualties, or to regain weapon system combat mission capabilities.” (U.S. Department of Defense 2004)
Important similarities exist between resilience and survivability. First, as shown in Table 1, each of the three MCEER resilience concepts has a direct survivability corollary: failure probability with susceptibility, level of impacts with vulnerability, and speed of recovery with recoverability.
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
87
Table 1 Relationships between MCEER resilience features and survivability features
MCEER Definition
MCEER Resilience Concept
Survivability Concept
Failure probability
“Reduced failure probabilities”
Susceptibility
Level of impacts
“Reduced consequences from failures, in terms of lives lost, damage, and negative economic and social consequences”
Vulnerability
Speed of recovery
“Reduced time to recovery (restoration of a specific system or set of systems to their “normal” level of performance)”
Recoverability
Second, similar to how our definition of resilience is specific to a particular disruption (e.g., a hurricane), this definition of survivability is specific to a particular threat. For example, Ball (2003) defines vulnerability as the conditional probability that an aircraft is destroyed given that it is hit (PK|H). Ball’s mathematical formulation for the probability of combat survivability is PS = 1 – PHPK|H, where PH is the probability of being hit. Finally, similar to how our framework includes resilience enhancement features, Ball (2003) describes characteristics that make something survivable in a general sense (i.e., across a range of threats) as “survivability enhancement features.” Ball’s list of features of a survivable aircraft includes items such as the characteristics of the aircraft, the characteristics of a crew, and the tactics used in combat. While each listed feature may not enhance survivability against every threat, each feature does enhance survivability against a set of threats. A list of survivability enhancement features can be further tailored to a specific threat. For example, the U.S. Army’s (2006) nuclear, biological, or chemical (NBC) contamination survivability criteria are “engineering design quantitative criteria expressed in terms of decontaminate-ability, hardness, and compatibility” for enhancing survivability in an environment with NBC contamination. Ball’s survivability enhancement features are similar to, but less specific than, the examples of resilience measures provided by Bruneau et al. (2003).
2.2 Resilience Assessment Frameworks 2.2.1 Domains of Resilience MCEER (Bruneau et al. 2003) developed the technical, organizational, social, and economic (TOSE) framework for defining system resilience, which was named after the important domains in which it should be applied. The technical domain includes physical systems that are engineered by humans, such as computer systems or the power grid, and their interconnected components. The organizational domain looks at the resilience capacities of organizations and institutions. The social domain
88
E.D. Vugrin et al.
examines population and community characteristics; e.g., the “capacity of globalized, multicultural societies to hold together in the face of systemic shocks such as diseases and terrorist strikes” (The Centre of Excellence for National Security 2008). The economic domain includes economies, which are composed primarily of firms and households. The economic domain can be further refined into three sub-domains: microeconomic (behavior of individual firms and households), mesoeconomic (behavior of a sector, market, or group), and macroeconomic (aggregate behavior of all individual firms, households, and groups) (Rose 2007). While resilience is generally considered to be a positive feature in infrastructure and economic systems, it is worth noting that resilience may be considered undesirable in other domains, depending on the perspective of the analyst. For example, Hills (2000) examined the resilience of sub-Saharan Africa’s police forces and argues that these undesirable statist institutions have demonstrated high levels of resilience. 2.2.2 Complex Adaptive Systems Recent resilience research in ecology often represents socioeconomic systems as complex systems with multiple equilibria, or “domains of attraction.” Walker et al. (2004) defines crucial aspects of resilience in terms of the ease of moving a system across a threshold that leads to a completely different state of the system. These complex systems-based definitions of resilience may not always be applicable within the TOSE framework because disasters rarely cause such a system to shift to a completely different domain of attraction. For example, in ecology, a new domain of attraction may be populated by a completely different set of organisms. On the other hand, large disasters, such as Hurricane Katrina, can create large changes in the long-term behavior of systems, but usually leave the TOSE systems in a state that is recognized to be the same domain of attraction as before the disaster. 2.2.3 Human-Social Systems The Resilience Alliance (2007a,b) developed a framework for assessing the resilience of social-economic systems, which are driven primarily by ecology (and the ecological definition of resilience) but have impacts on (or are affected by) human-social systems. The Resilience Alliance’s framework is composed of the following steps: 1. 2.
3.
Define and understand the system. This step involves understanding the components of the system and how resilience applies to the system. Identify the resilience of what? This step involves demarcating the boundaries of the system, identifying appropriate scales to examine resilience, and identifying the variables of concern. This includes the need to “identify the attributes of the system that…determine the dynamics of the system…[which] are the key targets for management intervention aimed at resilience.” Identify the Resilience to What? This step involves the identification of “system drivers and disturbance,” which are external and internal variables and events that drive system change. Identification is
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
4. 5.
6.
7.
89
accomplished through consultation with stakeholders in the system and development of a historical profile of the system. Identify the people and governance. This step identifies the key players in the system and how the system is governed externally. Assess resilience. This step develops the conceptual and analytical models necessary for identifying the recovery path and recovery efforts needed in the resilience assessment. Identify implications for management intervention. This step uses the results of the modeling in the previous step to inform policymakers or other managers how the system might react to interventions. Synthesize resilience understanding. The final step of the Resilience Alliance’s resilience assessment process is to synthesize the findings of the previous steps.
2.2.4 Seismic Resilience MCEER (Bruneau et al. 2003) proposes that the seismic resilience of a community to an earthquake can be measured by estimating the expected degradation in the quality of community infrastructure, Q(t). Given the occurrence of an earthquake at time t0, this degradation is measured for the period of time immediately following the earthquake (t0) until Q(t) returns to its pre-earthquake levels (t0). Resilience loss, RL, is calculated as t1
RL = ∫ [100 − Q(t )]dt
(1)
t0
or, equivalently, as the shaded area in Figure 3. The framework assumes that infrastructure quality levels are at 100 percent prior to the earthquake event and will return to this level following the earthquake. Though this method is presented in the context of earthquakes, the approach is also appropriate for other types of disturbances.
Quality of Infrastructure (%) 100 RL 50
t0
t1
time
Fig. 3 Conceptual illustration of MCEER’s seismic resilience loss measurement (adapted from Figure 1 in Bruneau et al. 2003)
90
E.D. Vugrin et al.
2.2.5 Probability-Based Resilience Assessment Chang and Shinozuka (2004) propose a probabilistic approach for measuring resilience, which they mathematically define in Equation 2 in terms of pre-defined performance standards A, given a seismic event of magnitude i:
R = P ( A i ) = P ( r0 < r * and t1 < t *) ,
(2)
where r0 = initial system performance loss; r* = a predetermined standard of robustness that denotes the “maximum acceptable loss” in system performance following an earthquake; t1 = time to full recovery; and t* = the “maximum acceptable disruption time”; i.e., the maximum acceptable length of time for system performance to return to preearthquake levels. Using this formulation, Chang and Shinozuka (2004) define resilience R as “the probability that the system of interest will meet predefined performance standards A [i.e., r0< r* and t1r* and t1>t* System Performance r*
Without Earthquake r0 ith W
t0
e ak qu h t r Ea
t*
t1
time
Fig. 4 Measuring probabilistic resilience (adapted from Chang and Shinozuka 2004)
2.2.6 Economic Resilience Section 2.1.2 includes Rose’s (2007) definitions of static and dynamic economic resilience. Rose (2007) also proposes metrics for measuring these quantities. Rose proposes that the static economic resilience of a system to a shock be measured as “the ratio of the avoided drop in [system] output and the maximum potential drop”
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
91
in system output (Figure 5) and applies this metric to assess the resilience of a system at any given instant in time. No Disruption ΔEP Expected Performance
System Output
ΔWC
Static Resilience =
ΔWC-ΔEP ΔWC
Worst Case
Fig. 5 Static economic resilience
Rose (2007) incorporates the time-dependent aspects of system recovery in his approach to measuring dynamic economic resilience. As illustrated in Figure 6, dynamic economic resilience can be calculated as N
Dynamic Resilience = ∑ SOHR ( ti ) − SOWR ( ti )
(3)
i =1
where N ti
= =
number of time steps considered, the ith time step,
SOHR =
system output under hastened recovery efforts, and
SOWR =
system output without hastened recovery efforts.
As with the other frameworks, these economic approaches can be easily adapted to non-economic applications. Shock Occurs
System Output
ce lien i s e ic R m a n Dy
With Hastened Recovery Without Hastened Recovery
Recovery Begins
Time Fig. 6 Dynamic economic resilience
92
E.D. Vugrin et al.
3 The System Resilience Framework As demonstrated in the previous section, resilience is not a new concept. However, no existing frameworks are general enough to apply to all eighteen CIKRs while explicitly considering the cost of recovery efforts. The authors developed a three-part resilience assessment framework with the goal of addressing both of these issues. The resilience assessment framework consists of a definition of system resilience, an approach for quantitatively measuring system resilience costs, and a qualitative method for evaluating features that determine system resilience. The individual components of the framework are described in this section.
3.1 A Definition of System Resilience for Infrastructure and Economic Systems Based on the review and comparison of resilience definitions and frameworks, including those described in Section 2, our proposed definition of system2 resilience is Given the occurrence of a particular disruptive event3 (or set of events), the resilience of a system to that event (or events) is the ability to efficiently reduce both the magnitude and duration of the deviation from targeted system performance levels. This definition emphasizes that resilience is determined by a combination of the impact of the event on the system and the time and cost required for the system to recover. Some clarification of terms is helpful: •
2
3
System performance: Given the flexibility of many systems to adjust and reconfigure to a disruptive event, maintaining system structure is not as important as maintaining system performance. For example, a disruptive event may radically change the structure of how a regional economy operates (the structure of markets; the direction and levels of commodity flows), but this is immaterial if the desirable performance levels of gross regional product (GRP) and low unemployment are maintained through and afterwards. As explained below, systems with high levels of adaptive capacity and restorative capacity may be able to
A “system” is a set of related and often explicitly interconnected entities that forms a whole. Engineered systems—such as infrastructure systems—have a collective, measurable purpose. A “disruptive event” is a shock or perturbation that can transform the system into other states (which are often worse, but potentially better). This event is most commonly an event of short duration, such as a natural disaster, but the definition also accommodates longer term disturbances such as changes in technology, business cycles, social trends, and climate change. These events are respectively termed “pulse” and “press” disturbances (Resilience Alliance 2007a).
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
•
•
•
•
93
achieve targeted system performance levels through even radical (and low-cost) changes in system structure. Efficiency: The term “efficiency” means using the lowest possible amount of resources during recovery processes; depending on the domain, these resources could be dollars, repair man-hours, infrastructure replacement assets, or time. By defining it this way, the definition of resilience has the broadest domain application and the framework can have concise applications in other analytical areas, such as consequence analysis, risk analysis, and policy analysis. Disruptive event: Our definition considers resilience of a system to a specific disruption. Different disruptions may affect a system in different ways and, thus, necessitate different recovery processes. Hence, a system may have different levels of resilience to different disruptions. Our definition emphasizes that resilience of a system should be considered in the context of a particular disruption. Measurement of system resilience costs: The resilience of a system to a specific disruptive event is determined by systemic impact, which is determined by the deviation from the targeted system performance levels; and total recovery effort, which is a function of the duration of recovery (the length of time it takes for the system performance level to permanently recover to the targeted system performance level) and the recovery effort (the costs and efforts required to change the structure of the system to recover to the targeted system performance level). Measuring system resilience costs this way allows one to make tradeoffs between systemic impact and total recovery effort: a system that moves quickly to the targeted system performance, but at a high total recovery effort, may not be preferable to a slower but less costly recovery. Targeted system performance level: This is a quantifiable measure of how the system performance metric should behave during and after disruptive events. This can be as simple as the pre-disruption system performance metric continued throughout time, or it may vary according to each particular disruptive event. In a scenario involving a destructive earthquake, for example, the targeted system performance level for emergency services will likely increase, overwhelming even an undamaged emergency services infrastructure, but begin declining in the weeks following the earthquake as demand for emergency services decreases.
3.1.1 Example Resilience Domains By design, our definition of resilience is applicable to a large number of system domains. Figure 7 illustrates the broad range of interconnected domains whose system resilience can be evaluated.
94
E.D. Vugrin et al.
Technical Organizational Social
Environmental Ecological Economic
Fig. 7 Domains for assessing system resilience
The definition of system resilience is applicable to human systems (black), environmental/ecological systems (green), and interconnected combinations of these two. All of these domains matter in our definition and measurement of system resilience, and most systems can and do fall across multiple domains.
3.2 A Quantitative Approach to Measuring Resilience Recovery is an inherent component of system resilience, as defined in Section 3.1. The existing measurement approaches described in Section 2 consider impacts to system output. Recovery is implicitly considered in all of these approaches because recovery directly affects system output. However, none of these approaches explicitly considers the cost of recovery or the dependence of system outputs on the selection of recovery strategies. The authors assert that this is a critical aspect of resilience that needs to be considered in the measurement of resilience. A simple example is presented to illustrate this point. •
Consider two systems that experience identical disruptions at precisely the same time. Both systems suffer the same decreases in system output, and both systems return to pre-disruption levels at precisely the same time. System 2 suffers these system output impacts while only few recovery resources are spent by external entities to restore system capabilities. However, significantly more resources are expended by external entities to return System 1 to pre-disturbance output levels.
Each of the existing resilience measurement approaches described in Section 2 would consider that these two systems have identical resilience values because those approaches only consider systemic impact. However, the authors assert that System 2 should be considered to be more resilient because System 1 required a greater recovery effort for identical results or system impacts (Figure 8). This section describes a novel approach for explicitly incorporating the cost of recovery into the measurement of resilience costs.
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
95
Fig. 8 Graphical comparison of the resilience of two systems
In contrast to other quantitative methods that measure resilience directly, the authors’ quantitative resilience assessment method indirectly evaluates resilience by quantifying resilience costs. Systems that are more resilient to a disruption will have lower costs than systems that are less resilient to that disruption. The resilience cost measurement approach requires quantification of two key components of the definition of system resilience: systemic impact and total recovery effort. As previously mentioned, systemic impact is measured by evaluating the difference between a targeted system performance level and the actual system performance following the disruption. Total recovery effort is measured by analyzing the amount of resources expended during the recovery process. Figure 9 graphically represents systemic impact and total recovery effort for a hypothetical system that has been disrupted. Systemic impact (SI) is quantified by calculating the area between the targeted system performance (TSP) and the actual system performance (SP) curves in Figure 9a. This area is calculated using the formula in Equation (4). tf
SI = ∫ [TSP(t ) − SP(t )] dt.
(4)
t0
Similarly, total recovery effort (TRE) is represented by the area under the recovery effort (RE) curve in Figure 9b, and this area is calculated with the formula in Equation (5). Calculation of resilience incorporates both of these quantities. tf
TRE = ∫ [RE (t )]dt. t0
(5)
96
E.D. Vugrin et al. Recovery is complete at t=tf
Shock occurs at t=t0 Duration=tf-t0 e c n a m r o fr e P m e t s y S
Time
Deviation from targeted system performance level
SP(t) TSP(t)
(a)
Recovery Effort
RE(t)
Time Duration Recovery effort commences following shock
Recovery complete at t=tf
(b) Fig. 9 Systemic impact (a) and total recovery effort (b) are measured by calculating the shaded areas under the curves
Before demonstrating how system resilience costs are calculated, it is important to note that the system performance is determined by the recovery effort. That is, different recovery efforts lead to different system performances. For example, if no recovery effort is made following the disruption, the impacts to system performance may be great. In contrast, if recovery resources are deployed shortly after the system shock, system performance may not be significantly affected, and systemic impact may be small. The recognition that systemic impact is implicitly determined by the selected recovery strategy leads to the development of two types of resilience cost measurements: •
Optimal resilience (OR) costs are the resilience costs for the system to a particular disruption, d, when the optimal recovery strategy that minimizes a combination of systemic impact and total recovery effort is employed (see Equation [6]), and
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
•
97
Recovery-dependent resilience (RDR) costs are the resilience costs of a system to a particular disruption, d, under a particular recovery strategy, RE. (see Equation [7]). tf
OR(d ) = min RE
tf
∫ [TSP(t ) − SP(t , d )]dt + α × ∫ [ RE (t , d )]dt
t0
t0
(6)
tf
∫ TSP(t ) dt
t0 tf
tf
∫ [TSP(t ) − SP(t, d )]dt + α × ∫ [ RE (t , d )]dt
RDR (d , RE ) = t 0
t0
tf
∫ TSP(t ) dt
t0
=
SI + α × TRE tf
(7)
∫ TSP(t ) dt
t0
Both OR and RDR costs are linear combinations of SI and TRE. The denominator is a normalization factor that permits the comparison of the resilience of systems whose system performance levels may be of different magnitudes. The parameter α is a non-negative weighting factor that allows the analyst to assign the relative importance of the systemic impact and total recovery effort terms. Assigning a small positive value to α weighs the systemic impact more heavily; a large positive value for α weighs the cost of recovery more heavily. To equally weight SI and TRE, α is set to 1. Several things about this resilience cost measurement approach should be considered: • • • •
Smaller RDR and OR costs indicate increasing resilience. No finite RDR and OR values correspond with the concept of a minimally “resilient” system. The approach for measuring system resilience is neither model- nor domain-specific. It only requires time series data (either historical or from a model) that represent system output and recovery efforts. The summation of the SI and α ×TRE terms in Equation 7 requires either that the SI and TRE be measured in the same units and α be a unitless constant or that α be assigned units that are appropriate for converting TRE to the same units as SI.
Because the RDR and OR values are dimensionless quantities, they are most informative when used in a comparative manner. For example, they can be used to compare the resilience of different systems to the same disruption. The system that has lower resilience costs will be the more resilient system. RDR and OR values
98
E.D. Vugrin et al.
can also be used to compare the resilience of the same system to different types of disruptions. The system is more resilient to the disruption that results in smaller RDR and OR values. Moreover, they can be used to compare the resilience of a system to a disruption under different recovery strategies. Each different recovery strategy will result in different SI and TRE values. The recovery strategy that results in the smallest RDR values will provide maximal resilience for the system. It should be noted that the definition of optimal resilience costs is similar to commonly used objective functions in optimal control applications.4 This similarity provides a starting point for research into the measurement of optimal resilience that will be discussed in Section 5. 3.2.1 Example Critical Infrastructure System Performance Metrics The resilience framework is designed to be used on any infrastructure or economic system that has at least one quantifiable measure of the performance. Table 2 lists some examples of infrastructure and economic systems and associated performance metrics: Table 2 Example System Performance Metrics
Critical Infrastructure System
4
System Performance Metrics
Agriculture and Food
Rates of and population exposure to food contamination; average consumer price of food
Chemical
Shipments to critical chemical-based commodities (e.g., pharmaceuticals)
Emergency Services
Lives saved; average response time
Energy: Petroleum, Oil, and Lubricants (POL)
Barrels of refined petroleum product transported to the Midwest; price of domestic refined products; profitability of energy companies
Information Technology
Number and efficacies of cyber attacks
Public Health and Healthcare: H1N1 vaccine production, storage, and distribution system
Rates of morbidity and mortality; cost per vaccine given
Transportation Systems: Highway
Average speed and cost of shipments; number of disrupted shipments
Communications
Number of dropped telephone calls
Consider, for example, the tracking formulation of the well-known linear quadratic regulator (LQR) control problem described by Lee and Markus (1986).
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
99
3.3 System Capacities That Determine System Resilience Fiksel (2003) suggests that “it is important to assess not only performance outcomes but also the intrinsic characteristics that contribute to system resilience.” Consequently, our framework features a qualitative analysis component that can be used to explain the results of quantitative measurements or can take the place of quantitative results when no data are available. This analysis is done through consideration of system structures, characteristics, and features. This portion of the framework uses three system capacities to formulate how inherent properties of a system can determine system resilience, specifically by reducing systemic impact and total recovery effort. They are: absorptive capacity, adaptive capacity, and restorative capacity. These capacities are affected by resilience enhancement features, features of the system that can increase one or more of the system capacities. The following subsections describe the characteristics of resilience capacities and related resilience enhancement features and provide examples of each. Figure 10 summarizes the distinguishing characteristics of the capacities.
Resilience Component
System Impact
Total Recovery Effort
Determining Features
Absorptive Capacity
Adaptive Capacity
Restorative Capacity
Distinguishing Characteristics of Capacity
Considers aspects that automatically manifest after the disruption
Considers internal aspects that manifest over time after the disruption
Considers ability to affect and repair internal system features
Effort Required
Automatic/ Little Effort
Internal Effort Required
External Effort Often Required
Measurement of Component
Internal Measurement
Exogenous Measurement
Fig. 10 Resilience capacities of a system
3.3.1 Absorptive Capacity Absorptive capacity is the degree to which a system can automatically absorb the impacts of system perturbations and minimize consequences with little effort. The absorptive capacity is an endogenous feature of the system.
100
E.D. Vugrin et al.
For example, storage can enhance the absorptive capacity; if a chemical plant is disabled but a large amount of collocated storage of its product is undamaged, customers can continue to be supplied by the stored quantities, with little cost to the producer or customer, while the plant is repaired. Examples of resilience enhancement features that can increase absorptive capacity include: • •
System robustness, which decreases systemic impact through the strength of individual connections in the system. System redundancy, which decreases systemic impact through providing alternate pathways for the system mechanics to operate.
3.3.2 Adaptive Capacity Adaptive capacity is the degree to which the system is capable of self-organization for recovery of system performance levels.5 It is a set of properties that reflect actions that result from ingenuity or extra effort over time, often in response to a crisis situation. It reflects a dynamic ability of the system to change endogenously throughout the recovery period. Consider the scenario in which a hurricane destroys many high voltage power lines, leaving many customers without electricity. Having customers with emergency generators enhances system adaptive capacity because the system can be changed (customers adapt to the disruptive event by generating power from a fuel source like gasoline rather than connecting to the electric grid) so that some portion of system performance is regained at a relatively low amount of effort. Substitutability, the ability to replace one system component with another, is a resilience enhancement feature that can affect adaptive capacities. Economic systems with a high adaptive capacity can easily adjust to shocks through ordinary means such as input substitution, which can occur in situations where an input is scarce. For example, if beef becomes scarce, prices may increase and consumers may naturally substitute to other, less-expensive food, such as poultry, with little effort. Similarly, Rose (2007) provides examples of “adaptive resource substitution,” “adaptive import substitution,” and “adaptive conservation,” where people alter their everyday behavior to recover some portion of the lost system functionality. Other resilience enhancement features that increase adaptive capacity tend to be more difficult to identify because they often rely upon the ingenuity of people faced with adversity.
5
Our definition is similar to the definition by Carpenter et al. (2001) of “self-organization capacity,” which is “the degree to which the system is capable of self-organization (versus lack of organization, or organization forced by external actors).” This definition emphasizes that organization is an endogenous, emergent property, and is similar to Rose’s (2007) concept of “adaptive resilience,” which is “the ability in crisis situations to maintain function on the basis of ingenuity or extra effort.” It is also similar to Sheffi and Rice’s (2005) definition of “flexibility,” which is the “amounts of building organic capabilities that can sense threats and respond to them quickly.”
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
101
3.3.3 Restorative Capacity Restorative capacity is the ability of a system to be repaired easily. Typically, these repairs are dynamic and performed by entities exogenous to the system. In the context of infrastructure policies, the government is often considered to be the exogenous, repairing entity. These repairs usually restore the system to near its original, pre-event state, but can also restore the system to a completely new state or regime that anticipates future system requirements. Restorative capacity primarily affects the total recovery effort, although repairs to the system enabled by the system’s restorative capacity may also increase the system performance, reducing systemic impact. Whereas adaptive capacity reflects the ability of a system to be changed endogenously, restorative capacity reflects the ability to be repaired exogenously. Most importantly, adaptive capacity involves changes that can radically alter the structure of the system to restore system performance, restorative capacity most often involves repairs that are usually implemented with the goal of returning a system to something near its original structure. For example, the electric power grid has monitoring systems that can automatically detect when and where a break in the grid emerges. Such technologies enhance the restorative capacity of the power grid because repair crews can be sent to the location of the break. These technologies result in a shorter disruption that is easier to repair (in terms of cost and time) than it would be if crews had to search large portions of the grid to find the break before repairing it. 3.3.4 Relationships between System Capacities, Performance, and Recovery As with system resilience, system capacities are event-specific. However, because many disruptive events may produce similar system performance and recovery results, capacities of a system to respond resiliently to different events may be highly correlated. For example, a variety of earthquakes could be experienced in California, but the capacities that enhance resilience of a system to one earthquake will be nearly identical to those that enhance resilience to other earthquakes. Therefore, capacities can be identified for classes of disruptive events. Section 4 provides a framework for conducting this qualitative resilience capacity assessment. Absorptive and adaptive capacities are likely most important in the initial stages of large, widespread disruptions, where repair of the system might be impossible in the short term. For example, even if a large earthquake immobilizes the emergency services sector, people trapped in collapsed buildings will need to be rescued long before the police stations and fire engines are repaired. Restorative capacity is likely most important after the disruptive event, when efforts are underway to repair it. System dependencies across systems can affect resilience capacities. For example: • •
Absorptive Capacity. If system A is dependent upon system B to operate, then this relationship will lower system A’s absorptive capacity in scenarios that negatively affect system B. Adaptive Capacity. System A may have adaptive capacities that allow the system to reorganize to reduce its dependency upon system B.
102
E.D. Vugrin et al.
•
Restorative Capacity. The operation of system A may not depend upon the functionality of system B, but the repair of the components of the sector may require system B to be operational. In such cases, the restorative capacity of system A is diminished. Additionally, repair of multiple sectors must often be coordinated (for example, where infrastructures are collocated in urban areas), which further reduces institutional capacity by increasing the complexity of repair.
3.3.5 Resilience Enhancement Features and Resilience Domains The ability of resilience enhancement features to affect resilience often comes from the particulars of the domain under consideration.6 Resilience enhancement features in the technical domain are often engineeringbased solutions that attempt to improve the functional performance level of the infrastructure system. For example, embedded sensors that relay the structural state of bridges to a command center after an earthquake are an example of a technical feature that can be engineered to improve the restorative capacity of highway systems because the sensors will reduce the need for costly and timeconsuming inspections of bridges conducted before traffic is allowed back onto the highway. Additionally, design simplicity may be another technical feature that contributes to resilience (Fiksel 2003). A simple design may be more robust (hence, higher absorptive capacity), be easier to adapt (higher adaptive capacity), and easier to repair (higher restorative capacity). Resilience enhancement features in the organizational domain are especially important from a policy standpoint because they can describe how governments and major stakeholders can affect restorative capacity. These organizations and institutions must attempt to choose an optimal recovery effort, taking into account the absorptive, adaptive, and restorative capacities of the system. This is an economic problem because the costs of recovery (i.e., the total recovery effort) must be weighed against the speed of recovery and the reductions in systemic impact brought about by recovery efforts. Economic resilience enhancement features may reflect the ability of the structure of the economy to enable resilience capacities. Because the economy is highly influenced by organizations such as the government, there are many overlaps in the economic domain. For example, Rose (2007) uses price “gouging” as an example of a behavior that increases resilience to scarcity. Thus, government policies that ban price increases during disasters are non-resilient (or even negatively resilient) because they reduce the absorptive and adaptive capacities of resilience enabled by the market price system. Resilience enhancement features from the social domain reflect grassroots characteristics of communities that enhance the resilience capacities. For example, 6
Focusing on the long-term sustainability of companies, ecosystems, and social systems, Fiksel (2003) identifies four categories of characteristics of resilient systems that can also be considered as resilient enhancement features and offers examples: (1) diversity, or “existence of multiple forms and behaviors”; (2) efficiency, or “performance with modest resource consumption”; (3) adaptability, or the “flexibility to change in response to new pressures”; and (4) cohesion, or the “existence of unifying forces or linkages.”
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
103
in the heavily damaged Vietnamese Versailles community of New Orleans East following Hurricane Katrina, residents quickly moved back and began “pooling their own resources and volunteer labor” to rebuild despite delays in government aid. The community also worked together to persuade the electric utility to restore power (Chen 2006). Possible capacity measures include egalitarianism, monoculturalism (see, for example, The Centre of Excellence for National Security 2008), and correlates of prosperity (see, for example, Isserman et al. 2009).
3.4 Framework Summary The resilience assessment framework that we have developed consists of three primary components. First, our definition of system resilience indicates what factors need to be considered when assessing the resilience of a system. These factors are quantitatively evaluated in the second component of the framework, our resilience cost measurement methodology. This methodology explicitly includes costs resulting from system impacts and recovery efforts following a system disruption. Third, the qualitative analysis component examines resilience capacities and enhancement features of the system to explain or replace quantitative results. Figure 11 illustrates the relationships between the components of the resilience assessment framework. Thorough application of this resilience assessment framework can result in a comprehensive evaluation of a system’s resilience and can inform about how to further enhance system resilience.
Absorptive Capacity
Adaptive Capacity
Restorative Capacity
Resilience
Recovery Effort
System Performance
System Performance Recovery Duration Systemic Impact
Recovery Effort
Total Recovery Effort
time
Fig. 11 Capacities and measures of resilience
time
104
E.D. Vugrin et al.
4 A Qualitative Resilience Assessment for an Earthquake Scenario The calculation of RDR values is rather straightforward, and calculation of OR values is a point of ongoing research (see Section 5.2.1). The qualitative analysis component of the resilience assessment framework is the most complex, so this section presents an example qualitative analysis to illustrate its application. As a part of NISAC program, Sandia is conducting a multi-year project to evaluate the potential impacts resulting from a large earthquake in the New Madrid Seismic Zone of the Midwestern United States. The authors performed a resilience assessment of the 18 CIKRs to this earthquake as a part of the multiyear effort (NISAC 2009). The assessment followed the qualitative resilience assessment approach described in Section 3.3, and this section demonstrates how that application was performed and presents some of the results from that assessment. This assessment focuses on the qualitative aspects of the assessment framework because the quantitative measurement approach was not fully developed at the time of that analysis.
4.1 The Earthquake Scenario The proposed earthquake scenario for the infrastructure assessment was as follows: on January 3, 2009, at 4:00 am Central Standard Time (CST), an earthquake of magnitude 7.7 occurred with an epicenter northwest of Memphis. The earthquake occurred on a fault coincident with the southwest linear zone of modern seismicity of the New Madrid Seismic Zone. The Modified Mercalli Intensity (MMI), which shows the degree of shaking, is shown in Figure 12.
Fig. 12 Ground Shaking Intensities for New Madrid Earthquake Scenario7 7
Source: http://earthquake.usgs.gov/regional/ceus/products/download/regional/nm_sw_mmi.gif, accessed October 15, 2008.
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
105
An earthquake in this region is of particular concern because CIKR systems could be severely affected. Several major natural gas transmission pipelines would be ruptured in this scenario. Consequently, natural gas transmission to the Chicago area and Northeast United States could potentially be reduced by up to 25 and 15 percent, respectively. Additionally, cascading electric grid failure would likely occur, leading to a blackout across the entire Eastern Interconnect within 30 minutes of the event. The Mississippi River would probably experience thousands of landslides, making it un-navigable, and the location of the river could actually be moved by up to a mile in some places.
4.2 Approach The objective of this resilience assessment was to provide a high-order, qualitative evaluation of the resilience of the 18 regional CIKR systems to the earthquake described in the previous section. This assessment focused on the resilience of the CIKRs within the technical domain. To achieve this objective, the authors performed the following steps by interviewing NISAC’s CIKR subject matter experts (SMEs): •
•
•
•
•
Identified system and subsystem(s) of interest: CIKR Systems are often composed of subsystems that would be best analyzed separately, depending, of course, on the goals of the resilience assessment. For example, energy infrastructure can fall into electric power, petroleum, natural gas, and so on. System boundaries should be set so that the resilience assessment is of a manageable scope. Identified system performance metric(s): Because systems are complex and composed of many entities or sub-units, there are numerous possible metrics that could be used. System performance metrics should be chosen that are most fundamental to the purpose the system from the perspective of the relevant stakeholders. For example, for the purposes of the NISAC (2009) study, the percentage of customers with working electric power was more relevant to the DHS than the profitability of power companies. Assessed or simulated the recovery path: Assessing or simulating the recovery path consists of identifying the initial systemic impact as well as the changes in that impact over time as recovery proceeds. The recovery path was assessed qualitatively using both the expertise and models of the SMEs and more formal quantitative models and simulations. Assessed or simulated the recovery effort: Assessing or simulating the recovery effort as it proceeds through time is related to identifying the recovery path. Because the recovery path is a function of the recovery effort, identifying both will likely follow similar qualitative or quantitative methods. Identified resilience enhancement features and assessed resilience capacities: Analysts asked SMEs in the NISAC (2009) study to identify features of their systems that affect resilience capacities.
106
E.D. Vugrin et al.
The authors evaluated the results of the interviews for each CIKR system and made qualitative assessments (high, medium, or low) for the resilience capacities of each CIKR system. The evaluations were then gathered in a single resilience matrix as described in the following section.
4.3 Sample Results Figure 13 contains the results of the resilience assessment for two CIKR systems: emergency services and postal and shipping services. Qualitative assessments of the resilience capacities are displayed in a single resilience matrix. In this example, the left column contains the names of the systems being evaluated and the top row lists the resilience enhancement features. The intersections of the rows and columns contain one of the letters H, M, or L (for high, medium, or low). More detailed explanations of the capacity assessments follow the matrix. Resiliency Matrix: Earthquake Scenario Analysis Example Absorptive Adaptive Restorative System Capacity Capacity Capacity
Emergency Services
L
M
L
Postal and Shipping
M
H
H
Emergency Services: - Absorptive Capacity: Low. Emergency services will likely be overwhelmed. The problem will be exacerbated with destruction of emergency services facilities. - Adaptive Capacity: Medium. Some emergency services functions, such as search and rescue, will be augmented by ordinary community members. - Restorative Capacity: Low. Local governments, which may be overwhelmed with impacts from the earthquake, are responsible for repair of the sector. Repair will be slowed because local governments will be unable to offer mutual aid and will have to compete for resources. Postal and Shipping: - Absorptive Capacity: Medium. Shipping facilities tend to exhibit redundancy. However, the earthquake may damage an important shipping super hub, thus having a large initial systemic impact. - Adaptive Capacity: High. Shippers have contingency plans for moving goods in the event of disasters. Shipping routes are flexible and can be rerouted around damaged areas. - Restorative Capacity: High. Shippers are large businesses with financial resources. They have both the contingency plans and the means for recovering quickly to their initial state. Fig. 13 An example resilience assessment (NISAC 2009)
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
107
5 Summary and Future Work 5.1 Summary Resilience is a concept that has been considered for decades in many different disciplines. Consequently, numerous resilience definitions and evaluation approaches have been developed, many of them being domain-specific and, thus, not broadly applicable. The federal government’s CIP policy shift towards CIR has necessitated the development of a uniform framework for conducting CIR analysis. Furthermore, because recovery is a fundamental aspect of system resilience, the uniform framework should explicitly consider the costs associated with infrastructure recovery. With these two considerations in mind, the authors have developed a new resilience assessment framework for infrastructure and economic systems. The framework consists of three components: a definition of system resilience, a quantitative methodology for measuring resilience costs, and a qualitative analysis approach for evaluating resilience affecting system characteristics. Our definition of system resilience identifies key issues that must be considered when evaluating resilience. These issues include the dependence of measurements on particular disruptive events, the evaluation of system impacts, and the utilization of resources during recovery processes. The quantitative measurement methodology follows directly from our definition of system resilience and requires the quantification of systemic impacts; i.e., deviations from targeted system performance levels, and the costs associated with the total recovery effort. Using these two components allows analysts to compare the relative benefits of alternative resilience policies that specifically make tradeoffs between reductions in level and length of loss of system performance and length and level of cost of recovery. The last component of our resilience assessment framework is a qualitative analysis that can inform or replace quantitative results. This process evaluates the absorptive, adaptive, and restorative capacities and considers system features that can enhance the resilience of the system. Section 4 contains an illustrative example of how the qualitative analysis can be performed for a hypothetical earthquake scenario. The development of CIR analysis and this framework have several advantages: •
•
In the context of CIP, it is impossible to secure all critical infrastructures against all threats. Rather than eliminating all threats or hardening all assets, it may be desirable to bolster resilience in critical infrastructure systems so that the functions of those systems can be maintained during and after disruptions—both natural and manmade. The resilience framework provides a way of analyzing systems based upon their overall function. The authors’ definition of resilience is flexible. Systems from a number of human and non-human domains can be analyzed within the same framework. Different aspects of the same system (i.e., system performance metrics) can be examined and compared.
108
E.D. Vugrin et al.
•
•
•
Decisions made following a disruptive event are fundamentally economic, involving the tradeoff between the system performance and the cost (not necessarily measured in dollars) of the chosen recovery effort. The inclusion of the recovery effort in the resilience analysis ensures that the recovery effort is considered both by analysts and by decision-makers. Resilience capacity analysis provides a useful framework for assessing why a system behaves as it does. Resilience is affected by a combination of the three capacities discussed in Section 3.3. The identification of resilience enhancement features and the mapping of the features to resilience capacities provide a more intimate knowledge of the structure of the system and increased confidence in forecasts of consequences from shocks to the system. Ultimately, agencies like DHS are interested in how to make a system more resilient. This resilience framework can answer these questions and provide analysis of the tradeoffs between the costs of building resilience and increases in resilience.
5.2 Future Work This chapter documents the initial effort performed to develop a resilience assessment framework for infrastructures and economic systems. We are currently engaged in efforts to further expand upon this framework and develop additional CIR analysis methods. Specifically, three areas of work are currently being performed or investigated. They are: 1. 2. 3.
Application of optimal control theory and methods to the resilience measurement methodology; Refinement of the resilience capacity analysis approach; and Development of resilient infrastructure design processes.
We have already begun investigating the application of optimal control techniques to the resilience measurement approach and gained initial insights. The following section illustrates approaches and key research questions. The other two efforts remain in conceptual stages, so only brief discussions about that work are presented. 5.2.1 Application of Optimal Control Methods to the Resilience Measurement Methodology As discussed previously, the quantitative resilience methodology is based on two key measurable components: systemic impact and total recovery effort. These considerations lend themselves nicely to mathematical formulations used for the development of optimal feedback control laws. When applied to a system, feedback controllers use measured system outputs to regulate system behaviors to target conditions while simultaneously providing a measure of the cost to do so. Feedback control has been successfully used in a wide variety of settings and
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
109
applications, from simple household temperature thermostats to more complicated applications such as the mitigation of aero-acoustic noise in supersonic jets. Incorporating feedback control in the quantitative description of resilience enables automatic system recovery from disruption and provides predictions of recovery cost. In the context of feedback control design, numerous formulations are currently available that allow for systematic development of optimal control laws. A particular formulation used for the design of optimal feedback control laws is that of the linear quadratic regulator (LQR). Controls developed with this formulation are able to drive system outputs to target values and have built-in robustness to system disturbances. They are developed by use of a tracking LQR cost function: ∞
min ∫ {(x − x t arg et )T Q(x − x t arg et ) + uT Ru}dt. u
(8)
0
In the LQR formulation above, x is a vector of system outputs, and xtarget is a vector of target operating conditions for those outputs. The goal of the tracking LQR control problem is to minimize the error between x and xtarget subject to the spatial and temporal dynamics of the system. The quantity Q is a matrix of weights, allowing for more emphasis on particularly important system outputs in the control problem. The quantity u is a vector of inputs to the system that are used to drive x to xtarget. These inputs are the controls and compensate automatically to satisfy a particular tracking objective. R is a matrix of weights on the controls. These weights are used to keep input magnitudes within reasonable bounds. By carefully constructing Q and R, system behaviors can be driven to their target operating conditions while keeping the cost associating with the requisite input reasonable. If impacted by a disturbance, the feedback controllers automatically detect changes in the system due to the disturbance. The inputs u adjust accordingly to keep the outputs x close to the target values xtarget. Advantages gained by incorporating feedback control methods in the quantitative resilience methodology are now illustrated in a chemical supply chain example. A schematic illustrating the supply chain is shown in Figure 14.
Fig. 14 A control theory-based schematic of a chemical supply chain
In the example shown in Figure 14, the supply chain is composed of five chemicals V, W, F, G, and H, with their quantities denoted by CV, CW, CF, CG, and CH, respectively. The quantities vary with time and space, and their evolution is
110
E.D. Vugrin et al.
described by a set of five coupled partial differential equations over the spatial interval [0, L]. In this configuration, the normal mode of operation is for species W to dissociate into the three daughter products F, G, and H. The target operating condition is for CG to maintain a constant value at the end of the spatial domain. When the supply of chemical V is disrupted, chemical W can be added to the supply chain and processed, with the resulting daughter product being chemical V. The addition of the chemical V can be considered as a means recovering the chemical supply chain. Under normal operating conditions, targeted system performance levels are attained by specifying a nominal value of CW(t, x) at the beginning of the spatial domain, x=0. This nominal value results in a concentration profile of species W along the interval [0, L] that yields the desired concentration of daughter product G at x = L. Chemical species V has species W as a daughter product. As a result, the concentration of species V at x = 0 is used as a control input to the supply chain, allowing for production of species W if there is a disruption in its supply. The behavior of the supply chain without disruption, a nominal value of CW(t, 0) = 1, and a target value of CG(t, L) = 0.0372 with L = 40, is shown in Figure 15 and Figure 16.
Fig. 15 Parent concentrations under nominal Fig. 16 Daughter concentrations under conditions nominal conditions
In Figure 16, the desired concentration of species G is denoted by the asterisks at x = 40 in the plot of CG(t,x). As is evident, the target value is maintained well under the undisturbed nominal operating condition. In this condition, the emergency supply V is not required for the target performance level to be met, as seen in Figure 16. To illustrate the impacts of disruption on the ability of the default supply chain (without feedback control) to maintain its target output, the nominal configuration is now severely disrupted. At t = 5, species W undergoes a catastrophic failure, with its available supply at x = 0 being completely eliminated. The supply of species W does not begin to recover until t = 75, when it is gradually brought back online to its nominal value. The behavior of the disturbed supply chain with no control is shown in Figure 17 and Figure 18.
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
111
Fig. 17 Parent concentrations under disturbed Fig. 18 Daughter concentrations under conditions and no control disturbed conditions and no control
As can be seen in Figure 18, the severe disruption in the supply of species W causes the concentration of daughter species G to severely depart from its target value at x = 40. The severe impact of the disruption on the supply chain is more clearly seen in Figure 19. In that figure, the dashed curve denotes the desired value of CG(t,40). The solid curve denotes the value attained for CG(t,40) for the disrupted supply chain. As shown in the figure, the disruption causes CG(t,40) to depart significantly from its target value.
Fig. 19 Deviation from the target condition under disrupted conditions and no control
The advantage of including optimal control in the quantitative definition of resilience is now apparent. A feedback controller is developed using the LQR formulation in Equation (1) with the aim of holding CG(t,40) at its target value. The control input to the system is the concentration of species V at x = 0. A departure of CW(t,0) from its nominal value causes the feedback control to automatically adjust the concentration of species V to compensate for the disruption. The behavior of the disturbed chemical supply chain with incorporated feedback control is shown in Figure 20 and Figure 21.
112
Fig. 20 Parent concentration under disturbed conditions with feedback control
E.D. Vugrin et al.
Fig. 21 Daughter concentration under disturbed conditions with feedback control
As seen in Figure 20, species V turns on after failure to force the necessary profile in its daughter chemical, species W. This allows for the target value of CG(t,40) to be maintained, even during the severe disruption to the nominal operating condition. The recovery of species W after t = 75 allows species V to vanish from the system, eventually returning to a value of zero. The ability of the feedback control to maintain the target condition, even during a severe disruption, is more clearly seen in Figure 22. As is evident by comparing the results of Figure 19 and Figure 22, incorporating feedback control allows for much better performance during disruption, and increases the resilience of the chemical supply chain.
Fig. 22 Deviation from the target conditions under disturbed conditions with feedback control
Several circumstances often arise that make the determination of optimal feedback controllers difficult. In most cases, optimal control theory provides feedback control formulations that are suitable for linear systems. The utility of
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
113
these methods for systems that are highly nonlinear is questionable. As a result, the development of control methods suitable for nonlinear systems is currently a very active area of research. Another potential hurdle is that the calculation of optimal controllers is computationally intensive. For systems that are very large, numerical calculations required to determine the optimal control quickly become intractable. For very large systems, order reduction methods must be used to reduce system dimension, allowing for the calculation of optimal controllers. The development of system order reduction strategies that lend themselves to typical control formulations is also an ongoing research avenue. 5.2.2 Refinement of Resilience Capacity Analysis The qualitative analysis of a system’s resilience capacity is the most complex aspect of the resilience assessment framework, so this aspect of the framework could be further refined. This could be done through analysis of a broad spectrum of infrastructure systems that are highly resilient (or in contrast, not resilient) to a particular type of disruption; or through assessment of a particular system that is resilient to an array of threats. These analyses could provide further insights into characteristics and resilience enhancement features that are common to “highly resilient” systems. These types of analyses could even lead to a quantitative methodology for evaluating absorptive, adaptive, and restorative capacities. 5.2.3 Resilient Infrastructure Design Our framework provides a means for evaluating the resilience of an existing or proposed infrastructure system, but answering the inverse problem could be even more useful. That is, given a design requirement that a system maintain a certain level of resilience, how does one design a system to meet that requirement? Our framework can provide insight into that question (e.g., how does one quantitatively determine if the design standard is met?). However, the development of a formal process for resilient infrastructure design should be further investigated.
Acknowledgements The authors would like to thank all of the Sandia members of the NISAC Fast Analysis and Simulation Team (FAST) that supported the New Madrid Earthquake resilience assessment. The authors would also like to thank Theresa Brown for her critical evaluations as we developed the resilience assessment framework. P. Sue Downes’s enthusiastic support for the development of the framework, especially the quantitative methodology, is greatly appreciated. Finally, the authors are indebted to Judy Jones for her editing skills and, just as importantly, her patience. The authors performed this work with funding from the DHS Office of Infrastructure Protection, the DHS Science and Technology Directorate, and Sandia National Laboratories’ Laboratory Directed Research and Development (LDRD) program.
114
E.D. Vugrin et al.
Sandia is a multiprogram laboratory operated by Sandia Corporation for the Department of Energy’s National Nuclear Security Administration under Contract DE-AC0494AL85000.
References Adger, W.N.: Social and Ecological Resilience: Are They Related? Progress in Human Geography 24(3), 347–364 (2000) Allenby, B., Fink, J.: Toward Inherently Secure and Resilient Societies. Science 309(5737), 1034–1036 (2005) Ball, R.E.: The Fundamentals of Aircraft Combat Survivability Analysis and Design, 2nd edn. AIAA, Reston (2003) Bruneau, M., Chang, S.E., Eguchi, R.T., Lee, G.C., O’Rourke, T.D., Reinhorn, A.M., Shinozuka, M., Tierney, K., Wallace, W.A., von Winterfeldt, D.: A Framework to Quantitatively Assess and Enhance the Seismic Resilience of Communities. Earthquake Spectra 19(4), 737–738 (2003) Carpenter, S., Walker, B., Anderies, J.M., Abel, N.: From Metaphor to Measurement: Resilience of What to What? Ecosystems 4(8), 765–781 (2001) The Centre of Excellence for National Security (CENS), (Un)problematic Multiculturalism and Social Resilience, Report of a Conference Organized by The Centre of Excellence for National Security at the S. Rajaratnam School of International Studies, Singapore (February 21, 2008), http://www.rsis.edu.sg/cens/publications/ conference_reports/Unproblematic%20Multiculturalism.pdf Chang, S.E., Shinozuka, M.: Measuring Improvements in the Disaster Resilience of Communities. Earthquake Spectra 20(3), 739–755 (2004) Chen, M.: Do-it-Yourself Disaster Relief Snubs New Orleans Planners. The New Standard (February 2, 2006), http://newstandardnews.net/content/index.cfm/items/2753 Comfort, L.K.: Shared Risk: Complex Systems in Seismic Response. Pergamon, New York (1999) Fiksel, J.: Designing Resilient, Sustainable Systems. Environmental Science and Technology 37(23), 5330–5339 (2003) Gunderson, L.H., Holling, C.S., Pritchard Jr., L., Peterson, G.D.: Resilience of Large-Scale Resource Systems. In: Gunderson, L.H., Pritchard Jr., L. (eds.) Resilience and the Behavior of Large-Scale Systems. Island Press, Washington (2002) Hills, A.: Revisiting Institutional Resilience as a Tool in Crisis Management. Journal of Contingencies and Crisis Management 8(2), 109–118 (2000) Holling, C.S.: Resilience and Stability of Ecological Systems. Annual Review of Ecology and Systematics 4, 1–23 (1973) Homeland Security Advisory Council, 2006, Report of the Critical Infrastructure Task Force (January 2006), http://www.dhs.gov/xlibrary/assets/HSAC_CITF_Report_v2.pdf Horne III, J.F., Orr, J.E.: Assessing Behaviors that Create Resilient Organizations. Employment Relations Today 24(4), 29–39 (1998)
A Framework for Assessing the Resilience of Infrastructure and Economic Systems
115
Isserman, A.M., Feser, E., Warren, D.: Why Some Rural Places Prosper and Others Do Not. International Regional Science Review 32(3), 300–342 (2009) Lee, E.B., Markus, L.: Foundations of Optimal Control Theory. Robert E. Kreiger Publishing, Malabar (1986) Mileti, D.S.: Disasters by Design: A Reassessment of Natural Hazards in the United States. Joseph Henry Press, Washington (1999) NISAC (National Infrastructure Simulation and Analysis Center), 2008, Hurricane Ike L-48 Impact Analysis, based on NOAA Advisory 37, National Infrastructure Simulation and Analysis Center (NISAC) report to the U.S. Department of Homeland Security (September 11, 2008) (Unclassified, Official Use Only) NISAC (National Infrastructure Simulation and Analysis Center), 2009, Scenario Analysis of New Madrid Seismic Zone (NMSZ) Earthquake Impacts to Infrastructures and the Economy, National Infrastructure Simulation and Analysis Center (NISAC) report to the U.S. Department of Homeland Security (February 2009) (Unclassified, Official Use Only) Obama, B.: Presidential Proclamation for National Preparedness Month. The White House (September 4, 2009), http://www.whitehouse.gov/the_press_office/ Presidential-Proclamation-National-Preparedness-Month/ (accessed September 28, 2009) Presidential Decision Directive/NSC-63, subject: Critical Infrastructure Protection. The White House (May 22, 1998), http://www.fas.org/irp/offdocs/pdd/pdd-63.htm Resilience Alliance, Assessing and Managing Resilience in Social-Ecological Systems: A Practitioners Workbook, Version 1.0 (2007a) (June 2007), http://www.resalliance.org/3871.php Resilience Alliance, Assessing Resilience in Social-Ecological Systems: A Workbook for Scientists, Version 1.1 (2007b) (June 2007), http://www.resalliance.org/3871.php Rose, A.: Economic Resilience to Natural and Man-Made Disasters; Multidisciplinary Origins and Contextual Dimensions. Environmental Hazards 7(4), 383–398 (2007) Rose, A., Liao, S.-Y.: Modeling Regional Economic Resilience to Disasters: A Computable General Equilibrium Analysis of Water Service Disruptions. Journal of Regional Science 45(1), 75–112 (2005) Sheffi, Y., Rice Jr., J.B.: A Supply Chain View of the Resilient Enterprise. MIT Sloan Management Review, 41–48 (Fall 2005) Tierney, K., Bruneau, M.: Conceptualizing and Measuring Resilience: A Key to Disaster Loss Reduction. TR News, 14–17 (May 2007) U.S. Department of Defense, Glossary (2004), http://at.dod.mil/docs/glossary.pdf U.S. Department of the Army, Survivability of Army Personnel and Materiel, Army Regulation 70-75, Washington, D.C. (2006) U.S. Department of Homeland Security, National Infrastructure Protection Plan (2006), http://www.dhs.gov/files/programs/editorial_0827.shtm U.S. Department of Homeland Security, National Infrastructure Protection Plan: Partnering to Enhance Protection and Resiliency (2009), http://www.dhs.gov/xlibrary/assets/NIPP_Plan.pdf
116
E.D. Vugrin et al.
U.S. Department of Homeland Security Risk Steering Committee. DHS Risk Lexicon (2008), http://www.dhs.gov/xlibrary/assets/dhs_risk_lexicon.pdf Walker, B., Holling, C.S., Carpenter, S.R., Kinzig, A.: Resilience, Adaptability and Transformability in Social-ecological Systems. Ecology and Society 9(2), 5 (2004), http://www.ecologyandsociety.org/vol9/iss2/art5 Wildavsky, A.: Searching for Safety. Transaction Publishers, New Jersey (1991)
Regional Infrastructure Investment Allocation for Sustainability Terry L. Friesz , Sung H. Chung, and Robert D. Weaver *
Abstract. In this chapter, we present a computable theory of multi-region optimal economic growth that may be used to determine tax policy and public infrastructure investment plans that facilitate financial, labor force, and environmental sustainability. A numerical example is presented to illustrate use of our theory.
1 Introduction This chapter shows how aspatial optimal economic growth theory may be extended to study optimal growth of interdependent regions in a national economy, and specifically considers how the allocation of resources to fund infrastructure capital projects can affect population distribution across regions. In particular, the chapter presents an optimal control theory that can be interpreted as a guide to the collection and distribution of tax revenues in support of infrastructure investments. In addition, we modify this theory with sustainability constraints that consider thresholds for regional population and pollution. In particular, our model uses the regional allocation of national tax revenues to private and public investment as controls to optimization of national product. Our theoretic model incorporates the following features: 1) private and public capital, the latter allowing public infrastructure investment decisions to be modeled, are distinguished from one another; 2) both public and private capital markets are assumed to be in instantaneous equilibrium; Terry L. Friesz · Sung H. Chung Department of Industrial and Manufacturing Engineering, The Pennsylvania State University, University Park, PA 16801, USA e-mail: [email protected], [email protected]
*
Robert D. Weaver Department of Agricultural Economics and Rural Sociology, The Pennsylvania State University, University Park, PA 16801, USA e-mail: [email protected] K. Gopalakrishnan & S. Peeta (Eds.): Sustainable & Resilient Critical Infrastructure Sys., pp. 117–137. © Springer-Verlag Berlin Heidelberg 2010 springerlink.com
118
T.L. Friesz, S.H. Chung, and R.D. Weaver
3) growth dynamics involve no constant returns-to-scale assumption and allow increasing returns; 4) population migration follows a Hotelling-type diffusion model where migration fills location-specific ecological carrying capacities; 5) capital augmenting technological change is incorporated; 6) fiscal balance is required; and 7) the optimization criterion is the present value of the stream of national income over a finite time horizon. The model we propose is partly based on the spatial disaggregation of macroeconomic identities relating the rate of change of capital stocks to investments and depreciation. This perspective on disaggregation to create coupled differential equations describing regional growth can be traced back to Datta-Chaudhuri (1967), Sakashita (1967), Ohtsuki (1971), Domazlicky (1977), Bagchi (1984) and Friesz and Luque (1987), although the assumptions we make regarding production functions and technological change draw on those in past literature. Our model deviates from past regional growth models by stepping away from the assumption of a constant proportionate rate of labor force growth for each region. Such a constant proportionate growth (CPG) specification of labor or population is convenient as it allows their dynamics to be uncoupled from that of capital formation and technological change. However, the specification implies population grows exponentially with respect to time and shows no response to changes in population density, capital or regional income. As an alternative, we replace the unrealistic CPG model of population growth with a Hotelling-type model that includes the effects of spatial diffusion and of ecological carrying capacities of individual regions; and is intrinsically coupled to the dynamics of capital formation.
2 The Dynamics of Capital Formation Basic macroeconomic identities can be used to describe the relationship of the rate of change of capital to investment, output and savings. In the simplest framework, output is a function of capital and labor, and the rate of change of capital is equated to investment less depreciation. That is:
dK = I −δ K dt
(1)
where K is capital, dK / dt is the time rate of change of capital, I is investment, and δ is an exogenous depreciation rate. To differentiate public and private capital dynamics, we define δ p as the depreciation rate of private capital and δ g as the depreciation rate of public capital. As specified, (1) is an aspatial model. It is also important to recognize that (1) is an equilibrium model in which the supply of capital (the left-hand side plus depreciation) is exactly balanced against the demand for capital (investment, on the right-hand side). We shall maintain this
Regional Infrastructure Investment Allocation for Sustainability
119
assumption of capital market equilibrium throughout the development that follows. It is also worth noting that (1) is the foundation of the much respected work by Arrow and Kurz (1970) that has explored the interdependence of aspatial private and public sector growth dynamics. To add spatial dimensions to (1) as well as to introduce a distinction between the public and private sectors we note:
dK ip dK ig = I ip − δ p K ip and = I ig − δ g K ig dt dt
(2)
where the subscript i ∈ [1, N ] refers to the i of N regions and the superscripts p and g refer to the private and public (governmental) sectors, respectively. th
Further detail can be introduced to the above dynamics by defining ci to be the consumption rate of region i and r to be a uniform tax rate imposed by the central government on each region's output. We also define ηi to be the share of tax revenues allocated to subsidize private investments in region i, and ν i to be the share of tax revenues allocated to public (infrastructure) investments in region i . We define Yi to be the output of region i . To keep the presentation simple, we assume that all capital (private as well as public) is immobile between regions, although this assumption can be relaxed at the expense of more complicated notation. Consequently, the following two identities hold:
I ip = (1 − ci − r )Yi + ηi r ∑ Y j and I ig = ν i r ∑ Y j N
N
j =1
j =1
(3)
We assume the government budget must always balance (surpluses and deficits must be zero), implying all tax revenue must be allocated as stated in the following constraints:
∑ (η i + ν i ) = 1
(4)
0 ≤ ηi ≤ 1, 0 ≤ ν i ≤ 1
(5)
N
i =1
The latter constraints require allocations to be nonnegative1. We further assume th
that the i region's production technology is described by a general production function that allows for differential productivity of public and private capital:
(
Yi = Fi K ip , K ig , Li 1
)
(6)
Other tax schemes, such as own-region taxes can easily be described. The one chosen here is meant to be illustrative.
120
T.L. Friesz, S.H. Chung, and R.D. Weaver th
where Li is the labor force (population) of the i region, and we assume Fi is twice continuously differentiable and concave. By substitution using (2) and (6), we re-write the dynamics for the evolution of private and public sector capital as:
(
)
dK ip = (1 − ci − r ) Fi K ip , K ig , Li + ηi r dt
N
∑F ( K j
p g j , K j , Lj
j =1
)−δ
p p Ki
(7)
∀i ∈ [1, N ]
(
)
N dKig = ν i r ∑ F j K jp , K gj , L j − δ g Kig ∀i ∈ [1, N ] dt j =1
(8)
3 Population Dynamics Traditionally, the literature on neoclassical economic growth has assumed constant proportionate labor force growth rates. To simplify notation, we assume full labor force participation, so that Li = Pi . We specify labor force growth as:
dLi = πi dt
Li (0) = L0i
(9)
for each region i ∈ N where π i is a constant, or equivalently that population (labor force) grows according to the exponential law Li (t ) = L0i e i . While appropriate for an economy isolated from immigration or emigration, to consider spatial dynamics it is of interest to replace (9). Of particular interest in a spatial context is to endogenize inter-regional population flows. To do so, we introduce a spatial diffusion model of the Hotelling-type2. In Hotelling-type population models, migration is based on diffusion that responds to density differences wherein the population seeks spatial niches that offer relatively less density than origin locations. While this type of diffusion process endogenizes diffusion, it does not offer a behavioral theory of diffusion. However, relative to the (9) population density will not continuously increase with time at a location. Instead, as density at one location exceeds that at other locations, migration to the lower density locations will establish an equilibrium in the distribution of density. We further generalize the Hotelling diffusion model by specifying the diffusion process to be conditioned by the capital formation process. Thus, infrastructure investment is introduced as a control on the diffusion of population. We specify a spatial diffusion process that includes a logistic model of birth/death processes conditioned on the ecological carrying capacity of each location alternative. With this specification, labor and population dynamics are linked to the capital πt
2
See [Hotelling78].
Regional Infrastructure Investment Allocation for Sustainability
121
dynamics processes (7) and (8), implying spatially allocated capital investment becomes a control of population diffusion processes. Hotelling's original model is in the form of a partial differential equation which is very difficult to solve for realistic spatial boundary conditions and is not readily coupled with ordinary differential equations like (7) and (8). However, Puu (1989) and Puu (1997) have suggested a multiple region alternative to Hotelling's model which captures key features of the diffusion process and the birth/death process in a more tractable mathematical framework. Specifically, noting the population of region i as Pi , Puu (1989) proposes the following population diffusion dynamics:
(
dPi = γ i Pi (ζ i − Pi ) + ∑ λ j Pj − Pi dt j ≠i
)
∀i ∈ [1, N ]
(10)
Where γ i , ζ i and λi are specified as positive, exogenous parameters. The last term ∑ j ≠i λ j
(P
j
− Pi
)
specifies a diffusion process conditioned by relative density
such that the region draws population from regions with higher population density. λ j is referred to as the coefficient of diffusion for region j ∈ N and expresses the relative sensitivity of region j ′s population to a density differential relative to region i . In the first term, the entity ζ i is called the fitness measure and describes the ecological carrying capacity of region i ∈ N ; its units are population. Importantly, as the difference (ζ i − Pi ) is not constrained in (10) we can not interpret ζ i as a threshold of survival. However, within the context of sustainability, the presence of this difference defines an adaptive process in which a soft threshold ζ i plays a role. The parameter γ i ensures dimensional
consistency and has the units of (time ) . Thus, the first term indicates population growth in region i is driven by its current state relative to its carrying capacity. Clearly this model is not equivalent to Hotelling's, but it does reflect the essential ideas behind diffusion-based population growth and migration and is substantially more tractable from a computational point of view since (10) is a system of ordinary (as opposed to partial) differential equations. To consider the potential role of infrastructure as a control in population diffusion, we now further elaborate the Puu (1989) diffusion process by specifying the fitness measure ζ i as endogenous and determined by the state of public capital, i.e. we specify −1
(
)
ζ i = Vi Kig , t + Ψi (t )
(
)
(11)
where Vi K ig , t defines carrying capacity as conditioned by the region's public
infrastructure and time, and Ψi (t ) specifies the time dependent, natural or ambient carrying capacity. As an extension of (10), (11) expresses the often made
122
T.L. Friesz, S.H. Chung, and R.D. Weaver
observation that each region has a natural capacity to support a specific population level, and that level may vary with time and be conditioned by man-made infrastructure. Puu (1989) observes that population models such as is presented above have one notable shortcoming: it is possible for certain initial conditions that population trajectories will include periods of negative population. Negative population is of course meaningless and population trajectories with this property cannot be accepted as realistic. Consequently, we must include in the final optimal control formulation a state space constraint that forces population to remain nonnegative. In addition, to explicitly consider population sustainability we introduce an absolute constraint that requires population to be above a threshold level, Λ i . To be more specific, we assume that if the population of a certain region falls below a critical level, then the economy of that region may collapse. Thus, we add the following constraint:
Pi ≥ Λ i
∀i ∈ [1, N ]
(12)
4 Technological Change As specified, our model does not require balanced growth and does not assume existence of a long-run equilibrium. To consider technological change we allow for factor augmentation that can be biased. Here, as our interest is in public capital investment, we explore a uniform public capital augmenting process such that in each region i an augmentation process Φ (t ) maps physical units of public capital K ig into augmented units Φ (t) K ig , i.e.
Φ (t ) : K ig ⇒ Φ (t ) K ig
(13)
where Φ is a scalar, non-negative, monotonic function of time. To incorporate this specification in the production function, we redefine the public capital variable in the production function to be augmented public capital. We specify the public capital augmentation process as endogenous, with positive augmentation occurring when the national output/public-capital ratio falls below some threshold and with zero augmentation when the ratio exceeds that threshold. That is, the rate of technological progress dΦ / dt is defined:
dΦ > 0 if σ ≤ Θ dt
(14)
dΦ = 0 if σ > Θ dt
(15)
where σ is the national output/public-capital ratio and Θ ∈ ℜ1+ is a known reference threshold. It is of interest to note that the output/public-capital ratio is
Regional Infrastructure Investment Allocation for Sustainability
123
dependent on several state variables including the levels of aggregate as well as regional public capital:
(
)
σ K p , K g , L, Φ =
∑iN=1 Yi g ∑iN=1 K i
=
∑iN=1 Fi
(K
p g i , Φ K i , Li g ∑iN=1 K i
)
(16)
where K p , K g and L are private capital, public capital and labor respectively. Moreover, σ is implicitly time dependent since it is constructed from time varying entities. It follows that the rate of technological progress is
[
(
dΦ = τ Θ − σ K p , K g , L, Φ dt
)]
+
(17)
where τ ∈ ℜ1+ is an exogenous constant of proportionality and [] . + is the
nonnegative orthant projection operator with the property that [Q]+ = max(0, Q ) for an arbitrary argument Q .
5 Pollution Dynamics In this section, we consider pollution dynamics from the sustainability point of view. Effective management of pollution has always been of great interest to scholars. Early work on pollution management considered optimal pollution taxes from a societal perspective represented by a generalized welfare function. This perspective on the management of pollution by optimally imposing tax can be traced back to Pigou (1932), Buchanan (1969), Simpson (1995), or more recently, Canton 2008). Here, we do not consider imposing tax directly on the generation of pollution because we focus on the optimal allocation of tax revenues to infrastructure investments for regional economic growth. Instead, we impose regionally specific pollution constraints and show how the optimal allocation of tax revenues may be modified under the presence of the pollution constraint. We assume that pollution is generated as a direct result of production. We postulate that pollution production is positively proportional to output and may be mitigated by recycling in direct proportion to consumption. We define change in the state Bi (t ) of pollution as:
(
)
dBi ≡ ξ iYi − κ i ci = ξ i Fi K ip , K ig , Pi − κ i ci , Bi (0 ) = Bi0 dt
(18)
where ξ i and κ i are regional-specific parameters. To consider sustainability, we suppose that regionally specific thresholds Si exist, derived from social, political processes exogenous to the economy. Thus, we assume regional pollution must be maintained below these thresholds, i.e.
124
T.L. Friesz, S.H. Chung, and R.D. Weaver
∀i ∈ [1, N ]
Bi ≤ S i
(19)
Though we specify the limit Si as locally imposed, an alternative specification would be an aggregate, national limit set by regulation, e.g. N
∑ Si ≤ S
(20)
i
6 Criterion Functional and Final Form of the Model We next consider the control of the spatial distribution of population through the regional distribution of tax revenues subject to population and pollution sustainability constraints. As constructed, our model implies a key role for public capital investment which we consider to be public infrastructure. Together, the controls directly affect the state of public infrastructure and indirectly affect the pollution levels and population diffusion. To proceed, we specify the objective of control to be the present value of the national income stream over the time interval t0 , t f . Thus, the control problem is expressed as
[
]
N
maxJK p , K g , P
tf
! ;t i1
p
g
exp">t F i K i , K i , P i dt
(21)
0
where ρ > 0 is the constant nominal rate of discount and P is a vector of regional specific populations:
P P i : i N
(22)
and the maximization is carried out subject to the dynamic and other constraints previously developed. Hence, the final form of the model is
maxJK p , K g , P
(23)
subject to
(
)
(
)
N dKip = (1 − ci − r ) Fi K ip , ΦKig , P + ηi r ∑F j K jp , ΦK gj , Pj − δ p Kip dt j =1
∀i ∈ [1, N ]
(24)
Regional Infrastructure Investment Allocation for Sustainability
125
(
(25)
)
N dK ig = ν i r ∑ F j K jp , ΦK gj , Pj − δ g K ig dt j =1
(
dPi = γ i Pi (ζ i − Pi ) − ∑ λ j Pj − Pi dt j ≠i
[
)
∀i ∈ [1, N ] ∀i ∈ [1, N ]
(26)
)]
(27)
dBi = ξ iYi − κ i ci = ξ i Fi K ip , K ig , Pi − κ i ci dt
)
(28)
∑ (ηi + ν i ) = 1 ∀i ∈ [1, N ]
(29)
(
dΦ = τ Θ − σ K p , K g , P, Φ dt
(
+
N
i =1
0 ≤ η i ≤ 1 ∀i ∈ [1, N ] 0 ≤ ν i ≤ 1 ∀i ∈ [1, N ]
(30) (31)
Pi ≥ Λ i
∀i ∈ [1, N ]
(32)
Bi ≤ S i
∀i ∈ [1, N ]
(33)
where the shares ηi and ν i for all i ∈ [0, N ] , as well as the tax rate r , are the control variables and we do not repeat earlier definitional specifications. The state variables are of course K ip , K ig , Pi , and Bi for all i ∈ [0, N ] , as well as Φ . Note that (32) and (33) are sustainability constraints.
7 Numerical Examples To provide an illustration of the use of regional allocation of public expenditure on infrastructure as a control on regional population distribution, we examine a four region model. We consider for comparison two cases defined with and without sustainability constraints. We solve the corresponding mathematical program using a discrete time approximation with N = 100 equal time steps. In these illustrations we implement optimization of (23) through (33) using GAMS/MINOS. Our parameterization is presented in Table 1. Table 2 presents initial values for the state variables used in the examples.
126
T.L. Friesz, S.H. Chung, and R.D. Weaver
7.1 The Case without Sustainability Constraints We consider four regions as shown in Figure 1. We suppose that technological augmentation is constant and unitary, i.e. Φ (t ) = 1 to retain focus on the role of public infrastructure.
Fig. 1 Capital, Labor, and Tax Flows
Our functional specifications in our illustration include a Cobb-Douglas form
(
)
for the production functions: Yi = Fi K ip , ΦK ig , Pi = Ai ( K iP Φ (t ) K ig )1−α ( Pi )α where Ai is a productivity parameter and α is a coefficient. Paralleling the neoclassical growth literature, we specify that production exhibits constant returns-to-scale in labor and aggregate capital, and specify α as 0.5 and Ai for each region is shown in Table 1. Thus, we consider the marginal productivity of labor to vary across regions, as does the marginal productivity of public capital. We set the private and public capital depreciation rates δ p and δ g at 0.005 and 0.01, respectively, and the constant nominal rate of discount ρ at 0.05 .
(
)
With respect to our specification of the fitness measure, we specify Vi Kig , t as linear in
K ig
and Ψi (t ) as a constant:
(
)
Vi K ig , t = wi K ig Ψi (t ) = ψ i
(34) (35)
Regional Infrastructure Investment Allocation for Sustainability
127
The parameterization presented in Table 1. indicates that regions 2 and 4 have greater total factor productivity and greater intrinsic fitness or carrying capacity relative to regions 1 and 3. It is also specified that regions 1 and 4 have greater fitness response to public infrastructure than do other regions. Table 1 Numerical Values of Parameters
Parameter Region1 Region2 Region3 Region4 Ai 0.100 0.120 0.090 0.125 ci
0.10
0.20
0.08
0.30
γi λi
0.01
0.02
0.03
0.02
0.01
0.03
0.005
0.01
wi
0.05
0.03
0.01
0.05
ψi ξi κi
100 0.01
120 0.03
110 0.05
150 0.02
0.02
0.01
0.06
0.05
Table 2 Initial Values of State Variables
Parameter Region1 Region2 Region3 Region4 K ip K ig Pi
10 11 10
10 11 10
10 11 10
10 11 10
Bi
0
0
0
0
Results for this first case are presented in Figures 2 through 9. In these figures, the time scale incorporates the entire simulation time horizon. These results illustrate in Figure 2 and 3 that it is optimal to allocate greater shares of both private and public capital investment to regions 2 and 4. As is shown in Figure 4, initial population states adjust quickly with a relative shift in population into regions 2 and 4. These regional population dynamics follow from our diffusion model specification. This performance is achieved with an optimal tax that is often zero. When the optimal tax is positive, associated allocations of revenues to the regions are illustrated in Figures 6. We note the periodic nature of the tax follows from sufficiency of initial capital stocks to support optimal production levels.
128
T.L. Friesz, S.H. Chung, and R.D. Weaver
Private Sector Growth Region 1 Region 2 Region 3 Region 4
2000
Growth
1500
1000
500
0
0
1
2
3
4
5 Time
6
7
8
9
10
Fig. 2 Private sector capital growth
Public Sector Growth Region 1 Region 2 Region 3 Region 4
300
Growth
250
200
150
100
50
0
0
1
2
3
4
5 Time
6
7
Fig. 3 Public sector capital growth
8
9
10
Regional Infrastructure Investment Allocation for Sustainability
129
Population of Each Region Region 1 Region 2 Region 3 Region 4
180 160 140
Population
120 100 80 60 40 20 0
0
1
2
3
4
5 Time
6
7
8
9
10
8
9
10
Fig. 4 Pollution of each region
Tax Rate 1.2
1
Tax Rate
0.8
0.6
0.4
0.2
0
0
1
2
3
4
5 Time
6
Fig. 5 Tax rate
7
130
T.L. Friesz, S.H. Chung, and R.D. Weaver
Share of Tax Revenues to Private Investment in Each Region 1 0.5 Region 1 0
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5 Time
6
7
8
9
10
Share Rate
1 0.5 Region 2 0
9
10
0.1 0.05 Region 3 0
9
10
0.1 0.05 Region 4 0
9
10
Fig. 6 Share of tax revenues to private investment in each region
Share of Tax Revenues to Public Investment in Each Region 0.5
Region 1
Share Rate
0
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5 Time
6
7
8
9
10
1 0.5 Region 2 0
9
10
0.5 Region 3 0
9
10
1 0.5 Region 4 0
9
10
Fig. 7 Share of tax revenues to public investment in each region
Regional Infrastructure Investment Allocation for Sustainability
131
Pollution Generation in Each Region
Pollution Generation in Each Region
2 Region 1 0
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5 Time
6
7
8
9
10
40 20 Region 2 0
9
10
10 Region 3 0
9
10
50 Region 4 0
9
10
Fig. 8 Pollution generation in each region
Production in Each Region Region 1 Region 2 Region 3 Region 4
1400
1200
Rate
1000
800
600
400
200
0
0
1
2
3
4
5 Time
6
7
Fig. 9 Production in each region
8
9
10
132
T.L. Friesz, S.H. Chung, and R.D. Weaver
7.2 The Case with Sustainability Constraints For the second case, we add the following sustainability constraints to the model used in section 7.1:
Pi ≥ 10 ∀i ∈ [1, 4]
(36)
Bi ≤ 40 ∀i ∈ [1, 4]
(37)
We again step aside from technological progress and set Φ (t ) = 1 . Results presented in Figures 10 through 17 illustrate that the output of region 4 has been reduced with the imposition of pollution constraints. The maximum amount of pollution generated from region 4 is now 40 relative to 60 in the first case. Also, the share of tax revenue allocated to the public sector in region 4 has been reduced relative to case 1. Nonetheless, the relative productivity of regions 2 and 4 specified in the parameterization continues to imply time paths similar to those seen for the case where no population or pollution threshold constraints are imposed. As in the first case, we see that initial values and the parameterization result in a zero level for the optimal tax during periods where the current stock of public and private capital is sufficient, however, as depreciation depletes that stock, re-investment is required and induced through positive tax rates.
Private Sector Growth 1500 Region 1 Region 2 Region 3 Region 4
Growth
1000
500
0
0
1
2
3
4
5 Time
6
7
Fig. 10 Private sector growth
8
9
10
Regional Infrastructure Investment Allocation for Sustainability
133
Public Sector Growth Region 1 Region 2 Region 3 Region 4
160 140
Growth
120 100 80 60 40 20 0
0
1
2
3
4
5 Time
6
7
8
9
10
Fig. 11 Public sector growth
Population of Each Region 180
Region 1 Region 2 Region 3 Region 4
160 140
Population
120 100 80 60 40 20 0
0
1
2
3
4
5 Time
6
7
Fig. 12 Population in each region
8
9
10
134
T.L. Friesz, S.H. Chung, and R.D. Weaver
Tax Rate 1.2
1
Tax Rate
0.8
0.6
0.4
0.2
0
0
1
2
3
4
5 Time
6
7
8
9
10
Fig. 13 Tax rate
Share of Tax Revenues to Private Investment in Each Region 1 0.5 Region 1 0
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5 Time
6
7
8
9
10
Share Rate
1 0.5 Region 2 0
9
10
0.1 0.05 Region 3 0
9
10
0.1 0.05 Region 4 0
9
10
Fig. 14 Share of tax revenues to private investment in each region
Regional Infrastructure Investment Allocation for Sustainability
135
Share of Tax Revenues to Public Investment in Each Region
0.05 Region 1
Share Rate
0
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5 Time
6
7
8
9
10
0.5 Region 2 0
9
10
0.5
Region 3 0
9
10
0.5 Region 4 0
9
10
Fig. 15 Share of tax revenues to public investment in each region
Pollution Generation in Each Region
Pollution Generation in Each Region
2 Region 1 0
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5 Time
6
7
8
9
10
40 20 Region 2 0
9
10
20 Region 3 0
9
10
40 20 Region 4 0
Fig. 16 Pollution generation in each region
9
10
136
T.L. Friesz, S.H. Chung, and R.D. Weaver
Production in Each Region 800
Region 1 Region 2 Region 3 Region 4
700
600
Rate
500
400
300
200
100
0
0
1
2
3
4
5 Time
6
7
8
9
10
Fig. 17 Production in each region
8 Conclusion We have presented an optimal control model for allocating tax revenues among regions for public and private infrastructure capital investments. The model incorporates a regional disaggregation of public and private capital accumulation processes and extends the Hotelling population diffusion model to introduce a direct effect of public infrastructure on regional population carrying capacity. The model does not permit a closed form solution. However, numerical methods can be used to provide solutions for given parameterizations. In an illustration, we use a discrete time approximation of the continuous time optimal control problem and provide solution paths that illustrate how this approach to public investment allocation can increase welfare by targeting funds toward most productive regions. We also show that in our illustrative case the imposition of minimum population thresholds, as is often conceived in political and social processes to preserve specific locations, does not substantially alter the optimal allocations. Finally, imposition of pollution thresholds to control pollution from exceeding particular limits is considered and shown to result in redirection of infrastructure allocations.
References Arrow, K.J., Kurz, M.: Public Investment, the Rate of Return and Optimal Fiscal Policy. The Johns Hopkins University Press, Baltimore (1970) Bagchi, A.: Stackelberg Differential Games in Economic Models. Springer, Berlin (1984)
Regional Infrastructure Investment Allocation for Sustainability
137
Buchanan, J.: External diseconomies, corrective taxes and market structures. American Economic Review 59, 174–177 (1969) Canton, J., Soubeyran, A., Sthan, H.: Environmental taxation and vertical Cournot oligopolies: How eco-industries matter. Environmental and Resource Economics 40, 369–382 (2008) Datta-Chaudhuri, M.: Optimum Allocation of Investments and Transportation in a TwoRegion Economy. In: Essays on the Theory of Optimal Economic Growth, pp. 129–140. MIT Press, Cambridge (1967) Domazlicky, B.: A note on the inclusion of transportation in models of the regional allocation of investment. Journal of Regional Science 17, 235–241 (1977) Friesz, T.L., Luque, J.: Optimal regional growth models: Multiple objectives, singular controls, and sufficiency conditions. Journal of Regional Science 27, 201–224 (1987) Hotelling, H.: A mathematical theory of population. Environmental and Planning A 10, 1223–1239 (1978) Ohtsuki, Y.A.: Regional allocation of public investment in an n-region economy. Journal of Regional Science 11, 225–233 (1971) Pigou, A.: The Economics of Welfare, 4th edn. MacMillan, Basingstoke (1932) Puu, T.: Lecture Notes in Economics and Mathematical Systems. Springer, Heidelberg (1989) Puu, T.: Mathematical Location and Land Use Theory: An Introduction. Springer, Heidelberg (1997) Sakashita, N.: Regional allocation of public investments. Papers, Regional Science Association 19, 161–182 (1967) Simpson, R.: Optimal pollution tax in a Cournot duopoly. Environmental and Resource Economics 6, 359–369 (1995)
A Framework for the Manifestation of Tacit Critical Infrastructure Knowledge Ebrahim Bagheri * and Ali A. Ghorbani 1
Abstract. Critical infrastructure systems are tightly-coupled socio-technical systems with complicated behavior. They have emerged as an important focal point of research due to both their vital role in the normal conduct of societal activities as well as their inherent appealing complications for researchers. In this chapter, we will report on our experience in developing techniques, tools and algorithms for revealing and interpreting the hidden intricacies of such systems. The chapter will include the description of several of our technologies that allow for the guided understanding of the current status quo of infrastructure systems through the Astrolabe methodology, the formal profiling of infrastructure systems using the UML-CI meta-modeling mechanism, and also observing the emergent behavior of these complex systems through the application of the agent-based AIMS simulation suite.
1 Introduction Critical infrastructures are among the most significant technical systems that influence the ordinary life of any person or the normal operation of any industrial sector. Their importance is mainly due to the type of facilities/utilities that they provide. These facilities (either in the form of asset supply or service provisioning) serve as the building block for any other simple or complex functionality of the society. The outputs of the infrastructures although complex in nature, can be Ebrahim Bagheri Institute for Information Technology, National Research Council Canada e-mail: [email protected]
1
Ali A. Ghorbani Faculty of Computer Science, University of New Brunswick, Canada e-mail: [email protected] *
This work was conducted while the first author was with the University of New Brunswick.
K. Gopalakrishnan & S. Peeta (Eds.): Sustainable & Resilient Critical Infrastructure Sys., pp. 139–158. © Springer-Verlag Berlin Heidelberg 2010 springerlink.com
140
E. Bagheri and A.A. Ghorbani
thought of as the essential atomic inputs to other more complex systems. Apparently, without the proper operation of infrastructure systems, the function of other dependant systems would be disrupted. The very interesting fact is that throughout the years, infrastructure systems themselves have become dependent on each others’ outputs, turning the vertically integrated systems with only a few points of communication, into horizontally integrated systems with various points of interaction in many of their dimensions [1]. Analogous to the dependency of other systems on infrastructures, it can be observed that infrastructure systems themselves are inter-reliant or in other words tightly coupled. As has been extensively studied in the field of fault tolerant computing, a complex system built from interacting components is exposed to a high risk of failure derived from the possibility of mal-function in any of its components. The degree of effectiveness of the failing component in the overall architecture (e.g. in a digital circuit it can be thought of as the number of input/output connections of a specific component), suggests an estimate of the degree of damage or harm that its failure will cause. The high interdependency of infrastructures makes their characteristics somewhat similar to the explained systems. Their extreme inter-connectedness makes one think of them as different components of a single network. A failure in a node of this complex network of interdependent infrastructures, results in catastrophic failures; many of which had not been foreseen. These failures are in many cases the result of the propagation of failure through these interconnected systems. Failure propagation is known as the cascading effect or ripple effect and has been the inspiration for many fruitful research efforts [2]. Critical infrastructures are also vulnerable to terrorist or malicious attacks. Exploring different reasons or intentions of the launchers of such spiteful activities is important for the prevention of their actual occurrence. Classifying infrastructures through a terrorist attack vulnerability analysis provides a good understanding of the threats and hazards that infrastructure systems may be facing. It would be much easier to avoid such incidents based on this analysis or its least benefit would be the possibility of creating appropriate recovery plans. The tragic 9/11 terrorist attacks in New York City are clear depictions of threat towards the most influential and operational elements of both the society and industry. Researchers have pursued two main directions of investigation for studying the structure and behavior of critical infrastructures, each of which has been to our belief successful. The researchers in the first group have been mainly involved with the study, analysis, and understanding of the infrastructures’ current makeup. Their goal has been to identify methods, techniques, tools and schemes for describing the current status of an infrastructure. Based on this understanding, these researchers exploit various vulnerability, risk and/or threat assessment methods to gain more insight into the operation of an infrastructure. A fine-grained deployment of this process reveals many of the possible causes of failure and to a great extent their consequences. It is understandable that the result of this process would
A Framework for the Manifestation of Tacit Critical Infrastructure Knowledge
141
only shed light on the types of failure that are a clear result of a breakdown or malfunction of a system. It should also be noted that although many of the possible roots for failure are detected in these approaches, but not all of their consequences are visibly perceived and understood.
Fig. 1 The high-level structure of the framework developed for critical infrastructure modeling, simulation and profiling
From the understanding gained from the study of the infrastructure organization, and the identification of their points of weakness, proper risk mitigation strategies can be proposed and ranked based on, e.g., three metrics namely costliness, time-consuming, and effectiveness [3]. Each of these metrics can be weighted and suitable mitigation strategies can be selected to enhance infrastructure safety and protection according to the priorities of the infrastructure management and their strategic directions. The other direction of research has mainly focused on the understanding of the dynamic behavior of infrastructure systems. In this route, the investigators attempt to explore the many paths of infrastructure process and operation through which they will try to identify any causes for instability. The search mainly revolves around the discovery of the paths that introduce more risk of catastrophic failures into the system. Usually such a process is supported by agent-based simulations or differential and/or algebraic-differential equations modeling. Some of the
142
E. Bagheri and A.A. Ghorbani
well-known examples of such approaches are the Agent-based Interdependency Modeling (AIMS) suite [4], Aspen [5], the inoperability input-output model, which has been based upon Leontief’s I/O model [6]. In this chapter, we include the description of a platform that consists of several of our technologies that allow for the guided understanding, classification and decision making based on the current status quo of infrastructure systems. As depicted in Figure 1, the platform constitutes the following interacting components: 1. Astrolabe [15] is a collaborative goal-based risk analysis methodology, which is based on the causal analysis of system risks. It allows the analysts to both align the current standpoint of the system with its intentions and identify any vulnerabilities or hazards that threaten the systems stability; 2. AIMS [4] is a multi-agent modeling and simulation suite that allows its users to create models of an infrastructure and observe the behavior of the modeled system through actual simulations. AIMS provides the means for inserting transients into a running simulation for observing the outcome of the occurrence of an unexpected event; 3. UML-CI [2] is a model driven architecture-based profiling method that provides its users with the required high level meta-models to create a profile of an infrastructure system. Its main advantages are that it gives initial insight for infrastructure analysis and system identification, provides sound basis for common understanding, communication, and knowledge transfer, it also allows the documentation of best practices. In the rest of this chapter, we will provide more detailed information about each of these components and technologies, discuss how they are able to collectively form a framework for modeling, simulating and profiling critical infrastructure systems.
2 Risk Analysis for Critical Infrastructure Systems Risks are the likelihood and the degree of severity of unplanned or undesirable states. Analogous to its nature, the definition of risk is very much dependant on context and contextual factors. What might not be considered as risk in one context may be identified as a major risk in the other. Even in the same context, in cases where these factors are qualitatively analyzed, different points of view may rank the severity or likelihood of a risk with dissimilar values, which results in more ambiguity. However, currently the common understanding is that analyzing risk requires methods to identify the sources of the events that drive a system or an organization towards the states that can expose it to risk. Therefore, besides the direct events that lead to unsafe conditions, the courses of action guiding these events and even, more importantly, the purpose of these actions need to be identified and well understood. It makes sense to track the sources of risk back to their origins (intentional causes), because without the proper adjustment of these roots, it would be impossible to change the outcomes. In other words, the probable formation of the
A Framework for the Manifestation of Tacit Critical Infrastructure Knowledge
143
branches of a behavior tree in a system is dependent upon the arrangement of its roots. It is undeniable that local changes made to the branches of this tree can have quick and even spontaneous effects, but they do not have long-term durability. From a systems engineering perspective, the roots of the behavior of a system relate to the goals that it pursues. Parson argues that goal attainment is an indispensable aspect of any system [7]. Let's consider two very different systems, each formed based on communal and associational relationships [8]. A system developed for a communal relationship focuses more on mere member gratification. An example of such systems can be the formation of a student association that organizes student activities. In a system of associational relationships the membership of the participants is no longer because of the importance or pleasure of a relationship and is more or less attained so that the results of this relationship can indirectly help the participants create other systems based on communal relationships. For instance, employment in a job is a type of associational relationship that is accepted by a person so that he/she can establish his/her own family (a system of communal relationships). There are fundamental differences between the natures of the systems developed from these two types of relationships. However, one common factor exists in both of them and that is goal attainment. Even in a student association that has been established for a very informal cause, the inability to cater its objectives may result in the discontinuity of the relationship. Therefore, goal attainment has primacy over all other activities of any type of system. Goals are often the result of the strategy selection process through which system stakeholders identify its direction and decision making criteria [9]. To achieve system's goals, the stakeholders devise plans to undertake a series of actions. The implementation of the course of these actions situates the system under various states and conditions among which unsafe states may also be found. The existence of these states depends on the degree of willingness of the system to take risks. If the risk is outweighed by the benefits perceived by system stakeholders, then that specific action may be performed. Based on this description, system goals cater related criteria and metrics for action generation and selection. This means that goals are the driving force of system behavior. Hence, the behavior of a system can be justified by its goals. Goals can also be used to appraise current system performance. The appraisal can be based on the gap between the desired system derived from its initial set of goals and its current standing [10]. The iterative process of re-appraisal can be employed to adjust a system, such that it moves towards its initially planned ends. More specifically, a system can be described by its goals and objectives, and the set of scenarios that it undertakes to support the operationalization of its goals. Studies show that very few systems actually achieve their intended goals to the extent of their initial desire [11]. This may be due to several factors. It may be either because the scenarios that have been supporting the attainment of system goals are not in full alignment with these goals, or it may be due to the incomplete or incorrect undertaking of these scenarios. Empirical evidences from the current behavior of a system can help identify the gap between system goals and the present practice. The existence of such a gap is not a rare incident in many systems. Even in political systems, the leaders initially acknowledge the goals of their party, but
144
E. Bagheri and A.A. Ghorbani
over time, as they acquire power, become rather conservative in order to maintain the current situation and as a consequence the initial goals of the party may be sacrificed.
Fig. 2 Key concepts of the Astrolabe methodology
A system can also be caught in a situation where the rapid change of its context leads to the requirement of goal amendment. The need for a quick adjustment can result in a condition where the goals of a system are no longer well defined. This situation can be described with the Garbage Can theory [12]. This theory describes a condition where a system has been left with a set of outdated solutions, which are the remainder of the previous goals of the system and as the goals of the system are amended and the problem statement changes, the system starts looking for a suitable problem statement to match the solutions that it has to offer. Therefore, the risks associated with this state should be also analyzed.
A Framework for the Manifestation of Tacit Critical Infrastructure Knowledge
145
Systems that need to incorporate the role of human resources into their structure also face a different kind of risk. Motivational theory formalizes the reasons behind the involvement of any human resource into a system through inducements and contributions [13]. Inducements are desired aspects of participation. For instance, inducements of working for a company are a suitable salary along with insurance options. Contributions on the other hand, have negative utility from the human resource perspective, but are the requirements for participation. Constant traveling for a salesperson is a type of contribution that he/she has to make in that position. In cases where the contributions and inducements of a position in a system contradict each other, risks may arise for the system, since the human resource may not adhere to the requirements of the contribution (This has also been addressed as orthogonal goals of organizations and individuals in the related literature). Given these issue and important consideration, Astrolabe provides a collaborative multi-perspective risk analysis methodology that identifies complex systems, e.g. critical infrastructure systems, risks and vulnerabilities based on systems operational activities and its strategic goals and objectives.
2.1 Astrolabe Key Concepts The risk analysis process in Astrolabe is based on five key concepts shown in Figure 2. Astrolabe aims to fully identify instances of these concepts for any target system. This information can then collectively describe a system and its status. The key concepts defined in Astrolabe are: 1. Perspective is the mental and conceptional standpoint of the representative of a group of related individuals through which they examine the universe of discourse (e.g. the target system being examined). 2. Goal is the conceptualization of the ideal state of affair for a system. A system may pursue multiple goals. 3. Evidence is an activity currently practiced in the universe of discourse for the attainment of one or more goals. 4. Obstacle is a goal or evidence, which can be from the outside of the universe of discourse that obstructs or delays the achievement of one or more system goals. 5. Hindrance is evidence, from within or outside the universe of discourse that disrupts the normal operation of one or more system activities.
2.2 Astrolabe Process Overview The Astrolabe methodology is intended to support the process of identifying the risks that threaten a complex system such as critical infrastructure systems and provide the means to trace the roots of these risks. It aims to provide means for analyzing the sources of risk so that proper mitigation strategies can be selected.
146
E. Bagheri and A.A. Ghorbani
Moreover, it can be used concurrently with system design methodologies to assist risk identification throughout the complex system design process. The Astrolabe methodology has an iterative nature in which all of its phases can be re-visited at any time. That is, once the steps in any of the phases have been performed, there may be a need to go back to the previous phases and refine some of the information. The Astrolabe methodology has seven major phases. This does not mean that the completion of these phases ends its lifecycle. As it is inherent in the nature of risk, analyzing and managing risk is a non-stop activity that requires constant revisits and refinements. Within our proposed methodology, after the completion of each iteration, regular examination of the deliverables and products of each phase is required, so that any change in system goals and activities can be captured and suitable risk identification and analysis activities can be performed. The major phases of the Astrolabe methodology are: 1. Boundary Specification: The functional and intentional spaces of a system usually span multiple domains; therefore, risk analysts should initially specify which one of the aspects of the system attracts their attention the most and is going to be the target of investigation. Boundary specification should also consist of the identification of the sub-systems of a larger system that are of interest. 2. Perspective Identification and System Analysis: In this phase, risk analysts decide on the parties that are going to be involved in the information elicitation process. For instance, they may decide that only two points of view, one from the CEO and one from the marketing representative, satisfies their needs and Requirements. 3. Preliminary Hazard Identification: Threats and vulnerabilities of a system can be identified by a close examination of the goals and evidences that have been identified up to the current point. For example, the representative of a perspective can look at the set of goals and evidences that he/she has identified and think of the risks that threaten their operation. This hazard identification process does not produce a complete list of all system hazards, since neither each perspective is complete nor the elaboration has been extensive enough yet. 4. Perspective Integration: Having identified a set of goals and evidences in each perspective, risk analysts should consolidate all this information into a unique representation. Within this unique representation, conflicts between the statements of the different perspectives should be resolved. 5. Process Refinement: The concentration of each perspective on the issues more relevant to its position may result in a sub-optimized view of the problem domain. This issue can be handled by aggregating and refining the information provided by all of the perspectives. In this way, each perspective will become aware of the goals or evidences that he/she may have missed by viewing the information provided by other perspectives. 6. Risk Analysis: Identification of risk in Astrolabe does not occur in a single phase and crosscuts all of the phases. The risks that are identified throughout all phases are analyzed in this phase. This analysis includes ranking goals, evidences, capabilities, and resources based on the degree of the threats that they
A Framework for the Manifestation of Tacit Critical Infrastructure Knowledge
147
pose. Based on this ranking, risk analysts will be able to concentrate on risk filtering and proper risk mitigation strategy selection. 7. Quality Measurement and Validation: The quality of a risk analysis process is very much subject to the expectations and needs of risk analysts and system administrators; however, in many cases, the integrity and correctness of the risk analysis process needs to be validated and the quality of the deliverables be assessed. In Astrolabe, the quality of the products, deliverables, and the analysis process is evaluated based on five metrics. These metrics are namely accuracy, consistency, completeness, traceability and unambiguity. In principle, Astrolabe mainly focuses on catering a formal framework for multiperspective goal and risk negotiation, and in particular, providing a structured approach to multi-perspective risk identification, prioritization and mitigation. These features of Astrolabe provide several benefits for both experienced and inexperienced risk analysts. Experienced risk analysts can exploit Astrolabe's risk classification and ranking methods to semi-automate the risk analysis procedure and hence decrease the required risk analysis time and increase risk analysis procedure for complex interconnected systems such as critical infrastructure systems. On the other hand, inexperienced analysts can gain benefits from the process, by analyzing the information provided by Astrolabe. For instance, Astrolabe provides the degree of cross-perspective consistency and information alignment, which can help inexperienced analysts in seeing and evaluating their own performance. Inexperienced risk analysts can benefit from these two features of Astrolabe, by comparing themselves within the framework of Astrolabe with an experienced risk analyst. The comparison would allow them to observe the degree of deviation of their risk analysis procedure with that of an experienced analyst. Summarily, the most outstanding features of Astrolabe that make it a very suitable option for risk analysis in critical infrastructure systems are: 1) Identifying risk from its system origins and stakeholders strategic objectives; 2) Incorporating multiple perspectives on system intention and structure for risk analysis that helps avoid any information misinterpretation and overlook that is common while analyzing large and complex systems such as critical infrastructures; 3) Quality measurement metrics for process validation purposes; and 4) An iterative model for risk management with clearly enumerated phases, steps and deliverables.
3 Critical Infrastructure Systems Behavior Analysis Multi-agent systems provide a collective understanding of a very complex system through a unified view on the aggregate behavior of their constituent elements (agents). Infrastructure systems can hence be very well modeled by multi-agent systems. Depending on the modelers understanding of the structure and organization of an infrastructure, appropriate multi-agent systems can be created to mimic the actual behavior of an infrastructure system.
148
E. Bagheri and A.A. Ghorbani
Fig. 3 An agent-based model of an infrastructure system
There are two main issues to a proper design for a multi-agent based architecture: agents’ internal design, and agents’ environmental commitments and collaborations (See Figure 3). The first issue addresses the internal representation of each agent within the multi-agent architecture. This aspect specifies what functionalities/services an agent requires/provides, what resources it owns/consumes, and what goals it pursues. These specifications would allow an agent to decide on its reactions under various conditions. For example, an electricity generator can be modeled as an agent in a multi-agent architecture. If the generator goal is to maximize its profit, it would regardless of its current load, accept new requests, but if its objective is to provide a very high quality of service to its customers, it would prefer to accept a lower number of requests but provide them with a better service. Investigators can use a very well known multi-agent architecture known as BDI, which specifies the internal structure of an agent through three concepts: Belief, Desire and Intention [14]. An agent’s belief is its perception of the surrounding environment. Desire is the agent’s motivational states or goals, and an agent’s intention is its current decision in order to reach his goals (desires). Agent intentions are realized through intelligent planning.
A Framework for the Manifestation of Tacit Critical Infrastructure Knowledge
149
Following the previous example, an electricity generator can forecast power market's future, and plan not to forward sale its electricity supplies because of a rise in electricity prices in the near future. As this example portrays, the sort of planning problems in a multi-agent system that simulates an infrastructure system is usually complicated and have to be discovered and optimized in a multidimensional space and hence requires complex techniques.
Fig. 4 Infrastructure service exchange in the multi-agent architecture of AIMS
An agent's societal interaction within a multi-agent architecture is the other point that should be specified when modeling an infrastructure system. This would include features such as Negotiation, Cooperation, and Coordination. Negotiation is a social practice through communication that is experienced for regulating power, resources and commitments among different agents. Agents are autonomous by their definition (as their corresponding infrastructure systems are in the real world), but require interaction to reach their goals. As a result of a successful negotiation, cooperation can be achieved. Negotiation and
150
E. Bagheri and A.A. Ghorbani
cooperation procedures appear to be able to reveal a set of infrastructure interdependencies. The negotiation of agents (each agent being an infrastructure system) over each others' resources depicts their interdependencies. This allows the revelation of many physical or cyber interdependencies. However, still logical interdependencies are not revealed. Logical interdependencies are related to a more sophisticated notion in multiagent systems: Coordination. Coordination is conceptually different from cooperation, since two coordinated agents may not in fact cooperate with each other, however make coordinated decisions. Suppose that the price of oil is strictly dependent upon the peace process in the Middle East. Therefore, major oil producers and consumers, try to coordinate their decisions with the political developments in that region. It is visible that oil companies have no direct cooperation with the political side of the issue, but they are highly coordinated with their decisions. This coordinated behavior can be considered as logical interdependency between two infrastructure systems. Given these important issues relevant to the multi-agent design for critical infrastructure behavior analysis, we have designed an Agent-based Interdependency Simulation Suite (AIMS) that supports for the understanding of the behavior of complex interdependent systems. The major advantage of AIMS over similar existing simulation suites such as CISIA [16] and ASPEN [5] is the following: 1. Firstly, based on the understanding that simulating infrastructure systems requires an in-depth study of their structure, AIMS provides suitable means for initially modeling infrastructure systems. Therefore, modeling an infrastructure system in AIMS is a perquisite for its simulation. The modeling process is supported by static component templates that represent an infrastructure, its subsystems, or even its resources. Theses component templates are properly stored in AIMS repositories and can be used in various modeling procedures. 2. Secondly, besides the concept of an agent, AIMS centers around the type of services that infrastructure systems can provide. Therefore, the modeling activity in AIMS needs to take a service-oriented approach in defining the parties that participate in the simulation. This is basically due to the fact that real-world infrastructure systems interact in a service-centric fashion. As an example, let’s consider a subset of the interactions between the electricity and telecommunication infrastructures. The telecommunication infrastructure relies on the electricity infrastructure for its power needs. In order for the telecommunication infrastructure to reach its needs, it has to negotiate with various electricity providers. These providers, based on this request, provide the telecommunication infrastructure with a proposal for service provisioning. This proposal may consist of information on the Quality of the Service (QoS), their cost and conditions of use. The telecommunication infrastructure can then decide on the choice it wants to make. If the environment in which the infrastructure systems operate gets more complicated, service brokers can assist in finding their requirements. Figure 4 shows how this process is supported in AIMS.
A Framework for the Manifestation of Tacit Critical Infrastructure Knowledge
151
Fig. 5 The overall structure of the AIMS metamodel
3.1 AIMS Metamodel The AIMS metamodel consists of four major metaclasses: Model Instance, Component Template, Contract (Binding), and Scenario. The model instance metaclass incorporates all other three metaclasses into a whole and shapes the overall design of the infrastructure that needs to be analyzed and simulated. Suppose that we intend to investigate the behavior of an electricity infrastructure. In this case, the model instance will represent those specific infrastructure systems, while the other three metaclasses depict its internal setting, external relationships and the context of its operation. As it is seen in Figure 5, a model instance of the AIMS metamodel can contain as many component templates, and contracts (binding) instances as required, but can only be associated with a single scenario instance each time. More formally these four components are described below: 1. Component template is an abstract entity that can be instantiated to represent any infrastructure system, sub-system or resource. Its design is mainly focused around the type of services that it provides. For instance, in a cellular network, an Mobile Switching Center can be considered as one of its sub-systems; therefore, it can be modeled through a component template in AIMS; 2. Contract (Binding) provides the means for two infrastructure systems modeled as component templates in the model agree to share their goods through the exchange of their services. For instance in Figure 4, a contract has been
152
E. Bagheri and A.A. Ghorbani
established between the electricity infrastructure and the telecommunication infrastructure for the exchange of electricity; 3. Scenarios specify the initial setting of the simulation model, and provide guidelines on what paths the simulation has to follow during the course of its execution. Since many of the interesting (and at many times harmful) events within an infrastructure system are unpredictable and rare, scenarios provide the possibility to design simulation routes that will give rise to problematic events; 4. Model instance is global metaclass that incorporates all the required information for the execution of a proper infrastructure simulation process including information about what component templates are involved, which scenario will be executed, and what contracts will be evaluated between the simulation components.
3.2 AIMS Suite Structure The AIMS suite has a modular design in which all of its modules interact through appropriate message passing schemes. This suite is composed of four major modules namely AIMS Core Module, Visualization, Manipulation, and Analysis Module, Scenario Handler Module, and the User Interface Module. Before going into the details of each module, we briefly introduce each of them in the following lines: 1. AIMS Core Module: The AIMS core module is responsible for creating an active simulation from the models that have been created based on the AIMS metamodel. It is also in charge of controlling the simulation process, and managing the interaction of the running simulation with the other three modules. Internal to the design of this module, we have incorporated four operating units: AIMS Controller, Simulation Controller, Market Place, and JADE Facade. 2. Visualization, Manipulation, and Analysis (VMA) Module: This module is responsible for providing AIMS users with proper means to analyze the active simulation. Various VMA modules can be introduced into AIMS to show different perspectives and interpretations of a single simulation. 3. Scenario Handler Module: Once an actual simulation has begun, the scenario handler module will parse the scenario section of the model instance. Based on the scenario action information, it will then send appropriate instructions to the AIMS core module at specific points of time in the simulation. 4. User Interface Module: The user interface module provides various graphical user interfaces for the AIMS suite modules. These interfaces include the frontend user interface, which is responsible for creating the model instance and executing the model through the AIMS simulator, and administration user interface that is in charge of configuring the AIMS suite.
A Framework for the Manifestation of Tacit Critical Infrastructure Knowledge
153
Fig. 6 A four-layered meta-modeling architecture extending UML for the water and sewage infrastructure
Summarily, AIMS is a multi-agent modeling and simulation suite that allows its users to create models of an infrastructure and observe the behavior of the modeled system through simulations. One of the distinguishing features of AIMS is that it provides its users with a set of predefined component templates (e.g. pipes, switches, etc). The organization of an infrastructure can be built through the instantiation of these component templates into actual software agents. AIMS also provides the means for inserting transients into a running simulation for observing the outcome of the occurrence of an unexpected event. The set of these transients called scenarios, control and hence effect the operation of the simulation. Visualization and analysis modules can also be plugged into a running simulation to view and study the structure and behavior of the simulated infrastructure system.
4 Profiling Critical Infrastructure Systems UML-CI is a meta-modeling language that provides the required constructs to formulate and properly structure and represent information related to and about a critical infrastructure system. It is based on the profiling mechanism provided by the UML modeling language. A UML profile allows the selection of a subset of the UML base metamodel, denotes the common model elements required for the specific modeling domain, and specifies a set of required well-formedness rules. Well-formedness rules are a set of constraints that accompany a family of models to show their proper composition. Object Constraint Language (OCL) is a strongly typed declarative language that is based on the mathematical set theory and predicate logic and is extensively used with UML for this purpose. Figure 6 depicts an
154
E. Bagheri and A.A. Ghorbani
example of how a base UML class can be gradually extended to provide more specific models for two different instances of a single water and sewage infrastructure metamodel. The major stimuli for devising a reference model for critical infrastructure modeling, UML-CI, are multifold. Here, we elaborate more on some of the most important advantages of an infrastructure profile that have led to the proposal of UML-CI. 1. Base Recognition and System Identification: A unified platform with extensive components allows an initial understanding of the organization of an infrastructure. Novice modelers can exploit this reference model to plan the modeling process. It also suggests the type of information that needs to be collected throughout the modeling practice. 2. Common Understanding and Communication: The elements of the reference model bring about a shared conception of the modeling task between the members of the team. This consequently eases communication and collaboration among the involved people. One of the other major advantages of the employment of UML-CI would be that a mutual understanding between the modelers and the infrastructure stakeholders can be reached more easily. 3. Current Understanding (Knowledge): A completed profile (an instantiated version of UML-CI) can provide a detailed view of the present infrastructure setting; however the level of detail of this completed profile depends vastly on the granularity of the information collected through the information acquisition phase. Many of the interdependencies (e.g. physical, cyber, or geographical) between different infrastructures may be illuminated through this process. The current understanding can also aid in infrastructure management by providing more detailed understanding of the current situation. 4. Knowledge Transfer: Infrastructure modeling projects are extremely specialized, requiring the team members to be acquainted with both the modeling tasks and infrastructure domain. As a result, those involved are highly skilled people that pose a great risk if they decide to leave the team. Having a standard modeling notation reduces this threat by providing the opportunity for the group to add new members. The new members can quickly grasp the problem domain using the detailed reference model. 5. Best practices and New Understanding: Different modeling teams based on the same framework can communicate and transfer their experience. This exchange of knowledge can occur through the attachment of thoughts, ideas, recommendations or even standards to the metamodel or its elements. The proper transfer of the best practices would bring about a more concrete understanding of infrastructure organization and behavior. 6. Documentation and Re-use: A reference model such as UML-CI can provide a highly readable and semantically rich documentation of the infrastructure that is being modeled. The created document can be easily understood by both the stakeholders and the modeling team for it has a graphical notation. The models created based on this reference model can be further re-used if need be. Similar infrastructures can also be modeled through the refinement of an existing model without the need for starting from scratch.
A Framework for the Manifestation of Tacit Critical Infrastructure Knowledge
155
Fig. 7 The five high-level critical infrastructure models of UML-CI
4.1 High-Level Critical Infrastructure Models The UML-CI critical infrastructure reference model consists of five main models, each of which addresses a different issue. Each of the metaclasses within a model is relevant to the concept that rules the model. These five models are briefly explained in the following lines and shown in Figure 7: 1. Ownership and Management Model: The elements within this model provide the means for the identification of the managerial aspects of an infrastructure. These characteristics include the specification of the infrastructure stakeholders, the government(s), and the geographical span of the infrastructure. It also includes features for defining the policymakers, regulations, and the roles they play in the infrastructure operation. 2. Structure and Organization Model: This model provides the means for specifying the makeup of an infrastructure system. It has three major metaclasses namely infrastructure, system, and task.
156
E. Bagheri and A.A. Ghorbani
3. Resource Model: Resources are the raw processing materials that are required for the operation of an infrastructure. They are consumed, produced or processed within the operations of an infrastructure system. This model provides the metaclasses for defining different types of resources that may be encountered in a critical infrastructure environment. 4. Threat, Risk, Vulnerability (TRV) Model: One of the main reasons for modeling infrastructure systems is to identify the hazards that threaten its operation. The TRV model, provides various metaclasses so that these hazards can be categorized and their causes, consequences and possible mitigation strategies be clearly specified. 5. Relationship Model: The relationship model provides many different metaclasses (derived from KernelAssociation) for connecting and joining the concepts that have been identified in the previous models. For example, although various hazards that threaten the operation of an infrastructure can be specified in the TRV model, but they are not specifically attached to the system, or task that they are actually threatening. To address this concern, the relationship model provides suitable metaclasses, so that all of the created classes in the previous models can be integrated into one unique model. To make the employment of the proposed infrastructure reference model easier, the metaclasses and well-formedness rules of UML-CI have been added to the Together Architect 2006 CASE tool. This facility provides the modelers with the choice of easily creating a model based on the proposed reference model in a graphical environment with drag and drop features. UML-CI metaclasses integrated in this tool can be easily selected and instantiated, and joined together in an integrated environment. The integration of UMl-CI into this CASE tool allows the end-users to reap extra facilities such as automatic document generation, and model checking (based on the UML-CI well-formedness rules) that are already available in Together Architect. Summarily, UML-CI is a reference model for profiling and modeling different aspects of a critical infrastructure system. The metaclasses in this reference model are categorized in five major high-level models that address various aspects of infrastructure organization and behavior. The most important concerns that have been addressed in this reference model are the issues of critical infrastructure ownership and management, their internal organization and system structure, asset classification and identification, and risk profiling. Other than the process of modeling and profiling critical infrastructure systems, UML-CI also concentrates on subjects that are of high importance to the management of critical infrastructure systems such as providing ground for creating common understanding and communication between infrastructure stakeholders, knowledge transfer, and documentation of best practices. Based on the models within UML-CI, infrastructure system knowledge bases can be built to aid the process of infrastructure system modeling, profiling, and management.
A Framework for the Manifestation of Tacit Critical Infrastructure Knowledge
157
5 Concluding Remarks Critical infrastructures are networks of interdependent, mostly privately-owned, man-made systems and processes that function collaboratively and synergistically to produce and distribute a continuous flow of essential goods and services [1]. These highly complex systems can be classified as socio-technical organisms that have some sort of hidden consciousness. Due to the potentially severe repercussions that infrastructures failure can induce, they have been the target of many fruitful research activities, which attempts to model, simulate and understand their behavior. In the past few years, we have focused on analyzing various aspects of critical infrastructure systems, developing a variety of technologies ranging from the analysis of the status quo of infrastructure systems through devising a goal-based risk analysis methodology to dynamic analysis of the behavior of critical infrastructure system using the agent-based interdependency modeling and simulation suite. Also, we have investigated the important issue of profiling and codifying the gained knowledge of infrastructure systems in a structured and formal representation model called UML-CI. The shining aspect of our work is that our developed technologies form a complete all-round framework that provides most of the necessary tools for analyzing, understanding, storing and reporting on critical infrastructure systems. The developed technologies, namely AIMS, Astrolabe, and UML-CI form a comprehensive framework whose products can be exchanged between each technology and be further processed to induce and reveal hidden aspects of the critical infrastructures behavior and structure.
References [1] Bagheri, E., Ghorbani, A.A.: The state of the art in critical infrastructure protection: a framework for convergence. International Journal of Critical Infrastructures 4(3), 215–244 (2008) [2] Bagheri, E., Ghorbani, A.A.: UML-CI: A reference model for profiling critical infrastructure systems. Information Systems Frontiers, July 10 (2008) (in press) [3] Turner, J.V., Hunsucker, J.L.: Effective risk management: a goal based approach. International Journal of Technology Management 17(4), 438–458 (1999) [4] Bagheri, E., Baghi, H., Ghorbani, A.A., Yari, A.: An agent-based service-oriented simulation suite for critical infrastructure behaviour analysis. International Journal of Business Process Integration and Management 2(4), 312–326 (2007) [5] Basu, N., Pryor, R., Quint, T.: ASPEN: A Microsimulation Model of the Economy. Computational Economics 12(3) (1998) [6] Leontief, W.: Input-output economics. Oxford University Press, New York (1966) [7] Parsons, T.: The System of Modern Societies. Prentice-Hall, Englewood Cliffs (1971) [8] Curry, J.A.: Sociology for the Twenty-First Century, 3rd edn. Prentice-Hall, Englewood Cliffs (2002) [9] Scott, W.R.: Organizations: Rational, Natural, and Open Systems. Prentice-Hall, Englewood Cliffs (1992)
158
E. Bagheri and A.A. Ghorbani
[10] Scott, W.R.: Technology and structure: An organization-level perspective. In: Goodman, P.S., Sproull, L.S. (eds.) Technology and Organizations. Josey-Bass, San Francisco (1990) [11] Gross, E.: The definition of organizational goals. British Journal of Sociology 20(3), 277–294 (1969) [12] Cohen, M., March, J., Olsen, J.: A garbage can theory of organizational choice. Administrative science quarterly (1972) [13] Simon, H.A.: Rational decision making in business organizations. American Economic Review 69(4), 493–513 (1979), http://ideas.repec.org/a/aea/aecrev/ v69y1979i4p493-513.html [14] Rao, Georgeff, M.P.: BDI-agents: From Theory to Practice. In: Proceedings of the First International Conference on Multiagent Systems, ICMAS 1995 (1995) [15] Bagheri, E., Ghorbani, A.A.: Astrolabe: A Collaborative Multiperspective GoalOriented Risk Analysis Methodology. IEEE Transactions on Systems, Man, and Cybernetics, Part A 39(1), 66–85 (2009) [16] Setola, R., Bologna, S., Casalicchio, E., Masucci, V.: An Integrated Approach For Simulating Interdependencies, Critical Infrastructure Protection II, pp. 229–239. Springer, Heidelberg (2008)
Intelligent Transportation Infrastructure Technologies for Condition Assessment and Structural Health Monitoring of Highway Bridges Kerop D. Janoyan and Matthew J. Whelan *
Abstract. The visual inspection routines mandated through the National Bridge Inspection Standards (NBIS) implemented after the 1967 catastrophic collapse of the Silver Bridge have nearly exclusively provided the framework for bridge management encompassing rehabilitation planning and reconstruction scheduling. Over the years, despite the numerous revisions of the NBIS to introduce special inspection procedures, such as for fracture critical and scour susceptible structures, it is evident that the visual inspection program falls short of ensuring a safe and efficient operational model for bridge management. All too often, imminent or unforeseen collapse predates reconstruction efforts and consequently the public is subjected to abrupt closures instead of anticipated and expediently scheduled rehabilitation projects. Sensor-based non-destructive condition assessment and anomaly detection technologies have long been proposed to supplement the limitations and subjectivity associated with the visual inspection program to provide timely and quantitative evaluation of the structural health of highway bridges. Presented is an overview of the role that intelligent transportation infrastructure technologies are increasingly assuming within bridge management as well as conceptual strategies for application of several monitoring approaches. Real-world field response measurements are also presented to demonstrate the current capabilities, typical data collection, and extraction of performance parameters from application of a wireless sensing network platform utilizing both strain transducers and accelerometers. Lastly, identification and localization of non-critical damage onset using a network of vibration sensors is explored through system identification and response prediction using forward innovations. Kerop D. Janoyan · Matthew J. Whelan Dept. of Civil and Environmental Engineering, Clarkson University, Potsdam, NY e-mail: {kerop,whelanmj}@clarkson.edu
*
K. Gopalakrishnan & S. Peeta (Eds.): Sustainable & Resilient Critical Infrastructure Sys., pp. 159–184. © Springer-Verlag Berlin Heidelberg 2010 springerlink.com
160
K.D. Janoyan and M.J. Whelan
1 Introduction While structural health monitoring (SHM) is often presented as a means of preventing catastrophic collapse of vulnerable infrastructure and the associated loss of life, property, and transport route, the benefits realized through quantitative operational assessment of structures over time are significantly broader. Beyond assurance of public safety, condition-based maintenance (CBM) can potentially reduce state-wide infrastructure rehabilitation costs by millions of dollars through effectively transitioning from schedule-based to need-based maintenance, informing bridge managers of accurate service limits to assign postings and issue permits, and maximizing remaining useful life through measured assessment of structural integrity. Unanticipated bridge closures place economic strain on businesses, commercial transit, and local residents and ultimately inflate the cost of rehabilitation or reconstruction as work schedules are necessarily accelerated under priority contracts. Although overlooked until closure or collapse, quality of life can be greatly impacted as vehicular users of the transportation system can be subjected to substandard roadway conditions or rerouted to lengthy detours for durations of weeks, if not months or years. From an environmental perspective, a bridge under closure or repair imposes congestion, delays, and detours that lead to increased fuel consumption and emissions. The aerospace industry has long integrated sensor technologies for in-service condition monitoring to aid service mechanics in the maintenance and repair of air vehicles. More recently, such efforts have begun to realize significant life-cycle cost savings and reduction of mission down-time due to unanticipated failures through fault identification and prediction using prognostic health management (PHM) approaches. Likewise, the introduction of on-board diagnostics (OBD) within automotive systems is a parallel case of multi-modal sensor integration for component and systems-level condition assessment through application of signal processing routines within an embedded computing platform. That such efforts have witnessed fruition in advance of analogous systems for civil infrastructure is not necessarily simply a reflection of larger military and consumer market-driven budgets, although certainly funding allocations have certainly had a significant impact on the advancement of condition-based maintenance systems. Vehicular passenger safety is readily recognized as an immediate safety concern, while the catastrophic danger of failing infrastructure has long been overlooked as a hidden threat to public safety and regional economic stability. Furthermore, while vehicular systems benefit mass-produced standardized designs and manageable systems-level scale, the broad magnitude and scale of the highway bridges in the national inventory has precluded the economic application of integrated cable-based sensor systems. More importantly, however, is that the aerospace and automotive industries have decidedly integrated condition monitoring into the design and fabrication stages of their systems, while to-date structural health monitoring has been an after thought when applied to highway bridges, often seeking to address condition assessment of aged bridges with little record of the baseline response and historical usage/wear trends. Barring an unforeseen breakthrough in the capabilities of damage diagnostic and prognostication algorithms without such a priori
Intelligent Transportation Infrastructure Technologies
161
data, this paradigm will have to shift if reliable, timely, and cost-effective condition assessment technologies will be realized within transportation infrastructure. Structural health monitoring can be broadly deconstructed into three fundamental yet interrelated and iterative tasks originating with sensor and sensor network development, progressing to fault and damage diagnostics capabilities, and ultimately evolving into prognostication for remaining useful life (RUL) prediction. As the development of structural health monitoring through these tasks is somewhat linear, it is not surprising that the first two tasks, with particular bias toward sensor and sensor network advancement, have witnessed the greatest maturation to-date. The impetus for renewed vitality in structural health monitoring of large civil constructions over the past decade came largely from the advancement of relevant integrated circuit technology. In particular, low-power chip transceivers have afford remote bidirectional communication through distributed low-power networking, thereby enabling battery powered instrumentation to be rapidly installed across structures without concern for routing expensive cabling to transmit power and signal lines. Furthermore, micro-electro-mechanical system (MEMS) sensor technology has produced compact yet highly capable and integrated surface-mount sensors at significantly reduced cost and power consumption as compared with their discrete counterparts. Successful large-scale deployments have demonstrated the inherent advantages of leveraging embedded technologies to replicate the early cable-based bridge monitoring studies at reduced cost with increased sensor count. Anomaly indication and advanced diagnostics from sensor-driven structural assessment techniques has begun to demonstrate the systems-level capability toward identifying the onset of structural faults and failure mechanism, although it is still very much a maturing field of research for highway bridges. As noted before, the scope of unique bridge designs, structural materials, subsurface and bearing conditions, site variations, and other differences yields diagnostics as a complex problem potentially requiring a non-uniform, or multi-modal, approach dependant on design details and likely failure mechanisms. Towards addressing these unknown factors, a small database of field studies has been compiled to-date in which measurements were obtained prior and subsequent to known, generally induced, damage on in-service bridges. Los Alamos researchers simulated fatigue cracks on a three-span reinforced concrete deck on two steel plate girder bridge design through successive torch cuts through the web and flange of the girder at the midspan [1]. Allampalli, et al. [2] extended this test approach by introducing saw cuts at various locations on the web and flange of a single-span reinforced concrete deck on two steel girder bridge in New York. In one of the most thorough response assessment studies including long-term temperature dependence, Katholieke Universiteit Leuven coordinated a progressive damage study consisting of progressively inducing sub-structural settlement, foundation inclination, spalling, and anchor head and tendon failures [3]. Most recently, Clarkson University researchers have supplemented this growing database with a study of the effect of diaphragm bolted connection damage and bearing restraint on the vertical and transverse vibrations throughout a simply-supported end-of-service life reinforced-concrete deck on steel girder highway span [4]. While this is by no means
162
K.D. Janoyan and M.J. Whelan
an exhaustive summary of the work performed to date, this brief survey begins to demonstrate the diversity of potential fault scenarios and test bed designs. Without continued supplemental testing to demonstrate the performance of proposed damage diagnostic routines on actual in-service data, it is speculative to claim the existence or reliability of any definitive approach for systems-level condition assessment from distributed response measurements. Advancements in the sensor technologies and network layers for structural health monitoring and the development of diagnostic routines lends itself toward the eventual emergence of prognostic methods with capabilities to assess the impact of identified faults on the systems-level performance and remaining useful life. However, as recently as 2007, the lack of any recognizable advancement in prognostication for civil and mechanical structural systems had been noted [5]. Prognostication is often viewed by academics through restricted lens in that it is often taken to encompass only physics-based approaches toward modeling failure mechanisms and estimating remaining useful life based on quantified strength, stress, or fatigue cycles. It is for this reason that the literature is exceptionally scarce regarding structural prognostics in the context of health monitoring and, as pointed out by the aerospace industry, need not be so exclusive. By definition, prognosis should be expanded to include any method, from simple cycle counting and historical probabilistic methods to complex model-based approaches, that provides for indication of a specific fault prior to user-defined undesirable operational states with ample lead time to permit response [6]. In other words, prognostics is essentially an extension of diagnostics except that it must be performed at an early enough stage in the fault development to provide sufficient response time and be able to gauge the relative propagation or change in severity associated with the fault to optimize service life without jeopardizing safety.
2 Sensor Networks for Highway Bridges Sensors, instrumentation strategies, and data networks are the foundation upon which diagnostic and prognostic methodologies are enabled, so the selection, implementation, and validation of these components are critical to the ultimate value extracted from any SHM system. It is entirely possible that a span could be instrumented with dozens of sensors but if there is no underlying approach for extracting functional performance metric used in the selection and placement of the sensors, the end system will realize little information of value. Wireless chip transceiver and low-cost MEMS sensor technology have revolutionized the possibilities of extending in-service monitoring to infrastructure systems, where the long spans between measurement location and central data processer have to-date prohibited application to only a handful of efforts, largely research efforts. The logistics and instrumentation costs, especially for long lengths of shielded cabling, associated with a distributed sensor network for a highway bridge span have long been the primary obstacles to SHM systems, as the theory and preliminary tests actually date back to the early twentieth century [7]. Recently, applied research has demonstrated the feasibility of large-scale, highrate and lossless wireless sensor networks for structural health monitoring of
Intelligent Transportation Infrastructure Technologies
163
highway bridges through multiple field tests on actual in-service structures [8], [9], [10]. While much research and development is being performed in the design of new sensing approaches and local diagnostic methodologies for highway bridges, the results of such efforts have to-date seen little translation to actual application within management systems maintained by highway departments. This seems to be a due largely to a lack of a systems-based approach to providing the customer (highway departments) with an affordable, yet effective, and highly integrated product or service which functions synergistically with existing management efforts rather than in conflict or as a replacement.
Fig. 1 “Smart” Bridge with On-Site Processor and Remote Diagnostics
In the context of a permanently installed structural health monitoring system, a variety of electromechanical sensors are generally affixed to the structure to measure the in-service response with distributed local networking capabilities afforded through either cable-based or wireless communications protocols (Fig. 1). The on-site technology, however, would be only a component within a larger systems-level design that affords remote data processing and collection, database creation for long-term diagnostics and prognostication, synergistic association with the National Bridge Inventory records for enhancing existing bridge management practices, and timely indication for safety assurance (Fig. 2). Unlike vehicle systems where an operator is present, asset management of highway bridges requires that a remote “dashboard” be provided so that alarms and condition status are intelligently routed to regional or state offices to permit bridge engineers to respond immediately. This level of second-tier networking, if properly designed
164
K.D. Janoyan and M.J. Whelan
and coordinated, could be implemented through modem communication from the central processor over the existing Plain Old Telephone System (POTS), modern Digital Subscriber Line (DSL), or satellite communications (SATCOM) systems. Given the relatively high cost of data transmission, especially at high throughput rates, coupled with lack of desire by bridge engineers to receive raw, preprocessed data, an effective second-tier network could be functional designed analogous to the digital I/O between an embedded OBD system and a simple dashboard of fault indicating incandescent bulbs. Within this scenario, the data rates would be exceptionally low and bridge managers would only be provided web-based virtual dashboard consisting only of the information of value, namely the fault identification and associated severity or quantitative measurement. Such a low-rate indication system is also well suited for the necessary aggregation and management of condition indexes from the thousands of bridges maintained by each state DOT.
Fig. 2 Systems-Level Intelligent Highway Bridge Management
To facilitate the database development necessary to give rise to advanced prognostication strategies, higher rate sensor measurements and routinely recorded functional performance measures will also need to be obtained. As small-form non-volatile memory cards have witnessed significantly increased capacity and decreased cost over recent years, an alternative to remote transmission of such data would be local data storage. Since, as with aerospace and automotive systems, electronic diagnostic systems are complimentary to routine visual inspections rather than a substitute, the bridge will still be accessed at minimum biennially by an inspector. An external serial communications interface provided on the central processor could enable inspectors, or private consulting companies, to simply download the data recorded between inspection periods. This compromise would facilitate the necessary database development to enable data-driven prognostication without the cost and communications infrastructure overhead associated with purchasing high-rate data subscriptions from land-line or satellite communication
Intelligent Transportation Infrastructure Technologies
165
companies. Once prognostication methods have been readily identified, they could be built-in software algorithms with the central processor and the processed results could be transmitted over the low-rate second-tier interface to fully integrate the system to bridge management and inventory systems.
3 Health Indicators and Condition Assessment All too often, the conceptual development of structural health monitoring systems, particularly in the academic sense, have been characterized by a desire to extract not only measures of functional performance, but also multi-modal diagnostics and extended prognostics using a minimal array of, usually homogeneous, sensors. In contrast, the majority of monitoring systems in place on highway bridges incorporate a large number of mixed transducers to record only basic mechanical, thermal, or environmental parameters. This dichotomy arises primarily out of the diversity of design and functional response associated with the expansive national inventory compounded by the reality that the database of actual in-service response, including data with identified natural failure phenomenon, is simply not available. If a lesson is to be learned from the aerospace and automotive industries, intelligent infrastructure systems will necessitate a period of “coming of age”, as unlike within vehicle systems, sensors, alarms, and built-in test systems have been entirely absent from highway structures so there is no existing diagnostics knowledgebase and maintenance indication system available to build off of. The natural progression seems to then take the form of an initial development of an on-board diagnostic (OBD) system for highway bridges analogous to those found in vehicles. Given the lack of established quantitative diagnostic and prognostic services within infrastructure, the preliminary state and alarm-based systems based on direct functional performance measures are likely to be the most attractive and immediately useful systems to highway departments. However, given the extensive investment and research being performed in the field on systems-level diagnostics as well as the excessively slow adoption of advanced technologies in the infrastructure sector, it is entirely possible that systems-level fault indication and diagnostics systems may preclude any basic functional diagnostic components. The general transducer principles underlying electro-mechanical sensing of static and dynamic structural response are covered exhaustively in literature and, for the sake of brevity, will not be discussed beyond those concepts critical to effective implementation. Furthermore, the core focus will be directed toward vibration and strain measurement, as these two approaches afford the unique opportunity for global fault detection through discrete, localized sensor elements. Coexisting methodologies, such as acoustic emissions testing, ground penetrating radar, and thermoelectric imaging, should not be discounted as appropriate tools for the modern inspection and evaluation engineer, though these methods do not lend themselves well to permanent installation, continuous assessment, and multimodal fault diagnostics. Such technologies might better be classified as advanced inspection tools, rather than elements of an intelligent transportation system, analogous to the use of a cylinder pressure gauge for troubleshooting engine
166
K.D. Janoyan and M.J. Whelan
performance while conversely integrated sensing components, such as a tachometer or oxygen sensor, monitor and manage the engine performance.
3.1 Functional Performance-Based Diagnostics Functional performance-based diagnostics are assumed for this case to be applications of sensors to directly measure a component or element behavior as a means of condition assessment and fault identification. This approach is generally when known design or deterioration issues exist on a structure that must be monitored to ensure safe operation. Rarely is such an approach suggested for widespread application, particularly within academic research, as the significant number of discrete structural components in a highway bridge would necessitate a significantly large array of sensors to encompass them all. However, with the relatively low cost of MEMS sensors and embedded wireless systems, particularly at large scale manufacturing, such an approach may be more feasible, and perhaps optimal that the longsought global diagnostic methods. In particular, a few dozen sensors per span may not be outrageous when placed into the context of parallel diagnostic and safety assurance systems, particularly given the total asset value of a typical highway bridge. For instance, modern automobiles are equipped with as many as one hundred electronic sensors, although their tasks range from safety and engine management to comfort. None-the less, the automotive sensor market is the largest sector of the total sensors market, comprising 25% or $10.5 billion as of 2003 [11]. 3.1.1 High Rocker Bearing Rotation Monitoring To illustrate an easily envisioned functional diagnostic component, a system applicable to monitoring the common issue associated with abnormal behavior of high rocker bearings will be conceptualized. This type of expansion bearing used to alleviate thermal stresses is characterized by a curved surface, similar to the base of a rocking chair, with a pinned joint at the plate-rocker interface which allows for significant longitudinal expansion and contraction through rotation about the bearing axis. Recently, there have been several high-profile bridge failures, including the 2005 incident on an expressway ramp in Albany, NY and the 2008 Birmingham Bridge failure in Western Pennsylvania, that have unexpectedly caused road closures and allocation of millions of dollars in emergency contracts.. In each case, the span had abruptly dropped the height of the bearing, as much as eight inches, at the abutment or pier after thermal stresses had dislodged rocker bearings frozen by the effects of rust and corrosion. Furthermore, secondary damage is also common during these failures as the girders impact the substructure piers or abutment with significant enough force to produce cracking and fracture. Rocker bearings are designed to be set vertically within a moderate temperature range, with contraction at towards the cold extremes and expansion at high temperatures. Given the seasonal weather variations in each region and bearing dimensions, express limit states can be devised where functionality can be assessed simply by determining the actual bearing inclination versus the expected inclination. The New York State Department of Transportation recently introduced a simple measurement procedure for high rocker bearings to attempt to isolate
Intelligent Transportation Infrastructure Technologies
167
failures within the framework of the biennial inspection program [12]. Under the revision, inspectors are required to determine the angle of rotation of the “worst” case bearing for each line of support and compare the state of contraction/expansion with the ambient temperature. If the bearing is found to be contracted in warm weather or expanded in cold weather, a yellow flag is assigned and if extreme bearing rotation is noted outside of the temperature extremes, the structure is red flagged (possible but not mandated bridge closure). While the devised inspection routine is a valuable addition to the visual inspection program, the two-year period between inspections and static, single point nature of the approach provides less than full assurance of the bearing functionality. As an alternative toward realizing an intelligent infrastructure system, simple continuous monitoring of the bearing functionality could easily be introduced using low-cost MEMS inclinometers wirelessly networked to a central diagnostics processing unit. By orienting an inclinometer to measure the rotation angle about the bearing axis, the embedded sensor node would directly infer the functional performance of the bearing over time, thereby replacing the effort of the inspector with a continuous, real-time assessment strategy to fully mitigate the safety issue. Through comparison with a table of state limits associated with ambient temperature embedded in the central processing unit, the diagnostics system could easily and immediately report rocker bearing faults without the need for any higher-level data extraction or advanced, indirect processing methodologies (Fig. 3).
Fig. 3 Conceptual High Rocker Bearing Functional Diagnostic Subsystem
3.1.2 Ambient Vibration-Based Cable-Force Monitoring Another direct function measurement that has been applied to cable-stayed bridges, arch suspension elements, and external tendons is the use of dynamic response data to estimate the force in the cables or tendons. Strain transducers are
168
K.D. Janoyan and M.J. Whelan
ineffective for this task since all in-service cables are inherently under a state of existing pre-stress that essentially becomes the datum to which the transducer measurement is referenced. Post application of such instruments only permits for the relative change in force to be monitored, rather than the absolute tendon force to be estimated. To overcome this challenge, the absolute force is commonly estimated using the natural frequencies of the cable, as measured by an accelerometer or velocity transducer. Various linear and nonlinear estimations are available that neglect or account for additional factors, such as bending stiffness and sag extensibility, but the general approach has been to indirectly estimate the cable force based on measured natural frequencies and free vibration characteristics of a cable [13]. The cable-force determination is performed by placing a linear accelerometer directionally perpendicular to the axis of the cable or tendon (Fig. 4). The free response is measured and then transformed to the frequency domain for natural frequency estimation. For simplicity, the cable response is often assumed to be characterized by the dynamic response of a slender tensioned wire. Under these assumptions, the cable force can be computed directly from the estimated natural frequency and known material properties of the cable from: (1) where the cable tension force, T, is a function of the mass density, cable length, and n-th natural frequency, fn. As with the case of monitoring the safety of high rocker bearings, a table of operational state limits of cable or tendon natural frequency based on minimum and allowable tension force could easily be stored within a diagnostic processor for real-time condition assessment based on sensor outputs.
Fig. 4 Cable-force estimation from ambient vibration measurement
Intelligent Transportation Infrastructure Technologies
169
3.1.3 Strain-Based Performance Assessment Application of discrete strain transducers to local regions of interest also permits for basic functional parameters to be monitored without exceptional postprocessing. For exhaustive coverage of the typical sensor topologies used to extract axial deformation, uniaxial bending, and biaxial bending response, as well as the influence of creep, gauge length, and temperature effects, the reader is referred to [14]. Given that traffic loading incurred on a bridge results primarily in longitudinal bending, the most commonly utilized topology for the superstructure elements is the uniaxial bending topology. This is also the most widely utilized and preferred configuration for transducer placement in experimental load ratings. In this topology, two strain transducers are placed at the extreme fibers of the crosssection undergoing bending, or as close as accessible (Fig. 5).
Fig. 5 Uniaxial Bending Topology for Strain Transducers
The neutral axis location for uniaxial bending is a higher-level, yet easily obtained, parameter of functional performance from the set of strain measurements recorded in the uniaxial bending topology. Using an assumption of linear strain profile, the neutral axis of the cross-section can be developed from the two strain measurements and their respective spatial locations (Fig. 6). The curvature produced as a result of an applied loading can also be computed from the difference of these strains divided by the separation distance. From a purely functional standpoint, the location of the neutral axis is an important parameter for concrete deck on steel girder bridges because it accurately assesses the level of composite action offered through the girder in the cross-section. For cross-sections composed of strictly homogenous materials, such as steel, the location of the neutral axis will remain constant unless there is a change in the geometry of the cross-section. This enables monitoring of this parameter as a means of assessing section loss, though it requires the natural assumption that the section loss is not uniform or symmetric about the neutral axis. For concrete cross-sections, cracking will induce changes in the effective cross-sectional area
170
K.D. Janoyan and M.J. Whelan
and moment of inertia, thereby inducing a shift in the location of the neutral axis [14]. For example, consider a concrete deck supported by steel girders. As cracking occurs in the concrete deck, the neutral axis will tend to shift toward the lower flange of the steel girder. Conversely, if section loss occurs in the steel girder, then the neutral axis will tend to shift toward the concrete deck surface. Unfortunately, such a measure is not exactly a direct measure of functional performance as with the prior examples, as the location of the neutral axis does not provide any direct measure of strength or health indication and symmetric degradation need not induce any shift at all despite the change in element strength.
Fig. 6 Neutral Axis Determination from Uniaxial Bending Topology
Within experimental load testing, moment distribution factors serve to quantify the demand requirements of each of the primary members under a defined load scenario. To develop these estimates, trucks of known weight are parked at locations in designated lanes and the strains recorded in each girder are used to express the live load moment experience by each girder as a fraction of the total moment (Fig. 7). As load is applied to a structure with redundancy, it is distributed to the nearest primary members then shed via the slab and diaphragm connections to the remaining structural elements. Section loss, either through corrosion or cracking, is directly reflected in the distribution factors, as the strain distribution of the member affected is amplified by the loss of stiffness. Furthermore, bearing and diaphragm connection degradation will result in a change in the load transfer mechanism of a bridge, thereby affecting the distribution factors for each case as well [15]. Since distribution factors are normalized, if purely elastic conditions are considered, this approach need not necessarily apply solely to known, weighed truck loads, but could be extended to routine traffic in designated lanes. Using a historical database and statistical trending, significant changes in distribution factors could be used to identify system-level faults through the relative changes in load shedding behavior.
Intelligent Transportation Infrastructure Technologies
171
Fig. 7 a) Upper and Lower flange strain measurements as measured at mid-span of girders during truck pass b) Distribution Factors for truck pass in eastern lane
3.2 Modal Analysis and Systems Modeling In contrast to functional performance-based diagnostics that utilize specific sensors to directly measure the behavior of specific primary or safety critical members, systems-level diagnostic approaches have been sought that utilize a distributed array of sensors to simultaneously enable fault identification and diagnostics across many failure mechanisms. Such multi-modal approaches have been classified by many as the “holy grail” of structural health diagnostics and, while there are many promising and developing methodologies, there remain no recognized definitive algorithm for the highway bridge, let alone one that satisfies all highway bridge structural types and design specifications. Vibration-based, and in extension ultrasonic, structural health monitoring techniques are based on the ability of multiple, properly spaced sensors to estimate a global response based on measurements taken at discrete locations. The underlying general dynamics of vibrating structures can be used to mathematically predict the response motion at discrete locations from the discrete mass, stiffness, and damping matrices: (2) where M, C, K, and B are matrices of the model order defining the mass, damping, stiffness, and input locations, respectively, and f and y are the excitation force and displacement vectors. It then follows that even localized changes to the mass, stiffness, or damping matrices that arise out of deterioration, malfunction, or structural damage, will impact the global response. Therefore, a distributed network with sensors not necessarily located at the source of a fault should still permit for its identification and localization. It is this fundamental principle that the majority of vibration-based structural health monitoring techniques are based on.
172
K.D. Janoyan and M.J. Whelan
Fig. 8 Multi-modal Diagnostics through Vibration-Based Systems Modeling
Since system identification techniques are available to experimentally construct the system model rather than formulating it from as-built geometries, material properties, and boundary assumptions, it is possible to develop a well-constructed baseline model from a measured response and then compare the model over time to assess and localize disparities. However, such a task is more readily performed when the system inputs are measurable and transfer functions can be computed to solve for modal parameters using experimental modal analysis [16]. On highway bridges and other large civil structures, it is not feasible, or in some cases undesirable from a service perspective, to measure system inputs and so the only tools available for system identification are those that utilize only response measurements, otherwise known as operational modal analysis. Stochastic system identification methods assume that the system inputs can be modeled under the assumption of white noise (stochastic) characteristics and from there manipulate the mathematical models to permit for estimation of the modal parameters. Under this constraint though, only the natural frequencies, damping ratios, and mode shapes can be estimated, as there is no means of reasonably estimating modal participation factors with unmeasured system input. 3.2.1 System Identification Techniques for Civil Structures Perhaps the simplest form of operational modal analysis is frequency domain decomposition, otherwise known as the “peak-picking” method. In this method, the digitized signals are converted to the frequency domain through Fourier transform and eigenfrequencies are identified as the distinct peaks present in the spectrum, similar to in the case of the transfer function. The magnitude of the corresponding mode shape at each location for the eigenfrequency is then assigned as the absolute value of the spectral peak. Using a single reference sensor, the phase angles determine the relative direction, or sign, of the mode shape vector based on whether the frequency component of the signal is either in-phase or 180° out-ofphase with the reference. Such an approach is relatively simple to perform and has been widely utilized in modal analysis of civil constructions, though there are
Intelligent Transportation Infrastructure Technologies
173
significant drawbacks and limitations which have been well noted [10], [17]. The most immediate limitation is that the approach actually produces operational deflection shapes rather than true mode shapes in that the shape constructed from the spectra data is actually the naturally weighted combination of all mode shapes that would arise if the structure was excited by a pure harmonic at the selected eigenfrequency [18]. Since only spectral content near the eigenfrequency contribute noticeably to the constructed mode shape estimate, the operational deflection shapes are generally similar to mode shapes. However, for structures with closely spaced modes or with significant damping, the estimated mode shape will often be a combination of multiple modes and therefore may not produce accurate response estimation. The approach also does not directly compute damping ratio estimates, though time-series analysis approaches, such as random decrement technique could be applied to obtain such estimates if desired [7]. Data-driven stochastic subspace identification (SSI) is a more robust form of output-only system identification that numerically develops an optimal reconstruction of a state-space form of the equation for the governing dynamics of structural vibration. Complete derivation of the SSI theory, algorithm development, and application examples is presented exhaustively by Peeters [18] as well as in a concise subchapter [7]. Stochastic subspace identification restates the matrix differential equation form of the governing equation into a discrete-time state-space model with the additional inclusion of noise terms to account for real-world measurement and modeling inaccuracies. The resulting state-space model then becomes: (3) where xk is the state vector consisting of displacement and velocity, yk is the measured response outputs, uk is the measured system input, and wk, vk are the process and measurement noise, respectively. The noise terms are assumed to be stochastic, zero-mean, and white. Then, since ambient environmental and traffic inputs on a highway bridge cannot be measured, the system input is implicitly modeled by the noise terms and the state-space model is reduced to:
(4) In this form, the state-space model now provides a framework by which a solution can be obtained from only response measurements, assuming that the inputs reasonably comply with the assumed zero-mean, white-noise characteristics. The general overview of the mathematical framework of SSI is provided here not as an introduction to the method or the mechanics of the technique. Interested readers and systems implementers are strongly referenced to Peeters [18]. Rather, the form of the state-space model relates strongly to the forward innovation model in linear control theory, which can be exploited to develop a prediction algorithm for system-level diagnostics. As outlined by Peeters and summarized concisely by
174
K.D. Janoyan and M.J. Whelan
[19], the well known steady-state Kalman model can be stated using matrices drawn or developed directly from the state-space model used in the SSI analysis
(5) where A and C are the state-space matrices solved for in SSI and K is the gain matrix, which can be developed through covariances using the experimental response data to fit the model. Once these matrices have been developed, the system model is then complete and forward prediction of the response can be performed in onestage prediction-correction process whereby the error between the measurements and the model prediction is used to provide the next time step prediction. The statistical distribution of prediction errors from application of the Kalman model to the baseline data through which it was derived through SSI is a Gaussian distribution as a result of the assumptions enforced on the system input and noise. As anticipated, when the structural system under test is affect by changes, such as when damage or element fault occurs, the error distribution or shape of the probability distribution function will be correspondingly affected as the model no longer predicts the response as well. Given that the Gaussian distribution is symmetric about zero, statistical mean is not a particularly applicable indicator, but rather variance, or second central moment, has shown much promise. The application of this simple diagnostic approach is presented in a field case study provided in the subsequent section. 3.2.2 Practical Application of Stochastic Subspace Identification Despite the more rigorous mathematical foundation and generally more time consuming analysis, stochastic subspace identification is a heavily favored approach to output-only system identification since less-well excited modes and those with relatively high damping can be extracted and a full state-space model or arbitrary order can be constructed for post-processing and forward prediction. While the process of developing the system invariant can be automated with knowledge of sensor count and references, the actual determination of appropriate model order is somewhat subjective and requires application of engineering judgment based on anticipated dynamic response of the structure. This subsection will utilize timehistory measurements of vertical response from a single-span multi-girder bridge (described within the case study of section 4) subjected to ambient traffic loading and instrumented at thirty uniformly spaced nodes with accelerometers. An advisable approach to determination of model order, found quite effective by the authors, is to iteratively select the order using traditionally prescribed approach of exploiting the principal angle and stabilization plots to suggest model order with post-processing of the state-space model to compare spectral estimates and modal contributions for validation (Fig. 9). The principal angle difference between subspaces is often largest at model orders that closely reconstruct the experimental data, so these model orders are often suggested as appropriate for assigning to the state-space model [20]. Stabilization diagrams also assist in this task by providing a visual indication of the development of system poles deemed
Intelligent Transportation Infrastructure Technologies
175
to be stable according to assigned frequency, damping ratio, and modal vector deviation limits as the model order increases. Generally, over-specification of the model order is necessary to reconstruct poorly excited modes with low signal-tonoise ratio [9], [20]. However, over-specification tends to introduce spurious (non-structural) poles at a greater rate than revealing the less-excited modes. Therefore it is suggested that the model order be minimized to capture the predominant modes measured in the signal and over-specification be reserved solely for academic reasons of identifying modal parameters from under-excited and over-damped modes.
Fig. 9 Aids used in determination of model order. a) Principal Angle separation, b) Stability of poles against model order, o – fully stable pole according to parameter limits, x – partially stabilized pole, c) average power spectrum across all data channels
A typical iteration consists of identifying a potential model order from a gap in the principal angle plot, correlating the choice with the stable pole development illustrated in the stabilization diagram, and then developing the state-space model of the chosen degree of freedom. The spectrum of the stochastic process can then be estimated using the double-sided z-transform of the covariance sequence and appropriate factorization properties, as presented in [18]. The spectrum of the state-space model can then be compared to the average power spectrum of the experimental data to gauge how well the system is reconstructed by the model and how the system poles influence the response (Fig. 10). As the spectrum more closely reconstructs the average power spectrum of the experimental measurements, the mode shapes can then be constructed and analyzed using engineering judgment to determine whether the model is constructed predominantly of structural response modes or if significant spurious poles exist in the model. The aim should be to maximize the inclusion of structural response modes afforded through increased model order while minimizing the development of spurious poles through maintaining reasonably low
176
K.D. Janoyan and M.J. Whelan
model order. Optimally low model order is advantageous as the matrix computation for forward prediction is expedited and the prediction model is less subject to influence from the spurious poles arising from violation of process and measurement noise assumptions, which may exhibit stronger variance over time. It should be stressed again that this process is subjective and the quantitative effect on the forward prediction innovations developed from experimental data consistent with the baseline response measurements has yet to be investigated.
Fig. 10 Progression of spectral content development versus increasing model order. a) An under specified model only reconstructs the most dominant structural modes, though neglects contribution from remaining well-excited modes, b) a near optimal specified model produces a spectrum that is consistent with the average power spectrum computed from the experimental measurements and is composed nearly solely of structural response modes, c) An over specified model reconstructs poorly excited and heavily damped modal contributions, but at the expense of inclusion of many spurious (non-structural) poles
The modal parameters composing the state-space model of order 20, which corresponds to ten degrees of freedom, reveals that the model consists of the first order bending modes in the longitudinal direction and two of the second-order longitudinal bending modes (Fig. 11). Sensor locations correspond to the intersection of grid lines, which correlates to six equally spaced accelerometers on each of the five superstructure girders. The first mode (3.75Hz) is actually a spurious pole introduced by the nature of the vehicular loading, which in this case was a large department of transportation maintenance vehicle driven along a single lane of the
Intelligent Transportation Infrastructure Technologies
177
bridge. Furthermore, the third mode presented (9.11Hz) is suspected to be induced by weak interaction with the first longitudinal bending mode of an adjacent span, as evidenced by further ambient vibration testing of the entire bridge span. Aside from these noted discrepancies, the order of the modes is typical of short-tomedium length multi-girder bridge spans. First order longitudinal bending modes progress in order as the modes of lowest natural frequency with development of girder phase differences to give rise to mixed bending modes. For example, the second-order mixed bending pattern of the first-order longitudinal bending mode (12.57 Hz) features exterior girders experiencing first-order longitudinal bending in-phase with each other while 180° out-of-phase with the central girder and the remaining interior girders are essentially neutral. The characteristics of this pattern are reflected in the second-order longitudinal analogue of the second-order mixed bending mode (29.71Hz), except in this case bending in the longitudinal direction is second-order. When the model order is increased beyond the specified ten degrees of freedom, the missing fundamental second-order longitudinal bending mode and additional higher-order bending modes do arise in the model, but only after the introduction of many spurious modes, so the order 20 state-space model has been designated optimal for this bridge span and corresponding instrumentation configuration under ambient loading.
Fig. 11 Mode shape contributions within system model of order 20. Note: Mode shapes from the two poles of least strength are exclude from the figure: the weak DC pole is irrelevant and the fourth order mixed bending mode (23 Hz) is excluded due to poor excitation/reconstruction
An often overlooked caveat in experimental and operational modal analysis of bridge spans, particularly multi-girder designs, is the effect of spatial aliasing on mode reconstruction. Typically sensor counts are limited due to cost and time associated with field instrumentation, so particular attention must be provided to correct selection of sensor location. Spatial aliasing requirements often limit the
178
K.D. Janoyan and M.J. Whelan
bending modes accurately constructed to the first few orders. Given that the firstorder modes are all present in the spectrum prior to any second-order modes, it was advisable to instrument all girders such that spatial aliasing effects did not cause the higher-order mixed bending modes to appear as mirror images of prior modes. For instance, sometimes due to ease of installation or access issues, vibration sensors are placed only on the shoulder of the roadway or curb, thereby measuring the vertical response over the exterior girders. In such a case as the multi-girder bridge, interpolation of the second-order mixed bending mode of first-order longitudinal bending (12.57 Hz) would appear as the fundamental firstorder longitudinal bending mode (8.07 Hz). Likewise, the third-order mixed bending mode of first-order longitudinal bending (19.03 Hz) would appear as the first-order mixed bending mode of the first-order longitudinal bending mode (10.35 Hz). To satisfy spatial aliasing requirements, at least {n+1} girders should be instrumented to reconstruct up to the nth order mixed mode and {n+1} longitudinal sensors should be placed along each girder to reconstructed the nth order longitudinal bending mode.
4 Span Vibration Monitoring and Analysis: A Case Study The application of a vibration-based global diagnostics from a systems model constructed from in-service measurements is illustrated through application to a bridge span under prescribed, progressive damage (Fig. 12). The RT345 Bridge over Big Sucker Brook in Waddington, NY was a 19.1 cm thick reinforced concrete slab over steel girder bridge design at the end of its 51 year service life. A two-lane structure consisting of three 13.7m spans for a total span length of 41.7m, the girders were spaced at 2.1 m and simply supported by fixed and high rocker bearings. At the biennial inspection performed six months prior to the test program, the bridge was assigned a sufficiency rating of 61.2%, and operational rating of 44.5 metric tons, and an average daily traffic estimate of 1169 vehicles. Complete details of the test program and extended, definitive results from application of the diagnostic algorithm can be found in literature [4]. The northern span of this structure was instrumented with a rectangular array of thirty dual-axis accelerometers mounted on the lower web of the girders to monitor the vertical and lateral response to ambient excitation. Within this configuration, each of the five girders was instrumented at six equally spaced locations with a dual-axis accelerometer. The accelerometers employed were low-noise capacitive MEMS sensors, enabling full bandwidth response to static DC, and were interfaced with custom wireless data acquisition and transceiver nodes developed by the authors [8]. Hardware-based signal conditioning was provided in the form of signal amplification of 64V/V, offset nulling of the gravitational field to bias the signal at the mid-range of the analog-to-digital converter (ADC), and anti-aliasing low-pass filtering in the form of a 5-th order Butterworth design. The signals were converted using a 12-bit successive approximation ADC at a sampling rate of 512Hz, then passed through a supplemental digital low pass filtering algorithm,
Intelligent Transportation Infrastructure Technologies
179
and decimated to an effective sampling rate of 128Hz prior to transmission. This conditioning approach assures superior rejection of alias frequencies above the signal bandwidth while maintaining an exceptionally flat, non-attenuating passband from DC to 50Hz. Furthermore, an oversampling ratio of four is accepted to produce an additional effective bit of resolution in the converter measurement.
Fig. 12 RT345 Bridge over Big Sucker Brook, Waddington, NY (Northern-most span instrumented and subject to progressive, prescribed damage scenarios)
Baseline ambient vibration monitoring was conducted to develop a baseline model of the structural response using stochastic subspace identification. Given the sampling rate used, an excessive number of natural frequencies were identified in the original measurement bandwidth beyond those desirable for the diagnostic implementation, particularly in the lateral direction which captured many torsional modes of the individual girders. To reduce the model to only first and secondorder bending modes, the measurements were digitally resampled to an effective rate of 80Hz, following appropriate anti-alias conditioning. Data-driven stochastic subspace identification was then performed using all sensor channels as reference measurements and the forward innovation model was developed from the statespace model resulting from the SSI solution of the system matrices.
180
K.D. Janoyan and M.J. Whelan
In the following analysis, only the vertical response is utilized in the state-space model formulation to provide a concise study of the effectiveness of a low-order system model to identify and localize the onset of multi-modal structural damage. The system model is developed from a 90 second time history of the measured vertical response to ambient traffic loading across the thirty measurement locations prior to any introduction of damage beyond the existing structural deterioration present in the end-of-service bridge span. This system model corresponds to the 20th order model developed in the prior section, the spectrum of which is presented in Fig. 10 and the modal contributions presented in Fig. 11. The system model therefore includes all five first-order global bending modes as well as two second-order modes in the form of the first and second-order mixed bending patterns. Consequently, the resultant 20-pole model is of moderate size and consists of seven span-specific structural modes, one structural mode arising from weak interaction with the unaffected adjacent span, and one spurious mode arising from the nature of the vehicular loading as prior noted. For an expanded study with a larger 58-pole model that includes the girder lateral response measurement in the diagnostic algorithm, interested readers are referred to the authors work in [4]. It should be noted that this study finds that the lateral response, as measured on the lower web of the girders, provides clearer, stronger, and more consistent indication and localization of induced damage, which has important implications considering a large volume of damage diagnostics studies to date have focused solely on vertical response measurements and/or have featured placement of accelerometers on the deck surface. For each data set, the time history was passed through the baseline forward innovation model to develop the matrix of time-step prediction errors for each sensor channel. Comparison between the variance of the prediction errors to the baseline variances of the reference data prediction errors through the percentage difference is then used to assign an index to each sensor for damage localization: (6) Positive damage indices indicate that the forward innovation model based on the reference measurements predicts the time history more poorly relative to the baseline data, while negative damage indices indicate better prediction; only positive indices are then taken to suggest damage in this study. For clarity, the pseudocolor plots presented in the following analysis are drawn with a lower colormap bound of zero so as not to result in shading at locations with damage indices less than zero. The first damage scenario introduced to the structure was simulation of bearing damage through the use of applied restraint from a hydraulic jack at the easternmost high rocker bearing at the north abutment. The damage indices derived through the percentage difference in prediction error variance clearly suggest the
Intelligent Transportation Infrastructure Technologies
181
damage localized at the affect exterior bearing (Fig. 13). Given that the rocker bearing provides rotational freedom about the axis perpendicular to the longitudinal axis of the bridge span, it is expected that the introduction of the hydraulic jack restraint about this axis will affect the flexure of the supported girder. The damage indices clearly display the expected profile with severity strongest near the affected bearing and decaying along the length of the girder. Due to the structural continuity between exterior and adjacent girder particularly provided by the end diaphragm and deck, the presence of decaying damage indices along the adjacent girder local to the affected bearing is also well explained.
Fig. 13 Bearing damage identified through damage indices of forward innovation model constructed from 20-pole state-space model of system identification and photograph of hydraulic jack inducing reactions at exterior high rocker bearing. Note: sensor locations denoted by ‘o’, piecewise bilinear interpolation of color intensity between data points
After removal of the hydraulic jack, reference measurements where then obtained to quantify any residual effect on the prediction response, which in this case was found to be significantly low with a slight bias of damage indices toward the affected girder. Due to the low magnitudes of the residual damage indices, it was considered sufficient that they could be treated as an offset to be removed from subsequent damage scenario cases rather than mandating the development of a new forward innovation model based on the post-bearing scenario response. Removal of all six connection bolts at the interior connection of the easternmost intermediate diaphragm clearly signaled damage that would be consistent with the disabling of the affected diaphragm (Fig. 14). The primary in-service function of diaphragm members is to transfer lateral wind loads throughout the superstructure and to aid the deck in distribution of vertical loads across the longitudinal girders and bearing elements. Since lateral response is not considered in this particular analysis, only the effect of loss of load transfer should be evident in the damage indices. This effect can be witnessed in the damage indices, which are nearly uniform and constrained solely to the exterior girder to which the traffic loading would no longer be transferred to as effectively without the intermediate girder.
182
K.D. Janoyan and M.J. Whelan
Fig. 14 Diaphragm connection damage identified through indices of forward innovation model and photograph of corresponding bolt removal at diaphragm connection to interior girder
An additional diaphragm connection was compromised to gauge the ability of the approach to isolate multiple cases of damage simultaneously present in the structure. In this scenario, the six bolts at the connection of the western interior diaphragm to the central girder were removed. The damage indices from this case in reference to the post-bearing baseline record reveals both the existing damage along the exterior girder affected by the first diaphragm connection damage as well as the onset of damage local to the newly affected diaphragm (Fig. 15). To provide better localization and identification of the newly induced damage, the first diaphragm damage indices were then treated as the reference to the present damage state and the damage local to the newly affected diaphragm readily becomes more apparent.
Fig. 15 Second diaphragm connection affected. a) Damage indices relative to prediaphragm #1 connection damage. b) Damage indices relative to post-diaphragm #1 connection damage
Although no means definitive or rigorously applied to multiple bridges or even extensive measurement databases, the presented approach of statistically analyzing prediction errors arising from applying a forward innovation model in the form
Intelligent Transportation Infrastructure Technologies
183
of a steady-state Kalman filter with system matrices developed from stochastic subspace identification has shown much promise for gauging the in-service health of highway structures. The results of an experimental study on an end-of-service bridge span with prescribed damage suggest the ability of the approach to identify and localize multi-modal and multiple occurrence damage to the span. At the very least, the study illustrates the feasibility of global systems-based sensing approaches to tackling multi-modal, multiple occurrence fault and anomaly detection using distributed sensor arrays rather than solely functional performance-based transducers.
References [1] Dynamic Characterization and Damage Detection in the I-40 Bridge over the Rio Grande. Los Alamos National Laboratory, LA-12767-MS, p. 153 (1994) [2] Alampalli, S., Fu, G., Dillon, E.W.: Signal versus Noise in Damage Detection by Experimental Modal Analysis. Journal of Structural Engineering 123(2), 237–245 (1997) [3] Peeters, B., De Roeck, G.: One year monitoring of the Z24-Bridge: environmental influences versus damage events. In: Proceedings of IMAC 18, the International Modal Analysis Conference, San Antonio, TX, pp. 1570–1576 (2000) [4] Whelan, M.J., Janoyan, K.D.: In-Service Diagnostics of a Highway Bridge from a Progressive Damage Case Study. Journal of Bridge Engineering (2010) (Submitted for Publication) [5] Farrar, C.R., Lieven, N.A.J.: Damage Prognosis: the future of structural health monitoring. Philosophical Transactions of the Royal Society A 345, 623–632 (2007) [6] Vachtsevanos, G., Lewis, F., Roemer, M., Hess, A., Wu, B.: Intelligent fault diagnosis and prognosis for engineering systems. John Wiley & Sons, New Jersey (2006) [7] Wenzel, H., Pichler, D.: Ambient Vibration Monitoring. John Wiley & Sons, West Sussex (2005) [8] Whelan, M.J., Janoyan, K.D.: Design of a Robust, High-Rate Wireless Sensor Network for Static and Dynamic Structural Health Monitoring. Journal of Intelligent Material Systems and Structures 20(7) (2009) [9] Whelan, M.J., Gangone, M.V., Janoyan, K.D., Jha, R.: Real-Time Wireless Vibration Monitoring for Operational Modal Analysis of an Integral Abutment Highway Bridge. Engineering Structures 30(10), 2224–2235 (2009) [10] Whelan, M.J., Gangone, M.V., Janoyan, K.D., Jha, R.: Highway Bridge Assessment using an Adaptive Real-Time Wireless Sensor Network. IEEE Sensors – Special Issue on Sensor Systems for Structural Health Monitoring 9(11), 1405–1413 (2009) [11] Marek, J., Suzuki, Y., Gopel, W., Queisser, H.-J.: Sensors for Automotive Applications, vol. 4. John Wiley & Sons, Chichester (2003) [12] New Inspection and Monitoring Requirements for High Rocker Bearings, Technical Advisory – State of New York Department of Transportation Bridge Inspection Manual (October 2005) [13] Kim, B.H., Park, T.: Estimation of cable tension force using the frequency-based system identification method. Journal of Sound and Vibration 304, 660–676 (2007) [14] Glišić, B., Inaudi, D.: Fibre Optic Methods for Structural Health Monitoring. John Wiley & Sons, West Sussex (2007)
184
K.D. Janoyan and M.J. Whelan
[15] Hag-Elsafi, O., Kumin, J.: Load testing for bridge rating: Route 22 over Swamp River. New York State Department of Transportation (NYSDOT), Special Report (February 2006) [16] Inman, D.: Engineering Vibration, 2nd edn. Prentice-Hall, New Jersey (2001) [17] Ren, W.-Z., Zhao, T., Harik, I.E.: Experimental and Analytical Modal Analysis of Steel Arch Bridge. Journal of Structural Engineering 130(7), 1022–1031 (2004) [18] Peeters, B.: System Identification and Damage Detection in Civil Engineering. Doctoral Dissertation. Katholieke Universiteit Leuven (2000) [19] Yan, A.-M., De Coe, P., Golinval, J.-C.: Structural Damage Diagnosis by Laman Model Based on Stochastic Subspace Indentification. Structural Health Monitoring 3(2), 103–119 (2004) [20] Van den Branden, B., Peeters, B., De Roeck, G.D.: Introduction to MACEC v2.0: Modal Analysis on Civil Engineering Constructions User Guide and Case Studies. Katholieke Universiteit Leuven, 47 (1999)
Maintenance Optimization for Heterogeneous Infrastructure Systems: Evolutionary Algorithms for Bottom-Up Methods Hwasoo Yeo , Yoonjin Yoon, and Samer Madanat *
Abstract. This chapter presents a methodology for maintenance optimization for heterogeneous infrastructure systems, i.e., systems composed of multiple facilities with different characteristics such as environments, materials and deterioration processes. We present a two-stage bottom-up approach. In the first step, optimal and near-optimal maintenance policies for each facility are found and used as inputs for the system-level optimization. In the second step, the problem is formulated as a constrained combinatorial optimization problem, where the best combination of facility-level optimal and near-optimal solutions is identified. An Evolutionary Algorithm (EA) is adopted to solve the combinatorial optimization problem. Its performance is evaluated using a hypothetical system of pavement sections. We find that a near-optimal solution (within less than 0.1% difference from the optimal solution) can be obtained in most cases. Numerical experiments show the potential of the proposed algorithm to solve the maintenance optimization problem for realistic heterogeneous systems.
1 Introduction Infrastructure management is a periodic process of inspection, maintenance policy selection and maintenance activities application. Maintenance, Rehabilitation, and Reconstruction (MR&R) policy selection is an optimization problem where the objective is to minimize the expected total life-cycle cost of keeping the facilities Hwasoo Yeo Department of Civil and Environmental Engineering, Korea Advanced Institute of Science and Technology, 335 Gwahangno, Yuseong-gu, Daejeon, Republic of Korea e-mail: [email protected] *
Yoonjin Yoon · Samer Madanat Institute of Transportation Studies and Department of Civil and Environmental Engineering, University of California, Berkeley, CA 94720, U.S.A. e-mail: [email protected], [email protected] K. Gopalakrishnan & S. Peeta (Eds.): Sustainable & Resilient Critical Infrastructure Sys., pp. 185–199. © Springer-Verlag Berlin Heidelberg 2010 springerlink.com
186
H. Yeo, Y. Yoon, and S. Madanat
in the system above a minimum service level while satisfying agency budget constraints. MR&R optimization can be performed using one of two approaches: top-down and bottom-up. In a pavement management system, a top-down approach provides a simultaneous analysis of an entire roadway system. It first aggregates pavement segments having similar characteristics such as structure, traffic loading and environmental factors into mutually exclusive and collectively exhaustive homogeneous groups. The units of policy analysis are the fractions of those groups in specific conditions, and individual road segments are not represented in the optimization. As a result, much of the segment-specific information (history of construction, rehabilitation, and maintenance; materials; structure) is lost. One of the main advantages of a top-down approach is that it enables decision makers to address the trade-off between rehabilitation of a small number of facilities and maintenance of a larger number of facilities, given a budget constraint. On the other hand, the top-down approach does not specify optimal activities for each individual facility, and mapping system-level policies to facility-level activities is left to the discretion of district engineers. One of the early examples of a top-down formulation is Arizona Department of Transportation (ADOT) Pavement Management System (PMS), which selects maintenance and rehabilitation strategies that minimize life-cycle cost. The ADOT PMS saved $200 million over five years (OECD, 1987). However, the Arizona DOT PMS is designed for homogeneous systems where all facilities are assumed to have same characteristics, and cannot be applied to a heterogeneous system where individual facility characteristics are different. For a heterogeneous system, composed of facilities with different material, deterioration process, and environmental characteristics, it is necessary to specify optimal maintenance activities at the facility-level. For example, a system of bridges usually consists of facilities of different materials, structural designs and traffic loads. For a heterogeneous system maintenance optimization, a bottom-up approach is appropriate to determine maintenance policies at the facility level. In formulating heterogeneous system optimization, Robelin and Madanat (2007) proposed a bottom-up approach as follows. First, identify a set of optimal (or near optimal) sequences of MR&R activities for each facility over the desired planning horizon. Then, find the optimal combination of MR&R activity sequences for entire system given a budget constraint. The main advantage of the bottom-up approach is that the identity of individual facilities is preserved as we maintain the information associated with each facility such as structure, materials, history of construction, MR&R, traffic loading, and environmental factors. However, preserving individual details leads to high combinatorial complexity in the system optimization step. The methodology proposed herein is an attempt to overcome such shortcoming of the bottom-up formulation. We propose a two-stage bottom-up approach to address MR&R planning for an infrastructure system composed of dissimilar facilities undergoing stochastic state transitions over a finite planning horizon. This chapter consists of five sections. In Section 2, state-of-the-art methods for MR&R planning are reviewed. In Section 3, a new two-stage approach for solving the heterogeneous system maintenance
Maintenance Optimization for Heterogeneous Infrastructure Systems
187
problem is presented. In Section 4, a parametric study is presented to illustrate and evaluate the new approach. Finally, Section 5 presents conclusions.
2 Literature Review Infrastructure maintenance optimization problems can be classified into single facility problems and multi-facility problems (also known as system-level problems). The single facility problem is concerned with finding the optimal policy, the set of MR&R activities needed for each state of the facility that achieves the minimum expected life-cycle cost. Optimal Control (Friesz and Fernandez, 1979; Tsunokawa and Schofer, 1994), Dynamic Programming (Carnahan, 1988; Madanat and Ben-Akiva, 1994), Nonlinear minimization (Li and Madanat, 2002), and Calculus of Variations (Ouyang and Madanat, 2006) have been used as solution methods. For the system-level problem, the objective is to find the optimal set of MR&R policies for all facilities in the system, which minimizes the expected sum of lifecycle cost within the budget constraint for each year. The optimal solution at the system-level will not coincide with the set of optimal policies for each facility if the budget constraint is binding. Homogeneous system problems have been solved by using linear programming (Golabi et al., 1982; Harper and Majidzadeh, 1991; Smilowitz and Madanat, 2000). The decision variables for linear programming are the proportions of facilities that need a specific MR&R activity at a certain state. This top-down approach has advantages, but as discussed earlier, it cannot be directly applied to MR&R optimization for heterogeneous systems. Fwa et al. (1996) used genetic-algorithms, to address the trade-off between rehabilitation and maintenance. The authors assumed four categories of agency cost structure, based on the relative costs among rehabilitation and three maintenance activities for 30 homogeneous facilities. Durango-Cohen et al. (2007) proposed a quadratic programming platform for multi-facility MR&R problem. While the quadratic programming (QP) formulation successfully captures the effect of MR&R interdependency between facility pairs, the applicability of QP is limited to situations when the costs are quadratic. The numerical example in the chapter is limited to facilities with the same deterministic deterioration process, where each facility is a member of either a ‘substitutable’ or a ‘complementary’ network. Although intuitively sensible, the determination of ‘substitutable’ or ‘complementary’ networks might not be evident in large scale networks. Ouyang (2007) developed a new approach for system-level pavement management problem using multi-dimensional dynamic programming. He expanded the dynamic programming formulation used in the facility-level optimization to multiple facilities. To overcome the computational difficulty associated with the multi-dimensional problem, he adopted an approximation method and applied this to a deterministic, infinite horizon problem.
188
H. Yeo, Y. Yoon, and S. Madanat
Robelin and Madanat (2006) used a bottom-up approach for MR&R optimization of a heterogeneous bridge system. At the facility-level, all possible combinations of decision variables are enumerated; at the system-level, the best combination of the enumerated solutions is determined by searching the solution space. This systemlevel problem has a combinatorial computational complexity. The authors find a set of lower and upper cost bounds for the optimal solution, which narrows the search space. In a related work, Robelin and Madanat (2008) formulated and solved a riskbased MR&R optimization problem for Markovian systems. At the facility-level, the optimization consists of minimizing the cost of maintenance and replacement, subject to a reliability constraint. At the system-level, the dual of the corresponding problem is solved: risk minimization (i.e., reliability maximization) subject to a budget constraint; specifically, the objective is to minimize the maximum risk across all facilities in the system, subject to the sum of MR&R costs not exceeding the budget constraint. The solution to the system-level problem turns out to have a simple structure, with a linear computational time. The approach used in Robelin and Madanat (2008) is limited to risk based MR&R optimization, where the objective function has a Min-Max format, and is not applicable for serviceability based optimization problems. For serviceability based problems, the objective function takes on an expected cost minimization (or expected serviceability maximization) which does not lend itself to solutions with such a simple structure. This motivates the approach proposed in this chapter.
3 Methodologies Consider an infrastructure system composed of N independent facilities, with different attributes such as design characteristics, materials, traffic loads: this system is a heterogeneous system. We assume that a managing agency has to find the best combination of maintenance activities within a budget constraint of the current year. This optimization process is repeated at the start of every year using the outputs of facility inspections. As the optimization is an annual process and the future budgets are unknown, future budget constraints are not considered in the current year optimization. The objective is to find an optimal combination of facility-level maintenance activities, minimizing the total system-level cost. We assume that two variables, cost and activity, can be defined for all facilities, regardless of individual characteristics. We assume that inspections are performed at the beginning of the year, and the current state of each facility is known. In our two-stage bottom-up approach, we first solve the facility-level optimization to find a set of best and alternative MR&R activities and costs for each facility. In the second stage, we solve the system-level optimization to find the best combination of MR&R activities across facilities by choosing among the optimal and sub-optimal alternative activities found in the first step. Fig. 1 illustrates the system-level optimization for N facilities. For each facility, the optimal activity and the 1st and 2nd alternative activities are obtained from the facility-level optimization. The initial solution in the system-level
Maintenance Optimization for Heterogeneous Infrastructure Systems
189
is the set of optimal activities, [a1, a3, …, a1]. Our objective is to find an optimal combination of MR&R activities, while minimizing the total life-cycle cost within the budget constraint for the current year. Due to the presence of a budget constraint, the optimal activities found in the facility-level optimization are not necessarily included in the system-level solution. Instead, the next alternative activity may replace the optimal activity for certain facilities if needed, as illustrated in the case of facility 2 in Fig. 1. System-level solution [a1, a2,…,a2] System-level optimization
MR&R Activities
a1
a2
a3
Optimal
a3
a1
a2
…
2nd alternative
Facility-level optimization
1
a2
a3
Optimal
Optimal
1st alternative
a1
1st alternative
1st alternative 2nd alternative
2nd alternative
Facility-level optimization
Facility-level optimization
2
…
N
Fig. 1 System-level solution using the proposed bottom-up approach
In this chapter, we develop a general methodology for heterogeneous system optimization with emphasis on a pavement system as it is one of the most widely researched systems. This has a common problem structure for infrastructure management, i.e. probabilistic state transition, time discounting, and multiple MR&R activities. Therefore, the methodology developed here can be modified and applied to other types of facilities such as bridge systems. In a Pavement Management System (PMS), the state of pavement can be represented by discrete numbers such as the Pavement Serviceability Rating (PSR), ranging from 1 (worst condition) to 5 (best condition). If pavement deterioration can be represented as a Markovian process, the serviceability (PSR) changes over time depend only on the current state and the maintenance activity applied at the beginning of the time period after inspection. The transition probability matrix, Pa(i,j) specifies the probabilities of a state change from state i to j after applying maintenance activity a. An MR&R program X =[x1 , …, xN] is a set of activities that will be applied to the N facilities in the system in the current year. We assume a finite planning horizon of length T. The vector X must be feasible, i.e., it must satisfy the budget constraint for the current year.
190
H. Yeo, Y. Yoon, and S. Madanat
3.1 Facility-Level Optimization The facility-level optimization solves for the optimal activity and its cost pair (action cost and expected-cost-to-go) without accounting for the budget constraint. It also identifies suboptimal alternative policies and their cost pairs. The facilitylevel optimization for a PMS can be formulated as a dynamic program to obtain an optimal policy and the alternative policies. The dynamic programming formulation that solves for optimal activity a* and its expected cost-to-go V* is: a * (i , t ) = arg min{C ( a, i ) + α ∑ V ( j , t + 1) Pa (i, j )}
(1)
V * (i , t ) = min{C ( a , i ) + α ∑ V ( j , t + 1) Pa (i , j )}
(2)
j∈S
a∈A
a∈A
j∈S
Where, A: Set of feasible maintenance activities, A ={a1, a2,..} S : Set of feasible states of facility Pa(i,j) : Transition Probability from state i to j under maintenance activity a. C(a,i) : Agency cost for activity a, performed on facility in state i. α: Discount amount factor = 1/(1+r); where r is the discount rate Other costs such as user costs are not directly included in the formulation, but can be added to the agency cost. In the PMS example, by not allowing pavement states less than a certain threshold value, user costs are indirectly considered. We assume that salvage values at time T can be assigned or postulated. Iterating equations (1) and (2) from time T-1 to 1, we can obtain the minimum expected total cost-to-go V * (i,1) from the current time year (t=1) to the end of the planning
horizon T. In the example shown in Fig. 2, a facility state is 8 at time t and three activities are available. Computing the expected costs-to-go for each activity, an activity (a3) with minimum expected cost-to-go (denoted as V1* (8, t ) ) is chosen as the optimal activity. The activity a2 with the second smallest expected cost-to-go ( V 2* (8, t ) ) is the first alternative activity; the one with the third smallest expected cost-to-go ( V3* (8, t ) ) is the second alternative, etc. a1* (8, t ) = a3, a2* (8, t ) = a2, and a3* (8, t ) = a1. The k-th alternative activity a k*+1 and its expected cost-to-go V k*+1 (i , t ) can be found by using following equations: ak*+1 (i, t ) = arg min {C (a, i) + α ∑V ( j, t + 1) Pa (i, j )}, k = 0,1,2,... a∈A −{al* , l ≤k }
Vk*+1 (i, t ) =
min {C (a, i ) + α ∑ V ( j , t + 1) Pa (i, j )}, k = 0,1,2,...
a∈A −{ al* , l ≤k }
(3)
j∈S
(4)
j∈S
Note that when k=0, the result of Equation (3), a1* is the optimal activity, and V1* , the result of Equation (4), is the expected cost-to-go for the optimal activity. Thus,
Maintenance Optimization for Heterogeneous Infrastructure Systems
191
Equations (3) and (4) are used to solve for both optimal and alternative activities. Iterating backward in time, the optimal policy and alternative policies { a1* , a2* , a3* ,... }, and their costs { V1* , V2* , V3* ,... } can be solved for the current year. Although the facility-level optimization can also be formulated and solved as a linear program (for the infinite horizon case), we used dynamic programming because it also produces the alternative policies and costs used as inputs for the system-level optimization without additional calculations. …
time 1
…
V2* (8,1)
…
t+1 . . .
1st alternative
Optimal
t
. . .
V1(9,t) Activities
9
a1
Pa1 (8,9) V3* (8, t )
T . . .
V1(9,t+1)
9
…
9
a1 V1(8,t+1)
* 8 V1 (8,1)
V3* (8,1)
a2
a3 3
…
8
V2* (8, t )
8
a2
V1* (8, t )
…
8
V1(7,t+1)
a3 7
…
Pa3 (8,7)
7
…
7
2nd alternative . . .
Choose an activity with minimum expected cost-to-go
. . .
. . .
Fig. 2 Dynamic programming process for facility-level optimization
3.2 System-Level Optimization The facility-level optimizations yield a set of activities { a1* , a2* , a3* ,... } and their expected cost-to-go { V1* , V2* , V3* ,... } for each facility. Given the agency cost for each activity, the objective is to find the combination of activities (one for each facility) that minimizes the system-wide expected cost-to-go while keeping the total agency cost within the budget. We refer to this combination of activities as the optimal program. Assuming that all facilities are independent, and given a budget constraint, the system-level optimization can be formulated as a constrained combinatorial optimization problem. Let M n ={0, 1, 2,…} be an alternative activity set for facility n, where 0 represents the optimal activity and i represents i-th alternative activity. The systemlevel optimal activity xn ∈ M n will be determined given state sn for facility n. Let f nC ( xn ) denote the expected cost-to-go function, and f nB ( xn ) the activity cost
xn at current time. Note that ( sn ,1) for all n, and f ( xn ) = C (a, sn ) for all n in the facility-level
function for facility n given activity f ( xn ) = V C n
* xn +1
B n
problem. The combinatorial optimization problem is:
192
H. Yeo, Y. Yoon, and S. Madanat N min ⎛⎜ TEC = ∑ f nC ( xn ) ⎞⎟ n =1 ⎝ ⎠
(5)
Such that: N
AC = ∑ f nB ( xn ) ≤ B (B: Budget of the current year)
(6)
n =1
Where, X = [ x1 ,... x N ] is the optimal program, TEC represents the total system expected cost-to-go from current year to year T and AC the total activity cost. There exist various methods for solving the constrained combinatorial optimization problem including integer programming and heuristic search algorithms. As the constraints and object function may include nonlinear equations as in Durango-Cohen et al. (2007), the general approach must be a nonlinear solution. Cases of nonlinear constraints arise when there exist functional and economic dependencies between facilities. For example, contiguous facilities are best rehabilitated in the same year to reduce delay costs during the rehabilitation. Therefore, the challenge is how to reduce the computation complexity of the algorithm used to solve for the optimal solution. A simple method, the brute force search, searching the entire combinatorial solution space, is guaranteed to find the optimal solution. However, with a computational complexity of exponential order, this method cannot be applied to problems of realistic size. Two-Facility Example
To develop a system level solution, consider a simple case with only two facilities. Fig. 3 shows the solution space and solution path. From the initial solution X = (0,0) which is a combination of optimal activities without budget constraint, the solution has to move towards the constrained optimal solution. For the first facility, the decision variable x1 can take four values, i.e. four activities are available including the original optimal policy and three alternatives. In this example, the expected cost-to-go for the optimal policy is 2, and the alternatives’ costs are 8, 15 and 21. The second facility, as shown on the vertical axis, has an optimal expected cost-to-go of 3.5 and alternatives’ costs of 8, 12 and 17. The diagonal lines illustrated are loci of equal total expected cost-to-go (TEC) points. We seek the minimum total cost combination f1C ( x1 ) + f 2C ( x2 ) inside the feasible region, defined by the budget constraint. To guarantee global optimality, the solution path has to include every point for which the total cost (TEC) is below the optimal solution as illustrated in Fig. 3(b). Starting from point, (2, 3.5), the solution moves to point (2, 8), which gives the smallest increase of total cost from TEC1 (5.5) to TEC2 (10). By repeating this procedure, we can reach the optimal solution, which is the first feasible solution visited. However, as the number of facilities increases, it becomes more difficult to find the next solution point from the current solution. In case of P activities available for N facilities, there exist PN combinations of movements in the solution space in the worst case. Therefore, we need to develop solution methods that can
Maintenance Optimization for Heterogeneous Infrastructure Systems
193
avoid the exponential order of complexity. We apply an Evolutionary Algorithm for this purpose. f2C( x2 )
Budget constraint
TEC9
17
f2C( x2 )
Budget constraint
TEC9
17
12
12
TEC2
TEC2
8
8
Optimal point
TEC1 3.5
Optimal point
TEC1 3.5
Solution Path 2
8
15
21
(a) Solution space
f1C( x1 )
2
8
15
21
f1C( x1 )
(b) Solution path
Fig. 3 System-level optimization for a two-facility example
Evolutionary Algorithm
In this section, we discuss the formulation and application of Evolutionary Algorithms (EAs) to heterogeneous infrastructure system management optimization. The complexity of the system level optimization for a heterogeneous infrastructure system arises from the large number of combinations of possible MR&R activities. Unlike traditional optimization where the solution search is conducted candidate by candidate, Evolutionary Algorithms (EAs) are optimization techniques to search and evaluate a group of solution candidates, or population of solutions, with a goal to converge to a space that contains solutions satisfying pre-set criteria. Starting with a group of candidates, EAs select only competitive solutions in the group to generate the parent. Solutions are then mutated or recombined to produce the next generation of solutions. The process of selection, mutation and crossover are repeated until a certain set of control criteria is satisfied. EAs are not always guaranteed to find the global optimum, and they require a careful planning in selecting the parent selection process and control parameters. Among several EAs techniques, we apply Genetic Algorithms (GAs), which are widely studied and used, and provide the most suitable platform for combinatorial optimization problems like ours. The details of our implementation are discussed below. Fig. 4(a) shows the basic concept of EA search. A set of random offspring is generated according to a predefined normal distribution. Then the evaluation and selection stage determines one solution among all offspring. Before reaching the budget-feasible region, an offspring reducing the total activity cost (AC) with the smallest total expected cost (TEC) increase is selected; after reaching the feasible region, the least cost solution inside the region is selected to improve the current
194
H. Yeo, Y. Yoon, and S. Madanat
solution. Fig. 4(b) shows the application of the EA method to the two-facility example. From point (2, 3.5), several offspring are generated and evaluated, and point (2, 8) is chosen as the next solution because it has the lowest cost increase among all solutions that have a lower activity cost. Repeating this procedure, the algorithm finally reaches optimal solution of (15, 8) in 7 iterations. Stage 1: Mutant Offspring Generation To generate mutant offspring, we use the current solution vector X as a single parent. Let dX be a movement vector. A number of movement vectors are randomly generated according to the normal distribution dxn ~ Normal (0, s 2 ), n ≤ N . The offspring is X offspring = X + dX . The initial solution vector is the optimal one found in the facility-level optimization without budget constraint. After generating movement vectors, they are rounded to one of the discrete values: -3, -2, -1, 0, 1, 2, 3. The smaller the value of s, the more components of the movement vector have zero value, resulting in smaller search space. The number of offspring can be used for controlling the precision of search. Stage 2: Offspring Evaluation and Selection At this stage, generated offspring are evaluated to find the best movement from the current solution point. When the current agency activity cost (AC) is greater than the budget assigned, the offspring satisfying the following conditions is selected. min ∑ ( f nC ( xn + dx n ) ) N
dx
(7)
n =1
N
N
n=1
n=1
AC = ∑ f nB ( xn + dxn ) ≤ ∑ f nB ( xn )
(8)
When the solution is inside the feasible region, we select an offspring satisfying the following condition: N
(
)
N
min ∑ f nC ( xn + dxn ) ≤ ∑ f nC ( xn ) (Minimum TEC) dxn n =1
(9)
n =1
N
AC = ∑ f nB ( xn + dx) ≤ B (Budget constraint)
(10)
n =1
The stage 2 procedure is repeated until there is no solution improvement. Stage 3: Optimality Check Checking optimality, the search range is expanded to find a solution closer to the global optimal solution. In each step when no improved solution is found, s is increased by multiplication factor w. Therefore, if k steps pass without solution improvement, offspring with movement vector dxn ~ Normal (0, w2ks2) are evaluated for optimality checking. If an improved solution is found, k is reset to its initial
Maintenance Optimization for Heterogeneous Infrastructure Systems
195
value, and from the new point, the stage 3 is repeated until no improved solution can be found within a predefined number of iterations. X
0
1
0
0
0
0
1
0
0
f C( x )
dxn ~ N (0, s 2 )
dX 0
0
-1
0
0
1
0
1
0
-1
0
0
0
0
1
0
0
0
0
-1
0
0
1
0
-1
0
0
0
0
0
2
2
TEC
Budget constraint 9
0
17
… Replace X with Xnew
0
1
-1
0
0
12 TEC
Xnew =X+dX 0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
2
0
0
0
1
0
0
0
1
1
0
0
0
2
0
0
0
0
0
0
… 0
Stage 2 2
Stage 3
8 TEC
Optimal point
Stage 1 1
3.5
2
8
15
21
f C( x ) 1
1
Evaluation Find the best Xnew
(a) Solution search with random offspring vector
(b) EA Solution path for the two-facility case
Fig. 4 The Evolution Algorithm process
4 Numerical Examples To evaluate the proposed optimization algorithm and show the applicability of the suggested approach to a realistic problem, we created highway pavement systems with random Transition Probability Matrices and action costs, and compared the optimality of the solutions and the algorithm execution speeds.
4.1 Test System Creation Virtual highway pavement systems based on realistic data were created. To evaluate the optimality, a 20-facility system was created, and the number of facilities was increased to 2000 to assess the algorithm performance. The planning horizon was set to 40 years, and the interest rate to 5%. The agency activity costs and Transition Probability Matrices were generated randomly with the mean values suggested in Table 1. Table 1 shows the agency activity costs for each state and activity. These values were used as mean values to generate virtual pavement systems for the experiments. Note that pavement states lower than 4 are unacceptable, which is incorporated as a constraint.
196
H. Yeo, Y. Yoon, and S. Madanat Table 1 Mean activity costs ($/sqyd) Pavement State Maintenance activity
10
9
8
7
6
5
4
Acceptable
3
2
1
Unacceptable
Do-nothing
0
0
0
0
0
0
0
0
0
0
Maintenance
0.5
3.0
8.5
16.5
43.5
53.5
55.5
57.0
58.0
58.5
Reconstruction
60
60
60
60
60
60
60
60
60
60
The Transition Probability Matrix provides the probabilities of state transition of a pavement segment after a maintenance activity is applied. The matrices shown below are the Transition Probability Matrices for do-nothing and maintenance, respectively. Values shown are also mean values for each facility, and non-zero components are randomized keeping all row sums to 1. For the activity Reconstruction, the first column is set to 1 while all other elements are set to 0. ⎞ ⎛ 0.69 0.31 ⎟ ⎜ 0.77 0.23 ⎟ ⎜ ⎟ ⎜ 0.92 0.08 ⎟ ⎜ 0.91 0.09 ⎟ ⎜ ⎟ ⎜ 0.90 0.10 ⎟ ⎜ P1 = ⎟ ⎜ 0.79 0.21 ⎟ ⎜ 0.50 0.50 ⎟ ⎜ ⎟ ⎜ 1.00 ⎟ ⎜ 1.00 ⎟ ⎜ ⎜ 1.00 ⎟⎠ ⎝
⎞ ⎛ 0.69 0.31 ⎟ ⎜ ⎟ ⎜ 0.69 0.31 ⎟ ⎜ 0.77 0.23 ⎟ ⎜ 0.92 0.08 ⎟ ⎜ ⎟ ⎜ 0.91 0.09 ⎟ ⎜ P2 = ⎟ ⎜ 0.90 0.10 ⎟ ⎜ 0.79 0.21 ⎟ ⎜ ⎟ ⎜ 0.50 0.50 ⎟ ⎜ 1.00 ⎟ ⎜ ⎜ 1.00 0.00 ⎟⎠ ⎝
Algorithm Verification
Fig. 5 shows the progressions of the evolutionary algorithm. AC and TEC in Fig. 5 represent Agency Activity Cost and the Total Expected Cost-to-go respectively. Before reaching the feasible region, the solution moves to the budget constraint region as AC decreases. But, TEC does not necessarily decrease; in most cases, it increases until the budget region is reached. After reaching the feasible region, the solution moves within the feasible region keeping AC lower than the budget. TEC always decreases, while AC can either increase or decrease. When both TEC and AC become constant, the algorithm lies in stage 3, in which it searches a broader range in the solution space to find a better solution. If no better solution is found within a predetermined time step, the algorithm stops.
Maintenance Optimization for Heterogeneous Infrastructure Systems
197
1520
135
stage 2
stage 3
1500
130
TEC
1480
125 1460 120
TEC
AC
1440
B=112.7
1420
110
AC
1400
115
105
1380
stage 1,3
stage 3
stage 2,3
1360
100 0
20
40
60
80
100
Iteration
Fig. 5 Solution progress
4.2 Algorithm Evaluation Optimality Evaluation
To evaluate the EA solutions, 1000 random experiments were executed for a twenty-facility system. In each experiment, facilities were randomly generated with random activity costs and transition probabilities as described in the previous Section. The value of s for offspring randomization was set to 0.15, the multiplication factor w to 1.1. The number of offspring for each iteration was set to 100. Real optimal costs TECopt were calculated by exhaustive search for comparison. Table 2 presents the experiment results. The mean value of optimal total expected cost ratio (TEC/TECopt) was 1.0007 which is lower than 1.001 (0.1%). In 966 cases out of 1000 experiments, the optimal total cost ratios were lower than 1.001. In other words, in around 97% of cases, the algorithm found near-optimal solutions within a 0.1% difference. Table 2 Optimal cost and budget ratio of EA Algorithm
EA
Optimal cost ratio (TEC/TECopt)
Mean Std
0.0041
Budget ratio (AC/ACopt)
Mean
0.9970
Std
0.0548
Near optimal cases
1.0007
966 cases/1000
198
H. Yeo, Y. Yoon, and S. Madanat
Evaluation of Algorithm Execution Speed
Actual CPU times used by the test program (Matlab version 6.03) during the algorithm execution were obtained. A test computer with a 2 GHz CPU was used for experiments. In a test of a system of 200 facilities, the EA algorithm reaches the solution in 12.7 seconds. An additional EA algorithm test was conducted for a system up to 2,000 facilities resulting in good performance as shown in Table 3. When N is 1000, the execution time is less than 14 minutes, and when N is 2000, the execution time is approximately 109 minutes. Noting that the Arizona DOT’s Pavement Management System optimizes 7,400 sections of freeway, it can be claimed that the proposed method can be applied to a real statewide PMS. Table 3 EA execution time Number of facilities, N
10
50
100
150
200
500
1000
2000
CPU time (sec)
0.03
0.78
2.09
4.75
12.92
142.61
817.17
6575.52
5 Conclusions We proposed a two-stage bottom-up methodology to solve the MR&R optimization problem for a heterogeneous infrastructure system. To overcome the computational complexity of the brute force search, we developed a method that utilizes the optimal and alternative solutions at the facility-level. This approach makes the search process more efficient by specifying the search order for the alternatives. The system-level problem is formulated as a constrained combinatorial optimization, and solutions are found by applying the Evolutionary Algorithm. Evaluation results suggest that the Evolutionary Algorithm is effective in identifying close-to-optimal solutions in relatively short time for large scale network problems. Numerical experiments showed that we obtain near-optimal solutions (within less than 0.1% difference from the optimal solution) in most cases, and also showed the potential of the proposed algorithms to solve the maintenance optimization problem for realistic heterogeneous systems. One extension of the proposed method is the optimization of a system composed of diverse types of infrastructures of bridges, pavements and other types of facilities. Such an extension would be useful for optimizing DOT maintenance expenditure in a multi-asset management framework.
References [1] Carnahan, J.V.: Analytical framework for optimizing pavement maintenance. Journal of Transportation Engineering, ASCE 114, 307–322 (1988) [2] Durango-Cohen, P., Sarutipand, P.: Multi-facility maintenance optimization with coordinated interventions. In: Proceedings of the Eleventh World Conference on Transport Research, Berkeley, CA (2007)
Maintenance Optimization for Heterogeneous Infrastructure Systems
199
[3] Friesz, T.L., Fernandez, J.E.: A model of optimal transport maintenance with demand responsiveness. Transportation Research 13B, 317–339 (1979) [4] Fwa, T.F., Chan, W.T., Tan, C.Y.: Genetic-algorithm programming of road maintenance and rehabilitation. Journal of Transportation Engineering, ASCE 122, 246–253 (1996) [5] Golabi, K., Kulkarni, R., Way, G.: A statewide pavement management system. Interfaces 12, 5–21 (1982) [6] Harper, W.V., Majidzadeh, K.: Use of expert opinion in two pavement management systems. Transportation Research Record No. 1311, Transportation Research Board, Washington, DC, 242–247 (1991) [7] Li, Y., Madanat, S.: A steady-state solution for the optimal pavement resurfacing problem. Transportation Research 36A, 525–535 (2002) [8] Madanat, S., Ben-Akiva, M.: Optimal inspection and repair policies for transportation facilities. Transportation Science 28, 55–62 (1994) [9] Organization for Economic Cooperation and Development (OECD): Pavement Management Systems. Road Transport Research Report, Paris, France (1987) [10] Ouyang, Y., Madanat, S.: An Analytical Solution for the Finite-Horizon Pavement Resurfacing Planning Problem. Transportation Research, Part B 40, 767–778 (2006) [11] Ouyang, Y.: Pavement resurfacing planning on highway networks: A parametric policy iteration approach. Journal of Infrastructure Systems, ASCE 13(1), 65–71 (2007) [12] Robelin, C.A., Madanat, S.: A bottom-up, reliability-based bridge inspection, maintenance and replacement optimization model. In: Proceedings of the Transportation Research Board Meeting 2006 (CD-ROM), TRB, Washington, DC. Paper #06-0381 (2006) [13] Robelin, C.A., Madanat, S.: Reliability-Based System-Level Optimization of Bridge Maintenance and Replacement Decisions. Transportation Science 42(4) (2008) [14] Smilowitz, K., Madanat, S.: Optimal Inspection and Maintenance Policies for Infrastructure Systems. Computer-Aided Civil and Infrastructure Engineering 15(1) (2000) [15] Tsunokawa, K., Schofer, J.L.: Trend curve optimal control model for highway pavement maintenance: case study and evaluation. Transportation Research 28A, 151–166 (1994) [16] Forrest, S.: Genetic Algorithms: Principles of Natural Selection Applied to Computation. Science 261, 872–878 (1993) [17] Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning (1989) [18] Gray, P., et al.: A Survey of Global Optimization Methods. Sandia National Laboratories (1997)
A Swarm Intelligence Approach for Emergency Infrastructure Inspection Scheduling Vagelis Plevris , Matthew G. Karlaftis, and Nikos D. Lagaros *
Abstract. Natural hazards such as earthquakes, floods and tornadoes can cause extensive failure of critical infrastructures including bridges, water and sewer systems, gas and electricity supply systems, and hospital and communication systems. Following a natural hazard, the condition of structures and critical infrastructures must be assessed and damages have to be identified; inspections are therefore necessary since failure to rapidly inspect and subsequently repair infrastructure elements will delay search and rescue operations and relief efforts. The objective of this work is scheduling structure and infrastructure inspection crews following an earthquake in densely populated metropolitan areas. A model is proposed and a decision support system is designed to aid local authorities in optimally assigning inspectors to critical infrastructures. A combined Particle Swarm – Ant Colony Optimization based framework is developed which proves an instance of a successful application of the philosophy of bounded rationality and decentralized decision-making for solving global optimization problems.
1 Introduction Infrastructure networks are vital for the well-being of modern societies; national and local economies depend on efficient and reliable networks that provide added value and competitive advantage to an area’s social and economic growth. The significance Vagelis Plevris · Nikos D. Lagaros Institute of Structural Analysis & Seismic Research, National Technical University of Athens, 9, Iroon Polytechniou Str., Zografou Campus, GR-15780 Athens, Greece e-mail: {vplevris,nlagaros}@central.ntua.gr
*
Matthew G. Karlaftis Department of Transportation Planning and Engineering, National Technical University of Athens, 9, Iroon Polytechniou Str., Zografou Campus, GR-15780 Athens, Greece e-mail: [email protected] K. Gopalakrishnan & S. Peeta (Eds.): Sustainable & Resilient Critical Infrastructure Sys., pp. 201–230. © Springer-Verlag Berlin Heidelberg 2010 springerlink.com
202
V. Plevris, M.G. Karlaftis, and N.D. Lagaros
of infrastructure networks increases when natural disasters occur since restoration of community functions is highly dependent on the affected regions receiving adequate relief resources. Infrastructure networks are frequently characterized as the most important lifelines in cases of natural disasters; recent experience from around the World (hurricanes Katrina and Wilma, Southeastern Asia Tsunami, Loma Prieta and Northridge earthquakes and others) suggests that, following a natural disaster, infrastructure networks are expected to support relief operations, population evacuation, supply chains and the restoration of community activities. Infrastructure elements such as bridges, pavements, tunnels, water and sewage systems, and highway slopes are highly prone to damages caused by natural hazards, a result of possible poor construction or maintenance, of design inconsistencies or of the shear magnitude of the natural phenomena themselves. Rapid network degradation following these disasters can severely impact both short and long run operations resulting in increased fatalities, difficulties in population evacuation and the supply of clean water and food to the affected areas. Much of the state-of-the-art in this research area indicates that attention must be given to three important actions: (i) Failsafe design and construction of infrastructure facilities; (ii) Effective maintenance and management of the available facilities; and, (iii) Planning and preparing actions to deal with rapid reparation of infrastructure following the disasters. As can be expected, significant research has been undertaken in emergency response to either natural hazards or manmade disasters. Work has concentrated on the four main aspects of the process; mitigation, preparedness, response, and recovery (an excellent collection of emergency response papers, with a heavy focus on quantitative approaches and algorithms, can be found in Altay and Green [1]). Work on mitigation includes assessing seismic hazards [2], probabilistic damage projection [3-4], and simulation based DSS for integrating the emergency process [5-6]. Research on preparedness, a particularly challenging area of network related problems, has mainly focused on preparing infrastructure networks for dealing with potential disasters and for accommodating evacuation needs [7-12]. Response related work has evolved around two main research paths; first, planning the response-relief logistics operations [13-16], and, second, assessing the performance of the infrastructure system following the natural hazard [17-20]. Finally, recovery operations have attracted limited attention despite their importance in practice; for example, work has concentrated on infrastructure element protection [21], general assessment of relief performance [22], and fund allocation for infrastructure repairs following disasters [23]. It is interesting to note that most research on emergency response, particularly following the disaster, has shied away from dealing with the critical step of damage assessment and its related issues. For example, following an earthquake, all infrastructure elements need to be inspected, damages assessed, and repairs prioritized; these needs pose sets of problems such as partitioning the damaged area into sub-areas of responsibility for repair crews, determining inspection sequences (i.e. which infrastructure elements should be inspected first, second, and so on), and allocating funds for repairs, that research has largely ignored to date. This chapter is focused on issues that are related to inspecting and repairing
A Swarm Intelligence Approach for Emergency Infrastructure Inspection Scheduling
203
infrastructure elements damaged by earthquakes, a highly unpredictable natural disaster of considerable importance to many areas around the World. An explicit effort is made to initiate the development of a process for handling postearthquake emergency response in terms of optimal infrastructure condition assessment, based on a combined Particle Swarm Optimization (PSO) – Ant Colony Optimization (ACO) framework. Some of the expected benefits from this work include improvements in infrastructure network restoration times and minimization of adverse impacts from natural hazards on infrastructure networks.
2 Notation and Symbols 2.1 Optimization (Generally) x f(x): Rn→R g(x): Rn→Rm xL, xU
Design variables vector Objective function Vector of m inequality constraint functions Vectors of length n defining the lower and upper bounds of the design variables, respectively
2.2 Optimum Assignment Problem Definition (i ) nSB
SBk Ci d(SBk,Ci) D(k)
G=(N,A)
di,j (i≠j) p={p(1), …, p(N)}
L(p)
Number of structural blocks allocated to the ith inspection crew kth structural block Centre of the ith group of structural blocks (with coordinates xCi and yCi) Distance between the SBk building block from the centre of the ith group Demand for the kth building block Weighted graph where N is the set of nodes and A is the set of arcs (edges or connections) that fully connects the components of N . The distance between two nodes A permutation, a possible solution to the Travelling Salesman Problem (TSP), where {p(1), …, p(N)} are the node indices Total length of a solution to the TSP
2.3 Particle Swarm Optimization vj(t) xj(t) xPb,j xGb c 1, c 2
Velocity vector of particle j at time t Position vector of particle j at time t Personal ‘best ever’ position of the jth particle Global best location found by the entire swarm Acceleration coefficients: c1 - cognitive parameter, c2 social parameter
204
r1, r2
D w wmax, wmin vmax NP n tmax kf fm Gbestt
V. Plevris, M.G. Karlaftis, and N.D. Lagaros
Two random vectors uniformly distributed in the interval [0, 1] Hadamard product, i.e. element-wise vector or matrix multiplication Inertia weight Maximum and minimum values of the inertia weight, respectively Vector containing the maximum allowable absolute velocity for each dimension Number of particles Dimension of particles Maximum number of iterations for the termination criterion Number of iterations for which the relative improvement of the objective function satisfies the convergence check Minimum relative improvement of the value of the objective function Best value of the objective function found by the PSO at iteration t
2.4 Ant Colony Optimization m Mk Nki
pik, j τi,j a, b ηi,j
di, j ρ A
Δτki,j(t)
L(T k )
Number of ants Memory of an ant k currently at node i, contains the nodes already visited the feasible neighbourhood that is the set of nodes that have not yet been visited by ant k the probability with which ant k, currently at node i, chooses to go to node j the amount of pheromone on connection between i and j nodes superscript parameters a is parameter to control the influence of τi,j, β is a parameter to control the influence of ηi,j a heuristic information that is available a priori, denoting the desirability of connection i,j Distance between nodes i and j rate of pheromone evaporation the set of arcs (edges or connections) that fully connects the set of nodes the amount of pheromone ant k deposits on the connections it has visited through its tour Tk Total length of tour Tk of ant k
A Swarm Intelligence Approach for Emergency Infrastructure Inspection Scheduling
205
3 Problem Formulation A general formulation of a nonlinear optimization problem can be stated as follows
minn f ( x ) x∈R
x = [ x1 ,… , xn ]T
Subject to g k ( x ) ≤ 0 k = 1,… , m
(1)
xL ≤ x ≤ xU where x is the design variables vector of length n, f(x): Rn→R is the objective function to be minimized, the vector of m inequality constraint functions g(x): Rn→Rm and xL, xU are two vectors of length n defining the lower and upper bounds of the design variables, respectively. The main objective of this work is to formulate the problem of inspecting the structural systems of a city/area as an optimization problem. This objective is achieved in two steps: in the first step, the structural blocks to be inspected are optimally assigned into a number of inspection crews (assignment problem), while in the second step the problem of hierarchy is solved for each group of blocks (inspection prioritization problem). In the formulation of the optimization problems considered in this work, the city/area under investigation is decomposed into NSB structural blocks while NIG inspection crews are considered for inspecting the structural condition of all structural and infrastructure systems of the city/area.
3.1 Step 1: Optimum Assignment Problem The assignment problem is defined as a nonlinear programming optimization problem as follows (i ) N IG nSB
min ∑∑[d ( SBk , Ci ) ⋅ D(k )] i =1 k =1
(i )
1 nSB xCi = ( i ) ∑ xk nSB k =1 yCi =
(2)
(i ) nSB
1 y (i ) ∑ k nSB k =1
D(k ) = A(k ) ⋅ BP(k ) where
(i ) nSB is the number of structural blocks allocated to the ith inspection crew,
d(SBk,Ci) is the distance between the SBk building block from the centre of the ith group of structural blocks (with coordinates xCi and yCi), while D(k) is the demand for the kth building block defined as the product of the building block total area
206
V. Plevris, M.G. Karlaftis, and N.D. Lagaros
times the built-up percentage (i.e. percentage of the area with a structure). This is defined as a discrete optimization problem since the design variables x are integer numbers denoting the inspection crews to which each built-up block has been assigned and thus the total number of the design variables is equal to the number of structural blocks and the range of the design variables is [1, NIG].
3.2 Step 2: Inspection Prioritization Problem The definition of this problem is a typical Travelling Salesman Problem (TSP) [24] which is a problem in combinatorial optimization studied in operations research and theoretical computer science. In TSP a salesman spends his time visiting N cities (or nodes) cyclically. Given a list of cities and their - pair-wise distances, the task is to find a Hamiltonian tour of minimal length, i.e. to find a closed tour of minimal length that visits each city once and only once. For an N city asymmetric TSP if all links are present then there are (N-1)! different tours. TSP problems are also defined as integer optimization problems, similar to all problems that have been proven to be NP-hard [25]. Consider a TSP with N cities (vertices or nodes). The TSP can be represented by a complete weighted graph G=(N,A) , with N the set of nodes and A the set of arcs (edges or connections) that fully connects the components of N . A cost function is assigned to every connection between two nodes i and j, that is the distance between the two nodes di,j (i≠j). In the symmetric TSP, it is di,j=dj,i. A solution to the TSP is a permutation p={p(1), …, p(N)} of the node indices {1, …, N}, as every node must appear only once in a solution. The optimum solution is the one that minimizes the total length L(p) given by N −1
L( p) = ∑ ( d p(i ), p(i +1) ) + d p ( N ), p(1)
(3)
i =1
Thus, the corresponding prioritization problem is defined as follows
⎡ nSB −1 ⎤ min ⎢ ∑ d (SBk , SBk +1 ) + d (SBn( i ) , SB1 )⎥ , i = 1,..., NIG SB ⎣⎢ k =1 ⎦⎥ (i )
(4)
where d(SBk, SBk+1) is the distance between building block SBk and k+1th. The main objective is to define the shortest possible route between the structural blocks that have been assigned in Step 1 to each inspection group.
4 Solving the Optimization Problems 4.1 Particle Swarm Optimization Algorithm 4.1.1 Introduction to Particle Swarm Optimization Many probabilistic-based search algorithms have been inspired by natural phenomena, such as Evolutionary Programming, Genetic Algorithms, Evolution
A Swarm Intelligence Approach for Emergency Infrastructure Inspection Scheduling
207
Strategies, among others. Recently, a family of optimization methods has been developed based on the simulation of social interactions among members of a specific species looking for food or resources in general. One of these methods is the Particle Swarm Optimization (PSO) [26] method that is based on the behavior reflected in flocks of birds, bees and fish that adjust their physical movements to avoid predators and seek for food. The method has been given considerable attention in recent years among the optimization research community. A swarm of birds or insects or a school of fish searches for food, resources or protection in a very typical manner. If a member of the swarm discovers a desirable path to go, the rest of the swarm will follow quickly. Every member searches for the best in its locality, learns from its own experience as well as from the others typically from the best performer among them. Even human beings show a tendency to behave in this way as they learn from their own experience, their immediate neighbors and the ideal performers in the society. The PSO method mimics the behavior described above. It is a population-based optimization method built on the premise that social sharing of information among the individuals can provide an evolutionary advantage. PSO has been found to be highly competitive for solving a wide variety of optimization problems [27-33]. It can handle non-linear, non-convex design spaces with discontinuities. Compared to other non-deterministic optimization methods it is considered efficient in terms of number of function evaluations as well as robust since it usually leads to better or the same quality of results. Its easiness of implementation makes it more attractive as it does not require specific domain knowledge information, while being a population-based algorithm, it can be straight forward implemented in parallel computing environments leading to a significant reduction of the total computational cost. PSO has been successfully applied to many fields, such as mathematical function optimization, artificial neural network training and fuzzy system control. In a PSO formulation, multiple candidate solutions coexist and collaborate simultaneously. Each solution is called a “particle” that has a position and a velocity in the multidimensional design space. A particle “flies” in the problem search space looking for the optimal position. As “time” passes through its quest, a particle adjusts its velocity and position according to its own “experience” as well as the experience of other (neighbouring) particles. Particle's experience is built by tracking and memorizing the best position encountered. As every particle remembers the best position it has visited during its “flight”, the PSO possesses a memory. A PSO system combines local search method (through self experience) with global search method (through neighbouring experience), attempting to balance exploration and exploitation. 4.1.2 Relationship of PSO with Evolutionary Algorithms (EAs) PSO shares many similarities with evolutionary computation techniques, such as Genetic Algorithms (GA), but the conceptual difference lies in its definition which is given in a social rather than a biological context. The common features of the two optimization approaches include the population concept of the design vectors, initialization with a population of random solutions, a fitness value to evaluate
208
V. Plevris, M.G. Karlaftis, and N.D. Lagaros
performance, searching for optima by updating iterations (generations) based on a stochastic process, no requirement for gradient information or user-defined initial estimates and no guaranteed final success. However, unlike GA, PSO has no genetic operators such as crossover and mutation. In PSO, the potential solutions, fly through the problem space by following a velocity update rule. The information sharing mechanism in PSO is significantly different compared to GA. In GA chromosomes share information with each other, so the whole population moves like one group towards an optimal area. In PSO, only Gbest (the global best particle) communicates the information to the others, forming a one-way information sharing mechanism. Compared to Genetic Algorithms, according to the study of Hassan et al. [34], PSO and GA can both obtain high quality solutions, yet the computational effort required by PSO to arrive to such high quality solutions is less than the corresponding effort required by GA. According to Angeline [35], two main distinctions can be made between PSO and an evolutionary algorithm: i. EAs rely on three mechanisms in their processing: parent representation, selection of individuals and the fine tuning of their parameters. In contrast, PSO only relies on two mechanisms, since PSO does not adopt an explicit selection function. The absence of a selection mechanism in PSO is compensated by the use of leaders to guide the search. However, there is no notion of offspring generation in PSO as with EAs. ii. The manipulation of the individuals is different in EAs and PSO. PSO uses an operator that sets the velocity of a particle to a particular direction. This can be seen as a directional mutation operator in which the direction is defined by both the particle’s personal best and the global best (of the swarm). If the direction of the personal best is similar to the direction of the global best, the angle of potential directions will be small, whereas a larger angle will provide a larger range of exploration. In contrast, EAs use a mutation operator that can set an individual in any direction (although the relative probabilities for each direction may be different). In fact, the limitations exhibited by the directional mutation of PSO has led to the use of mutation operators similar to those adopted in EAs. 4.1.3 Mathematical Formulation of PSO Each particle maintains two basic characteristics, velocity and position, in the multi-dimensional search space that are updated as follows
v j (t + 1) = wv j (t ) + c1r1 D ( x Pb,j − x j (t ) ) + c2 r2 D ( x Gb − x j (t ) ) (5) x j (t + 1) = x j (t ) + v j (t + 1)
(6)
where vj(t) denotes the velocity vector of particle j at time t, xj(t) represents the position vector of particle j at time t, vector xPb,j is the personal ‘best ever’ position of the jth particle, and vector xGb is the global best location found by the entire swarm. The acceleration coefficients c1 and c2 indicate the degree of confidence in the best solution found by the individual particle (c1 - cognitive parameter) and by
A Swarm Intelligence Approach for Emergency Infrastructure Inspection Scheduling
209
the whole swarm (c2 - social parameter), respectively, while r1 and r2 are two random vectors uniformly distributed in the interval [0, 1]. The symbol “ D ” of Eq. (5) denotes the Hadamard product, i.e. the element-wise vector or matrix multiplication.
Fig. 1 Visualization of the particle’s movement in a two-dimensional design space
Figure 1 depicts a particle’s movement, in a two-dimensional design space, according to Eqs. (5) and (6). The particle’s current position xj(t) at time t is represented by the dotted circle at the lower left of the drawing, while the new position xj(t+1) at time t+1 is represented by the dotted bold circle at the upper right hand of the drawing. It can be seen how the particle’s movement is affected by: (i) it’s velocity vj(t); (ii) the personal best ever position of the particle, xPb,j, at the right of the figure; and (iii) the global best location found by the entire swarm, xGb, at the upper left of the figure. In the above formulation, the global best location found by the entire swarm up to the current iteration (xGb) is used. This is called a fully connected topology (fully informed PSO), as all particles share information with each other about the best performer of the swarm. Other topologies have also been used in the past where instead of the global best location found by the entire swarm, a local best location of each particle’s neighbourhood is used. Thus, information is shared only among members of the same neighbourhood. The term w of Eq. (5) is the inertia weight, essentially a scaling factor employed to control the exploration abilities of the swarm, which scales the current velocity value affecting the updated velocity vector. The inertia weight was not part of the original PSO algorithm [26], as it was introduced later by Shi and Eberhart [36] in a successful attempt to improve convergence. Large inertia weights will force larger velocity updates allowing the algorithm to explore the design space globally. Similarly, small inertia values will force the velocity updates to concentrate in the nearby regions of the design space.
210
V. Plevris, M.G. Karlaftis, and N.D. Lagaros
The inertia weight can also be updated during iterations. A commonly used inertia update rule is the linearly-decreasing, calculated by the formula:
wt +1 = wmax −
wmax − wmin ⋅t tmax
(7)
where t is the iteration number, wmax and wmin are the maximum and minimum values, respectively, of the inertia weight. In general, the linearly decreasing inertia weight has shown better performance than the fixed one. Particles' velocities in each dimension i (i = 1, …,n) are restricted to a maximum velocity vmaxi. The vector vmax of dimension n holds the maximum absolute velocities for each dimension. It is more appropriate to use a vector rather than a scalar, as in the general case different velocity restrictions can be applied for different dimensions of the particle. If for a given particle j the sum of accelerations of Eq. (5) causes the absolute velocity for dimension i to exceed vmaxi, then the velocity on that dimension is limited to ±vmax,i. The vector parameter vmax is employed to protect the cohesion of the system, in the process of amplification of the positive feedback. The basic PSO has only few parameters to adjust. In Table 1 there is a list of the main parameters, their typical values as well as other information. Table 1 Main PSO parameters
Symbol
Description
Details
NP
Number of particles
n
Dimension of particles
w
Inertia weight
xL, xU
Vectors containing the lower and upper bounds of the n design variables, respectively Vector containing the maximum allowable velocity for each dimension during one iteration Cognitive and social parameters
A typical range is 10 – 40. For most problems 10 particles is sufficient enough to get acceptable results. For some difficult or special problems the number can be increased to 50-100. It is determined by the problem to be optimized. Usually is set to a value less than 1, i.e. 0.95. It can also be updated during iterations. They are determined by the problem to be optimized. Different ranges for different dimensions of particles can be applied in general. Usually is set half the length of the allowable interval for the given dimension: vmaxi = (xUi - xLi)/2. Different values for different dimensions of particles can be applied in general. Usually c1=c2=2. Other values can also be used, provided that 0 < c1+c2 < 4 [26].
vmax
c 1, c 2
A Swarm Intelligence Approach for Emergency Infrastructure Inspection Scheduling
211
4.1.4 Convergence Criteria Due to the repeated process of the PSO search, convergence criteria have to be applied for the termination of the optimization procedure. Two widely adopted convergence criteria are the maximum number of iterations of the PSO algorithm and the minimum error requirement on the calculation of the optimum value of the objective function. The selection of the maximum number of iterations depends, generally, on the complexity of the optimization problem at hand. The second criterion presumes prior knowledge of the global optimal value, which is feasible for testing or fine-tuning the algorithm in mathematical problems when the optimum is known a priori, but this is certainly not the case in practical structural optimization problems where the optimum is not known a priori. Table 2 PSO convergence parameters
Symbol
Description
Details
tmax
Maximum number of iterations for the termination criterion. Number of iterations for which the relative improvement of the objective function satisfies the convergence check. Minimum relative improvement of the value of the objective function.
Determined by the complexity of the problem to be optimized, in conjunction with other PSO parameters (n, NP).
kf
fm
If the relative improvement of the objective function over the last kf iterations (including the current iteration) is less or equal to fm, convergence has been achieved.
In our study, together with the maximum number of iterations, we have implemented the convergence criterion connected to the rate of improvement of the value of the objective function for a given number of iterations. If the relative improvement of the objective function over the last kf iterations (including the current iteration) is less or equal to a threshold value fm, convergence is supposed to have been achieved. In mathematical terms, denoting as Gbestt the best value for the objective function found by the PSO at iteration t, the relative improvement of the objective function can be written for the current iteration t as follows
Gbestt − k f +1 − Gbestt Gbestt − k f +1
≤ fm
(8)
In Table 2 there is a list of the convergence parameters of the PSO used in this study with description and details. A pseudo code of the PSO procedure is given in Figure 2.
212
V. Plevris, M.G. Karlaftis, and N.D. Lagaros
4.1.5 PSO for Integer Optimization Since both problems defined in Section 2 are integer optimization problems, discrete optimization algorithms are required. For the Step 1 optimization problem described in Section 2.1, a discrete version of the PSO algorithm is employed. In the continuous version of the PSO method, both particle positions and velocity are initialized randomly. For each particle j Initialize particle position by distributing particles randomly in the design space End Repeat For each particle j Calculate fitness value for current position If the current fitness value is better than the best fitness value (Pbest) in the particle’s history then set current fitness value as the new Pbest and current position as the new xPbj End Set Gbest as the best fitness value of all the particles’ Pbest and corresponding position as the new xGb For each particle j Calculate particle velocity from Eq. (5) Update particle position from Eq. (6) If, for any dimension i, xi ≤ xLi or xi ≥ xUi, then set xi = xLi or xi = xUi respectively and set corresponding vi = 0 End Until maximum iterations is not attained and the relative improvement of the objective function is greater than fm over the last kf iterations Report results Fig. 2 Pseudo-code for the main PSO for unconstrained optimization
A Swarm Intelligence Approach for Emergency Infrastructure Inspection Scheduling
213
In this work, the particle positions are generated randomly over the design space using discrete Latin Hypercube Sampling, thus guaranteeing that the initial particle positions will be integers in the acceptable range. Furthermore, in the case of discrete optimization and in particular in integer programming, at every step of the optimization procedure, integer particle positions should also be generated. In order to satisfy this, Eq. (5) is modified as follows
v j (t + 1) =
round ⎡⎣ wv j (t ) + c1r1 D ( x Pb,j − x j (t ) ) + c2r2 D ( x Gb − x j (t ) ) ⎤⎦
(9)
where the vector function round(x) rounds each element of the vector x into the nearest integer.
4.2 Ant Colony Optimization 4.2.1 Introduction to Ant Colony Optimization The Ant Colony Optimization (ACO) algorithm [37,38] is a population-based probabilistic technique for solving optimization problems, mainly for finding optimum paths through graphs. The algorithm was inspired by the behaviour of real ants in nature. In many ant species, individuals initially wander randomly and upon finding a food source return to their colony, depositing a substance called pheromone on the ground. Other ants smell this substance, and its presence influences the choice of their path, i.e. they tend to follow strong pheromone concentrations rather than travelling completely randomly, returning and reinforcing it if they eventually find food. The pheromone deposited on the ground forms a pheromone trail, which allows the ants to find good sources of food that have been previously identified by other ants. As time passes, the pheromone trails start to evaporate, reducing their strength. The more time it takes for an ant to travel down a path and back again, the more time the pheromone trail has to evaporate. A short path gets marched over faster than a long one, and thus the pheromone density remains high as it is laid on the path faster than it can evaporate. If there was no evaporation, the paths chosen by the first ants would tend to be excessively attractive to the following ants and as a result the exploration of the solution space would be constrained. In that sense, pheromone evaporation helps also to avoid convergence to a locally optimal solution. Positive feedback eventually leads to most of the ants following a single “optimum” path. The idea of the ant colony algorithm is to mimic this behavior with simulated ants walking around the graph representing the problem to solve. The first algorithm was aiming to search for an optimal path in a graph. The original idea has since diversified to solve a wider class of numerical problems and, as a result, several problems have emerged, drawing on various aspects of the behavior of ants. The initial applications of ACO were in the domain of NP-hard
214
V. Plevris, M.G. Karlaftis, and N.D. Lagaros
combinatorial optimization problems, while it was soon also applied to routing in telecommunication networks. In ACO, a set of software agents called artificial ants search for good solutions to the optimization problem of finding the best path on a weighted graph. The ants incrementally build solutions by moving on the graph. The solution construction process is stochastic and it is biased on a pheromone model, that is, a set of parameters associated with graph components (either nodes or edges) whose values are modified at runtime by the ants. The advantages of ACO include its easy implementation, the inherent parallelism of its procedures, the positive feedback that accounts for rapid discovery of good solutions in hard combinatorial optimization problems, its suitability to be used in dynamic applications, e.g. in a TSP where the distances between the nodes change with time, and its great performance with “illstructured” problems like network routing. 4.2.2 ACO Applied to the TSP To apply ACO to the TSP, the construction graph is considered, defined by associating the set of cities with the set of vertices on the graph. The construction graph is fully connected and the number of vertices is equal to the number of cities, since in the TSP it is possible to move from any given city to any other city. The length of the edges (connections) between the vertices are set to be equal to the corresponding distances between the nodes (cities) and the pheromone values and heuristic values are set for the edges of the graph. Pheromone values are modified during iterations at runtime and represent the cumulated experience of the ant colony, while heuristic values are problem dependent values that, in the case of the TSP, are set to be the inverse of the lengths of the edges. During an ACO iteration, each ant starts from a randomly chosen vertex of the construction graph. Then, it moves along the edges of the graph keeping a memory of its path. In order to move from one node to another it probabilistically chooses the edge to follow among those that lead to yet unvisited nodes. Once an ant has visited all the nodes of the graph, a solution has been constructed. The probabilistic rule is biased by pheromone values and heuristic information: the higher the pheromone and the heuristic value associated to an edge, the higher the probability the ant will choose that particular edge. Once all the ants have completed their tour, the iteration is complete and pheromone values on the connections are updated: each of the pheromone values is initially decreased by a certain percentage and then it receives an amount of additional pheromone proportional to the quality of the solutions to which it belongs. 4.2.3 Ant Colony Optimization Algorithm Consider a population of m ants where at each iteration of the algorithm every ant constructs a “route” by visiting every node sequentially. Initially, ants are put on randomly chosen nodes. At each construction step during an iteration, ant k
A Swarm Intelligence Approach for Emergency Infrastructure Inspection Scheduling
215
applies a probabilistic action choice rule, called random proportional rule, to decide which node to visit next. While constructing the route, an ant k currently at k node i, maintains a memory M which contains the nodes already visited, in the order they were visited. This memory is used in order to define the feasible neighborhood Nki that is the set of nodes that have not yet been visited by ant k. In particular, the probability with which ant k, currently at node i, chooses to go to node j is
p = k i, j
(τ i , j )α ⋅ (ηi , j )β
∑ ( (τ
i ,A
α
) ⋅ (ηi ,A )
A∈N ik
β
)
, if j ∈ N ik
(10)
where τi,j is the amount of pheromone on connection between i and j nodes, α is a parameter to control the influence of τi,j, β is a parameter to control the influence of ηi,j and ηi,j is a heuristic information that is available a priori, denoting the desirability of connection i,j, given by
ηi , j =
1 di , j
(11)
According to Eq. (11), the heuristic desirability of going from node i to node j is inversely proportional to the distance between i and j. By definition, the probability of choosing a city outside Nki is zero. By this probabilistic rule, the probability of choosing a particular connection i,j increases with the value of the associated pheromone trail τi,j and of the heuristic information value ηi,j. The selection of the superscript parameters α and β is very important: if α=0, the closest cities are more likely to be selected which corresponds to a classic stochastic greedy algorithm (with multiple starting points since ants are initially randomly distributed over the nodes). If β=0, only pheromone amplification is at work, that is, only pheromone is used without any heuristic bias (this generally leads to rather poor results [38]). 4.2.4 Pheromone Update Rule After all the m ants have constructed their routes, the amount of pheromone for each connection between i and j nodes, is updated for the next iteration t+1 as follows m
τ i , j (t + 1) = (1 − ρ ) ⋅τ i, j (t ) + ∑ Δτ ik, j (t ), ∀(i, j) ∈ A k =1
(12)
where ρ is the rate of pheromone evaporation, a constant parameter of the method, A is the set of arcs (edges or connections) that fully connects the set of nodes and
216
V. Plevris, M.G. Karlaftis, and N.D. Lagaros
Δτki,j(t) is the amount of pheromone ant k deposits on the connections it has visited through its tour Tk, typically given by
⎧ 1 if connection (i,j) belongs to T k ⎪ k k Δτ i , j = ⎨ L( T ) ⎪ 0 otherwise ⎩
(13)
The coefficient ρ must be set to a value