Reliability and Maintainability Assessment of Industrial Systems: Assessment of Advanced Engineering Problems (Springer Series in Reliability Engineering) 3030936228, 9783030936228

This book covers advanced reliability and maintainability knowledge as applied to recent engineering problems. It highli

174 113 7MB

English Pages 355 [347] Year 2022

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Contents
Editors and Contributors
Dynamic Availability Analysis for the Flexible Manufacturing System Based on a Two-Step Stochastic Model
1 Introduction
2 Problem Description
3 The Two-Step Stochastic Model
3.1 Step 1: Establishment of the Equivalent FMC Model
3.2 Step 2: Establishment of the Dynamic Availability of the Equivalent FMS
4 Numerical Example
5 Conclusion
References
Integrated Reliability and Risk Assessments of Nuclear Facilities
1 Introduction
2 Nuclear Facilities Under the Scope of Analysis
3 Reliability and Risk Assessments During Facility Lifetime
4 Reliability of Items Important to Safety
4.1 Deterministic and Probabilistic Safety Analysis
4.2 Reliability Characteristics
4.3 Reliability Engineering Tools
4.4 Reliability, Availability and Maintainability (RAM) of Items Important to Safety
4.5 Human Reliability
5 Probabilistic Risk Assessment (PRA)
5.1 PRA Steps
5.2 PRA Tools
5.3 Risk and Reliability Acceptance Criteria
6 Integrated Approach of Reliability and Risk Assessments
7 Case Study of Integrated Reliability and Risk Assessments
8 Concluding Remarks
References
Computational Tools of Media Analysis for Corporate Policy Effectiveness Evaluation: Models and Their Reliability
1 Introduction
2 Theory and Methodology of Topic Mining
3 Application of Topic Mining Algorithms to Real-Life Problems in Business and Public Administration
3.1 Policy Problem: Corporate Reputation Vulnerability
3.2 Data
3.3 Preprocessing
3.4 Latent Dirichlet Allocation
3.5 Hierarchical Dirichlet Process
3.6 Manual Labeling
4 Comparing Three Different Topic Mining Algorithms: Advantages and Disadvantages of Each Method for Practical Problem-Solving in Business
References
Optimal Design of Checkpoint Systems with General Structures, Tasks and Schemes
1 Introduction
2 Standard Model
3 General Checkpoint Models
3.1 General Stractures
3.2 Examples of Redundant Modules
3.3 Two Kinds of Random Tasks
3.4 Two Schemes of Checkpoints
4 Four Examples of Random Tasks
4.1 Model 1
4.2 Model 2
4.3 Model 3
5 Conclusion
References
Series System Reliability: An Unified Approach Using Fatal Shock and Stress Models
1 Introduction
2 Methodology
2.1 Series Systems with n Components
2.2 Fatal Shock and Stress Methods
2.3 Inference Methods for a Series System
3 Probabilistic Models
4 Data Application: Operation Time of a Series System
5 Conclusion
References
The New Attempt of Network Reliability in the Assessment of Business Activities for Precision Management
1 Introduction
2 Mathematical Preliminaries
2.1 Process Modeling
2.2 Performance Evaluation of a Process at Time t
3 The SA Approach
3.1 Stage One—Searching for MPs
3.2 Stage Two—Searching for d-MPs
3.3 Stage Three—Evaluating the Network Reliability via Union Probability
4 The Assessment of Individuals and Groups
4.1 The Performance Indicators for Individuals
4.2 The Performance Indicators for Groups
4.3 The Assessments
5 The Bonus Dispatching for the Process
6 The Case Studies
6.1 Discussions of the Case Study on Precision Management
7 Conclusion
References
Optimal Design of Reliability Acceptance Sampling Plans for Multi-stage Production Process
1 Introduction
2 Design of Single Sampling Plan (SSP) Under Type II Censoring
2.1 Formation of Optimization Problem and Its Solution
3 Construction of Double Sampling Plan (DSP) Under Type II Censoring
4 Numerical Results and Comparisons
5 Comparisons of Single Sampling and Double Sampling Plans
6 Sensitivity Analysis of Sampling Plans
7 Some Case Studies
8 Conclusions
References
The Importance of Technical Diagnostics for Ensuring the Reliability of Industrial Systems
1 Introduction
2 Theoretical Settings of Industrial Systems Maintenance
2.1 Maintenance Strategies and Concepts
2.2 Condition Based Maintenance Strategy
3 Technical Diagnostics—Basis of Condition Based Maintenance
3.1 The Concept of Technical Diagnostics
3.2 Condition, Changes and Condition Control of the Technical System
3.3 Technical Diagnostic Processes
3.4 Technical Diagnostic Methods
3.5 Methods of Measurement and Analysis of Vibrations
3.6 Concept, Classification and Digital Signal Processing
3.7 Importance of Technical Diagnostics on the Reliability of Technical Systems
4 Application of Technical Diagnostics in Industrial Practice
4.1 Description of the Reference Event
5 Discussion of Results
6 Conclusion
References
Reliability Assessment of Replaceable Shuffle Exchange Network with an Additional Stage Using Interval-Valued Universal Generating Function
1 Introduction
2 Assumptions
3 Notations
4 Definitions
5 Terminal Reliability of SEN+
6 Broadcast Reliability of SEN+
7 Network Reliability of SEN+
8 Numerical Illustration
8.1 Terminal Reliability of the Replaceable SEN + under Consideration by Using the IUGF Approach
8.2 MTTF of the SEN + under Consideration
8.3 Broadcast Reliability of the Considered Replaceable sEN + by Using the Method of IUGF
8.4 MTTF of the SEN +
8.5 Network Reliability of the Replaceable SEN + using IUGF Approach
8.6 MTTF of the SEN+
9 Result and Discussion
10 Conclusion
References
Modeling Software Vulnerability Injection-Discovery Process Incorporating Time-Delay and VIKOR Based Ranking
1 Introduction
2 Literature Review
3 Methodology
3.1 Modeling Framework
4 Data Analysis and Model Validation
4.1 Estimation of Parameters
4.2 Goodness-of-Fit Criteria
4.3 Graphical Approach (Goodness of Fit)
4.4 VIKOR Analysis
4.5 Illustrated Examples
5 Conclusion
References
Assessment of Reliability Function and Signature of Energy Plant Complex System
1 Introduction
2 Evolution of the Signature, Minimal Signature, Tail Signature and Cost from Structure Function of the Proposed Model
3 Model Description
4 Numerical Example
5 Result and Discussion
6 Conclusion
References
Reliability Evaluation and Cost Optimization of Solar Air-Conditioner
1 Introduction
2 Working of Solar Air-Conditioner
3 Assumptions and Notations
4 Mathematical Modelling
4.1 Formulation of the Model
4.2 Solution of the Model
4.3 Availability
4.4 Reliability Analysis
4.5 Mean Time to Failure
4.6 MTTF Sensitivity
4.7 Reliability Sensitivity
5 Optimization of Cost of a Solar Air Conditioner Using PSO
6 Results Analysis
7 Conclusion
References
Analyzing Interrelationships Among Software Vulnerabilities Using Fuzzy DEMATEL Approach
1 Introduction
2 Research Methodology
2.1 Dataset Description
2.2 Fuzzy DEMATEL Approach
3 Data Analysis
4 Conclusion
References
Universal Generating Function Approach for Evaluating Reliability and Signature of All-Digital Protection Systems
1 Introduction
1.1 Definitions
1.2 Alternative System Structure Designs
1.3 System Reliability Block Diagram
1.4 Universal Generating Function (UGF)
1.5 Assessment of System Reliability
1.6 Signature Reliability
1.7 Signature of ADPS
1.8 Algorithm for Finding Expected Lifetime of the ADPS with Minimal Signature
1.9 Mean Time to Failure (Expected Lifetime) of the All-Digital Protection System
2 Conclusion
References
Reliability Analysis of 8 × 8 SEN- Using UGF Method
1 Introduction
2 Preliminaries
2.1 Interconnection Network
2.2 Multistage Interconnection Network (MIN)
2.3 Universal Generating Function (UGF)
3 Shuffle Exchange Network with One Stage Less (SEN-)
3.1 New SEN-
4 Reliability Analysis of SEN-
4.1 Terminal Reliability
4.2 Broadcast Reliability
4.3 Network Reliability
5 Conclusion
References
Recommend Papers

Reliability and Maintainability Assessment of Industrial Systems: Assessment of Advanced Engineering Problems (Springer Series in Reliability Engineering)
 3030936228, 9783030936228

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Springer Series in Reliability Engineering

Mangey Ram Hoang Pham   Editors

Reliability and Maintainability Assessment of Industrial Systems Assessment of Advanced Engineering Problems

Springer Series in Reliability Engineering Series Editor Hoang Pham, Department of Industrial and Systems Engineering, Rutgers University, Piscataway, NJ, USA

More information about this series at https://link.springer.com/bookseries/6917

Mangey Ram · Hoang Pham Editors

Reliability and Maintainability Assessment of Industrial Systems Assessment of Advanced Engineering Problems

Editors Mangey Ram Department of Mathematics, Computer Science and Engineering Graphic Era University Dehradun, Uttarakhand, India

Hoang Pham Department of Industrial and Systems Engineering Rutgers University Piscataway, NJ, USA

Institute of Advanced Manufacturing Technologies Peter the Great St. Petersburg Polytechnic University Saint Petersburg, Russia

ISSN 1614-7839 ISSN 2196-999X (electronic) Springer Series in Reliability Engineering ISBN 978-3-030-93622-8 ISBN 978-3-030-93623-5 (eBook) https://doi.org/10.1007/978-3-030-93623-5 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

Advanced technology instigated complexities in industries whatever in engineering system or production, that resulting the increase risk factor. To control and manage the risk factor, reliability analysis is necessary. This book entitled “Reliability and Maintainability Assessment of Industrial Systems—Assessment of Advanced Engineering Problems” enlightens the engineering problems related to their reliability and maintenance due to increased complicacy into the production or manufacturing in addition to superiority, general consciousness besides the perseverance of the level of excellence of the system, updated rules and regulation regarding equipment reliability, necessities for the specification of dependability and maintainability enactment recommended by the government, and effects on income caused by the high cost of failures and repairs. Through this book, entitled “Reliability and Maintainability Assessment of Industrial Systems—Assessment of Advanced Engineering Problems,” the engineers and researchers have to lead a greater awareness and help them in the applications of reliability and maintainability engineering. The book is meant for those who take reliability engineering as a subject of study. The material is envisioned for the spectators at the level of postgraduate or senior undergraduate students. Dehradun, India Piscataway, USA

Mangey Ram Hoang Pham

Acknowledgements The editors acknowledge Springer for this opportunity and professional support. Also, we would like to thank all the chapter authors and reviewers for their availability for this work.

v

Contents

Dynamic Availability Analysis for the Flexible Manufacturing System Based on a Two-Step Stochastic Model . . . . . . . . . . . . . . . . . . . . . . . Wenbin Zeng, Guixiang Shen, Ilia Frenkel, Igor Bolvashenkov, Jörg Kammermann, Hans-Georg Herzog, Lev Khvatskin, and Anatoly Lisnianski Integrated Reliability and Risk Assessments of Nuclear Facilities . . . . . . Vanderley de Vasconcelos, Graiciany de Paula Barros, Antônio Carlos Lopes da Costa, and Wellington Antonio Soares Computational Tools of Media Analysis for Corporate Policy Effectiveness Evaluation: Models and Their Reliability . . . . . . . . . . . . . . . Gregory S. Khvatsky, Dmitry G. Zaytsev, Valentina V. Kuskova, and Anna A. Sokol Optimal Design of Checkpoint Systems with General Structures, Tasks and Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kenichiro Naruse and Toshio Nakagawa Series System Reliability: An Unified Approach Using Fatal Shock and Stress Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ricardo Puziol de Oliveira, Marcos Vinicius de Oliveira Peres, Wesley Bertoli, and Jorge Alberto Achcar

1

21

51

73

93

The New Attempt of Network Reliability in the Assessment of Business Activities for Precision Management . . . . . . . . . . . . . . . . . . . . . . 105 Shin-Guang Chen Optimal Design of Reliability Acceptance Sampling Plans for Multi-stage Production Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 M. Kumar

vii

viii

Contents

The Importance of Technical Diagnostics for Ensuring the Reliability of Industrial Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 D. Lj. Brankovi´c, Z. N. Milovanovi´c, and V. Z. Janiˇci´c Milovanovi´c Reliability Assessment of Replaceable Shuffle Exchange Network with an Additional Stage Using Interval-Valued Universal Generating Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Amisha Khati and S. B. Singh Modeling Software Vulnerability Injection-Discovery Process Incorporating Time-Delay and VIKOR Based Ranking . . . . . . . . . . . . . . . 239 Mohini Agarwal, Deepti Aggrawal, Subhrata Das, Adarsh Anand, and Navneet Bhatt Assessment of Reliability Function and Signature of Energy Plant Complex System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 Monika Negi, Megha Shah, Akshay Kumar, Mangey Ram, and Seema Saini Reliability Evaluation and Cost Optimization of Solar Air-Conditioner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 Ashok Singh Bhandari, Mangey Ram, Akshay Kumar, and Sushil Chandra Dimri Analyzing Interrelationships Among Software Vulnerabilities Using Fuzzy DEMATEL Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 Misbah Anjum, P. K. Kapur, Vernika Agarwal, and Vivek Kumar Universal Generating Function Approach for Evaluating Reliability and Signature of All-Digital Protection Systems . . . . . . . . . . . . 301 Soni Bisht and S. B. Singh Reliability Analysis of 8 × 8 SEN- Using UGF Method . . . . . . . . . . . . . . . . 329 Vaibhav Bisht and S. B. Singh

Editors and Contributors

About the Editors Prof. Mangey Ram received the Ph.D. degree major in Mathematics and minor in Computer Science from G. B. Pant University of Agriculture and Technology, Pantnagar, India. He has been a faculty member for around thirteen years and has taught several core courses in pure and applied mathematics at undergraduate, postgraduate, and doctorate levels. He is currently Research Professor at Graphic Era (Deemed to be University), Dehradun, India, and Visiting Professor at Peter the Great St. Petersburg Polytechnic University, Saint Petersburg, Russia. Before joining the Graphic Era, he was Deputy Manager (Probationary Officer) with Syndicate Bank for a short period. He is Editor-in-Chief of International Journal of Mathematical, Engineering and Management Sciences; Journal of Reliability and Statistical Studies; Journal of Graphic Era University; Series Editor of six Book Series with Elsevier, CRC Press-A Taylor and Frances Group, Walter De Gruyter Publisher Germany, River Publisher, and Guest Editor and Associate Editor with various journals. He has published 250 plus publications (journal articles/books/book chapters/conference articles) in IEEE, Taylor & Francis, Springer Nature, Elsevier, Emerald, World Scientific and many other national and international journals and conferences. Also, he has published more than 50 books (authored/edited) with international publishers like Elsevier, Springer Nature, CRC Press-A Taylor and Frances Group, Walter De Gruyter Publisher Germany, River Publisher. His fields of research are reliability theory and applied mathematics. He is Senior Member of the IEEE, Senior Life Member of Operational Research Society of India, Society for Reliability Engineering, Quality and Operations Management in India, Indian Society of Industrial and Applied Mathematics. He has been a member of the organizing committee of a number of international and national conferences, seminars, and workshops. He has been conferred with “Young Scientist Award” by the Uttarakhand State Council for Science and Technology, Dehradun, in 2009. He has been awarded the “Best Faculty Award” in 2011; “Research Excellence Award” in 2015; “Outstanding Researcher Award” in 2018 for his significant contribution in academics and research at Graphic

ix

x

Editors and Contributors

Era Deemed to be University, Dehradun, India. Recently, he has been received the “Excellence in Research of the Year-2021 Award” by the Honorable Chief Minister of Uttarakhand State, India. Dr. Hoang Pham is Distinguished Professor and former Chairman (2007–2013) of the Department of Industrial and Systems Engineering at Rutgers University, New Jersey, USA. Before joining Rutgers, he was Senior Engineering Specialist with the Boeing Company and the Idaho National Engineering Laboratory. He has been served as editor-in-chief, editor, associate editor, guest editor, and board member of many journals. He is Editor of Springer Book Series in Reliability Engineering and has served as conference chair and program chair of over 40 international conferences. He is the author or coauthor of 7 books and has published over 200 journal articles, 100 conference papers, and edited 15 books including Springer Handbook in Engineering Statistics and Handbook in Reliability Engineering. His numerous awards include the 2009 IEEE Reliability Society Engineer of the Year Award. He is Fellow of the Institute of Electrical and Electronics Engineers (IEEE) and the Institute of Industrial Engineers (IIE).

Contributors Jorge Alberto Achcar University of São Paulo, São Paulo, Brazil Mohini Agarwal Amity School of Business, Amity University Uttar Pradesh, Noida, India Vernika Agarwal Amity International Business School, Amity University, Noida, Uttar-Pradesh, India Deepti Aggrawal University School of Management and Entrepreneurship, Delhi Technological University, Delhi, India Adarsh Anand Department of Operational Research, University of Delhi, Delhi, India Misbah Anjum Amity Institute of Information Technology, Amity University, Noida, Uttar-Pradesh, India Wesley Bertoli Federal University of Technology, Curitiba, Paraná, Brazil Ashok Singh Bhandari Department of Mathematics, Graphic Era Hill University, Uttarakhand, India Navneet Bhatt Anil Surendra Modi School of Commerce, SVKM’s Narsee Monjee Institute of Management Studies (Deemed to be University), Mumbai, India Soni Bisht Department of Mathematics, Eternal University, Himachal Pradesh, Baru Sahib, India

Editors and Contributors

xi

Vaibhav Bisht Department of Mathematics, Statistics and Computer Science, G.B. Pant University of Agriculture and Technology, Pantnagar, India Igor Bolvashenkov Institute of Energy Conversion Technology, Technical University of Munich (TUM), Munich, Germany D. Lj. Brankovi´c Department of Hydro and Thermal Engineering, Faculty of Mechanical Engineering Banja Luka, University of Banja Luka, Banja Luka, Bosnia and Herzegovina Shin-Guang Chen Tungnan University, New Taipei City, Taiwan Subhrata Das Department of Operational Research, University of Delhi, Delhi, India Antônio Carlos Lopes da Costa Centro de Desenvolvimento da Tecnologia Nuclear - CDTN, Pampulha, Belo Horizonte, Minas Gerais, Brazil Ricardo Puziol de Oliveira Maringá State University, Maringá, Brazil Marcos Vinicius de Oliveira Peres Maringá State University, Maringá, Brazil Graiciany de Paula Barros Centro de Desenvolvimento da Tecnologia Nuclear CDTN, Pampulha, Belo Horizonte, Minas Gerais, Brazil Vanderley de Vasconcelos Centro de Desenvolvimento da Tecnologia Nuclear CDTN, Pampulha, Belo Horizonte, Minas Gerais, Brazil Sushil Chandra Dimri Department of Computer Science and Engineering, Graphic Era Deemed To Be University, Uttarakhand, India Ilia Frenkel Center for Reliability and Risk Management, Shamoon College of Engineering, Beer Sheva, Israel Hans-Georg Herzog Institute of Energy Conversion Technology, Technical University of Munich (TUM), Munich, Germany Jörg Kammermann Institute of Energy Conversion Technology, Technical University of Munich (TUM), Munich, Germany P. K. Kapur Amity Center for Inter-Disciplinary Research, Amity University, Noida, Uttar-Pradesh, India Amisha Khati Department of Mathematics, Statistics and Computer Science, G. B. Pant University of Agriculture and Technology, Pantnagar, India Lev Khvatskin Center for Reliability and Risk Management, Shamoon College of Engineering, Beer Sheva, Israel Gregory S. Khvatsky International Laboratory for Applied Network Research, HSE University, Moscow, Russia Akshay Kumar Department of Mathematics, Graphic Era Hill University, Dehradun, Uttarakhand, India

xii

Editors and Contributors

M. Kumar Department of Mathematics, National Institute of Technology, Calicut, Kerala, India Vivek Kumar Department of Operational Research, University of Delhi, Delhi, India Valentina V. Kuskova International Laboratory for Applied Network Research, HSE University, Moscow, Russia Anatoly Lisnianski Center for Reliability and Risk Management, Shamoon College of Engineering, Beer Sheva, Israel V. Z. Janiˇci´c Milovanovi´c Banja Luka, Bosnia and Herzegovina Z. N. Milovanovi´c Department of Hydro and Thermal Engineering, Faculty of Mechanical Engineering Banja Luka, University of Banja Luka, Banja Luka, Bosnia and Herzegovina Toshio Nakagawa Department of Business Administration, Aichi Institute of Technology, Aichi, Japan Kenichiro Naruse Faculty of Social and Environmental Studies, Josai International University, Chiba, Japan Monika Negi Department of Mathematics, Graphic Era Hill University, Dehradun, Uttarakhand, India Mangey Ram Department of Mathematics, Computer Science and Engineering, Graphic Era Deemed To Be University, Uttarakhand, India; Institute of Advanced Manufacturing Technologies, Peter the Great St. Petersburg Polytechnic University, Saint Petersburg, Russia Seema Saini Department of Mathematics, Graphic Era Deemed To Be University, Dehradun, Uttarakhand, India Megha Shah Department of Mathematics, Graphic Era Hill University, Dehradun, Uttarakhand, India Guixiang Shen School of Mechanical and Aerospace Engineering, Jilin University, Changchun, China S. B. Singh Department of Mathematics, Statistics and Computer Science, G.B. Pant University of Agriculture and Technology, Pantnagar, India Wellington Antonio Soares Centro de Desenvolvimento da Tecnologia Nuclear CDTN, Pampulha, Belo Horizonte, Minas Gerais, Brazil Anna A. Sokol International Laboratory for Applied Network Research, HSE University, Moscow, Russia Dmitry G. Zaytsev International Laboratory for Applied Network Research, HSE University, Moscow, Russia

Editors and Contributors

xiii

Wenbin Zeng Fine Mechanics and Physics, Changchun Institute of Optics, Chinese Academy of Science, Changchun, China

Dynamic Availability Analysis for the Flexible Manufacturing System Based on a Two-Step Stochastic Model Wenbin Zeng, Guixiang Shen, Ilia Frenkel, Igor Bolvashenkov, Jörg Kammermann, Hans-Georg Herzog, Lev Khvatskin, and Anatoly Lisnianski Abstract The paper proposes a dynamic availability analysis approach for the flexible manufacturing system (FMS) under a stochastic environment that machines’ failure and starvation or blockage of production process occur randomly. Accurately knowing the availability of the FMS, which is changing dynamically overtime in a stochastic circumstance could benefit a lot for the improvement or re-design of the system. A two-step stochastic model proposed in current paper that integrates the intermediate buffers into associated workstations equivalently in terms of the relationships between upstream and downstream production rates. Calculation procedures of relevant dynamic availability are established by using the Lz-transform method, which conquer the states-explosion problem which is common in FMS performance analysis. Meanwhile, the impacts of intermediate buffers on the FMS W. Zeng Fine Mechanics and Physics, Changchun Institute of Optics, Chinese Academy of Science, Changchun 130022, China e-mail: [email protected] G. Shen School of Mechanical and Aerospace Engineering, Jilin University, Changchun 130022, China e-mail: [email protected] I. Frenkel (B) · L. Khvatskin · A. Lisnianski Center for Reliability and Risk Management, Shamoon College of Engineering, 84100 Beer Sheva, Israel e-mail: [email protected] A. Lisnianski e-mail: [email protected] I. Bolvashenkov · J. Kammermann · H.-G. Herzog Institute of Energy Conversion Technology, Technical University of Munich (TUM), 80333 Munich, Germany e-mail: [email protected] J. Kammermann e-mail: [email protected] H.-G. Herzog e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment of Industrial Systems, Springer Series in Reliability Engineering, https://doi.org/10.1007/978-3-030-93623-5_1

1

2

W. Zeng et al.

dynamic availability also revealed, which assist to determine the appropriate volumes of the buffers to satisfy the various FMS production demands. A numerical example is presented for illustrating the effectiveness and rationality of the approach. Keywords Flexible manufacturing system · Dynamic availability analysis · Two-step stochastic model · Intermediate buffers · Multi-state system · Lz-transform method

1 Introduction With the increasing demand of customization and high-quality products, leveraging the advantages of new technologies with the context of Industry 4.0 is seen as an important factor for manufacturers to maintain competitiveness while responding to the sustainability challenge [11, 27]. Industry 4.0, predicted as the fourth industrial revolution, will implant the information technology deeper into the manufacturing automation [24]. Namely, Industry 4.0 will use the Internet of things (IoT) in production systems to create a cyber-physical production system (CPPS) [17], which is characterized by sharing production operation information over the Internet with multiple systems inside the smart factory [16]. Flexible manufacturing systems (FMS) is an inevitable choice for current companies to respond the rapid market changes and customer customization requirements, which turns into a hot topic and plays an important role in the realization of the Industry 4.0. It can achieve the goals of Industry 4.0 that include a higher level of operational efficiency and productivity, as well as a higher level of automatization [12, 35]. Therefore, the research on the performance analysis of FMS is not only beneficial to its design, improvement and control, but also can be used as one of the foundations for implementing intelligent factory. FMS is generally defined as an integrated computer-controlled system with a set of workstations, interconnected by an automated material handling system (MHS) and which can simultaneously perform several different tasks [3, 9]. It is learned from Groover [9] and Jin and Liu [14] that the workstations (such as CNC machine tools, industrial robots, washing and measuring machines) are processing stations of FMS, where several machines are utilized at the same stage to meet the productivity and production balance. The automated material handling system is used for transferring parts from one station to another, while the central computer is used for controlling and coordinating the performance of the workstations and material handling system. Moreover, the finite intermediate buffers are set between workstations for the completion of the processing task when the upstream or downstream devices in a failed circumstance. Due to the essential and complicated nature of the FMS, there have been various studies on the performance metrics analysis of FMS as well as the flexible manufacturing cell (FMC). Under the assumption that elements of the FMS are available at all times, Singholi et al. [33] presented a framework based on the Taguchi experimental design for

Dynamic Availability Analysis for the Flexible Manufacturing …

3

evaluating the effects of varying levels of machine and routing flexibility on the makespan, average waiting time (AWT), and average utilization of an FMS. Dosdogru et al. [6] utilized an integrated genetic algorithm, the Monte Carlo method, to solve the stochastic flexible job shop-scheduling problem and to measure the impact of routing flexibility on shop performance. Jain and Raj [13] performed three different approaches to analyze the intensity of performance variables in an organization and proposed the FMS performance index to intensify the factors, which affect FMS. In a multi-stage production system, Gyulai et al. [10] proposed a simulation-based optimization method that utilizes lower-level shop floor data to calculate robust production plans for final assembly lines of a flexible, multistage production system. Rybicka et al. [26] demonstrated how discrete event simulation (DES) can address complexity in an FMS to optimize the production line performance. However, the reliability of machines has a considerable impact on the practical performance of manufacturing systems. The disturbances caused by these breakdowns lead to scheduling problems, which decrease the productivity of the entire manufacturing process. This issue points out an important tangible for the consideration of machine reliability in the performance evaluation of FMS, especially in light of the increasing complexity of such systems in recent years. Consequently, many researchers have realized the importance of reliability features of the FMS and much work has been contributed. Koulamas [15] developed a semi-Markov model to study the effects of tool failures on the performance of an FMC. Das et al. [5] proposed an approach that provides a flexible routing which ensures high overall performance of the cellular manufacturing system (CMS) by minimizing the impact of machine failure through the provision of alternative process routes in case of any machine failure. Elleuch et al. [8] proposed a Markov-based model for reducing the severity of breakdowns and improving performances of the CMS with unreliable machines. Loganathan et al. [23] suggested a methodology for availability evaluation of manufacturing systems using semi-Markov model, which considers variable failure or repair rates. Zeng et al. [38] developed a multi-state transition model for CNC machine tools that included the setup of production, and the Lz-transform method was introduced to determine the productivity and availability evaluation of the multi-state FMS. Savsar [28] developed stochastic models to determine the performance of an FMC under random operational conditions, including random failures of cell components in addition to random processing times, random machine loading and unloading times, and random pallet transfer times. Moreover, Savsar [30] developed a stochastic multi-states model to analyze performance measures of a cell, which is allowed to operate under degraded mode and the results can be useful for practitioners to analyze performance of an FMC at the design or operational stage. Chen et al. [4] proposed a process performance evaluation chart to examine the manufacturing performance of bearing connectors, which can also be used for multiple evaluations of manufacturing performance in other manufacturing industries. Manocher and Hamid [25] developed a discrete-event simulation model to investigate the effectiveness of a reliability plan focusing on the most critical machines in manufacturing cell. More studies about the performance analysis of manufacturing systems can be found in papers of Anupma and Jayswal [2], Savsar [29], Li and Huang [18], Alhourani [1].

4

W. Zeng et al.

Although the aforementioned studies have been undertaken, only few studies that include the influence of intermediate buffers on performance measures of an FMS. Tan and Gershwin [34] proposed a tool for performance evaluation of general Markovian continuous material flow of two-processing stage production systems with one intermediate buffer. Liu et al. [22] modeled and analyzed the throughput of a twostage manufacturing system with multiple independent unreliable machines at each stage and one finite-sized buffer between the stages. Duan et al. [7] suggested a reliability modeling and evaluating methodology for a repairable, non-series, multistate FMS with finite buffers by using an improved vector universal generation function to satisfy the market demands on capacity and capability of the system. However, only steady-state probabilities of FMS performance metrics such as steady-state availability, theoretical production rates, utilization rates, efficient production rates and production loss are calculated. Therefore, a two-step stochastic model is presented in current paper as an attempt to evaluate the dynamic performance of FMS, especially the availability, and to meet various production demands through select appropriate buffers volumes. The remaining article is structured as follows. The description of the FMS and corresponding hypothesis are shown in the section “Problem Description”. Then the section “Two-Step Stochastic Model” introduces the procedure of establishing the required model in detail. Subsequently, an exploratory “Numerical Example” is utilized to present the effectiveness of the proposed approach and remarks as well as future work discussed in the “Conclusion”.

2 Problem Description The flexible manufacturing system (FMS) investigated in this paper consist of m flexible manufacturing cells (FMCs) and corresponding intermediate buffers for a part family, the schematic shown in Fig. 1. Each FMC contains various numbers of

Fig. 1 Schematic of the FMS consisting of multiple FMCs and corresponding buffers: F MC is the abbreviation of the flexible manufacturing cell, M, R and B represent the processing machines, robots and intermediate buffers, respectively

Dynamic Availability Analysis for the Flexible Manufacturing …

5

the identical processing machines and a robot for automatic loading and unloading. Note that the machines process the same work independently while connect in series with the robot separately. The buffers are situated between every two FMCs. The (i)th FMC as the example to illustrate the multi-state characteristics of the of the (i)th device, (i) = 1, . . . , m. Machine M (i) j , j = 1, . . . , H is one machine   (i) (i) FMC that has K j performance rates, represented with vector g j = g j1 , . . . , g(i) jK j

device (working or total failure) with the relevant and robot R (i) is a binary-state   (i) (i) (i) (i) performance vector r = r0 , r1 . Let g(i) j1 and r0 represent the worst performance (i) issue (total failure) for machines and robot while g(i) j K and r1 as the best. Therefore, j

the performanceof the FMC G (i) F MC (t) at any instant t ≥ 0 is a random variable where  H (i)  (i) (i) G F MC (t) ∈  f r , j=1 g j ,  f is the structure operator. Including all devices in this FMC, the state space of the FMC regarded as the product of all processing machines and robot states. In order to achieve the assessment of the dynamic availability of the FMS, following assumptions are listed: I.

II.

III.

IV.

Strategy for all devices with corrective maintenance and repair “as bad as old”, i.e. minimal repair. Moreover, repair time of all devices follow exponential distribution. Processing machines have aging characteristics and failure rates obey Weibull distribution while failure rates of robots follow exponential distribution. The failure of any equipment is state-based, that is, the equipment only fails during the processing of the blank, while does not fail during the shutdown process caused by starvation or blockage. Blank transportation time and robots loading and unloading time are negligible compared to the machines’ processing time. There is no fault during the blank transition. All machines located in the first FMC have enough blanks to be machined and all machines located in the terminal FMC have enough storage space. In addition, when the buffer B (i) is under blockage or starvation, all FMCs in the upstream and in the downstream are shut down, respectively.

3 The Two-Step Stochastic Model Duan et al. [7] studied the relationships between the buffers and the productivities of upstream and downstream for establishing the steady-state transition equation in order to obtain the probability of the blank remaining in the buffer. However, productivities of the workstations are set as constant which entail the aging and performance degradation of machines that apparently have influence on the productivity of system are excluded in that equation. Consequently, an attempt that brings in such factors to explore the dynamic performance of FMS is made in current

6

W. Zeng et al.

paper, which summarizes a two-step stochastic model. The specific procedures are as follows.

3.1 Step 1: Establishment of the Equivalent FMC Model The FMS steady-state performance models had been established by Duan et al. [7]. This model cannot be directly used for systems with stochastic characteristics. However, the logic and ideas of analyzing that problem are still worth learning. Zhou and Liu [39] proposed the method by that the FMS (shown in Fig. 1) can be decomposed into (m − 1) processing domains with two FMCs and one buffer, which is shown in Fig. 2. Supposing the volume of buffer B (i) is bi , the production rate of upstream F MC (i) is P R (i) (t) and the downstream F MC (i+1) is P R (i+1) (t). Then in a very short period, at most one blank enters or leaves the buffer, and the intensity of the workpiece entering and leaving the buffer area is determined by the productivity of the upstream and downstream stations. Therefore, the change process of the number of blanks in the buffer is shown in Fig. 3. The change process of the blanks in the buffer B (i) could be regarded as a nonlinear birth–death process or a discrete state continuous-time Markov process [37], the corresponding state transition intensity matrix a(i) is

Fig. 2 Schematic of one processing domain

Fig. 3 The change process of the number of blanks in the buffer

Dynamic Availability Analysis for the Flexible Manufacturing …

7



a(i)

⎤ P R (i) (t) ··· 0 0 −P R (i) (t)  ⎢ PR ⎥ P R (i) (t) ··· 0 (i+1) (t) − P R (i) (t) + P R (i+1) (t) ⎢ ⎥ ⎢ ⎥ . . . . . ⎢ ⎥ =⎢ ⎥ . . . . . ⎢ ⎥ . . . . . ⎢ ⎥  ⎣ P R (i) (t) ⎦ 0 ··· P R (i+1) (t) − P R (i) (t) + P R (i+1) (t) 0 ··· 0 P R (i+1) (t) −P R (i+1) (t)

Moreover, Zhou and Liu [39] explained that the initial probability distribution has no effect on the dynamic probability of the above process, then the dynamic probability of the number of blanks in B (i) at time t can be calculated. Subsequently, considering the influence of the front and rear buffers, the FMC has the following five independent states and equivalent probabilities: (1)

The equivalent probability B B p(i−1)0 (t)PV(i) (t) pib (t);

for

F MC (i)

operates

normally

is

i

(2)

The equivalent probability for F MC (i) that lack of blanks while output normal B B is p(i−1)0 (t)PV(i) (t) pib (t); i

(3)

The equivalent probability for F MC (i) that input normal while output blocked B B is p(i−1)0 (t)PV(i) (t) pib (t); i

(4)

The equivalent probability for F MC (i) that input empty while output blocked B B is p(i−1)0 (t)PV(i) (t) pib (t); i

(5)

The equivalent probability for F MC (i) that equipment failure is PV(i) (t) where, PV(i) (t) is the dynamic probability of F MC (i) normal operation when B B the impact of the buffer is excluded, p(i−1)0 (t) and p(i−1)0 (t) are the dynamic



B probabilities of B (i−1) for non-starvation and starvation respectively, pib (t) i B (i) for non-blockage and blockage and pib are the dynamic probabilities of B (t) i respectively.

Therefore, the equivalent FMS can be established on the basis of these probabilistic relationships, and the result is shown in Fig. 4. According to the content in Fig. 4 and the above analysis results, it is necessary to determine the dynamic productivity of each FMC and to determine the equivalent Fig. 4 Schematic of the equivalent FMS

8

W. Zeng et al.

system performance model for completing the dynamic performance analysis. Hence, the Lz-transform method is introduced to finish the task and the specific analysis procedure is as follows. Machine M (i) j , j = 1, . . . ., H in the (i)th FMC has K j states, where transition intensities between states may

be the

function of time t, represented with a transition intensity matrix a j = a juv (t) , j = 1, . . . ., H, u, v = 1, . . . ., K j . Then the machine can be regarded continuous-time (DSCT) Markov process   as a discrete-state (i) (i) (i) (i) [36] G j (t) ∈ g j = g j1 , . . . , g jK which is descripted by the following triplet j notation:   (i) (1) G (i) j (t) = g j , a j , p j (0) , where p j (0) is initial probability distribution of the machine which generally defines the state K j with the highest performance level as the initial state, which is expressed by (2).    (i) p j (0) = p j1 (0) = Pr G (i) j1 (0) = g j1 , . . . ,   (i) p jK j (0) = Pr G (i) = [0, . . . , 1] = g (0) jK jK j

j

(2)

Chapman-Kolmogorov differential equations should be solved in order to find state probabilities for such a system. When the state-transition diagram for the machine is built, Eq. (3) can be written as follows. Kj Kj   dp ju (t) = p jv (t)a juv (t) − p ju (t) a juv (t), u, v = 1, . . . ., K j dt v=1 v=1 v = u v=u

Introducing the row-vector p j (t) = can rewrite in matrix notation:



(3)

 p j1 (t), p j2 (t), . . . , p jK j (t) , then Eq. (3)

dp j (t) = p j (t)a j dt

(4)

Consequently, the probabilities of the states at the moment t should be found as a solution of the system (4) under the initial conditions (2). When the machine state transitions are caused by its failures and repairs events, the corresponding transition intensities are expressed by the machine’s failure and repair rates [20, 21]. Without lose generality, the stochastic process of the machine contains minor and major failures and repairs. The state-transition process with corresponding state performance rates of the machine is presented in Fig. 5.

Dynamic Availability Analysis for the Flexible Manufacturing …

9

Fig. 5 State-transition process for repairable machine with minor and major failures and repairs

Once the dynamic probabilities of the machine’s states are obtained, the Lztransform of discrete-state continuous-time (DSCT) Markov process (1) can be defined as: Kj    (i) p ju (t) · z g ju = L z G (i) (t) j

(5)

u=1

where G (i) j (t) is the stochastic output performance process of the single machine, p ju (t) is a probability that the process is in state u at instant t ≥ 0 for the given previous initial states probability distribution p j (0), g(i) ju is a performance in state u, and z in general case is a complex variable. The definition and proof of Lz-transform can be found in [19–21]. The Lz-transform of all processing machines, operated in parallel, can be obtained by using the Universal Generating Operator (UGO)  f par [20, 21]. KM         (i) (i) (i) L G G L z G (i) PMU (t) · z g MU =  , . . . , L = (t) (t) (t) f z z par 1 M H

(6)

U =1

where G (i) M (t) is the stochastic output performance process of the parallel operate processing machines, K M is the possible states of the process, PMU (t) is a probability that the process is in state U, U = 1, . . . , K M at instant t ≥ 0, g(i) MU is a performance in state U . The Lz-transform of the whole (i)th FMC obtains by using the serial Universal Generating Operator (UGO)  fser since the group of processing machines and robot work in series

10

W. Zeng et al. K (i)        (i) (i) L z G (i) (t) =  fser L z G M (t) , L z G R (t) = PV(i) (t) · z g(i)V



(7)

V =1

where G (i) (t) is the stochastic output performance process of the whole F MC (i) , G (i) R (t) is the output performance process of the robot of the FMC, K (i) is the possible states of the process, PV(i) (t) is a probability that the process is in state V, V = 1, . . . , K (i) at instant t ≥ 0, g(i) V is a performance in state V . The dynamic production rate of the (i)th FMC at instant t ≥ 0 can be calculated by Eq. (8). P R (i) (t) =

K (i) 

(i) g(i) V · PV (t), i = 1, . . . , m

(8)

V =1

Therefore, the P R (i) (t) results are utilized to calculate the dynamic probability of the number of blanks in B (i) . Then, the equivalent probabilities of FMCs and the equivalent FMS shown in Fig. 4 are obtained.

3.2 Step 2: Establishment of the Dynamic Availability of the Equivalent FMS Step 1 incorporates the influence of the intermediate buffers on the FMS into the FMCs to generate the equivalent FMCs that are neither starvation nor blockage, and an equivalent FMS that is neither starvation nor blockage is also generated. Moreover, due to the existence of the intermediate buffers, the rigid connection of the production system is transformed into a flexible connection [31]. When a station in the flexible connection production line is shut down, the upstream and downstream stations can continue processing because of the buffer for storing and supplying blanks, which prevents the production line from being forced to shut down due to station shutdown. From the perspective of reliability, the production line is transformed from a reliable single-circuit series structure to a parallel redundant structure [32]. Therefore, the dynamic availability A F M S (t) of the equivalent FMS is: ∼

A F M S (t) = 1− A F M S (t)= 1 −

m    1 − A(i) F MC (t)

(9)

i=1 ∼

where A F M S (t) is the unavailability of the equivalent FMS, A(i) F MC (t) is the avail(i) B B ability of the equivalent FMC, where A(i) F MC (t) = p(i−1)0 (t)A F MC (t) pib (t) and i

(i) that excludes the impact of buffers and it A(i) F MC (t) is the availability of F MC required to processes blanks more than production demands, otherwise it should be

Dynamic Availability Analysis for the Flexible Manufacturing …

11

shut down and repaired immediately. Therefore, A(i) F MC (t) calculated as follows [20, 21]: A(i) F MC (t) =



PV(i) (t)

(10)

g(i) V ≥d

where d is the production demands that could be constant or stochastic. Thus, the dynamic performance measurements of the FMS can be calculated based on the Eqs. (9) and (10), and the specific calculation index and approach will be presented within the numerical example.

4 Numerical Example This section introduces an exploratory numerical example to demonstrate the proposed approach in detail, and the schematic of flexible manufacturing system as depicted in Fig. 6. It is assumed that the blanks transportation process is failurefree and all processing machine and robots are three-states and binary-state systems, which is shown in Fig. 7, respectively. In particular, minor and major failures are included in the machines’ state-transition process as well as aging characteristics that failure rates follow the two-parameter Weibull distribution, while the repair rates of

Fig. 6 Schematic of the FMS

Fig. 7 State-transition process of machines (a) and robots (b)

12

W. Zeng et al.

machines obey exponential distribution. The failure rates and repair rates of robots both follows exponential distribution. Note that the loading and unloading times of robots are neglected compared to the processing times. The parameters of each device are given in Table 1. Shape parameters β (i) and scale parameters α (i) present the Weibull process of machines’ failures. The reciprocal of the Mean time to repair (MTTR-1) and mean time to failure (MTTF) are given to describe the exponential distribution. In addition, two constant production demands, d1 = 12 units/h and d2 = 24 units/h are included to depict more about the impacts of buffers on the FMS. Therefore, taking constrain d1 as the detailed example, the first step to evaluate the dynamic performance metrics of the FMS is to capture the dynamic production rates of FMCs. Taking the F MC (2) as example, the differential system of the state probabilities of machine M (2) j , j = 1, 2, 3, 4 obtained based on the Eq. (3) and Fig. 7a. For simplicity, cancel the subscript indicating the FMC in the following calculation. ⎧ dp1 (t) = p3 (t)λ3,1 (t) + p2 (t)λ2,1 (t) − p1 (t)μ1,3 (t) ⎨ dt   dp2 (t) (11) = p3 (t)λ3,2 (t) − p2 (t) λ2,1 (t) + μ2,3 (t) dt   ⎩ dp3 (t) = p + p − p λ + λ (t)μ (t) (t)μ (t) (t) (t) (t) 1 1,3 2 2,3 3 3,1 3,2 dt Initial conditions are: p1 (0) = 0, p2 (0) = 0, p3 (0) = 1. Similarly, the differential system of state probabilities of the second robot R (2) obtained based on the Eq. (3) and Fig. 7b. 

d P0R (t) dt d P1R (t) dt

R R = P1R (t)λ1,0 (t) − P0R (t)μ0,1 (t) R R R R = P0 (t)μ0,1 (t) − P1 (t)λ1,0 (t)

(12)

Initial conditions are: P0R (0) = 0, P1R (0) = 1. Then the Lz-transform of machine M j , j = 1, 2, 3, 4 and robot in the F MC (2) obtained based on Eq. (5) and results of system (11) and (12). 

  L z G j (t) = p3 (t)z g3 + p2 (t)z g2 + p1 (t)z 0 R L z {G R (t)} = P1R (t)z g1 + P0R (t)z 0

(13)

The Lz-transform of the whole F MC (2) obtained through Eqs. (6) and (7), and the dynamic production rate of F MC (2) are calculated through Eq. (8). The result about F MC (2) is depicted in Fig. 8. In the mimic procedures, the productivities of the remaining FMCs can be evaluated as shown in Fig. 9. Determining the dynamic productivities of FMCs, the next task is to calculate the dynamic models of the equivalent FMCs. Based on the state transition intensity matrix a(i) and the Chapman-Kolmogorov differential equations, the first state probability of equivalent FMCs, i.e. the equivalent availability of F MC (1) is:

[3200, 2.2]

[1800, 1.8]

[2500, 2.7]

[2800, 2.4]

[2600, 3.2]

M (2)

M (3)

M (4)

M (5)

λ3,2 (t)

[2900, 3.3]

[3400, 2.0]

[3100, 2.4]

[2300, 1.9]

[3500, 2.3]

λ3,1 (t)

Failure rates/(α (i) , β (i) )

M (1)

Device

Table 1 Parameters of each device

λ2,1 (t)

[1700, 2.3]

[1400, 1.5]

[1500, 1.8]

[1200, 1.6]

[2100, 1.9]

10

15

25

30

12

μ2,3

20

30

50

60

24

μ1,3

Repair rates (MTTR−1 /h)

24

12

8

6

12

g3

12

6

4

3

6

g2

Performance (Units/h)

R(5)

R(4)

R(3)

R(2)

R(1)

Device

4000

3500

2800

2200

2500

λ1,0

Failure rates (MTTF/h)

8

10

15

20

10

μ0,1

Repair rates (MTTR−1 /h)

30

30

30

30

30

g1

Performance (Units/h)

Dynamic Availability Analysis for the Flexible Manufacturing … 13

14

W. Zeng et al.

Fig. 8 Dynamic production rate of F MC (2) under constrain d1

Fig. 9 Dynamic production rates of F MC (1) , F MC (3) , F MC (4) and F MC (5) under constrain d1 (1) B A(1) F MC (t) = A F MC (t) p1b1 (t)

(14)

(1) B where A(1) F MC (t) is the equivalent availability of F MC , p1b1 (t) is the dynamic probability of B (i) non-blockage. Then, as the volume of b1 changes, the corresponding

shifts of A(1) F MC (t) is depicted in Fig. 10.

Dynamic Availability Analysis for the Flexible Manufacturing …

15

Fig. 10 The equivalent availability of F MC (1) with the change of volume b1

The results in the figure shows that as b1 increases, the equivalent availability of F MC (1) increases but gradually reaches the plateau. Note that, the availability is chosen as the criterion for determining the appropriate buffer volume in current paper, while other criterion in terms of system reliability associated cost (RAC) will be discussed in further research, where the optimal volume of buffer could be determined. Therefore, b1 = 40 is selected as an appropriate volume for buffer B (1) . Consequently, the changes of equivalent availability of F MC (2) with the set volume of b1 and variational b2 could also be obtained in the mimic procedure, which shown in Fig. 11. Therefore, an appropriate volume for buffer B (2) also confirmed as b2 = 30. Similarly, the equivalent availability of F MC (3) , F MC (4) and F MC (5) with the changes of volume of b3 and b4 depicted in Fig. 12, respectively. In addition, a group of appropriate FMS intermediate buffers’ volumes can be obtained, as shown in Table 2. In accordance with the aforementioned analysis results and the Eq. (9), the dynamic availability of the FMS under production demands d1 and d2 plotted as follow. It can be concluded from Fig. 13 that the results in Table 2 could ensure the FMS possess high availability under d1 while perform poorly under d2 , which entails the production capacity requirements have a significant impact on the FMS availability. The approach proposed in current paper not only quantifies this impact, but more importantly, provides a new solution for evaluating and improving the dynamic availability of FMS during the life cycle. Nevertheless, it is inadequate to determine the optimal buffers combination barely from the perspective of reliability because production cost is an inevitable topic for the design, operation and optimization of

16

W. Zeng et al.

Fig. 11 The equivalent availability of F MC (2) with the volume of b1 = 40 and the variational b2

Fig. 12 The equivalent availability of F MC (3) , F MC (4) and F MC (5) with the changes of volume of b3 and b4 , respectively Table 2 An appropriate volume of each intermediate buffer Buffer

B (1)

B (2)

B (3)

B (4)

Volume

40

30

30

40

Dynamic Availability Analysis for the Flexible Manufacturing …

17

Fig. 13 The availability of the FMS under d1 and d2

FMS. Therefore, it will be an interesting research to incorporate cost factors with the proposed model for obtains a veritable optimal buffers combination.

5 Conclusion The flexible manufacturing system consists of certain flexible manufacturing cells and finite intermediate buffers considered in this paper. A two-step stochastic model based on the Lz-transform method proposed to evaluate the dynamic availability of the FMS and to determine the appropriate intermediate buffers volumes that realize the system possess high availability throughout its life cycle under the premise of meeting various production demands. On one hand, Figs. 10, 11 and 12 manifest that the equivalent availability of FMCs increases as the buffer’s volumes increase while gradually reaches the plateau, which beneficial to select the appropriate intermediate buffer volume. Moreover, the results in Fig. 13 also illustrate the efficiency and rationality of the proposed method. On the other hand, the buffer optimization issue is to ensure and improve manufacturing system productivity and efficiency during the design and operation phases, then the system associated cost play a pivotal role in the decision-making process. Therefore, it will have important theoretical and practical significance to incorporate the cost factors with the proposed model to investigate the optimal buffers arrangement systematically and exhaustive.

18

W. Zeng et al.

References 1. Alhourani F (2016) Cellular manufacturing system design considering machines reliability and parts alternative process routings. Int J Prod Res 54(3):846–863 2. Anupma Y, Jayswal SC (2019) Evaluation of batching and layout on the performance of flexible manufacturing system. Int J Adv Manuf Technol 101:1435–1449 3. Browne J, Dubois D, Rathmill K et al (1984) Classification of flexible manufacturing systems. FMS Mag 114–117 4. Chen KS, Yu CM, Hus TH et al (2019) A model for evaluating the performance of the bearing manufacturing process. Appl Sci 9(15):3105 5. Das K, Lashkari R, Sengupta S (2007) Reliability consideration in the design and analysis of cellular manufacturing systems. Int J Prod Econ 105(1):243–262 6. Dosdogru AT, Gocken M, Geyik F (2015) Integration of genetic algorithm and Monte Carlo to analyze the effect of routing flexibility. Int J Adv Manuf Technol 81:1379–1389 7. Duan JG, Xie N, Li LH (2019) Modelling and evaluation of multi-state reliability of repairable non-series manufacturing system with finite buffers. Adv Mech Eng 11(6):1–13 8. Elleuch M, Bacha HB, Masmoudi F et al (2008) Analysis of cellular manufacturing systems in the presence of machine breakdowns. J Manuf Technol Manag 19(2):235–252 9. Groover MP (2007) Automated assembly system, automation, production systems, and computer-integrated manufacturing. Prentice Hall Press, Upper Saddle River, NJ 10. Gyulai D, Pfeiffer A, Monostori L (2017) robust production planning and control for multi-stage systems with flexible final assembly lines. Int J Prod Res 55(13):3657–3673 11. Halse LL, Jæger B (2019) Operationalizing industry 4.0: understanding barriers of industry 4.0 and circular economy. In: Ameri F, Stecke K, von Cieminski G, Kiritsis D (eds) Advances in production management systems. Towards smart production management systems. APMS 2019. IFIP advances in information and communication technology, vol 567. Springer, Cham 12. He C, Zhang SY, Qiu LM et al (2019) Assembly tolerance design based on skin model shapes considering processing feature degradation. Appl Sci 9(16):3216 13. Jain V, Raj T (2016) Modeling and analysis of FMS performance variables by ISM, SEM and GTMA approach. Int J Prod Econ 171(1):84–96 14. Jin R, Liu K (2013) Multimode variation modeling and process monitoring for serial-parallel multistage manufacturing processes. IIE Trans 45(6):617–629 15. Koulamas CP (1992) A stochastic model for a machining cell with tool failure and tool replacement considerations. Comput Oper Res 19(8):717–729 16. Kumar N, Kumar J (2019) Efficiency 4.0 for industry 4.0. Human Technol 15(1):55–78 17. Lee J, Bagheri B, Kao HA (2015) A cyber-physical systems architecture for industry 4.0 based manufacturing systems. Manuf Let 3:18–23 18. Li JS, Huang NJ (2007) Quality evaluation in flexible manufacturing systems: a markovian approach. Math Prob Eng. Article ID 057128, 24 pages. https://doi.org/10.1155/2007/57128. 19. Lisnianski A (2012) Lz-transform for a discrete-state continuous-time Markov process and its application to multi-state system reliability. In: Lisnianski A, Frenkel I (eds) Recent advances in system reliability. Springer-Verlag, London, pp 79–95 20. Lisnianski A, Frenkel I, Ding Y (2010) Multi-state system reliability analysis and optimization for engineers and industrial managers. Springer, London 21. Lisnianski A, Frenkel I, Khvatskin L (2021) Modern dynamic reliability analysis for multi-state systems. Springer series in reliability engineering. Springer, Cham 22. Liu JL, Yang S, Wu AG et al (2012) Multi-state throughput analysis of a two-stage manufacturing system with parallel unreliable machines and a finite buffer. Eur J Oper Res 219(2):296–304 23. Loganathan MK, Girish K, Gandhi OP (2016) Availability evaluation of manufacturing systems using Semi-Markov model. Int J Comput Integ M 29(7):720–735 24. MacDougall W (2014) Industry 4.0: smart manufacturing for the future. Berlin, Germany, GTAI

Dynamic Availability Analysis for the Flexible Manufacturing …

19

25. Manocher D, Hamid S (2019) Analysis of critical machine reliability in manufacturing cells. J Ind Eng Manag 12(1):70–82 26. Rybicka J, Tiwari A, Enticott S (2016) Testing a flexible manufacturing system facility production capacity through discrete event simulation: automotive case study. Int J Mech Aerosp Ind Mechatron Manuf Eng 10(4):668–672 27. Sanghavi D, Parikh S, Raj SA (2019) Industry 4.0: tools and implementation. Manag Prod Eng Rev 10(3):3–13 28. Savsar M (2000) Reliability analysis of a flexible manufacturing cell. Reliab Eng Syst Saf 67(2):147–152 29. Savsar M (2004) Performance analysis of an FMS operating under different failure rates and maintenance policies. Int J Flex Manuf Sys 16:229–249 30. Savsar M (2011) Multi-state reliability modeling of a manufacturing cell. Int J Perform Eng 7(3):217–228 31. Shu S (1992) An analysis of the repairable computer integrated manufacturing system (CIMS) with buffers and a study of the system reliability. Acta Automatica Sinica 18(1):15–22 32. Shu S, Zhang Y (1995) Reliability analysis of series production lines. Control Theory Appl 12(2):177–182 33. Singholi A, Ali M, Sharma C (2013) Evaluating the effect of machine and routing flexibility on flexible manufacturing system performance. Int J Serv Oper Manag 16(2):240–261 34. Tan B, Gershwin SB (2009) Analysis of a general Markovian two-stage continuous-flow production system with a finite buffer. Int J Prod Econ 120(2):327–339 35. Thames L, Schaefer D (2016) Software-defined cloud manufacturing for industry 4.0. Procedia CIRP 52:12–17 36. Trivedi K (2019) Probability and statistics with reliability, queuing and computer science applications. Wiley, New York 37. Zeng W, Chen P (2008) Volatility smile, relative deviation and trading strategies: a general diffusion model for stock price movements based on nonlinear birth-death process. China Econ Quart 7(4):1415–1436 38. Zeng W, Shen G, Chen B et al (2019) Lz-transform method and Markov reward approach for flexible manufacturing system performance evaluation. Appl Sci 9(19):4153 39. Zhou J, Liu Z (2006) Relationship between machine utilization and buffer capacity. Tool Eng 40(9):24–26

Integrated Reliability and Risk Assessments of Nuclear Facilities Vanderley de Vasconcelos, Graiciany de Paula Barros, Antônio Carlos Lopes da Costa, and Wellington Antonio Soares

Abstract Reliability is the probability that a system will perform its intended function satisfactorily during its lifetime under specified environmental and operating conditions. Risk can be measured by assessing the probability of an undesired event (e.g., a system failure) and the magnitude of its consequences. Therefore, risk and reliability are complementary variables. Licensing of nuclear facilities requires safety and risk assessment. Probability risk assessment implies carrying out a quantitative assessment of the reliability of items important to safety (IISs). In addition, reliability assessments are required during all lifetime phases for optimizing plant performance, maintainability and safety. Thus, the outcomes of reliability and risk assessments are interchangeable and complementary. This chapter proposes a framework for integrating reliability and risk assessments of IISs of nuclear facilities using tools of reliability engineering, as Markov models, fault tree analysis, event tree analysis, reliability block diagrams and life data analysis. Based on frequency-dose limits to the public, risk acceptance criteria are also suggested in the scope of the framework. A case study demonstrating the advantages of using the integrated approach applied to a preliminary plant design phase of a nuclear fuel fabrication, to meet risk acceptance criteria and licensing requirements, is presented.

1 Introduction Reliability is usually defined as the probability that an item, component, or engineering system will perform its intended function without failing during a given V. de Vasconcelos (B) · G. de Paula Barros · A. C. L. da Costa · W. A. Soares Centro de Desenvolvimento da Tecnologia Nuclear - CDTN, Av. Presidente Antônio Carlos, 6.627 Campus da UFMG, Pampulha, Belo Horizonte, Minas Gerais 31270-901, Brazil e-mail: [email protected] G. de Paula Barros e-mail: [email protected] A. C. L. da Costa e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment of Industrial Systems, Springer Series in Reliability Engineering, https://doi.org/10.1007/978-3-030-93623-5_2

21

22

V. de Vasconcelos et al.

period of time under operation conditions. Among several risk concepts used in different scientific areas, risk can be defined as the potential of loss resulting from exposure to hazards. Sometimes, the risk is measured by assessing the probability (or frequency) of occurrence of an undesired event (e.g., an accident) and the magnitude of its consequences (severity) [1]. Under this viewpoint, risk and reliability can be considered as complementary variables. Thus, even if these analyses were not performed at the same time during plant lifetime, or by the same team, their outcomes should be interchangeable and complementary. This chapter proposes a framework for integrating reliability and risk assessments applied to the design of items important to safety (IISs) of nuclear facilities (NFs). This kind of integration is a trend in advanced reactor designs within reliability assurance programs (RAPs), which exploit the optimization of performance and safety of such plants using wellestablished reliability, availability and maintainability (RAM) and probabilistic risk assessment (PRA) tools [2]. The application of this framework for nuclear facilities, which are simpler and less potentially hazardous than nuclear power plants (NPPs), can also bring benefits during all plant life phases, supporting design, operation and licensing processes. Compared to NPPs, NFs differ in many aspects. These last facilities use a wide variety of technologies and processes and, in addition to fissile material and waste, they often handle, process and storage large quantities of toxic, corrosive, or combustible chemicals. Furthermore, a common characteristic of these facilities is the continuous improvement and innovations, research and development, as well as greater dependence on human factors, which range the potential hazards and the possibility of release of hazardous chemical or radioactive materials. Despite that, the methodologies used for reliability and risk assessments of NPPs can be adapted and integrated for use during all NF lifetime, in order to optimize the safety and performance of such facilities [3]. An overview of an integrated approach for reliability and risk assessments of NFs is shown in Fig. 1. Supported by reliability engineering tools and risk acceptance criteria, the IISs are analyzed in an integrated way throughout the facility lifetime. The optimization of plant performance and safety is carried out using well-established techniques of PRA and reliability engineering, such as life data analysis (LDA), reliability block diagrams (RBD), failure mode and effects analysis (FMEA), event tree analysis (ETA), fault tree analysis (FTA), Markov models and Monte Carlo simulation, among others. Special emphasis should be given to aspects of human reliability analysis (HRA), sensitivity analysis of IISs and uncertainty assessment of reliability and risk quantitative results.

2 Nuclear Facilities Under the Scope of Analysis Nuclear-based electric power depends on heat generation within the reactor core, which is usually an assembly of uranium fuel rods (nuclear fuel), the uranium sometimes being enriched in the naturally present uranium-235 isotope (fissile atom). The

Integrated Reliability and Risk Assessments of Nuclear Facilities

23

Fig. 1 Overview of integrated approach for reliability and risk assessments of nuclear facilities

heat results from a sustained and carefully controlled “chain reactions”, in which neutrons cause fissions of fissile atoms within the core. The fission process involves the splitting of the fissile atoms into lighter radioactive fission products, the production of further neutrons, sustaining the chain reactions and releasing a considerable amount of energy [4]. Spent nuclear fuel can be reprocessed to extract recyclable energy material, inside the so-called “nuclear fuel cycle”. This cycle involves all industrial activities necessary to produce electricity from uranium in NPPs, including “front-end” activities (fuel fabrication), “service period” activities (electricity generation in NPPs) and “back-end” activities (reprocessing, reuse and disposal). Fuel cycle is referred to as “closed fuel cycle” if the spent fuel is reprocessed; otherwise, it is referred to as “open fuel cycle”. Therefore, nuclear fuel cycle facilities (NFCFs) can include the following industrial operations: • • • • • • •

Processing of uranium ores; Uranium conversion and enrichment; Nuclear fuel fabrication; Reprocessing of spent nuclear fuel; Nuclear waste management and disposal; Separation of radionuclides from fissile products; Research and development to support NFCFs.

NFCFs are installations other than NPPs, research reactors and critical assemblies, which pose potential hazards to workers, public and environment [5]. The terms “nuclear fuel cycle facility” and “nuclear facility” and their initials are used interchangeably in this chapter. The variety of processes associated with NFCFs can result in a broad range of potential accidents. Some types of facilities with low

24

V. de Vasconcelos et al.

radioactive inventories and low criticality risk, or facilities handling natural uranium, have a low hazard potential. Other NFCFs (e.g., reprocessing facilities) may have a potential risk comparable with NPPs. The following specific features need to be considered when performing a safety assessment of NFCFs [6]: • The nature and variety of technologies result in a broad range of hazardous conditions; • Presence of a wide diversity of radioactive, fissile, toxic, combustible, explosive or reactive materials; • Radioactive and other hazardous materials are present and handled throughout different parts of the facility; • The possibility of occurrence of nuclear criticality accidents; • Need for frequent changes in equipment, processes, and operating procedures can pose additional hazards; • Operations of NFCFs usually require more operator actions than NPPs, making operators more vulnerable; • Many of NFCFs worldwide are obsolete and lack adequate design documentation and information on operating experience, which require additional efforts and specific approaches to safety assessment. The loss of the main safety functions of NFCFs may lead to release of radiological or hazardous materials. The most common safety functions required by NFCFs are [7]: • Prevention of criticality of fissile materials; • Radiological protection; • Confinement of radioactive or hazardous materials to prevent or mitigate unplanned releases to the environment; • Removal of decay heat. This chapter presents a simplified or graded version of the complete probabilistic risk assessment (PRA) traditionally used in NPPs, which is compatible with the complexity and level of hazard presented by NFs. PRA is useful to complement the conventional deterministic safety assessment (DSA), whereby plant safety can be assessed and improved. PRA provides not only a comprehensive and structured approach to identify failure scenarios and facility damages but also for estimating numerical values for the risks. Likewise, it offers a systematic approach to verify whether the reliability and independence of safety systems are adequate for implementing defence-in-depth provisions and assessing whether the risks are “as low as reasonably achievable” (ALARA criterion). It is a trend currently in such analyses to make a less conservative assumption and to consider the best-estimated values [6]. Both PRA applications to NPPs and NFCFs are based on common principles. Nevertheless, there are some significant differences, mainly due to plant design and processes. Compared to NPPs, NFCFs are usually characterized by a larger diversity of technologies and processes. Apart from processing radioactive or fissile materials,

Integrated Reliability and Risk Assessments of Nuclear Facilities

25

Fig. 2 Integration of reliability and risk assessments during facility lifetime and licensing process

larger quantities of chemical materials are handled frequently. Consequently, these materials must be taken into account appropriately in PRAs of NFCFs.

3 Reliability and Risk Assessments During Facility Lifetime In relation to NFs, risk assessment is defined as “an assessment of the radiation risks and other risks associated with normal operation and possible accidents involving facilities and activities. This will normally include consequence assessment, together with some assessment of the probability of those consequences arising.” Throughout this chapter, safety assessment is used with the concept of “an assessment of all aspects of a practice that are relevant to protection and safety”, a scope that encompass risk assessment [5]. The integration of reliability and risk assessments of items important to safety (IISs) can be carried out at all phases of the facility lifetime. At the beginning of a lifetime, these assessments are almost independent of each other, although they can be done together, at the same time, and by the same team in order to optimize resources. Throughout the facility lifetime, this integration becomes even more necessary, not only for cost-effectiveness reasons but also mainly for improving safety and quality assurance of the outcomes [2, 8]. Figure 2 shows the integration of reliability and risk assessments during facility lifetime and licensing process. The arrow width illustrates the need or potential of integration of these assessments from initial phases of a lifetime or licensing process to decommissioning of the facility.

26

V. de Vasconcelos et al.

Looking for the details of this integration, a table was built, listing the safety and risk assessments typically required in the licensing process of NFs, as well as the Reliability, Availability and Maintainability (RAM) assessments required by design, operation and performance issues. Table 1 shows, for each phase in the lifetime of an NF, some examples of typical documents and authorizations required during the licensing process, relating them to safety, risk and RAM assessments. Before the start of the licensing process itself (pre-licensing) in the phase of the feasibility study and preliminary design, there is still no formal licensing process, but some decisions are taken about reference facilities, strategic plan for licensing, and acceptable levels of risk, safety and reliability. At this point, a definition and categorization of a list of IISs, preliminary deterministic and probabilistic safety assessments, a qualitative analysis of hazard and accident scenarios, as well as a selection of design principles applied to safety are carried out. On the other hand, reliability characteristics and functional requirements of IISs, as well as automation and control philosophies are defined. During siting and site evaluation phase, regulatory bodies usually require some licensing documents, such as site evaluation reports, environmental impact assessments, emergency preparedness and response plan. For supporting public inquiries, identification of potential vulnerabilities and areas of importance to risk, demonstration of facility safety taking into account specific site characteristics and long-term safety assessment are carried out. At the same time, the project team proceeds with the availability and performance assessments taking into account specific site characteristics. Both deterministic and probabilistic safety assessments should be completed before initiating the detailed design process, looking for any necessary changes in the basic design philosophy, because future changes could become excessively expensive. These assessments require technical specification of IISs, as well as definitions about maintenance, inspections, tests, requirements, man-machine interface and human reliability, among others. As the detailed design evolves to the final design, the importance of an integrated assessment of safety, risk, maintainability and availability increase, not only for economic reasons, but also for quality assurance requirements, which imposes high reliability and safety and lower risks. In addition to technical specifications and PRA, the elaboration of safety-related documents as physical protection and fire protection plans are more effective when performed through integrated activities, using the same tools as FTA, ETA and RBD, for example. Considerations about influence of human factors on facility design and operation are also analyzed in this phase. For granting a construction permit from a regulatory body, a preliminary safety analysis report (PSAR) is one of the required documents. Therefore, an assessment of the potential impact of construction on neighboring and demonstration of compliance of IISs with safety requirements will need life data analysis and supplier assessment, as well as the elaboration of pre-service inspection strategy. Integration of safety assessment and procurement process is carried out to specify reliability standards and purchase the most reliable components available.

Integrated Reliability and Risk Assessments of Nuclear Facilities

27

Table 1 Examples of typical safety, risk, reliability, availability and maintainability assessments during facility lifetime and licensing process of nuclear facilities (NFs) Facility lifetime Safety and risk RAM assessments Licensing process phases assessments 1. Feasibility study Definition and and preliminary design categorization of a list of IISs Preliminary deterministic and probabilistic safety assessments 2. Siting and site Vulnerabilities and evaluation areas of importance to risk Demonstration of facility safety 3. Design Deterministic safety assessment (DSA) Probabilistic risk assessments (PRA) Specifications for safety functions 4. Manufacturing and construction

5. Commissioning

6. Operation

7. Technical modifications and innovations

8. Ageing management

Assessment of potential impact of construction on neighboring As-built safety assessment Safety assessment of commissioning tests Analysis of risks and opportunities Data specialization of failure rate of IISs Living PRA Verification of operator fitness for duties related to safety

Definition of reliability characteristics of IISs Definition of automation and control philosophies

Definition of reference facilities Acceptable levels of risk, safety and reliability

Demonstration of facility availability taking into account specific site characteristics RAM assessment of IISs Maintenance, inspection and test requirements Human reliability analysis (HRA) Life data analysis Suppliers assessment

Site approval Site evaluation report Emergency plan

Assessment of maintenance, in-service inspection and testing strategies Unconformities analysis Data specialization of reliability characteristics of IISs Analyses of facility performance using RBI and RCM HRA for facility operation Safety review of risks Evaluation of and hazards reliability Assessing of safety characteristics of IISs factors against current Evaluation of facility standards performance Categorizations of IISs Understanding of to ageing management ageing mechanisms Prioritization of Prioritization of actions for safety corrective actions for improvement RAM improvements

Technical specifications Fire protection plan Technical design documents

Construction permit Preliminary safety analysis report (PSAR) Authorization for use of nuclear material Quality assurance program

Authorization for operation As-built final safety analysis report (FSAR) Operating procedures

Approval of technical modifications Periodic safety review (PSR) Ageing management plan and reports

(continued)

28 Table 1 (continued) Facility lifetime phases 9. Decommissioning

V. de Vasconcelos et al.

Safety and risk assessments

RAM assessments

Licensing process

Safety assessment of decommissioning Assessment of safety of workers, public and environment

Assessment of progressive shutdown and new configurations of IISs

Authorization for decommissioning Decommissioning plan and reports

The authorization for use of nuclear material and commissioning require, among others, the assessment of as-built safety, measures for controlling nuclear materials, and commissioning tests. These assessments complement or require the definition of in-service inspection and testing strategies (e.g., risk-based inspection—RBI), and maintenance strategies (e.g., reliability-centered maintenance—RCM). On the other hand, a definition of a quality assurance program requires an integrated unconformities analysis, analysis actions to address risks and opportunities, and elaboration of training and qualification plans, to meet the current requirements of the ISO 9001 standard. For granting authorization for operation from a regulatory body, a final safety analysis report (FSAR), taking into account data specialization of the failure rate of IISs and operating experience, is one of the required documents. Sometimes, the safety assessment is complemented by a living PRA, helping to ascertain the compliance of NF plans and procedures with safety requirements (e.g., related to the operation, radiation protection, fire protection, emergency preparedness and waste management). During the operation, a fully continuous assessment of the effectiveness of preventive and corrective actions, availability of IISs, facility performance, use of RBI and RCM strategies, and verification of operator fitness for duties should be integrated and carried out by the same team, if possible, for cost-effectiveness issues. During facility lifetime, many technical modifications and innovations are necessary, which should be approved by a regulatory body, sometimes under periodic safety review plans (PSRs). These technical modifications require a safety review of risks and hazards, assessing safety factors against current standards, and revaluation of facility safety, reliability characteristics of IISs, and facility performance after modifications. These safety, risk and RAM assessments are complementary and would become expensive and with a high possibility of inconsistencies if not carried out in an integrated way.

Integrated Reliability and Risk Assessments of Nuclear Facilities

29

Ageing management is a phase of facility lifetime, which also requires simultaneous risk, safety, and RAM assessments. To determine the best strategy for ageing management of IISs and their impact on risk changes, it is necessary to understand how ageing affects the reliability of individual IISs. Examples of these analyses include categorizations of IISs to ageing management, safety review of IISs under current conditions, prioritization of corrective actions for safety improvement, mitigation of ageing effects, understanding of ageing mechanisms, review of reliability characteristics of IISs under current conditions, and prioritization of corrective actions for RAM improvement. At the end of the facility lifetime, it is also necessary an authorization for decommissioning from the regulatory body. On the one hand, it is necessary a safety assessment of decommissioning (potential new hazards), a demonstration of safe management of radioactive waste, an assessment of proposed end-sate, including radiation doses, and assessment of environmental impact and safety of workers and public. On the other hand, it is necessary an assessment of progressive shutdown and new configurations of IISs, an assessment of RAM during decommissioning (e.g., reliability of hoisting devices to prevent accidents and environmental impact), and an assessment of knowledge management effectiveness. These are fully integrated safety and reliability assessments, which should be carried out by the same team, at the same time and using the same tools and data.

4 Reliability of Items Important to Safety Reliability assessments of NFs are required, among others, by quality assurance (QA) programs and technical specifications of IISs [2]. Safety and risk assessments of NFs are licensing requirements [8, 9]. Two complementary methods (deterministic and probabilistic) are usually required to assess and improve the safety during the lifetime of these facilities.

4.1 Deterministic and Probabilistic Safety Analysis According to International Atomic Energy Agency—IAEA [5] deterministic analysis, in the context of safety of NFs, “implies focusing on accident types, releases of radioactive materials and consequences, without considering the probabilities of different event sequences”. Deterministic safety analysis—DSA is performed to predict the facility response to postulated initiating event (e.g., design basis accidents— DBAs) and demonstrate the adequacy of design of engineered safety features or other IISs, both to control and mitigate their consequences [6]. DSA is based on conservative assumptions made at all steps of quantification of accident sequences, showing that the facility and its safety systems meet the safety requirements, ensuring that the end-states in terms of consequences are acceptable [10].

30

V. de Vasconcelos et al.

Probabilistic risk assessment—PRA (sometimes called probabilistic safety analysis—PSA) evaluates the probabilities of occurrence of the accident sequences and their consequences. This type of evaluation is used to identify risks, mitigation measures, and weaknesses in facility design that might dominate the risks. PRA is used in a complementary way with DSA, helping the selection of new accident sequences to be included in safety analysis [10].

4.2 Reliability Characteristics There are many reliability characteristics that are important to facility safety and performance assessments. Table 2 summarizes some of these characteristics, with their respective definitions. The selected reliability characteristics were probability of failure given time, B(X) life, mean life, reliability given time, availability given time, failure rate, and maintainability. These variables are interrelated in some way and their definitions are complemented by the Eqs. 1–5 presented next [11]. Assuming an exponential life distribution for an item i with a constant failure rate, λi , the reliability of item i as a function of time, t, Ri (t), is given by Eq. 1: Ri (t) = e−λi t .

(1)

Table 2 Selected reliability characteristics important to facility safety and performance assessments [11] Reliability characteristics Definitions Probability of failure given time

B(X) life Mean life

Reliability given time

Availability given time Failure rate Maintainability

The probability that an item will be failed at a particular point in time. The probability of failure is also known as “unreliability” and it is the reciprocal of the reliability The estimated time when the probability of failure will reach a specified point (X%) The average time that the items in the population are expected to operate before failure. This is often referred to as “mean time to failure” (M T T F) The probability that an item will operate successfully at a particular point in time and under specified operational conditions Probability that an item is operational at a given time (i.e., has not failed or it has been restored after failure) The number of failures per unit time that can be expected to occur for the item Probability that an item is successfully restored after failure. “Mean time to repair” (M T T R) is often a reliability characteristic used to measure maintainability

Integrated Reliability and Risk Assessments of Nuclear Facilities

31

Mean time to failure, M T T F, can be obtained by integrating the reliability function of item i from zero to infinity, as given by Eq. 2: 



MT T F =

Ri (t) dt.

(2)

0

As the reliability of an item is the probability that it will operate successfully by a given time, in dealing with repairable items, some other definitions are needed, as described below. Maintainability, Mi (t), is defined as the probability of performing a successful repair action within a given time, t. Then, it measures the ease and speed a system can be restored to its operational status after a failure occurs. Considering an item i in which repair times are distributed exponentially, with a constant repair rate, μi , its maintainability is given by Eq. 3: Mi (t) = 1 − e−μi t .

(3)

Mean time to repair, M T T R, can be obtained by integrating maintainability function from zero to infinity, as given by Eq. 4:  MT T R =



Mi (t) dt.

(4)

0

Availability can be defined as the probability an item/system is operational at a given time (i.e., it has not failed or it has been restored after failure). Then, availability for repairable systems takes into account both reliability and maintainability. In other words, availability, A(t), is the probability a system is operating properly when it is requested for use, and in the case of a single item, i, is given by Eq. 5: Ai (t) =

MT T F Item uptime = . Item uptime + Item downtime MT T F + MT T R

(5)

In the case of systems with items in series or parallel configurations, the reliability characteristics are calculated using concepts of Boolean algebra and probability theory. For more complex configurations, reliability engineering tools and computer codes are usually required [11, 12].

4.3 Reliability Engineering Tools There are many tools of reliability engineering, usually supported by computer codes, developed to solve reliability related problems. Some of the most important reliability engineering tools are next shortly described.

32

V. de Vasconcelos et al.

Reliability block diagram (RBD) is a graphic tool that represents a system as connections of blocks (system items) in series or in parallel, according to their reliability relationships. RBD analyzes the effect of item failures on system performance, assuming that each item is in an operating or failed state, and associating each item a failure probability or a reliability characteristic. Its results are used to reliability and availability estimations [11]. Fault tree analysis (FTA) is also a graphic and logical technique that analyzes the probability of occurrence of an undesired state of a system (top event) from the probability of occurrence of its causes (basic events). The top events are selected taken into account the items that are critical to safety or reliability. A fault tree can be described as logical combinations of events (e.g., item failures or human errors), through “OR” or “AND” gates, that will result in the top event. Fault trees can be analyzed qualitatively or quantitatively. Qualitative analysis involves the identification of the combinations of basic events that lead to top event (minimal cut sets). Quantitative analysis involves the estimation of the probability of occurrence of top events, when the probabilities of the basic events are known. These results are used both to risk and reliability analyses [12, 13]. Reliability analysis and PRAs of IISs require life data (also called lifetime), as time, distance or cycles of operation before they fail. Life data analysis (LDA) studies and models use observed item lifetimes to estimate failure rates of IISs and fit statistical distributions to available data. The most common distributions used for modelling reliability and failure behavior are Weibull, normal, lognormal and Gamma, among others. Computer programs are required to select and fit the adequate distributions and estimate important reliability characteristics and probability of failure at specific times. Sometimes, accelerated life testing (ALT) is needed in order to get lifetimes of items of high reliability [12]. Markov models can be required to estimate reliability characteristics and failure probabilities (e.g., piping rupture rates) from available data, as inspection and repair rate, flaw occurrence rate and leak rates. They use decision trees for modeling intermediate states of an item before its failure. Markov models are appropriate whenever the stochastic behavior of an item depends on the states of their components. These models assume that the probability an item will perform a transition from a state to another at any time depends on the initial and final state of the transition. In general, Markov models are more adequate than FTA for modeling reliability and failure probability of items as function of time [14].

4.4 Reliability, Availability and Maintainability (RAM) of Items Important to Safety According to [5], an item important to safety (IIS) is an item whose failure led to radiation exposure of workers or public and includes structures, systems, and components (SSCs), which prevent that foreseeable operational events result in accident

Integrated Reliability and Risk Assessments of Nuclear Facilities

33

Table 3 Examples of design principles and criteria applied to reliability, safety and risk [9] Design focuses Principles and criteria Reliability Safety

Risk

Standby and redundancy, physical separation, functional independence, diversity, safety factors, maintainability, availability Fail-safe design, double contingency, single failure design, ALARA (as low as reasonably achievable), defence-in-depth, passive safety, fault tolerance, inherently safe, licensing requirements Prevention principle, precautionary principle, protection principle, radioprotection, limitation of risks to individuals and environment, design basis accidents (DBAs), risk-based inspections (RBI), IAEA safety principles

conditions, as well as devices necessary to mitigate the consequences of failures or malfunctioning of SSCs. The reliability of IISs shall be in accordance with their safety significance. The design and technical specification of IISs shall be such as to ensure that they can be qualified, installed, operated, and maintained to be capable of withstanding, with sufficient reliability and effectiveness, all conditions specified in their design basis. Table 3 shows some examples of design principles and criteria, which are commonly used in NFs in order to increase safety and reliability of IISs, as well as reduce risks [9]. The design principles and criteria cited in Table 3 are grouped in three focuses: reliability, safety, and risk. For example, the design of IISs shall take into account common mode failures (CCFs), using concepts of redundancy, diversity, physical separation, and functional independence, in order to achieve the intended reliabilities. The improvement of maintainability, e.g., including self-checking capability, periodic testing, or adopting reliability centered maintenance (RCM), increases the reliability and availability of IISs. The concepts of fail-safe design, defence-in-depth, and single failure design shall be incorporated into the design of IISs, so that even if they fail the safety functions are performed. In the same way, prevention, precautionary, and protection principles, as well as radioprotection or risk-based inspection (RBI), shall be used, as appropriate, to prevent, reduce and mitigate risks to individuals and the environment. Reliability engineering tools for assessing the reliability characteristics are widely used to verify and prove these principles and criteria [2, 9].

4.5 Human Reliability Human reliability analysis (HRA) studies and models the interactions between humans and systems to estimate human error probabilities (HEPs), taking into account work environment, opportunities to recover from errors and their consequences. The most known HRA techniques are technique for human error rate pre-

34

V. de Vasconcelos et al.

diction (THERP), holistic decision tree (HDT), cognitive reliability and error analysis method (CREAM), human error assessment and reduction method (HEART), and standardized plant analysis risk-human reliability analysis (SPAR-H). Many of these techniques use the so-called performance-shaping factors (PSFs) to help HEP prediction from specific work situations [15]. PSFs are defined as variables that may affect human performance in an HRA and are used for adjusting basic human error probabilities (HEPs) assumed as nominal under certain conditions. PSFs usually adopted in HRA methods include available time, stress, complexity of systems and tasks, training, worker experience, operational procedures, human factors, and workload, among others. Some HRA methods, such as SPAR-H, categorize human tasks into two basic types: action or diagnosis. Routine operation, starting equipment, calibration or maintenance, and other activities of following procedures are examples of action tasks. On the other hand, diagnosis tasks depend on knowledge/experience for analyzing current facility conditions, planning and prioritizing activities, and take decisions about alternative actions. SPAR-H assigns a figure of 10−2 to nominal H E Ps for diagnosis tasks, which is ten times the nominal H E Ps for action tasks (10−3 ). Some PSF levels cause the increase of H E P (negative influence), while other PSF levels cause the decrease of H E P (positive influence). Figure 3 illustrates this relationship for diagnosis and action tasks, including 5th and 95th percentiles, representing the uncertainties involved in the estimates of H E Ps. A lower bound of 10−5 is suggested [15].

Fig. 3 Influence of PSF on nominal error rate (based on [15])

Integrated Reliability and Risk Assessments of Nuclear Facilities

35

SPAR-H treats mathematically the influence of PSF levels on H E P using specific multipliers for diagnosis and action tasks (Table 4), according to Eq. 6: HEP = NHEP ×

8 

Si ,

(6)

i=1

where N H E P is the nominal H E P (10−2 for diagnosis and 10−3 for action), and Si is the multiplier associated with the corresponding PSFi level (Table 4). As an example, in an accident situation, when a diagnosis task (N H E P = 10−2 ) of high complexity (S3 = 5) is carried out under extreme stress (S2 = 5), considering the other six levels as nominal (S1 = S4 = S5 = S6 = S7 = S8 = 1), the H E P value, according to Eq. 7, would be: 8 

HEP = NHEP ×

Si = 10−2 × 1 × 5 × 5 × 1 × 1 × 1 × 1 × 1 = 0.25. (7)

i=1

In this theoretical example, a diagnosis task that under normal conditions would have a human error probability of 1% becomes 25% under accident conditions! When three or more negative PSF influences are present (i.e., anytime a multiplier greater than 1 is selected), when applying Eq. 6, SPAR-H method recommends the use of an adjustment factor, according to Eq. 8 for computing H E P [16]. This helps to reduce double counting interdependencies between PSF effects and avoids H E P value from being greater than 1. HEP =

8 Si N H E P × i=1 . 8 N H E P × ( i=1 Si − 1) + 1

(8)

5 Probabilistic Risk Assessment (PRA) Mathematically, the risk from an undesired event can be expressed as a product of the frequency of occurrence and severity, as shown in Eq. 9 [17]: Risk

 consequence  time

 = frequency

  consequence  event × severity time event

(9)

Risk assessment is the combination and integration of the probabilities (or frequencies) and the severities (or consequences) for identified hazards, taking into account the effectiveness of controls and barriers (defence-in-depth levels). It provides an input to risk evaluation and decisions about risk management, through the adoption of adequate risk acceptance criteria [18].

36

V. de Vasconcelos et al.

Table 4 Evaluation of PSFs for diagnosis and action portions of tasks according to SPAR-H method [15] PSFs PSF levels Diagnosis multiplier Action multiplier 1. Available time

Inadequate time Barely adequate time (≈2/3 × nominal) Nominal time Extra time (between 1 and 2 × nominal and > than 30 min) Expansive time (>2 × nominal and >30 min) Insufficient information 2. Stress/stressors Extreme High Nominal Insufficient information 3. Complexity Highly complex Moderately complex Nominal Obvious diagnosis Insufficient information 4. Experience/training Low Nominal High Insufficient information 5. Procedures Not available Incomplete Available, but poor Nominal Diagnostic/symptom oriented Insufficient information 6. Ergonomics/HMI Missing/misleading Poor Nominal Good Insufficient information

P (Failure) = 1.0 10

P (Failure) = 1.0 10

1 0.1

1 0.1

0.01

0.01

1

1

5 2 1 1

5 2 1 1

5 2 1 0.1 1

5 2 1 Not applicable 1

10 1 0.5 1

3 1 0.5 1

50 20 5 1 0.5

50 20 5 1 Not applicable

1

1

50 10 1 0.5 1

50 10 1 0.5 1 (continued)

Integrated Reliability and Risk Assessments of Nuclear Facilities Table 4 (continued) PSFs 7. Fitness for duty

8. Work processes

37

PSF levels

Diagnosis multiplier

Action multiplier

Unfit Degraded fitness Nominal Insufficient information Poor Nominal Good Insufficient information

P (failure) = 1.0 5 1 1

P(failure) = 1.0 5 1 1

2 1 0.8 1

5 1 0.5 1

5.1 PRA Steps PRA is a quantitative assessment of the risks in terms of hazards, such as hazardous chemicals or a radioactive release and their effects on the health of the public and environment, in the case of nuclear facilities (NFs). In addition to fissile materials, NFs usually handle large quantities of toxic, corrosive, or flammable materials. Therefore, these materials must be taken into account in typical PRAs. Furthermore, the safety of NFs is more frequently influenced by human errors than NPPs. With the intention of continually improve processes, operational procedures and technical equipment can undergo frequent changes. In contrast to PRAs for NPPs, which have different levels of consequences (from core damage to environmental impact), typical end-states of NFs accident sequences have only one stage (e.g., the release of hazardous materials or criticality). Therefore, PRAs of NFs must take into account these differences compared to traditional PRAs of NPPs. Figure 4 shows the main steps necessary to perform a PRA of an NF. Step 1 “Management and organization” covers the actions and activities necessary to carry out the PRA, including the definition of objectives, scope, project management, team, training, budget, and schedule, as well as quality assurance (QA) program. Step 2 “Identification and selection of initiating events (IEs)” looks for a list of internal and external events, as complete as possible, that upset the normal operation and require a facility response in order to avoid undesirable outcomes. Operator errors, damaged process pipes or filters, transfer errors, ageing of SSCs, and degradation of control functions can be mentioned as examples of typical IEs for NFs [3]. Step 3 “Scenario modeling” intends to develop a model, which links the IEs and the defence-in-depth levels available to prevent or mitigate the undesirable facility end-states. This model describes the sequence of events as a consequence of an IE, including operator actions, which are initiated for controlling the accident. The

38

V. de Vasconcelos et al.

Fig. 4 Main PRA steps of a nuclear facility

most common tools for modeling complex systems are ETA, FTA, and RBD, among others. HRA methods are used for modeling human factors affecting the behavior and performance of operators. Step 4 “Data assessment and parameter estimating” consists of gathering the information relevant for the quantification regarding the frequencies of occurrence of operational sequences and the magnitude of consequences. Parameters such as frequencies of IEs, component reliabilities, safety system unavailability, and human error probabilities are necessary for frequency estimation. For consequence assessment, specific parameters related to the amount, form, and transport of hazardous and radioactive materials during postulated accidents are required. General data related to effects to people and the environment are also required. Step 5 “Sequence frequency” and step 6 “Consequence evaluation” consist of quantification of scenarios using the models developed in step 3 and the data gathered in step 4. These steps result in the assessment of the frequency of accident sequences and the estimation of potential consequences, generally in terms of radiation doses. There are many uncertainties related to both frequency and consequence assessments. Assessment of uncertainties related to data, methods and models is of fundamental importance to risk management. Uncertainty assessment usually involves the determination of the probability density function (pdf) of some parameters associated with outcomes based on pdf of input data. For instance, uncertainty propagation of failure probability in fault trees and event trees should be considered in PRAs, in order to get the confidence bounds of the quantitative risk metrics. Step 7 “Risk assessment” involves the integration (or product) of frequency and severity assessments using the Eq. 9, and a comparison of estimated levels of risk with defined risk acceptance criteria set by regulators. For risks involving radioac-

Integrated Reliability and Risk Assessments of Nuclear Facilities

39

tive materials, usually frequency-dose relationships are obtained and compared with acceptance criteria (frequency-dose curves). Uncertainty assessment is also important in this step to rank risks and verify compliance with standards and guidelines. Where appropriate for risk management, sensitivity studies should be made for the main assumptions and parameters to get their relative importance and guide the decision-making processes. Step 8 “Documentation” of the PRA, as part of the QA program, should cover a compilation of assumptions, data, methods, detailed analyses, and interpretation of the results. This is essential for future verification, validation, auditing, updating, and improve the assessments.

5.2 PRA Tools Many risk assessment tools to estimate both the frequency of accidents and their consequences can be used to support the implementation of PRA steps. Some commonly used tools are next shortly described [13, 19]. Failure mode and effect analysis (FMEA) is a qualitative technique for identifying potential failure modes, causes, effects and control mechanisms of failures on processes or systems. Sometimes, a semi-quantitative analysis of risks is carried out for prioritizing corrective actions. This analysis is based on scores assigned to severity of effects, likelihood of occurrence and detection of failures. FMEA is used at the initial steps of PRAs for screening the postulated initiating events (IEs) and accident scenarios. Event tree analysis (ETA) uses a binary decision tree for modeling accident scenarios. It starts with an IE and proceeds through a sequence of successes or failures of defence-in-depth levels until the end-states are reached. Each end-state is an accident sequence whose consequence should be quantitatively assessed. Each accident sequence has its own frequency of occurrence, depending on the frequency of the IE and on the probability of success or failure of defence-in-depth levels. These probabilities can be estimated using logical tools as RBD and FTA, or analyzing available life data, using LDA or ALT, for example. Monte Carlo simulation (MCS) is a stochastic method for modeling based on direct simulation of systems or processes. Through the use of random sampling of probability density functions of input parameters, the uncertainties of output parameters are obtained. It is simple in principle, but requires the use of computer programs to be implemented because the high number of samples necessary to get accurate results. In PRA, MCS is mostly used for uncertainty propagation in FTAs and ETAs, as well as consequence estimates of accident sequences [10, 20]. Analytical models are used in safety assessment mainly through the determination of phenomenological, statistical or empirical mathematical equations to model, for instance, accident consequences. They are mainly based on mass, energy and momentum conservation principles. Analytical models can represent process conditions, atmospheric dispersions, explosion shockwave, radiation heat, or radiological

40

V. de Vasconcelos et al.

doses, among others, at various points in time after an accident. The consequences can be expressed as fatalities, injuries, radiation doses, damage in structures, or environmental impact [10]. Other tools used in PRAs, such as RBD, FTA and Markov models, are also used in RAM analyses and have already been described in Sect. 4.3 of this chapter.

5.3 Risk and Reliability Acceptance Criteria Risk acceptance criteria are limits and conditions defined by regulatory bodies to ensure the adequate safety levels during licensing processes. In the case of NPPs, these criteria can be, for example, core damage frequency (CDF) and large early release frequency (LERF). In the case of NFs, risk acceptance criteria are usually defined in terms of frequency-consequence (e.g., frequency-dose) ranges for workers or the public. On the other hand, these criteria can be more subjective, for instance, the demonstration by the licensee that IISs or other SSCs meet certain principles and criteria, such as, ALARA principle, single failure criterion, fail-safe design or double contingency principle [21]. NUREG 1860 [22] presents a frequency-consequence (F-C) curve based on the principle that event frequencies and doses are inversely related, which is broadly consistent with ICRP 64 [23]. Table 5 shows these proposed numerical criteria in terms of frequency per reactor-year and the corresponding total effective dose equivalent (TEDE) ranges for the public. These frequency-dose ranges for the public are based on U.S. Code of Federal Regulations (CFR) Parts 20, 50 and 100, as well as on Environmental Protection Agency (EPA) Action Guidelines. While these regulatory requirements provide some dose points for deriving the dose ranges, considerable engineering judgment was necessary to assign the corresponding frequencies. Despite these ranges had been proposed for NPPs, their base lines and criteria can be applied to NFs, e.g., ALARA principle, stochastic and deterministic effects of radiation, radiological protection, and emergency preparedness. Figure 5 shows a graphical comparison of the safety criteria of Germany, Great Britain, ICRP 64, NUREG 1860, and Switzerland. The values plotted in Fig. 5 are based on the data available in [3, 22]. Note that some frequency-dose curves are only represented by step lines separating the acceptable and unacceptable regions (German, Switzerland and NUREG 1860), while other criteria present ALARA regions between acceptable and unacceptable regions (ICRP 64 and Great Britain). Apart from risk acceptance criteria, there are reliability acceptance criteria to meet the safety, performance, and QA requirements. These criteria are based primarily on verifying the compliance of the reliability characteristics (Table 2) with the specified values. For instance, the vendor must include documented bases for the estimated M T T F and M T T R of IISs. Specifying RAM levels for IISs depends not only on their safety significance, but on many factors, such as energy efficiency and economic evaluation (e.g., pricing, maintenance, and inspection costs).

Integrated Reliability and Risk Assessments of Nuclear Facilities

41

Table 5 Proposed dose-frequency ranges for public, according to NUREG 1860 [22] Dose range (mSv) Frequency Comment (per reactor-year) (all doses are total effective dose equivalent—TEDE) 0.01–0.05 0.05–1

1.0 10−2

1–10

10−3

10–250

10−4

250–1000

10−5

1000–3000 3000–5000 >5000

10−6 5 × 10−7 10−7

0.05 mSv/year is the ALARA dose in 10 CFR 50 App I 1 mSv/year is the public dose limit from licensed operation in 10 CFR 20 10 mSv/event is the off-site trigger of EPA Protection Action Guidelines 250 mSv/event triggers abnormal occurrence reporting and is limit in 10 CFR 50.34 and in 10 CFR 100 for siting 500 mSv is a trigger for deterministic effects (i.e., some early health effects are possible) In this range the threshold for early fatality is exceeded Above 3–4 Sv, early fatality is quite likely Above 5 Sv early fatality is very likely and curve shall be capped

Fig. 5 Comparison of safety criteria (risk based) for nuclear facilities

42

V. de Vasconcelos et al.

The acceptable levels of complex IISs depend on the evaluation of the whole system and interdependencies. RAM criteria and principles, such as redundancy, diversity, physical separation, fail-safe, single failure and defence-in-depth, among others (Table 3), should be carefully evaluated. Reliability tools, as described in Sect. 4.3, are useful for this verification, mainly taking into account common mode failures. Double contingency is a requirement of particular relevance for NFs. According to this requirement, a “criticality accident cannot occur unless at least two unlikely, independent and concurrent changes in process conditions have occurred” [5]. This implies, not only following the above mentioned criteria of Table 3, but also the estimation of the probability of occurrence of failures, which can result in criticality accidents, including human errors.

6 Integrated Approach of Reliability and Risk Assessments Reliability, availability, and maintainability (RAM) assessments of items important to safety (IISs) are carried out in quality assurance (QA) programs to define the technical specifications and the required availability levels [2]. On the other hand, safety assessment, conducted using both deterministic and probabilistic approaches, intends to demonstrate that the facility complies with regulatory requirements. The safety assessment shall also assure that defence-in-depth has been implemented in the facility design [9]. RAM assessments, PRA, and definition of acceptance criteria are complementary and can be carried out in an integrated way. Venn diagram of Fig. 6 illustrates this integration, showing the interrelationships: technical specifications, reliability data, risk criteria, and safety and availability analyses. The integration involves also the use of common tools, as RBD, FTA, ETA, MCS, FMEA, LDA, and Markov models. A framework to integrate RAM assessments and PRA is presented in Fig. 7. According to this framework, the design basis accidents (DBAs) are defined initially through the analysis of postulated initiating events (IEs), which can include equipment failures, operator errors, and human-induced or natural events that challenge the IISs. The accident scenarios analysis includes the modeling of accident sequences, as well as estimating their probabilities of occurrence and consequences. The quantitative risk assessment, integrating probability (or frequency) and severity of consequences can identify important scenarios beyond DBAs, which in turn shall be analyzed quantitatively adding new accident sequences to accident scenarios. Based on risk assessment, the IISs are ranked according to their significance in preventing or mitigating the consequences of accident sequences. PRA, using reliability and maintainability data, taking into account the technical specifications of IISs, shall include uncertainty assessment, in order to assure that the facility design has adopted adequate safety margins and meets the defined risk acceptance criteria. Frequency-consequence or frequency-dose curves, depending on regulatory bodies of different countries (e.g., Fig. 5), are the most used criteria to evaluate whether

Integrated Reliability and Risk Assessments of Nuclear Facilities Fig. 6 Venn diagram illustrating the interrelationships among RAM assessments, PRA and acceptance criteria

Fig. 7 Proposed framework for integrating reliability and risk assessments of nuclear facilities

43

44

V. de Vasconcelos et al.

the facility complies with the acceptable levels of risk. At this point, it is important to consider the confidence bounds of quantitative assessments (represented by the 5th and 95th percentiles of pdf of the estimated risks), obtained through uncertainty assessment, in order to assure an adequate level of conservatism.

7 Case Study of Integrated Reliability and Risk Assessments In order to illustrate the use of the integrated approach to reliability and risk assessments, a simple example involving analysis of a criticality accident in a nuclear fuel fabrication plant (NFFP) was developed. Nuclear facilities such as fuel fabrication plants handle fissile material and must manage their processes and activities to ensure criticality safety. The nuclear subcriticality depends on many parameters, such as mass, concentration, geometry, volume, enrichment, and density of fissile material. It is also influenced by the presence of other materials in the facility, such as moderators, absorbers, and reflectors. Subcriticality is then ensured through different defence-in-depth levels of protection for preventing failures, ensuring detection, and mitigating the consequences of criticality accident. They can be classified as passive, active, and administrative protection levels [24]. Examples of passive defence-in-depth levels, which do not depend on control systems, or human interventions, are the designing of geometry and volume of pipes, vessels, and structures inherently safe against criticality, limiting of the uranium enrichment, presence of neutron absorbing materials, as well as shielding for mitigating accidental radiation doses. Examples of active defence-in-depth levels are automatic process control, neutron and gamma monitors, and computer systems for controlling the movement of fissile material. Examples of administrative defence-indepth levels are operating procedures for avoiding dangerous situations, controlling of the isotopic composition, mass, density, concentration, chemical composition, degree of moderation, and spacing between fissile material systems. Other examples are defining of areas authorized to contain significant quantities of fissile material and access control. Active components that require human actions in response to indicators or alarms are better classified as administrative defence-in-depth levels. Let’s consider a probabilistic risk assessment (PRA) of a nuclear fuel fabrication plant (NFFP) carried out at the preliminary facility design phase. It was assumed as design basis accident (DBA) a criticality excursion producing an initial burst of 1.0 × 1018 fissions in 0.5 s followed successively at 10 min intervals by 47 bursts of 1.9 × 1017 fissions, resulting in a total of 1.0 × 1019 fissions in 8 h. According to reports of this kind of accident, this event produces little or no mechanical damage in the structures, systems and components (SSCs) of the NFFP [25]. It is also assumed that the system for diagnosis of criticality accidents and mitigating their consequences

Integrated Reliability and Risk Assessments of Nuclear Facilities

45

is moderately complex and there are only poorly elaborated checking procedures to anticipate hazardous conditions. The semi-empirical Eqs. 10 and 11, respectively, are recommended for estimating the prompt gamma and neutrons doses following the postulated accident [25]: Dγ = 2.1 × 10−19 N d −2 e−3.4d μc ,

(10)

where: Dγ = prompt gamma dose (mSv), N = number of fissions, d = distance of source (km), μc = dose reduction factor (μc = 2.5 for concrete wall thickness = 8 in. and μc = 5 for concrete wall thickness = 20 in.). Dn = 7.0 × 10−19 N d −2 e−5.2d μc ,

(11)

where: Dn = prompt neutron dose (mSv), N = number of fissions, d = distance of source (km), μc = dose reduction factor (μc = 2.3 for concrete wall thickness = 8 in. and μc = 4.6 for concrete wall thickness = 20 in.). To investigate design alternatives, defining reliability and risk acceptance criteria, and make initial considerations about the influence of human factors on design and operation, a preliminary PRA can be used in an integrated way with reliability assessment techniques. The generic event tree analysis shown in Fig. 8 was developed to investigate the accident scenarios of criticality accident in an NFFP, where λ is the frequency of initiating event, and P1 , P2 and P3 are the failure probabilities of passive, active and administrative defence-in-depth levels, respectively. An initiating event for this kind of accident could be an uncontrolled increase of concentration of fissile material in piping and vessels of the facility. It will be assumed conservatively a frequency of occurrence of such event as λ = 10−1 per year [26], and P1 , P2 and P3 as 10−2 , and 10−1 , and 10−1 , respectively [27]. These are very conservative figures since they are one order of magnitude greater than the recommended upper limits for single passive and active defence-in-levels. The figure of 10−1 assumed for administrative defence-in-depth, is also conservative and is recommended for rare unplanned administrative events. It is also compatible with SPAR-H estimates for moderately complex diagnosis tasks with poorly procedures. Assuming that the administrative defence-in-depth levels are only to mitigate the accident and not to prevent it, as well as the defence-in-depth levels are independent and each one is capable of avoiding the accident, the only sequences in which the criticality accident occurs are S1 and S2 .

46

V. de Vasconcelos et al.

Fig. 8 Generic event tree for the criticality accident analysis in an NFFP Table 6 Estimated frequency-doses for accident sequences S1 and S2 of case study Sequences S1 S2 Alternative 8” concrete wall 20” concrete wall 8” concrete wall 20” concrete wall design Freq. Dose Freq. Dose Freq. Dose Freq. Dose year−1 (mSv) year−1 (mSv) year−1 (mSv) year−1 (mSv) 1

10−5

241

10−5

120

2

10−6

241

10−6

120

3

10−7

241

10−7

120

9× 10−5 9× 10−6 9× 10−7

46.9 46.9 46.9

9× 10−5 9× 10−6 9× 10−7

23.5 23.5 23.5

1: initial basic design 2: redundant active defence-in-depth 3: improved administrative defence-indepth

Doses and frequencies for sequences S1 and S2 were estimated for three alternatives designs: the initial assumed basic design; the improvement on active defencein-depth level, adding redundancy; and additional improvements on administrative. Table 6 presents these estimated values. For simplicity, the external doses from dispersion of fission products were neglected, and only prompt gamma and neutron doses were considered in calculations for the public individuals of the critical group, assuming they are located at 0.1 km of distance.

Integrated Reliability and Risk Assessments of Nuclear Facilities

47

Fig. 9 Frequency-dose curve (based on NUREG-860 [22]]) and accident sequences S1 and S2 , comparing different design alternatives

The total doses estimated for the sequence S1 , using Eqs. 10 and 11, were 241 mSv for 8 in. concrete wall thickness and 120 mSv for 20 in. concrete wall thickness, respectively. These doses were estimated considering that the public individuals remain throughout the course of the accident at the distance of 0.1 km. The doses from sequence S2 , estimated supposing an evacuation time of one hour, were reduced to 46.9 mSv for 8 in. concrete wall thickness and 23.1 mSv for 20 in. concrete wall thickness, respectively. The frequencies of occurrence of sequences S1 and S2 , estimated according to equations presented in Fig. 8, were 10−5 and 9 × 10−5 , respectively. The estimated risks are plotted in a frequency-dose curve, according to NUREG 1860 criteria [22], taken as reference. Both sequence S1 and S2 , for the initial basic design defence-indepth levels, lie under the curve, in an acceptable region, as can be seen in Fig. 9. However, the dots are very close to the curve, and if the uncertainties on estimating frequency and doses were taken into account, certainly the dots would lie in an unacceptable region [20]. Therefore, further design alternatives should be investigated. Adopting redundancy at active defence-in-depth level, its reliability becomes 0.99, which reevaluates P2 as 10−2 per year. Revising design for simplifying diagnosis system and improving maintenance, checklist, and operation procedures, the nominal value of 10−3 per year can be adopted to P3 , according to SPAR-H method. In this way, new values estimated for criticality risks are achieved, which lie in the acceptable region in Fig. 9, far from limits of the frequency-dose curve and with adequate safety margins for the risks.

48

V. de Vasconcelos et al.

8 Concluding Remarks A framework integrating reliability, maintainability and availability (RAM) assessments of items important to safety and probabilistic risk assessment (PRA) of nuclear facilities (NFs) is proposed in this chapter. The use of common reliability engineering tools and reliability data could optimize the safety and availability analyses. The outcomes of this proposal can also support the licensees in achieving a high level of safety and reliability through the integrated and complementary use of RAM and PRA tools in all phases of NF lifetime and meeting licensing requirements. A case study demonstrating the advantages of using the integrated framework to a preliminary facility design phase of a nuclear fuel fabrication plant was presented. Design and administrative control alternatives were analyzed to reduce the criticality accident risk. This case study highlighted that the use of redundancy of active defencein-depth levels and considerations of human factors from the feasibility study and preliminary design phase helps to meet the reliability and risk acceptance criteria, as well as nuclear licensing requirements. Acknowledgements The authors thank the following intitutions: CNEN (Brazilian Commission of Nuclear Energy), CDTN (Center of Nuclear Technology Development) and FINEP (Funding of Science, Technology and Innovation).

References 1. Christensen F, Andersen O, Duijm J (2003) Risk terminology: a platform for common understanding and better communication. J Hazard Mater 103:181–203 2. International Atomic Energy Agency (2001) Reliability assurance programme guidebook for advanced light water reactors. IAEA-TECDOC-1264. IAEA, Vienna 3. International Atomic Energy Agency (2002) Procedures for conducting probabilistic safety assessment for non-reactor nuclear facilities. IAEA-TECDOC-1267. IAEA, Vienna 4. Winteringham F, Peter W (1992) Energy use and the environment. Lewis Publishers Inc, London 5. International Atomic Energy Agency (2018) IAEA safety glossary terminology used in nuclear safety and radiation protection. 2018th edn. Vienna 6. International Atomic Energy Agency (2020) Safety analysis and licensing documentation for nuclear fuel cycle facilities. Safety Reports Series No. 102 IAEA, Vienna 7. International Atomic Energy Agency (2017) Safety of nuclear fuel cycle facilities. IAEA Safety Standards Series No. SSR-4, IAEA, Vienna 8. International Atomic Energy Agency (2010) Licensing process of nuclear installations. Specific Safety Guide No. SSG-12 IAEA, Vienna 9. International Atomic Energy Agency (2016) Safety of nuclear power plants: design, specific safety requirements. SSR-2/1 (Rev. 1), IAEA, Vienna 10. Vasconcelos V, Soares W, Costa A et al (2019) Deterministic and probabilistic safety analyses. In: Advances in system reliability engineering. Elsevier, London 11. Reliasoft Corporation (2015) System analysis reference: reliability, availability and optimization. ReliaSoft, Tucson 12. Reliasoft Corporation (2015) Life data analysis reference. ReliaSoft, Tucson 13. Stamatelatos M (2002) Probabilistic risk assessment procedures guide for NASA managers and practitioners: Version 1.1. NASA, Washington DC

Integrated Reliability and Risk Assessments of Nuclear Facilities

49

14. Fleming K (2004) Markov models for evaluating risk-informed in-service inspection strategies for nuclear power plant piping systems. Reliab Eng Syst Saf 83:27–45 15. U.S. Nuclear Regulatory Commission (2004) The SPAR-H human reliability analysis method. NUREG/CR 6883. USNRC, Washington DC 16. Park J, Jung W, Kim J (2020) Inter-relationships between performance shaping factors for human reliability analysis of nuclear power plants. Nucl Eng Technol 52:87–100 17. U.S. Nuclear Regulatory Commission (1975) WASH-1400: reactor safety study, NUREG 75/014. USNRC, Washington DC 18. International Organization for Standardization (2009) Risk management: principles and guidelines. ISO 31000, 1st edn. ISO/IEC, Geneva 19. Calixto E (2013) Gas and oil reliability engineering, modeling and analysis. Elsevier, Amsterdam 20. Vasconcelos V, Soares W, Costa A, Raso A (2019) Treatment of uncertainties in probabilistic risk assessment. In: Reliability and maintenance-an overview of cases. IntechOpen, London. https://www.intechopen.com/chapters/65179 21. International Atomic Energy Agency (2001) Applications of probabilistic safety assessment (PSA) for nuclear power plants. TECDOC-1200, IAEA, Vienna 22. U.S. Nuclear Regulatory Commission (2007) Feasibility study for a risk-informed and performance based regulatory structure for future plant licensing. NUREG 1860, v. 1, USNRC, Washington DC 23. International Commission on Radiological Protection (1993) ICRP publication 64: protection from potential exposure: a conceptual framework. Ann ICRP 23(1). Ottawa 24. International Atomic Energy Agency (2014) Criticality safety in the handling of fissile material. Specific Safety Guide No. SSG-27, IAEA, Vienna 25. U.S. Nuclear Regulatory Commission (1979) Regulatory guide 3.34. Revision 1. Assumptions used for evaluating the potential radiological consequences of accidental nuclear criticality in a uranium fuel fabrication plant, Washington, DC 26. Laboratory Los Alamos National (2000) LA-13638. A review of criticality accidents (2000) Revision. Los Alamos, New Mexico 27. Canadian Nuclear Safety Commission (2018) REGDOC-2.4.3. Nuclear criticality safety. Ottawa, Ontario

Computational Tools of Media Analysis for Corporate Policy Effectiveness Evaluation: Models and Their Reliability Gregory S. Khvatsky, Dmitry G. Zaytsev, Valentina V. Kuskova, and Anna A. Sokol

1 Introduction Human decision-making is becoming more complex due to increasing amount of information t that needs to be analyzed in today’s modern systems. Decision-making in business corporations is not an exception. Such complexity raises a question about reliability or broader efficiency and effectiveness of corporate decision-making. Corporate efficiency and effectiveness is strongly associated with inclusion of the modern computational or data science tools into the strategic corporate decisionmaking. A special concept describes this relationship—business intelligence (BI). BI is “the use of all the organization’s resources: data, applications, people, and processes in order to increase its knowledge, implement and achieve its strategy, and adapt to the environment’s dynamism” [2, p. 6]. However, BI faces some challenges. Among them are the ability to identify “patterns, trends, rules, and relationships” [2, p. 6]. Alnoukari and Hanano [2, p. 6] from large amounts of data that are available, Most of that data are too complex for humans to process. Data complexity leads to the problem where complex corporate efficiency and effectiveness evaluation systems are limited to integration of computational tools

G. S. Khvatsky · D. G. Zaytsev (B) · V. V. Kuskova · A. A. Sokol International Laboratory for Applied Network Research, HSE University, 11 Pokrovsky Bulvar, Pokrovka Complex, Moscow 101000, Russia e-mail: [email protected] G. S. Khvatsky e-mail: [email protected] V. V. Kuskova e-mail: [email protected] A. A. Sokol e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment of Industrial Systems, Springer Series in Reliability Engineering, https://doi.org/10.1007/978-3-030-93623-5_3

51

52

G. S. Khvatsky et al.

into the simpler models and approaches such as SWOT analysis, scenario analysis, and forecasting [2, 3, 36]. Nevertheless, efficiency is a concept that is universally used across all social sciences—not only in management. In political science, efficiency is associated with governments and states [34], in sociology—with various social institutions (such as civil society) and the social environment [16]; in economics—with commerce efficiency [39]; in psychology—with efficient behaviors of individuals [13]. This multidimensional nature of efficiency has also led to the propagation of attenuating concepts such as effectiveness [23] and efficacy [28].1 Therefore, to establish a reliable system of corporate efficiency evaluation that is not limited to simpler models, one must take into consideration the increasing complexity of decision-making in modern systems, including large corporations. Outside the narrower field of management, this complexity is reflected in social sciences, and mostly in policy science. Policy science is as a newer interdisciplinary field, studying of policy-making, and it integrates contemporary developments in economics, management, sociology, and political science [20]. The context-specific, problem-oriented, and evidence-based approach of policy science also allows policy analysts, corporate consultants, and business intelligence agents to better integrate data science methods into their decision-making process, thereby increasing corporate policy effectiveness. Changing the focus from corporate efficiency to corporate policy effectiveness embeds the necessary complexity into evaluation of decision-making. The concept of “policy” provides a broader context of the social environment, which serves as a ecological niche in which large corporations function and for which they carry certain responsibilities.. Policy effectiveness focuses not only on the efficient results (any outputs achieved with the minimum use of resources), but also on policy outcomes (corporate policy effects that may be valuable for the general public and the society at large). Table 1 provides questions for corporate policy effectiveness evaluation, derived from the stages of applied problem-solving approach, and the corresponding stages in the policy cycle theory and practice. To increase the effectiveness, policy designers, managers, practitioners, scientists, and social engineers usually have to answer these questions using different methods and tools. Therefore, the table serves as the summary of the commonly used methods and computational tools. We argue that computational tools are not inferior, and in some cases, they may be superior to the traditional methods of corporate policy effectiveness evaluation. They could provide additional answers when decision-makers are dealing with big corporate data wherehuman processing is impossible. Examples of such data are data from old and new media (social networks), which lately became the source of many important insights that affect corporate effectiveness.

1

Despite the differences between these concepts (efficiency, effectiveness, and efficacy) exist, their detailed analysis is beyond the scope of this chapter.

Computational Tools of Media Analysis …

53

Table 1 Corporate policy effectiveness evaluation: questions and methods Applied problem-solving

Stages in policy cycle

Questions

Traditionally used methods

Computational tools

Problem recognition

Agenda-setting (problem definition)

Who is concerned about the problem? What is their status? Are there new ideas?

Stakeholder analysis Mass survey Expert survey Focus groups

Named-entity recognition (NER) Social networks analysis (SNA) Natural language processing (NLP): topic modeling

Proposal of solution

Policy formulation (constructing the policy alternatives)

What policy tools are used? Are there new actors with new ideas?

Experts interviews (brainstorming, consulting)

NLP: topic modeling NER & SNA with NLP: topic modeling

Choice of solution

Decision-making (selection of preferred policy option; policy design)

What are the effects of policy constraints? What is the policy capacity level?

Cost–benefit analysis Scenario analysis SWOT analysis

NLP: topic modeling and sentiment analysis

Putting solution into effect

Policy Implementation

What is the Interviews with relationship of policy actors policy means with policy goals (both outputs and outcomes)?

SNA

Monitoring (and evaluation) of results

Policy (monitoring and) evaluation

What are the learning outcomes?

NER, SNA, NLP: topic modeling and sentiment analysis

Program monitoring Performance management Program evaluation

Natural Language Processing in general, and Topic Modeling in particular, became the key instruments of analyzing old and new media data for a large corporation (Table 1). Therefore, in this chapter we are focusing on different techniques of topic mining and their reliability for policy-making in a large corporation. Our main contribution is an integration of computational tools such as topic mining into specific tasks of business intelligence, which then leads to an increase of corporate policy effectiveness. The context for this study came from the case of one of the largest transnational corporations (TNK) based in Russia (“the Company”). The problem faced by the Company was a concern with its reputation. The management wanted to implement a new corporate policy aimed at building a more effective image and increase the company’s reputation in old and new media. To demonstrate how computational tools were used to improve business intelligence, we first provide a short overview of the existing methods of topic mining.

54

G. S. Khvatsky et al.

Second, we describe how we use three popular data science methods of topic mining to tackle a real-life business problem. Third, we provide an analysis of the advantages and disadvantages of the topic mining methods related to the case of the Russian TNK. In conclusion, we provide recommendations for development of a system of monitoring and evaluation of corporate policy effectiveness using computational policy tools, topic mining procedures in particular.

2 Theory and Methodology of Topic Mining Text data is one of the most important sources of information and knowledge. In fact, about 95% of big data come in text or some other unstructured form.2 Therefore, the ability to get ideas and understanding of such data is an excellent advantage for the successful possession and control of information in business [18, 27]. Social media plays a decisive role in propagating textual information, having a great influence on what information reaches the public [10, 31]. Today’s coverage of information has greatly increased the ability of both ordinary people and large corporations to publish their content and expand their reach. At the same time, as information and knowledge from these media continue to be digitized and stored in the form of news articles, web pages, social media posts and the like, analyzing all this information will help businesses manage it in time to respond to various events [19, 29]. That is why learning the characteristics of social media content becomes important for a number of tasks, such as recent news discovery, personalized message recommendation, friend recommendation, sentiment analysis, and others. However, analyzing social media content presents certain challenges. Typical social media contains large amounts of data consisting of brief writings with serious fragmentation. Many documents are “off-topic” to the subject of interest; excess highlights often complicate readability. Users have no limitation on content they post; their language is not sufficiently standardized, unlike news or scientific articles. The language may have urban dialects and digitals forms and reductions, making it difficult to recognize and understand. All of these issues create problem for data reduction when processing social texts. The challenge of identifying common topics is to detect “hot” topics in social media data streams. In natural language processing, including topic modeling, there are multiple approaches for detecting, labeling, and pattern detection. Most of these tasks are done by machine learning algorithms. In this study we demonstrate three different approaches to topic mining and outline their benefits and drawbacks. Topic modeling became a popular analytical tool for evaluating text data. Researchers use topic modeling for different reasons: to find similarity of post content in social media [9, 15], explore environmental data [11], recommend scientific articles [37]. It also has a special place in management research [14]. There are several 2

https://builtin.com/big-data.

Computational Tools of Media Analysis …

55

definitions of “topic modeling;” in this study, we will use the definition of [7], who define it as “an unsupervised machine learning method that learns the underlying themes in a large collection of otherwise unorganized documents” [7, p. 419]. Topic modeling allows to capture the hidden semantic structures in a document. Typical document is composed of mixtures of topics, and topics are composed of a set of words [17, 32, 33]. Therefore, the methods used usually are aimed at separating these sets from each other. One popular topic modeling method is the Latent Dirichlet Analysis (LDA). Originally proposed for use with genetic data [30], the algorithm quickly found its way into text analysis and machine learning [5], where it remains popular to this day. LDA is a three-level hierarchical Bayesian model, where documents are represented as random mixtures over latent topics. Each of the topics is a probability distribution over individual units of analysis (e.g. individual words.) Topics in LDA do not mean the same as “topics” or “discourses” in qualitative text analysis, but rather probability distributions over words that need additional interpretation. The model has one hyperparameter—the number of topics to estimate. While there are methods that allow for computational estimation of number of topics, in a particular, a corpus [26], using them for large corpora requires a large amount of computational effort. LDA for traditional text mining has made some progress and achieved good results [5, 12]. Despite its wide popularity for complex text data, it is not very efficient at analyzing short texts from social media, which have their own unique properties, so special care is required when modeling social media data with LDA. Another another popular method is Hierarchical Dirichlet Process (HDP). HDP is an extension to LDA, designed to be used when the number of “topics” in inside the document is not known [40]. Using this method, researchers can effectively improve the efficiency of text segmentation. Because both methods are the so-called these bag-of-words methods [8, 21], they can be used without manual intervention since they are not concerned with text structure. This is somewhat of a limitation, since the topics obtained automatically by these methods can be correlated with each other, and this can complicate their interpretation [6, 38]. There is another problem that complicates topic modeling. Sometimes, words in the training data are spelled the same, but have different meaning and represent different topics. Such words as “Elephant” or “Donkey” belongs to the “animal” topic, but these words are also related to the “American politics” topic. Both LDA and HDP are insensitive to the original topic of such words. One way to solve the above limitations is the “manual labeling” method. Experiments with real data sets have shown that manual labeling method is efficient and flexible for solving problems related to brand image [1, 40]. In this study, we compare the manual labeling method with several others for the use in corporate policy efficiency evaluation.

56

G. S. Khvatsky et al.

3 Application of Topic Mining Algorithms to Real-Life Problems in Business and Public Administration 3.1 Policy Problem: Corporate Reputation Vulnerability Impact of social media on corporate reputation cannot be underestimated: with the right words and enough reposts, reputation of a company can be damaged in a matter of minutes [24]. By now, most companies know this very well. The case presented in this chapter is based on a large Russian multinational corporation (the Company). The Company was well aware of the effects that social media had on opinions of its stakeholders, so it was working to design a corporate policy aimed at effective public image management. The company approached the authors of this study with several problems. First, it wanted to have an understand of its then-current corporate image: the sentiment around the company, topics that were generating the largest number of positive and negative comments, and the overall sentiment of public discussion around itself. Second, it wanted to design a media monitoring tool that would provide nearimmediate results of frequent media scanning for the purpose of evaluating any changes in corporate reputation. This was a complex project consisting of several parts; this paper reports on the topic mining component of it—the first problem that the company posed before the authors.

3.2 Data The data used for this project were collected via a media monitoring platform called YouScan. This platform provides a tool that collects data from a variety of sources, including social media websites, online news websites, web forums, blogs, and posts made in public channels of messaging applications. YouScan also maintains an archive of such data and provides access to it. The workflow for getting data from YouScan is as follows: the user defines a “topic” that is denoted by several text queries. The system searches the archive for posts matching the queries, and starts monitoring for new data. The user can then export the data for offline processing via a built-in API. In total, we have collected 3,118,412 publications in Russian (texts and metadata information), ranging in time from August 30, 2017 to October 29, 2019. Publications we collected could be separated into multiple groups based on the role they played in the discussions that happened on the online platforms monitored by YouScan. The most common of these publications is a ‘post’ (1,270,434 publications). This type of a publication describes a text that was published on an online platform by a user. The descriptions of all publication types and their counts are provided in Table 2.

Computational Tools of Media Analysis …

57

Table 2 Publication types with counts and descriptions Type

Count

post

1,270,434 A text that was published on an online platform by a user

Description

repost

841,506 A publication of someone else’s post made by another user or community on their own site or page

comment

596,834 A comment on a post or a repost

replyComment

315,674 A comment on a post or a repost that is a reply to another comment

Extended repost

93,964 A repost of a post made by another user or community that has some content added by the user or community that made the repost

Table 3 Top-10 data sources and publication counts

Source

Count

Percentage of total (%)

vk.com

1,050,353

33.68

facebook.com

793,709

25.45

ok.ru

342,666

10.99

twitter.com

294,028

9.43

telegram.me

108,743

3.49

youtube.com

77,498

2.49

instagram.com

53,444

1.71

livejournal.com

38,632

1.24

talks.by

20,408

0.65

pikabu.ru Other sources

19,012

0.61

319,919

10.26

YouScan collects data from a wide variety of sources, including, but not limited to, social network websites, public Telegram channels, and a selection of news websites. A more complete breakdown on data sources for our dataset is provided in Table 3. As shown in Table 3, almost 80% of the data came from social network websites. The most common of these and most common overall is VKontake (VK), one of the most popular social network website in the Russian-language segment of the Internet. It is followed by Facebook, Odnoklassniki (ok.ru), another social network targeted at the Russian-language segment of the Internet, then Twitter and Instagram. Another large source of data is Telegram, a messaging application that allows its users to create public channels to which they can post. Other users can then subscribe to these channels and discuss the posts in them. Telegram is also popular in Russia, with more than 25% penetration rate.3 Among the top-10 most common data sources are blog websites such as LiveJournal (similar to Blogspot) and Pikabu (similar to Reddit), which are also targeted at the Russian-speaking audience. Also among the 3

https://www.statista.com/statistics/867549/top-active-social-media-platforms-in-russia/.

58 Table 4 Distribution of publications across different resource types

G. S. Khvatsky et al. Resource type

Count

Percentage of total (%)

social

2,628,923

84

news

219,663

7

blog

98,944

3

messenger

92,528

3

forum

76,852

2

1502

0

reviews

most common sources of data are YouTube and a large Belarus-based web forum (talks.by) that is now defunct. The distribution of publication count across different types of media indexed by YouScan is presented in Table 4. As Table 4 shows, most popular type of resource are also social network websites. Social media accounts for 84% of all publications in the dataset. YouScan collects a large amount of information on each of the publications. Of prime interest for this study are the texts, but other information is also present. YouScan collects data on the time and date the publication was made, the source of the publication, information about the author of the publication (e.g. name, age, gender of a person, and a URL and a name for a community) and the amount of engagement (shares, likes, reposts) a publication has received. What information is collected is dependent on the source of the publication. For example, it is not possible to collect engagement rates from a web forum that does not allow its users to “like” and “share” the discussions.

3.3 Preprocessing Before doing any analysis, we applied multiple preprocessing steps to our data to clean it and improve the interpretability of the results obtained later. Our first step was text cleaning. Since the original corpus was in Russian, we removed English words. We also removed emoji characters, hashtags, punctuation, and web URLs from the texts in our database. The purpose of this step was to eliminate words that could confuse the lemmatizer and to remove information that was not relevant to our analysis. Next, we performed the lemmatization procedure on all of the texts in the database. Russian language has a fusional morphology, which means that words have different forms according to their roles in a sentence. This can lead to problems with text analysis, since bag-of-words based algorithms such as LDA may recognize different grammatical forms of the same word as different words. Because of that, a lemmatization procedure that puts all words into their dictionary form is required when performing bag-of-words based analysis.

Computational Tools of Media Analysis …

59

For this study, we used Mystem—a closed-source, but free to use lemmatizer for Russian and Ukrainian languages. It uses a hybrid approach combining a suffix tree and a dictionary. This approach allows for both accurate lemmatization of known words and guessing dictionary forms of words not originally present in the dictionary [35]. It should be noted that in our case, Mystem failed to properly lemmatize the name of a large Russian state enterprise that was very common in our dataset, so we had to create a custom dictionary in order to perform this step in a proper manner. As a final step, we removed stopwords from our lemmatized texts. Stopwords are words that carry no meaning and can be detrimental to bag-of-words based analysis. Examples of such words for the English language are the verb “to be”, words such as “and,” “or,” and “maybe”. There are lists of such words for almost all languages, and we have used one of these lists for this study. After performing all of these steps, the texts were finally ready for further analysis.

3.4 Latent Dirichlet Allocation The first algorithm that we used for this study was Latent Dirichlet Allocation (LDA). This algorithm has readily available open-source implementations, which makes it particularly easy to use. For our application of LDA, we used the implementation provided by the Gensim Python library. The Gensim library is a collection of NLP algorithms and models. The library strives to provide implementations of various models that are able to process arbitrarily large corpora that might not fit into RAM of the computer running the models. This property was quite useful for this study, as our corpus of 3,118,412 publications did not fit into RAM of the computer running the models. We did not use algorithmic detection of the number of topics due to the extreme computational complexity of having to fit a large number of models needed to estimate the value of the hyperparameter. For this study, we chose to train an LDA model with 100 topics and use its results for further analysis. While this number may appear somewhat arbitrary, methodological studies aimed at detecting the number of topics for topic modeling usually stop at about 100 (e.g., [41]). Therefore, we started with this large number to make sure we picked up most of the topics. With our particular implementation of LDA we faced a set of problems. First, when building the dictionary of words that are present in the corpus, Gensim may remove words that are not very common in the corpus. Such words, however, could still be important for our analysis. One example is words that are present in less than 10 publications in the corpus. Second, Gensim may also remove the words that are too common—for example, words that are present in more than 70% of the publications in the corpus. One way to solve this problem was to tweak the parameters used for this word removal. However, we decided to take an alternative approach, and created a list of words that should not be removed from the dictionary even if they were rare. These words included words that, for example, referred to events and people important for the industry under investigation.

60

G. S. Khvatsky et al.

A third problem was the presence of publications that were not relevant to our analysis. These publications (e.g. advertisements) have decreased the performance of LDA: discovered topics often had words from the advertisements with very high probabilities, which made interpretation of the discovered topics much more difficult. The fourth problem we faced was a large number of topics that were highly correlated with one another, thus making the model interpretation much harder. This was most likely due to the large number of topics in the model. Also, it could be due to the way that the original search query was constructed, because certain words (that were part of the query itself) were present in a very large numbers of posts. Again, this presented a problem for our analysis as it made it much harder for the training algorithm to differentiate different topics present in the posts from one another. Yet another problem that affected the model interpretability was the choice of the procedure used to assign topics to individual publications. LDA assigns a probability distribution of topics to each of the publications in the corpus. This means that technically, all publications have all topics present within them with various degrees of “affinity.” Determining which topics are present in which publication becomes quite complicated as a result. An obvious choice would be to simply select the most probable topic for each of the publications. However, doing so is based on an implicit assumption that a publication can only have exactly one topic present within it. After using this approach on our data, we have found that this assumption was not true for many of the publications present in our dataset. Thus, we decided to implement an alternative approach. We picked a threshold value and selected all the topics with the probability equal to or higher than this value. This approach allows for multiple topics to be present in one document, but of course, is based on determining the appropriate threshold value. For this study, set the probability threshold to the standard 0.05 or 5%. After building the models, we visualized them using another Python library called pyLDAvis. This library can convert LDA topic models to interactive visualizations that can be used to analyze and interpret the meaning of topics discovered in the corpus algorithmically. The library also visualizes the relationships between different topics via a what is called a “topic map.” PyLDAvis builds these maps by first computing a matrix of similarities between all possible pairs of topics and then embedding it into a 2D space using the classic multidimensional scaling. An example of such map for the LDA model we built on our corpus of 3,118,412 publications is present in Fig. 1. It can be seen from the map that topics 93, 17, 8, 42 and 24 are the most common in the corpus. After in-depth analysis, we have determined that they represent important discourses present in the media. Topic 93 represents the discourse that is related to the Company as well as its subsidiaries. Publications related to this topic describe corporate activities of the Company, such as public events that it organizes. A sizable part of publications related to this topic were also related to companies that were important not only to the Company itself, but to the industry as a whole. For example, certain environmental protection agencies, societies and the events they organize became included in our sample.

Computational Tools of Media Analysis …

61

Fig. 1 An LDA model topic map

Topic 17 represents the discourse that is related to the workers of the industry that the Company operates in. The publications related to this topic had discussions on the identity of typical industry workers and descriptions of their daily lives. This discourse also contained the discussions of dangers that are inherent to the industry for the workers themselves, the environment, and humanity as a whole. Topic 8 represents the discourse related to one of the worst disasters to happen to the industry in question, its short- and long-lasting effects on the lives of the affected people, the environment, and the future of the industry. Another important aspect of the discourse is the continued discussion of the ways the disaster was represented in various media (e.g., TV shows and video games). Topic 42 represents a discourse that is related to that of topic 8. The discourse for this topic is also centered on the same disastrous event, but it centers more on the media depictions of the event, the long-term effects on the environment surrounding the epicenter of the event, and the exploration of this environment.

62

G. S. Khvatsky et al.

Topic 24 represents a discourse that is connected with the industry in question as a whole. It also contains discussions on its relationship with other industries. While we found the results of the analysis useful, because of the problems with the analysis described above, we decided to apply other topic discovery algorithms to our dataset. Our goal was to order to compare the results we obtained by using different algorithms and models.

3.5 Hierarchical Dirichlet Process Another method that we attempted to use for topic modeling and topic discovery was to perform topic modeling using the Hierarchical Dirichlet Process (HDP) topic model. While it is harder to use than LDA, the availability of free and open source implementation makes the use of this model relatively easy. For our application of HDP, we have used the implementation provided by the Gensim Python library. The implementation of HDP provided by Gensim is based on an online algorithm that allows for analysis of arbitrarily large text corpora. As in the case with LDA, this property was very useful due to the large size of our corpus. For this model, we focused only on some types of publications in our corpus. We removed comments from our analysis since they did not add as much substance to the discussion of the Company and industry as posts and (extended) reposts. The size of our final corpus used to train the model was 2,205,904 publications, and it consisted of posts, reposts, and extended reposts. Since HDP can infer the number of topics from the data, we did not employ any algorithmic procedures for topic count estimation. For this study, we set the maximum possible number of topics in the model to 150. HDP and LDA are both based on the bag-of-words approach and as such share a common set of problems. We followed the same procedure for building the dictionary as outlined in the section on LDA. We used the same list of words that were important for our analysis to prevent them from being removed from the dictionary based on being too common or too rare. When building the dictionary, we used the same parameters for word exclusion, removing words from the dictionary if they were present in less than 10 publications, or more than in 70% of publications. Another issue that was shared to a degree between HDP and LDA modeling approaches was that the topics were highly correlated and contained similar sets of probable words. In the case of HDP, this problem was alleviated by the fact that the number of topics inferred from the text by the model was much lower than the number we used for the part of our analysis based on LDA. Also relevant to this algorithm were the issues caused by the presence of irrelevant publications in the dataset. Assigning topics to individual publications has also proved problematic. As with the case of LDA, HDP assigns a probability distribution over all the topics to each of the documents. However, this issue affects HDP much more than LDA, because the implementation of HDP we used generates a distribution not over the inferred

Computational Tools of Media Analysis …

63

number of topics, but over the maximum possible number of topics specified when training the model. To remedy this, we decided to use an approach similar to what we did for LDA: we set a probability threshold of 5%. As with LDA, we could assign more than one topic to a document. After training the model, we created a visualization procedure that allowed us to see the relationships between different topics. While HDP models can be transformed into LDA models, this transformation destroys a large portion of the available data, such as the number of topics inferred by the model. This severely limits the usefulness of converting HDP models to LDA models to aid in their visualization and interpretation. To visualize the model we used the following procedure. First, for each topic we counted the number of documents related to the topic using the rules described above. Next, for each of the topics inferred by the model, we obtained a sample of 1000 most probable words with their probability scores. We then created a topic-word matrix with rows representing topics and columns representing sampled words. If for some topic a word was not in the 1000 most probable, we assigned to it a probability score of 0. Next, we used t-SNE to embed this large matrix into a 2-dimensional space. An example of a topic map created using this procedure is presented in Fig. 2. The size of the bubble represents the number of documents related to each of the topics. As Fig. 2 shows, the model has discovered 20 topics in our dataset of publication texts. Topics 0, 1, 2, 3 and 6 have the largest number of documents related to them, and further analysis suggests that they represent important discourses that exist in social media.

Fig. 2 HDP model topic map

64

G. S. Khvatsky et al.

Topics 0, 3 and 6 represent a discourse that is centered around one of the largest disasters to happen to the industry under investigation. This discourse also refers to the geographical and geopolitical area where the event took place, as well as the lives of people involved in the containment measures and otherwise affected by the tragedy as well as its long-term effects. Topic 0 is focused on the lives of people affected by the tragedy, and by its effects on the area surrounding the site. Topic 3 is concerned with the strain that this tragedy put on the industry under investigation in the country where it happened, as well as how this strain affects large economic sectors of this country. Topic 6 is concerned with the general dangers of the continued development of the industry in question, and their connection with one of the largest disasters that happened in connection with that industry. Topic 1 represents a discourse centered around investments that the Company makes in its home country and how it helps develop the remote parts of the country. This discourse also contains discussions around new large-scale projects undertaken by the corporation and the jobs it creates. Topic 2 represents a discourse centered around new technological developments in the industry in question and the role of the Company in these developments. This discourse also includes discussions on the investments and job creation by the corporation.

3.6 Manual Labeling The third method that we used for topic analysis for this study is what we called “manual labeling.” While this method required human expert involvement, it allowed for simultaneous topic discovery and interpretation. It also allowed us to filter out publications that were detrimental for analysis, such as advertisements. The manual labeling method for topic analysis is centered around assigning topics to publications based on a dictionary that is created by a panel of experts. A topic can be assigned to a document based on the text of the publication or the set of other topics assigned to the document (for example, if a word is present in the text or a match of a regular expression). For example, when analyzing a dataset related to computer hardware, one can define a topic called “motherboards.” This topic will be assigned to a publication if it contains the word “motherboard” or the word “mainboard.” Other topics might be “graphics cards” (with the words being “graphics card” or “display adapter”) and “keyboards” (with the word “keyboard”). It would then be possible to define two other topics. One is “core components”, assigned to the publications that already have topics “motherboards” or “graphics cards” assigned to them. Another is the “peripherals,” assigned to publications that have the topic “keyboards” assigned to them. The hierarchy of topics can be infinitely deep. It would also be possible to define the topic “computers” that would be assigned only to those publications that have both the “core components” and the “peripherals” topics assigned to them. This topic hierarchy is represented in Fig. 3.

Computational Tools of Media Analysis …

65

Fig. 3 Example of a topic hierarchy

For our analysis, we created our own implementation of this method using the Python programming language. In our implementation, the definition for each topic contained sets of regular expressions (used to search for substrings in the text), sets of other topics, or a combination of both. In order to build the topic dictionary using expert opinion, we devised the following cyclic procedure. (Experts were the people who had the subject matter knowledge of the industry.) Starting with the initial empty dictionary, we sampled a set of documents that did not have any topics assigned to them from our database. After that, we had a panel of experts analyze the sample of the documents in order to make additions and modifications to the dictionary so that all the sampled documents had topics assigned to them. In the case of our study, the experts identified various discourses that existed in our dataset. They also identified the connections that existed between the discourses that were already discovered. In this study, we repeated this procedure until all unassigned documents remaining were not relevant to our analysis (e.g., unclassified advertisements). One of the biggest challenges inherent to this type of analysis is that it is both time-consuming and susceptible to subjectivity of the experts. While the first problem remains unsolved, we have tried to remedy the second problem by having a panel of multiple experts. Additionally, we have used independent coding together with the experts discussing the coding results to reach consensus among experts and to make the coding as objective as possible. Another problem related to this method is that it does not allow to determine the level of affinity of a publication to a particular topic, as our implementation only allows for binary topic assignment. Still, the method allows to assign arbitrarily many topics from the dictionary to a single publication, so it does not require the assumption of one topic per document to hold. Another issue with using this method (that it shares with other bag-of-words text analysis methods) is that it largely ignores the structure of the text of the publications. What makes this issue important is that particular care should be taken when building

66

G. S. Khvatsky et al.

the dictionary, since ambiguous entries can greatly affect the quality of the extracted topics. For example, if the dictionary is designed to use single words to assign topics to documents, using this method can lead to assigning topics to documents that are not relevant to the topic. A general recommendation in this case is to include as much context as possible when adding text-based entries to the dictionary. While it was possible, our implementation of this method did not make the use of additional metainformation available for the publications in our dataset. Since the analysis performed for this study was largely exploratory, we reserved the use of metainformation for future work. However, after performing this analysis, we used the metadata available for the publications for further analysis. After performing all the procedures outlined above, we discovered 189 topics in the texts of publications in our dataset. We visualized them using the following procedure. First, for each of the topics in the dictionary, we created a sample of the publications related to the topic. Next, we used a pre-trained word2vec model to convert the text of each of the publications of our samples to a vector with 1536 components. Next, for each of the topics we computed the component average of the vectors of all the documents related to the topic. Then, we used t-SNE to embed the average vectors into a 2-dimensional space. This visualization (or topic map) is presented on Fig. 4. The sizes of the bubbles represent the number of documents in a dataset that are related to each of the topics. As Fig. 4 shows, some topics are more prevalent in the dataset than the others. We have determined that topics 159, 12, 17, 1 and 121 are the most common in the full corpus. Topic 159 is related to the corporation under investigation, its social media accounts and publications, and social media discussions centered around its public image. It should be noted that this topic was defined using a single word, namely the name of the Company, so it is quite general and requires care when interpreted.

Fig. 4 Manual labeling model topic map

Computational Tools of Media Analysis …

67

Topics 12, 17 and 121 are related to each other. They describe a discourse centered around one of the largest disasters that happened in the industry under investigation. Topic 12 is more focused on the geographical area where the accident happened, and the effects of the incident on the surrounding nature and environment. It also contains publications dedicated to lives of those affected by the disaster. Topic 17 describes a discourse centered around both technical and cultural implications of the accident and its containment, as well as related media. Topic 121 is focused on a discourse surrounding the actual site of the accident, the present efforts for continued containment, modern explorations of the site, and the ways the site of the accident is represented in media. Topic 1 describes a more general discourse that is focused on other disasters and accidents related to the industry, and their cultural, environmental, and media impact. It should be noted that the dictionary was constructed in a way that minimized the intersection between this topic and topics 12, 17, and 121.

4 Comparing Three Different Topic Mining Algorithms: Advantages and Disadvantages of Each Method for Practical Problem-Solving in Business Drawbacks and benefits of each method described above are summarized in Table 5. This Table can be used as a so-called “roadmap” for building a system of monitoring and evaluation of corporate policy effectiveness using computational tools of old and new media data mining. It should be noted that there are many possible avenues for improvement of the analysis presented in this study. The first is to use other, more modern methods for automated topic discovery. The methods used for this study only made use of the text data available for publications but not the metainformation. For example, the topic model proposed by [25] can take location and time data into account when modeling topics from the corpus, and both time and location data is available in our dataset. Another topic model proposed by [22] is a variant of LDA that can take connections between documents into account. These data are also available in our dataset in the form of connections between posts and reposts, posts and comments, comments and replies to comments. It is entirely possible that including metadata into the topic discovery process may allow for better, less correlated, and easier to interpret topics to be discovered in our dataset. Another possible avenue for improvement is through modifications of the manual labeling topic discovery process. While it better than having the experts read millions of publications one by one, the process is still laborious and time-consuming. One possible way to improve this process is by using active learning. It may be possible to start with an automatically generated grouping of publications. They can be generated by one of the two methods. The first is via a clustering algorithm that allows for detection of publications that do not belong to any of the clusters (e.g. DBSCAN).

68

G. S. Khvatsky et al.

Table 5 Comparison of topic modeling methods used in our study LDA

HDP

Manual labeling

Determination of the number of topics

The number of topics is a hyperparameter, for a large dataset, determining the optimal number of topics requires a large amount of computational effort

The number of topics is inferred from the data, although the maximum number of topics is still a hyperparameter of the model

The number of topics is inferred from the data by a panel of experts, and can be from 1 to an arbitrarily large value. However, the sampling approach may still leave some of the topics in the data undiscovered by the expert panel

Topic interpretability

Discovered topics are not necessarily interpretable. An suboptimal choice of the number of topics may further decrease topic interpretability

Discovered topics are not necessarily interpretable

Topics are human-defined, and thus are interpretable by definition

Robustness to dirty data

The method is not robust to dirty data. Presence of data not related to the subject of analysis in the dataset may negatively affect the quality of discovered topics

The method is not robust to dirty data. Presence of data not related to the subject of analysis in the dataset may negatively affect both the quality of discovered topics and the inferred number of topics

The method is robust to dirty and unrelated data in the sense that the expert panel has a means to move all the unrelated data to a separate topics and remove it from further analysis

Inferring topics from documents

This method allows for inference, however, assigning topics to individual documents requires further consideration

This method allows for inference, however, assigning topics to individual documents requires further consideration. In addition, difference between the maximum number of topics and the inferred number of topics may further complicate this procedure

The method allows for assigning topics to individual documents. However, it does not allow for computation of a level of “affinity” of a document to a particular topic

Using text structure to LDA is a determine topics bag-of-words method and does not take the structure of the text into account

HDP is a bag-of-words method and does not take the structure of the text into account

It is possible to integrate text structure into the topic dictionary to a degree

Computational Tools of Media Analysis …

69

The second is via a topic modeling algorithm. After automatic groups are generated, the expert panel can work on interpreting and improving the discovered document groupings. It is also possible to combine both automated topic modeling approaches with the manual labeling approach we applied in this study. This combined approach can help with removing posts that are irrelevant for analysis, such as advertisements. It can also help with manual labeling. As a result, it can help improve the performance of the automated topic discovery models. Continuing with recommendations of applying the methods described in this chapter for future system of monitoring and evaluation of corporate policy effectiveness, we would like to suggest the following improvements to our methodology. Our first recommendation is to take special care when creating the search query to obtain the initial dataset for analysis. Since the topic modeling methods we used were affected by irrelevant publications obtained by the query we used, using a better, narrower query has a large potential for improving the performance of all the presented models. Another recommendation is to use an algorithm to determine the number of topics for the LDA model. We did not use it due to lack of time (the project had a tight deadline) and low computational power. However, in the future, using an optimal number of topics for the LDA model will definitely improve the quality and interpretability of the discovered topics. Notes Funding: The article was prepared within the framework of the HSE University Basic Research Program.

References 1. Allahyari M, Pouriyeh S, Kochut K, Arabnia HR (2017) A knowledge-based topic modeling approach for automatic topic labeling. Int J Adv Comput Sci Appl 8(9):335–349. https://doi. org/10.14569/IJACSA.2017.080947 2. Alnoukari M, Hanano A (2017) Integration of business intelligence with corporate strategic management. J Intell Stud Bus 7(2). https://doi.org/10.37380/jisib.v7i2.235 3. Alnoukari M, Razouk R, Hanano A (2016) BSC-SI, a framework for integrating strategic intelligence in corporate strategic management. Int J Strateg Inf Technol Appl (IJSITA) 7(1):32– 44. https://doi.org/10.4018/IJSITA.2016010103 4. https://builtin.com/big-data. What is big data? 5. Blei DM (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022 6. Boyd-Graber J, Blei DM (2010) Syntactic topic models. arXiv:1002.4665 [cs, math, stat]. http://arxiv.org/abs/1002.4665. Accessed 21 July 2021 7. Chaney A, Blei D (2012) Visualizing topic models. Proc Int AAAI Conf Web Soc Media 6(1):419–422 8. Chasin R et al (2014) Word sense disambiguation in the clinical domain: a comparison of knowledge-rich and knowledge-poor unsupervised methods. J Am Med Inform Assoc 21(5):842–849. https://doi.org/10.1136/amiajnl-2013-002133 9. Cheng X et al (2014) BTM: topic modeling over short texts. IEEE Trans Knowl Data Eng 26(12):2928–2941. https://doi.org/10.1109/TKDE.2014.2313872

70

G. S. Khvatsky et al.

10. Drury G (2008) Opinion piece: social media: should marketers engage and how can it be done effectively? J Direct Data Digit Mark Pract 9(3):274–277. https://doi.org/10.1057/palgrave. dddmp.4350096 11. Girdhar Y, Giguère P, Dudek G (2013) Autonomous adaptive underwater exploration using online topic modeling. In: Desai JP et al (eds) Experimental robotics: the 13th international symposium on experimental robotics. Springer tracts in advanced robotics. Springer International Publishing, Heidelberg, pp 789–802. https://doi.org/10.1007/978-3-319-000657_53 12. Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101(suppl 1):5228–5235. https://doi.org/10.1073/pnas.0307752101 13. Haley KJ, Fessler DMT (2005) Nobody’s watching? Subtle cues affect generosity in an anonymous economic game. Evol Hum Behav 26(3):245–256. https://doi.org/10.1016/j.evolhumbe hav.2005.01.002 14. Hannigan TR et al (2019) Topic modeling in management research: rendering new theory from textual data. Acad Manag Ann 13(2):586–632. https://doi.org/10.5465/annals.2017.0099 15. Hong L, Davison BD (2010) Empirical study of topic modeling in Twitter. In: Proceedings of the first workshop on social media analytics. Association for computing machinery (SOMA ’10), New York, NY, USA, pp 80–88. https://doi.org/10.1145/1964858.1964870 16. Hoxha G (2015) Limited efficiency of civil society during the democratization process of Albania. Thesis. Epoka University. http://dspace.epoka.edu.al/handle/1/1762. Accessed 17 November 2020 17. Jelodar H et al (2019) Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey. Multimed Tools Appl 78(11):15169–15211. https://doi.org/10.1007/s11042018-6894-4 18. Kaplan AM (2015) Social media, the digital revolution, and the business of media. Int J Media Manag 17(4):197–199. https://doi.org/10.1080/14241277.2015.1120014 19. Kaplan AM, Haenlein M (2010) Users of the world, unite! The challenges and opportunities of social media. Bus Horiz 53(1):59–68. https://doi.org/10.1016/j.bushor.2009.09.003 20. Lasswell HD (1970) The emerging conception of the policy sciences. Policy Sci 1(1):3–14 21. Lau JH et al (2012) Word sense induction for novel sense detection. In: Proceedings of the 13th conference of the European chapter of the association for computational linguistics. EACL 2012. Association for computational linguistics, Avignon, France, pp 591–601. https://aclant hology.org/E12-1060. Accessed 21 July 2021 22. Liu Y, Xu S (2017) A local context-aware LDA model for topic modeling in a document network. J Am Soc Inf Sci 68(6):1429–1448. https://doi.org/10.1002/asi.23822 23. Magalhães PC (2014) Government effectiveness and support for democracy. Eur J Polit Res 53(1):77–97. https://doi.org/10.1111/1475-6765.12024 24. McCorkindale T, DiStaso MW, Carroll C (2013) The power of social media and its influence on corporate reputation. In: The handbook of communication and corporate reputation, vol 9, no. 1, pp 497–512 25. Mei Q, Zhai C (2006) A mixture model for contextual text mining. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining—KDD ’06; The 12th ACM SIGKDD international conference, Philadelphia, PA, USA. ACM Press, p 649. https://doi.org/10.1145/1150402.1150482 26. Mimno D et al (2011) Optimizing semantic coherence in topic models. In: Proceedings of the 2011 conference on empirical methods in natural language processing, EMNLP 2011. Association for computational linguistics, Edinburgh, Scotland, UK, pp 262–272. https://acl anthology.org/D11-1024. Accessed 13 July 2021 27. Mulhern F (2009) Integrated marketing communications: from media channels to digital connectivity. J Mark Commun 15(2–3):85–101. https://doi.org/10.1080/13527260902757506 28. Niemi RG, Craig SC, Mattei F (1991) Measuring internal political efficacy in the 1988 national election study. Am Polit Sci Rev 85(4):1407–1413. https://doi.org/10.2307/1963953 29. Pentina I, Tarafdar M (2014) From “information” to “knowing”: exploring the role of social media in contemporary news consumption. Comput Hum Behav 35:211–223. https://doi.org/ 10.1016/j.chb.2014.02.045

Computational Tools of Media Analysis …

71

30. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155(2):945–959 31. Qualman E (2012) Socialnomics: how social media transforms the way we live and do business. Wiley 32. Ramamonjisoa D (2014) Topic modeling on users’s comments. In: 2014 third ICT international student project conference (ICT-ISPC), pp 177–180. https://doi.org/10.1109/ICT-ISPC.2014. 6923245 33. Rani S, Kumar M (2021) Topic modeling and its applications in materials science and engineering. Mater Today Proc 45:5591–5596. https://doi.org/10.1016/j.matpr.2021.02.313 34. Rayp G, Sijpe NVD (2007) Measuring and explaining government efficiency in developing countries. J Dev Stud 43(2):360–381. https://doi.org/10.1080/00220380601125230 35. Segalovich I (2003) A fast morphological algorithm with unknown word guessing induced by a dictionary for a web search engine. In: MLMTA, p 273 36. Tewolde MH, Gubán P (2010) the means of analysis and evaluation for corporate performances. Ann Univ Apulensis Ser Oecon 1(12):738–749. https://doi.org/10.29302/oeconomica.2010.12. 1.43 37. Wang C, Blei DM (2011) Collaborative topic modeling for recommending scientific articles. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. Association for computing machinery (KDD’11), New York, NY, USA, pp 448–456. https://doi.org/10.1145/2020408.2020480 38. Wang H et al (2019) Optimization of topic recognition model for news texts based on LDA. J Digital Inf Manag 17(5):257. https://doi.org/10.6025/jdim/2019/17/5/257-269 39. Wen HJ, Lim B, Lisa Huang H (2003) Measuring e-commerce efficiency: a data envelopment analysis (DEA) approach. Ind Manag Data Syst 103(9):703–710. https://doi.org/10.1108/026 35570310506124 40. Yau C-K et al (2014) Clustering scientific documents with topic modeling. Scientometrics 100(3):767–786. https://doi.org/10.1007/s11192-014-1321-8 41. Zhao W, Chen JJ, Perkins R, Liu Z, Ge W, Ding Y, Zou W (2015) A heuristic approach to determine an appropriate number of topics in topic modeling. In: BMC Bioinf 16(13):1–10. BioMed Central, 2015

Optimal Design of Checkpoint Systems with General Structures, Tasks and Schemes Kenichiro Naruse and Toshio Nakagawa

Abstract This chapter proposes some kinds of checkpoint systems with general structures, tasks and schemes. We have already considered redundancy techniques which are duplex and majority systems, and have applied them to two checkpoint models in which their interval times are constant and random. Giving overheads for checkpoints, we have obtained the mean execution times until the process succeeds, and have derived optimal checkpoint times to minimize them. In this chapter, we first introduce the standard checkpoint model, and propose general checkpoint models which include parallel, series and bridge systems. Furthermore, we consider tandem and bulk tasks, and apply them to two schemes and compare optimal policies theoretically and numerically. Finally, as examples of the above models, we give four models, obtain their mean execution times analytically and discuss which scheme is better numerically.

1 Introduction It is of great importance to develop the design of computer systems with high reliability. Especially, human technologies have been recently growing up to build spacecraft bound for Mars, International Space Station, auto-drive cars, and so on. All of them mostly consist of computing units that need high reliability and high speed processing. Therefore, we have to design previously such computer systems with high

K. Naruse (B) Faculty of Social and Environmental Studies, Josai International University, 1 Gumyo, Togane, Chiba 283-8555, Japan e-mail: [email protected] T. Nakagawa Department of Business Administration, Aichi Institute of Technology, 1247 Yachigusa, Yakusa, Toyota, Aichi 470-0392, Japan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment of Industrial Systems, Springer Series in Reliability Engineering, https://doi.org/10.1007/978-3-030-93623-5_4

73

74

K. Naruse and T. Nakagawa

quality. However, we cannot eliminate some computer errors, and to mask these errors, we have to design high reliable computers with multiple units and tolerant performances. Some errors may occur due to space radiation, electromagnetic waves, low quality of hardware, element overheat and over current, and so on. These errors lead to faults or failures, and might cause serious damage to systems. To prevent such faults, some types of fault-tolerant technologies including system redundancies and configurations have been offered [1, 5, 13]. Using such techniques, we can achieve high reliabilities and effective performances of objective computer systems in actual fields. Failures due to errors may be found and computer systems lose their consistency. To protect such critical incidents, several recovery methods are required to restore a consistent state just before failures. The most standard method is to taking copies of normal states at suitable times, which is called checkpoint times. When errors of the process occur and are detected, we execute the rollback operation to the nearest checkpoint time and restore the consistent state of the process. The simplest scheme for error detection recovery techniques is [6]: We execute two independent modules which compare two states at checkpoint times. If two states of each module do not match with each other, we go back to the newest checkpoint and make their retrials. Furthermore, we recently considered six modules with duplex and majority schemes, which consist of three clusters with two modules and two clusters with three modules and compared them [9]. Several studies have been conducted to determine the optimal checkpoint frequencies: The reliability and performance of a double modular system with one spare module were evaluated [7, 11]. In addition, the performance of checkpoint schemes with task duplication was evaluated [14, 15]. An optimal instruction retry period that minimizes the possibility of dynamic failures using the triple modular controller was derived [4]. Evaluation models with finite checkpoints and bounded rollback were discussed [10]. In this chapter, we obtain the mean execution time to complete the computer processing time for the standard checkpoint model, using renewal processes, and derive optimal periodic checkpoint interval in Sect. 2. We make a generalization of checkpoint models in Sect. 3: We consider a K-out-of-n structure in Sect. 3.1 and obtain the reliability of general redundant systems which include parallel, series, majority decision, and bridge systems in Sect. 3.2. We take up Tandem and Bulk tasks, and compare which task is better numerically in Sect. 3.3. Section 3.4 introduces two schemes of checkpoints such as compare-checkpoint and store-checkpoint. To apply the above models to practical ones easily, we give four examples of random tasks in Sect. 4: We consider a parallel structure with double modules in Sect. 4.1, 4 tasks with 3 schemes in Sect. 4.2, 6 tasks with 9 schemes with double modules in Sect. 4.3(1), and triple majority modules in Sect. 4.3(2). We obtain the mean execution times of each model, show their numerical examples, and discuss which model is better.

Optimal Design of Checkpoint Systems …

75

2 Standard Model Suppose that S(0 < S < ∞) is a native execution time of processes and does not include any checkpoint overheads. It is assumed that some errors occur according to a general distribution F(t), and the failure rate is h(t) ≡ f (t)/F(t), where f (t) is a density function of F(t) and Φ(t) ≡ 1 − Φ(t) for any function Φ(t). To detect such errors, we divide S equally into N (N = 1, 2, . . .) time intervals and place periodic checkpoints at planned times kT (k = 1, 2, . . .), where T ≡ S/N . If we detect errors at checkpoint times, roll back to the previous checkpoint and re-execute the process, where it is assumed that occur according to an identical distribution F(t) for each checkpoint interval ((k − 1)T, kT ] and the failure rate h(t) increases with t. Introduce constant overhead C1 of one checkpoint. Then, the mean time L(N ) to complete the process with a native execution time S is the total of the execution times and the overhead C1 for the checkpoint. From the assumption that if some errors are detected at checkpoint times, the process is rolled back to the previous checkpoint, the mean execution time for each checkpoint interval ((k − 1)T, kT ] is given in a renewal equation [8, p. 126]: L(1) = (T + C1 )F(T ) + [T + C1 + L(1)]F(T ).

(1)

Solving (1) with respect to L(1), L(1) =

T + C1 F(T )

.

(2)

Thus, the mean time to complete the process is L(N ) ≡ N L(1) =

N (T + C1 ) F(T )

=

S + N C1 F(S/N )

=

S(1 + C1 /T ) F(T )

.

(3)

We find optimal N1∗ to minimize L(N ). For this purpose, we derive optimal 1 ≤ S) to minimize 1 (0 < T T 1 + C1 /T  L(T ) ≡ . F(T ) Differentiating  L(T ) with respect to T and setting it equal to zero, h(T )(T 2 + C1 T ) − C1 = 0,

(4)

whose left-hand side increases strictly with t from −C1 to ∞. Thus, there exists a 1 < ∞) which satisfies (4). 1 (0 < T finite T

76

K. Naruse and T. Nakagawa

Therefore, we have the following optimal number N1∗ , using the partition method [8, p. 42]: 1 < S, we set [S/T 1 ] ≡ N and calculate L(N ) from (3). If L(N ) ≤ (i) When T ∗ L(N + 1), then N1 = N , and conversely, if L(N ) > L(N + 1), then N1∗ = N + 1. 1 ≥ S, N1∗ = 1, i.e., we do not make any checkpoint, and the mean time (ii) When T is (S + C1 )/F(S). 1 given in (4) does not depend on S. Thus, if S would be very large, be Note that T 1 as an approximate checkpoint changed greatly, or be uncertain, then we may adopt T time of T1∗ . For example, when S is a random variable with a distribution L(t) and mean l, l(1 + C1 /T ) L(N ) = . F(T ) In particular, when F(t) = 1 − e−λt , (4) is T 2 + C1 T −

C1 = 0. λ

(5)

Solving (5) with respect to T ,  1 = C1 T

 1 1+ −1 . λC1

(6)

If 1/λ > (S 2 /C1 ) + 2S, then N1∗ = 1.

3 General Checkpoint Models 3.1 General Stractures We propose general structures with redundant modules to mask errors and detect them at checkpoint times. As one of general structures, we consider the following K-outof-n system whose theoretical properties and practical applications were extensively collected [12]: A K-out-of-n (1 ≤ K ≤ n) structure can operate if and only if at least K modules of the total n units are operable, and their reliability characteristics were investigated. When errors of each module occur according to an identical distribution F(t), the reliability of the structure at time t is R K (t) =

n    n j=K

j

F(t) j F(t)n− j .

(7)

Optimal Design of Checkpoint Systems …

77

Furthermore, suppose that K is a random variable with a probability function pk ≡ Pr{K = k}(k = 0, 1, 2, . . . , n) for a specified n(n = 1, 2, . . .) where p0 ≡ 0 [2, 3]. It is denoted that the distribution of K is Pk ≡ Pr{K ≤ k} = kj=0 p j (k = 0, 1, . . . , n), where P0 ≡ 0, PN ≡ 1 and Pk increases with k from 0 to 1. Then, the reliability of the structure at time t is, from (7), R P (t) =

n 

pk

k=0

n    n j=k

j

F(t) j F(t)n− j =

n 

Pk

k=1

  n F(t)k F(t)n−k , k

(8)

and the mean time is



MP ≡

R P (t)dt =

n 

0

k=1

Pk

  ∞ n F(t)k F(t)n−k dt. k 0

(9)

Note that Pk represents the probability that when k modules have no error, the structure is normal, i.e., the process executes correctly.

3.2 Examples of Redundant Modules (1) Parallel Structure When Pk = 1 (k = 1, 2, . . . , n), R P (t) = 1 − F(t)n .

(10)

(2) Series Structure When Pk = 0 (k = 1, 2, . . . , n − 1) and Pn = 1, R S (t) = F(t)n .

(11)

(3) Majority Decision Structure When n = 2m + 1 and k = m + 1, P j = 0 ( j = 1, 2, . . . , m) and P j = 1 ( j = m + 1, . . . , 2m + 1), R M (t) =

2m+1  k=m+1



2m + 1 k

 F(t)k F(t)2m+1−k .

In particular, when m = 1, R M (t) = 3F(t)2 − 2F(t)3 .

(12)

78

K. Naruse and T. Nakagawa

Fig. 1 Bridge structure with 5 modules with 2-terminal network

(4) Bridge Structure When the structure with 5 modules is given in Fig. 1, P1 = 0, P2 = 1/5, P3 = 4/5, P4 = P5 = 1,       1 5 4 5 5 F B (t) = F(t)2 F(t)3 + F(t)3 F(t)2 + F(t)4 F(t) + F(t)5 4 5 2 5 3 = F(t)2 [1 + 2F(t) + F(t)2 − 2F(t)3 ].

(13)

Example 1 Consider a majority decision structure with 3 modules when F(t) = 1 − e−λt . If more than two states of three modules agree with each other, then the process is correct, i.e., the structure can mask a single module error. Then, the reliability of the structure is, from (12), R M (t) = 3e−2λt − 2e−3λt .

(14)

Assuming that C2 is the overhead of a majority comparison with three modules, the mean time to complete the process is, from (3), L M (N ) =

S + N C2 S(1 + C2 /T ) = −2λT ≡ L M (T ). −3λS/N − 2e 3e − 2e−3λT

3e−2λS/N

(15)

Differentiating L M (T ) with respect to T and setting it equal to zero, (e

λT



C2 − 1) T + C2 T − 2λ 2

 =

C2 , 6λ

(16)

whose left-hand side increases strictly with T from 0 to ∞. Thus, there exists a finite 2 (0 < T 2 < ∞) which satisfies (16), and using the partition method in and unique T Sect. 2, we obtain optimal N2∗ and T2∗ which minimize L M (N ) in (15). Therefore, for any structures with reliability R P (t) in (8), we obtain the mean time in (3) and discuss optimal policies to minimize it. Table 1 presents optimal N2∗ and the mean execution time L M (N2∗ ) in (15) for λ and C2 . For example, when S = 1.0, λ = 0.150, C2 = 0.05, optimal N2∗ is 3. Thus,

Optimal Design of Checkpoint Systems …

79

Table 1 Optimal N2∗ and mean time L M (N2∗ ) when S = 1.0, G(t) = 1 − e−t λ

C2 = 0.001 N2∗ L M (N2∗ )

C2 = 0.005 N2∗ L M (N2∗ )

C2 = 0.05 N2∗

L M (N2∗ )

0.150 0.125 0.100 0.075 0.050 0.025 0.001

5 4 4 3 2 2 1

3 3 2 2 2 1 1

3 1 1 1 1 1 1

1.117 1.092 1.077 1.066 1.057 1.052 1.050

1.008 1.007 1.006 1.005 1.004 1.002 1.001

1.022 1.020 1.017 1.014 1.012 1.007 1.005

when we make 3 chechkpoints until time S, it is optimal and the mean execution time is 1.117. This indicates that N2∗ increase with λ from 1 and decrease with C2 , and L M (N2∗ ) increases with λ and C2 from S = 1.0. This means that if error rate λ is small and overhead C2 is large, we should not place any checkpoints until time S.

3.3 Two Kinds of Random Tasks (1) Tandem task Suppose that the process executes N tasks (N = 1, 2, . . .), each of which has a processing time Y j ( j = 1, 2, . . . , N ) with an identical distribution G(t) ≡ Pr{Y j ≤ t} with finite mean 1/θ and is executed successively, which is called Tandem task in Fig. 2. Supposing that T is a random variable with G(t) in Sect. 2, the mean time to complete the process is, from (1),



L T (N ) ≡



(t + N C1 ) F(t) + [t + N C1 + L T (N )] F(t) dG (N ) (t),

(17)

0

where G (N ) (t) is the N -fold Stieltjes comvolution of G(t) with itself, and G (0) (t) ≡ 1 for t ≥ 0. Solving (17) with respect to L T (N ), the mean time to complete the process is N (1/θ + C1 ) L T (N ) = ∞ . (18) (N ) (t) 0 F(t)dG (2) Bulk task Suppose that the process executes N tasks simultaneously, which is called Bulk task in Fig. 3. If some tasks with errors are detected, then the process returns to the

80

K. Naruse and T. Nakagawa

Fig. 2 Tandem task

Fig. 3 Bulk task

previous checkpoint and executes all of N tasks, and ends when all of N tasks have no errors, Then, letting C N be the overhead for N tasks, the mean time to complete the process is

L B (N ) =





(t + C N )F(t) + [t + C N + L B (N )] F(t) dG(t) N .

(19)

0

Solving (19) with respect to L B (N ),

∞  1 − G(t) N dt + C N 0 L B (N ) = .

∞ N 0 F(t)dG(t)

(20)

We compare L T (N ) and L B (N ) when C N = C1 N : From (18) and (20), N (1/θ + C1 ) L T (N ) − L B (N ) = ∞ (N ) − (t)dF(t) 0 G

∞  1 − G(t) N dt + N C1 0

∞ . N 0 G(t) dF(t)

  Noting that G(t) N ≥ G (N ) (t) (N = 1, 2, . . .) and 1 − G(t) N /N decreases with N from G(t) to 0, we easily have

∞ 0

∞ 0

G(t) N dF(t)

G (N ) (t)dF(t)



(1/N )

∞  1 − G(t) N dt + C1 0 . 1/θ + C1

Thus, if C N = N C1 , then Bulk task is better than Tandem one, because we can begin to execute N tasks simultaneously for Bulk one. However, overhead C N might be larger than N C1 .

Optimal Design of Checkpoint Systems …

81

2 and mean time L B (2) when N = 2, L T (2) = L B (2), C1 = 1 and G(t) = Table 2 Overhead C 1 − e−θ t and F(t) = 1 − e−λt λ θ = 1.00 θ = 0.75 θ = 0.50 θ = 0.25 2 2 2 2 C C C C L B (2) L B (2) L B (2) L B (2) 0.1 0.05 0.01 0.005 0.001 0.0005 0.0001

2.690 2.598 2.520 2.510 2.502 2.501 2.500

4.840 4.410 4.080 4.040 4.008 4.004 4.001

2.958 2.817 2.698 2.682 2.670 2.668 2.667

5.994 5.310 4.792 4.729 4.679 4.673 4.668

3.545 3.286 3.059 3.030 3.006 3.003 3.001

8.640 7.260 6.242 6.121 6.024 6.012 6.002

5.667 4.909 4.196 4.099 4.020 4.010 4.002

19.600 14.400 10.816 10.404 10.080 10.040 10.008

Example 2 When N = 2, G(t) = 1 − e−θt and F(t) = 1 − e−λt , L T (2) =

2 (1/θ + C1 ) 3/ (2θ ) + C2 . , L B (2) = 2 2 2θ / [(θ + λ) (2θ + λ)] [θ/ (θ + λ)]

Thus, if 2θ (λ + 2θ ) C2 − 8θ (θ + λ) C1 > 5λ + 2θ, then Tandem task is better than Bulk one. 2 and mean time L B (2) for λ and θ when N = 2, Table 2 presents overhead C 2 = 2.598 L T (2) = L B (2) and C1 = 1. For example, when λ = 0.05 and θ = 1.00, C  and L B (2) = L T (2) = 4.410. Values of C2 and L B (2) increase with λ and 1/θ . This 2 , then Tandem task is better than Bulk one. indicates that if C2 > C

3.4 Two Schemes of Checkpoints We consider the following three schemes of checkpoints [8, p. 138]: Store-checkpoint (SCP) stores the state of the process, Compare-checkpoint (CCP) compares the state of the process that is functioning when the structure consists of more than two modules, and Compare-and-store checkpoint (CSCP) stores and compares the state of the process. Suppose that S is a native execution time, and to detect errors, we provide the structure with two independent modules, where errors of each module occur at constant λ, i.e., F(t) = 1 − e−λt . Then, the probability that any errors do not occur in (0, t] is given by F(t)2 = e−2λt . In addition, introduce the overhead C2 for the comparisons of two states and C S for the comparison and the store of them where Cs ≥ C2 .

82

K. Naruse and T. Nakagawa

Fig. 4 Two schemes

Under the above assumptions, we consider two schemes in Fig. 4: In Scheme 1, the mean execution time for each interval ((k − 1)T, kT ] is, from (1), L 1 (1) = (T + C2 + C S )e−2λT + [T + C2 + L 1 (1)] (1 − e−2λT ).

(21)

Solving (21) with respect to L 1 (1), L 1 (1) =

T + C2 + CS. e−2λT

Thus, the mean time to complete the process is S + N C2 L 1 (N ) ≡ L 1 (T ) ≡ N L 1 (1) = −2λS/N + N C S = S e



1 + C2 /T CS + e−2λT T

 . (22)

In Scheme 2, the mean time for ((k − 1)T, kT ] is l2 (k) = C2 + T + e−2λT l2 (k + 1) + (1 − e−2λT )l2 (1) l2 (N ) = C2 + T + e−2λT C S + (1 − e−2λT )l2 (1).

(k = 1, 2, . . . , N − 1), (23)

Solving (23) with respect to l2 (1),   (C2 + S/N ) 1 − e−2λS   + CS 1 − e−2λS/N e−2λS   (C2 + T ) 1 − e−2λS  =  + CS, 1 − e−2λT e−2λS

L 2 (N ) ≡ L 2 (T ) ≡ l2 (1) =

where note that L 1 (S) = L 2 (S).

(24)

Optimal Design of Checkpoint Systems …

83

Table 3 Optimal N1∗ , N2∗ and mean time L 1 (N1∗ ), L 2 (N1∗ ) when S = 1.0, Cs = 0.1, G(t) = 1 − e−t/μ λ C2 = 0.001 C2 = 0.005 C2 = 0.001 C2 = 0.005 N1∗ L 1 (N1∗ ) N1∗ L 1 (N1∗ ) N2∗ L 2 (N2∗ ) N2∗ L 2 (N2∗ ) 0.200 0.150 0.125 0.100 0.075 0.050 0.025 0.001

2 2 2 2 1 1 1 1

1.424 1.364 1.335 1.307 1.263 1.206 1.152 1.103

2 2 2 2 1 1 1 1

1.434 1.373 1.344 1.316 1.268 1.211 1.157 1.107

14 12 11 10 9 7 5 1

1.365 1.295 1.262 1.229 1.198 1.167 1.136 1.103

6 6 5 5 4 3 2 1

1.409 1.331 1.294 1.258 1.221 1.185 1.149 1.107

We find optimal T1∗ and T2∗ to minimize L 1 (T ) in (22) and L 2 (T ) in (24). Differentiating L 1 (T ) with respect to T and setting it equal to zero, e2λT [2λT (T + C2 ) − C2 ] = C S .

(25)

Thus, there exists a finite and unique T1∗ (0 < T1∗ < ∞) which satisfies (25). Differentiating L 2 (T ) with respect to T and setting it equal to zero, 1 2λT (e − 1) − T = C2 . 2λ

(26)

Thus, there exists a finite and unique T2∗ (0 < T2∗ < ∞) which satisfies (26). Using the partition method in Sect. 2, we can get optimal N1∗ and N2∗ to minimize L 1 (N ) and L 2 (N ), respectively. Example 3 Table 3 presents optimal N1∗ and N2∗ and the mean time L 1 (N1∗ ) in (22) and L 2 (N2∗ ) in (24) for λ and C2 when Cs = 0.1 and S = 1. For example, when λ = 0.100 and C2 = 0.001, N1∗ = 2 and N2∗ = 10, i.e., we should 1 CSCP in Scheme 1 and 9 CCP’s in Scheme 2. This indicates that both N1∗ and N2∗ increase with λ and decrease with C2 .

4 Four Examples of Random Tasks We can give four checkpoint models with random tasks by combining structures, tasks and schemes.

84

K. Naruse and T. Nakagawa

4.1 Model 1 We adopt a parallel structure with 2 modules, N (N = 1, 2, . . .) tandem tasks, and Schemes 1 and 2 in Fig. 4, in which errors of each module occur at constant λ, i.e., F(t) = 1 − e−λt , and each task has an identical distribution G(t). Then, for Scheme 1 in Fig. 4, the mean execution time of task k is, from (21),



L 1 (1) =

(t + C2 + C S )e−2λt dG(t) +

0





[t + C2 + L 1 (1)] (1 − e−2λt )dG(t).

0

(27)

Solving (27) with respect to L 1 (1), L 1 (1) = where G ∗ (s) ≡ process is

∞ 0

C2 + 1/θ + CS, G ∗ (2λ)

(28)

e−st dG(t) for Re(s) ≥ 0. Thus, the mean time to complete the  C2 + 1/θ + CS . L 1 (N ) ≡ N L 1 (1) = N G ∗ (2λ) 

(29)

Similarly, for Scheme 2, l2 (k) = C2 + l2 (N ) = C2 +

1 + θ 1 + θ



∞

 l2 (k)e−2λt + l3 (1)(1 − e−2λt ) dG(t)

(k = 1, 2, . . . , N − 1),

0 ∞

 C S e−2λt + l2 (1)(1 − e−2λt ) dG(t).

(30)

0

Solving (30) with respect to l2 (1), L 2 (N ) ≡ l2 (1) =

(C2 + 1/θ ) 1 − [G ∗ (2λ)] N [1 − G ∗ (2λ)] [G ∗ (2λ)] N

+ CS.

(31)

Therefore, the mean execution time per one task is   (C2 + 1/θ ) 1 − G ∗ (2λ) N (1) l CS 2  = . + L 2 (N ) ≡ N ∗ ∗ N N N [1 − G (2λ)] [G (2λ)]

(32)

We find optimal N2∗ to minimize  L 2 (N ). Forming the inequality  L 2 (N + 1) −  L 2 (N ) ≥ 0, N    ∗ j 1 Cs 1 − G ≥ (2λ) , N +1 ∗ C [G (2λ)] 2 + 1/θ j=1

(33)

Optimal Design of Checkpoint Systems …

85

Table 4 Optimal N2∗ and mean time  L 2 (N2∗ ) and L 1 (N2∗ ) when Cs = 0.1 and G(t) = 1 − e−t λ

C2 = 0.001 C2 = 0.005 C2 = 0.05 ∗ ∗ ∗ ∗ ∗ ∗    L 2 (N2 ) L 1 (N2 ) N2 L 2 (N2 ) L 1 (N2 ) N2∗ L 2 (N2∗ ) L 1 (N2∗ ) N2

0.1 0.05 0.01 0.005 0.001 0.0005 0.0001

1 1 3 4 10 13 30

1.301 1.201 1.075 1.051 1.022 1.016 1.007

1.301 1.201 3.363 4.444 11.030 14.326 33.036

1 1 3 4 10 13 30

1.306 1.206 1.079 1.055 1.026 1.020 1.011

1.306 1.206 3.375 4.460 11.070 14.378 33.156

1 1 3 4 10 13 30

1.360 1.255 1.126 1.102 1.072 1.065 1.057

1.360 1.255 3.513 4.642 11.521 14.964 34.506

whose left-hand side increases strictly with N to ∞. Thus, there exists a finite and unique minimum N2∗ (1 ≤ N2∗ < ∞) which satisfies (33). If 1 − G ∗ (2λ) Cs ≥ 2 ∗ C2 + 1/θ [G (2λ)] then N2∗ = 1. Example 4 Table 4 presents optimal N2∗ and the mean time  L 2 (N2∗ ) in (32) and ∗ ∗ ∗ ∗  L 1 (N2 ) in (29). Compute N2 and compare L 2 (N2 ) and L 1 (N2 ) for λ and C S when C2 = 1 × 10−4 , Cs = 0.1 and 1/θ = 1. For example, when λ = 0.005, C2 = 0.001, optimal N2∗ is 4. Thus, when we make 3 CCPs and 1 CSCP for 4 tasks, it is optimal and the mean execution time is 1.051. Clearly, Scheme 2 is better than Scheme 1 with 4 tasks.

4.2 Model 2 We consider 3 schemes with 4 tasks in Fig. 5 for double modules, and discuss which schemes are better. When G(t) = 1 − e−θt and F(t) = 1 − e−λt , the mean execution times of each scheme are obtained as follows: Let lk (k = 1, 2, 3, 4) be the mean time for task k to complete the final task 4. For Scheme 1,

Fig. 5 4 tasks with 2 CSCP

86

K. Naruse and T. Nakagawa



∞   (Cs + t + l2 ) e−2λt dG(t) + (C2 + t + l1 ) 1 − e−2λt dG(t),

0 ∞

∞0   l2 = (Cs + t) e−2λt dG (3) (t) + (C2 + t + l2 ) 1 − e−2λt dG (3) (t). ∞

l1 =

0

0

For Scheme 2,



∞   l1 = (Cs + t + l3 ) e−2λt dG (2) (t) + (C2 + t + l1 ) 1 − e−2λt dG (2) (t),

0 ∞

∞ 0   l3 = (Cs + t) e−2λt dG (2) (t) + (C2 + t + l3 ) 1 − e−2λt dG (2) (t). 0

0

For Scheme 3,



∞   l1 = (Cs + t + l4 ) e−2λt dG (3) (t) + (C2 + t + l1 ) 1 − e−2λt dG (3) (t), 0

0 ∞

∞   l4 = (Cs + t) e−2λt dG(t) + (C2 + t + l4 ) 1 − e−2λt dG(t). 0

0

Solving the above equations with respect to l1 , respectively,   1/θ + C2 [1 − G ∗ (2λ)] 3/θ + C2 1 − G ∗ (2λ)3 + L 1 = L 3 = 2Cs + , G ∗ (2λ) G ∗ (2λ)3     2/θ + C2 1 − G ∗ (2λ)2 L 2 = 2 Cs + . G ∗ (2λ)2 Comparing L 1 and L 2 , we easily have L 2 < L 1 because 3/θ + C2 2/θ + C2 1/θ + C2 + ∗ . > G ∗ (2λ) G (2λ)2 G ∗ (2λ) This means that we should place the checkpoint at the middle point of the number of tasks.

4.3 Model 3 (1) Double Modules We consider 6 tasks with 9 schemes in Fig. 6. The mean execution times of each scheme for 6 tandem tasks with a distribution G(t) when F(t) = 1 − e−λt are obtained as follows:

Optimal Design of Checkpoint Systems …

87

Fig. 6 6 Tasks with CCP and CSCP

L 1 = Cs − C2 +

C2 + 6/θ , G ∗ (2λ)6

  C2 + 3/θ , L 2 = 2 Cs − C2 + ∗ G (2λ)3   C2 + 2/θ L 3 = 3 Cs − C2 + ∗ , G (2λ)2   C2 + 1/θ , L 4 = 6 Cs − C2 + G ∗ (2λ)   (C2 + 3/θ) 1 + G ∗ (2λ)3 L 5 = Cs − C2 + , G ∗ (2λ)6   (C2 + 2/θ) 1 + G ∗ (2λ)2 + G ∗ (2λ)4 L 6 = Cs − C2 + , G ∗ (2λ)6   (C2 + 1/θ) 1 + G ∗ (2λ) + G ∗ (2λ)2 + G ∗ (2λ)3 + G ∗ (2λ)4 + G ∗ 2λ)5 L 7 = Cs − C2 + , G ∗ (2λ)6     (C2 + 1/θ) 1 + G ∗ (2λ) + G ∗ (2λ)2 L 8 = 2 Cs − C2 + , G ∗ (2λ)3  L 9 = 3 Cs − C2 +

  (C2 + 1/θ) 1 + G ∗ (2λ) . G ∗ (2λ)2

For examples, L 1 and L 7 are obtained as follows: Let lk (k = 1, 2, 3, 4, 5, 6)) be the mean execution time for task k to complete the final task 6. For Scheme 1,

88

K. Naruse and T. Nakagawa





l1 =

(Cs + t) e−2λt dG(t)6 +

0





  (C2 + t + l1 ) 1 − e−2λt dG(t)6

0

  6 = (Cs − C2 ) G ∗ (2λ)6 + C2 + + l1 1 − G ∗ (2λ)6 . θ Solving it with respect to l1 , L 1 ≡ l1 = Cs − C2 + For Scheme 7,



lk = (C2 + t + lk+1 )e−2λt dG(t) + 0



  (C2 + t + l1 ) 1 − e−2λt dG(t)

0



l6 =



C2 + 6/θ . G ∗ (2λ)6

(Cs + t)e−2λt dG(t) +

0





(k = 1, 2, 3, 4, 5),   (C2 + t + l1 ) 1 − e−2λt dG(t).

0

Solving it with respect to l1 , L 7 ≡ l1 = Cs − C2   (C2 + 1/θ ) 1 + G ∗ (2λ) + G ∗ (2λ)2 + G ∗ (2λ)3 + G ∗ (2λ)4 + G ∗ (2λ)5 + . G ∗ (2λ)6 Example 5 Table 5 presents L 1 ∼ L 9 for λ when Cs = 0.1, C2 = 0.001 and G(t) = 1 − e−t , using double module. When λ = 0.1, L 4 is the smallest and when λ = 0.001, L 7 is the smallest. Table 6 presents L 1 ∼ L 9 for C2 when Cs = 0.1 and G(t) = 1 − e−t , Comparing Tables 5 and 6, when λ becomes smaller, L 1 ∼ L 9 becomes smaller, and when C2 becomes smaller, L 1 ∼ L 9 becomes smaller. Thus, if we improve the mean execution time, we need to decrease an error rate λ and overhead C2 . (2) Triple Majority Module Consider a majority decision structure with 3 modules, i.e., 2-out-of-3 structure. When more than two states of three modules agree with each other, the process in this interval is correct. Then, the correct probability of the process in the interval ((k − 1)t, kt] is given in (14). Thus, replacing G ∗ (2λ) in L k (k = 1, 2, . . . , 9) with

∞ 0

 −2λt  3e − 2e−3λt dG(t) = 3G ∗ (2λ) − 2G ∗ (3λ),

Optimal Design of Checkpoint Systems …

89

Table 5 Mean times of L 1 ∼ L 9 when Cs = 0.1, C2 = 0.001 and G(t) = 1 − e−t λ L1 L2 L3 L4 L5 L6 L7 L8 0.1 0.05 0.01 0.005 0.001 0.0005

18.018 10.730 6.857 6.469 6.172 6.136

10.570 8.187 6.567 6.382 6.236 6.218

8.941 7.561 6.543 6.421 6.324 6.312

7.801 7.201 6.720 6.660 6.612 6.606

14.246 9.410 6.663 6.377 6.155 6.128

13.105 8.995 6.600 6.347 6.150 6.126

12.027 8.595 6.540 6.319 6.147 6.126

8.943 7.487 6.447 6.325 6.228 6.216

Table 6 Mean times of L 1 ∼ L 9 when Cs = 0.1, λ = 0.05 and G(t) = 1 − e−t C2 L1 L2 L3 L4 L5 L6 L7 L8 0.05 0.01 0.005 0.001

10.768 10.737 10.733 10.730

8.219 8.193 8.189 8.187

7.592 7.566 7.563 7.561

7.230 7.206 7.203 7.201

9.513 9.429 9.418 9.410

9.164 9.026 9.009 8.995

8.962 8.662 8.625 8.595

7.746 7.535 7.508 7.487

L9 8.225 7.234 6.484 6.393 6.321 6.312

L9 7.427 7.269 7.250 7.234

we have easily L k . For example,   C2 + 2/θ , L 3 = 3 Cs − C2 + [3G ∗ (2λ) − 2G ∗ (3λ)]2   (C2 + 1/θ ) [1 + 3G ∗ (2λ) − 2G ∗ (3λ)] L 9 = 3 Cs − C2 + . [3G ∗ (2λ) − 2G ∗ (3λ)]2 For general modules given in (8), we may replace G ∗ (2λ) with n  k=1

Pk

  ∞ n F(t)k F(t)n−k dG(t). k 0

(34)

Example 6 Table 7 presents L 1 ∼ L 9 for λ when Cs = 0.1, C2 = 0.001 and G(t) = 1 − e−t , using triple majority modules. Table 8 presents L 1 ∼ L 9 for C2 when Cs = 0.1 and G(t) = 1 − e−t . Compairing Tables 5 and 7, all of mean times in Table 5 are larger than those of Table 7, however, when λ is small, the mean times of two tables are almost the same. In general, C2 for triple modules would be larger than those for double modules. Comparing Tables 6 and 8, all mean times of Table 8 are smaller than those in Table 6. We should select a triple module under these conditions.

90

K. Naruse and T. Nakagawa

Table 7 Mean times L 1 ∼ L 9 when Cs = 0.1, C2 = 0.001 and G(t) = 1 − e−t for triple majority modules λ L1 L2 L3 L4 L5 L6 L7 L8 L9 0.1 0.05 0.01 0.005 0.001 0.0005

7.692 6.545 6.121 6.105 6.100 6.100

6.949 6.419 6.210 6.203 6.200 6.200

6.790 6.445 6.307 6.302 6.300 6.300

6.840 6.672 6.603 6.601 6.600 6.600

7.272 6.433 6.116 6.105 6.101 6.101

7.136 6.397 6.116 6.106 6.102 6.102

7.004 6.362 6.117 6.108 6.105 6.105

6.697 6.349 6.211 6.206 6.204 6.204

6.668 6.412 6.308 6.304 6.303 6.303

Table 8 Mean times L 1 ∼ L 9 when Cs = 0.1, λ = 0.05 and G(t) = 1 − e−t for triple majority modules C2 L1 L2 L3 L4 L5 L6 L7 L8 L9 0.05 0.01 0.005 0.001

6.549 6.546 6.546 6.545

6.422 6.419 6.419 6.419

6.448 6.446 6.445 6.445

6.676 6.673 6.672 6.672

6.487 6.443 6.437 6.433

6.502 6.416 6.405 6.397

6.620 6.410 6.383 6.362

6.552 6.387 6.366 6.349

6.564 6.440 6.424 6.412

5 Conclusion We have proposed the optimal design of checkpoint systems with general structures, tasks and schemes for random checkpoint models. In particular, we have considered several useful checkpoint systems, obtained the mean time execution time and compared them theoretically and numerically. It would be useful for systems designers that need high reliability and high speed computing systems. When system designers choose an appropriate system from this chapter, they will be able to make the optimal checkpoint systems. Furthermore, if the systems failure rate are known, they can easily choose an optimal kind of checkpoint schemes and get its checkpoint times. This study can be applied to several systems with high reliable requirements such as spacecraft, stock exchange system, aircraft, drone and so on, because such systems need to provide high reliability and performance by implementing recovery methods and redundant techniques.

References 1. Abd-El-Barr M (2007) Reliable and fault-tolerant. Imperial Colledge Press, London 2. Ito K, Nakagawa T (2019) Reliability properties of K-out-of-N: G systems. In: Ram M, Dohi T (eds) Systems engineering. CRC Press, pp 25–40 3. Ito K, Zhao X, Nakagawa T (2017) Random number of units for K-out-of-n systems. Appl Math Model 45:563–572

Optimal Design of Checkpoint Systems …

91

4. Kim H, Shin KG (1996) Design and analysis of an optimal instruction-retry policy for TMR controller computers. IEEE Trans Comput 45(11):1217–1225 5. Lee PA, Anderson T (1990) Fault tolerance principles and practice. Dependable computing and fault-tolerant systems. Springer, Wien 6. Nakagawa S, Fukumoto S, Ishii N (2003) Optimal checkpointing intervals of three error detection schemes by a double modular redundancy. Math Comput Model 38:1357–1363 7. Nakagawa S, Okuda Y, Yamada S (2003) Optimal checkpointing interval for task duplication with spare processing. In: Ninth ISSAT international conference on reliability and quality in design, Honolulu, Hawaii, vol 2003, pp 215–219 8. Nakagawa T (2008) Advanced reliability models and maintenance policies. Springer, London 9. Naruse K, Nakagawa T (2020) Optimal checkpoint intervals, schemes and structures for computing modules. In: Pham H (ed) Reliability and statistical computing. Springer, pp 265–287 10. Ohara M, Suzuki R, Arai M, Fukumoto S, Iwasaki K (2006) Analytical model on hybrid state saving with a limited number of checkpoints and bound rollbacks (reliability, maintainability and safety analysis). IEICE Trans Fundam Electron Commun Comput Sci 89(9):2386–2395 11. Pradhan DK, Vaidya NH (1992) Rollforward checkpointing scheme: concurrent retry with nondedicated spares. IEEE Computer Society Press, pp 166–174 12. Ram M, Dohi T (2019) Systems engineering: reliability analysis using k-out-of-n structures. CRC Press 13. Siewiorek DP, Swarz RS (eds) (1982) The theory and practice of reliable system design. Digital Press, Bedford, Massachusetts 14. Ziv A, Bruck J (1997) Performance optimization of checkpointing schemes with task duplication. IEEE Trans Comput 46:1381–1386 15. Ziv A, Bruck J (1998) Analysis of checkpointing schemes with task duplication. IEEE Trans Comput 47:222–227

Series System Reliability: An Unified Approach Using Fatal Shock and Stress Models Ricardo Puziol de Oliveira, Marcos Vinicius de Oliveira Peres, Wesley Bertoli, and Jorge Alberto Achcar

Abstract This chapter introduces a new approach to estimate the reliability and series system reliability functions based on the structure of shock and stress models. Conceptually, shock may refer to a deterioration process that destroys the entire system. Conversely, stress can be related to two components kept working independently, both having maintenance scheduled at a fixed time. In this context, we introduce multivariate models derived from those structures to evaluate the reliability function of an n-component series system.

1 Introduction Systems’ reliability is not limited to the study of individual components’ reliability. Better models that imply better inferences to study the reliability of component systems, often structured in different configurations, are important, especially in applications such as in all manufacturing sectors, engineering and management sectors, [1–11]. As emphasized by [12], a complex system is often used to perform a task that simpler systems could have done but with lower reliabilities. Depending on the assumed structure, if one has many components working simultaneously, then the system’s overall reliability may decrease, and the maintenance would be more complex. According to [13], the calculations to assess the reliability of a system should be based on two essential operations:

R. P. de Oliveira (B) · M. V. de Oliveira Peres Maringá State University, Maringá, Brazil W. Bertoli Federal University of Technology, Curitiba, Paraná, Brazil e-mail: [email protected] J. A. Achcar University of São Paulo, São Paulo, Brazil e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment of Industrial Systems, Springer Series in Reliability Engineering, https://doi.org/10.1007/978-3-030-93623-5_5

93

94

R. P. de Oliveira et al.

1. It is needed a reliability measurement, as precise as possible, of the components used in the system environment; 2. It is also needed the calculation of the reliability of some complex combinations of these components. In this way, our primary goal is to obtain better inferences for system reliability originated by block diagrams that depict the functional relationship of various elements in a physical system. For example, a block diagram having two blocks is called a series system, usually assumed in many engineering experiments. Such a configuration has the following property: if a single component fails, then the entire system fails. In probabilistic terms, we say that each system-related component response is represented by a random variable that could be binary (fail/no-fail) or positive realvalued (lifetime). In such a context, it is possible to evaluate the system reliability by estimating the overall probability of failure or the survival function [12, 14]. Several studies related to series system reliability usually assume that the components’ lifetimes are not dependent. However, in many practical engineering situations, such an assumption is far from reasonable. For instance, two specific system components may be subject to the same set of stresses/fatal shocks or may receive the same load/voltages. Besides, the components may share the same working environment, so one may expect them to perform similarly. Thus, when obtaining the reliability of 2-component systems, it is crucial adopting statistical models that accommodate dependence structures between the components’ lifetimes (e.g., bivariate models) [15]. In many practical applications, not considering the dependence as a source of variation could affect the assessment of the entire system’s reliability [16–19]. Therefore, various bivariate distributions could be assumed for modeling correlated lifetimes of a series system. Notably, the Bivariate Exponential distributions have been widely used for estimating systems’ reliability [20–28]. A system’s reliability is typically obtained as the product of reliabilities at a fixed time ti = min{ti } (i = 1, . . . , n). Under the independence assumption, and considering a 2-component series system as a special case, one can assume two random variables T1 and T2 following the same distribution, but with different parameters. Usually, T1 and T2 are univarite lifetimes of a series system and so we have ti = min{t1i , t2i }. Several lifetime models could be chosen for each component of a series system. For example, T1 and T2 could be Weibull or Exponential random variables. A mixture of these distributions (T1 following a Weibull and T2 following an Exponential distribution) or a random field discretization can also be considered [29–41]. However, there may be a deterioration process damaging both components simultaneously. In this context, the lifetimes’ dependence cannot be ignored as the lifetimes’ dependency is strongly related to the deterioration process. In this case, it is highly recommended to adopt a bivariate distribution for the inferential process. In the present study, we introduce a time-dependent framework for a deterioration process using multivariate models for an n-component series system. Such an approach can provide better estimates for the system reliability instead of adopting independent univariate distributions for each component. Besides, since obtaining a

Series System Reliability: An Unified Approach …

95

system’s reliability function based on component reliabilities poses a great challenge, using probabilistic models, especially multivariate ones, can be helpful to determine system’ failure rates and warranty periods accurately.

2 Methodology 2.1 Series Systems with n Components Firstly, we consider a 2-component series system whose components’ lifetimes are given, respectively, by T1 and T2 . Thus, the system’s reliability function at a fixed time t is given by R(t) = P(min{T1 , T2 } > t), where T1 and T2 represent the lifetimes of components 1 and 2, respectively. Typically, the lifetimes T1 and T2 are correlated, which imply some implications when evaluating the system reliability [13, 31, 32, 42]. Under the dependence assumption, the reliability function of the system at a fixed time t is given by R(t) = P(T > t) = P(T1 > t, T2 > t) = S(t, t), where S(t, t) denotes the joint survival function (sf) and T = min{T1 , T2 }, that is, S(t, t) = 1 − F1 (t) − F2 (t) + F(t, t), where F j (t j ) = P(T j  t j ) ( j = 1, 2) is the marginal cumulative distribution function (cdf) of T j and F(t, t) is the joint cdf. By the induction principle, the reliability function of the system at a fixed time t, assuming n dependent components, is simply given by R(t) = P(T > t) = P(T1 > t, . . . , Tn > t) = S(t, . . . , t), where S(t, . . . , t) represents the multivariate sf and T = min{T1 , . . . , Tn }. By ignoring the dependence structure, one can simply evaluate the system reliability by product-law of the reliabilities.

2.2 Fatal Shock and Stress Methods The shock model structure introduced by [22] presupposes three existing independent shock sources in the environment where a 2-component system is placed. Following [43], such a structure is defined as follows.

96

R. P. de Oliveira et al.

Definition 1 ([43]) Suppose independent Poisson processes W1 (t, δ 1 ), W2 (t, δ 2 ), and W3 (t, δ 3 ) represent the occurrence number of fatal shocks for a 2-component system. Thus, 1. W1 (t, δ 1 ) represents the fatal shocks transmitted to component 1; 2. W2 (t, δ 2 ) represents the fatal shocks transmitted to component 2; 3. W3 (t, δ 3 ) represents the fatal shocks transmitted equally and independently to both components. If X = min{W1 , W3 } and Y = min{W2 , W3 } denote, respectively, the lifetimes of components 1 and 2, then the probability of the system is working until the occurrence of an overall failure is given by P(X > x, Y > y) = P(W1 (x) = 0, W2 (y) = 0, W3 (max{x, y}) = 0) = P(min{W1 , W3 } > x, min{W2 , W3 } > y) = P({W1 > x, W3 > x}, {W2 > y, W3 > y}) = P(W1 > x, W2 > y, W3 > z), where z = max{x, y}. Once W j ( j = 1, 2, 3) are mutually independent, we have that P(X > x, Y > y) = P(W1 > x)P(W2 > y)P(W3 > z). in which the dependence structure of the random variables X and Y is related to the common source of shock. By the induction principle, we can easily generalize the Definition 1 to a ncomponent series system. However, let us first introduce a 3-component system in Definition 2 (see [43]). In this case, one can notice that seven Poisson processes representing the fatal shocks. If we consider a 4-component system, we will have 15 Poisson processes for the shocks and so on. In general, for a n-component system, we can assume 2n − 1 Poisson processes for the shocks. Such a quantity denotes the number of extra model parameters that need to be estimated to characterize the full system’s reliability. Definition 2 ([43]) For a 3-component system and following the same structure for a 2-component series system, we have: 1. W1 (t, δ 1 ), W2 (t, δ 2 ), and W3 (t, δ 3 ) represents the fatal shocks, respectively, to components 1, 2, and 3; 2. W12 (t, δ 12 ), W13 (t, δ 13 ), and W23 (t, δ 23 ) represents the fatal shocks, respectively, to the component pairs (1, 2), (1, 3), and (2, 3);

Series System Reliability: An Unified Approach …

97

3. W123 (t, δ 123 ) represents an overall fatal shock to components 1, 2, and 3. If X = min{W1 , W12 , W13 , W123 }, Y = min{W2 , W12 , W23 , W123 }, and Z = min{W3 , W13 , W23 , W123 }, then the probability of the system is working until the occurrence of an overall failure is given by P(X > x, Y > y, Z > z) = P(W1 (x) = 0, W2 (y) = 0, W3 (z) = 0, W12 (max{x, y}) = 0, W13 (max{x, z}) = 0, W23 (max{y, z}) = 0, W123 (max{x, y, z}) = 0). Once the random variables W j ( j = 1, 2, 3), Wi j (i = 1, 2), and W123 are mutually independent, we have that P(X > x, Y > y, Z > z) = P(W1 > x)P(W2 > y)P(W3 > z) × P(W12 > max{x, y}) × P(W13 > max{x, z}) × P(W23 > max{y, z}) × P(W123 > max{x, y, z}). Since in the n-dimensional space, similar arguments hold, we have that: P(X 1 > x1 , . . . , X n > xn ) = P(W1 > x1 ) · · · P(Wn > xn ) × P(W12 > max{x1 , x2 }) × ... × P(W123 > max{x1 , x2 , x3 }) × ... × P(W12...n > max{x1 , . . . , xn }) For the structure of a stress model, also introduced by [12, 22], it is assumed that a 2-component system is subject to individual and independent stresses, say U1 and U2 . Besides, the authors suppose that the system has an overall stress source, U3 , which is transmitted to both components equally and independently. The stresses for the components are X 1 = max{U1 , U3 } and X 2 = max{U2 , U3 }. Similar arguments hold for the multivariate sf of the shock model. Here, the difference in the estimation of the reliability is, in the case of 2-component, that the reliability function of the system (T = min{X 1 , X 2 }), under the dependence assumption, is given by R(t) = R1 (t)R2 (t) + R3 (t) − R1 (t)R2 (t)R3 (t),

98

R. P. de Oliveira et al.

where Ri (t) (i = 1, 2, 3) is the sf of Ui , so defining the stress of each component. Noticeably, the system’s reliability functions based on stress models are very different from those derived from fatal shock.

2.3 Inference Methods for a Series System Without loss of generally, suppose a series system with two components. Naturally, the procedure for a n-component system is analogous. In general, the existing literature on the reliability of 2-component series systems is based only on the failure times and the indication of which component had failed. Specifically, we have the complete lifetime data for one component but only partial information for the other since it may not have failed yet. In this case, we have that T = min{T1 , T2 }. Now, given a random sample (T1 , . . . , Tm ) of size m from a series system with two components, one can define the following variable  δi =

1, for T1i < T2i 0, for T1i  T2i .

Hence, the individual contribution of T1i and T2i for the likelihood function of vector β is given by P(T1i = t1i , T2i > t2i ) = −

∂ S(t1i , t2i ) , ∂t1i

if δi = 1,

P(T1i > t1i , T2i = t2i ) = −

∂ S(t1i , t2i ) , ∂t2i

if δi = 0.

and

Therefore, the full likelihood function of vector β (see [44]) is given by L(β) =

   m   ∂ S(t1i , t2i ) δi ∂ S(t1i , t2i ) 1−δi − − . ∂t1i ∂t2i i=1

(1)

Note that Eq. (1) could have no compact form depending on the probability distributions adopted for each component. This implies that the maximum likelihood estimator and the Fisher information matrix of β could only be obtained using optimization algorithms. Nevertheless, from a fully Bayesian perspective, one may use a Gibbs Sampling (GS) algorithm, which is a Markov Chain Monte Carlo (MCMC) method (see [45, 46]) for obtaining posterior inferences.

Series System Reliability: An Unified Approach …

99

We have considered the GS algorithm to generate pseudo-random samples from the joint posterior distribution of model parameters. After that, we have obtained the Monte Carlo (MC) estimates under the squared error loss function. For instance, let Zi = (X i , Yi ) be a Markov Chain and f (x, y) be a given joint probability density. Also, let f (x; y) and f (y; x) be the conditional probability density functions. In this case, the steps of the adopted GS algorithm are: 1. Choose initial values Z 0 = (X 0 , Y0 ) and set i = 1; 2. Generate X i ∼ f (xi ; Yi−1 = yi−1 ) and Yi ∼ f (yi ; X i = xi ); 3. Set i = i + 1 and return to Step 2. Using such a procedure, we have generated chains with N = 210,000 values for each parameter, where the first 10,000 values were discarded as burn-in period. The chains’ convergence was monitored using trace plots and Geweke’s diagnostic methods. Then, to obtain pseudo-independent samples from the joint posterior distribution, for every 100 values generated, one was kept, which resulted in chains of size 2,000 for each model parameter. The analysis were made in the R software [47] using the R2jags package [48].

3 Probabilistic Models This section illustrates the proposed methodology using 2-component systems described in Definition 1 to derive probabilistic models that can be used to describe a system’s reliability. The obtained models are presented in Tables 1 and 2.

Table 1 Models derived from the fatal shock structure Fatal shock models (z = max{x1 , x2 }) Model Components S(x1 , x2 ) BE-I

BR-I

BL-I

W1 ∼ E x p(β1 ) W2 ∼ E x p(β2 ) W3 ∼ E x p(β3 ) W1 ∼ Ray(β1 ) W2 ∼ Ray(β2 ) W3 ∼ Ray(β3 ) W1 ∼ Lin(β1 ) W2 ∼ Lin(β2 ) W3 ∼ E x p(β3 )

Marginal (S X j (x j ), j = 1, 2)   exp −(β j + β3 )x j

exp{−β1 x1 − β2 x2 − β3 z} 

x12

x22

z2 exp − 2 − − 2β1 2β22 2β32

2  βi xi −βi xi e−β3 z 1+ e 1 + βi i=1





exp −  1+

β12 + β32 β12 β32



 x 2j

 βjxj e−β j x j −β3 x j 1 + βj

100

R. P. de Oliveira et al.

Table 2 Models derived from the stress structure Stress models (z = min{x1 , x2 }) Model

Components W1 ∼ E x p(β1 )

BE-II

W2 ∼ E x p(β2 )

F(x1 , x2 )

Marginal (FX j (x j ), j = 1, 2)

1 − exp{−β3 z}

2



W1 ∼ Ray(β1 ) W2 ∼ Ray(β2 ) W3 ∼ Ray(β3 )

BL-II

W2 ∼ Lin(β2 )

W1 ∼ Lin(β1 ) W3 ∼ E x p(β3 )



FE (xi , βi )FE (xi , β3 )

i=1

W3 ∼ E x p(β3 )

BR-II

1 − exp{−βi xi }





 

x2 1 − exp − i 2 2βi i=1

FR (xi , βi )FR (xi , β3 )

 2   β x 1− 1+ i i e−βi xi 1 + βi

FL (xi , βi )FE (xi , β3 )

1 − exp −



1 − e−β3 z

z2 2β32

 2 

i=1

4 Data Application: Operation Time of a Series System This section presents an application of the proposed methodology using data from a 2-component series system. For each component, we assume that the failure times can be approximated by an Exponential distribution with parameters β = 0.07 for the first component and β = 0.05 for the second one. The system’s reliability was obtained for an initial operation time of 50 h using the classical methodology, the product-law of reliabilities (PLR), that is, assuming independence between the lifetimes [13]. Here, the main goal is to evaluate the system reliability using the proposed models and the classical PLR methodology. In this way, we may verify if the proposed models better accommodate the dependence structure related to a fatal shock or overall stress that affects both components simultaneously. To perform a Bayesian analysis, we have chosen noninformative Gamma prior distributions (α j = κ j = 0.001) for β j ( j = 1, 2) and an informative Gamma prior distribution with α3 = T and κ3 = V(T ), where T is the random variable related to the operation time for the fatal shock models, T is the sample mean and V(T ) is the sample variance. Moreover, we have adopt a noninformative Uniform prior distributions, U (0, 10), for β j ( j = 1, 2, 3) for the stress models to prevent computational instability. To evaluate the series systems reliability, we have considered the parameter estimates to compute the full system’s reliability for an initial operation time of 50 h. The obtained results are presented in Figs. 1, 2 and 3. From Figs. 1, 2 and 3, one may notice that only fatal shock models provide accurate estimates for the system reliability compared to the PLR methodology. This result may be related to the fact that we have considered a series system, and the results could be different considering a parallel one; that is, maybe only the stress models could provide similar estimates for the system reliability due to the structure from which the stress models are derived. One can notice that the BL-I model and the PLR method provide approximate results, which can be related to the flexibility of the Lindley distribution on reliability analysis (see [49]). For the BE-I model, the obtained result is quite similar to the

Series System Reliability: An Unified Approach … 1.0

1.0

PLR BE−I

PLR BE−II

0.8

^ R(t)

0.8

^ R(t)

101

0.5

0.2

0.5

0.2

0.0

0.0 0

12

25 Time (Hours)

38

50

0

12

25 Time (Hours)

38

50

Fig. 1 System reliability for an initial operation time of 50 h considering the BE-I (left-panel), the BE-II (right-panel), and the PLR methodology 1.0

1.0

PLR BR−I

0.8

^ R(t)

^ R(t)

0.8

PLR BR−II

0.5

0.2

0.5

0.2

0.0

0.0 0

12

25 Time (Hours)

38

50

0

12

25 Time (Hours)

38

50

Fig. 2 System reliability for an initial operation time of 50 h considering the BR-I (left-panel), the BR-II (right-panel), and the PLR methodology 1.0

1.0

PLR BL−I

0.8

^ R(t)

^ R(t)

0.8

PLR BL−II

0.5

0.2

0.5

0.2

0.0

0.0 0

12

25 Time (Hours)

38

50

0

12

25 Time (Hours)

38

50

Fig. 3 System reliability for an initial operation time of 50 h considering the BL-I (left-panel), BL-II (right-panel), and the PLR methodology

102

R. P. de Oliveira et al.

PLR methodology, which is expected since we consider similar modeling structures. The results based on the Rayleigh distribution are good, but such a model is not always as flexible as the other ones. Despite the difference in the obtained results, one can notice that both methodologies are useful for such data analysis. Besides, the proposed models have the advantage of incorporating dependence structures: a fatal shock or stress that damages both components of the system simultaneously.

5 Conclusion In this chapter, we considered both stress and fatal shock structures. Based on the obtained results, we can notice that the fatal shock model led to accurate inferential results for the full system reliability, even using noninformative priors in the Bayesian analysis. Also, as the real-data application showed, the proposed methodology could be applied in many industrial situations where a series system fails due to a common shock source that destroys both components. Based on the reliability estimates, it is possible to conclude that using any of the proposed models could be an excellent alternative to the PLR methodology under the independence assumption. Besides, other probabilistic models could be assumed for the random variables Wi and Ui (i = 1, 2, 3), which would generate even more flexible models for accommodating other information of the entire system.

References 1. Hall PL, Strutt JE (2003) Probabilistic physics-of-failure models for component reliabilities using Monte Carlo simulation and Weibull analysis: a parametric study. Reliab Eng Syst Saf 80(3):233–242 2. Xie L, Zhou J, Hao C (2004) System-level load-strength interference based reliability modeling of k-out-of-n system. Reliab Eng Syst Saf 84(3):311–317 3. Nandal J, Chauhan SK, Malik SC (2015) Reliability and MTSF evaluation of series and parallel systems. Int J Stat Reliab Eng 2(1):74–80 4. Chauhan SK, Malik SC (2016) Reliability evaluation of series-parallel and parallel-series systems for arbitrary values of the parameters. Int J Stat Reliab Eng 3(1):10–19 5. Zhou W, Xiang W, Hong HP (2017) Sensitivity of system reliability of corroding pipelines to modeling of stochastic growth of corrosion defects. Reliab Eng Syst Saf 167:428–438 6. Chi Z, Chen R, Huang S, Li YF, Bin Z, Zhang W (2020) Multi-state system modeling and reliability assessment for groups of high-speed train wheels. Reliab Eng Syst Saf 107026 7. Okabe T, Otsuka Y (2020) Proposal of a validation method of failure mode analyses based on the stress-strength model with a support vector machine. Reliab Eng Syst Saf 107247 8. Torres-Alves GA, Morales-Nápoles O (2020) Reliability analysis of flood defenses: the case of the Nezahualcoyotl Dike in the Aztec City of Tenochtitlan. Reliab Eng Syst Saf 107057 9. Stefenon SF, Ribeiro MHDM, Nied A, Mariani VC, Coelho LS, Rocha DFM, Grebogi RB, Ruano AEB (2020) Wavelet group method of data handling for fault prediction in electrical power insulators. Int J Electr Power Energy Syst 123:106269 10. Ribeiro MHDM, Coelho LS (2020) Ensemble approach based on bagging, boosting and stacking for short-term prediction in agribusiness time series. Appl Softw Comput 86:105837

Series System Reliability: An Unified Approach …

103

11. Yousefi N, Coit DW, Song S (2020) Reliability analysis of systems considering clusters of dependent degrading components. Reliab Eng Syst Saf 202:107005 12. Oliveira RP, Achcar JA, Mazucheli J, Bertoli W (2021) A new class of bivariate Lindley distributions based on stress and shock models and some of their reliability properties. Reliab Eng Syst Saf 211:107528 13. Aggarwal KK (2012) Reliability engineering, vol 3. Springer Science & Business Media 14. Jensen PA, Bard JF (2003) Operations research models and methods, vol 1. Wiley 15. Thoft-Christensen P, Murotsu Y (1986) Reliability of series systems. In: Application of structural systems reliability theory. Springer, Berlin, Heidelberg 16. Li J, Coit DW, Elsayed EA (2011) Reliability modeling of a series system with correlated or dependent component degradation processes. In: 2011 international conference on quality, reliability, risk, maintenance, and safety engineering. IEEE, pp 388–393 17. Hu L, Yue D, Li J (2012) Availability analysis and design optimization for a repairable seriesparallel system with failure dependencies. Int J Innov Comput Inf Control 8(10):6693–6705 18. Park JH (2017) Time-dependent reliability of wireless networks with dependent failures. Reliab Eng Syst Saf 165:47–61 19. Jafary B, Mele A, Fiondella L (2020) Component-based system reliability subject to positive and negative correlation. Reliab Eng Syst Saf 107058 20. Gumbel EJ (1960) Bivariate exponential distributions. J Am Stat Assoc 55(292):698–707 21. Freund JE (1961) A bivariate extension of the exponential distribution. J Am Stat Assoc 56(296):971–977 22. Marshall AW, Olkin I (1967) A generalized bivariate exponential distribution. J Appl Probab 4(02):291–302 23. Marshall AW, Olkin I (1967) A multivariate exponential distribution. J Am Stat Assoc 62(317):30–44 24. Downton F (1970) Bivariate exponential distributions in reliability theory. J R Stat Soc Ser B (Methodol) 32(3):408–417 25. Hawkes AG (1972) A bivariate exponential distribution with applications to reliability. J R Stat Soc Ser B (Methodol) 129–131 26. Block HW, Basu AP (1974) A continuous, bivariate exponential extension. J Am Stat Assoc 69(348):1031–1037 27. Sarkar SK (1987) A continuous bivariate exponential distribution. J Am Stat Assoc 82(398):667–675 28. Arnold BC, Strauss D (1988) Bivariate distributions with exponential conditionals. J Am Stat Assoc 83(402):522–527 29. Burr IW (1968) On a general system of distributions. J Am Stat Assoc 63 30. Burr IW (1973) Parameters for a general system of distributions to match a grid of α3 and α4 . Commun Stat Theory Methods 2 31. Singh C, Billinton R (1977) System reliability, modelling and evaluation, vol 769. Hutchinson London 32. Blanchard BS, Fabrycky WJ, Fabrycky WJ (1990) Systems engineering and analysis, vol 4. Prentice Hall Englewood Cliffs, NJ 33. Chao MT, Fu JC (1991) The reliability of a large series system under Markov structure. Adv Appl Probab 23(4):894–908 34. Mori Y, Ellingwood BR (1993) Time-dependent system reliability analysis by adaptive importance sampling. Struct Saf 12(1):59–73 35. Hulting FL, Robinson JA (1994) The reliability of a series system of repairable subsystems: a Bayesian approach. Naval Res Logist 41(4):483–506 36. Zhang T, Horigome M (2001) Availability and reliability of system with dependent components and time-varying failure and repair rates. IEEE Trans Reliab 50(2):151–158 37. Rausand M, Arnljot HÃ (2004) System reliability theory: models, statistical methods, and applications, vol 396. Wiley 38. Kołowrocki K (2008) Reliability of large systems. Wiley Online Library

104

R. P. de Oliveira et al.

39. Eryilmaz S, Tank F (2012) On reliability analysis of a two-dependent-unit series system with a standby unit. Appl Math Comput 218(15):7792–7797 40. Zhen H, Mahadevan S (2015) Time-dependent system reliability analysis using random field discretization. J Mech Des 137(10):101404 41. Oliveira RP, Achcar JA (2019) Use of Basu-Dhar bivariate geometric distribution in the analysis of the reliability of component series systems. Int J Quality Reliab Manag 42. Chowdhury S, Kundu A (2017) Stochastic comparison of parallel systems with log-Lindley distributed components. Oper Res Lett 45(3):199–205 43. de Oliveira RP (2019) Multivariate lifetime models to evaluate long-term survivors in medical studies. PhD thesis, University of São Paulo 44. Lawless JF (1982) Statistical models and methods for lifetime data. Wiley Series in probability and mathematical statistics. Wiley, New York 45. Gelfand AE, Smith AFM (1990) Sampling-based approaches to calculating marginal densities. J Am Stat Assoc 85(410):398–409 46. Chib S, Greenberg E (1995) Understanding the Metropolis-Hastings algorithm. Am Stat 49(4):327–335 47. R Core Team (2020) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria 48. Su Y-S, Yajima M (2012) R2jags: a package for running jags from R. R package version 0.03-08. http://CRAN.R-project.org/package=R2jags 49. Ghitany ME, Atieh B, Nadarajah S (2008) Lindley distribution and its application. Math Comput Simul 78(4):493–506

The New Attempt of Network Reliability in the Assessment of Business Activities for Precision Management Shin-Guang Chen

Abstract Network reliability has been applied to many real-world applications. The popular areas are traffic planning, computer network planning, power transmission planning, etc. However, in the social applications, there are still very limited cases reported. This chapter is planned to introduce the new attempt of applying network reliability in the assessment of business activities. Such attempt is a step toward precision management in business administration. Based on the capability of conducting precision management, a great deal of improvements in business practices can be taken place. Precision performance evaluation for individuals or groups are presented, and a case study is illustrated to show the approach proposed in this chapter.

1 Introduction Network reliability has been applied to many real-world applications. The popular areas are traffic planning, computer network planning, power transmission planning, etc. Since 1954, the problem of maximum flow has attracted widespread attention in the world [22]. They have also expanded to many other application areas [13]. Firstly, the reliability problem of no flow in the binary state network is discussed by Aggarwal et al. [3]. Lee [16] extended the problem to include cases with streams. The binarystate cases for network reliability problem were solved firstly with minimum path (MP) by Aggarwal et al. [2]. The multi-state cases for reliability network problem were solved firstly by Xue [24]. The multi-state network (MSN) is a network whose flow has a random state. Lin et al. [17] use MP or minimum cut (MC) [14] to illustrate the reliability calculation of MSN. They also set up three stages in these calculations: (a) Search for all MPs [10, 12, 23] / MCs [1, 8, 15, 28]; (b) Search for all d-MP [19, 25] / d-MC [26] from these MP/MC; (c) From these d-MPs [6, 7, 30] / d-MCs S.-G. Chen (B) Tungnan University, New Taipei City, Taiwan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment of Industrial Systems, Springer Series in Reliability Engineering, https://doi.org/10.1007/978-3-030-93623-5_6

105

106

S.-G. Chen

[12, 18], we calculate the joint probability of them. The staged approach (SA) was stream-lined by Lin [19] and proposed a simpler and better method to evaluate the reliability MSNs. Now, SA has been extensively studied in the literature. A comprehensive discussion can be referred to the article [4]. A good improvement in algorithms can be referred to the article [11]. Based on the progresses, we can now apply the network reliability to larger area such as social applications to show the impressive power of network reliability. The following sections are arranged as follows. The mathematical requirements for SA are presented in Sect. 2. The details of SA [11] is given in Sect. 3. The performance evaluation for individuals and groups in a process is given in Sect. 4. The bonus dispatching in a process is given in Sect. 5. A case study is presented in Sect. 6. Then, we draw the conclusion of this chapter in Sect. 7.

2 Mathematical Preliminaries Assume that the process is a MSN, we have (E, V ) been a MSN, where E is the set of edges representing the precedence relationships between individuals in a process, V = {ai |1 ≤ i ≤ N } is the set of nodes representing the individuals in a process. Let M = (m 1 , m 2 , . . . , m N ) be a vector with m i (an integer) being the maximal capacity of ai . The process is assumed to satisfy the following assumptions. 1. The capacity of ai is an integer-valued random variable which takes values from the set {0, 1, 2, …, m i } according to a given distribution μi . 2. The edges are perfect. 3. Flow in the process satisfies the flow-conservation law [13]. 4. The states in nodes are statistically independent from each other.

2.1 Process Modeling From the input to the output of the process, we call ρ1 , ρ2 , …, ρz the MPs. The process is described by two variables at time t: the status variable St = (s1t , s2t , . . . , s N t ) and the stream variable Ft = ( f 1t , f 2t , . . . , f zt ), where sit denotes the current capacity state of ai at time t and f jt denotes the current work flow on ρ j at time t. Then, Ft is feasible if and only if z 

{ f jt |ai ∈ ρ j } ≤ m i , for i = 1, 2, . . . , N .

(1)

j=1

The above equation states that the total stream of ai have a maximal limit. Then, we have κ Mt ≡ {Ft |Ft is feasible under M}.

The New Attempt of Network Reliability …

107

At the same time, Ft exists under St = (s1t , s2t , . . . , s N t ) if and only if z  { f jt |ai ∈ ρ j } ≤ sit , for i = 1, 2, . . . , N .

(2)

j=1

{Ft |Ft is feasible under St }. The maximal flow under St is For clarity, let κ St = defined as  St ≡ max{ zj=1 f jt |F ∈ κ St }. Ft ∈ κ M also exists such that the sum of streams in Ft equals d. We have the following equation, z  f jt = d. (3) j=1

Let Ft = {Ft |Ft ∈ κ M and complies with Eq. (3)}. The below lemmas [19] hold. Lemma 1 Let St be a d-MP, then Ft ∈ Ft exists such that sit =

z  { f jt |ai ∈ ρ j }, for each i = 1, 2, . . . , N .

(4)

j=1

Given Ft ∈ Ft , a state vector S Ft = (s1t , s2t , . . . , s N t ) can be built by Equation (4). The set t = {S Ft |Ft ∈ Ft } is created. Let min,t = {St |St is a minimum vector in t }. We have Lemma 2 min,t includes all the members of d-MPs at time t.

2.2 Performance Evaluation of a Process at Time t Given a demand dt at time t, the reliability (ωdt ) is named the probability of the network such that the maximal flow is greater than or equal to dt at time t, i.e., ωdt ≡ Pr{St | St ≥ dt }. For obtaining ωdt , we search the minimum vectors in {St | St ≥ dt }. A minimum vector (St ) is a d-MP iff (a)  St ≥ dt and (b) Wt < dt , ∀Wt = St such that Wt < St , where Wt ≤ St iff w jt ≤ s jt for each j = 1, 2, . . . , n and Wt < St if and only if Wt ≤ St and w jt < s jt for at least one j. Suppose there are totally q d-MPs for dt : S1t , S2t , . . . , Sqt . The probability (i.e., the reliability) is derived as ωdt = Pr{

q 

{St |St ≥ Skt }}.

k=1

This value can be derived by the amazing method in [6].

(5)

108

S.-G. Chen

3 The SA Approach The SA approach includes three stages of computation [4]. Stage one is to search for all MPs in a network. Stage two is to search for all d-MPs under the MPs. Stage three is to calculate the union probability (the reliability) under all d-MPs. The following subsections will briefly introduce the stages listed above.

3.1 Stage One—Searching for MPs The most popular method to search for MPs can be referred to [10]. It can be shown in the following. Lemma 3 ρ j ∪ v creates a cycle if and only if ∃vk ∈ ρ j and vk = v. Lemma 4 (Minimal Path detection) ρ is a MP iff ρ has no cycles, and the corresponding w = s ∪ v1 ∪ v2 ∪ . . . ∪ vk ∪ t, where vk ∈ B − (S ∪ T ), s ∈ S, and t ∈ T. The algorithm is briefly described as follows.

3.1.1

Algorithm (for Minimum Paths)

Since all undirected networks can be transformed to directed ones, the following efficient algorithm [10] for directed networks to find MPs is introduced here. Algorithm 1: // for minimum paths 1. Get L. // Get the network data. 2. While v ∈ L(0) = S, and v = φ, select a candidate. 3. R = v ∪ L − S. // Temporary list R (starting from 0). 4. Set j = 0, ρ = w = φ. // Initialization variables. 5. If j < 0, then ρ = ∪2≤l {ρ(l)}, Got MP and do backtrack. 6. Else begin v = R( j). 7. If w(k) = v holds, backtrack. 8. Else if j ∈ v and { j} = φ, ρ = { j} ∪ ρ, w = {v} ∪ w, and run Step 6. 9. Else do backtrack. End. end While.

3.1.2

Explanation Example

A popular ARPA network [29] is given in Fig. 1. Table 1 shows the results. The efficiency of the approach is very good.

The New Attempt of Network Reliability …

4

a

1

2

c

5

3

s

109

7

t 9

6

b

8

d

Fig. 1 The ARPA network Table 1 The MPs searched

The proposed approach MPs

Cycles

{s, b, a, c, t} {s, b, a, c, d, t} {s, a, c, b, d, t} {s, b, d, c, t} {s, a, b, d, c, t} {s, a, b, c, d, t} {s, a, b, c, t} {s, a, b, d, t} {s, b, c, d, t} {s, a, c, t} {s, b, c, d, t} {s, b, c, t} {s, b, d, t} {a, c, b} {a, c, d, b} {b, d, c}

3.2 Stage Two—Searching for d-MPs An elegant method has been reported in [9]. This method results in the birth of a better efficiency approach [11]. This subsection will briefly describe the approach in [11]. A non-negative linear Diophantine equation has the following form: c1 f 1 + c2 f 2 + · · · + cz f z = d, ci , f i , d ∈ Z+ .

(6)

It can be shown that the following lemmas hold [9].

  q+d−1 , and the Lemma 5 For Eq. (6), if ci = 1, ∀i, the number of solutions is d   q+d−1 complexity is O( d ).   q+d−1 , and the complexity Lemma 6 The number of solutions is no greater than d   q+d−1 ). is no greater than O( d Lin and Chen [20] pointed out a very efficient method for d-MPs verification, which was pioneered by Yeh [27]. This creative method can be used along with the exact enumeration method, and creates a better method to find out d-MPs in the network. According to Lin and Chen [20], the following lemmas hold.

110

S.-G. Chen

Property 1 ai is crossed by a path, then we have si > 0. Lemma 7 ρ j ∪ γ creates a cycle if and only if ∃γk ∈ ρ j with γk = γ . One can find all cycles or loops under Property 1 and Lemma 7. The following lemmas also hold [27]. Lemma 8 S has no cycle, then S is a d-MP.

3.2.1

Algorithm for Searching d-MPs

Assume that ⊕ is a union operation for vectors. Let L be the network’s linked path structure (LPS) [10]. Chen and Lin proposed the following algorithm. Algorithm 2: 1. 2. 3. 4.

5. 6. 7. 8. 9. 10. 11. 12. 13.

Let D = {1, 1, . . . , 1}. Set r1 ⊕ r2 ⊕ . . . ⊕ rq ≡ D. { f i | f i = #(ri ) for all i}. // #(ri ) is the count   of 1’s. if F meets the equations as follows, zj=1 { f j |ai ∈ ρ j } ≤ m i , zj=1 f j = d, for 1 ≤ i ≤ N ,  then si = zj=1 { f j |ai ∈ ρ j }, for 1 ≤ i ≤ N , for 1 ≤ i ≤ N do if si > 0, then I = I ∪ {i}. endfor for i ∈ I do λ = φ. // the path trace. if not(L i ∈ λ), then keep on tracing L i until to the sinks, then continue, else λ = λ ∪ Li . else backtrack to Step 2. endfor min = min ∪ {S}. Backtrack to Step 2. Return. //Finish search.

3.2.2

Illustrative Examples

A simple network in Fig. 2 is presented for the explanation of the proposed approach. Four MPs are found: ρ1 = {a1 , a3 }, ρ2 = {a2 , a4 }, ρ3 = {a1 , a5 , a4 }, and ρ4 = {a2 , a6 , a3 }. The corresponding information flows are F = { f 1 , f 2 , f 3 , f 4 }. The observed data transmission distribution of the links are listed in Table 2. The demand of data sink is set to 6.

The New Attempt of Network Reliability …

111 Data source

Fig. 2 A simple network a1

a6

a3

a5

a2

a4 Data sink

Table 2 The observed data transmission distributions of the links for Fig. 2 Arcs The distributions 0 1 2 3 4 5 a1 a2 a3 a4 a5 a6

0.000 0.000 0.000 0.000 0.000 0.000

0.000 0.000 0.000 0.000 0.000 0.000

0.001 0.006 0.000 0.002 0.008 0.001

0.008 0.061 0.006 0.019 0.073 0.008

0.069 0.309 0.055 0.114 0.328 0.069

0.316 0.624 0.292 0.369 0.590 0.316

6 0.606 0.000 0.647 0.497 0.000 0.606

Based on Table 2, the 6-MPs is described as follows: min = {S1 = (0, 6, 0, 6, 0, 0), S2 = (0, 6, 1, 5, 0, 1), S3 = (0, 6, 2, 4, 0, 2), S4 = (0, 6, 3, 3, 0, 3), S5 = (0, 6, 4, 2, 0, 4), S6 = (0, 6, 5, 1, 0, 5), S7 = (0, 6, 6, 0, 0, 6), S8 = (1, 5, 0, 6, 1, 0), S9 = (1, 5, 1, 5, 0, 0), S10 = (1, 5, 2, 4, 0, 1), S11 = (1, 5, 3, 3, 0, 2), S12 = (1, 5, 4, 2, 0, 3), S13 = (1, 5, 5, 1, 0, 4), S14 = (1, 5, 6, 0, 0, 5), S15 = (2, 4, 0, 6, 2, 0), S16 = (2, 4, 1, 5, 1, 0), S17 = (2, 4, 2, 4, 0, 0), S18 = (2, 4, 3, 3, 0, 1), S19 = (2, 4, 4, 2, 0, 2), S20 = (2, 4, 5, 1, 0, 3), S21 = (2, 4, 6, 0, 0, 4), S22 = (3, 3, 0, 6, 3, 0), S23 = (3, 3, 1, 5, 2, 0), S24 = (3, 3, 2, 4, 1, 0), S25 = (3, 3, 3, 3, 0, 0), S26 = (3, 3, 4, 2, 0, 1), S27 = (3, 3, 5, 1, 0, 2), S28 = (3, 3, 6, 0, 0, 3), S29 = (4, 2, 0, 6, 4, 0), S30 = (4, 2, 1, 5, 3, 0), S31 = (4, 2, 2, 4, 2, 0), S32 = (4, 2, 3, 3, 1, 0), S33 = (4, 2, 4, 2, 0, 0), S34 = (4, 2, 5, 1, 0, 1), S35 = (4, 2, 6, 0, 0, 2), S36 = (5, 1, 0, 6, 5, 0), S37 = (5, 1, 1, 5, 4, 0), S38 = (5, 1, 2, 4, 3, 0), S39 = (5, 1, 3, 3, 2, 0), S40 = (5, 1, 4, 2, 1, 0), S41 = (5, 1, 5, 1, 0, 0), S42 = (5, 1, 6, 0, 0, 1), S43 = (6, 0, 0, 6, 6, 0), S44 = (6, 0, 1, 5, 5, 0), S45 = (6, 0, 2, 4, 4, 0), S46 = (6, 0, 3, 3, 3, 0), S47 = (6, 0, 4, 2, 2, 0), S48 = (6, 0, 5, 1, 1, 0), S49 = (6, 0, 6, 0, 0, 0)}.

112

S.-G. Chen

3.3 Stage Three—Evaluating the Network Reliability via Union Probability Deriving the union probability of d-MPs is a way to obtain the network reliability. We usually employ the inclusion-exclusion principle (IEP) to do so. It can be backtracked from the idea of Abraham de Moivre (1718) [21]. To explain this concept, Eq. (7) shows the way of IEP to obtain the probability of {A1 ∪ A2 ∪ A3 }. Pr{A1 ∪ A2 ∪ A3 } = Pr{A1 } + Pr{A2 } + Pr{A3 } − Pr{A1 ∩ A2 } − Pr{A2 ∩ A3 } − Pr{A1 ∩ A3 } + Pr{A1 ∩ A2 ∩ A3 }

(7)

By IEP, it is to sum up each entity probabilities, subtract the interception of paired probabilities, and adds the intersection probability of all entities. By set theory, the complexity of IEP equals to the complexity of power set expansion. That is O(2q − 1). The Recursive Inclusion-Exclusion Principle (RIEP) can do a better job than IEP just by rearranging Eq. (7) to its recursive form as Eq. (8). The complexity is still O(2q − 1), but it has 10 times faster by compared to the normal IEP cases. Pr{A1 ∪ A2 ∪ A3 } = Pr{A1 } + Pr{A2 } − Pr{A1 ∩ A2 } + Pr{A3 } − Pr{A2 ∩ A3 } − Pr{A1 ∩ A3 } + Pr{A1 ∩ A2 ∩ A3 } = Pr{A1 } + Pr{A2 } − Pr{A1 ∩ A2 } + Pr{A3 } − Pr{A2 ∩ A3 ∪ A1 ∩ A3 }

(8)

Chen [6] reported a reduction approach to speed up the derivation of union probability. He emphasized the good application for monotonic cases such as the network reliability problems. That is, the network reliability is to derive the union probability of d-MPs.

3.3.1

The Approach of IEP

Given {A1 , A2 , …, An }. The union probability by IEP is such that

Pr{

n  i=1

⎛ ⎞ n

  ⎜ ⎟ k−1 Ai } = Pr{ Ai }⎠ . ⎝(−1) k=1

I ⊂{1,...,n} |I |=k

i∈I

(9)

The New Attempt of Network Reliability …

+

113

+

Fig. 3 Rearrangement of RIEP

3.3.2

The Approach of RIEP

A conceptual diagram of RIEP is presented in Fig. 3. Let π(X 1 , X 2 , . . . , X n ) ≡ Pr{X 1 ∪ X 2 ∪ · · · ∪ X n }. Then, RIEP can be expressed by the following equation. π(A1 ) = Pr{A n 1 }, when n = 1, (Pr{Ai } π(A1 , A2 , . . . , An ) = i=1 −π(A1 ∩ Ai , A2 ∩ Ai , . . . , Ai−1 ∩ Ai )), when n > 1,

3.3.3

(10)

The Approach of RRIEP

We firstly denote what is a reduction for the union events. Definition 1 {X 1 , X 2 , . . . , X k } is a reduction of {Y1 , Y2 , . . . , Yn }, if for any k, n ∈ N and k < n such that {X 1 ∪ X 2 ∪ . . . ∪ X k } = {Y1 ∪ Y2 ∪ . . . ∪ Yn }. One have the following lemmas hold. Lemma 9 {X 1 , X 2 , …, X k } is a reduction of {Y1 , Y2 , …, Yn }, then π(X 1 , X 2 , . . . , X k ) = π(Y1 , Y2 , . . . , Yn ). Lemma 10 The complexity of π(X 1 , X 2 , . . . , X k ) is no greater than or equal to the complexity of π(Y1 , Y2 , . . . , Yn ).

114

S.-G. Chen

We said {B1 , B2 , . . . , Bki } a reduction of {A1 ∩ Ai , A2 ∩ Ai , . . . , Ai−1 ∩ Ai }. By Lemma 9, Equation (10) is equal to π(A1 ) = Pr{A1 }, when n = 1, n = i=1 (Pr{Ai } π(A1 , A2 , . . . , An ) , Ai−1 ∩ Ai )), −π(A1 ∩ Ai , A2 ∩ Ai , . . . n Pr{Ai } = i=1 . . . , Bki )), when n > 1, −π(B1 , B2 ,

(11)

Apparently, RRIEP is more efficient than RIEP. To do the trimming of union entities, one can refer to the monotonicity property. Property 2 Given Q¯ = (q¯1 , q¯2 , . . . , q¯h ) and W¯ = (w¯ 1 , w¯ 2 , . . . , w¯ h ) the state vectors. Q¯ and W¯ comply with the following rules: 1. Q¯ ≤ W¯ if and only if q¯ j ≤ w¯ j for each j = 1, 2, . . . , h. 2. Q¯ < W¯ if and only if Q¯ ≤ W¯ and q¯ j < w¯ j for at least one j.

3.3.4

The Algorithm (for Union Probability)

¯ w¯ i ∈ W¯ , ∀i}. ωdt can be derived by Let Q¯ ⊕ W¯ ≡ {g¯ i |g¯ i = max(q¯i , w¯ i ), q¯i ∈ Q, the following steps. Algorithm 3: // for union probability 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.

12. 13. 14. 15.

d M P = {X 1 , X 2 , ...,X h }. ω = 0. If h = 0, then return ω, For 1 ≤ i ≤ |d M P| do ω2 = Pr {X ≥ X i }. For 1 ≤ j < i do X j,i = X j ⊕ X i . End (for). Let E = {X j,i |1 ≤ j < i} and Wl , Wk ∈ E. For k ∈ / J and 1 ≤ k ≤ ||E|| do // Reduction process. For l ∈ / J and k < l ≤ ||E|| do If Wl ≤ Wk , then J = J ∪ {k} and go to Step 9. Else, if Wl > Wk , then J = J ∪ {l}. End (for). d M P ∗ = d M P ∗ ∪ {Wk }. End (for). Run Step 1 employing d M P ∗ and got ω3 . ω = ω + ω2 − ω3 . End (for). got ω.

The New Attempt of Network Reliability … Table 3 The computation comparison (sec) # of d-MPs IEP 15 18 20 25 30 35 40 45 50 55 59 ∗

1.765 19.891 75.953 9256.750 –∗ – – – – – –

115

RIEP

RRIEP

0.390 1.891 2.953 9.359 43.343 106.131 415.580 1109.876 2134.090 5884.508 8652.496

0.0003 0.0005 0.016 0.016 0.015 0.035 0.054 0.066 0.083 0.108 0.110

Not stopped over 1 day

The reduction of union entities is begun in Step 9. Because RRIEP always decreases the computation efforts in most cases, it is a good approach to shorten the computation time for most applications. However, in the worst cases, it still needs O(2 Q − 1) to obtain the reliability.

3.3.5

Explanation Example

One randomly generates vectors of length 20 as d-MPs, and makes comparison with well-known methods. The results are listed in Table 3. We run these cases by PROLOG on a laptop.

4 The Assessment of Individuals and Groups When we think of MSN as a process network, we get precision tools to achieve precision management practices. The first step is how to assess the performance of the individuals in a process. This can be done by modeling the process by a MSN such that the individuals are nodes, and their precedence relationship in the process are edges. The work flow through the network can reflect the performance of the process. Then, we can define the following indicators to reflect the performance of individuals in a process.

116

S.-G. Chen

4.1 The Performance Indicators for Individuals Since St = {s1t , s2t , . . . , s N t } are the capacity vector for the individuals at time t, we can obtain the empirical throughput distribution t = {μ1t , μ2t , . . . , μ N t ]}. The definition of the expected throughput at time t for individual is as follows. E it =

max{s ik |1≤k≤t}

jμit ( j).

(12)

j=1

Further, we can define the individual importance in the process. This can be done by defining the covering set of the individual ai in the process. Let P j = {ρ j |ai ∈ ρ j } be the MP set involving ai . The covering set ζi of ai is defined by the following. Definition 2 Assume ai , a j ∈ V . j is in ζi iff P j ⊆ Pi . Then, we define the individual importance ϒi by the following. ϒi =

||ζi || , N

(13)

where || • || is the number of the member in a set. By the above definition, we observe the fact that if ϒi = 1.0, it means it is a critical individual in the process, since it cover all the flows in the process, and ϒi cannot be 0, because it must at least cover itself. We can also define the individual contribution in the process. This can be done by defining the Survivability ωdt,i of ai in the process. Definition 3 The Survivability ωdt,i of ai is the process reliability at time t when ai is unavailable (i.e., μit (0) = 1). Then, we define the individual contribution Ci by the following. t Ci =

j=1 (ωd j t j=1

− ωd j,i ) ωd j

.

(14)

By the above definition, we observe the fact that if Ci = 1.0, it means ai having most real contribution in the process, since all the time, the flows in the process cannot exist without him. On the contrary, if Ci = 0.0, it means ai having no real contribution in the process, since all the time, the flows in the process did not be influenced without him.

The New Attempt of Network Reliability …

117

4.2 The Performance Indicators for Groups Let the process output at time t be t , and the empirical distribution during this period is μt , then the throughput of the group during this period is defined as the expected value during this period: E t =

max{ i |1≤i≤t} 

jμt ( j).

(15)

j=1

4.3 The Assessments Along with the above definitions, we get the following assessments for individuals as well as groups.

4.3.1

The Assessment of Individual Performance

Let the individual standard output δit at time t relative to the standard output dt , then the performance appraisal λit of individual ai can be calculated by the following formula: E it . (16) λit = t j=1 δi j

t

Therefore, when λit > 1, the individual has over-standard performance; when λit = 1, the individual has standard performance; when λit < 1, the individual has below-standard performance.

4.3.2

The Assessment of Group Performance

The group performance assessment λt can be calculated using the following formula: λt =

E t

t

i=1

di

.

(17)

t

So when λt > 1, the group has over-standard performance; when λt = 1, the group has standard performance; when λt < 1, the group has below-standard performance.

118

S.-G. Chen

5 The Bonus Dispatching for the Process The purpose of bonus dispatching is to distribute the profit of the process at time t fairly to all individuals in the process. Therefore, there are two calculation methods. 1. Taking the individual contribution as a coefficient, the following formula can be used to calculate: i1 = λit Ci . (18) This method will be affected by the individual capacity, that is, if the individual is not properly allocated, some bottlenecks will appear. This kind of allocation can reflect the bottleneck contribution. 2. With individual importance as a coefficient, the following formula can be used to calculate: (19) i2 = λit ϒi . This method will not be affected by the individual capacity, but only reflects the importance of the process position, that is, the dynamic bottleneck contribution will not be reflected.

6 The Case Studies Figure 4 is the network of a distribution process in a well-known company. The standard total output is dt = 4 deliveries at time t per unit time, t = 1, 2, . . . , 10. The detailed information is listed in Table 4. According to the above formulas, the group and individual performance indicators at time t = 10 are shown in Table 5.

a2

Transporter

Salesman 1

a4

a1 a5

Order Entry

Accountant a3 Salesman 2 Fig. 4 A distribution process network in a well-known company

The New Attempt of Network Reliability …

119

Table 4 The initial throughput of the distribution process t a1 a2 a3 a4 1 2 3 4 5 6 7 8 9 10 Si Ci ∗

6 5 5 0 6 5 5 4 5 4 1.0 1.0

0∗ 3 3 3 3 3 3 2 3 3 0.2 1.0

3 2 3 2 0 3 3 1 2 3 0.2 1.0

2 3 3 3 3 3 3 3 0 3 0.2 0.07

a5

λt

ωdt

5 4 5 5 3 6 6 0 6 6 1.0 1.0

5 4 5 5 3 6 6 0 6 6

0.1230 0.1728 0.3087 0.4368 0.3479 0.4928 0.6561 0.5616 0.5688 0.5760

Absence

Table 5 The group and individual performance indicators when t = 10 Items a1 a2 a3 a4 a5 E t δit E it λit i1 i2

4 4.5 1.13 1.13 1.13

2 2.6 1.3 1.3 0.33

2 2.2 1.1 1.1 0.28

2 2.6 1.3 0.09 0.33

4 4.6 1.15 1.15 1.15

4.6

λt

ω4t

1.15

0.58

6.1 Discussions of the Case Study on Precision Management The following features can be learned from the results of the case study on precision management. 1. The case of this chapter involves the random absence of individual. 2. The performance of individuals in this case has exceeded their standard performances, but the probability of achieving the corporate goal is only 0.58. Therefore, there is a great risk of process shutdown. For example, if someone is absent for a long time, the process may be shut down. 3. According to the indicators in this case, it can be seen that the risk of the case is due to improper capacity planning of individuals. It is known from the Ci index that everyone except a4 has the greatest substantial contribution, which is the reason for the high risk. According to Chen’s optimal capacity planning scheme [5], the standard output of a2 , a3 , and a4 should be 4, 4, 1. Therefore, the two salesmen activities must increase manpower, and the transporter only needs to select the least capable person. Even abolishing a4 will not affect the overall performance too much.

120

S.-G. Chen

4. It would be unfair to implement incentive measures in accordance with individual performance λit . Because the truly important roles such as a1 and a5 are not highlighted, while those with no substantive contributors such as a4 receive the highest rewards. Fairer reward indicators can be i1 and i2 , the former can be fairly rewarded based on the actual contribution; the latter can highlight the real importance in the process. 5. The group performance has exceeded the standard performance by 15%, and individual can be rewarded according to this ratio.

7 Conclusion From the perspective of network reliability, this chapter makes an attempt to assess the process performance by network reliability for precision management. One of them is done by objectively evaluating the overall process performance, as well as analyzing the different importance and contribution of each individual in the process. The results show that the network reliability is very suitable to assess social activities with process in them. By the case study, we found that the improvement of process can be facilitated by precising analysis of the performance of individuals as well as groups. This is the basis of so-called precision management of business activities. Future researches can be conducted by inspecting multi-commodities in a process to show the merits of the theories in network reliability. Acknowledgements This paper was supported in part by the Ministry of Science and Technology, Taiwan, Republic of China, under Grant No. MOST 107-2221-E-236-004-MY3.

References 1. Abel U, Bicker R (1982) Determination of all minimal cut-sets between a vertex pair in an undirected graph. IEEE Trans Reliab R-31:167–171 2. Aggarwal KK, Chopra YC, Bajwa JS (1982) Capacity consideration in reliability analysis of communication systems. IEEE Trans Reliab 31:177–80 3. Aggarwal KK, Gupta JS, Misra KB (1975) A simple method for reliability evaluation of a communication system. IEEE Trans Commun 23:563–565 4. Chen SG (2020) Computation in network reliability. In: Pham H (ed) Reliability and statistical computing. Springer, pp 107–126 5. Chen SG (2012) An optimal capacity assignment for the robust design problem in capacitated flow networks. Appl Math Model 36(11):5272–5282 6. Chen SG (2014) Reduced recursive inclusion-exclusion principle for the probability of union events. In: 2014 IEEE international conference on industrial engineering and engineering management, Selangor, Malaysia, pp 1–3 7. Chen SG (2014) Reduced recursive sum of disjoint product in network reliability. In: 2014 the 20th ISSAT international conference on reliability and quality in design, Seattle, Washington, USA, pp 170–173

The New Attempt of Network Reliability …

121

8. Chen SG (2015) Search for all MCs with backtracking. Int J Reliab Qual Perform 6(2):101–106 9. Chen SG (2018) A novel approach to search for all solutions of a non-negative linear Diophantine equation. In: 24th ISSAT international conference on reliability and quality in design, pp 55–57 10. Chen SG, Lin YK (2012) Search for all minimal paths in a general large flow network. IEEE Trans Reliab 61(4):949–956 11. Chen SG, Lin YK (2020) A permutation-and-backtrack approach for reliability evaluation in multistate information networks. Appl Math Comput 373:125024 12. Colbourn CJ (1987) The combinatorics of network reliability. Oxford University Press, UK 13. Ford LR, Fulkerson DR (1962) Flows in networks. Princeton University Press, NJ 14. Jane CC, Lin JS, Yuan J (1993) On reliability evaluation of a limited-flow network in terms of minimal cutsets. IEEE Trans Reliab 42:354–361, 368 15. Jasmon GB, Foong KW (1987) A method for evaluating all the minimal cuts of a graph. IEEE Trans Reliab R-36:538–545 16. Lee SH (1980) Reliability evaluation of a flow network. IEEE Trans Reliab 29:24–26 17. Lin JS, Jane CC, Yuan J (1995) On reliability evaluation of a capacitated-flow network in terms of minimal pathsets. Networks 25:131–138 18. Lin YK (2001) On reliability evaluation of a stochastic-flow network in terms of minimal cuts. J Chin Inst Ind Eng 18:49–54 19. Lin YK (2001) A simple algorithm for reliability evaluation of a stochastic-flow network with node failure. Comput Oper Res 28(13):1277–1285 20. Lin YK, Chen SG (2019) An efficient searching method for minimal path vectors in multi-state networks. Ann Oper Res 1–12 21. Roberts FS, Tesman B (2009) Applied combinatorics, 2nd edn. CRC Press 22. Schrijver A (2002) On the history of the transportation and maximum flow problems. Math Program 91(3):437–445 23. Shen Y (1995) A new simple algorithm for enumerating all minimal paths and cuts of a graph. Microelectron Reliab 35:973–976 24. Xue J (1985) On multistate system analysis. IEEE Trans Reliab 34(4):329–337 25. Yeh WC (2001) A simple algorithm to search for all d-MPs with unreliable nodes. Reliab Eng Syst Saf 73:49–54 26. Yeh WC (2001) A simple approach to search for all d-MCs of a limited-flow network. Reliab Eng Syst Saf 71:15–19 27. Yeh WC (2002) A simple method to verify all d-minimal path candidates of a limited-flow network and its reliability. Int J Adv Manuf Technol 20(1):77–81 28. Yeh WC (2006) A simple algorithm to search for all MCs in networks. Eur J Oper Res 174:1694– 1705 29. Yeh WC (2007) A simple heuristic algorithm for generating all minimal paths. IEEE Trans Reliab 56(3):488–494 30. Zuo MJ, Tian Z, Huang HZ (2007) An efficient method for reliability evaluation of multistate networks given all minimal path vectors. IIE Trans 39:811–817

Optimal Design of Reliability Acceptance Sampling Plans for Multi-stage Production Process M. Kumar

Abstract One of the goals in manufacturing industry is to follow manufacturing standards which ensure that the manufactured products meet expectations of consumers. An Acceptance sampling plan is a tool to ensure that quality of products meet the minimum expected standards. In this chapter, an attempt is made to derive lot acceptance single and double sampling plans based on type ii censored data. The units in the lot have multiple quality characteristics and are processed through multi-stage process. We assume that the quality characteristics of units follow the exponential distribution. The acceptance criterion for the given lot is derived based on mean-life of units in the sample at every stage. Further, two non-linear optimization problems, which minimize the expected total testing cost at the acceptable quality level, are solved. In addition, sensitivity analysis studies are also conducted to assess the behavior of total testing costs with respect to change in producer’s and consumer’s risks and sample sizes. Several numerical examples and two case studies are presented to illustrate our resulting sampling plans.

1 Introduction Acceptance sampling plan is one of the oldest techniques in quality control, which helps to make a decision on the acceptance of a lot which contains a finite or infinite number of units. It occupies the middle ground between no inspection and 100% inspection. A complete sampling is often not desirable when the cost of testing is high, compared to that of passing a defective unit, or if the testing is destructive. The sampling schemes protect both the producer and consumer. It enables the producer to accept the good lot and the consumer to reject the bad lot. The use of sampling schemes minimize the nonconforming products and maximize the profit of the industry.

M. Kumar (B) Department of Mathematics, National Institute of Technology, Calicut, Kerala, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment of Industrial Systems, Springer Series in Reliability Engineering, https://doi.org/10.1007/978-3-030-93623-5_7

123

124

M. Kumar

There exists a variety of sampling schemes in the literature. The acceptance sampling plans which incorporate the lifetime information of a product are known as reliability sampling plans. They are time censored (Type I), failure censored (Type II), Progressive censored, both time and failure censored (Hybrid) or the mixture of these censorings. Some of the earlier researchers have studied the design and development of the life test sampling plans. The first attempt on the development of the life test sampling plans was done by Epstein and Sobel [1]. They derived the sampling plan for exponential distribution under Type I and Type II censoring. The expected number of failures were obtained in Type I censoring and expected testing time in Type II. Later, many acceptance sampling plans were considered by many authors for different distributions. For example, Epstein [2], Jun et al. [3], Bartholomew [4], Aslam and Jun [5], Azam et al. [6], Sai and Wu [7], Kantam et al. [8], Balamurali and Usha [9], Srinivasa Rao [10], Aslam [11], Aslam and Jun [12] and Chien-Wei Wu et al. [13], Vellaisamy and Sankar [14]. In all these works, they have considered the number of failures. Kumar and Ramyamol [15] have considered reliability acceptance sampling plan for exponential distribution in which the lifetime of failed units is taken into account under repetitive group sampling. Ramyamol and Kumar [16] have also designed acceptance sampling plans for mixture distributions. However, the above-mentioned works concentrated on the single stage processes. Asokan and Balamurali [17] considered multi-attribute acceptance sampling plan for multiple quality characteristics based on a single sample. Using acceptance sampling plan, Duffuaa et al. [18] developed process targeting model for a product with two dependent quality characteristics. They determined the first quality characteristic by setting the first process, whereas the determination of the second quality characteristic depends on the setting of the two processes. The quality of the product is controlled by a single sample inspection plan, where inspection is assumed to be error free. Moskowitz et al. [19] proposed a multi-stage Bayesian acceptance sampling model for a serial production process incorporating multiple-type defects that may occur in each stage. Lee and Tagaras [20] derived an economic acceptance sampling plan in a complex multi-stage production system. The first stage can be viewed either as the first manufacturing stage or an incoming lot of components or raw materials on which an operation is performed at the second stage. The planning and scheduling of production in a multi-task or multi-stage batch manufacturing process (for example in the case of industries such as chemical manufacturing, food processing, and oil refining) was developed and studied in the work carried out by Francesco Gaglioppa et al. [21]. In this paper, a reliability acceptance sampling plan is developed for a multi-stage production process, in which units pass sequentially from Machine N to Machine 1. In real life, several products such as semiconductor devices and television tubes are processed through a multi-stage production process, and the literature has addressed acceptance sampling plans for these situations. By exploring the literature, Aslam et al. [22] found that there is no study on the development of an acceptance sampling plan for multiple quality characteristics from a multi-stage process using the time-truncated life test. They considered the development of this sampling plan for the product coming from a two-stage process by assuming that the lifetime of the product follows the Weibull distribution with known

Optimal Design of Reliability Acceptance Sampling Plans …

125

or unknown shape parameters. They have taken the number of failures as the criterion for accepting the lot and ignored the failure time of units tested. The cost of testing is also not addressed in their work. The “number of failures criterion” is not enough to judge whether a lot is good or bad. For example, an accepted lot (for instance, a lot of certain brand of bulbs), based on this information, may actually perform bad when the units are deployed in the market, and thereby, could receive negative feedbacks from the consumers. This will result a huge loss to the industry which produces such units. In the light of the above, we propose to develop a complete acceptance sampling plan which overcomes the drawbacks in the existing sampling plan for multi-stage process. We include the lifetime information of failed units, instead of just depending upon “number of failures”, and make an attempt to develop a reliability acceptance sampling plan for multi-stage process which minimizes the expected total testing cost. The rest of the chapter is organized as follows: In Sect. 2, we define a single sampling plan for multi-stage production process in which lifetime of units in each stage follows the exponential distribution with different parameters. In Sect. 3, we consider double sampling plan. Section 4 discusses the results obtained under single and double sampling plans. In Sect. 5, comparisons of proposed single and double acceptance sampling plans are made. Besides, we compare our double sampling plan with the existing plan (see [22]). In Sect. 6, we have included sensitivity analysis of expected total testing cost. Case studies are presented in Sect. 7, and in Sect. 8, some conclusive remarks are presented.

2 Design of Single Sampling Plan (SSP) Under Type II Censoring Let N be the number of stages (quality characteristics) under consideration. A single sample of size n i , i = 1, 2, . . . , N from the ith stage is taken and the quality checking is carried out at the end of the ith stage. Throughout the paper, we assume that the stages are independent, and the quality characteristics at the ith, i = 1, 2, . . . , N stage follow the exponential distribution with different parameters. Let θi be the parameter of the exponential distribution in the ith stage (ith quality characteristics) of the process. Then the pdf of the exponential distribution for the ith stage is given by 1 i exp( −x ), xi > 0, θi > 0 θi (1) f (xi ) = θi 0, other wise, where xi is the value of the random variable X i , which represents the lifetime of ith quality characteristics. The formulation of the acceptance sampling plan for the N stage process is given in the following steps 1 and 2. 1. At the ith, i = 1, 2, . . . , N , stage select a random sample of size n i from the submitted lot and put them on a test. Continue the test until the total number of failures observed is ri . Let, X 1,ni , X 2,ni , . . . , X ri,ni be the lifetime of failed units in the increasing order.

126

M. Kumar

2. Calculate θˆri,ni , the maximum  i likelihood estimate (MLE) of θi . The expression X j,ni + (n i − ri )X ri ,ni )/ri (see [2]). If, θˆri,ni ≥ ti , for the MLE is θˆri,ni = ( rj=1 accept the units processed in the ith stage and go to (i + 1)th stage. Otherwise, stop testing at the ith stage and reject the entire lot. Note that the entire lot will be accepted only when, θˆri,ni ≥ ti , ∀ i = 1, 2, . . . , N . In the above defined single sampling plan, the parameters ri and ti , i = 1, 2, . . . , N are determined by modeling the problem as an optimization problem which minimizes the average total testing cost at acceptable quality level, satisfying the Type I and Type II error constraints. The acceptance probability at the ith stage is given by Pi = P(θˆri,ni ≥ ti ). ˆ

According to Epstein and Sobel [2], 2ri θθiri,ni follows the chi-square distribution with 2ri degrees of freedom (χ 2 (2ri )). Hence Pi = P(χ 2 (2ri ) ≥ (2rθii ti ) ), i = 1, 2, . . . , N .  i −1 e −a2 ( a2 ) j of the chi-square distribuUsing the property, P( χ 2 (2ri ) ≥ a) = rj=0 j! ri −1

e

−ri ti θi

(

ri ti

)j

θi tion, we can write Pi = j=0 . j! For an N stage process, the final acceptance of the lot depends on the acceptance at all the stages. Since the stages are independent, the final acceptance probability is given by −ri ti N N r i −1   e θi ( rθi iti ) j . P= Pi = j! i=1 i=1 j=0

Let Ti denote the time to get ri failures from a sample of size n i at the ith stage. Then Ti is a random variable with pdf  gri ,ni (x) =

n!e(n−r +1)x/θi (r −1)!(n−r )!θi

0,

[1 − e−x/θi ]ri −1 , x > 0, θi > 0 otherwise.

Moreover, the expected testing time is given by E(Ti ) = θi

ri  j=1

1 ni − j + 1

(See [2]). Next, we calculate the expected total testing time based on the Bernoulli random variables as follows: Let  1, if the ith quality is acceptable Zi = (2) 0, otherwise. Also, the expression for the expected value of Z i is calculated as

Optimal Design of Reliability Acceptance Sampling Plans …

127

E(Z i ) = 1.P(ith quality is acceptable) + 0.P(ith quality is unacceptable) = P(θˆri,ni ≥ ti ) = Pi . Now the total random testing time is T = T1 + Z 1 T2 + Z 1 Z 2 T3 + · · · + (Z 1 Z 2 . . . , Z N −1 )TN  i−1  N   = Z k Ti , Z 0 = 1. i=1

k=0

 N i−1 Hence the expected total testing time, E(T ) = i=1 E(Z ) E(Ti ) k k=0

 N i−1 ri 1   = i=1 j=1 n i − j+1 , P0 = 1, Z 0 = 1, since Z k s and Ti s are indek=0 Pk θi pendent. In the subsequent sections, we propose the design of optimal acceptance sampling plan which minimizes the expected total testing cost. It is to be noted that the final total cost in conducting the proposed acceptance sampling plan will generally consist of installation cost of installing all the test units at the beginning of the test, test units cost and operational cost (salary of operators, utility, depreciation of test equipment and so on), in addition to expected total testing cost. Note that the final total cost can be computed once the minimum expected total testing cost is obtained.

2.1 Formation of Optimization Problem and Its Solution The acceptance sampling plan defined above involves parameters (ri , ti ), i = 1, 2, . . . , N , that need to be determined. In this section, we discuss the formulation of an optimization problem which helps to find the parameters of the plan. The major components of acceptance sampling plan are producer’s and consumer’s risk. Therefore, we find (ri , ti ), i = 1, 2, . . . , N , which suitably controls the producer’s and consumer’s risk, and minimizes the expected total testing cost. To accommodate these pieces of information, we model our problem as an optimization problem with objective function as the expected total testing cost at the acceptable quality level. The constraints associated with the problem are Type I error at the acceptable quality level and Type II error at the unacceptable quality level. If Ci is the cost of testing a for unit time, then the expected total testing cost of sample of size n i at the ith stage

 N i−1 ri 1 the process is given by i=1 j=1 n i − j+1 C i , P0 = 1. k=0 Pk θi Then our optimization problem to obtain the plan parameters (ri , ti ), i = 1, 2, . . . , N , is given by Problem Q 1 :

Min

N  i=1

⎡  ri i−1   ⎣ Pk θi k=0

j=1

⎤ 1 ⎦ Ci ni − j + 1

(3)

such that P(Type I error at acceptable quality level) ≤ α

(4)

128

M. Kumar

P(Type II error at unacceptable quality level) ≤ β.

(5)

The acceptable and unacceptable quality levels are defined as follows: We have N stages (quality characteristics), and θi , i = 1, 2, . . . , N is the parameter of the exponential distribution which represents the lifetime of ith quality characteristics. At acceptable quality level, all the θi should possess a minimum value and is defined as, θi ≥ θi∗ , i = 1, 2, . . . , N . At unacceptable quality level, θi ≤ θi0 , for some i, 1 ≤ i ≤ N . We assume that (θi∗ , θi0 ), 1 ≤ i ≤ N to be known. In view of this, our problem Q 1 can be rewritten as Problem Q 2 : N 

Min

i=1

such that

n 

⎡  ri i−1   ⎣ Pk θi k=0

Pi ≥ 1 − α,

j=1

⎤ 1 ⎦ ni − j + 1

θi ≥ θi∗ , ∀1 ≤ i ≤ N

(6)

(7)

i=1 n 

Pi ≤ β,

θi ≤ θi0 , for some 1 ≤ i ≤ N .

(8)

i=1

Observe that the above Problem Q 2 can be rewritten as Problem Q 3 : N 

Min

i=1

such that Min

N 

(θ1 ,θ2 ,...,θn )

Max

k=0

j=1

Pi ≥ 1 − α,

⎤ 1 ⎦ ni − j + 1

θi ≥ θi∗ , ∀1 ≤ i ≤ N

(9)

(10)

i=1

N 

(θ1 ,θ2 ,...,θn )

⎡  ri i−1   ⎣ Pk θi

Pi ≤ β,

θi ≤ θi0 , for some 1 ≤ i ≤ N ,

(11)

i=1

where α and β are producer’s and consumer’s risk respectively. Note that for an N stage process one has to find 2N parameters, (r1 , t1 ), (r2 , t2 ), . . . , (r N , t N ). The problem is intractable when N ≥ 3. Hence we solve our problem for the case N = 2. Hence Problem Q 3 can be written as Problem Q 4 as follows:

Min

2  i=1

⎡  ri i−1   ⎣ Pk θi k=0

j=1

⎤ 1 ⎦ ni − j + 1

(12)

Optimal Design of Reliability Acceptance Sampling Plans …

such that Min (θ1 ,θ2

Max

(θ1 ,θ2 )

2 

Pi ≥ 1 − α,

129

θi ≥ θi∗ , ∀1 ≤ i ≤ 2

(13)

i=1

2 

Pi ≤ β,

θi ≤ θi0 , for some 1 ≤ i ≤ 2.

(14)

i=1

2 Observe that, i=1 Pi on the left-hand side of the inequalities (13) and (14) are

2 all increasing in θi . The minimum i=1 Pi occurs at (θ1∗ , θ2∗ ). Consider the feasible regiondescribed in (14). Divide it into  three parts   as follows: A1 = (θ1 , θ2 ) : θ1 ≤ θ10 and θ2 ≤ θ20 , A2 = (θ1 , θ2 ) : θ1 ≥ θ10 and θ2 ≤ θ20 and A3 = (θ1 , θ2 ) : θ1 ≤ θ10 and θ2 ≥ θ20 . The regions A2 and A3 are unbounded, hence the Problem Q 4 cannot be solved. Therefore, we consider only the bounded region A1 and note that the point of maximum is (θ10 , θ20 ). So the final optimization problem becomes Problem Q 5 : Min

2 

⎡  ri i−1   ⎣ Pk θi

i=1

such that

k=0

2 

Pi ≥ 1 − α

j=1

⎤ 1 ⎦ Ci ni − j + 1

at (θ1∗ , θ2∗ )

(15)

(16)

i=1 2 

Pi ≤ β

at (θ10 , θ20 ).

(17)

i=1

The optimization problem for N = 2 is solved by genetic algorithm solver in Matlab. We give several numerical examples in Table 1.

3 Construction of Double Sampling Plan (DSP) Under Type II Censoring A double sampling plan, in general, is more difficult to construct and hence need not be easy for implementation as compared to the single sampling plan. However, it may give similar levels of the consumer’s and the producer’s risk and requires less sampling in the long run than a single sampling plan. We consider the double sampling plan for multi-stage process which is defined as follows. Here one may note that each stage can have two samples. The DSP algorithm is given as follows. For 1 ≤ i ≤ N :

130

M. Kumar

1. Select a random sample of size n i . 2. Observe ri number of failures. (1) of θi . 3. Calculate MLE θˆri,ni (1) ≥ ti2 , accept the units produced in ith stage. Set i = i + 1, go to Step a. If, θˆri,ni 1, else go to (b). (1) < ti2 , take one more sample from the ith stage by fixing the b. If, ti1 ≤ θˆri,ni (2) same sample size n i and observe ri failures. Calculate MLE θˆri,ni of θi using the new sample and go to (c). Else go to (e). (1) (2) + θˆri,ni ≥ 2ti2 , accept the units from the ith stage. Set i = i + 1, go c. If, θˆri,ni to Step 1. Else go to (d). d. Reject the lot, go to Step 5 e. Reject the lot, go to Step 5.

4. If all the stages are accepted, then accept the entire lot 5. End. The unknown quantities (ti1 , ti2 , ri ), i = 1, 2 . . . , N are determined by formulating an optimization problem which minimizes the expected total testing cost. In the double sampling plan, the acceptance probability P is calculated as follows: At the ith stage, we accept the units if     (1) (1) (1) (1) ≥ ti2 or ti1 ≤ θˆri,ni < ti2 and θˆri,ni + θˆri,ni ≥ 2ti2 . θˆri,ni Then the acceptance probability at the ith stage is given by (1) (1) (1) (2) ≥ ti2 ) + P((ti1 ≤ θˆri,ni ≤ ti2 ) ∩ (θˆri,ni + θˆri,ni ≥ 2ti2 )). Pi = P(θˆri,ni (1) (2) Here θˆri,ni and θˆri,ni are independent random variables. (1) (2) Denote θˆri,ni = X and θˆri,ni = Y . Then the pdf of X (see [1]) is given by

 f (x) =

( 2rθii )ri 2xri Γ ri exp( −rθii x ), x > 0, θi > 0 0, otherwise. ri−1

(18)

Note that Y also has the same pdf. The joint pdf of X and Y is f (x) f (y) since X and Y are independent. Thus

Optimal Design of Reliability Acceptance Sampling Plans …

131

P((ti1 ≤ X ≤ ti2 ) ∩ (X + Y ≥ 2ti2 ))  ti2  ∞ x ri−1 2ri −ri x 2ri ri y ri−1 −ri y ( )ri r exp( )( ) r exp( )d yd x = iΓr iΓr θ 2 θ θ 2 θi i i i i i ti1 2ti2 −X  ti2 r i −1 x ri−1 2ri −ri x −ri (2ti2 − x) −ri (2ti2 − x) j 1 ( )ri r exp( ) exp( )( ) = iΓr θ 2 θ θi θi j! i i i ti1 j=0 r i −1

ri −2ri ti2 1 ( )ri + j exp( ) = θ θ j!Γ ri i j=0 i



ti2

x ri − (2ti2 − x) j d x

ti1

Observe that P(X ≥ ti2 )+ P((ti1 ≤ X ≤ ti2 ) ∩ (X + Y ≥ 2ti2 )) t  i −1 ri ri + j = P(χ 2 (2ri ) ≥ riθtii2 ) + rj=0 ( θi ) exp( −2rθii ti2 ) j!Γ1 ri ti1i2 x ri −1 (2ti2 − x) j d x   −ri ti2 −ri ti2 r ri −1 e θi ( θrii ) j e θi ( θi )ri  ti2 j ri−1 j i Hence Pi = j=0 (2ti2 − x) . ti2 + ti1 x j! Γ ri

NNow, the independence of all the stages gives the total acceptance probability, P = i=1 Pi . As in the single sampling plan, we find the plan parameters (ri , ti1 , ti2 , ), i = 1, 2, . . . , N by formulating an optimization problem which minimizes the expected total testing cost at acceptable quality level satisfying the Type I and Type II error constraints. (1) (1) (2) ≤ ti2 ) ∩ (θˆri,ni + θˆri,ni ≥ We have, E(Z i ) = P(θˆri,ni ≥ ti2 ) + P((ti1 ≤ θˆri,ni 2ti2 )) = Pi , where Z i is the Bernoulli random variable defined in Sect. 2. Let Ti be the random testing time to get ri failures from a sample of size n i . Then the random testing time for the ith stage, i = 1, 2, . . . , Nin the DSP (1) 1 i ≤ ti2 ). Also E(Ti ) = θi rj=1 . is (1 + Pi1 )Ti , where Pi1 = P((ti1 ≤ θˆri,ni n i − j+1 Hence the total random testing time, T = (1 + P11 )T1 + (1 + P21 )Z 1 T2 + · · · + (1 + PN 1 )(Z n−1 , . . . , Z 1 )TN . of size n i for unit time in the ith stage, If Ci is the cost of testing a sample

then the ri  N i−1 1 expected total testing cost is i=1 j=1 n i − j+1 C i , where k=0 Pk θi (1 + Pi1 )  −ri ti1  −ri ti2 r t r t  i −1 e θi ( iθii1 )m e θi ( iθ i2 )m i − P0 = 1, Pi1 = rm=0 . m! m! Then our optimization problem becomes Problem Q 6 :

Min

N  i=1

such that

N  i=1

⎡  ri i−1   ⎣ Pi θi (1 + li ) k=0

Pi ≥ 1 − α,

j=1

⎤ 1 ⎦ Ci ni − j + 1

θi ≥ θi∗ , ∀1 ≤ i ≤ N

(19)

(20)

132

M. Kumar N 

Pi ≤ β,

θi ≤ θi0 , for some 1 ≤ i ≤ N .

(21)

i=1

Observe that the inequalities in (20) and (21) are equivalent to N 

Min

(θ1 ,θ2 ,...,θn )

Max

θi ≥ θi∗ , ∀1 ≤ i ≤ N

(22)

i=1

N 

(θ1 ,θ2 ,...,θn )

Pi ≥ 1 − α,

Pi ≤ β,

θi ≤ θi0 , for some 1 ≤ i ≤ N .

(23)

i=1

For an N-stage process, one has to find 3N parameters namely (ri , ti1 , ti2 ), i = 1, 2, . . . , N for the implementation of DSP. The problem is hard to deal with for N ≥ 3. Hence we solve our problem for the case N = 2. Hence the Problem can be rewritten as Problem Q 7 :

Min

2 

⎡  ri i−1   ⎣ Pk θi (1 + li )

i=1

such that Min

(θ1 ,θ2 )

Max

(θ1 ,θ2 )

k=0

2 

j=1

Pi ≥ 1 − α,

⎤ 1 ⎦ Ci ni − j + 1

θi ≥ θi∗ , ∀1 ≤ i ≤ 2

(24)

(25)

i=1

2 

Pi ≤ β,

θi ≤ θi0 , for some 1 ≤ i ≤ 2.

(26)

i=1

Now  into three parts as given in Sect. 2.   divide the feasible region of (26) A1 = (θ1 , θ2 ) : θ1 ≤ θ10 and θ2 ≤ θ20 , A2 = (θ1 , θ2 ) : θ1 ≥ θ10 and θ2 ≤ θ20 and  

2 A3 = (θ1 , θ2 ) : θ1 ≤ θ10 and θ2 ≥ θ20 . If we choose, ti2 ≥ θ2i , then i=1 Pi in (25) and (26) are increasing with respect to each θi . The minimum in (25) occurs at (θ1∗ , θ2∗ ). Since the feasible region of inequality (26) is unbounded, we consider the region A1 only. So Problem Q 7 becomes Q 8 :

Min

2  i=1

such that

⎡  ri i−1   ⎣ Pk θi (1 + li ) k=0

2  i=1

Pi ≥ 1 − α

j=1

⎤ 1 ⎦ Ci ni − j + 1

at (θ1∗ , θ2∗ )

(27)

(28)

Optimal Design of Reliability Acceptance Sampling Plans … 2 

Pi ≤ β

at (θ10 , θ20 ).

133

(29)

i=1

The optimization can be solved using genetic algorithm solver in Matlab. The examples are mentioned in Table 2. It is observed from Tables 3 and 4 that a change in sample size will effect expected total testing time and cost.

4 Numerical Results and Comparisons In this section, we present some numerical results of single and double sampling plans in Tables 1 and 2 respectively. Various comparisons of results based on SSP (ri , ti ) and DSP (ri , ti1 , ti2 ) are shown in Tables 3 and 4. First, we compare SSP (ri , ti ) and DSP (ri , ti1 , ti2 ). Next, DSP (ri , ti1 , ti2 ) is compared with the existing plan in the literature [22]. One may observe that in Aslam et al. [22], authors have considered the quality ratio, and for the different quality ratios, minimum sample number is calculated. The main drawback of this work is that the plan parameters are calculated using quality ratio which ignores the actual measured lifetime of the product for a specified period. Our DSP (ri , ti1 , ti2 ) based on the lifetime of the product, one can choose the sample size, where the units are tested for their failure, and the testing cost is changing according to the lifetime of the products. For example, in Aslam et al. [22], when α = 0.05, β = 0.1, ratio = 2 and a = 0.5, their plan gives an average sample number of 41.66, and the lot is accepted or rejected only based upon the number of defectives obtained. Hence the test plan ignores the actual lifetime of the units in the lot. The total testing time involved is 1200 units. Next, we consider our DSP (ri , ti1 , ti2 ) problem. Let, α = 0.05, β = 0.1, θ1∗ = 100, θ10 = 30, θ2∗ = 300, θ20 = 100, n 1 = 15, n 2 = 18, will give an expected total testing cost 114.2. Note that the actual testing cost may even be less than 114.2. It is observed that the proposed plan DSP (ri , ti1 , ti2 ) has an advantage of savings in expected testing costs of about 50%.

5 Comparisons of Single Sampling and Double Sampling Plans In this section, we compare our single sampling and double sampling plans. A comparisions of our plans are done with existing sampling plans in the literature (see [22]). It is seen that our plans perform better as compared to the existing plans in terms of testing costs incurred (see Tables 3, 4, 5 and 6).

134

M. Kumar

Table 1 Examples of single sampling plan (SSP(ri , ti )) The values of t1 , t2 and r1 , r2 for various choices of (θ1∗ , θ10 ), (θ2∗ , θ20 ), (α, β) and C1 = C2 = 1 (α, β)

(θ1∗ , θ10 )

(θ2∗ , θ20 )

(n 1 , n 2 )

t1

t2

(r1 , r2 )

0.1, 0.25 0.1, 0.05 0.1, 0.05 0.05, 0.01 0.05, 0.05 0.1, 0.05 0.05, 0.06 0.1, 0.1 0.05, 0.05

1500, 700 1500, 700 1000, 200 100, 30 100, 30 100, 30 500, 100 500, 100 500, 100

900, 500 900, 500 300, 100 300, 100 300, 100 300, 100 300, 100 300, 100 300, 100

20, 20 20, 20 15, 18 15, 18 15, 18 15, 18 10, 10 10, 10 10, 10

850 1000 550 50 55 59 203 242 201

400 500 200 100 111 99 109 112 101

12, 1 14, 6 9, 10 7, 5 7, 7 5, 2 4, 3 2, 3 5, 3

E T C at θ1∗ , θ2∗ 1711 2439.4 1375.3 208.58 256.49 118.5925 535.1864 371.9771 636.4034

Table 2 Examples of double sampling plan (DSP (ri , ti1 , ti2 )) The values of (t11 , t12 ), (t21 , t22 ) and r1 , r2 for various choices of (θ1∗ , θ10 ), (θ2∗ , θ20 ), (α, β) and C1 = C2 = 1 (α, β)

(θ1∗ , θ10 )

(θ2∗ , θ20 )

(n 1 , n 2 )

(t11 , t12 )

(t21 , t22 )

(r1 , r2 )

0.1, 0.05 0.05, 0.1 0.1, 0.25 0.1, 0.05 0.05, 0.1 0.1, 0.25 0.1, 0.15 0.1, 0.25 0.2, 0.25 0.15, 0.15

1500, 500 1500, 500 1500, 500 1000, 300 1000, 300 300, 80 300, 80 300, 80 300, 80 700, 90

900, 300 900, 300 900, 300 300, 50 300, 50 800, 350 800, 350 300, 80 300, 80 400, 100

20, 20 20, 20 20, 20 15, 10 15, 10 10, 15 10, 15 10, 15 10, 15 10, 15

275, 650 150, 669 40, 396 99, 387 57, 226 17, 96 14, 119 14, 97 19, 97 25, 154

65, 485 65, 372 85, 381 18, 98 19, 97 71, 326 68, 365 11, 57 11, 58 11, 91

4, 3 4, 3 2, 2 2, 2 2, 2 2, 2 2, 3 2, 1 1, 1 1, 1

E T C at θ1∗ , θ2∗ 802.96 819.296 609.9 409.699 468.645 412.442 480.88 204.76 150.26 267.05

Table 3 Comparison between single (SSP (ri , ti )) and double (DSP (ri , ti1 , ti2 )) sampling plans Comparisons of SSP(ri , ti ) and DSP (ri , ti1 , ti2 ) for various choices of (θ1∗ , θ10 ), (θ2∗ , θ20 ), (α, β) and C1 = C2 = 1 (α, β)

(θ1∗ , θ10 )

(θ2∗ , θ20 )

(n 1 , n 2 )

0.1, 0.25 0.05, 0.05 0.05, 0.05

1500, 700 100, 30 500, 100

900, 500 300, 100 300, 100

20, 20 15, 18 10, 10

ETC under single sampling 1711 256.49 636.4034

ETC under double sampling 978.3 114.2 233.544

(θ1∗ , θ10 )

1500, 700 100, 30 1500, 700

(α, β)

0.05, 0.25 0.05, 0.05 0.05, 0.1

900, 500 300, 100 900,500

(θ2∗ , θ20 )

20, 20 15, 18 20, 20

(n 1 , n 2 )

ETC under double expected total testing cost sampling for existing plan, a=0.5 and ratio=2 978.3 1200 114.2 200 1015.4 1200

Comparisons of double sampling plan with the existing plan for various choices of (θ1∗ , θ10 ), (θ2∗ , θ20 ), (α, β) and C1 = C2 = 1

Table 4 Comparison between double sampling plan DSP (ri , ti1 , ti2 ) and the existing sampling plan Aslam et al. [22]

40.13 65.66 53.47

Average sample number for existing plan

Optimal Design of Reliability Acceptance Sampling Plans … 135

136

M. Kumar

Table 5 Effect of change in sample number on cost in single sampling plan (SSP (ri , ti )) The values of ETC for various choices of (θ0 , θ1 ), (α, β) and C = 1 (α, β)

(θ1∗ , θ10 )

(θ2∗ , θ20 )

(n 1 , n 2 )

t1

t2

(r1 , r2 )

0.1, 0.25 0.1, 0.25 0.1, 0.25 0.1, 0.05 0.1, 0.05 0.1, 0.05 0.05, 0.01 0.05, 0.01 0.05, 0.01

1500, 700 1500, 700 1500, 700 1000, 200 1000, 200 1000, 200 100, 30 100, 30 100, 30

900, 500 900, 500 900, 500 300, 100 300, 100 300, 100 300, 100 300, 100 300, 100

15, 10 25, 10 15, 20 15, 25 25, 25 15, 15 10, 12 15, 20 25, 20

850 850 850 550 550 550 50 50 50

400 400 400 200 200 200 100 100 100

12, 1 12, 1 12, 1 9, 10 9, 10 9, 10 7, 5 7, 5 7, 5

E T C at θ1∗ , θ2∗ 3086.9 1392.6 2953.3 1272.8 711.6130 1474.9 365.1710 193.7672 156.4667

Table 6 Effect of change in sample number on cost in double sampling plan DSP (ri , ti1 , ti2 ) The values of ETC for various choices of (θ1∗ , θ10 ), (θ2∗ , θ20 ), (α, β) and C1 = C2 = 1 (α, β)

(θ1∗ , θ10 )

(θ2∗ , θ20 )

(n 1 , n 2 )

(t11 , t12 )

(t21 , t22 )

(r1 , r2 )

0.1, 0.05 0.1, 0.05 0.1, 0.05 0.1, 0.05 0.1, 0.05 0.1, 0.05 0.1, 0.25 0.1, 0.25 0.1, 0.25 0.1, 0.25 0.1, 0.25 0.1, 0.25 0.15, 0.15 0.15, 0.15

1500, 500 1500, 500 1500, 500 1000, 300 1000, 300 1000, 300 300, 80 300, 80 300, 80 300, 80 300, 80 300, 80 700, 90 700, 90

900, 300 900, 300 900, 300 300, 50 300, 50 300, 50 800, 350 800, 350 800, 350 300, 80 300, 80 300, 80 400, 100 400, 100

20, 20 10, 20 10, 10 15, 10 25, 10 5, 15 10, 15 15, 15 15, 25 10, 15 5, 5 5, 25 10, 15 10, 5

275, 650 275, 650 275, 650 99, 387 99, 387 99, 387 17, 96 17, 96 17, 96 14, 97 14, 97 14, 97 25, 154 25, 154

65, 485 65, 485 65, 485 18, 98 18, 98 18, 98 71, 326 71, 326 71, 326 11, 57 11, 57 11, 57 11, 91 11, 91

4, 3 4, 3 4, 3 2, 2 2, 2 2, 2 2, 2 2, 2 2, 2 2, 1 2, 1 2, 1 1, 1 1, 1

E T C at θ1∗ , θ2∗ 802.96 1439.30 1709.41 409.699 287.9 1003.8 412.442 365.79 256.06 204.76 471.02 323.4 267.05 416.5

6 Sensitivity Analysis of Sampling Plans In this section, we focus on the sensitivity of some of the important parameters in the derived sampling plans, and investigate the behavior of the expected total testing cost due to the variations in producer’s and consumer’s risks and sample sizes. The results of sensitivity analysis study are shown in Figs. 1, 2, 3 and 4.

Optimal Design of Reliability Acceptance Sampling Plans …

137

138 θ*1=150,θ*2=90 θ01=40,θ02=30

137

Expected total testing cost

136

β=0.1

135 134 133 132 131 130 129 0.05

0.055

0.06

0.065

0.07

α

0.075

0.08

0.085

0.09

0.095

Fig. 1 Sensitivity of expected total testing cost to changes in α in DSP

We observe from the above sensitivity analysis studies that the expected total testing cost is a decreasing function of α (see Fig. 1). On the other hand, ETC shows fluctuating behavior, that it is neither decreasing nor increasing with respect to changes in β. It can also be noted from Fig. 2 that ETC has an increasing trend in the interval (0.06, 0.85). The sensitivity analysis study of the expected total testing cost subjected to changes in two parameters, namely n 1 and n 2 , is given in Figs. 3 and 4. It is clear from the figures that ETC is a decreasing function of (n 1 , n 2 ), when all the other parameters in the sampling plan are kept fixed.

7 Some Case Studies Lets us consider the following examples to understand the results obtained in the previous sections. Example 1 We consider the example depicted in Aslam et al. [22], a two-stage inspection process of ball bearings. The compressive strength and the maximum stress of a ball bearing are first and second quality characteristics respectively. Assume that the compressive strength and the stress of a ball bearing are inde-

138

M. Kumar

137.5

expected total testing cost

137 136.5

θ1*=150, θ2*=90 0 0 θ1=40, θ2=30 α=0.1

136 135.5 135 134.5 134 133.5 0.05

0.055

0.06

0.065

0.07

β

0.075

0.08

0.085

0.09

0.095

Fig. 2 Sensitivity of expected total testing cost to changes in β in DSP Table 7 Failure times of 20 ball bearings related with strength (stage I) Sample 0.2439 0.4728 0.1764 0.1239 0.8787 0.3995 0.1687 I: 0.5582 Sample 0.4447 0.3119 0.3733 0.7539 0.1624 0.5920 0.8900 II: 0.1050

0.3858

0.4427

0.5280

0.1079

pendent and that each follows the exponential distribution. Let us consider the single and double sampling plan for the lot acceptance. Assume that α = 0.05, β = 0.25. Let the specified mean failure time under the test compressive load or the stress load of a ball bearing be θ1∗ = θ2∗ = 0.25 and θ10 = θ20 = 0.1. Table 7 and Table 8 give the sample values. Single sampling plan: Choose the first sample of each stage with n 1 = n 2 = 10. Then the plan parameters are calculated using Problem Q 5 . Here t1 = 0.11, t2 = 0.115, r1 = 5 and r2 = 6. Consider sample 1 given in Table 7. We take first five failure times, which give θˆ5,10 = 0.60554, and 0.60554 > t1 = 0.11. So we accept the lot from the first stage and move to the second stage. In the second stage, first 6 failure times recorded in sample 1 given in Table 8 are under consideration. Then we

Optimal Design of Reliability Acceptance Sampling Plans …

139

θ* =100,θ* =300, 1 2 0 0 θ =30, θ =100, 1

2

α=0.05, β=0.01

expected total testing cost

500 400 300 200 100 0 40 30

40 30

20

20

10 n2

10 0

0

n1

Fig. 3 Sensitivity of expected total testing cost to changes in (n 1 , n 2 ) in SSP Table 8 Failure times of 20 ball bearings related with stress (stage II) Sample 0.3337 0.0379 0.3248 0.2935 0.3901 0.3608 0.5570 I: 0.3029 Sample 0.3042 0.2392 0.9623 0.4297 0.2629 0.6186 0.5351 II: 0.1603

0.6922

0.0971

0.6047

0.6976

have, θˆ6,10 = 0.5083 > t2 = 0.115. Hence we accept the product from the second stage and hence this leads to the final acceptance of the entire lot. Double sampling plan: Let n 1 = n 2 = 10. Then the decision making parameter values are, t11 = 0.1, t12 = 0.2, t21 = 0.1, t22 = 0.184, r1 = 2 and r2 = 1. These (1) is greater values are obtained using Problem Q 8 . Thus if the average failure time θˆ2,10 than 0.2, we accept the units from the first stage. In our example, from sample 1 given (1) = 0.567 > 0.2. So we accept the units from the first stage. In the in Table 7, θˆ2,10 second stage, r2 = 1. From sample 1 given in Table 8, we have the first failure time (1) is 0.379 which is greater than t22 = 0.184. Hence we accept the is 0.0379 and θˆ1,10 units from the second stage also. Thus we finally accept the entire lot.

140

M. Kumar *

*

θ1=1500,θ2=900 θ01=500, θ02=300 α=0.1, β=0.05

expected total testing cost

4000 3000 2000 1000 0 40 30

40 30

20

20

10 n2

10 0

0

n1

Fig. 4 Sensitivity of expected total testing cost to changes in (n 1 , n 2 ) in DSP Table 9 Failure times of 10 units with mean life 1 (stage I) 0.0989 0.0906 2.3275 0.6035 2.0637 0.4583 1.2783 0.0434

0.0357 1.8476

Table 10 Failure times of 10 units with mean life 5 (stage II) 0.1492 3.6142 9.7636 0.4401 0.2189 1.1140 4.3166 1.1647

0.2068 2.1099

Example 2 In this second example, we have generated two samples of size 10 from an exponential distribution, each with mean 1 and 5 respectively. The samples are given in Tables 9 and 10. We illustrate SSP and DSP for this example in the following. Single sampling plan: Set α = 0.1, β = 0.2, n 1 = 10, n 2 = 10, θ1∗ = 1, θ10 = 0.5 and θ2∗ = 5, θ20 = 2. Our plan parameters are given by r1 = r2 = 5, t1 = 0.395 and ˆ = 0.60368. t2 = 3. From the Table 9, select the first 5 failure times, which gives θ5,10 Since 0.60368 > t1 = 0.395, we accept the product from the first stage. In the second ˆ = 1.5398 < t2 = 3. Hence we reject the product stage r2 = 5. From Table 10, θ5,10 from the second stage and hence this leads to the final rejection of the entire lot.

Optimal Design of Reliability Acceptance Sampling Plans …

141

Double sampling plan: Set α = 0.1, β = 0.2, n 1 = 10, n 2 = 10, θ1∗ = 1, θ10 = 0.5 and θ2∗ = 5, θ20 = 2. Our plan DSP (ri , ti1 , ti2 ) give r1 = r2 = 2, t11 = 0.35, t12 = (1) 0.8, t21 = 2.12 and t22 = 4. From Table 9, θˆ2,10 = 0.375 > t11 = 0.35. Now we take another sample of size 10, which is given below: Sample II: 1.4491 1.0408 0.1970 4.1732 3.1460 1.7779 0.4321 0.3124 0.4343 0.7965. (2) (1) (2) = 1.5043, and (θˆ2,10 + θˆ2,10 )/2 = 0.93965 > 0.8. Hence From this sample θˆ2,10 (1) we accept the units from the first stage. In the second stage, θˆ2,10 = 1.0052 < t21 = 2.12. Therefore, we reject the units from the second stage which leads to the rejection of the entire lot.

8 Conclusions In this chapter, we developed two sampling schemes, namely single and double sampling plans for the multistage process based on the information about the failure times of units obtained under Type II censoring. The multistage production process in which stages are independent is discussed by Aslam et al. [22]. They have considered the number of failures as the criterion for accepting the lot and ignored the failure time of units tested. Also the cost of testing is not addressed in their work. In this work, we have removed the above drawbacks by considering the average-life of units in the sample. It is noticed that when the sample size increases, the expected total testing cost decreases which is clear from the sensitivity analysis study of ETC with respect to changes in sample sizes (n 1 , n 2 ). Also, as α increases, ETC decreases. However, it is observed that ETC shows a fluctuating trend subjected to changes in β. It is inspected that our plan DSP (ri , ti1 , ti2 ) has an advantage of savings in expected total testing cost of about 50% as compared to Aslam et al. [22]. The actual testing cost may even be less than that reported. As a future scope of work one may consider dependent production process with several stages. Acknowledgements The author would like to express his gratitude to the editors for their constructive comments which improved the presentation of the chapter. The author would also like to thank Dr. Ramyamol P C for her computational assistance.

References 1. Epstein B, Sobel M (1953) Life testing. J Am Stat Assoc 48:486–502 2. Epstein B (1954) Truncated life tests in the exponential case. Ann Math Stat 25:555–564 3. Jun CH, Balamurali S, Lee SH (2006) Variable sampling plans for Weibull distribution lifetimes under sudden death testing. IEEE Trans Reliab 55:53–58 4. Bartholomew DJ (1963) The sampling distribution of an estimate arising in life testing. Technometrics 5:361–374

142

M. Kumar

5. Aslam M, Jun CH (2009) A group acceptance sampling plans for truncated life tests based on the Inverse Rayleigh and Log-Logistic distributions. Pakistan J Stat 25:1–13 6. Azam M, Aslam M, Balamurali S, Javaid A (2015) Two stage group acceptance sampling plan for half normal percentiles. J King Saud University - Sci 27:239–243 7. Tsai TR, Wu SJ (2006) Acceptance sampling based on truncated life test for generalized Rayleigh distribution. J Appl Stat 33:595–600 8. Kantam RR, Rosaiah K, Srinivas Rao G (2001) Acceptance sampling plans based on life tests: log- logistic model. J Appl Stat 28:121–128 9. Balamurali S, Usha M (2013) Optimal design of variable chain sampling plan by minimizing the average sample number. Int J Manuf Eng 3:1–10 10. Srinivasa Rao G (2009) A group acceptance sampling plans for lifetimes following a generalized exponential distribution. Econ Quality Control 24:75–85 11. Aslam M (2007) Double acceptance sampling based on truncated life-tests in Rayleigh distribution. Eur J Sci Res 17:605–611 12. Aslam M, Jun CH (2009) A group acceptance sampling plan for truncated life test having Weibull distribution. J Appl Stat 36:1021–1027 13. Chien-Wei W, Aslam M, Jun C-H (2012) Variables sampling inspection scheme for resubmitted lots based on the process capability index Cpk. Eur J Oper Res 217:560–566 14. Vellaisamy P, Sankar S (2005) A unified approach for modeling and designing attribute sampling plans for monitoring dependent production processes. Methodol Comput Appl Probab 7:307–323 15. Kumar M, Ramyamol PC (2016) Optimal reliability acceptance sampling plan for exponential distribution. Econ Quality Control 31:23–36 16. Ramyamol PC, Kumar M (2019) Optimal design of variable acceptance sampling plans for mixture distribution. J Appl Stat 46:2700–2721 17. Asokan MV, Balamurali S (2000) Multi attribute single sampling plans. Econ Quality Control 15:103–108 18. Duffuaa SO, Al-Turki UM, Kolus AA (2009) Process-targeting model for a product with two dependent quality characteristics using acceptance sampling plans. Int J Prod Res 47:4041– 4046 19. Moskowitz H, Plante R, Tang K (1996) Multi stage multi attribute acceptance sampling in serial production systems. IIE Trans 130–137 20. Lee HL, Tagaras G (1992) Economic acceptance sampling plans in complex multi-stage production systems. Int J Prod Res 30:2615–2632 21. Gaglioppa F, Miller LA, Benjaafar S (2008) Multi-task and multi stage production planning and scheduling for process industries. Oper Res 56:1010–1025 22. Aslam M, Azam M, Jun C-H (2015) Acceptance sampling plans for multi-stage process based on time–truncated test for Weibull distribution. Int J Adv Manuf Technol 79:1779–1785

The Importance of Technical Diagnostics for Ensuring the Reliability of Industrial Systems D. Lj. Brankovi´c, Z. N. Milovanovi´c, and V. Z. Janiˇci´c Milovanovi´c

Abstract The success and sustainability of the business industrial system is largely determined by the degree of effectiveness of the production system as its basic integral part. The reliability of the technical system, i.e., the probability of performing the projected goal function in the observed time period, along with the readiness and functional convenience is a basic indicator of the effectiveness of the production system. Maintenance, as an integral part of the production system, has the function of providing the projected level of optimal reliability by implementing activities aimed at ensuring the required level of technical readiness of parts of the technical system. One of the ways to ensure the optimal level of reliability of the technical system is the application of maintenance concepts according to the condition that allows monitoring and control of technical parameters of the state of the elements of technical systems within the production processes. The basis of the application of the condition based maintenance is the introduction of procedures for monitoring and control of condition parameters using technical diagnostic methods. Knowing the current state of the parameters that determine the degree of success of the projected function of the goal of the technical system gives the possibility of timely response to the occurrence of malfunctions and avoid the entry of the system into failure. The paper presents a systematic provision of regular monitoring and control of condition parameters of parts of the technical system of paper machines, specifically vibration levels on high voltage motors, within the industrial system for production of toilet paper using technical diagnostic methods using portable vibration control devices. By timely response to the observed occurrence of increased vibration levels at a specific critical position with the application of technical diagnostic methods and D. Lj. Brankovi´c · Z. N. Milovanovi´c (B) Department of Hydro and Thermal Engineering, Faculty of Mechanical Engineering Banja Luka, University of Banja Luka, Stepe Stepanovi´ca 71, Banja Luka, Bosnia and Herzegovina e-mail: [email protected] D. Lj. Brankovi´c e-mail: [email protected] V. Z. J. Milovanovi´c Solunska 8a, 78000 Banja Luka, Bosnia and Herzegovina e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment of Industrial Systems, Springer Series in Reliability Engineering, https://doi.org/10.1007/978-3-030-93623-5_8

143

144

D. Lj. Brankovi´c et al.

complex maintenance intervention activities, it is possible to avoid unwanted failure, long-term unplanned production downtime and very high maintenance costs. Along with the description of the specific event and preventive action of the maintenance service, the possibility of connecting the observed working position with other critical positions provided with systems for continuous monitoring of operating parameters, specifically the vibration levels of bearings, is indicated. By constantly striving and finding ways to raise the overall level of reliability to a higher level, maintaining the production system plays a major role in providing conditions for continuous production, achieving planned results and profits and sustainability of production and business industrial system in the increasingly demanding world market. Keywords Technical diagnostic · Effectiveness · Reliability · Condition based maintenance

1 Introduction The production of goods with new use value as well as the provision of the required quality of services within the production process is the basis of economic development of social communities. The organization of production processes arises as a consequence of the planned integration of human, material and energy resources and their transformation into the desired form within the production industrial systems. Production systems are composed of a series of complex technical systems, i.e., integrated functional units by means of which the transformation of input quantities is performed, i.e., material values into semi-finished or finished products. An integral part of production processes are activities to ensure the continuity of the process of technical systems in order to fulfill the projected function of the goal. These activities are planned, organized and managed by a special organizational unit of the production system—maintenance. All maintenance activities within the production industrial system are closely related to other units that participate in the production process and form part of the integrated system support to production.

2 Theoretical Settings of Industrial Systems Maintenance The need for maintenance in industrial systems appears from the very beginning and application of technical systems during the first industrial revolution and the introduction of mechanization in production processes. The feature of any technical system is that under the influence of external and internal disturbances during operation, it can reach a state of failure which interrupts its function, which stops part or the entire production process. In order to prevent the failure condition, or if it occurs, an appropriate response from the maintenance service is required. Within the period of the first industrial revolution, there was no need for a systematic approach

The Importance of Technical Diagnostics …

145

Fig. 1 Industrial revolutions

to maintenance. Maintenance was reduced to interventions after the occurrence of failures and the occurrence of failures of parts of technical systems. With the development of technical means, electrification during the second industrial revolution, the introduction of electronics and automation during the third industrial revolution, all the way to digitalization and application of modern software solutions in a new understanding of the so-called. Industries 4.0 and 5.0 (Fig. 1), the importance of maintenance has become one of the key factors for ensuring high performance of efficient production systems. The development of technological processes and industry in the second half of the twentieth century conditioned a new way of understanding the maintenance process. As this is a very complex concept, there are many ways to define the maintenance process. Rejec [1] cites Congress OECD (Organization for Economic Co-operation and Development) from 1963 and presents one of the first definitions of maintenance: Maintenance is the function of the company entrusted with constant control over the plants and the performance of certain repairs and audits, which enables constant functional capacity and preservation of production plants, auxiliary plants and other equipment.

There are numerous scientific papers and researches that try to describe the concept of maintenance in more detail. Papi´c and Milovanovi´c [2] point out that maintenance ceases to be seen as a need that has a random character and becomes a joint action of a set of segments or elements in order to ensure the functional state of the technical system in accordance with the requirements and criteria. The process of maintaining assets, as one of the most important parts of the overall production process, has the task of preventing and eliminating system failures, primarily through streamlining and optimizing their use and increasing productivity and cost-effectiveness in the production and operation process.

146

D. Lj. Brankovi´c et al.

The traditional understanding of maintenance related to the processes or activities of repairing damaged parts of a technical system is changing, Tsang [3]. The new understanding is defined by the maintenance terminology EN 13,306: 2001 [4], Bengtsson [5] by CEN (Committee of Europeen de Normalisation, European Committee for Standardization). Standard EN 13,306: 2001 and is partly based on the standard IEC 60,050 (191) according to which the term maintenance is defined as: Maintenance is a combination of all the technical, administrative and management activities during the life cycle of parts intended to maintain or restore the condition of the system in which the required function can be provided.

Although all interpretations of the concept of maintenance have their specifics, general maintenance can be understood as the implementation of all measures necessary for a machine, plant or entire factory to function properly, developing performance within the prescribed limits, or with the required effects and quality, without failure and with the prescribed providing the environment, and under the assumption of good security of all conditions, i.e., with the necessary logistical support, Majdandži´c [6]. As part of the business system, the maintenance system has the role of integrating the organizational structure, technology and information system and activating all available material and human resources within the design / redesign process. Maintenance is the process of providing conditions for fulfilling the function of the goal in relation to defined costs, effectiveness, availability, reliability, etc. The success of maintenance is determined by providing the greatest possible degree of certainty over time. Ensuring the functional state of the system is possible only with the interaction of the maintenance system with parts of the business system and environment. In this case we are talking about a system approach to maintenance, Fig. 2. Maintenance is the process of engaging all the resources and activities necessary to meet the predetermined characteristics and conditions of the observed system within a given time interval.

Fig. 2 System approach to maintenance

The Importance of Technical Diagnostics …

147

With the construction and establishment of the production system, the need has been imposed to define certain procedures that ensure the elimination of possible causes of failures in the operation of the system. Working condition and condition in failure are the main characteristics of the production system. The permanent goal of the maintenance process is to minimize the time of failure and to provide working conditions in which the production system fulfills the planned function of the goal.

2.1 Maintenance Strategies and Concepts Ensuring the basic maintenance functions—security of operation of the technical system by providing the highest possible level of reliability and availability, is possible if there is a clear maintenance strategy. Maintenance Strategy is the highest level of maintenance determined by the management in accordance with the requirements: • ensuring the maximum availability of the technical system which enables the performance of the required function with minimal downtime/costs, • meeting safety requirements for workers and equipment, • achieving the necessary durability of the technical system, • quality assurance of services provided by maintenance processes. The maintenance strategy is determined based on the characteristics and needs of other parts of the business system, Fig. 3. Historically, the meaning and role of maintenance have changed over time. Unlike the traditional understanding, in which the maintenance of parts of the technical

Fig. 3 Conditions for choosing a maintenance strategy, [2]

148

D. Lj. Brankovi´c et al.

system was entrusted to individuals and in which maintenance was seen as a nonprofit business entity, the basic characteristic of modern maintenance is the organization and implementation of team activities to prevent and eliminate failures at the lowest possible cost gives the maintenance service the possibility of a great influence on the economic results and a large share in the creation of the value of profit in the operation of the system. In Table 1 presents the basic characteristics of the old / traditional and new / modern understanding of maintenance. There are a large number of ways in which the division of maintenance strategies can be performed, but in each one a connection with the way of performing maintenance procedures which characterizes the basic division of maintenance strategies can be seen, Fig. 4. The most important characteristics of basic maintenance strategies are: • Corrective maintenance strategy or so-called. “Wait and see” strategy—maintenance activities are performed after the technical system fails. There are no previous planned interventions other than regular cleaning and lubrication operations. The justification of this strategy is in cases of simple, cheap and less important elements of the technical system, Table 1 Characteristics of traditional and modern understanding of maintenance organization Characteristic

Old (traditional way of maintenance)

New (moder way of maintenance)

Orijentation

On repair

On reliability

Akctivity

Repair

Improve

Concept

“Firefighting”

anticipate, plan and program activities

Executor

Repairman

member of the business team

Occurrence of failure

Failure elimination

elimination of causes

Consequence in accordance with the set goal

Maintenance costs reduction

Increasing of working time

Business advancement

“Action program of the week/month”

Continous improvement

Belief

Failures are inevitable

Failures are only exceptions

Priorities defining

Failures that occurred

Eliminating the cause of the failure

Number of failures

Large

Just a few failures

Share of planned operations

Low

High

Number of complaints

Large

Small

Reliability level

Low

High

Nivo troškova održavanja

High

Small

Planning period

Short-term

Long-term

Investments character

Non-profit investment

Attracts investment

The Importance of Technical Diagnostics …

149

Fig. 4 Basic division of maintenance strategies

• Opportunistic maintenance strategy—upgrade of the “Wait and see” strategy which refers to the introduction of elements of preventive maintenance only in cases of parts of technical systems where it is necessary, • Preventive maintenance strategy—methods of prevention of failures based on the scientific theory of reliability of the elements of the technical system. It is used in cases of more expensive and responsible equipment while providing the desired level of safety for both workers and equipment, • Condition based maintenance strategy—by monitoring certain technical parameters of the condition of parts of the technical system, a decision is made on the implementation of maintenance activities. It is used in cases of monitoring and maintenance of very expensive, responsible and complex technical systems such as aircraft, power plants, process industry and the like, • Risk based maintenance strategy—upgrading the “Condition based maintenance” strategy in the case of parts of complex high-risk technical systems such as nuclear power plants, petrochemical plants and the like. In addition to the above basic maintenance strategies, in practice the following ways of organizing maintenance activities in industrial systems can be found: • Logistics maintenance strategy—if the priority of the maintenance strategy is based on the development of elements of logistics support such as administrative tasks, ergonomic and environmental factors, realization and distribution of profits, then we are talking about logistics maintenance (Logistics Maintenance). This strategy basically supports the fully installed equipment with an emphasis on its design, production of equipment and conditions of use. Within the basic activities, the production and repair of spare parts in service or own workshops is realized, as well as the procurement of raw materials and completely new spare parts, • Total Productive Maintenance strategy (TPM)—higher maintenance strategies from the late twentieth century that the staff, i.e., puts the individual in the production process at the center of interest in relation to the disposal of equipment and means of work. The common goal of all employees in the company is to raise the performance and results with a new way of organizing the system. Compared to the competition, which uses the same equipment, a better business result is

150

D. Lj. Brankovi´c et al.

achieved. Equipment used in production processes should have a projected level of capability with the maximum possible return on investment. The application of the concept of eliminating “small” losses and changing the overall philosophy of work and business requires full commitment of all participants in the production process and a high level of awareness and motivation to work, • Terotechnological maintenance strategy—the greatest importance and role in the processes is given to maintenance professionals during the design and construction of production systems in all phases of the life cycle (lifespan). From the very initial idea, through procurement, installation, operation and write-off, i.e., withdrawal of equipment from use, qualified maintenance persons have an active role in providing functional and economic support to all processes in the system, • Expert systems strategy—the basis of expert systems consists of large databases supported by computer systems that have data, knowledge and reasoning mechanisms, i.e., artificial intelligence, Fig. 5. • Self-maintenance strategy—the highest level in the functioning of maintenance characteristic of fully automated companies with a high degree of automation such as e.g., technological systems with robotic lines, • Maintenance-free maintenance so-called strategy—replacement of parts of technical systems, subassemblies or assemblies after the expiration of the working

Fig. 5 Example of configuration and operation of an expert system

The Importance of Technical Diagnostics …

151

function, which means work until the failure condition occurs, without the intention to repair the failed element and bring it back to functional condition. The strategy is acceptable in all cases in which it is economically unacceptable to engage its own maintenance or services of external companies in relation to the availability and price of a completely new part, subassembly or assembly. If the lifespan of such a part or assembly is long enough to ensure the write-off of the old part while achieving the planned productivity within the technological process, this strategy provides for the complete replacement of the failed part with a new, spare part or assembly.

2.2 Condition Based Maintenance Strategy A large number of researchers have defined the concept of maintenance strategy according to the condition. Bengtsson [5] states that maintenance according to condition belongs to the strategy of preventive maintenance based on performance and / or parameters of subsequent diagnostics of technical condition. Butcher [7] defines maintenance technology as a series of maintenance activities in terms of assessing the condition of equipment in real or near real time. Condition assessment is obtained by purpose-built sensors and/or external tests or measurements using portable measuring devices/instruments, [7]. According to Kothamasu et al. [8], a state-of-the-art maintenance strategy is a decision-making strategy related to maintenance activities based on observation of the state of the system and its constituent components, [8]. In contrast to the basic maintenance division shown in Fig. 4, in the Shin and Jun [9] and Erbe at all studies [10], maintenance is divided into three basic categories: preventive, corrective, and condition-based maintenance. Preventive maintenance refers to periodic scheduled inspections of elements of technical systems at clearly defined time intervals. Although it has the properties of failure prevention, the meaning of the maintenance strategy according to the condition is the fact that the degradation of the condition does not occur immediately because usually processes occur that translate the system into an abnormal state. Condition based maintesnance according to EN 13,306 [4] is defined as: Preventive maintenance based on monitoring of performance and / or parameters and subsequent corrective activities.

Monitoring is defined according to standard EN 13,306 [4] as: Activity realized manually or automatically, with the aim of observing the current state of the system.

The main difference between monitoring and inspection is that monitoring is used to assess any changes in system parameters over time. Monitoring can be continuous, within time intervals or within a certain number of operations. It is usually realized in the operating conditions of the system.

152

D. Lj. Brankovi´c et al.

Parameter and performance monitoring can be predetermined as continuous or intermittent. Preventive maintenance strategy uses the conditions and means of diagnostic devices to analyze the current state of the observed parts of the system and using the knowledge of technical diagnostics, operating conditions of equipment and technological procedures within production processes, determines appropriate preventive maintenance activities. By using maintenance intervals and conditionbased maintenance tasks, maintenance activities become predictable. In this sense, standard EN 13,306 [4] defines predictive maintenance as: Condition based maintenance carried out after the forecast obtained by analysis and assessment of unambiguous parameters of system degradation.

The term system degradation is defined according to standard EN 13,306 [4] as: An irreversible process of disrupting one or more characteristics of a system either by time or by the action of an external.

The basic determinant of the term degradation is the possibility of the system coming to a state of failure. Often the term degradation also refers to wear. In terms of the previous definition, the notion of degradation can also be explained as a notion of disruption of system operating conditions because the basic task of the maintenance strategy according to the condition is to diagnose the disorder before it becomes an irreversible process. Looking at the financial aspects, it is very important to determine the economic feasibility of a particular maintenance strategy. In the case of high capital importance, criticality and safety of the complete system, safety of the production process, etc., condition based maintenance is a better solution, although it is a significant investment in diagnostic equipment. The basic precondition is that the observed system has measurable parameters of the work process. There are numerous analysis decisions on which maintenance method and strategy to apply for each case, Al-Najjar and Alsyouf [11]. The basis of monitoring is the existence of the condition based maintenance, which can be defined as: A system that uses state-of-the-art maintenance to determine and plan predictive maintenance activities independently or in interaction with other systems or humans.

Tavner et al. [12] emphasize the growing importance of the application of Condition Monitoring in the automation and reduction of human oversight within power plants. They analyze the effects of monitoring the condition of electric rotating machines with techniques that are applicable when these machines are in operation, neglecting the many off-line inspection techniques. The results of monitoring the condition relate to: electromagnetic and dynamic behavior of electrical machines (with special emphasis on control with modern power electronics) and the behavior of the insulation system of electrical machines, [12]. Condition based maintenance has the function of converting measurable energy input quantities (temperature, vibration, noise …) into the desired shape or effect (condition of elements, comparison with allowed values for safe operation, forecasts

The Importance of Technical Diagnostics …

153

for failure, etc.) in space and time. The maintenance system according to the condition uses the characteristics and parameters of the monitoring technique (vibrations, sound, heat …) in terms of detecting performance disturbances or changes in the operating characteristics of the observed system. Based on the previous definitions, techniques or activities of the monitoring system can be: • • • •

subjective—noticed and observed by workers, objective: use of portable measuring devices—manual, stand-alone continuous systems for measuring predefined parameters.

It should be noted that monitoring activities are usually carried out during the normal operation of the system, which in many cases represents the ability to obtain relevant data on the state of the system. Monitoring can be planned, on request or continuous. Following the characteristics and parameters of monitoring, the maintenance system according to the condition should determine the subsequent activities that are realized within the predictive maintenance. Advanced maintenance systems, using the support of computer programs, have the ability to autonomously plan maintenance activities. They provide input on the current state of the system and an estimate of where the observed system should be in the future given the required operating conditions. After predicting the state of the system, the role of man is reduced to deciding on the schedule of maintenance activities. This integration of man into an integral part of the system creates the possibility of realization of all set requirements within the condition based maintenance.

3 Technical Diagnostics—Basis of Condition Based Maintenance 3.1 The Concept of Technical Diagnostics In the last few decades, the term technical diagnostics has penetrated into all branches of technology, especially in electrical engineering and mechanical engineering. This term means all measures used to assess the current condition or give a forecast of the behavior of machinery, devices, equipment and the like, without disassembly / disassembly or destruction in a certain period of time. The basic task accomplished by technical diagnostic measures is the timely prediction of the occurrence of malfunctions of parts of the technical system. The implementation of technical diagnostic measures increases the overall degree of reliability, availability and effectiveness of the complete technical system. Technical diagnostics is the science of recognizing the condition of a technical system and thanks to systems for monitoring the size of the condition, the occurrence

154

D. Lj. Brankovi´c et al.

of malfunctions can be predicted. By eliminating the factors of possible system failure, the efficiency and reliability of the system increases [13]. Technical diagnostics provides a large number of benefits both for the normal and safe operation of the production technical system and for achieving the function of the maintenance goal. The most important characteristics of technical diagnostics are: • optimization of planning of preventive repairs and overhaul works of the plant based on the assessment of the actual condition of the technical system, • avoidance of conditions during operation that may cause the technical system to fail, • reduction of downtime and unplanned activities that result in optimization of reliability and prevention of emergency situations, • extension of the service life of the equipment as the results of technical diagnostics determine and indicate the terms of replacement of individual parts and assemblies, • forming a database suitable for easier viewing and access for the purpose of future investments in critical places and obsolete equipment.

3.2 Condition, Changes and Condition Control of the Technical System Technical diagnostics is a kind of tool for assessing the condition of the components of the technical system. The state of the system is the basic characteristic of the observed technical system. The state of the technical system determines a set of physical parameters whose monitoring and comparison in time can define the behavior of the system at a given time and the forecast of the state of the system in the future after the action of internal or external disturbances, Fig. 6. The term system state can be described by the following features: • • • •

working condition, state of dismissal, quality of functioning, safety and security, etc.

Fig. 6 General overview of the technical system [14]

The Importance of Technical Diagnostics …

155

The failure of a technical system is defined as the cessation of the ability of an element or system as a whole to perform the functions for which it was designed. The technical system may have impaired performance due to the action of various factors that can cause different degrees of damage. Causes of impairment of working ability can be: • • • •

technological (conditions of technological processes), construction (errors during the process of construction of parts), exploitation (periods of operation / use, methods of use and maintenance), wear of parts (friction, wear …).

The assessment of the condition of the diagnosed object is defined by the limit values of the corresponding characteristic or parameter. The values that determine the normal functioning of the system are conditioned by the system design, method of construction, mode of functioning and conditions, i.e., change of environmental conditions (Design, Production Technology, Operational, Change of Condition— DPTOCC). Comparing the measured values of the observed parameter (values that condition normal functioning) with the limit prescribed values creates the basis for deciding whether the observed object or part fulfills the designed objective function or it is necessary to perform appropriate maintenance activity in terms of adjustment. The state of an object is described by a certain set of parameters (characteristics), which should perform the designed function, under certain conditions and in a certain period of time. The characteristics of the monitoring object, depending on the basic purpose of the observed system, can be: voltage, noise level, fluid flow, operating temperature, bearing vibrations, wall thickness of a pressure vessel, etc. Changes in the parameters can cause a decrease in the degree of functionality of the object, and in a longer period of time it can lead to a complete interruption of the projected function, i.e., occurrence of failure. Parameters that affect the function of the object can be with: • constant (monotonous), • growing or • by a sudden effect on the condition of the observed object. The basic precondition for making any decision related to the further use of the observed facility is knowledge of working conditions and allowable wear limits. The wear limit of the observed part represents the boundary between the operational usability of the elements or parts of the system and its damage. The task of technical diagnostics is to determine the technical condition of the system elements at a precisely defined time and precisely defined limits of accuracy. Procedures for determining the technical condition of the observed element of the system include appropriate instrumentation or observations by the senses of a diagnostic specialist. The state limits of the system elements in diagnostic procedures, which are of special interest, refer primarily to the states of damage (fatigue, wear, wear, etc.). The criteria for determining the boundaries are:

156

• • • •

D. Lj. Brankovi´c et al.

technical–technological (production, assembly, etc.), economic (replacement cost of system parts, risks, etc.), ergonomic, security, etc.

For example, technical–technological criteria can be the thickness of the thermally treated layer of the shaft, the clearance of the shaft and bearing assembly, etc. Economists take into account the cost of construction / repair, etc. Safety and security must be considered in accordance with the possible consequences of the failure. Ergonomic aspects have recently been increasingly analyzed due to the human– machine relationship (e.g., vibrations, noise, temperature, various types of radiation, required manpower for maintenance, etc.). Most often, several criteria are taken into account when assessing the situation. Boundaries can be determined based on: • • • •

empirical procedures or experiences, experimental research within real systems, tests on test benches, theoretical calculations.

Condition assessment is a very complex procedure and its definition depends on a set of measures that need to be taken to maintain the functionality of the system in the next period of operation.

3.3 Technical Diagnostic Processes The existence of a technical system with measurable parameters is the first prerequisite for establishing technical diagnostics. Technical diagnostics is a process composed of certain activities or steps that include: • installation of measuring sensors on the object (observed system), • stabilization of the operating mode (systems and equipment) to obtain relevant results, • measurement of diagnostic signals, • comparing the obtained values with the prescribed values from the norms, • making a conclusion about the current state of the elements of the system under observation, • forecasts of the future state of the system, with the definition of recommendations on the sustainability of that state. Defining critical places in terms of system parts that by their function can cause failures of parts or a complete system narrows the process of eliminating potential risks that endanger the system. Each individual critical location should contain some of the measurable physical quantities that change over time during operation. Sensors for monitoring and measuring the characteristic values of the system state are placed in critical places. The process involves measuring and converting these quantities

The Importance of Technical Diagnostics …

157

Fig. 7 Technical diagnostic processes

into values that can be displayed in a clear and concise manner. The measured values are compared with the prescribed values from the standards, on the basis of which a conclusion is drawn about the state of the operating characteristics of the system. If we monitor the behavior of the observed part of the technical system by collecting, measuring, converting and processing certain physical quantities that describe the state of the technical system, we can talk about the diagnostic procedure of the technical system, Fig. 7. The processes of technical diagnostics are related to the monitoring of the parameters of the state of the technical system by the workers / operators on machines/devices, maintenance workers trained to handle diagnostic equipment or by the equipment manufacturer through the service of technical support.

3.4 Technical Diagnostic Methods Measurement method, i.e., technical condition identification is a set of procedures that define the relationships between measured quantities. There are several different criteria on the basis of which the division of measurement methods is performed. If the measured quantity is currently read as an absolute value then we are talking about an absolute method. If the ratio of measured and predefined other quantities is determined, then it is a relative method. Considering whether it is a measurement on the object of measurement without taking a sample or measurement in an experimental way through the examination of the characteristics of the materials that represent the object of measurement, we are talking about methods with or without destruction of the material. If the measured quantity is read directly, these are direct methods. If the measured quantity is obtained by calculation through a predefined relation, we are talking about indirect methods. The contact measurement method uses measuring devices that are in direct contact with the measuring medium. Otherwise, if the measurement is realized through the media or wave, then we can talk about non-contact methods. If the measurement refers to a large number of individual results on the basis of which a conclusion is made, it is a matter of differential methods. Simultaneous measurement of several parameters defines complex measurement methods.

158

D. Lj. Brankovi´c et al.

When choosing a method for identifying the state of the technical system, it is essential that the measurement procedures, i.e., diagnostics or tests to check the technical condition of the elements or the overall system are realized without destroying the elements and dismantling the technical system, because with each disassembly and reassembly there is a change in the geometry and conditions of contact. This leads to an increased intensity of wear due to the repetition of the running-in phase. In general, all procedures or methods of diagnosing the state of technical systems can be divided into: subjective and objective procedures. Subjective methods of technical diagnostics are based primarily on the experience of responsible technical persons in charge of monitoring and forecasting the current state of the technical system. Certainly, the level of education with the necessary experience is a key parameter of a successful diagnosis. The main task when choosing an objective measurement method is to observe the characteristics of the system and the physical quantities that can be monitored. The use of objective methods to identify the condition can be based on: • temperature recording (using thermometers, thermocouples, thermistors, thermosensitive paint, infrared radiation and heat flow), • vibration recording (speed accelerator and shock pulse displacement meter, stethoscope) and length measurement: mechanical, electrical, laser rangefinder, etc., • registration of defects in the material (magnetism, penetrating colors, eddy currents, radiography, ultrasound, hardness, sound resonance, corona emission and optical fiber), • registration of deposition, corrosion and erosion (ultrasound, radiography, cathodic potential and using weight), • flow and pressure recording (using a neon-Freon detector, manometer and micro turbine meter), • registration of electrical parameters (cable fault meter, frequency meter, phase angle meter, universal instruments for checking electrical circuits and voltage meter in transients), • load recording (dynamometers: mechanical, hydraulic and electrical), • registration of physical and chemical quantities (spectrograph, humidity meter, O2 and CO2 gas meter, pH meter, viscosity meter and meter of the presence of metal particles in the lubricant), etc. Objective methods of identifying the state of the technical system cover only a certain segment of measurable values and determining their relationship with changes in system state parameters, characteristics of their change, diagnosing the state of system parts in a certain time interval and predicting future state of performance. In practice, there is no unambiguous criterion for the introduction of a single diagnostic system that would be appropriate for all technical production systems. The choice of the control-diagnostic system concept to be proposed and selected depends on a number of factors, namely:

The Importance of Technical Diagnostics …

159

• the nature of the technological or production process, • technical characteristics of production capacities (media, machines, equipment …), • availability of measured quantities or diagnostic parameters, • availability of measuring equipment on the market, • the existence of trained workers for the implementation and monitoring of the diagnostic system, • financial capabilities of the business system, etc. Depending on whether the definition of the cause of failure of the technical system is realized during the work or operation or within the downtime of the plant, we can talk about exploitation and overhaul diagnostics. Both terms are an integral part of the condition based maintenance.

3.5 Methods of Measurement and Analysis of Vibrations The basic characteristic of technical systems that have moving parts is the creation of mechanical forces that are the source of the excitation of oscillation of the complete system or its individual parts. This oscillation, if it is continuous for a long period of time and if it goes beyond the limits that disrupt normal and uninterrupted work, can lead to: • • • • • • • •

wear of system elements, changes in the geometry and position of system parts, failure of electronic components, damage to the insulation of cables in contact, disturbance of workload balance, disturbance of lubrication between moving parts, the appearance of noise and causing damage and disease in humans, material fatigue, damage to parts and eventually fracture which causes the complete system to fail.

Precisely for the purpose of recognizing and monitoring the state of the system in the conditions of vibration occurrence and avoiding harmful events, they have developed methods of vibration monitoring, which we often abbreviate as vibrodiagnostics (Fig. 8). The main characteristic of vibration systems is the intense interaction of forces and vibrations, Fig. 9. There are numerous reasons for the development and application of vibration analysis methods in industrial practice. Randal [15] points out that vibration analysis is by far the most common method for defining the state of a machine because it has a number of advantages over other methods. The main advantage relates to the immediate response to external change and can therefore be used for continuous or intermittent monitoring. Modern signal processing techniques enable the detection

160

D. Lj. Brankovi´c et al.

Fig. 8 Reasons for vibration monitoring

Fig. 9 Vibration conditions

and filtering of very weak, masking signals that can be indicators of damage to the monitored elements of the technical system, [15]. The main benefit that can be expected after the introduction of vibro diagnostics can be described through: • elimination or minimization of unplanned downtimes that cause the technical system to fail, • minimizing corrective maintenance and transition to preventive maintenance, • reduction of the risk of material damage, • increasing the operational reliability of the technical system, • increase in the mean time between cancellations,

The Importance of Technical Diagnostics …

161

• introduction of the “Just-In-Time” model in the spare parts business, • increasing the operational reliability of the technical system, • improving maintenance management.

3.6 Concept, Classification and Digital Signal Processing In science, the term signal is related to a physical quantity that depends on time, spatial coordinates or some other independently variable. A signal is the carrier of a message that was created in the process of signal generation. Given the current level of development of technology and especially electronics, when we say a signal in most cases we mean electrical signals that are suitable for further processing and presentation. There are two basic signal classes: stationary and non-stationary, Fig. 10. If we can predict the value of the signal size at any given time then we can talk about deterministic signals. Random (stochastic) signals are characterized by the property of uncertainty, i.e., their occurrence and value cannot be accurately predicted at some point in time in the future. To represent such a signal using a time function, we can only use known values from the past and in that case we are talking about random functions. Non-stationary continuous signals can be viewed as random signals or can be divided into smaller intervals and viewed as transient. If the beginning and end of the signal observation are related to a constant value, e.g. most often zero value, then we can talk about transient signals. By nature, signals can be divided into: continuous and discrete. Continuous signals correspond to one of the continuous functions in time. Discrete signals appear as arrays of separate elements and can have a finite number of different values.

Fig. 10 Signal classifications

162

3.6.1

D. Lj. Brankovi´c et al.

Frequency Signal Analysis

Frequency domain signal analysis is one of the most acceptable methods today. As the signal recording is always in the time domain, this means that the signal recorded in this way should be transformed into a frequency domain. The theoretical basis of the previously mentioned transformation is the possibility of decomposing a complex function of time into a set of sine functions: x(t) = A1 sin(w1 · t + ϕ1 ) + · · · + An sin(wn · t + ϕn ).

(1)

where: Ai —amplitude of sinusoidal function, ωi —circular frequency of the i-th sinusoidal function, ϕi —phase of the i-th sinus function The decomposition of the time function x(t) into a continuous complex spectrum in the frequency domai F( jw) is performed using the Fourier integral which reads (Fig. 11):

Fig. 11 Fundamentals of oscillation theory

The Importance of Technical Diagnostics …

 F( jw) =

+∞ −∞

163

x(t) · e− jwi · dt.)

(2)

The Fourier integral is derived based on the concept that a periodic function expressed in the Fourier order can fully represent an aperiodic function if its period tends to infinity. In expression (2) the function F( jw) is called the spectral density of amplitudes. In general, there are two ways to transform a signal from a time domain to a frequency domain. The first way is transformation using specialized devices, spectrum analyzers that at first glance resemble classical oscilloscopes. By establishing signal digitization, spectrum analyzers immediately perform Furier signal integration and plot the result on a monitor that is an integral part of the device. Another way is to digitize the help of analog-to-digital (A/D) converters that are built into the computer. Both signal transformations from the time domain to the frequency domain use the developed Furier Transformation FFT algorithms, which, depending on the number of measurement results, save up to 99% of the time during the calculation (Fig. 12). There are numerous scientific papers that show the advantages of applying advanced methods of frequency signal processing. Lee [16] describes the developed concept of the Acceleration Enveloping method and the result of applying the method to early detection of rolling element damage in a bearing. By proper analysis of the Wave form using Enveloping, it is possible to determine the cause of the potential problem that can arise from the bearing cage, the outer or inner ring or the rolling element of the bearing itself. In this particular case, the detection of problems on the outer ring of the bearing is described, showing how this observed frequency could be isolated and differentiated in the total vibration spectrum, [16]. Kazzaz and

Fig. 12 Furier signal processing (FFT) vibration monitoring

164

D. Lj. Brankovi´c et al.

Fig. 13 Principle of frequency analysis [18]

Sigh [17] cite an example of on-line digital signal processing obtained from a monitoring system for monitoring process state parameters using the C++ programming language. The obtained signals are analyzed in detail through specially developed modified algorithms in order to obtain the most accurate state of technical systems. By using quality techniques to protect the input signals that come into the monitoring system, it is possible to timely warn operators of the possibility of serious equipment failures, [17]. Figure 13 shows the principle of frequency analysis in the case of monitoring the operating parameters of the electric motor assembly and the actuator (reducer, mixer, mill, etc.). Vibration level / amplitude measurement and frequency spectrum analysis can provide an answer to the condition of the operating element and the forecast of the behavior of operating parameters in the future. All physical structures and machines, which are connected to the rotating components, stimulate vibration. Veltkamp [19] emphasizes that vibrations generated on machines have become a well-used parameter for assessment within condition monitoring. It is one of the most versatile techniques that can detect about 70% of common mechanical faults associated with rotating machines [19]. The measured values are compared with the values prescribed by the standard for each type of equipment or working element. An example of ISO 10816–3 defining oscillation amplitude values and allowable limit values for rotating elements is shown in Fig. 14. In addition to accelerometers, there are other ways to measure and analyze vibration signals such as ultrasonic defect detection. In their study, Kim et al. [21] give the results of testing low-velocity bearing vibration signals where significantly

The Importance of Technical Diagnostics …

165

Fig. 14 Limit values of oscillation amplitudes of rotating parts [20]

better results of ultrasonic measurement techniques are shown compared to vibration acceleration measurements, [21].

3.7 Importance of Technical Diagnostics on the Reliability of Technical Systems The application of technical diagnostics within the concept of maintenance according to condition is a set of activities that actively positively affect the effectiveness of technical systems: availability, reliability, sustainability, safety, efficiency, etc., as well as reducing energy consumption and environmental pollution. The result of condition-based maintenance activities usually allows the user sufficient time to plan, schedule, and make necessary repairs before the equipment reaches a technical failure condition. This can avoid major failures and costly repairs. Monitoring, collection and analysis of signals that characterize the behavior of the technical system provide the possibility of early detection of the potential cause of future failure because it is possible to signal the occurrence of the cause of failure long before the situation of increased noise or temperature that inevitably precedes Fig. 15. Previous long-term practice in the maintenance of technical systems has pointed out that the best way to optimize the performance of system elements has been

166

D. Lj. Brankovi´c et al.

Fig. 15 Example of early warning of the possibility of failure

overhaul or replacement at a fixed interval. This is based on the assumption that there is a direct relationship between the amount of time (or number of cycles) of equipment being monitored and the likelihood of failure. Moubray [22] emphasizes that the relationship between working time (age) and failure is correct for some types of failure, but that it is no longer credible because the equipment is now much more complex and sophisticated than it was fifteen years ago. This way of understanding maintenance has led to significant changes in techniques for predicting equipment failure dates. Unless there is a clear indication of the existence of age and equipment obsolescence factors, overhaul at fixed intervals or replacement may have very little or no significant effect on improving the reliability of rotating equipment. In this case, it is more realistic to develop maintenance strategies that will assume that equipment failure can occur at any time, and not at a fixed time. Choosing the right maintenance strategy is a very professional and sensitive process. In the long run, it can result in a series of events that have a decisive impact on the overall result of the business system. To choose an appropriate maintenance strategy within the production system in continuous working conditions is to plan the projection of the operation of the complete system and the conditions in which it is provided. Proper choice of maintenance strategy within the continuous production process means providing the following conditions: • optimal number of qualified maintenance workers deployed within shift and intervention work, • the optimal number of spare parts that can be found,

The Importance of Technical Diagnostics …

167

• availability and application of modern technical systems for monitoring operating parameters, • positive impact on the degree of effectiveness of the production system (reliability, availability, readiness), • planning and implementation of repairs in terms of failure prevention, • cost tracking control, • application of safety measures at work, • positive impact on environmental protection. Providing the mentioned conditions means optimizing maintenance procedures. Wrong or inadequate choice of maintenance strategy means in the short term perhaps a better financial result, considering that it is mostly thought of avoiding financial investments in the maintenance service (personnel, equipment, organization, management…). In the long run, the wrong strategy inevitably requires large investments. The term investment refers to the investment in equipment, professional staff and the improvement of technical means used by the maintenance service in terms of providing the projected strategy. Today, the modern understanding of maintenance does not view this phenomenon as a “repair” but maintenance should be approached as a way of providing targeted activities in terms of failure prevention. The correct choice of maintenance strategy is a rational planning of operating costs of the entire production system over a long period of time.

4 Application of Technical Diagnostics in Industrial Practice Technical diagnostics is a very suitable and desirable methodology for application in industrial production systems in terms of control and insight into the state of the elements of technical systems and prediction of their state and expected behavior in the future. As an illustration of the manner of application of technical diagnostics and significance, i.e., effects that are achieved in the process of providing normal conditions for the functioning of technical systems, describes a specific event of observation and repair of a critical situation in the working position of high voltage motor technical system paper machine within the industrial system for production of toilet paper from Banja Luka, Republika Srpska, Bosnia and Herzegovina, Fig. 16. Paper machine is a complex technical system whose ultimate function is the production of low-gram hygienic paper with a grammage of 15–40 g/m2 . The complete technological process of preparation and creation of paper tape consists of the following technical units: • paper pulp preparation plant—as an input semi-finished product paper pulp ground in water in the percentage of dry matter up to max. 5% is used,

168

D. Lj. Brankovi´c et al.

Fig. 16 Paper machine of an industrial plant in Banja Luka

• machine part of the plant paper machines—mixing devices (mixing tubs, pumps, fresh and regenerated water tanks, …), homogenization (equalizers, mills, concentration meters,…), transport (pumps, pipelines and related instrumentation), purification (vertical and centrifugal purifiers) and central pump for delivery of prepared paper pulp to paper machine, • paper machine—a technical system consisting of several functionally dependent units from which can be distinguished: – wet part—subsystem for injecting paper pulp (flow), a system of rollers that move the sieves through which water is separated from the pulp by pressing and presses (rotary rollers for additional squeezing of water from the pulp by pressure and vacuum), – dry part—cylinder for drying paper with steam and creping with the help of a system of knives, system for winding already formed paper tape created in the process of creping, – recuperation—a system related to the provision of conditions for drying paper, which includes: steam–air tube exchangers, high-voltage motors for hot air fans, hot air drying hood and all accompanying instrumentation (condensers, air flow regulators, thermocouples, etc.) The output of the described technological process is a semi-finished product—a paper strip that is wound on winding shafts and further handed over or on a cutting machine to appropriate formats needed for processing in own paper confectionery plants or for sale as a final output semi-finished product with defined specification of quality parameters.

The Importance of Technical Diagnostics …

169

4.1 Description of the Reference Event In order to analyze the importance of the application of technical diagnostics within the production system that operates in a continuous mode, a specific event that occurred during one of the regular diagnostic inspections of technical assemblies of a paper machine from 2019 is described. The maintenance service of this production system uses three ways of preventive diagnostic examinations and condition control of the characteristic elements of the technical system of the paper machine, as follows: • on-line diagnostic system for monitoring the state of vibration (at critical positions presses the paper machine, Fig. 17) and • periodic measurement of a number of predefined elements of the technical system of the paper machine with a portable vibrodiagnostic device and subsequent analysis of the measurement value by an external specialized company, • use of own portable measuring device for instant control of vibration condition by shift maintenance workers.

Fig. 17 Structure of the on-line diagnostic system at paper machine press positions

170

D. Lj. Brankovi´c et al.

In the case of the described technical system of the paper machine, one of the regular preventive activities of the maintenance service is the contracted vibrodiagnostic examinations, which are realized by an external service from the Czech Republic. Experts from a specialized external company come to the site once or twice a month and inspect the previously defined measuring points of technical functional elements of the paper machine (drive and guide rollers, presses, drying cylinder) and auxiliary units (high voltage recuperation motors with couplings and fan bearings, mill motors and vacuum plants). Examples of parts of the diagnostic report with the values of the measured parameters are shown in Figs. 18, 19 and 20. Portable vibration measuring devices such as Mikrolog CMVA 65 and Microlog CMVA55 are used as on-site vibration diagnostic measuring equipment by the external company, Fig. 21. These devices have the ability to measure, archive and instantly analyze the vibration spectrum. During the regular control and measurement of the vibration condition parameters of the bearings at the position of the high-voltage motor for the drive of the hot air recirculation fan for drying the paper strip on the dry side of the hood, on March 21, 2019, the measured values of bearing oscillation vibrations at position L2 are significantly higher than allowed, Fig. 22. Two methods were used to evaluate the motor bearings condition implemented in SKF Microlog frequency analyzers: “Enveloping” and HFD (High Frequency Detection) methods (Fig. 23). On botht trends diagrams is a visible increase of the overall values that indicate worsening of the bearing condition. On the Fig. 24 is frequency spectrum measured with Enveloping method and on the Fig. 25 there is time domain spectrum of the measured signal. In time domain is visible impact occurrence with frequency approximately 575 rpm. This frequency is also clearly visible and dominate in the frequency spectrum. Frequency 575 rpm is 0.40 multiplies of the rotary speed of the motor, what correspond to cage fault frequency of the bearing. Appearance of this frequency indicates a problem with the cage of the L2 bearing. Cage problem of the bearing is a severe problem because of the cage separates the rolling element. In the case of the brake of the cage, the rolling element changes its position and the bearing is immediately blocked. Rolling element are not rolling but sliding on the bearing races what lead to high increase of the temperature and in most cases to the damage of the rotor shaft. Possible is also physical contact of the rotor and stator, which lead to total destruction of the motor. Comparing the obtained results with previous measurements (trend/measurement history) it could be clearly seen that there was a sudden jump in vibration values at the position of the front bearing of the high voltage motor L2 (side to the motor fan) and a dangerous situation that can cause a serious problem at any time. on the bearing itself but also on the motor and the associated motor-coupling-fan bearing assembly.

The Importance of Technical Diagnostics …

Fig. 18 Example of a part of a diagnostic report - values of roller bearing vibrations

171

172

D. Lj. Brankovi´c et al.

Fig. 19 Example of a part of a diagnostic report—analysis of roller bearing condition

The Importance of Technical Diagnostics …

Fig. 20 Example of a part of a diagnostic report—condition of engine bearings

173

174

D. Lj. Brankovi´c et al.

Fig. 21 Microlog CMVA65 portable vibration meter and analysis

Fig. 22 Trend of vibration value measurement at the position of bearing L2 of a high voltage hot air fan motor on recuperation with the Enveloping method

Figure 26 shows the working position of the high-voltage motor for driving the fan for recirculation of hot air on the recuperation—“dry” side of the hood for drying paper. Taught by experience from previous years, when due to sudden bearing failure, brush problems or voltage problems, on several occasions there were more serious failures and burns of both high-voltage motors (Figs. 27 and 28) and other motor units (Fig. 29), a state of high risk for the occurrence of serious emergency situations was stated.

The Importance of Technical Diagnostics …

175

Fig. 23 Trend of measuring oscillation amplitude values with HFD method

Fig. 24 Frequency spectrum measurement and analysis results

Fig. 25 Measured signal values in the time domain

The conclusion of the expert diagnostic measurement finding is that the cage problem of the bearing is a severe problem that can lead to immediate damage of the bearing and after all to the severe motor damage. Based on the expert opinion of the specialist who performed the measurement and who recommended the replacement of the L2 bearing of the high-voltage motor for hot air recirculation on recuperation

176

D. Lj. Brankovi´c et al.

Fig. 26 High voltage recuperation motor operating position

Fig. 27 Breakdown of the winding of the high-voltage fan motor from 2017

Fig. 28 High vibrations and cracks of the high-voltage motor support, 2018

The Importance of Technical Diagnostics …

177

Fig. 29 DC motor drive failure

in the shortest possible time, the responsible maintenance persons adopted executive measures and assessed the terms for specific preventive and corrective activities: • decision to extend operation in high-risk mode with intensive, half-hour temperature measurement at a critical place—high voltage motor bearing by shift maintenance with clear instruction on immediate shutdown of the entire production process if the temperature value rises above 55 °C because it would mean immediate entry into the accident emergency condition. In addition to the measure of constant temperature control, a written order was issued for intensive lubrication of this position (grease). The decision and risk-taking for the continuation of work was made on the basis of a technical assessment of the situation and taking into account the needs of the production cycle, fulfillment of the production plan and compliance with obligations to customers, • preparation of emergency maintenance activities during the next day which were related to: – providing the required number of executors (locksmiths and electricians of the maintenance service, auxiliary technological staff and support of logistics workers), – providing the necessary tools and devices—various types of manual chain hoists, certified ropes, devices for dismantling bearings and couplings and forklifts, – providing a crane—the service of an external person since the company does not have its own means of lifting heavy loads. The crane had to be urgently announced and delivered to the scene. The position of the high-voltage motor requires the engagement of a crane from the outside, i.e., outside the production hall, so that the manipulation (extraction of the engine from the working position) is realized in the shortest possible time, • realization of high voltage motor replacement—emergency maintenance activity.

178

4.1.1

D. Lj. Brankovi´c et al.

Remediation of a Critical Situation in the Work Position

Maintenance activities are often very complicated and specific for certain types of industrial production plants due to numerous external and internal factors that affect the conditions and manner of their implementation. The concrete case is a real example of very demanding preparation and realization with the appearance of several unpredictable situations that needed to be solved during the elimination of the observed problem. There are several specifics that were present in the preparation of the implementation of maintenance activities: • careful planning of the most favorable period of shutdown of the plant with regard to the current production plan and compliance with the already defined deadlines for delivery of semi-finished products to the customer, • available working space at the working position of the high-voltage electric motor, • use of the presence of a technician for diagnostic measurement, • the required number of workers for the intervention, • crane engagement—external company and availability of equipment, • preparation of the necessary tools (based on events in the history of replacement of this engine), … Realization of repair, i.e., replacement of the problematic high-voltage motor and elimination of unforeseen problems that occurred during the performance of maintenance activities is shown in Figs. 30, 31, 32 and 33. The dismantling of the lightweight partition wall made of 120 mm thick panels was necessary in order to facilitate the access of the crane and significantly shorten the time of pulling the high-voltage motor out of the working position. It takes an average of up to 6 h to remove the engine from the outside and insert the spare engine, while the same activity realized in the direction of the interior of the production hall was performed in 16 h due to numerous physical obstacles such as the inner panel wall, steam and condensate piping. impossibility of reaching the crane into the extraction space, etc.

Fig. 30 Working and spare high-voltage electric motor of the recuperative fan

The Importance of Technical Diagnostics …

179

Fig. 31 Dismantling a light partition panel wall on recuperation

Fig. 32 Disassembly of the coupling from the working motor, finishing of the spacer ring and installation of the coupling on the spare motor

Fig. 33 Setting the spare high-voltage motor to the working position using a truck crane

The most complex problem that arose during the realization of the replacement of the high-voltage motor is the observation of the inappropriate position of the opening on the spare motor in relation to the place of reliance on the metal structure and the concrete base. The reason for this problem was the fact that the spare engine was not manufactured by the same manufacturer and was intended as a replacement for two similar positions, where the structural measures of the support point were provided for the second position. After defining the necessary measures for position correction, new corresponding holes were cut in the metal support, Fig. 34. This

180

D. Lj. Brankovi´c et al.

Fig. 34 Correcting the location of the motor support on the concrete base

Fig. 35 Centering the connector and connecting the cables to the high voltage motor

necessary intervention required a significant amount of time relative to the total time spent on engine replacement. By installing and centering the coupling as well as connecting the motor to the existing cables, the basic maintenance activities were completed, Fig. 35 (Fig. 36). After almost 14 h of continuous work on the replacement of the high-voltage motor with the successfully completed testing of the motor with parameters that correspond to the operating conditions, the conditions for the continuation of the operation of the plant have been created. After the resumption of production, the installation of a light panel wall was completed, thus completing the complete maintenance activity. Vibration control measurement after engine replacement at this position showed satisfactory values.

5 Discussion of Results The application of technical diagnostic methods has a very positive impact on the degree of realization of planned production activities of industrial systems. In the case of the previously described industrial system for the production of toilet paper, the following characteristic indicators can be highlighted:

The Importance of Technical Diagnostics …

181

Fig. 36 Spare high-voltage motor installed and tested in working position

1.

Reduction of the total downtime—by preventing failure states of the elements of technical systems with greater negative consequences for the technical condition, there is an increase in the total downtime, which increases the reliability and overall effectiveness of the technical production system. In the specific case, the total production stop caused by the decision on the planned shutdown of the production plant based on the results of the technical diagnostics is 14 h, including unplanned and so-called. hidden problems that had to be solved operatively during the performance of maintenance works and which obviously estimated, extended the expected duration of the downtime by some 4 h. If we compare an almost identical event from the past, which lasted a total of 22 h, when there was a serious accident (engine burnout, bearing damage to the side of the fan coupling, waiting for the organization and engagement of an external company that should provide a crane and the absence of unforeseen / unplanned additional problems), a total positive reduction of production downtime by 8 to 10 h was obtained in the case of only one accident. If we count from the average 9 h of shortening, i.e., time savings we can estimate the possibility of realization of production (Po ) in that time from

Po = 9h · 4, 8

t = 43, 2t h

(3)

where: • 4,8 tons—the average production of a technical paper machine system for one hour of operation. If, given the diversity of the product range, an average gross margin—GM of 290 e/ton of paper produced is assumed, a direct potential production loss (T p ) can be

182

D. Lj. Brankovi´c et al.

calculated, i.e., realization of potential profit (Pr ) by selling the realized production from: e T p = Pr = G M · Po = 290 · 43, 2t = 12.580e t

(4)

which is a significant financial amount, given that the estimate relates to only one accident event in a system with a constant production cycle. 2.



Maintenance cost savings—the estimate of maintenance cost savings can be reported on the basis of a comparison of the costs of preventive planned repairs and the costs of a possible emergency situation. The common costs in both cases are: the cost of labor of the maintenance worker during the downtime and engagement to replace the high-voltage motor and the cost of the crane. Additional costs in case of an accident burnt engine winding cost (C v ):

Cv = PV M · C PV M = 132K W · 100e/K W = 13.000e

(5)

where: PV M —is high voltage motor power (KW ), C PV M —is estimated cost of engine winding per unit power. • transport cost (C t ) to the nearest service center (in this case abroad) and the value of customs duties/taxes = e 1,500 (estimate based on previous cases). Total savings of currently identifiable maintenance costs (C m ): Cm = Cv + Ct = 13.000e + 1.500e = 14.500e

(6)

Total savings in production costs (avoiding loss of production) and maintenance (C t ): Cu = T p + Cm = 12.580e + 14.500e = 27.080e 3.

(7)

Integration of vibration level monitoring into a single on-line monitoring system—in this case there is a possibility of connecting vibration monitoring on the bearings of high-voltage electric motors for driving air fans on recuperation with the already installed on-line diagnostic system Octavis, Fig. 17. The existing online diagnostic system Octavis, manufactured by IFM Electronic, is a simple modular system that has the ability to monitor the condition of bearings and vibration values using high-frequency signal analysis. The four-channel PLC unit type VSE100 forms the basis of the diagnostic system and has the

The Importance of Technical Diagnostics …

183

possibility of continuous measurement and evaluation of the signal obtained from the VSE001 acceleration sensor, Fig. 37. All collected values are stored on a pre-set structure of the software system. As two four-channel PLC units are currently installed with a total of 6 installed vibration monitoring sensors at critical positions of the paper machine press (Fig. 38), two free channels remain currently available for connection and condition diagnostics that could be used with another PLC four-channel unit to secure both critical positions of high-voltage motors on recuperation. The measured results are forwarded with the help of the Ethernet network to the control system with the VES003 software installed, where they are archived in the form of long-term trends of individual measurements. The diagnostic system enables the creation of real-time oscillation frequency analysis on any sensor connected to the system.

Fig. 37 Electrical cabinet of the vibrodiagnostic system with VSE001 PLC measuring units manufactured by IFM electronic

Fig. 38 Vibration sensors at measuring points of critical positions presses the paper machine

184

D. Lj. Brankovi´c et al.

Fig. 39 Opening of the scolding high-voltage motor recovery

4.

Fire protection prevention—the working position of the high voltage motor of the air blower for drying on the dry side of recuperation is especially interesting from the point of view of fire protection because a large amount of paper dust often accumulates in the working environment, Fig. 39.

Certainly, the accident situation followed by the combustion of a high-voltage electric motor at this position can cause a fire and ignition of paper dust, so monitoring the condition of the motor would significantly increase the level of safety and protection of a particular position in relation to increased fire risk, 5.

Technical diagnostics and classification of downtime—a specific event opened a controversy about the categorization of the resulting downtime of production, i.e., correct understanding of the character of the maintenance action after the indication of the technical diagnostics on the occurrence of the violation of the optimal condition of the monitored component of the element of the technical system. Based on the established system of evaluation of the activities of the maintenance service and measuring the overall effectiveness of the production system of the paper machine, the question arose whether in the case of action, i.e., maintenance interventions after the results of diagnostic measurement, analysis and decision to stop the production plant, the production downtime is treated as a maintenance downtime due to failure, i.e., cancellation because the production process is currently stopped or should be considered as a preventive maintenance downtime. The condition-based maintenance concept belongs to the form of preventive maintenance when decisions on subsequent maintenance activities are made on the basis of periodic or continuous control of the technical condition of the elements of the technical system. The form

The Importance of Technical Diagnostics …

6.

185

and manner of realization of planned activities depend on the measurement results and the degree of violation of the optimal technical characteristics of the observed components of the technical system. On the other hand, there is a clear difference between scheduled maintenance and condition-based maintenance. In planned maintenance, all subsequent maintenance activities are carried out according to a precisely defined time schedule, while in condition-based maintenance, the current technical condition and the forecast of the technical system’s behavior in the near future are taken as the most important factors. If the data and values of measuring the parameters of the technical system are outside normal operating tolerances or if the assessment of the development of monitored quantities has a markedly negative trend with a tendency to move from an inappropriate state to an increased or high risk of emergencies, a decision is made. In this regard, subsequent maintenance interventions, conditioned by the action of technical diagnostics and according to the results determined by decisions, are treated as a form of preventive maintenance because their application prevents far greater and more serious consequences that can result in significant disruption of production processes (high maintenance costs, extremely expensive spare parts, engagement of specialized firms, etc.). In this particular case as well, the production stoppage was accepted and recorded as the use of planned time for preventive repairs of the maintenance service, Optimization of spare parts/units—the specificity of the working positions of high-voltage recuperation motors (dry and wet side) is that in the recuperation area there are two motors of the same operating characteristics with a certain difference in the construction of the metal frame constructions with a chain crane to assist in the manipulation / extraction of the high-voltage motor on the “wet” side of the air recirculation. In this regard, the need to have at least one spare engine with identical operating characteristics is defined, which provides conditions for minimal downtime in case of unforeseen, accident situations or replacement activities in case of need to overhaul these work units. As there is a real possibility of a simultaneous problem in both work positions, there is a justified need to secure these jobs with an online diagnostic system with only one existing spare unit, which raises the reliability of this part of the technical system paper machine to a higher level by ensuring process continuity production with simultaneous optimization of the stock of spare aggregates and their spare parts.

6 Conclusion Technical diagnostics within the monitoring of technical system elements proves to be a particularly suitable tool for failure prevention in industrial systems operating in a continuous mode where the imperative of high reliability and efficiency of production capacity utilization does not allow frequent stops and control of technical system parts preventive controls and repairs were implemented. By monitoring the status of system elements in the work using computer technology and modern software

186

D. Lj. Brankovi´c et al.

solutions it is possible to obtain high-quality, timely and objective information on the current state of the elements of technical systems in operation as well as very precise prognosis assessment in future period based on which it can make timely executive decisions and ensure the optimal level of utilization of production capacities. Ensuring a high degree of reliability is a prerequisite for optimal time utilization of production capacities and maximum realization of production plans with minimal costs, which means that industrial systems try to stand out from the competition and survive in the increasingly demanding global world market.

References 1. Rejec E (1974) Terotechnology: modern organization of maintenance of funds, Informator, Zagreb, p 134 2. Papi´c L, Milovanovi´c Z (2007) Systems maintainability and reliability, DQM monograph library quality and reliability in practice, Book 3, Prijevor, 501 p 3. Tsang A, Jardine A, Kolodny H (1999) Measuring maintenance performance: a holistic approach. Int J Oper Prod Manage 19(7):691–715. https://doi.org/10.1108/014435799102 71674 4. EN 13306:2001 (2001) Terminologie maintenance 5. Bengtsson M (2004) Condition based maintenance systems—an investigation of technical constituents and organizational aspects, Malardalen University Licentiate Thesis, No.36 6. Majdandži´c N (1999) Maintenance strategies and maintenance information systems, JJ Strossmayer University, Mechanical Engineering Faculty in Slavonski Brod, 349 p 7. Butcher SW (2000) Assessment of condition-based maintenance in the department of defense, technical report 8. Kothamasu R, Huang SH, VerDuin WH (2006) System health monitoring and prognostics—a review of current paradigms and practices. Int J Adv Manuf Technol 28:1012–1024. https:// doi.org/10.1007/s00170-004-2131-6 9. Shin J, Jun H (2015) On condition based maintenance policy. J Comput Des Eng 2(2):119–127 10. Erbe H, Morel G, Lee J, Iung B, L’eger JB, Seliger E et al (2005) Infotronic technologies for e-maintenance regarding the costs aspects. Proc IFAC 38(1):1–12 11. Al-Najjar B, Alsyouf I (2003) Selecting the most efficient maintenance approach using fuzzy multiole criteria decision making. Int J Prod Econ 84(1):85–100. https://doi.org/10.1016/ S0925-5273(02)00380-8 12. Tavner P, Ran L, Penman J, Sedding H (2008) Condition monitoring of rotating electrical machines, IET power and energy series 56, The institution of engineering and technology. London, United Kingdom, p 282 13. Sebastijanovi´c S (2002) Machine constructions basic maintenance, JJ Strossmayer University, Mechanical Engineering Faculty in Slavonski Brod, p 278 14. Jeremi´c B, Todorovi´c P, Maˇcuži´c I, Kokovi´c V (2006) Technical diagnostic. University of Kragujevac, Faculty of Mechanical Engineering in Kragujevac, Serbia, p 186 15. Randal R (2011) Vibration-based condition monitoring: industrial, aerospace and automotive applications. A John Wiley and Sons, Ltd., Publication, ISBN 978-0-470-74785-8, p 289 16. Lee G (2006) Acceleration enveloping, a key element in aggressive condi-tion monitoring. uptime, the magazine for PdM and BBM professionals, pp 38–42 17. Kazzaz S, Singh G (2003) Experimental investigations on induc-tion machine condition monitoring and fault diagnosis using digital signal processing techniques, Electric Power Syst Res 65(3):197–221. Elsevier 18. Bilošova A, Biloš J (2012) Vibration diagnostic, investments in education development, p 113. Ostrava

The Importance of Technical Diagnostics …

187

19. Veltkamp A (2001) Vibration introduction course: SKF condition monitoring, in computerized maintenance management system and enterprise resource planning, Nigerian Society of Engineers, Lagos, Nigeria, pp 1.1–4.5 20. ISO 10816–3 (1995) Displacement, International standard 21. Kim YH., Tan ACC, Mathew J, Yang BS (2006) Condition monitoring of low speed bearings: a comparative study of the ultrasound technique versus vibration measurements. In: Mathew J, Kennedy J, Ma L, Tan A, Anderson D (eds) Engineering asset management. Springer, London. https://doi.org/10.1007/978-1-84628-814-2_21 22. Moubray J (2000) Maintenance management: a new paradigm, strategic technologies, Inc., Aladon Ltd, UK, pp 7–11. http://www.maintenanceresources.com/RCM/Maintparadigm

Reliability Assessment of Replaceable Shuffle Exchange Network with an Additional Stage Using Interval-Valued Universal Generating Function Amisha Khati and S. B. Singh Abstract In this paper, a Shuffle Exchange Network with an additional stage (SEN+) is inspected, the probabilities of whose components aren’t known with accuracy. To overcome this difficulty we determined its reliability by using the method of Interval-valued universal generating function (IUGF) and thus the transient state probabilities are obtained in intervals. The reliability has been analyzed in terms of three parameters: Terminal, Broadcast, and Network reliability. Also within the observed network if any of the Switching Element (SE)comes up short and the system quits working then we will replace that SE by a fixed replacement rate to continue the operation of the network. A numerical example is also provided to offer a practical clarification of the proposed technique. Keywords Terminal reliability (TR) · Broadcast reliability (BR) · Network reliability (NR) · Mean time to failure (MTTF) · Interval-valued Universal generating function (IUGF)

1 Introduction In parallel computer systems, interconnection networks offer communication among memory modules, processors, and other devices by interconnecting them. Interconnection systems are at present being utilized for various one-of-a-kind applications, extending from very-large-scale integration (VLSI) circuits to wide-area computer networks. For certain applications, interconnection systems have been considered in profundity for quite a long time. In any case, over the most recent 10 years, we’ve seen a fast development of the interconnection network innovation that is as of now being injected into the modern era of multiprocessor frameworks. The innovation is sufficiently experienced to discover its way into business items, while continually introducing new roads for development and application. There are abundant features pivotal in the progression of the interconnection networks. A. Khati · S. B. Singh (B) Department of Mathematics, Statistics and Computer Science, G. B. Pant University of Agriculture and Technology, Pantnagar, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment of Industrial Systems, Springer Series in Reliability Engineering, https://doi.org/10.1007/978-3-030-93623-5_9

189

190

A. Khati and S. B. Singh

To gauge the performance of a network, reliability is often taken as a vital factor. The reliability and accordingly the overall performance of the interconnection networks rely upon how its components are interconnected with one another. Reliability of Multistage Interconnection Networks (MIN) can be thought-out as a scale of the network’s ability to transform the message or data from input to output depots. One of the most broadly used MIN is Shuffle Exchange Network (SEN) which comprises a remarkable way between any source node and sink node and if any way in the system becomes defective the whole system will get abolished. To conquer this problem, engineers designed SEN+, which is a double-path MIN, by adding one more stage to the existing SEN. The reliability of the SEN+ alongside the SEN has been assessed by numerous specialists with the aid of various approaches. Fard and Gunawan (2005) brought in a modified SEN comprising of 1 × 2 switching elements at the source node, 2 × 2 at the intermediate stages, and 2 × 1 at the sink nodes, and estimated the terminal reliability of SEN and modified SEN. The results obtained say that the terminal reliability is higher for the network of size greater than 4 × 4. Yunus and Othman (2011) scrutinized six different types of SENs having further stages. The reviewed SENs are SEN+, Irregular Augmented SEN (IASEN), Irregular Augmented Shuffle Network (IASN), Generalized SEN (GSEN), Improved Irregular Augmented Shuffle Multistage Interconnection Network (IIASN), and Irregular Modified Alpha Network (ALN). It was detected that a more redundant path is conquered by augmenting the stages and it also lessens the latency and upsurges the fault-tolerance by permitting auxiliary links. However, increasing the network size increases the price and complexity of the network. Bistouni and Jahanshahi [4] introduced an approach, called rearranging links, to perk up the reliability of MINs. They implemented this approach on SEN+ and ASEN and then evaluated their terminal reliability. It was found that the procedure of rearranging the links is an efficient one to spice up the reliability of MINs. Also, it had been observed that utilizing this approach leads to a less steeply-priced MIN. Bistouni and Jahanshahi [2] presented how to boost up the fault-tolerance and reliability of SEN by expanding the number of stages and deduced that the reliability of SEN+ is better than both SEN or SEN having two extra stages (SEN + 2). The reliability of SEN + has always been found more than the reliability of SEN + 2. Bistouni and Jahanshahi [3] evaluated the reliability of SENs by employing the technique of Reliability Block Diagrams (RBDs). Yunus and Othman [14] proposed a new network SEN with minus one stage (SEN-) and then compared it with SEN, SEN+, and SEN + 2. The comparison was made based on three reliability indices, viz., terminal, broadcast, and network reliability. The three aforesaid reliability parameters of SEN- were found higher than the rest three considered networks. Bisht and Singh [1] figured out the terminal, broadcast, and network reliability of SENs by applying the method of UGF. Bistouni and Jahanshahi [5] analyzed the reliability importance of the switching elements in SEN, SEN+, and SEN + 2. They concluded that a high-reliable network can be reached by replacing the sensitive switching elements and by using switching elements having high-reliability. From the above discussion it can be observed that none of the researchers examined the networks incorporating the uncertainties in them while in the real-world situation, it’s pretty common to go through the networks possessing many uncertainties. Thus

Reliability Assessment of Replaceable Shuffle Exchange Network …

191

it becomes very vital to analyze the reliability of SEN+ incorporating their uncertainty which will prominently upgrade the believability of the reliability analysis of the network. The motives causing the uncertainties in the complex systems are: (a)

(b) (c)

Temperature, strain, moistness, vibration, stress, and so forth are the natural highlights that would end in the uncertainties within the device and its segments. When evaluating a posh system, in some cases it turns out to be very brain desensitizing to obtain exact reliability data. If we routinely utilize a system, its overall performance depreciates with time, and consequently the probability of its components shifts with time.

Interval-valued universal generating function (IUGF) is a way to investigate the reliability of a network having uncertainties. Li et al. [8] advised a way to appraise the reliability of the Multi-state system (MSS) when the prevailing data about the component isn’t enough. In such cases rather than considering the precise values of the probabilities, the interval-valued probabilities of the components can be considered. To acquire the interval-valued reliability of MSS, an IUGF was built up and from the results it can be observed that the proposed method is efficient when the state probabilities of the components are imprecise (or uncertain). Pan et al. [10] gave an approach for the assessment of interval-valued reliability of MSS incorporating uncertainty. They constructed the algorithm for the IUGF method and authorized their method by encountering examples. Kumar et al. [7] considered a 2-out-of-4 system comprising of two components organized in a series pattern and evaluated its interval-valued reliability using the IUGF technique. Singh [9] appraised the reliability and MTTF of a non-repairable MSS by using the IUGF approach. They analyzed the system’s reliability by considering the uncertainties in the probabilities and the failure rates of the components of the considered system. Also, all the researchers considered the SEN+ in which if any of the components fizzles then one can’t replace it by a decent one. Nonetheless, in practice, we’d like a network in which we can replace any broken segment with the goal that the entire network doesn’t quit working. Keeping all these facts in mind, we’ve considered a SEN+, the probability of whose parts aren’t known with accuracy, and determined its reliability by using the method of IUGF. Thus the transient state probabilities are evaluated in intervals. The reliability is examined regarding three parameters: Terminal, Broadcast, and Network reliability. Also, in the considered network if any of the SE fails and the network stops operating, at that point we can replace that SE to proceed with the operation of the network. The SEN+ proposed is of size 8 × 8. The differential equations overseeing the network’s conduct have been obtained by applying the method of supplementary variable technique. The probabilities of all the components are obtained by using the Laplace transform and henceforth the reliability of the network is computed. Also, the MTTF of the SEN+ is analyzed. At last, we demonstrated our model by taking a numerical example.

192

A. Khati and S. B. Singh

2 Assumptions 1. 2. 3. 4. 5.

To begin with, the network is in worthy condition i.e., all the hubs and connections are working appropriately. The network considered is an 8 × 8 SEN+ wherein each of the SE is of size 2 × 2. All the network’s components can be in a working stage or in a failed stage. On the off chance that the network comes up short totally then only will go for replacement and after replacement, the network will become as good as new. The failure rates of different components are different while the replacement rates of all the components are supposed to be the same.

3 Notations x:

Elapsed replacement time

λi :

Failure rate of the ith component

η:

Replacement rate for the entire network’s component

Pi :

Lower bound of the probability of ith component, represented by a real number

Pi : [R]:

Interval-valued reliability of the network

Upper bound of the probability of ith component, represented by a real number

4 Definitions (i)

(ii)

(iii)

Terminal Reliability: The terminal reliability (TR) of a network is defined as the probability of the existence of at least one fault-free path between a source–destination pair. Broadcast Reliability: The broadcast reliability (BR) of a network is defined as the probability of fruitful communication between a single source hub and all the destination hubs. Network Reliability: The network reliability (NR) of a network is defined as the probability that all the source nodes are associated with all the destination nodes.

Reliability Assessment of Replaceable Shuffle Exchange Network …

193

5 Terminal Reliability of SEN+ The RBD for the terminal reliability of SEN+ is shown in Fig. 1 and the transition state diagram for the considered SEN+ is shown in Fig. 2. (a)

Formulation of mathematical model:

Applying the supplementary variable technique to the proposed model, the set of difference-differential equations for lower bound probabilities of the components of the model are obtained as: 

 ∞ ∞ d + D P 0 (t) = P 1 (x, t)η(x)d x + P 8 (x, t)η(x)d x + dt 0

0

∞

∞ P 9 (x, t)η(x)d x+

0

Fig. 1 RBD for the TR of an 8 × 8 SEN+

Fig. 2 Transition state diagram for Fig. 1

P 10 (x, t)η(x)d x + 0

194

A. Khati and S. B. Singh

∞

∞ P 11 (x, t)η(x)d x +



0

P 12 (x, t)η(x, t)d x

(1)

0

 ∂ ∂ + + η(x) P1(x, t) = 0 ∂x ∂t   d + λ4 P2(t) = λ2P0(t) dt   d + λ2 P3(t) = λ4P0(t) dt   d + λ5 P4(t) = λ3P0(t) dt   d + λ3 P5(t) = λ5P0(t) dt   d + λ2 + λ4 P6(t) = λ5P4(t) + λ3P5(t) dt   d + λ3 + λ4 P7(t) = λ2P3(t) + λ4P2(t) dt   ∂ ∂ + + η(x) P8(x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P9(x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P 10 (x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P11(x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P12(x, t) = 0 ∂x ∂t

(2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13)

Boundary conditions: P 1 (0, t) = λ1 P 0 (t)

(14)

P 8 (0, t) = λ2 P 6 (t)

(15)

Reliability Assessment of Replaceable Shuffle Exchange Network …

195

P 9 (0, t) = λ4 P 6 (t)

(16)

P 10 (0, t) = λ3 P 7 (t)

(17)

P 11 (0, t) = λ5 P 7 (t)

(18)

P 12 (0, t) = λ6 P 0 (t)

(19)

Initial conditions: P 0 (t) = 1 att = 0 and is zero at all other values of t. (b)

Solution of the model:

On taking Laplace transform of Eqs. (1) to (13) along with the boundary conditions (14) to (19) and using the initial conditions, we get: ∞ (s + D)P 0 (s) = 1 +

∞ P1(x, s)η(x)d x +

0

0

∞ +

P1(x, s)η(x)d x

∞ P8(x, s)η(x)d x +

0

P9(x, s)η(x)d x

(20)

0

∞ +

∞ P10(x, s)η(x)d x+

0

P11(x, s)η(x)d x 0





s + λ4 P 2 (s) = λ2 P 0 (s)



 s + λ2 P 3 (s) = λ4 P 0 (s)



 s + λ5 P 4 (s) = λ3 P 0 (s)



(21) (22) (23)

 s + λ3 P 5 (s) = λ5 P 0 (s)

(24)

  s + λ2 + λ4 P 6 (s) = λ5 P 4 (s) + λ3 P 5 (s)

(25)

  s + λ3 + λ5 P 7 (s) = λ4 P 2 (s) + λ2 P 3 (s)

(26)



 ∂ s+ + η(x) P 8 (s, x) = 0 dx

(27)

196

A. Khati and S. B. Singh

  ∂ + η(x) P 9 (s, x) = 0 s+ dx   ∂ + η(x) P 10 (s, x) = 0 s+ dx   ∂ + η(x) P 11 (s, x) = 0 s+ dx   ∂ + η(x) P 12 (s, x) = 0 s+ dx

(28) (29) (30) (31)

Boundary conditions:

(c)

P 1 (0, s) = P 0 (s)λ1

(32)

P 8 (0, s) = P 6 (s)λ2

(33)

P 9 (0, s) = P 6 (s)λ3

(34)

P 10 (0, s) = P 7 (s)λ3

(35)

P 11 (0, s) = P 7 (s)λ5

(36)

P 12 (0, s) = P 0 (s)λ6

(37)

Transition state probabilities:

The equations from (20) to (37) lead to the following transition state probabilities of the considered model: P 0 (s) =

1 (s + D)

(38)

λ1 (1 − S(s)) P 0 (s) s

(39)

P 2 (s) =

λ2 P0(s) (s + λ4 )

(40)

P 3 (s) =

λ4 P (s) (s + λ2 ) 0

(41)

P 1 (s) =

Reliability Assessment of Replaceable Shuffle Exchange Network …

P 4 (s) = P 5 (s) = P 6 (s) = P 7 (s) =

λ5 λ3

197

λ3 P0(s) (s + λ5 ) λ5 (s + λ3 )  1

(42)

P 0 (s)

(s + λ2 + λ4 ) (s + λ5 )  λ2 λ4 1 (s + λ3 + λ5 ) (s + λ2 )

+ +

(43) 1

(s + λ3 ) 1

 P 0 (s)

(44)

P 0 (s)

(45)



(s + λ4 )

P 8 (s) =

λ1 (1 − S(s)) P 6 (s) s

(46)

P 9 (s) =

λ4 (1 − S(s)) P 4 (s) s

(47)

P 10 (s) =

λ3 (1 − S(s)) P 3 (s) s

(48)

P 11 (s) =

λ5 (1 − S(s)) P 5 (s) s

(49)

P 12 (s) =

λ6 (1 − S(s)) P 0 (s) s

(50)

where D = λ1 + λ2 + λ3 + λ4 + λ5 + λ6 . Similarly, we can find the expressions for upper bounds of the transition state probabilities of the network’s components by replacing P i (s) by P i (s) (i = 0 to 12) and λ j by λ j (j = 1 to 6) in Eqs. (38) to (50). (i)

Interval-valued Reliability of the network:

The interval-valued terminal reliability of the replaceable SEN+ is given as: R = [ P 0 + P 5 + P 6 + P 8 + P 9 + P 11 + P 12 + P 13 + P 16 + P 18 , P 0 + P 5 +P 6 + P 8 + P 9 + P 11 + P 12 + P 13 + P 16 + P 18 ] (51)

198

(ii)

A. Khati and S. B. Singh

Mean time to failure:

The lower bound of the MTTF for the TR of SEN+ is obtained as:   1 1 λ λ λ λ λ3 λ5 1 + 2 + 4 + 3 + 5 + + D Dλ4 Dλ Dλ5 Dλ3 D(λ2 + λ4 ) λ5 λ3   2 1 λ2 λ4 1 (52) + + D(λ3 + λ5 ) λ2 λ4

(MTTF)TR =

The upper bound of the MTTF can be found by replacing λi by λi (i = 1 to 6) in Eq. (52).

6 Broadcast Reliability of SEN+ The RBD for the BR of SEN+ is shown in Fig. 3 and the transition state diagram for the broadcast reliability of SEN+ is shown in Fig. 4. (a)

Formulation of mathematical model:

By applying supplementary variable technique to the planned model, the differencedifferential equations for lower bound probabilities of the components of the network in different states, governing the network’s behavior are as: 

 ∞ ∞ d + E P 0 (t) = P 3 (x, t)η(x)d x + P 10 (x, t)η(x)d x dt 0

0

∞ +

∞ P 11 (x, t)η(x)d x +

0

P 12 (x, t)η(x)d x 0

∞ +

P 13 (x, t)η(x)d x + 0

Fig. 3 Broadcast RBD for an 8 × 8 SEN+

∞ P 14 (x, t)η(x)d x 0

Reliability Assessment of Replaceable Shuffle Exchange Network …

199

Fig. 4 Transition state diagram for Fig. 3

∞ +

P 17 (x, t)η(x)d x +

P 20 (x, t)η(x)d x

0

0

∞

∞

+ 

∞

P 21 (x, t)η(x)d x + 0



P 22 (x, t)η(x)d x

(53)

0

d + λ3 + λ4 + λ5 P 1 (t) = λ2 P 0 (t) dt   d + λ2 + λ6 + λ7 P 2 (t) = λ3 P 0 (t) dt   d + λ5 P 4 (t) = λ4 P 1 (t) dt   d + λ4 P 5 (t) = λ5 P 1 (t) dt   d + λ3 + λ6 + λ7 P 6 (t) = λ4 P 5 (t) + λ5 P 4 (t) dt

(54) (55) (56) (57) (58)

200

A. Khati and S. B. Singh



 d + λ7 P 7 (t) = λ6 P 2 (t) dt   d + λ6 P 8 (t) = λ7 P 2 (t) dt   d + λ2 + λ4 + λ5 P 9 (t) = λ7 P 7 (t) + λ6 P 8 (t) dt   ∂ ∂ + + η(x) P10(x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P11(x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P12(x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P13(x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P14(x, t) = 0 ∂x ∂t   d + λ7 P 15 (t) = λ6 P 6 (t) dt   d + λ6 P 16 (t) = λ7 P 6 (t) dt   ∂ ∂ + + η(x) P17(x, t) = 0 ∂x ∂t   d + λ5 P 18 (t) = λ4 P 9 (t) dt   d + λ4 P 19 (t) = λ5 P 9 (t) dt   ∂ ∂ + + η(x) P 20 (x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P 21 (x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P 22 (x, t) = 0 ∂x ∂t

(59) (60) (61) (62) (63) (64) (65) (66) (67) (68) (69) (70) (71) (72) (73) (74)

Reliability Assessment of Replaceable Shuffle Exchange Network …

201

Boundary conditions: P 3 (0, t) = λ3 P 1 (t) + λ2 P 2 (t)

(75)

P 12 (0, t) = λ9 P 0 (t)

(76)

P 11 (0, t) = λ8 P 0 (t)

(77)

P 13 (0, t) = λ10 P 0 (t)

(78)

P 22 (0, t) = λ2 P 9 (t)

(79)

P 17 (0, t) = λ7 P 15 (t) + λ6 P 16 (t)

(80)

P 20 (0, t) = λ5 P 18 (t) + λ4 P 19 (t)

(81)

P 21 (0, t) = λ3 P 6 (t)

(82)

P 22 (0, t) = λ2 P 9 (t)

(83)

Initial conditions: P 0 (t) = 1 at t = 0 and is zero at all other values of t. (b)

Solution of the model:

By taking Laplace transform of the Eqs. (53) to (83) and using the initial conditions, we get: 

∞



s + F P 0 (s) = 1 +

∞ P 1 (s, x)η(x)d x +

0

∞ +

P 2 (s, x)η(x)d x 0

∞ P 3 (s, x)η(x)d x +

P 4 (s, x)η(x)d x

0

0

∞

∞

+

P 7 (s, x)η(x)d x + 0 ∞

+

P 10 (s, x)η(x)d x 0

∞ P 14 (s, x)η(x)d x +

0

P 15 (s, x)η(x)d x 0

202

A. Khati and S. B. Singh

∞ +

∞ P 19 (s, x)η(x)d x +

0

P 20 (s, x)η(x)d x

  s + λ3 + λ4 + λ5 P 1 (s) = λ2 P 0 (s)

(85)

  s + λ2 + λ6 + λ7 P 2 (s) = λ3 P 0 (s)

(86)

  ∂ s+ + η(x) P 3 (s, x) = 0 dx   s + λ5 P 4 (s) = λ4 P 1 (s)

(88)

  s + λ4 P 5 (s) = λ5 P 1 (s)

(89)

  λ λ P (s) + s + λ3 + λ6 + λ7 P 6 (s) = 4 5 1 (s + λ4 )   s + λ7 P 7 (s) = λ6 P 2 (s)  

(84)

0

(87)

λ4 λ5 P 1 (s) (s + λ5 )

 s + λ6 P 8 (s) = λ7 P 2 (s)

 λ7 λ6P 1 (s) λ λ P (s) + s + λ2 + λ4 + λ5 P 9 (s) = 7 6 1 (s + λ7 ) (s + λ6 )   ∂ s+ + η(x) P 1 (s, x) = 0 dx   ∂ + η(x) P 2 (s, x) = 0 s+ dx   ∂ + η(x) P 3 (s, x) = 0 s+ dx   ∂ + η(x) P 4 (s, x) = 0 s+ dx   ∂ + η(x) P 11 (s, x) = 0 s+ dx   ∂ + η(x) P 12 (s, x) = 0 s+ dx

(90) (91) (92) (93) (94) (95) (96) (97) (98) (99)

Reliability Assessment of Replaceable Shuffle Exchange Network …

  ∂ s+ + η(x) P 13 (s, x) = 0 dx   ∂ + η(x) P 14 (s, x) = 0 s+ dx   s + λ7 P 15 (s) = λ6 P 6 (s)   s + λ6 P 16 (s) = λ7 P 6 (s)   ∂ s+ + η(x) P 17 (s, x) = 0 dx   s + λ5 P 18 (s) = λ4 P 9 (s) 

 s + λ4 P 19 (s) = λ5 P 9 (s)

 ∂ + η(x) P 20 (s, x) = 0 dx   ∂ + η(x) P 21 (s, x) = 0 s+ dx   ∂ + η(x) P 22 (s, x) = 0 s+ dx

203

(100) (101) (102) (103) (104) (105) (106)



s+

(107) (108) (109)

Boundary conditions: P 3 (0, s) = P 1 (s)λ3 + P 2 (s)λ2

(110)

P 10 (0, s) = P 0 (s)λ1

(111)

P 11 (0, s) = P 0 (s)λ8

(112)

P 12 (0, s) = P 0 (s)λ9

(113)

P 13 (0, s) = P 0 (s)λ10

(114)

P 14 (0, s) = P 0 (s)λ11

(115)

P 17 (0, s) = P 15 (s)λ7 + P 16 (s)λ6

(116)

204

(c)

A. Khati and S. B. Singh

P 20 (0, s) = P 18 (s)λ5 + P 19 (s)λ4

(117)

P 22 (0, s) = P 9 (s)λ2

(118)

Transition state probabilities:

On solving the equations from (84) to (109) and using the boundary conditions from (110) to (118), the following transient state probabilities are obtained: 1 (s + E)

(119)

λ2 P 0 (s) (s + λ3 + λ4 + λ5 )

(120)

P 0 (s) = P 1 (s) =

λ3 P 0 (s) (s + λ2 + λ6 + λ7 )     1 − S(s) P 3 (s) = λ3 P 1 (s) + λ2 P 2 (s) s P 2 (s) =

(121) (122)

P 4 (s) =

λ2 λ4 P 0 (s) (s + λ5 )(s + λ3 + λ4 + λ5 )

(123)

P 5 (s) =

λ2 λ5 P 0 (s) (s + λ4 )(s + λ3 + λ4 + λ5 )

(124)

λ2 λ4 λ5 λ6 P 0 (s) (s + λ4 )(s + λ7 )(s + λ3 + λ4 + λ5 ) λ2 λ4 λ5 λ6 P 0 (s) + (s + λ5 )(s + λ7 )(s + λ3 + λ4 + λ5 )

P 6 (s) =

(125)

P 7 (s) =

λ3 λ6 P 0 (s) (s + λ7 )(s + λ2 + λ6 + λ7 )

(126)

P 8 (s) =

λ3 λ7 P 0 (s) (s + λ6 )(s + λ2 + λ6 + λ7 )

(127)

λ3 λ6 λ7 P 0 (s) (s + λ6 )(s + λ2 + λ4 + λ5 )(s + λ2 + λ6 + λ7 ) λ3 λ6 λ7 P 0 (s) + (s + λ7 )(s + λ2 + λ4 + λ5 )(s + λ2 + λ6 + λ7 )     1 − S(s) P 10 (s) = λ1 P 0 (s) s

P 9 (s) =

(128) (129)

Reliability Assessment of Replaceable Shuffle Exchange Network …

    1 − S(s) P 11 (s) = λ8 P 0 (s) s     1 − S(s) P 12 (s) = λ9 P 0 (s) s     1 − S(s) P 13 (s) = λ10 P 0 (s) s     1 − S(s) P 14 (s) = λ11 P 0 (s) s λ2 λ4 λ5 λ6 P 0 (s) (s + λ4 )(s + λ7 )(s + λ3 + λ4 + λ5 ) λ2 λ4 λ5 λ6 P 0 (s) + (s + λ5 )(s + λ7 )(s + λ3 + λ4 + λ5 )

205

(130) (131) (132) (133)

P 15 (s) =

λ2 λ4 λ5 λ7 P 0 (s) (s + λ4 )(s + λ6 )(s + λ3 + λ4 + λ5 ) λ2 λ4 λ5 λ7 P 0 (s) + (s + λ5 )(s + λ6 )(s + λ3 + λ4 + λ5 )     1 − S(s) P 17 (s) = λ7 P 15 (s) + λ6 P 16 (s) s

(134)

P 16 (s) =

λ3 λ4 λ6 λ7 P 0 (s) (s + λ5 )(s + λ7 )(s + λ2 + λ4 + λ5 )(s + λ2 + λ6 + λ7 ) λ3 λ4 λ6 λ7 P 0 (s) + (s + λ5 )(s + λ6 )(s + λ2 + λ4 + λ5 )(s + λ2 + λ6 + λ7 )

(135) (136)

P 18 (s) =

λ3 λ5 λ6 λ7 P 0 (s) (s + λ4 )(s + λ7 )(s + λ2 + λ4 + λ5 )(s + λ2 + λ6 + λ7 ) λ3 λ5 λ6 λ7 P 0 (s) + (s + λ4 )(s + λ6 )(s + λ2 + λ4 + λ5 )(s + λ2 + λ6 + λ7 )     1 − S(s) P 20 (s) = λ5 P 18 (s) + λ4 P 19 (s) s     1 − S(s) P 21 (s) = λ3 P 6 (s) s     1 − S(s) P 22 (s) = λ2 P 9 (s) s

(137)

P 19 (s) =

where E = λ1 + λ2 + λ3 + λ8 + λ9 + λ10 + λ11 .

(138) (139) (140) (141)

206

A. Khati and S. B. Singh

Similarly, we can find the expressions for upper bounds of the transition state probabilities of the network’s components by replacing P i (s) by P i (s) (i = 0 to 22) and λ j by λ j (j = 1 to 11) in Eqs. (119) to (141). (i)

Interval-valued reliability:

The interval-valued BR of the SEN+ is given as: R = [P 0 + P 1 + P 2 + P 4 + P 5 + P 6 + P 7 + P 8 + P 9 + P 15 + P 16 + P 18 + P 19 , P 0 + P 1 + P 2 + P 4 + P 5 + P 6 + P 7 + P 8 + P 9 + P 15 + P 16 + P 18 + P 19 ] (ii)

(142)

Mean time to failure:

The lower bound of the MTTF for the BR of SEN+ is given as: (MTTF)BR = λ3 λ2 λ4 1 λ2 + + + E E(λ3 + λ4 + λ5 ) E(λ2 + λ6 + λ7 ) Eλ5 (λ3 + λ4 + λ5 ) λ2 λ4 λ5 λ2 λ4 λ5 λ2 λ5 + + + Eλ4 (λ3 + λ4 + λ5 ) Eλ4 (λ3 + λ4 + λ5 ) Eλ5 (λ3 + λ4 + λ5 ) λ3 λ7 λ3 λ 6 + + Eλ7 (λ2 + λ6 + λ7 ) Eλ6 (λ2 + λ6 + λ7 ) λ3 λ6 λ7 λ3 λ6 λ7 + + Eλ7 (λ2 + λ6 + λ7 )(λ2 + λ4 + λ5 ) Eλ6 (λ2 + λ6 + λ7 )(λ2 + λ4 + λ5 ) λ2 λ4 λ5 λ6 λ2 λ4 λ5 λ6 + + Eλ4 λ7 (λ3 + λ4 + λ5 ) Eλ5 λ7 (λ3 + λ4 + λ5 ) λ2 λ4 λ5 λ7 λ2 λ4 λ5 λ7 + + Eλ4 λ6 (λ3 + λ4 + λ5 ) Eλ5 λ6 (λ3 + λ4 + λ5 ) λ3 λ4 λ7 λ6 λ3 λ4 λ7 λ6 + + Eλ5 λ7 (λ2 + λ6 + λ7 )(λ2 + λ4 + λ5 ) Eλ5 λ6 (λ2 + λ6 + λ7 )(λ2 + λ4 + λ5 ) λ3 λ5 λ7 λ6 λ3 λ5 λ7 λ6 + + Eλ4 λ7 (λ2 + λ6 + λ7 )(λ2 + λ4 + λ5 ) Eλ4 λ6 (λ2 + λ6 + λ7 )(λ2 + λ4 + λ5 ) (143) We can determine the upper bound of the MTTF for BR of SEN+ by replacing λi by λi (i = 1 to 11) in Eq. (143).

Reliability Assessment of Replaceable Shuffle Exchange Network …

207

7 Network Reliability of SEN+ The network RBD for an 8 × 8 SEN+ is shown in Fig. 5 and the transition state diagram for the network reliability of SEN+ is shown in Fig. 6. (a)

Formulation of mathematical model:

By applying supplementary variable technique to the assumed model, the differencedifferential equations for lower bound probabilities of the components of the network in different states, governing the network’s behavior are as: 

 ∞ ∞ d + F P 0 (t) = P 1 (x, t)η(x)d x + P 2 (x, t)η(x)d x dt 0

Fig. 5 Network RBD for an 8 × 8 SEN+

Fig. 6 Transition state diagram for Fig. 5

0

208

A. Khati and S. B. Singh

∞ +

∞ P 3 (x, t)η(x)d x +

0 ∞

+

P 4 (x, t)η(x)d x 0 ∞

P 7 (x, t)η(x)d x + 0 ∞

+

P 10 (x, t)η(x)d x 0

∞ P 14 (x, t)η(x)d x +

0 ∞

+ 

P 15 (x, t)η(x)d x 0 ∞

P 19 (x, t)η(x)d x + 0



P 20 (x, t)η(x)d x

(144)

0

d + λ6 P5 (t) = λ5 P 0 (t) dt   d + λ5 P6 (t) = λ6 P 0 (t) dt   ∂ ∂ + + η(x) P 7 (x, t) = 0 ∂x ∂t   d + λ8 P8 (t) = λ7 P 0 (t) dt   d + λ7 P9 (t) = λ8 P 0 (t) dt   ∂ ∂ + + η(x) P 10 (x, t) = 0 ∂x ∂t   d + λ10 P11 (t) = λ9 P 0 (t) dt   d + λ9 P12 (t) = λ10 P 0 (t) dt   d + λ11 + λ12 P13 (t) = λ9 P 12 (t) + λ10 P11 (t) dt   ∂ ∂ + + η(x) P 14 (x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P 15 (x, t) = 0 ∂x ∂t

(145) (146) (147) (148) (149) (150) (151) (152) (153) (154) (155)

Reliability Assessment of Replaceable Shuffle Exchange Network …

 d + λ12 P16 (t) = λ11 P0 (t) dt   d + λ11 P17 (t) = λ12 P0 (t) dt   d + λ9 + λ10 P18 (s) = λ12 P16 (s) + λ11 P17 (s) dt   ∂ ∂ + + η(x) P 19 (x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P 20 (x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P 21 (x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P 22 (x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P 23 (x, t) = 0 ∂x ∂t   ∂ ∂ + + η(x) P 24 (x, t) = 0 ∂x ∂t

209



(156) (157) (158) (159) (160) (161) (162) (163) (164)

Boundary conditions: P 1 (0, t) = λ1 P 0 (t)

(165)

P 2 (0, t) = λ2 P 0 (t)

(166)

P 3 (0, t) = λ3 P 0 (t)

(167)

P 4 (0, t) = λ4 P 0 (t)

(168)

P 7 (0, t) = λ6 P 5 (t) + λ5 P 6 (t)

(169)

P 10 (0, t) = λ8 P 8 (t) + λ7 P 9 (t)

(170)

P 14 (0, t) = λ11 P 13 (t)

(171)

210

A. Khati and S. B. Singh

P 15 (0, t) = λ12 P 13 (t)

(172)

P 19 (0, t) = λ9 P 18 (t)

(173)

P 20 (0, t) = λ10 P 18 (t)

(174)

P 21 (0, t) = λ13 P 0 (t)

(175)

P 22 (0, t) = λ14 P 0 (t)

(176)

P 23 (0, t) = λ15 P 0 (t)

(177)

P 24 (0, t) = λ16 P 0 (t)

(178)

Initial conditions: P 0 (t) = 1 at t = 0 and is zero at all other values of t. (b)

Solution of the model:

By taking Laplace transform of the Eqs. (144) to (178) and using the initial conditions, we get: 

∞



s + F P 0 (s) = 1 +

∞ P 1 (s, x)η(x)d x +

0

∞ +

0

∞ P 3 (s, x)η(x)d x +

P 4 (s, x)η(x)d x

0

0

∞

∞

+

P 7 (s, x)η(x)d x + 0

P 10 (s, x)η(x)d x 0

∞ +

∞ P 14 (s, x)η(x)d x +

P 15 (s, x)η(x)d x

0

0

∞

∞

+ 

P 2 (s, x)η(x)d x

P 19 (s, x)η(x)d x + 0

P 20 (s, x)η(x)d x

(179)

0

 ∂ + η(x) P 1 (s, x) = 0 s+ dx

(180)

Reliability Assessment of Replaceable Shuffle Exchange Network …

  ∂ s+ + η(x) P 2 (s, x) = 0 dx   ∂ + η(x) P 3 (s, x) = 0 s+ dx   ∂ + η(x) P 4 (s, x) = 0 s+ dx   s + λ6 P 5 (s) = λ5 P 0 (s)   s + λ5 P 6 (s) = λ6 P 0 (s) 

211

(181) (182) (183) (184) (185)

 ∂ + η(x) P 7 (s, x) = 0 dx   s + λ8 P 8 (s) = λ7 P 0 (s)

(187)

  s + λ7 P 9 (s) = λ8 P 0 (s)

(188)

s+

(186)

  ∂ + η(x) P 10 (s, x) = 0 s+ dx   s + λ10 P 11 (s) = λ9 P 0 (s)

(190)

  s + λ9 P 12 (s) = λ10 P 0 (s)

(191)

  s + λ11 + λ12 P 13 (s) = λ9 P 12 (t) + λ10 P 11 (t)

(192)

  ∂ + η(x) P 14 (s, x) = 0 s+ dx   ∂ + η(x) P 15 (s, x) = 0 s+ dx   s + λ12 P 16 (s) = λ11 P 0 (s)



(189)

(193) (194) (195)

  s + λ11 P 17 (s) = λ12 P 0 (s)

(196)

 s + λ9 + λ10 P 18 (s) = λ10 P 16 (t) + λ11 P 19 (t)

(197)

212

A. Khati and S. B. Singh

  ∂ + η(x) P 19 (s, x) = 0 s+ dx   ∂ + η(x) P 20 (s, x) = 0 s+ dx   ∂ + η(x) P 21 (s, x) = 0 s+ dx   ∂ + η(x) P 22 (s, x) = 0 s+ dx   ∂ + η(x) P 23 (s, x) = 0 s+ dx   ∂ + η(x) P 24 (s, x) = 0 s+ dx

(198) (199) (200) (201) (202) (203)

Boundary conditions:

Initial conditions:

P 1 (0, s) = λ1 P 0 (s)

(204)

P 2 (0, s) = λ2 P 0 (s)

(205)

P 3 (0, s) = λ3 P 0 (s)

(206)

P 4 (0, s) = λ4 P 0 (s)

(207)

P 14 (0, s) = λ11 P 13 (s)

(208)

P 15 (0, s) = λ12 P 13 (s)

(209)

P 19 (0, s) = λ9 P 18 (s)

(210)

P 20 (0, s) = λ10 P 18 (s)

(211)

P 7 (0, s) = λ6 P 5 (s) + λ5 P 6 (s)

(212)

P 10 (0, s) = λ8 P 8 (s) + λ7 P 9 (s)

(213)

Reliability Assessment of Replaceable Shuffle Exchange Network …

213

P 0 (t) = 1 at t = 0 and is zero at all other values of t. (c)

Transition state probabilities:

On solving the equations from (179) to (203) and using the boundary conditions from (204) to (213), the following transient state probabilities are obtained: 1 (s + F)     1 − S(s) P 1 (s) = λ1 P 0 (s) s     1 − S(s) P 2 (s) = λ2 P 0 (s) s     1 − S(s) P 3 (s) = λ3 P 0 (s) s     1 − S(s) P 4 (s) = λ4 P 4 (s) s P 0 (s) =

(215) (216) (217) (218)

P 5 (s) =

λ5 P 0 (s) s + λ6

(219)

P 6 (s) =

λ6 P 0 (s) s + λ5

(220)

    1 − S(s) P 7 (s) = λ6 P 5 (s) + λ5 P 6 (s) s P 8 (s) =

λ7 P 0 (s) s + λ8

λ8 P 0 (s) s + λ7     1 − S(s) P 10 (s) = λ8 P 8 (s) s P 9 (s) =

P 13 (s) =

(214)

(221) (222) (223) (224)

P 11 (s) =

λ9 P 0 (s) (s + λ10 )

(225)

P 12 (s) =

λ10 P 0 (s) s + λ9

(226)

λ9 λ10 P 0 (s) λ9 λ10 P 0 (s) + (s + λ9 )(s + λ11 + λ12 ) (s + λ10 )(s + λ11 + λ12 )

(227)

214

A. Khati and S. B. Singh

   1 − S(s) P 14 (s) = λ11 P 13 (s) s     1 − S(s) P 15 (s) = λ12 P 13 (s) s 

P 18 (s) =

(228) (229)

P 16 (s) =

λ11 P 0 (s) s + λ12

(230)

P 17 (s) =

λ12 P 0 (s) s + λ11

(231)

λ11 λ12 P 0 (s) λ11 λ12 P 0 (s) + (s + λ12 )(s + λ9 + λ10 ) (s + λ11 )(s + λ9 + λ10 )     1 − S(s) P 19 (s) = λ9 P 18 (s) s     1 − S(s) P 20 (s) = λ10 P 18 (s) s     1 − S(s) P 21 (s) = λ13 P 0 (s) s     1 − S(s) P 22 (s) = λ13 P 0 (s) s     1 − S(s) P 23 (s) = λ13 P 0 (s) s     1 − S(s) P 24 (s) = λ13 P 0 (s) s

(232) (233) (234) (235) (236) (237) (238)

where F = λ1 + λ2 + λ3 + λ4 + λ5 + λ6 + λ7 + λ8 + λ9 + λ10 + λ11 + λ12 + λ13 + λ14 + λ15 + λ16 . Similarly we can find the expressions for upper bounds of the transition state probabilities of the network’s components by replacing P i (s) by P i (s) (i = 0 to 24) and λ j by λ j (j = 1 to 13) in Eqs. (214) to (238). (i)

Interval-valued reliability:

The interval-valued NR of the considered network is given as: R = [P 0 + P 5 + P 6 + P 8 + P 9 + P 11 + P 12 + P 13 + P 16 + P 18 , P 0 + P 5 + P 6 + P 8 + P 9 + P 11 + P 12 + P 13 + P 16 + P 18 ]

(239)

Reliability Assessment of Replaceable Shuffle Exchange Network …

(ii)

215

Mean time to failure:

The lower bound of the MTTF for the NR of the considered SEN+ can be determined by using Eq. (240) as: 1 λ λ λ λ λ λ + 5 + 6 + 7 + 8 + 9 + 10 F Fλ6 Fλ5 Fλ8 Fλ7 Fλ10 Fλ9 λ9 λ λ10 λ + + 11 + 12 + F(λ11 + λ12 ) F(λ11 + λ12 ) Fλ12 Fλ11 λ11 λ12 + + F(λ9 + λ10 ) F(λ9 + λ10 )

(MTTF)NR =

(240)

One can determine the upper bound of the MTTF for NR in a similar manner by replacing λi by λi (i = 1 to 16) in Eq. (240).

8 Numerical Illustration 8.1 Terminal Reliability of the Replaceable SEN + under Consideration by Using the IUGF Approach Let the upper and lower bounds of the failure rates of the SEN + be λ1 = 0.01, λ2 = 0.03, λ3 = 0.05, λ4 = 0.07, λ5 = 0.09, λ6 = 0.11, λ1 = 0.02, λ2 = 0.04, λ3 = 0.06,λ4 = 0.08,λ5 = 0.1,λ6 = 0.13. On substituting the assumed values of failures rates in Eqs. (15) to (17) and Eqs. (25) to (27) and on taking their inverse Laplace transform, we get the expressions for various transition state probabilities. Tables 1 and 2 show the changes in the transition state probabilities of the operating state with respect to time. From Eq. (29), we can determine the variation on the reliability of the replaceable SEN with time which is presented in the Table 3 and is shown in the Fig. 7.

8.2 MTTF of the SEN + under Consideration By using Eq. (52), we can find the bounds of MTTF by changing the bounds of failure rates which is presented in the Tables 4, 5, 6 and 7 and the changes in the bounds of MTTF with different failure rates are displayed in the Figs. 8 and 9, where B = λ1 ,λ2 ,λ3 ,λ4 ,λ5 ,λ6 and B = λ1 ,λ2 ,λ3 ,λ4 ,λ5 ,λ6.

216

A. Khati and S. B. Singh

Table 1 Variations in the transition state probabilities of operating states w. r. t. time





P 0 (t), P 0 (t) P 2 (t), P 2 (t) P 3 (t), P 3 (t) P 4 (t), P 4 (t) t 0

[1]

[0,0]

[0,0]

[0,0]

1

[0.657047, 0.697676]

[0.024281, 0.031302]

[0.05786, 0.06394]

[0.040047, 0.04641]

2

[0.431711, 0.486752]

[0.03958, 0.049463]

[0.096518, 0.10345]

[0.06454, 0.072566]

3

[0.283554, 0.339596]

[0.048723, 0.059173]

[0.121829, 0.127003]

[0.078479, 0.085718]

4

[0.186374, 0.236928]

[0.053675, 0.063503]

[0.137877, 0.141062]

[0.085324, 0.09704]

5

[0.122456, 0.165299]

[0.055799, 0.064455]

[0.146584, 0.147584]

[0.087468, 0.090764]

6

[0.08046, 0.115325]

[0.05604, 0.063332]

[0.148616, 0.152667]

[0.08656, 0.087816]

7

[0.052866, 0.08046]

[0.055052, 0.060982]

[0.147985, 0.154875]

[0.083128, 0.083797]

8

[0.034735, 0.056135]

[0.053284, 0.057948]

[0.14553, 0.154961]

[0.077736, 0.079744]

9

[0.022823, 0.039164]

[0.051044, 0.05458]

[0.142074, 0.153621]

[0.07195, 0.07512]

10

[0.014996, 0.027324]

[0.048544, 0.051098]

[0.137963, 0.151347]

[0.066166, 0.070231]

Table 2 Variations in the transition state probabilities of operating states w. r. t. time



P 5 (t), P 5 (t) P 6 (t), P 6 (t) P 7 (t), P 7 (t) t 0

[0,0]

[0,0]

[0,0]

1

[0.073612, 0.079088]

[0.003779, 0.004894]

[0.003779, 0.047245]

2

[0.12138, 0.126447]

[0.012749, 0.016056]

[0.00845, 0.0102818]

3

[0.151291, 0.153227]

[0.024289, 0.0]29788

[0.015577, 0.0172214]

4

[0.16673, 0.16891]

[0.036702, 0.043891]

[0.022809, 0.0263046]

5

[0.17113, 0.178767]

[0.048924, 0.057113]

[0.029502, 0.0386]

6

[0.171495, 0.181595]

[0.060315, 0.068806]

[0.035341, 0.0556191]

7

[0.167228, 0.181828]

[0.07052, 0.078689]

[0.040203, 0.0795118]

8

[0.162312, 0.178236]

[0.079374, 0.086707]

[0.044083, 0.035263]

9

[0.155548, 0.173745]

[0.086831, 0.092934]

[0.047039, 0.038385]

10

[0.148282, 0.16857]

[0.092925, 0.097513]

[0.049162, 0.040892]

Reliability Assessment of Replaceable Shuffle Exchange Network … Table 3 Variations in the terminal reliability bounds with time



t

217

R(t), R(t)

0

[1]

1

[0.86045, 0.912732116]

2

[0.7254864, 0.795321]

3

[0.6929866, 0.7198718]

4

[ 0.6774008, 0.695321]

5

[0.670961, 0.68424]

6

[0.6591051, 0.667854]

7

[0.6014996, 0.6331808]

8

[0.54823, 0.5994108]

9

[0.5304517, 0.5601235]

10

[0.48002479, 0.5189737,]

Fig. 7 TR versus time

1.2 1 0.8

Lower bound of TR

0.6

Upper bound of TR

0.4 0.2 0 0

5

10

15

Table 4 MTTF w. r. t. λ1 ,λ2 ,λ3 λ3 MTTF w. r. t.λ1

MTTF w. r. t.λ2

MTTF w. r. t.λ3

0.1

[13.03921569,18.25581396]

0.2

[10.901639344, 18.25581396]

0.3

[9.3661972, 18.25581396]

0.4

[8.20987654,18.25581396693]

[9.7852875,18.25581396] [10.343975,18.25581396]

0.5

[7.3076923077, 18.25581396]

[9.006275,18.25581396] [9.2865307,18.25581396]

0.6

[6.581584158, 18.25581396]

0.7

[5.990990991, 18.25581396]

0.8

[5.4958677686, 18.25581396]

[6.99432086,18.25581396]

0.9

[5.0763358779, 18.25581396]

[6.55872122,18.25581396] [5.6768021,18.25581396]

0.91 [5.0378787879, 18.25581396]

[13.87868,18.25581396] [13.121886,18.25581396] [10.702502,18.25581396] [12.338651,18.25581396] [10.06095,18.25581396] [11.763298,18.25581396]

[8.87429610,18.25581396] [8.6520746,18.25581396] [7.7942064,18.25581396] [7.5090764,18.25581396]

[6.51860825,18.25581396]

[6.201986,18.25581396] [5.641976,18.25581396]

218

A. Khati and S. B. Singh

Table 5 MTTF w. r. t. λ4 ,λ5 ,λ6 λ6 MTTF w. r. t.λ4

MTTF w. r. t.λ5

MTTF w. r. t.λ6

0.1

[12.742349,18.25581396]

[15.9032,18.25581396]

0.2

[11.138954,18.25581396]

[14.5412,18.25581396]

[16.625,18.25581396] [13.3,18.25581396]

0.3

[10.347424,18.25581396]

[13.985,18.25581396]

[11.08333,18.25581396]

0.4

[9.1762086,18.25581396]

[12.98642,18.25581396]

[9.5,18.25581396]

0.5

[8.88542087,18.25581396]

[11.54345,18.25581396]

[8.3125,18.25581396]

0.6

[8.175201,18.25581396]

[10.878202,18.25581396]

[7.388889,18.25581396]

0.7

[7.5560428,18.25581396]

[9.6541509,18.25581396]

[6.65,18.25581396]

0.8

[6.1753207,18.25581396]

[9.0095414,18.25581396]

[6.04545455,18.25581396]

0.9

[5.34730725,18.25581396]

[8.7341765,18.25581396]

[5.5416667,18.25581396]

0.91

[5.3110843,18.25581396]

[8.706428,18.25581396]

[5.4958677,18.25581396]

Table 6 MTTF w. r. t. λ1 ,λ2 ,λ3 MTTF w. r. t.λ1

MTTF w. r. t.λ2

MTTF w. r. t.λ3

0.09

[6.6500002,18.70851]

[6.6500002, 17.70692]

[6.6500002, 20.29365]

0.091

[6.6500002,18.66609]

[6.6500002, 17.68255]

[6.6500002, 20.26063]

0.092

[6.6500002,18.62386]

[6.6500002, 17.65889]

[6.6500002, 20.22849]

0.093

[6.6500002, 18.58182]

[6.6500002, 17.63593]

[6.6500002, 20.19709]

0.094

[6.6500002, 18.53997]

[6.6500002, 17.61974]

[6.6500002, 20.16652]

0.095

[6.6500002, 18.49831]

[6.6500002, 17.59199]

[6.6500002, 20.13742]

0.096

[6.6500002, 18.45683]

[6.6500002, 17.59024]

[6.6500002, 20.10765]

0.097

[6.6500002, 18.41554]

[6.6500002, 17.55059]

[6.6500002, 20.0793]

0.098

[6.6500002, 18.37443]

[6.6500002, 17.53271]

[6.6500002, 20.03863]

0.099

[6.6500002, 18.33351]

[6.6500002, 17.51156]

[6.6500002, 20.02469]

Table 7 MTTF w. r. t. λ4 ,λ5 ,λ6 MTTF w. r. t.λ4

MTTF w. r. t.λ5

MTTF w. r. t.λ6

0.09

[6.6500002, 22.92815]

[6.6500002, 25.35714]

[6.6500002, 24.21102]

0.091

[6.6500002, 22.93929]

[6.6500002, 25.37771]

[6.6500002, 24.14002]

0.092

[6.6500002, 22.95099]

[6.6500002, 25.3987]

[6.6500002, 24.06943]

0.093

[6.6500002, 22.96324]

[6.6500002, 25.42009]

[6.6500002, 23.99926]

0.094

[6.6500002, 22.97531]

[6.6500002, 25.444186]

[6.6500002, 23.92949]

0.095

[6.6500002, 22.98927]

[6.6500002, 25.467032]

[6.6500002, 23.86013]

0.096

[6.6500002, 22.99413]

[6.6500002, 25.48649]

[6.6500002, 23.79117]

0.097

[6.6500002, 23.01724]

[6.6500002, 25. 512802]

[6.6500002, 23.72261]

0.098

[6.6500002, 23.03191]

[6.6500002, 25.53246]

[6.6500002, 23.65444]

0.099

[6.6500002, 23.04701]

[6.6500002, 25.548714]

[6.6500002, 23.58666]

Reliability Assessment of Replaceable Shuffle Exchange Network …

Fig. 8 MTTF versus λ1 ,λ2 ,λ3 ,λ4 ,λ5 ,λ6

Fig. 9 MTTF versus failure rate

219

220

A. Khati and S. B. Singh

8.3 Broadcast Reliability of the Considered Replaceable sEN + by Using the Method of IUGF Let the upper and lower bounds of the failure rates of the proposed SEN + be λ1 = 0.02, λ2 = 0.04, λ3 = 0.06, λ4 = 0.08, λ5 = 0.1, λ6 = 0.12„ λ7 = 0.14,λ8 = 0.16,λ9 = 0.18,λ10 = 0.2,λ11 = 0.22, λ12 = 0.24, λ1 = 0.01, λ2 = 0.03, λ3 = 0.05,λ4 = 0.07,λ5 = 0.09,λ6 = 0.11,λ7 = 0.13,λ8 = 0.15,λ9 = 0.17,λ10 = 0.19,λ11 = 0.21,λ12 = 0.23. Substituting the values of failures rates in Eq. (119) to (141) and taking their inverse Laplace transform, we get the expressions for various transition state probabilities. Tables 8, 9 and 10 shows the changes in the transition state probabilities of the operating states of the SEN with variation in time. Using the Eq. (142), we can evaluate the variation on the reliability of the considered replaceable SEN w. r. t. time which is given in the Table 11 and is depicted in the Fig. 10. Table 8 Variations in the transition state probabilities of operating states w. r. t. time





P 0 (t), P 0 (t) P 1 (t), P 1 (t) P 2 (t), P 2 (t) P 4 (t), P 4 (t) t 0

[1]

[0,0]

[0,0]

[0,0]

1

[0.414783, 0.444858]

[0.018286, 0.02324]

[0.029493, 0.033728]

[0.000734, 0.06951]

2

[0.1720449, 0.1978986]

[0.022957, 0.027921]

[0.035634, 0.08977]

[0.001566, 0.014795]

3

[0.07136127, 0.0880368]

[0.022228, 0.025962]

[0.033039, 0.034677]

[0.003448, 0.02263]

4

[0.029599, 0.039163]

[0.019672, 0.022081]

[0.028096, 0.029818]

[0.004556, 0.030231]

5

[0.0122773, 0.0174224]

[0.016626, 0.018057]

[0.021812, 0.022391]

[0.005376, 0.037623]

6

[0.00509243, 0.0077505]

[0.013795, 0.01449]

[0.016573, 0.017606]

[0.005928, 0.044966]

7

[0.00211225, 0.003448]

[0.011324, 0.011516]

[0.012449, 0.013669]

[0.006256, 0.052368]

8

[0.000876, 0.0015338]

[0.009242, 0.009108]

[0.009294, 0.010536]

[0.00602, 0.059987]

9

[0.0003634, 0.00068233]

[0.007519, 0.007185]

[0.006915, 0.008088]

[0.00641, 0.06796]

10

[0.000150733, 0.00030354]

[0.006108, 0.00566]

[0.00515, 0.006195]

[0.006312, 0.076418]

Reliability Assessment of Replaceable Shuffle Exchange Network …

221

Table 9 Variations in the transition state probabilities of operating states w. r. t. time





P 5 (t), P 5 (t) P 6 (t), P 6 (t) P 7 (t), P 7 (t) P 8 (t), P 8 (t) t 0

[0,0]

[0,0]

[0,0]

1

[0.000951, 0.00136]

2

[0.002741, 0.003802] [0.000604, 0.001073] [0.005115, 0.00626]

3

[0.004542, 0.006126] [0.000968, 0.001415]

4

[0.006059, 0.007967] [0.001253, 0.001664] [0.010228, 0.01186]

5

[0.007223, 0.009278] [0.001452, 0.001838] [0.011561, 0.01309]

6

[0.008053, 0.010122] [0.001575, 0.001954] [0.012202, 0.01351]

[0.015241, 0.01668]

7

[0.008597, 0.010587] [0.001638, 0.002025] [0.012316, 0.01336]

[0.015565, 0.01669]

8

[0.008905, 0.010758] [0.001654, 0.002061] [0.012053, 0.01282]

[0.015423, 0.01623]

9

[0.009029, 0.00709]

[0.014953, 0.01545]

[0.000217, 0.000612] [0.001857, 0.00235] [0.00807, 0.009608]

[0.001636, 0.002071] [0.011536, 0.01204]

10 [0.009008, 0.010499] [0.001594, 0.002062]

[0,0] [0.00221, 0.002758] [0.006137, 0.00742] [0.00977, 0.011487] [0.012502, 0.01432] [0.01428, 0.001598]

[0.01086, 0.011136] [0.014268, 0.01449]

8.4 MTTF of the SEN + Using Eq. (143), we can determine the upper and lower bounds of MTTF of the SEN + corresponding to its BR. By varying the failure rate, the various bound obtained are presented in Tables 12, 13, 14, 15, 16, 17, 18 and 19 and their graphical version are given in Figs. 11 and 12, where C = λ1 ,λ2 ,λ3 ,ss λ4 ,λ5 ,λ6 ,λ7 ,λ8 ,λ9 ,λ10 ,λ11 and C = λ1 ,λ2 , λ3 ,λ4 , λ5 , λ6 ,λ7 ,λ8 ,λ9 ,λ10 ,λ11 .

8.5 Network Reliability of the Replaceable SEN + using IUGF Approach Let the upper and lower bounds of the failure rates of the proposed SEN + be λ1 = 0.01, λ2 = 0.03, λ3 = 0.05, λ4 = 0.07, λ5 = 0.09, λ6 = 0.11„ λ7 = 0.13,λ8 = 0.15,λ9 = 0.17,λ10 = 0.19,λ11 = 0.21,λ12 = 0.23, λ13 = 0.25, λ14 = 0.27, λ15 = 0.28, λ16 = 0.29, λ 1 = 0.02, λ 2 = 0.04, λ 3 = 0.06,λ 4 = 0.08,λ 5 = 0.1,λ 6 = 0.12,λ 7 = 0.14,λ 8 = 0.16,λ 9 = 0.18,λ 10 = 0.2,λ 11 = 0.22,λ 12 = 0.24, λ 13 = 0.26,λ 14 = 0.28,λ 15 = 0.285,λ 16 = 0.295. Putting the values of failures rates in Eqs. (214) to (238) and taking their inverse Laplace transform, we get the expressions for various transition state probabilities. Tables 20, 21 and 22 show the changes in the transition state probabilities of the operating states with respect to time. Using the Eq. (239), we can find the variation on the network reliability of the replaceable SEN+ with time which is given in the Table 23 and is shown in the Fig. 13.

[0.012408, 0.013929]

[0.013779, 0.015102]

[0.014824, 0.015871]

9

10

[0.006644, 0.00805]

5

8

[0.004491, 0.005593]

4

[0.00877, 0.010351]

[0.002526, 0.003237]

3

[0.010722, 0.012338]

[0.001009, 0.001332]

2

7

[0.000172, 0.000235]

1

6

[0,0]

0

[0.00058772, 0.000877]

[0.0005277, 0.0008124]

[0.0004593, 0.0007098]

[0.0003839, 0.0005937]

[0.0003039, 0.0004871]

[0.00022296, 0.000362]

[0.0001461, 0.0002439]

[0.0000797, 0.0001361]

[0.0000309, 0.0000541]

[0.0000051, 0.0000092]

[0,0]

[0.00073805, 0.001090]

[0.0006624, 0.0009891]

[0.00056908, 0.000989]

[0.0004724, 0.0006210]

[0.00037144, 0.000589]

[0.0002706, 0.0003762]

[0.000152, 0.00029242]

[0.0000952, 0.0001629]

[0.0000363, 0.0000654]

[0.00000536, 0.000012]

[0,0]

Table 10 Variations in the transition state probabilities of operating states w. r. t. time



t P 9 (t), P 9 (t) P 15 (t), P 15 (t) P 16 (t), P 16 (t)

[0,0]

P 18 (t), P 18 (t)

[0.0037298, 0.0047326]

[0.003127, 0.003924]

[0.0024625, 0.003154]

[0.001942, 0.00233001]

[0.0013171, 0.001853]

[0.000849, 0.00100913]

[0.00016302, 0.000479]

[0.0002202, 0.0002248]

[0.0000594, 0.0007022]

[0.0000055, 0.0000106]



222 A. Khati and S. B. Singh

Reliability Assessment of Replaceable Shuffle Exchange Network … Table 11 Changes in the broadcast reliability bounds with time



T

223

R(t), R(t)

0

[1]

1

[0.468719, 0.57868]

2

[0.41220131, 0.54691109]

3

[0.3860208, 0.3992534]

4

[0.36142308, 0.3714442]

5

[0.3129225950.312922595]

6

[0.3119157, 0.31232434]

7

[0.271104265, 0.301650249]

8

[0.247908, 0.26147536]

9

[0.1913648, 0.21088568]

10

[0.1378953, 0.1680076]

Fig. 10 BR versus time

1.2

Reliability

1 0.8 0.6

Lower bound of BR

0.4

Upper bound of BR

0.2 0 0

5

10

Time

Table 12 MTTF w. r. t. λ1 ,λ2 ,λ3 0.1

MTTF w. r. t.λ1

MTTF w. r. t.λ2

MTTF w. r. t.λ3

[3.8774968, 27.954032]

[4.604626, 27.954032]

[5.687244, 27.954032]

0.2

[3.6580158, 27.954032]

[6.683774, 27.954032]

[6.652804,27.954032]

0.3

[3.1534619, 27.954032]

[8.438454, 27.954032]

[7.653553, 27.954032]

0.4

[2.5027476, 27.954032]

[9.930167, 27.954032]

[8.582255,27.954032]

0.5

[1.8402555, 27.954032]

[10.46414, 27.954032]

[9.417767,27.954032]

0.6

[1.2604490, 27.954032]

[12.31958, 27.954032]

[10.16279,27.954032]

0.7

[0.8079801, 27.954032]

[13.2896, 27.954032]

[10.82652,27.954032]

0.8

[0.4867350, 27.954032]

[14.14345, 27.954032]

[11.54981,27.954032]

0.9

[0.2765539, 27.954032]

[14.90108, 27.954032]

[11.95038,27.954032]

0.91

[0.1562452, 27.954032]

[15.04965, 27.954032]

[12.06321, 27.954032]

15

224

A. Khati and S. B. Singh

Table 13 MTTF w. r. t. λ4 ,λ5 ,λ6 λ6 MTTF w. r. t.λ4

MTTF w. r. t.λ5

MTTF w. r. t.λ6

0.1

[20.39979, 27.954032]

[15.9842, 27.954032]

[12.984207,27.954032]

0.2

[18.9774, 27.954032]

[13.97621, 27.954032]

[11.41526, 27.954032]

0.3

[15.34459, 27.954032]

[12.98246, 27.954032]

[10.85541, 27.954032]

0.4

[13.00737, 27.954032]

[11.41096, 27.954032]

[9.983425607,27.954032]

0.5

[11.37702, 27.954032]

[10.39325, 27.954032]

[8.298473,27.954032]

0.6

[10.17459, 27.954032]

[9.52351, 27.954032]

[8.119777,27.954032]

0.7

[9.753145, 27.954032]

[8.20653, 27.954032]

[8.0094112, 27.954032]

0.8

[8.518797, 27.954032]

[8.001438, 27.954032]

[7.8984723,27.954032]

0.9

[7.924162, 27.954032]

[7.923206, 27.954032]

[7.798434,27.954032]

0.91

[7.870775, 27.954032]

[7.902426, 27.954032]

[7.698818,27.954032]

Table 14 MTTF w. r. t. λ7 ,λ8 ,λ9 λ9 MTTF w. r. t.λ7

MTTF w. r. t.λ8

0.1

[7.853127., 27.954032]

[4.539508, 27.954032]

[4.652996, 27.954032]

0.2

[7.513118, 27.954032]

[4.0460836,27.954032]

[4.135997, 27.954032]

0.3

[7.3555276 27.954032]

[3.6494087, 27.954032]

[3.722396, 27.954032]

0.4

[7.2048652, 27.954032]

[3.323569, 27.954032]

[3.49723, 27.954032]

0.5

[7.118563, 27.954032]

[3.051145, 27.954032]

[3.101997, 27.954032]

0.6

[7.0101625, 27.954032]

[2.819998, 27.954032]

[2.872102, 27.954032]

0.7

[6.90373453, 27.954032]

[2.621406„ 27.954032]

[2.658855, 27.954032]

0.8

[6.881042,27.954032]

[2.448945, 27.954032]

[2.484309, 27.954032]

0.9

[6.7151872,27.954032]

[2.297776, 27.954032]

[2.326498, 27.954032]

0.91

[6.7019423,27.954032]

[2.283679, 27.954032]

[2.312048, 27.954032]

Table 15 MTTF w. r. t. λ10 ,λ11

MTTF w. r. t.λ9

MTTFw. r. t.λ10

MTTF w. r. t.λ11

0.1

[4.772304, 27.954032]

[4.897891, 27.954032]

0.2

[4.229997, 27.954032]

[4.328369, 27.954032]

0.3

[3.798364, 27.954032]

[3.877497, 27.954032]

0.4

[3.560932, 27.954032]

[3.511695, 27.954032]

0.5

[3.154574, 27.954032]

[3.208963, 27.954032]

0.6

[2.896932, 27.954032]

[2.934981, 27.954032]

0.7

[2.6557518, 27.954032]

[2.737057, 27.954032]

0.8

[2.402058, 27.954032]

[2.549587, 27.954032]

0.9

[2.355947, 27.954032]

[2.386152, 27.954032]

0.91

[2.34113, 27.954032]

[2.370953, 27.954032]

Reliability Assessment of Replaceable Shuffle Exchange Network … Table 16 MTTF w. r. t.λ1 ,λ2 ,λ3

MTTF w. r. t.λ1

MTTF w. r. t.λ2

MTTF w. r. t.λ3

0.1

[1.26434352, 3.09359758]

[1.26434352, 1.654956]

[1.26434352, 1.536492]

0.2

[1.26434352, 2.78423782]

[1.26434352, 1.761904]

[1.26434352, 2.390001]

0.3

[1.26434352, 2,531125]

[1.26434352, 2.009773]

[1.26434352, 3.199891]

0.4

[1.26434352, 2.320198]

[1.26434352, 2.298437]

[1.26434352, 3.961509]

0.5

[1.26434352, 2.1513129]

[1.26434352, 2.595035]

[1.26434352, 4.676214]

0.6

[1.26434352, 1.988741]

[1.26434352, 2.886925]

[1.26434352, 5.346969]

0.7

[1.26434352, 1.8307822]

[1.26434352, 3.168934]

[1.26434352, 5.977091]

0.8

[1.26434352, 1.740149]

[1.26434352, 3.43903]

[1.26434352, 6.569825]

0.9

[1.26434352, 1.637787]

[1.26434352, 3.696612]

[1.26434352, 7.128206]

0.91

[1.26434352, 1.628209]

[1.26434352, 3.721682]

[1.26434352, 7.182264]

MTTF w. r. t.λ4

MTTF w. r. t.λ5

MTTF w. r. t.λ6

0.1

[1.26434352, 2.8468719]

[1.26434352, 2.78429]

[1.26434352, 3.85526]

0.2

[1.26434352, 2.4268227]

[1.26434352, 2.68398]

[1.26434352, 3.81627]

0.3

[1.26434352, 2.2214942]

[1.26434352, 2.2845]

[1.26434352, 3.757654]

0.4

[1.26434352, 2.10470249]

[1.26434352, 2.14352]

[1.26434352, 3.6225]

0.5

[1.26434352, 2.03287682]

[1.26434352, 2.04436]

[1.26434352, 3.516455]

0.6

[1.26434352, 1.9860925]

[1.26434352, 1.983529]

[1.26434352, 3.475505]

0.7

[1.26434352, 1.82286785]

[1.26434352, 1.897982]

[1.26434352, 3.388433]

0.8

[1.26434352, 1.488691]

[1.26434352, 1.769752]

[1.26434352, 3.287852]

0.9

[1.26434352, 1.27589]

[1.26434352, 1.668726]

[1.26434352, 3.161679]

0.91

[1.26434352, 1.255834]

[1.26434352, 1.6483829]

[1.26434352, 3.14492]

Table 17 MTTF w. r. t. λ4 , λ5 , λ6 λ6

225

226 Table 18 MTTF w. r. t. λ7 ,λ8 ,λ9 λ9

Table 19 MTTF w. r. t. λ10 ,λ11

A. Khati and S. B. Singh MTTF w. r. t.λ7

MTTF w. r. t.λ8

MTTF w. r. t.λ9

0.1

[1.26434352, 3.92946]

[1.26434352, 3.66342118]

[1.26434352, 3.01864]

0.2

[1.26434352, 3.87847]

[1.26434352, 3.42744197]

[1.26434352, 2.887845]

0.3

[1.26434352, 3.7123]

[1.26434352, 3.30020844]

[1.26434352, 2.78429]

0.4

[1.26434352, 3.68763]

[1.26434352, 3.19098693]

[1.26434352, 2.67742]

0.5

[1.26434352, 3.55954]

[1.26434352, 3.0881076]

[1.26434352, 2.55070]

0.6

[1.26434352, 3.42497]

[1.26434352, 2.9854911]

[1.26434352, 2.44289]

0.7

[1.26434352, 3.383862]

[1.26434352, 2.84319]

[1.26434352, 2.376763]

0.8

[1.26434352, 3.292675]

[1.26434352, 2.74945]

[1.26434352, 2.209894]

0.9

[1.26434352, 3.15458]

[1.26434352, 2.686855]

[1.26434352, 2.16682]

0.91

[1.26434352, 3.126243]

[1.26434352, 2.643192]

[1.26434352, 2.13875]

MTTF w. r. t.λ10

MTTF w. r. t.λ11

0.1

[1.26434352, 2.8634218]

[1.26434352, 2.209894]

0.2

[1.26434352, 2.74945]

[1.26434352, 2.198549]

0.3

[1.26434352, 2.642497]

[1.26434352, 2.142497]

0.4

[1.26434352, 2.559854]

[1.26434352, 2.024987]

0.5

[1.26434352, 2.320989]

[1.26434352, 1.985416]

0.6

[1.26434352, 2.242455]

[1.26434352, 1.820984]

0.7

[1.26434352, 2.17494]

[1.26434352, 1.74945]

0.8

[1.26434352, 2.0985413]

[1.26434352, 1.64247]

0.9

[1.26434352, 2.042497]

[1.26434352, 1.498519]

0.91

[1.26434352, 2.009894]

[1.26434352, 1.474945]

8.6 MTTF of the SEN+ The bounds of the MTTF at various failure rates are given in the Tables 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 and 35 and are represented graphically in Figs. 14 and 15, where in Fig. 14 λ A = λ1 ,λ2 ,λ3 , λ4 ,λ5 ,λ6 ,λ7 ,λ8 ,λ9 ,λ10 ,λ11 ,λ12 , λ13 , λ14 ,λ15 , λ16 and in Fig. 15 λ A = λ1 , λ2 , λ3 , λ4 , λ5 , λ6 ,λ7 , λ8 , λ9 , λ10 , λ11 , λ12 , λ13 , λ14 , λ15 , λ16 .

Reliability Assessment of Replaceable Shuffle Exchange Network …

227

Fig. 11 MTTF versusλ1 ,λ2 ,λ3 ,λ4 ,λ5 ,λ6 ,λ7 ,λ8 ,λ9 ,λ10 ,λ11

Fig. 12 MTTF versus Failure rates

9 Result and Discussion The proposed model studies how one can analyze the bounds of reliability of a replaceable shuffle exchange network incorporating uncertainties by using the IUGF technique. The reliability bounds of the proposed SEN+ have been determined based on three reliability parameters: TR, BR, and NR. From the outcomes accomplished it can be concluded that the reliability bounds of all the three reliability indices of the SEN+ decrease with increasing time, which can be observed from Figs. 7, 10 and 13.

[1]

[ 0.068563154, 0.079659]

[0.004700906, 0.006346]

[0.00032231, 0.000505]

[0.0000222098, 0.000040266]

[0.000001515, 0.000003207]

[0.000000104, 0.00000025555]

[0.000000007122, 0.00000002035]

[0.000000000488, 0.000000001621]

[0.000000000033, 0.0000000001291]

[0.00000000000229, 0.0000000000103]

0

1

2

3

4

5

6

7

8

9

10

[0.0117654, 0.0123795]

[0.01326545, 0.013819]

[0.014957, 0.015426]

[0.0168637, 0.0172195]

[0.01190137, 0.0192217]

[0.02143789, 0.021456692]

[0.02395027, 0.0241703]

[0.026718034, 0.0272403]

[0.0296097, 0.031967]

[0.0303536, 0.0319571]

[0,0]

Table 20 Variations in the transition state probabilities of operating states w. r. t. time



t P 0 (t), P 0 (t) P 5 (t), P 5 (t)

[0.020899, 0.021944]

[0.023296, 0.02401]

[0.026271, 0.025526118]

[0.028745, 0.0282106]

[0.031451, 0.031176649]

[0.034392, 0.03444167]

[0.03737, 0.037861853]

[0.037611, 0.0388965]

[0,0]

P 6 (t), P 6 (t)

[0.017110672, 0.018329]

[0.01891022, 0.020055]



228 A. Khati and S. B. Singh

Reliability Assessment of Replaceable Shuffle Exchange Network …

229

Table 21 Variations in the transition state probabilities of operating states w. r. t. time





P 8 (t), P 8 (t) P 9 (t), P 9 (t) P 11 (t), P 11 (t) P 12 (t), P 12 (t) t 0

[0,0]

[0,0]

[0,0]

[0,0]

1

[0.042997, 0.04353225]

[0.049902, 0.050444]

[0.054291, 0.054448]

[0.061337, 0.061509]

2

[0.040080452, 0.04015]

[0.047312, 0.047794]

[0.048311, 0.049221]

[0.055438, 0.056793]

3

[0.034359, 0.034803]

[0.041368, 0.042234]

[0.03981, 0.041048]

[0.046594, 0.048304]

4

[0.029292796, 0.02997]

[0.03598, 0.037155]

[0.032611, 0.033973]

[0.038938, 0.040784]

5

[0.0249626, 0.025809]

[0.031281, 0.032628]

[0.026701, 0.028096]

[0.032525, 0.03441]

6

[0.02127821, 0.22208]

[0.027194, 0.02865]

[0.021861, 0.023235]

[0.027168, 0.029031]

7

[0.018226655, 0.01911]

[0.023642, 0.025158]

[0.017898, 0.019214]

[0.022692, 0.024492]

8

[0.015446517, 0.01645]

[0.020553, 0.022091]

[0.014654, 0.015889]

[0.018954, 0.020663]

9

[0.013162653, 0.01416]

[0.017868, 0.019398]

[0.011997, 0.01314]

[0.015832, 0.017433]

10

[0.011216473, 0.01218]

[0.015534, 0.017033]

[0.009823, 0.010866]

[0.013224, 0.014708]

Thus, the performance of the network is crumbling with time. From the outcomes, it was also observed that among all the three reliability bounds, the bounds of TR are the best followed by the BR bounds which are trailed by the NR bounds. In addition to the bounds of reliability of the network, the bounds of the MTTF of the network at different failure rates were also evaluated and the results obtained can be summed up as:

230

A. Khati and S. B. Singh

Table 22 Variations in the transition state probabilities of operating states w. r. t. time





P 13 (t), P 13 (t) P 16 (t), P 16 (t) P 17 (t), P 17 (t) P 18 (t), P 18 (t) T 0

[0,0]

1

[0.012796, 0.013627]

2

[0,0]

[0,0]

[0.064744, 0.065271]

[0.071605, 0.072462]

[0.019407, 0.020271]

[0.024539, 0.025431]

[0.055368, 0.05706]

[0.062374, 0.064509]

[0.037725, 0.038341]

3

[0.029866, 0.030354]

[0.043858, 0.04575]

[0.050393, 0.05275]

[0.046426, 0.46418]

4

[0.031016, 0.03102]

[0.034521, 0.036383]

[0.040464, 0.042795]

[0.047581, 0.048573]

5

[0.029389, 0.029827]

[0.027157, 0.02891]

[0.032475, 0.034692]

[0.045223, 0.046888]

6

[0.02666, 0.027441]

[0.021362, 0.02297]

[0.026062, 0.028121]

[0.041009, 0.43151]

7

[0.02355, 0.024552]

[0.016804, 0.018251]

[0.020195, 0.022794]

[0.03606, 0.038486]

8

[0.020423, 0.021562]

[0.013219, 0.014501]

[0.016785, 0.018477]

[0.031033, 0.033582]

9

[0.017491, 0.018693]

[0.010398, 0.011521]

[0.01347, 0.014977]

[0.026289, 0.028835]

10

[0.014848, 0.016057]

[0.008179, 0.009154]

[0.01081, 0.01214]

[0.022008, 0.024459]

Table 23 Changes in the network reliability of the SEN+ with time

[0,0]

t



R(t), R(t)



0

[1]

1

[0.462659, 0.575439]

2

[0.442011, 0.4598109]

3

[0.41389608, 0.423992534]

4

[0.38614208, 0.39962971442]

5

[0.37131528, 0.3802322595]

6

[0.329157, 0.3427192434]

7

[0.3027110465, 0.31260249]

8

[0.27097618, 0.299147536]

9

[0.2598855568, 0.265413648]

10

[0.249997855, 0.25802036]

Reliability Assessment of Replaceable Shuffle Exchange Network … Fig. 13 Network Reliability versus time

231

1.2

Reliability

1 0.8 0.6

Lower bound of NR

0.4

Upper bound of NR

0.2 0 0

5

10

15

Time

Table 24 MTTF w. r. t.λ1 ,λ2 ,λ3 MTTF w. r. t.λ1

MTTF w. r. t.λ2

0.1

[3.724784,4.39604205]

[3.751972,4.39604205]

MTTF w. r. t.λ3 [3.77956,4.39604205]

0.2

[3.594547,4.39604205]

[3.61986,4.39604205]

[3.645533,4.39604205]

0.3

[3.473109,4.39604205]

[3.496736,4.39604205]

[3.840953,4.39604205]

0.4

[3.359609,4.39604205]

[3.381712,4.39604205]

[3.821614,4.39604205]

0.5

[3.253292,4.39604205]

[3.274014,4.39604205]

[3.295001,4.39604205]

0.6

[3.153498,4.39604205]

[3.172964,4.39604205]

[3.158427,4.39604205]

0.7

[3.059644,4.39604205]

[3.083559,4.39604205]

[3.096507,4.39604205]

0.8

[2.971245,4.39604205]

[2.998489,4.39604205]

[3.005966,4.39604205]

0.9

[2.887754,4.39604205]

[2.904069,4.39604205]

[2.920569,4.39604205]

0.91

[2.886943, 0.39604205]

[2.903249,4.39604205]

[2.91974,4.39604205]

Table 25 MTTF w. r. t. λ4 ,λ5 ,λ6 MTTF w. r. t.λ4

MTTF w. r. t.λ5

MTTF w. r. t.λ6

0.1

[3.807557,4.39604205]

[3.984653,4.39604205]

[3.852282,4.39604205]

0.2

[3.621883,4.39604205]

[3.93228,4.39604205]

[3.893866,4.39604205]

0.3

[3.544967,4.39604205]

[4.009738,4.39604205]

[4.049092,4.39604205]

0.4

[3.426801,4.39604205]

[4.12914,4.39604205]

[4.121983,4.39604205]

0.5

[3.372291,4.39604205]

[4.24622,4.39604205]

[4.194467,4.39604205]

0.6

[3.212626,4.39604205]

[4.265932,4.39604205]

[4.233761,4.39604205]

0.7

[3.497356,4.39604205]

[4.281708,4.39604205]

[4.253427,4.39604205]

0.8

[3.023648,4.39604205]

[4.290619,4.39604205]

[4.296045,4.39604205]

0.9

[2.937258,4.39604205]

[4.298344,4.39604205]

[4.318242,4.39604205]

0.91

[2.92889,4.39604205]

[4.321165,4.39604205]

[4.356728,4.39604025]

232

A. Khati and S. B. Singh

Table 26 MTTF w. r. t. λ7 ,λ8 ,λ9 MTTF w. r. t.λ7

MTTF w. r. t.λ8

MTTF w. r. t.λ9

0.1

[3.972555,4.39604205]

[4.275923,4.39604205]

[4.253029,4.39604205]

0.2

[3.763703,4.39604205]

[4.123972,4.39604205]

[4.246949,4.39604205]

0.3

[3.757352,4.39604205]

[4.148283,4.39604205]

[4.235865,4.39604205]

0.4

[3.796785,4.39604205]

[4.210882,4.39604205]

[4.21873,4.39604205]

0.5

[3.851167,4.39604205]

[4.244789,4.39604205]

[4.21601,4.39604205]

0.6

[3.910577,4.39604205]

[4.286146,4.39604205]

[4.20352,4.39604205]

0.7

[3.971024,4.39604205]

[4.316264,4.39604205]

[4.118623,4.39604205]

[4.338612,4.39604205]

[4.035371,4.39604205]

0.8

[4. 030,703,4.39604205]

0.9

[4.088757,4.39604205]

[4.351192,4.39604205]

[3.072834,4.39604205]

0.91

[4.10 5164,4.39604205]

[4.359421,4.39604205]

[3.044328,4.39604205]

Table 27 MTTF w. r. t. λ10 ,λ11 ,λ12 MTTF w. r. t.λ10

MTTF w. r. t.λ11

MTTF w. r. t.λ12

0.1

[4.19842,4.39604205]

[4.08431,4.39604205]

[3.796355,4.39604205]

0.2

[4.001618,4.39604205]

[3.84139,4.39604205]

[3.59327,4.39604205]

0.3

[3.808442,4.39604205]

[3.764321,4.39604205]

[3.43058,4.39604205]

0.4

[3.729194,4.39604205]

[3.683219,4.39604205]

[3.37327,4.39604205]

0.5

[3.659392,4.39604205]

[3.55942,4.39604205]

[3.282759,4.39604205]

0.6

[3.550143,4.39604205]

[3.42964,4.39604205]

[3.166469,4.39604205]

0.7

[3.492753,4.39604205]

[3.37275,4.39604205]

[3.103403,4.39604205]

0.8

[3.329802,4.39604205]

[3.247428,4.39604205]

[3.004065,4.39604205]

0.9

[3.272829,4.39604205]

[3.188528,4.39604205]

[2.929254,4.39604205]

0.91

[3.249143,4.39604205]

[3.129543,4.39604205]

[2.910376,4.39604205]

MTTF w. r. t.λ15

Table 28 MTTF w. r. t. λ13 ,λ14 ,λ15 MTTF w. r. t.λ13

MTTF w. r. t.λ14

0.1

[4.079525,4.39604205]

[4.112161,4.39604205]

[4.12042,4.39604205]

0.2

[3.921838,4.39604205]

[3.951001,4.39604205]

[3.96162,4.39604205]

0.3

[3.77956,4.39604205]

[3.807550,4.39604205]

[3.86296,4.39604205]

0.4

[3.69259,4.39604205]

[3.671573,4.39604205]

[3.678141,4.39604205]

0.5

[3.66836,4.39604205]

[3.544967,4.39604205]

[3.503427,4.39604205]

0.6

[3.404107,4.39604205]

[3.426801,4.39604205]

[3.432522,4.39604205]

0.7

[3.393737,4.39604205]

[3.316259,4.39604205]

[3.395374,4.39604205]

0.8

[3.219265,4.39604205]

[3.212626,4.39604205]

[3.217654,4.39604205]

0.9

[3.096597,4.39604205]

[3.115274,4.39604205]

[3.120001,4.39604205]

0.91

[3.095575,4.39604205]

[3.105862,4.39604205]

[3.110561,4.39604205]

Reliability Assessment of Replaceable Shuffle Exchange Network …

233

Table 29 MTTF w. r. t. λ16 MTTF w. r. t.λ16 0.1

[3.455598,4.39604205]

0.2

[3.343221,4.39604205]

0.3

[3.275208,4.39604205]

0.4

[3.139055,4.39604205]

0.5

[3.07426,4.39604205]

0.6

[2.95839,4.39604205]

0.7

[2.89842,4.39604205]

0.8

[2.797389,4.39604205]

0.9

[2.653007,4.39604205]

0.91

[2.646179,4.39604205]

Table 30 MTTF w. r. t.λ1 ,λ2 ,λ3 MTTF w. r. t.λ1

MTTF w. r. t.λ2

MTTF w. r. t.λ3

0.1

[3.835971515, 4.245033]

[3.835971515, 4.27769]

[3.835971515, 4.48467]

0.2

[3.835971515, 4.08897]

[3.835971515, 4.11925]

[3.835971515, 4.31085]

0.3

[3.835971515, 3.94397]

[3.835971515, 3.97214]

[3.835971515, 4.14999]

0.4

[3.835971515, 3.8089]

[3.835971515, 3.83517]

[3.835971515, 4.00071]

0.5

[3.835971515, 3.726472]

[3.835971515, 3.70733]

[3.835971515, 3.73221]

0.6

[3.835971515, 3.56474]

[3.835971515, 3.58774]

[3.835971515, 3.61103]

0.7

[3.835971515, 3.42315]

[3.835971515, 3.47562]

[3.835971515, 3.49708]

0.8

[3.835971515, 3.35]

[3.835971515, 3.3703]

[3.835971515, 3.39085]

0.9

[3.835971515, 3.25204]

[3.835971515, 3.27117]

[3.835971515,3.29053]

0.91

[3.835971515, 3.24256]

[3.835971515, 3.26158]

[3.835971515, 3.28082]

Table 31 MTTF w. r. t. λ4 , λ5 , λ6 MTTF w. r. t.λ4

MTTF w. r. t.λ5

MTTF w. r. t.λ6

0.1

[3.835971515, 4.34453]

[3.835971515, 4.36641]

[3.835971515, 4.40186]

0.2

[3.835971515, 4.1812]

[3.835971515, 4.33703]

[3.835971515, 4.48618]

0.3

[3.835971515, 4.02971]

[3.835971515, 4.44362]

[3.835971515, 4.6746]

0.4

[3.835971515, 3.88881]

[3.835971515, 4.57498]

[3.835971515, 4.87625]

0.5

[3.835971515, 3.75743]

[3.835971515, 4.70988]

[3.835971515, 5.00642]

0.6

[3.835971515, 3.65638]

[3.835971515, 4.84193]

[3.835971515, 5.26432]

0.7

[3.835971515, 3.51962]

[3.835971515, 4.9689]

[3.835971515, 5.44485]

0.8

[3.835971515, 3.42843]

[3.835971515, 5.09006]

[3.835971515, 5.67248]

0.9

[3.835971515, 3.31011]

[3.835971515, 5.20528]

[3.835971515, 5.77759]

0.91

[3.835971515, 3.30029]

[3.835971515, 5.21647]

[3.835971515, 5.72164]

234

A. Khati and S. B. Singh

Table 32 MTTF w. r. t. λ7 ,λ8 ,λ9 MTTF w. r. t.λ7

MTTF w. r. t.λ8

MTTF w. r. t.λ9

0.1

[3.835971515, 4.54641]

[3.835971515, 4.54378]

[3.835971515, 4.74465]

0.2

[3.835971515, 4.33949]

[3.835971515, 4.41388]

[3.835971515, 4.33062]

0.3

[3.835971515, 4.33309]

[3.835971515, 4.32902]

[3.835971515, 4.24552]

0.4

[3.835971515, 4.38629]

[3.835971515, 4.58728]

[3.835971515, 4.25222]

0.5

[3.835971515, 4.42506]

[3.835971515, 4.61997]

[3.835971515, 4.29603]

0.6

[3.835971515, 4.48312]

[3.835971515, 4.72925]

[3.835971515, 4.35682]

0.7

[3.835971515, 4.57598]

[3.835971515,4.79535]

[3.835971515, 4.42547]

0.8

[3.835971515, 4.60006]

[3.835971515, 4.82485]

[3.835971515, 4.49736]

0.9

[3.835971515, 4.65637]

[3.835971515, 4.95941]

[3.835971515, 4.56997]

0.91

[3.835971515, 4.66189]

[3.835971515, 4.96724]

[3.835971515, 4.57721]

Table 33 MTTF w. r. t. λ10 ,λ11 ,λ12 MTTF w. r. t.λ10

MTTF w. r. t.λ11

MTTF w. r. t.λ12

0.1

[3.835971515, 4.70832]

[3.835971515, 4.8825]

[3.835971515, 4.86329]

0.2

[3.835971515, 4.35621]

[3.835971515, 4.41449]

[3.835971515, 4.43705]

0.3

[3.835971515, 4.30557]

[3.835971515, 4.31137]

[3.835971515, 4.35555]

0.4

[3.835971515, 4.33882]

[3.835971515, 4.32792]

[3.835971515, 4.3658]

0.5

[3.835971515, 4.40521]

[3.835971515, 4.33622]

[3.835971515, 4.41197]

0.6

[3.835971515, 4.48602]

[3.835971515, 4.38514]

[3.835971515, 4.4741]

0.7

[3.835971515, 4.57281]

[3.835971515, 4.44231]

[3.835971515, 4.54334]

0.8

[3.835971515, 4.66133]

[3.835971515, 4.5031]

[3.835971515, 4.61526]

0.9

[3.835971515, 4.75942]

[3.835971515, 4.56502]

[3.835971515, 4.6875]

0.91

[3.835971515, 4.73825]

[3.835971515, 4.57122]

[3.835971515, 4.69469]

MTTF w. r. t.λ13

MTTF w. r. t.λ14

MTTF w. r. t.λ15

0.1

[3.835971515, 4.6731]

[3.835971515, 4.92123]

[3.835971515,4.73276]

0.2

[3.835971515, 4.48467]

[3.835971515, 4.7127]

[3.835971515, 4.53959]

0.3

[3.835971515, 4.31085]

[3.835971515, 4.34452]

[3.835971515, 4.34484]

0.4

[3.835971515, 4.14999]

[3.835971515, 4.21095]

[3.835971515, 4.19757]

0.5

[3.835971515, 4.00071]

[3.835971515, 4.0297]

[3.835971515, 4.04436]

0.6

[3.835971515, 3.87352]

[3.835971515, 3.89945

[3.835971515, 3.91162]

0.7

[3.835971515, 3.73221]

[3.835971515, 3.75743]

[3.835971515, 3.77016]

0.8

[3.835971515, 3.56542]

[3.835971515, 3.70842

[3.835971515, 3.66255]

0.9

[3.835971515, 3.49748]

[3.835971515, 3.63464]

[3.835971515, 3.53019]

0.91

[3.835971515, 3.48652]

[3.835971515, 3.6228]

[3.835971515, 3.51962]

Table 34 MTTF w. r. t. λ13 ,λ14 ,λ15

Reliability Assessment of Replaceable Shuffle Exchange Network … Table 35 MTTF w. r. t. λ16

235

MTTF w. r. t.λ16 0.1

[3.835971515, 4.75298]

0.2

[3.835971515, 4.55819]

0.3

[3.835971515, 4.37873]

0.4

[3.835971515, 4.21557]

0.5

[3.835971515, 4.05912]

0.6

[3.835971515, 3.82498]

0.7

[3.835971515, 3.78299]

0.8

[3.835971515, 3.65855]

0.9

[3.835971515, 3.54203]

0.91

[3.835971515, 3,35079]

Fig. 14 MTTF versus failure rates

(a) (i)

(ii)

(b)

MTTF with respect to Terminal Reliability: It can be observed from Fig. 8 that as the values of λ1 ,λ2 ,λ3 ,λ4 ,λ5 and λ6 increase the lower bound of the MTTF of the considered SEN+?decrease while the upper bound of MTTF remains consistent for all values of the mentioned failure rates. Also Fig. 9 shows that the upper bound of MTTF of the network decrease slowly with increasing values of λ1 ,λ2 ,λ3 and λ6 while increase with increasing values of λ4 ,λ5 . Also, for all failure rates the lower bound stays unaltered. MTTF with respect to Broadcast Reliability:

236

A. Khati and S. B. Singh

Fig. 15 MTTF versus failure rate

(i)

(ii)

(c) (i)

(ii)

By observing Fig. 11, it can be examined that with increasing values of λ1 ,λ4 ,λ5 ,λ6 ,λ7 ,λ8 ,λ9 ,λ10 and λ11 the lower bounds of MTTF decrease while they increase with increasing λ2 ,λ3 . The upper bound of the MTTF remains constant concerning all the parameters. From Fig. 12, it can be observed that the upper bound of the MTTF of the considered network decrease with increase in the values of λ1 ,λ4 ,λ5 ,λ6 ,λ7 ,λ8 ,λ9 ,λ10 ,λ11 and increase with increasing values of λ2 ,λ3 .The lower bound of the MTTF remains unchanged for all the mentioned parameters. MTTF with respect to Network Reliability: From Fig. 14 it can be detected that on increasing the values of the failure rates λ1 ,λ3 ,λ4 ,λ7 ,λ8 ,λ9 ,λ10 ,λ11 ,λ12 ,λ13 ,λ14 ,λ15 and λ16 ,lower bound of MTTF of the proposed SEN decrease whereas with increasing value of λ6 , lower bound of MTTF increase. Also, as λ2 and λ5 increase the lower bounds of the MTTF first decreases and then increases. Here, also the upper bound remains constant. On examining Fig. 15 it can be visualized that the upper bound of the MTTF decrease with increasing values of the parameters λ1 ,λ2 ,λ3 ,λ4 ,λ7 ,λ8 ,λ9 ,λ10 ,λ11 ,λ12 ,λ13 ,λ14 ,λ15 and λ16 while increase with increasing λ6 . It can also be observed that if we increase λ5 then the upper bound of MTTF first decrease then increase. Also, the lower bound of MTTF remains the same for all the failure rates.

Reliability Assessment of Replaceable Shuffle Exchange Network …

237

One can observe from the results that the lower bound of the MTTF of the network is lowest with respect to λ1 and is highest with respect to λ4 , and both are corresponding to the NR of the network. The lowest and the highest value of the MTTF are 0.1562452 and 20.39972 respectively. Also, the upper bound of the MTTF is lowest with respect to the parameter λ4 with value 1.255834 and is highest with respect to λ2 with value 27.954032. The lowest and highest values of the upper bound of the MTTF are also obtained corresponding to the BR of the SEN+.

10 Conclusion In this paper, we have considered a SEN+ the probability of whose components are not known with accuracy and determined its reliability using the method of IUGF. The transient state probabilities are obtained in intervals. The reliability is acquired regarding three parameters: Terminal, Broadcast, and Network reliability. Also in the observed network if any of the SE fails and the network quits operating at that point we are able to replace that SE to hold the operation of the network. The SEN+ proposed is of length 8 × 8. The RBD for the Terminal, Broadcast, and Network reliability of SEN+ have been presented. The differential equations overseeing the network’s behavior have been obtained with the aid of the method of the supplementary variable technique. The probabilities of all the components are obtained by using the Laplace transform method and hence the reliability of the network is computed. Furthermore, the MTTF is examined and at last, we established our model by taking a numerical example.

References 1. Bisht S, Singh SB (2020) Assessment of reliability and signature of Benes network using universal generating function. Life Cycle Reliab Saf Eng 2. Bistouni F, Jahanshahi M (2014) Improved extra group network: a new fault- tolerant multistage interconnection network. The J Supercomput 69(1):161–199 3. Bistouni F, Jahanshahi M (2014) Analyzing the reliability of shuffle-exchange networks using reliability block diagrams. Reliab Eng Syst Saf 132:97–106 4. Bistouni F, Jahanshahi M (2018) Rearranging links: a cost-effective approach to improve the reliability of multistage interconnection networks. Int J Internet Technol Secured Trans 8(3):336–373 5. Bistouni F, Jahanshahi M (2019) Determining the reliability importance of switching elements in the shuffle-exchange networks. Int J Parallel Emergent Distrib Syst 34(4):448–476 6. Fard NS, Gunawan I (2005) Terminal reliability improvement of shuffle-exchange network systems. Int J Reliab Qual Saf Eng 12(01):51–60 7. Kumar A, Singh SB, Ram M (2016) Interval-valued reliability assessment of 2-out-of-4 system. In: 2016 international conference on emerging trends in communication technologies (ETCT). IEEE, pp 1–4 8. Li CY, Chen X, Yi XS, Tao JY (2011) Interval-valued reliability analysis of multi-state systems. IEEE Trans Reliab 60(1):323–330

238

A. Khati and S. B. Singh

9. Singh SB (2017) Reliability analysis of multi-state complex system having two multi-state subsystems under uncertainty. J Reliab Stat Stud 10(1):161–177 10. Pan G, Shang CX, Liang YY, Cai JY, Li DY (2016) Analysis of interval-valued reliability of multi-state system in consideration of epistemic uncertainty. In: International conference on P2P, parallel, grid, cloud and internet computing. Springer, Cham, pp 69–80 11. Rajkumar S, Goyal NK (2016) Review of multistage interconnection networks reliability and fault-tolerance. IETE Tech Rev 33(3):223–230 12. Sharma S, Kahlon KS, Bansal PK (2009) Reliability and path length analysis of irregular fault tolerant multistage interconnection network. ACM SIGARCH Comput Architecture News 37(5):16–23 13. Yunus NAM, Othman M (2011) Shuffle exchange network in multistage interconnection network: a review and challenges. Int J Comput Electr Eng 3(5):724 14. Yunus NAM, Othman M (2015) Reliability evaluation for shuffle exchange interconnection network. Procedia Comput Sci 59:162–170

Modeling Software Vulnerability Injection-Discovery Process Incorporating Time-Delay and VIKOR Based Ranking Mohini Agarwal, Deepti Aggrawal, Subhrata Das, Adarsh Anand, and Navneet Bhatt Abstract Researchers have widely focused on catering software quality attributes viz. reliability and maintainability. However, the reliance on software and softwarebased products have redirected the focus of researchers/ engineers towards security. The vulnerability in software can be due to design flaws, implementation errors, configuration errors, etc., making it prone to attacks and can be used for malicious activities. Timely detection and fixation of these loopholes can enhance the development of safe and secure software thereby minimizing the efforts and resources required to fix them afterwards. With the aim of modeling the discovery process of vulnerability, in this chapter time delay-based formulation for vulnerability injection and discovery has been proposed which has been modeled by considering the infinite server queuing theory. For the empirical validation two vulnerability discovery data has been used. Further VIKOR, a well-known Multi Criteria Decision Making (MCDM) technique has been used to rank the different proposed models. Keywords Discovery · Injection · Security breaching · VIKOR · Vulnerability

M. Agarwal Amity School of Business, Amity University Uttar Pradesh, Noida 201303, India D. Aggrawal University School of Management and Entrepreneurship, Delhi Technological University, Delhi, India e-mail: [email protected] S. Das · A. Anand (B) Department of Operational Research, University of Delhi, Delhi 110007, India N. Bhatt Anil Surendra Modi School of Commerce, SVKM’s Narsee Monjee Institute of Management Studies (Deemed to be University), Mumbai 400056, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment of Industrial Systems, Springer Series in Reliability Engineering, https://doi.org/10.1007/978-3-030-93623-5_10

239

240

M. Agarwal et al.

1 Introduction Software security is crucial to ensure data security and prevent the misuse of the software [28]. Software vulnerabilities are bugs/loopholes in the software system which makes it prone to software breaching [22]. The presence of vulnerability is a potential risk to the software whereas an exploit is the set of code to take advantage of a security flaw. Vulnerability discovery process is like security check for programmers/developers and assists in improving software quality. From the viewpoint of software security, it is very important to review the bugs that are discovered might be exploited by any hacker. Basically, when a software vulnerability is discovered it follows a life cycles that consist of injection, discovery, disclosure, patch and exploitation [8]. All software vulnerabilities are not equally harmful and are analyzed based on their ease of detection, their exploitability and damage they can cause. In the arena of software engineering, fault removal phenomena and vulnerability handling has been given the utmost importance [10]. The most convenient way to deal with vulnerabilities is to provide software patches or the corrective code to overcome the loophole. The software firms tend to deploy software updates as soon as possible and most of the vulnerability exploitations can be avoided by applying these patches. Common vulnerabilities and Exposure (CVE), the vulnerability database provides instances of publicly disclosed loopholes present in any software. Also, many instances of vulnerabilities getting exploited can also be viewed. As quoted by [32], the first half of the year 2019 has seen more than 3800 cases of publicly disclosed software breaches uncovering 4.1 billion compromised records. It is even more astonishing that out of this 3.2 billion of these records were uncovered by only eight breaches. Moreover, in 2017 the infamous WannaCry vulnerability, a ransomware affected the software community on a massive scale [23]. These breaching affects the software and its dependent processes as well. Thereby making it a prime responsibility to discover the loopholes before their exploitation by bad guys. Even though, huge number of resources will be required for finding, disclosing, and fixing the attack prone areas in susceptible software. The discovery process is being undertaken both internally and externally which comprises of both testers (internal part of developing team and users (external, jointly engaged in discovering the vulnerabilities throughout the lifespan of the software. As described above, life cycle of vulnerability consist of injection, discovery, disclosure, patch release and exploitation [4] and researchers have extensively worked on each stage of the Vulnerability life cycle, be it working on induction, be it its discovery and disclosure and be it the patch scheduling policies. The present framework is different from some of the previous works in many aspects, which can be understood as below (Table 1). The chapter has been chronologically arranged as follows: after the brief discussion on importance and introduction to vulnerabilities in Sect. 1, a brief literature on vulnerability discovery process has been supplemented in Sect. 2. Methodology and model development have been discussed in Sect. 3 followed by data analysis and

Modeling Software Vulnerability Injection-Discovery Process …

241

Table 1 Contrast between current and previous research Present Research [4]

[15] Research on Concept of VDM*

Yes

No

No

Utility of any MCDM Yes technique for model selection

Yes No

No

Time delay concept in VDM

Yes

No

No

No

Modeling vulnerability discovery process

Yes

Yes Yes

Yes

Role of Infinite Server Queuing theory

* Studies

No

like [1, 2] and [27]

model validation in Sect. 4. In Sect. 5 the conclusion and lastly list of references are provided.

2 Literature Review Vulnerability Discovery Models (VDMs) are probabilistic models which helps in mathematically assessing and quantifying the vulnerabilities present in the software. Several practitioners have worked for developing mathematical structures based on different distribution functions [2, 4, 6, 7, 17, 18, 27]. The basic software vulnerability discovery model was the Anderson’s Thermodynamic Model [5]. In 2005, Rescorla presented a linear and exponential vulnerability detection process whereas Alhazmi and Malaiya model was an S-shaped, logistic VDM which defines the vulnerability discovery process can be depicted as three phases as linear, learning and saturation [2]. Alhazmi and Malaiya [2] also proposed an effort-based model for modeling the effort consumed in terms of resources and budget in vulnerabilities detection. Hanebutte and Oman [13] analyzed the program’s software faults and vulnerabilities wherein they comprehended that these two are not distinct categories, but rather two interrelated sets that should not be referred independently. Woo et al. [33], quantitatively examined the Vulnerability Discovery Process (VDP) for two main servers viz. Apache and IIS. The assessment and feasibility based on time as well as effort. Younis et al. [35], investigated the applicability of a VDM called Folded model, based on the Folded normal distribution, and compare it with Alhazmi and Malaiya’s Logistic (AML) model [2]. The work of [21], is based on exploring the various vulnerability techniques, basically the static analysis, fuzzing, penetration testing. Massacci and Nguyen [24] proposed an empirical framework for the evaluation of existing VDMs performance in terms of quality and their predictability and demonstrated superiority of their work over the traditional approaches. Further, established the fact that simplest linear model works best for the first 6–12 months after release date, apart from that logistics-based models are also better choice. [11] suggested that the probability of exploitation rises with the

242

M. Agarwal et al.

vulnerabilities that are inherent to a software organization, the effectiveness of cyberattacks, and hacker’s advantage, but reduces with the allocation of resources for the security and associated organization’s financial loss. Additionally, they investigated three central decision instances in which security investment, information sharing, or both would be respectively regulated by a social planner. Like ways,a lot has been studied in depth about security [12, 20, 31, 34]. Kapur et al. [17] proposed a VDM based on logistic rate whereas the model proposed by [4] utilized and proposed a hump-shaped vulnerability detection process. Schatz and Bashroush [29] conducted a systematic literature review to identify and analyze the state-of-the-art in economic assessment of information security investments in organizations. Kaur et al. [18] have discussed a vulnerability correction model which models the time gap between the correction of leading and dependent vulnerabilities. Kudjo et al. [19], examined the statistical significance of the annual seasonal patterns and trends in vulnerability discovery using the weighted moving window. They further worked on regression-based models as vulnerability predictors. On examining the literature on vulnerability, it can easily be articulated that above research focuses on assessment of vulnerabilities based on different distribution patters. However, the work presented in this chapter quantifies vulnerabilities present in software system, on the foundation of injection-based discovery process. This is done to keenly understand the rationale behind introduction of vulnerabilities and its impact on their discovery pattern. The concept used here has its roots in the field of software reliability where the fault removal phenomenon is significantly dependent on the fault detection process and the time delay between detection and correction [16].

3 Methodology In the chapter, an integrated modeling approach is proposed based on the concept of infinite queuing server theory and which is as per the idea given by [14]. Here, we have formulated different sets of models to illustrate several scenarios of vulnerability discovery by considering the unification approach. Furthermore, due to availability of several developed models it is essential for selecting the best model and for the precise selection we have considered a ranking i.e., VIKOR approach [25, 26, 30]. In VIKOR approach, several comparison criterions have been calculated for all the models. Each alternative is compared with other alternatives based on the best calculated value of each attributes.

3.1 Modeling Framework Infinite Server Queuing Theory has been exhaustively utilized in various field like that of software engineering [16] and Managerial decision making [3] to name a few. The

Modeling Software Vulnerability Injection-Discovery Process …

243

proposed modeling framework is formed by considering the Infinite Server Queuing Theory used by [3] which helps in describing leap time between vulnerability getting injected and its discovery. Assuming, the case when the vulnerability discovery process is done in accordance with Non-Homogeneous Poisson Process (NHPP). The present modeling framework is based on the following assumptions: • Injection is synonymous to vulnerabilities introduction as the part of software launch. • Vulnerability Injection and Discovery process follows Non-homogeneous Poisson Process (NHPP). • Initial count of vulnerabilities present is constant, and the discovery process has no influence on initial number. • The count of non-discovered vulnerabilities affects the discovered vulnerabilities. Following are the set of notations: Vi (t)

Mean Value function for vulnerability injection process

Vd (t)

Mean Value function for vulnerability discovery process

v

Number of vulnerabilities in the software

λ, β

Shape and scale parameters

Let the counting process {M(t), t ≥ 0}, {N (t), t ≥ 0} represents number vulnerabilities injected and discovered up to time  t  and the process started at initial time  t = 0 . Then the distribution of N (t) is given by [3]: 

∞ 

Vi (t)e−Vi (t) Pr{N (t) = n} = Pr{N (t) = n|M(t) = m } m! j=0

 (1)

Assuming the initial count of vulnerabilities introduced as  m  then the probability that  n  loopholes will be discovered while the system experienced the discovery process, can be given as follows:  Pr{N (t) = n|M(t) = m } =

 m ( p(t))n (1 − p(t))m−n n

(2)

where p(t) represents the probability that an arbitrary vulnerability is getting discovered at time  t  , which can be defined using steiltjes convolution and the concept of the conditional distribution of arrival times, given as: t p(t) =

Fd (t − u) 0

d Vi (u) Vi (t)

(3)

244

M. Agarwal et al.

Probability distribution function (pdf) of cumulative count of vulnerabilities discovered by time  t  ⎛ Pr{N (t) = n} = ⎝

t

⎞ Fd (t − u)d Vi (u)⎠

e

 t − Fd (t−u)d Vi (u) 0

(4)

m!

0

  t Equation (4) describes Fd (t − u)d Vi (u) .

Vd (t)

that

follows

an

NHPP

with

MVF

0

Therefore, ⎛ t ⎞  Vd (t) = ⎝ Fd (t − u)d Vi (u)⎠

(5)

0

Henceforth, the mean value function for vulnerability injection and distribution pattern of vulnerabilities discovery can assist in determining the count of vulnerabilities present in the system at any instant of time. For analytical understanding four different cases have been considered. VDM-I: Assuming the injection of vulnerability to follow the constant detection pattern whereas the discovery process follows exponential pattern i.e., Vi (t) ∼ 1(t) and Fd (t) ∼ exp(λ). On substitution the functional form for the same in Eq. (5) leads to:   Vd (t) = v 1 − e−λt

(6)

The function form given by Eq. (6) is like Rescorla’s Exponential (RE) model (2005). VDM-II: Considering another case of where injection happen to follow the constant pattern whereas the discovery followed s-shaped pattern encompassed with the learning i.e., Vi (t) ∼ 1(t) and Fd (t) ∼ logistic(λ, β). Putting the functional form in Eq. (5) leads to: 

1 − e−λt Vd (t) = v 1 + βe−λt

 (7)

Equation (7) defines the logistic distribution-based vulnerability discovery model being dependent on constant pattern of vulnerability injection which is in line with Alhazmi and Malaiya’s Logistic (AML) model (2005). VDM-III: Here we assumed that the vulnerability injection remains same over the time and PDF for vulnerability discovery process to be Weibull behavior i.e., Vi (t) ∼ 1(t) and Fd (t) ∼ W eibull(λ, k). Putting the functional form in Eq. (5) leads to:

Modeling Software Vulnerability Injection-Discovery Process …

245



k Vd (t) = v 1 − e−λt

(8)

The function for VDM as stated in Eq. (8) is equivalent to the model proposed by [15] where the researcher worked with Weibull distribution for vulnerability discovery. VDM-IV: If injection and discovery distribution function to follow one-stage Erlang distribution with same rate i.e., Vi (t) ∼ exp(λ) and Fd (t) ∼ exp(λ). On substituting in Eq. (5) gives:   Vd (t) = v 1 − (1 + λt)e−λt

(9)

VDM-V: Considering the case when rates of vulnerability injection and discovery both follow exponential pattern with different rates i.e.,Vi (t) ∼ ex p(λ1 ) and Fd (t) ∼ ex p(λ2 ). Using Eq. (5) we get:  Vd (t) = v 1 −

  −λ2 t  1 λ1 e − λ2 e−λ1 t (λ1 − λ2 )

(10)

VDM-VI: Consider vulnerability injection intensity to be exponentially changing over time and vulnerability discovery process follows Logistic distribution i.e., Vi (t) ∼ ex p(λ) and Fd (t) ∼ logistic(λ, β); and again, making use of Eq. (5) we get, Vd (t) = v(1 − e−λt ) + v(1 + β)e−λt log



(1 + β)e−λt 1 + βe−λt

 (11)

VDM-VII: Assuming intensity of vulnerability injection and discovery process both to follow Logistic distribution i.e. Vi (t) ∼ logistic(λ, β) and Fd (t) ∼ logistic(λ, β); and again making use of Eq. (5) we get,  Vd (t) = v

1 − e−λt 1 − β 2 e−λt



   (1 + β)2 e−λt (1 + β)e−λt + v 2 λt + 2log 1 + βe−λt 1 − β 2 e−λt

(12)

VDM-VIII: Yet another case that can be on constant pattern for injection intensity and vulnerability discovery process follows hump shaped rate of detection Vi (t) ∼ 1(t) and Fd (t) ∼ hump Shape distribution (λ, β). Putting the functional form in Eq. (5) leads to:  Vd (t) = v 1 − e

−λ



1 1+βe−λt

   1 − 1+β

(13)

Equation (13) defines the Hump shaped based vulnerability discovery model being dependent on constant pattern of vulnerability injection which is in line with [4].

246

M. Agarwal et al.

Further, the Eqs. (6)–(13) can assist in determining the count of discovered vulnerabilities based on the pattern of vulnerability injection which can be useful instruments for framing patching and update schedules.

4 Data Analysis and Model Validation Model competency and suitability can be demonstrated by its behavior to replicate and forecast based on observed behavior. To assess the performance of proposed VDMs, the data of two operating systems obtained from the CVE Details which is a Common Vulnerabilities and Exposures (CVE) security vulnerability database/information source [9] i.e., Mozilla Firefox (VDS-I) and Windows Server 2008 (VDS-II) has been utilized. Further, various comparison criterions are calculated to compare different models.

4.1 Estimation of Parameters For parameter estimation a statistical software, SAS has been used which is a comprehensive and flexible software used for data management and analytics. Non-Linear Regression (NLR) modules have been used, it is an analytical method which classifies the variables as predictor and response variable. Here, we have considered time as the predictor and vulnerabilities as the response variable.

4.2 Goodness-of-Fit Criteria To examining the predictive nature of the developed model’s different comparison criteria have been computed to assess the models numerically. Six predictive criterions are used viz. SSE, MSE, Variance, Root MSE, R-Square, Adj. R-square [3].

4.3 Graphical Approach (Goodness of Fit) Figures 1 and 2 depicts the goodness-of-fit curves, i.e., it displays the how the observed vulnerability and the model predicted vulnerabilities variate over the time. Classification of models based on pictorial representation sometimes becomes ambiguous. Hence, a need for an approach for better classification is necessary.

Modeling Software Vulnerability Injection-Discovery Process …

247

Fig. 1 Goodness of fit for VDS-I

Fig. 2 Goodness of fit for VDS-II

4.4 VIKOR Analysis VIKOR, a very important multi criteria decision making approach, was proposed by Serafim Opricovic [25, 26, 30] and is also known as the compromise ranking method. The technique helps choose the best option out of the various alternatives by considering the conflicting criteria. The utility lies in the fact that it facilitates decision making in situations where decision makers are unable to express their preferences. It makes uses of an aggregate function by expressing closeness to the ideal solution and to rank the various alternatives. Along with the ranking, VIKOR also provides a compromise solution with an advantage rate which maximizes the group utility for the majority and minimizes the individual regret for the opponent. The optimal solution is closest to the ideal solution and farthest from the nadir solution. Following are the steps involved in its implementation: Step-1: Establish a matrix of criteria and different alternatives. Step-2: Normalization of decision matrix.

248

M. Agarwal et al.

Step-3: Calculate the weight of the normalized decision matrix. Si =

  ∗ m   xi − xi j wj ∗ xi∗ − xi− j=1

   ∗ xi − xi j max w j ∗ Ri =  x∗ − x− i

i

j

Step-4: Determine the ideal solution and nadir solutions (negative ideal solution) I deal Solution N adir Solution min Si R ∗ =  min Ri S ∗ =  i

S − =  max Si i

i

R ∗ =  max Ri i

Step-5: Compute the distance for each alternative. Step-6: Calculate the relative closeness to the ideal solution. Qi = v





Si − S ∗ S− − S∗





Ri − R ∗ + (1 − v) R− − R∗



Step 7: Rank the Preference Order. The alternative with the smallest value is the best solution.

4.5 Illustrated Examples The idea behind this presentation is to evaluate the appropriateness of the VIKOR method that an all-inclusive classification of the alternative models can be directed by considering several attributes related to the models for a given dataset. Example 1: Vulnerability discovery data for Mozilla Firefox (DS-I) has been collected from the CVE details [9] and for the assessment, optimal model selection and ranking on the models on the bases of six different performance criteria’s i.e., SSE, MSE, Variance, Root-MSE, R2 and Adj.-R2 . The parameter values for the eight proposed models have been identified by applying the non-linear least square (NLLS) regression technique and the parameter is provided in Table 2. The value of the six performance criteria have been obtained. The value of the attributes for each VDM is given in Table 3. Based on the above comparison table, it is noticed that the selection of best model becomes difficult. To evade this difficulty, we have applied VIKOR method to classify

VDM-I

9850

0.010351







Parameter

v

λ1

λ2

β

k



14.67787



0.210592

2670.246

VDM-II

Table 2 Parameter estimates for VDS-I

1.734185





0.003688

4707.227

VDM-III







0.067987

5761.476

VDM-IV

0.040864

0.101144

6524.02

VDM-V

4.21



0.199337

2870.237

VDM-VI

0.23



0.089582

4818.152

VDM-VII

16.25926



0.210377

14,853.58

VDM-VIII

Modeling Software Vulnerability Injection-Discovery Process … 249

VDM-I

502,610

31,413.1

163.65

177.2

0.916

0.916

Criteria

SSE

MSE

Variance

Root MSE

R2

Adj.R 2

0.993

0.994

49.2428

45.52

2424.9

33,948

VDM-II

Table 3 Comparison criteria for VDS-I

0.995

0.996

41.9015

39.09

1755.7

24,580.3

VDM-III

0.996

0.996

40.2877

39.00

1623.1

24,346.5

VDM-IV

0.995

0.996

40.1772

38.88938

1614.2

24,213.1

VDM-V

0.994

0.994

45.9692

44.26052

2113.2

31,697.4

VDM-VI

0.995

0.995

40.738

39.44352

1659.6

24,894.4

VDM-VII

0.993

0.994

49.2496

45.52526

2425.5

33,957.3

VDM-VIII

250 M. Agarwal et al.

Modeling Software Vulnerability Injection-Discovery Process …

251

Table 4 VIKOR analysis for VDS-I Models

Si

Ri

Qi

Rank

VDM-I

1

0.166667

1

8

VDM-II

0.038227

0.011027

0.05159777

6

VDM-III

0.005368

0.002097

0.00834962

3

VDM-IV

0.000378

0.000148

0

1

VDM-V

0.002083

0.002083

0.0066646

2

VDM-VI

0.027952

0.007175

0.03489317

5

VDM-VII

0.00608

0.002083

0.00866386

4

VDM-VIII

0.038249

0.011035

0.05163355

7

the models by considering all six criteria taken together and the respective ranking of the model is shown in Table 4. The ranking is outlined on the relative closeness value of the models that is established by considering six performance measures as contributing attributes together used in VIKOR analysis. The model with the lowest value is given rank 1, that with second lowest as rank 2, and so on. As per the results found after applying the VIKOR method, it can be seen that the VDM-IV is ranked as first position followed by VDM-V. Hence, the model that depicts the injection and discovery distribution function follow one-stage Erlang distribution with same rate attain 1st rank. Example 2: The vulnerability discovery data of Windows Server 2008 (DS-II) has been collected from the CVE details [9] and for the assessment, optimal model selection and ranking on the models on the bases of six different performance criteria’s i.e., SSE, MSE, Variance, Root-MSE, R2 and Adj.-R2 . The parameter values for the eight proposed models have been identified by applying the non-linear least square (NLLS) regression technique and the parameter is provided in Table 5. The value of the six performance criteria have been obtained and the value of the attributes for each VDM is given in Table 6. Similarly, based on the above table it is difficult to judge which model is performing best. Thus, VIKOR approach has been used to rank the models based on all six criteria taken together and shown in Table 7. The model with the lowest value is given rank 1, that with second lowest as rank 2, and so on. The results obtained depict that VDM-II is ranked as first position followed by VDM-VIII. The outcomes of above examples show that not all the scenarios can be best explained by all models and their predictive capabilities are influenced accordingly. Also, the VDM-I is ranked 8 on both the data sets implying exponential growth in vulnerability discovery is least suited to understand its pattern.

8402.65

0.010504



v

λ1

λ2

k

β

VDM-I

Parameter

29.40423



0.253477

2996.618

VDM-II

Table 5 Parameter estimates for VDS-II

2.069643



0.001766

4519

VDM-III



0.049576

9614

VDM-IV





0.045

0.065066

8364.35

VDM-V

6.696046

0.150076

6143.2

VDM -VI

2.019402

0.187941

3787.1

VDM-VII

36.593999

0.296056

9314.05

VDM-VIII

252 M. Agarwal et al.

VDM-I

359,416

29,951.3

162.9228

173.1

0.853

0.853

Criteria

SSE

MSE

Variance

Root MSE

R2

Adj.R 2

0.991

0.992

40.7014

38.96507

1656.6

18,222.6

VDM-II

Table 6 Comparison criteria for VDS-II

0.987

0.988

51.0653

48.43078

2607.7

28,684.3

VDM-III

0.985

0.985

54.9416

54.6645

3018.6

36,223

VDM-IV

0.983

0.983

57.6831

57.27339

3327.3

39,928.1

VDM-V

0.989

0.990

46.7824

43.9689

2188.6

24,074.5

VDM-VI

0.985

0.987

53.826

50.80567

2897.2

31,869.6

VDM-VII

0.989

0.990

47.1696

44.9426

2225

24,474.7

VDM-VIII

Modeling Software Vulnerability Injection-Discovery Process … 253

254

M. Agarwal et al.

Table 7 VIKOR analysis for VDS-II Models

Si

Ri

Qi

Rank

VDM-I

1

0.166667

1

8

VDM-II

0

0

0

1

VDM-III

0.046113

0.013046

0.0621956

4

VDM-IV

0.07149

0.021109

0.09907072

6

VDM-V

0.08689

0.024616

0.11729418

7

VDM-VI

0.025189

0.007655

0.03555907

2

VDM-VII

0.059657

0.016522

0.07939336

5

VDM-VIII

0.027395

0.008142

0.03812452

3

5 Conclusion This chapter has been focused on modeling the impact of vulnerability injection on the discover process. As quoted “Vulnerability finding increases total software quality”, it would be better to discover the vulnerabilities before being discovered by exploiters. Here, we have formulated a unification strategy based on infinite server queuing theory for the modeling of eight different VDMs which are based on different injection and discovery patterns. These models were tested for prediction capability on two different data sets collected from CVE repository. Moreover, for in depth understanding a ranking procedure VIKOR analysis has been used and ranks are assigned based on six comparison criterions. It was observed that the predictive capabilities of the proposed models vary with the change in shape of the data. According to authors knowledge, this is the first time that the VDM is being studied based on the injection pattern of vulnerabilities. In future, we wish to work on extending the field of vulnerability discovery modeling to incorporate more flexibility in locating loopholes because of random lag function approach and thereby establishing the equivalence with the proposed approach.

References 1. Alhazmi OH, Malaiya YK, Ray I (2007) Measuring, analyzing and predicting security vulnerabilities in software systems. Comput Secur 26(3):219–228 2. Alhazmi OH, Malaiya YK (2005) Modeling the vulnerability discovery process. In: Proceedings of the 16th IEEE international symposium on software reliability engineering. IEEE, Chicago, IL, pp 138–147 3. Anand A, Agarwal M, Aggrawal D, Singh O (2016) Unified approach for modeling innovation adoption & optimal model selection for the diffusion process. J Adv Manage Res-An Emerald Insight 13(2):154–178 4. Anand A, Bhatt N (2016) Vulnerability discovery modeling and weighted criteria based ranking. J Indian Soc Probab Stat 1–10

Modeling Software Vulnerability Injection-Discovery Process …

255

5. Anderson R (2002) Security in open versus closed systems—the dance of Boltzmann, Coase and Moore. Cambridge University, England, Technical report, pp 1–15 6. Arora A, Nandkumar A, Telang R (2006) Does information security attack frequency increase with vulnerability disclosure? An empirical analysis. Inf Syst Front 8(5):350–362 7. Bhatt N, Anand A, Aggrawal D (2019) Improving system reliability by optimal allocation of resources for discovering software vulnerabilities. Int J Qual Reliab Manage 8. Bhatt N, Anand A, Yadavalli VSS, Kumar V (2017) Modeling and characterizing software vulnerabilities. Int J Math, Eng Manage Sci (IJMEMS) 2(4):288–299 9. CVE (2019) https://www.cvedetails.com/. Accessed 20 Jan 2020 10. Chatterjee S, Saha D, Sharma A (2021) Multi-upgradation software reliability growth model with dependency of faults under change point and imperfect debugging. J Softw: Evol Process e2344 11. Gao X, Zhong W, Mei S (2015) Security investment and information sharing under an alternative security breach probability function. Inf Syst Front 17(2):423–438 12. Garg S, Singh RK, Mohapatra AK (2019) Analysis of software vulnerability classification based on different technical parameters. Inf Sec J: A Glob Perspect 28(1–2):1–19 13. Hanebutte N, Oman PW (2005) Software vulnerability mitigation as a proper subset of software maintenance. J Softw Maint Evol Res Pract 17(6):379–400 14. Inoue S, Yamada S (2002) A software reliability growth model based on infinite server queuing theory. In: Proceedings 9th ISSAT international conference on reliability and quality in design. Honolulu, HI, pp 305–309 15. Joh H, Kim J, Malaiya YK (2008) Vulnerability discovery modeling using Weibull distribution. In: 2008 19th international symposium on software reliability engineering (ISSRE). IEEE, pp 299–300 16. Kapur PK, Pham H, Gupta A, Jha PC (2011) Software reliability assessment with OR applications. Springer, London Limited 17. Kapur PK, Sachdeva N, Khatri SK (2015) Vulnerability discovery modeling. In: International conference on quality, reliability, infocom technology and industrial technology management, pp 34–54 18. Kaur J, Anand A, Singh O (2019) Modeling software vulnerability correction/fixation process incorporating time lag. In: Boca Raton FL (ed) Recent advancements in software reliability assurance. CRC Press, pp 39–58 19. Kudjo PK, Chen J, Brown SA, Mensah S (2019) The effect of weighted moving windows on security vulnerability prediction. In: 2019 34th IEEE/ACM international conference on automated software engineering workshop (ASEW). IEEE, pp 65–68 20. Kumar A, Ram M (2018) System reliability analysis based on Weibull distribution and hesitant fuzzy set. Int J Math Eng Manag Sci 3(4):513–521. https://doi.org/10.33889/IJMEMS.2018.3. 4-037 21. Liu B, Shi L, Cai Z, Li M (2012) Software vulnerability discovery techniques: a survey. In: 2012 fourth international conference on multimedia information networking and security. IEEE, pp 152–156 22. Liu Q, Xing L (2021) Survivability and vulnerability analysis of cloud RAID systems under disk faults and attacks. Int J Math Eng Manag Sci 6(1):15–29. https://doi.org/10.33889/IJM EMS.2021.6.1.003 23. MSRC Team (2017) Customer Guidance for WannaCrypt attacks. Accessed 25th Jan 2020 24. Massacci F, Nguyen VH (2014) An empirical methodology to evaluate vulnerability discovery models. IEEE Trans Softw Eng 40(12):1147–1162 25. Opricovic S (1998) Multicriteria optimization of civil engineering systems. Faculty Civ Eng, Belgrade 2(1):5–21 26. Opricovic S, Tzeng GH (2004) Compromise solution by MCDM methods: a comparative analysis of VIKOR and TOPSIS. Eur J Oper Res 156(2):445–455 27. Rescorla E (2005) Is finding security holes a good idea? IEEE Secur Priv 3(1):14–19 28. Ryan KT (2016) Software processes for a changing world. J Softw: Evol Process 28(4):236–240

256

M. Agarwal et al.

29. Schatz D, Bashroush R (2017) Economic valuation for information security investment: a systematic literature review. Inf Syst Front 19(5):1205–1228 30. Tong LI, Chen CC, Wang CH (2007) Optimization of multi-response processes using the VIKOR method. The Int J Adv Manuf Technol 31(11–12):1049–1057 31. Verma R, Parihar RS, Das S (2018) Modeling software multi up-gradations with error generation and fault severity. Int J Math Eng Manag Sci 3(4):429–437. https://doi.org/10.33889/IJM EMS.2018.3.4-030 32. Winder D (2019) https://www.forbes.com/sites/daveywinder/2019/08/20/data-breaches-exp ose-41-billion-records-in-first-six-months-of-2019/#6e20808bd549, Accessed 25th Jan 2020 33. Woo SW, Joh H, Alhazmi OH, Malaiya YK (2011) Modeling vulnerability discovery process in Apache and IIS HTTP servers. Comput Secur 30(1):50–62 34. Yang SS, Choi H, Joo H (2010) Vulnerability analysis of the grid data security authentication system. Inf Secur J: A Glob Perspect 19(4):182–190 35. Younis A, Joh H, Malaiya Y (2011) Modeling learning less vulnerability discovery using a folded distribution. In: Proceedings of SAM, vol 11, pp 617–623

Assessment of Reliability Function and Signature of Energy Plant Complex System Monika Negi, Megha Shah, Akshay Kumar, Mangey Ram, and Seema Saini

Abstract The objective of this chapter is to calculate the performance of a complex system of an energy plant. In this model, deals with the universal generating function technique (UGF) for analyzing the performance of the complex fixable condensate system of an energy plant, the process of analyzing the reliability has been used for exaggerate the capability of the complex system by reducing the failure rate. The illustrated system is divided into six subsystems those are arranged into series parallel combination and by using the z transformation with UGF technique we find the reliability, signature, tail signature, expected time and expected cost and these resulting outcomes help to get the highly beneficial probabilistic approach that analyzed the signature and the reliability of the system in effective manner. Keywords Structure function · Signature · Tail signature · Thermo plant energy system · Universal generating function

M. Negi · M. Shah · A. Kumar (B) Department of Mathematics, Graphic Era Hill University, Dehradun, Uttarakhand, India M. Ram Department of Mathematics, Computer Science and Engineering, Graphic Era Deemed To Be University, Uttarakhand, India Institute of Advanced Manufacturing Technologies, Peter the Great St. Petersburg Polytechnic University, 195251 Saint Petersburg, Russia S. Saini Department of Mathematics, Graphic Era Deemed To Be University, Dehradun, Uttarakhand, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment of Industrial Systems, Springer Series in Reliability Engineering, https://doi.org/10.1007/978-3-030-93623-5_11

257

258

M. Negi et al.

1 Introduction Reliability is the capability of a system or a model to determine the performance of the entire system as well as the elements such as perfectly function or totally failed. Osaki and Nakagawa [30] gave a short assessment about stochastic systems on the lists of reliability of models through chosen reference on the system dependability models with the help of Markov techniques. Dhillon [4] considered two stochastic systems for the study of the accessibility and reliability of k-out-of-n component group of systems having critical human errors (where an unnecessary model stops working because of human error) and common-cause failures by making the use of additional variables and Markov method. Sahner and Trivedi [33] described about the software instrument for the study of stochastic model. SHARPE which stands for Symbolic Hierarchical Automated Performance Evaluator is a software system which helps in the examination of stochastic systems. This software was developed for the designing of real systems and has been used for this very purpose for so many years. Another model inexact two-stage stochastic programming (ITSP) is offered by Huang et al. [10] for water resources under unpredictability. Authors had analysed about the suppositional case study that illustrated the usage of the advanced system and the solutions that had got are then evaluated and explicated for producing decision substitutes and recognising consequential constituents of those which are influencing the system’s efficiency. Weng [37] explored about the merged usage of satellite remote sensing, stochastic model technologies and geographic information systems (GIS). The investigation reveals that the unification of satellite remote sensing and GIS were an efficient technique for evaluation of land use change’s rate, direction and spatial pattern. Zacharof and Butler [38] supposed a model and the chief aim was to have the ability to control landfill sites, subsequent leachate, and biogas production for the optimisation of disposal process and the reduction of environmental impact. A common analytic model-based technique was determined which helps in a cloud service’s end-to-end performability study in which their approach is explained from Infrastructure taken as Service cloud. On the contrary to a sole onelevel monolithic model, their method yielded a soaring precision of the model that is tractable and scalable. This method had been easily expanding to other kinds of cloud services and are appropriate for the private, public and hybrid clouds by [7]. Michael et al. [25] explained about Markov and Discrete event simulation (DES) model and these models are useful for failure interaction. Experiment’s designs had also been introduced by them which was performed on both the approaches and DES model was as precise as Markov model came out to be the conclusion. Machida et al. [21] determined an automatic model constitution approach to a hierarchical stochastic model from a system design which was the sensible union of combinatorial and state-space models. The attained Hierarchical stochastic model was then calculated analytically and significantly quickly. Andrzejczak [1] determined the most general methods which are used for designing of fixable systems. Author explained stochastic

Assessment of Reliability Function and Signature …

259

procedures as instances of reliability-based systems for the support of the continuation linked decisions. The availability of the casting procedures in foundry work is well explained through Laplace transformation and supplementary variable method and by the application of probability approach the defects can be overcome. The outcomes were defined and illustrated by graphs from which authors can readily envision the nature of the model after the appearance of any error by [16]. Malik and Munday [22] illustrated to magnify the productivity of the models having hardware redundancy which can be enhanced by taking correct repairing policies and elements of low failure percentages by improving the repair time. The method of redundancy is also regarded as an instrument to intensify longevity of the systems. Manatos et al. [23] studied a multi-state rectifiable production model that is discussed in preventive maintenance has been proposed and the reason behind its implementation is to enhance its dependability and performance. They have given a generic model for the production system with the help of a semi-Markov procedures, for scrutinizing the limiting behaviour of systems. Ram and Goyal [31] examined the performance of ATM channel and also inspected the dependability measures of a fixable ATM system which helps the reliability engineers or designers for the determination of how reliability could be enhanced using suitable alterations. It also seeks the sensitive scanning of different variations in reliability attributes along with the modifications in accurate values of input parameters. Goyal and Ram [8] gave the assessment of how a wind electricity generated power plant performs by examining the reliability measures. The research aimed at improvement of wind power plant’s performance with the aid of different techniques. To secure farm-based biogas plant projects for the long-run. Dennehy et al. [3] examined the necessity of it and had objected to attain co-substrate’s stabled stocks and to classify plants volume which is grounded on the accessibility of the co-substrates that drives methane production. Komal [11] discussed the fuzzy reliability of coal fired thermo plant using fuzzy methods having membership function from triangular fuzzy numbers etc. Gupta [9] determined about the approach to illustrate the condensate system’s performance of thermal power plant based on coal. The illustration was completed through availability matrices and these are evolved from various merger of repair plus failure rates of each and every sub system to maximise the generation of power and to make it working for a long-term. A two-stage stochastic degradation model is proposed by [5], in the first stage, degeneration procedures of two correlated execution properties were explained by correlated bivariate Wiener, second stage was designed by univariate Wiener method for adopted modelling. When both the degradation levels attained its thresholds then the model stopped working absolutely in the last stage. Explicit statements of system performance are grounded on maintenance policies and non-identical inspection. Kumar et al. [17] illustrated an unconventional approach for the accessibility scanning of a nuclear power plant’s Digital Feed Water Control System, which supposed the preservation and restoration of the main-steam safety valves. Stochastic Petri Net is used to model the system which took over all model prerequisites, the partial failures of its subsystems along with general reasons of failures.

260

M. Negi et al.

In signature reliability’s context for a multi-state system, Levitin [18] elucidated a redundancy optimization problem. Their principal aim was to reduce the total investment costs while contenting demands, presented by an increasing demand curve, with stated probability. Levitin [19] illustrated reliability examination and optimization of different models which generalize successive k-out-of-r-from-n-F model comprising of n consecutive multi state ordered unit where single component has its own state using UGF. Levitin [20] gave the comprehensive description of UGF technique and its application in both the field of binary and multi state system reliability analysis. Navarro et al. [28, 29] explained about the coherent system and its dependent component. Also hyper minimal and hyper maximal distributions are two families of univariate distributions which was introduced by the writers. In the context of reliability theory, these families have engrossing applications in those elements which having coherent system lifetime distributions. Samaniego [34] and Samaniego et al. [35] computed dynamic signatures along with their usage for the comparison of the reliability of used and new system. Authors discovered how one can compare a fresh model to the another efficiently burned-in to the kth element failure, elements also spot the situations in which burn-in is inferior (or is superior) to the fielding of a fresh model. Coherent systems are pertinent subjects in reliability and survival studies and is being examined by [26], it also functions as a handy apparatus to calculate the model dependability and anticipated life-duration to compare various models with the help of stochastic orderings. Here the elements are independent and identically distributed (i.i.d.) with a common continuous distribution. Navarro and Rychlik [27] regulated bounds and comparisons for the anticipated lifespan of mixed or coherent systems having independent units that had unspecified distributions depending on the elements anticipated lifespans and on their domination polynomials which is only based on the system structure function. Using reliability functions of the given models, Kumar and Singh [13] evaluated signature reliability of complex and k-out-of-n coherent system. Kumar and Singh [14] computed expected price and Barlow-Proschan index which were discovered through reliability or structure function of the linear multi-state sliding window system by the method of Owen and UGF technique. Kumar and Singh [12, 15] supposed sliding window coherent model in the event of multiple failures and also had analyzed about A-within-B-fromD/G SWCS incorporating multiple failures. Owen’s method and UGF method is used for the assessment of signature and its measures in the proposed model. Rushdi and Ghaleb [32] studied the weighted k-out-of-n complex coherent system and discussed system’s characteristic properties with some effective techniques. Triantafyllou [36] evaluated the signature of complex (k,d)-out-of-n:F Constrained system with the help of some proposed algorithm. Mean life time and conditional mean residual lifetime also computed based on signature expressions. In this studied we have considered thermo power plant complex system combination of series parallel manner using UGF method and find tail signature, signature, mean time to failure and cost. In Sect. 1 discussed review and literature based on

Assessment of Reliability Function and Signature …

261

reliability and signature analysis. Section 2 studied some numerical formula for computing signature factor. Section 3 concluded model configuration. In Sect. 4 having numerical result on the basis of Owen’s and UGF method. Section 5 and 6 describe result and discussion and conclusion part. Notation n = total number of components in the multi-state system p = the probability of system q = failure probability of the system ϕ& ⊗ = shifting or multiplication operator R = reliability of the system E(X) = expected X of the components E(T ) = excepted working time of units S = signature of the system with l components s = tail signature of system with l components C = cost of the system having i units H = number of components which is associated with n units.

2 Evolution of the Signature, Minimal Signature, Tail Signature and Cost from Structure Function of the Proposed Model The signature and its measures of the various coherent systems which are i.i.d. elements [for instance, order statistics and reliability function methods [2, 28, 29] can be evaluated in the following manner. Signature and its measures are defined as Sl = 

1





n H ⊆[n] n − l + 1 | H |=n−l+1

ϕ(H ) − 

1





n H ⊆[n] n − 1 | H |=n−1

ϕ(H )

(1)

From systems structure function which are of polynomial’s form and having i.i.d. components   n  m e n−e H ( p) = Cj and Ce = si , e = 1, 2, .., n p q e e=1 i=n−e+1 m 

(2)

262

M. Negi et al.

The system is Sl =

n 

si =

i=l+1

⎛ ⎝

1

n





⎠ | H |=n−l

ϕ(H ) with n-tuples set function

n−1 such as S = (S0 , ..., Sn ) and we have to find the tail signature along with signature of the system we considered. Using Taylor’s expansion and signature is Sl =

n−1 l D P(1), l = 0, 1, ..., n n!

to transform polynomial function into P(X ) = X n H which is

1 X

s = Sl−1 − Sl , l = 1, ..., n and E(T ) = μ

n  i=1

Ci i

and E(X ) =

n 

(3)

of proposed method [24]

(4)

i.si , i = 1, 2, ..., n are defined as to determine

i=1

the cost and anticipated lifespan of the system from reliability function [6, 26] which are based on minimal signature and quantity of failed components of the system with one as the mean value.

3 Model Description Thermo power plant system is the generalization of coal fired power plant system. Thermo-plant model is an arrangement of apparatus and complex manner which is helpful in the conversion of fossil fuel’s chemical energy into thermal energy. Then transform it into steam it is shifted to water at high temperature and pressure and it is utilized for the evolution of power in a turbine or steam engine. The water is transformed into steam with the assistant of heat generated by burning of coal in the boiler. Then, the exhaust steam is liquefied to water as it passes through a condenser. The condensed steam (condensate) is driven into X 4 (GS condenser) through X 2 (CEP 1), X 3 (CEP 2) from X 1 (condenser). Then it proceeds through X 5 (drain cooler) and eventually to X 9 (deaerator) after the elevation of its temperature in X 6 (heater 1), X 7 (heater 2), X 8 (heater 3). The complete working of condensate system has been discuss in the diagram given below in (Fig. 1).

4 Numerical Example A Thermo-plant complex structure is comprised of nine components. Among these nine components 1, 2 and 3 are associated in series manner and 4 to 9 components are arranged in combination of series–parallel manner. The performance of the system

Assessment of Reliability Function and Signature …

263

Fig. 1 Block diagram of thermo-plant

depends upon working culture of components which is attached in system. The system structure function is defined as max(X 1 X 2 X 3 , max(X 4 X 5 , X 6 , X 7 , X 8 ), X 9 ). Each unit will either work or fail and 0, 1 are given as the rate of performance of working and non-working components shown as Fig. 1. Now, calculate the expected reliability function of the consider system shown as in the form of u-functions such as u j (z) = (1 − p j )z 0 + p j z 1 , 1 ≤ j ≤ 9, The system structure function is defined as X = ϕ(X 1 , X 2 , ...X 9 ) = max(X 1 X 2 X 3 , max(X 4 X 5 , X 6 , X 7 , X 8 ), X 9 ). From the composition operator, get u-function using structure of the system such as U (z) = ⊗(u 1 (z), ..., u 9 (z)). ϕ

The reliability function can be obtaining repeated as X 10 = X 1 X 2 X 3 X 11 = X 4 X 5 X 12 = max(X 11 , X 6 , X 7 , X 8 )

264

M. Negi et al.

X 13 = X 10 X 12 X = max(X 9 , X 13 ) Now, calculation of the system reliability is as follows U10 = u 1 (z) ⊗ u 2 (z) ⊗ u 3 (z) min

min

U10 (z) = [(1 − p1 )z 0 + p1 z 1 ] ⊗ [ (1 − p2 )z + p2 z ] ⊗ [(1 − 0

1

min

min p3 )z 0

+ p3 z 1 ]

U10 (z) = [(1 − p1 )(1 − p2 ) + p1 (1 − p2 ) + p2 (1 − p1 ) + p1 p2 (1 − p3 )]z 0 + p1 p2 p3 z 1 . U11 (z) = u 4 (z) ⊗ u 5 (z) min

U11 (z) = [(1 − p4 )z 0 + p4 z 1 ⊗ (1 − p5 )z 0 + p5 z 1 ] min

U11 (z) = [(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) + p5 (1 − p4 )]z 0 + p4 p5 z 1 U12 (z) = U11 (z) ⊗ u 6 (z) ⊗ u 7 (z) ⊗ u 8 (z) max

max

max

U12 (z) = [(1 − p4 )((1 − p5 ) + p4 (1 − p5 ) + p5 (1 − p4 )]z 0 + p4 p5 z 1 ⊗ [(1 − p6 )z 0 + p6 z 1 ] ⊗ max

max

[(1 − p7 )z 0 + p7 z 1 ] ⊗ [(1 − p8 )z 0 + p8 z 1 ] max

U12 (z) = [(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) +(1 − p4 ) p5 ](1 − p6 )(1 − p7 )(1 − p8 )z 0 + {[(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) + (1 − p4 ) p5 ](1 − p6 )[ p7 (1 − p8 ) + (1 − p7 ) p8 + p7 p8 +[[(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) + (1 − p4 ) p5 ] p6 + p4 p5 ][(1 − p7 )(1 − p8 ) + [ p7 (1 − p8 ) +(1 − p7 ) p8 + p7 p8 ]]}z 1 .

Assessment of Reliability Function and Signature …

265

U13 (z) = U12 (z) ⊗ U10 (z) min

U13 (z) = {[(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) +(1 − p4 ) p5 ](1 − p6 )(1 − p7 )(1 − p8 ){[(1 − p1 )(1 − p2 ) + p1 (1 − p2 ) + p2 (1 − p1 ) + p1 p2 (1 − p3 )] + p1 p2 p3 } +[(1 − p4 )(1 − p5 ) + (1 − p5 ) p4 +(1 − p4 ) p5 ](1 − p4 )[(1 − p8 ) p7 + (1 − p7 ) p8 + p7 p8 ] +[[(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) +(1 − p4 ) p5 ] p6 + p4 p5 ][(1 − p7 )[(1 − p8 ) + [ p7 (1 − p8 ) + (1 − p7 ) p8 + p7 p8 ]] [(1 − p1 )(1 − p2 ) + p1 (1 − p2 ) + p2 (1 − p1 ) + p1 p2 (1 − p3 )]}z 0 +{[(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) + p5 (1 − p4 )[ p7 (1 − p8 ) + p8 (1 − p7 ) + p7 p8 ] + [[(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) + p5 (1 − p4 )] p6 + p4 p5 ][(1 − p7 )(1 − p8 ) + p7 (1 − p8 ) + p8 (1 − p7 ) + p7 p8 ]]} p1 p2 p3 z 1 . U (z) = U13 (z) ⊗ u 9 (z) max

U (z) = {[(1 − p4 )(1 − p4 ) + p4 (1 − p5 ) +(1 − p4 ) p5 ]{(1 − p6 )(1 − p7 )(1 − p8 )[(1 − p1 )(1 − p2 ) + p1 (1 − p2 ) + (1 − p1 ) p2 + (1 − p3 ) p2 p1 ] + p1 p2 p3 } +[(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) + p5 (1 − p4 )](1 − p6 )[ p7 (1 − p8 ) +(1 − p7 ) p8 + [[(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) + p5 (1 − p4 )] p6 + p4 p5 ][(1 − p7 )(1 − p8 ) + [(1 − p8 ) p7 + (1 − p7 ) p8 + p7 p8 ]] [(1 − p1 )(1 − p2 ) + p1 (1 − p2 ) + p2 (1 − p1 ) + p1 p2 (1 − p3 )]}(1 − p9 )z 0 + {[[(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) + p5 (1 − p4 )][(1 − p6 )(1 − p7 )(1 − p8 )[(1 − p1 )(1 − p2 ) + p1 (1 − p2 ) + p2 (1 − p1 ) + (1 − p3 ) p1 p2 + p1 p2 p3 ] + [(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) +(1 − p4 ) p5 ](1 − p6 )[ p7 (1 − p8 ) + (1 − p7 ) p8 + p7 p8 ] + [[(1 − p4 )(1 − p5 ) + (1 − p5 ) p4 +(1 − p4 ) p5 ] p6 + p4 p5 ][(1 − p7 )(1 − p8 ) + [ p7 (1 − p8 ) + p8 (1 − p7 ) + p7 p8 ]] [(1 − p1 )(1 − p2 ) + p1 (1 − p2 ) + p2 (1 − p1 ) + (1 − p3 ) p1 p2 ]] p9 +{[(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) + p5 (1 − p4 )](1 − p6 )[ p7 (1 − p8 ) + p8 (1 − p7 ) + p7 p8 ] + [[(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) + (1 − p4 ) p5 ] p6 + p4 p5 ][(1 − p7 )(1 − p8 ) +[ p7 (1 − p8 ) + p8 (1 − p7 ) + p7 p8 ]]} p1 p2 p3 }z 1 .

(5) Hence, reliability function of u-function, expected value can be obtained (see [20]) as

266

M. Negi et al.

R = [[(1 − p4 )(1 − p5 ) + (1 − p5 ) p4 + (1 − p4 ) p5 ][(1 − p6 )(1 − p7 )(1 − p8 ) [(1 − p1 )(1 − p2 ) + p1 (1 − p2 ) + p2 (1 − p1 ) + p1 p2 (1 − p3 )] + p1 p2 p3 ]+ [(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) + (1 − p4 ) p5 ](1 − p6 )] p7 (1 − p8 ) + (1 − p7 ) p8 + p7 p8 ] + [[(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) + (1 − p4 ) p5 ] p6 + p4 p5 ][(1 − p7 )(1 − p8 )+ [ p7 (1 − p8 ) + (1 − p7 ) p8 + p7 p8 ]][(1 − p1 )(1 − p2 ) + p1 (1 − p2 ) + (1 − p1 ) p2 + (1 − p3 ) p1 p2 ]] p9 + {[(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) + (1 − p4 ) p5 ](1 − p6 ) [(1 − p8 ) p7 + (1 − p7 ) p8 + p7 p8 ] + [[(1 − p4 )(1 − p5 ) + p4 (1 − p5 ) + (1 − p4 ) p5 ] p6 + p4 p5 ][(1 − p7 )(1 − p8 ) + [ p7 (1 − p8 ) + p8 (1 − p7 ) + p7 p8 ]]} p1 p2 p3

(6) If all the probabilities be i.i.d. components of each other, then structure function of the consider system is p1 = p2 = ... = p9 = p. The required reliability function of the proposed system is R = p + 3 p4 − 5 p5 + 5 p7 − 4 p8 + p9 .

(7)

Calculate minimal signature from the Eq. (7) such as Min. Signature = (1, 0, 0, 3, -5, 0, 5, -4, 1). Hence, calculate the tail signature from reliability function using Eq. (3) as Tail signature S = (1, 1, 66/72, 396/504, 1920/3024, 708/1512, 2016/6048, 4032/18144, 4032/36288, 0). In case of Signature using s = S l−1 − Sl , l = 1, 2, ..., 9 and value of tail signature, we get s = (0, 6/72, 4752/36288, 229,824/1524096, 762,048/4572288, 1,233,792/9144576, 12,192,768/109734912, 73,156,608/658409472, 4032/36288). 1 Now, determine the Expected time with the help of Eq. (7) using E(T ) have F(t) 0

is

1 p + 3 p 4 − 5 p 5 + 5 p 7 − 4 p 8 + p 9 dp?

= 0

1 =

(8) e(−t) − 3e(−4t) − 5e(−5t) + 5e(−7t) − 4e(−8t) + e(−9t) dt

0

= 1.075. And Expected X can be obtained by using system’s signature and formula E(X ) =

9  i=1

isi , i = 1, 2, ..., 9. such as = 5.472.

(9)

Assessment of Reliability Function and Signature …

267

For expected cost of the system using Eqs. (8–9) as Expected cost rate = E(X)/E(T ) = 5.09.

5 Result and Discussion In this work, signature used for comparison of system with respect to probability and failure probability of working elements. System have in good state according to mean time to failure, cost analysis and failure probability of the elements. Furthermore, most of the components are not in substandard state and they will execute effectively. These results are good enough for a system to perform excellently.

6 Conclusion This chapter aims at the efficacy of various components of the thermo plant energy system in which UGF and system structure have been used to calculate the reliability function. We evaluated both signature (0, 6/72, 4752/36288, 229,824/1524096, 762,048/4572288, 1,233,792/9144576, 12,192,768/109734912, 73,156,608/658409472, 4032/36288) and tail signature (1, 1, 66/72, 396/504, 1920/3024, 708/1512, 2016/6048, 4032/18144, 4032/36288, 0) of the system which came out to be less and hence making our model more productive. From the outcomes of the case study we can say that system’s mean time to failure is 1.075 high but the expected cost 5.09 of the system is moderate. If water supply are tends to fail, the system is will to be collapse. That is the one of the major limitation of consider system.

References 1. Andrzejczak K (2015) Stochastic modelling of the repairable system. J KONBiN 35(1):5–14 2. Boland PJ (2001) Signatures of indirect majority systems. J Appl Probab 38(2):597–603 3. Dennehy C, Lawlor PG, Gardiner GE, Jiang Y, Shalloo L, Zhan X (2017) Stochastic modelling of the economic viability of on-farm co-digestion of pig manure and food waste in Ireland. Appl Energy 205:1528–1537 4. Dhillon BS (1992) Stochastic modelling of k-out-of-n units family of systems. Int J Syst Sci 23(8):1277–1287 5. Dong Q, Cui L, Si S (2020) Reliability and availability analysis of stochastic degradation systems based on bivariate Wiener processes. Appl Math Model 79:414–433 6. Eryilmaz S (2012) The number of failed elements in a coherent system with exchangeable elements. IEEE Trans Reliab 61(1):203–207 7. Ghosh R, Trivedi KS, Naik VK, Kim DS (2010) End-to-end performability analysis for infrastructure-as-a-service cloud: an interacting stochastic models approach. In: 2010 IEEE 16th Pacific rim international symposium on dependable computing, pp 125–132

268

M. Negi et al.

8. Goyal N, Ram M (2017) Stochastic modelling of a wind electric generating power plant. Int J Qual Reliab Manage 34(1):103–127 9. Gupta S (2019) Stochastic modelling and availability analysis of a critical engineering system. Int J Qual Reliab Manage 36(5):782–796 10. Huang GH, Loucks DP (2000) An inexact two-stage stochastic programming model for water resources management under uncertainty. Civil Eng Syst 17(2):95–118 11. Komal S (2019) Fuzzy reliability analysis of the compressor house unit (CHU) system in a coal fired thermal power plant using TBGFLT technique. Int J Qual Reliab Manage 36(5):686–707 12. Kumar A, Ram M (2019) Computation interval-valued reliability of sliding window system. Int J Math, Eng Manage Sci 4(1):108–115 13. Kumar A, Singh SB (2017) Computations of the signature reliability of the coherent system. Int J Qual Reliab Manage 34(6):785–797 14. Kumar A, Singh SB (2018) Signature reliability of linear multi-state sliding window system. Int J Qual Reliab Manage 35(10):2403–2413 15. Kumar A, Singh SB (2019) Signature A-within-B-from-D/G sliding window system. Int J Math, Eng Manage Sci 4(1):95–107 16. Kumar A, Varshney A, Ram M (2015) Sensitivity analysis for casting process under stochastic modelling. Int J Ind Eng Comput 6(3):419–432 17. Kumar P, Singh LK, Chaudhari N, Kumar C (2020) Availability analysis of safety-critical and control systems of NPP using stochastic modeling. Ann Nuclear Energy 147:107657 18. Levitin G (2001) Redundancy optimization for multi-state system with fixed resourcerequirements and unreliable sources. IEEE Trans Reliab 50(1):52–59 19. Levitin G (2002) Optimal allocation of elements in a linear multi-state sliding window system. Reliab Eng Syst Saf 76(3):245–254 20. Levitin G (2005) The universal generating function in reliability analysis and optimization, p 442. Springer, London. https://doi.org/10.1007/1-84628-245-4 21. Machida F, Xiang J, Tadano K, Maeno Y (2013) Composing hierarchical stochastic model from Sys ML for system availability analysis. In: 2013 IEEE 24th international symposium on software reliability engineering (ISSRE), pp 51–60 22. Malik SC, Munday VJ (2014) Stochastic modelling of a computer system with hardware redundancy. Int J Comput Appl 89(7):26–30 23. Manatos A, Koutras VP, Platis AN (2016) Dependability and performance stochastic modelling of a two-unit repairable production system with preventive maintenance. Int J Prod Res 54(21):6395–6415 24. Marichal JL, Mathonet P (2013) Computing system signatures through reliability functions. Stat Probab Lett 83(3):710–717 25. Michael S, Mariappan V, Kamat V (2011) Stochastic modelling of failure interaction: Markov model versus discrete event simulation. Int J Adv Oper Manage 3(1):1–18 26. Navarro J, Rubio R (2009) Computations of signatures of coherent systems with five components. Commun Stat-Simul Comput 39(1):68–84 27. Navarro J, Rychlik T (2010) Comparisons and bounds for expected lifetimes of reliability systems. Eur J Oper Res 207(1):309–317 28. Navarro J, Ruiz JM, Sandoval CJ (2007) Properties of coherent systems with dependent components. Commun Stat—Theory Meth 36(1):175–191 29. Navarro J, Rychlik T, Shaked M (2007) Are the order statistics ordered? A survey of recent results. Commun Stat—Theory Meth 36(7):1273–1290 30. Osaki S, Nakagawa T (1976) Bibliography for reliability and availability of stochastic systems. IEEE Trans Reliab 25(4):284–287 31. Ram M, Goyal N (2016) Automated teller machine network inspection under stochastic modelling. J Eng Sci Technol Rev 9(5):1–8 32. Rushdi AMA, Ghaleb FAM (2021) Reliability characterization of binary-imaged multi-state coherent threshold systems. Int J Math, Eng Manage Sci 6(1):309–321 33. Sahner RA, Trivedi KS (1993) A software tool for learning about stochastic models. IEEE Trans Educ 36(1):56–61

Assessment of Reliability Function and Signature …

269

34. Samaniego FJ (2007) System signatures and their applications in engineering reliability, vol 110. Springer Science & Business Media. ISBN 978-0-387-71796-8 35. Samaniego FJ, Balakrishnan N, Navarro J (2009) Dynamic signatures and their use in comparing the reliability of new and used systems. Naval Res Logistics (NRL) 56(6):577–591 36. Triantafyllou IS (2021) On the lifetime and signature of constrained (k, d)-out-of-n: F reliability systems. Int J Math, Eng Manage Sci 6(1):66–78 37. Weng Q (2002) Land use change analysis in the Zhujiang Delta of China using satellite remote sensing, GIS and stochastic modelling. J Environ Manage 64(3):273–284 38. Zacharof AI, Butler AP (2004) Stochastic modelling of landfill leachate and biogas production incorporating waste heterogeneity. Model formulation and uncertainty analysis. Waste Manage 24(5):453–462

Reliability Evaluation and Cost Optimization of Solar Air-Conditioner Ashok Singh Bhandari, Mangey Ram, Akshay Kumar, and Sushil Chandra Dimri

Abstract As we all know synergy files are the channel for aspiring engineers and technologists that are striving for a better and more sustainable world. During summers when the temperature can soar to mid 40 °C in many places around the world, air conditioning becomes more than just a luxury however it is costly to cool spaces particularly in areas with a high level of humidity. One of the advantages of cooling instead of heating is that cooling is required more when there is more heat energy around to tap into or in other words there is more energy available. To be more specific we have more solar energy available to us. Here we are interested in the technology that uses heat from the sun to provide energy directly in the thermodynamic circle of an air-conditioner. This research work is dedicated to evaluate the reliability measures of solar air-conditioners which include availability, mean time to failure (MTTF), and sensitivity analysis with their graphical representation by using the Markov process. Along with reliability assessment, Particle Swarm Optimization (PSO) technique is applied with the objective to find the minimum cost of the system while taking maximum reliability as a constraint. Keywords Solar air- conditioners · Solar energy · Markov process · Reliability · Availability · Mean time to failure · Sensitivity analysis · Particle swarm optimization

A. S. Bhandari (B) · A. Kumar Department of Mathematics, Graphic Era Hill University, Uttarakhand, India M. Ram Department of Mathematics, Computer Science and Engineering, Graphic Era Deemed To Be University, Uttarakhand, India Institute of Advanced Manufacturing Technologies, Peter the Great St. Petersburg Polytechnic University, 195251 Saint Petersburg, Russia S. C. Dimri Department of Computer Science and Engineering, Graphic Era Deemed To Be University, Uttarakhand, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment of Industrial Systems, Springer Series in Reliability Engineering, https://doi.org/10.1007/978-3-030-93623-5_12

271

272

A. S. Bhandari et al.

1 Introduction The need for air conditioning has been increased rapidly in recent times due to global warming and climate change. The simultaneous processing of temperature, humidity, purification, and distribution of air current in compliance with the requirement of space needing air conditioning is defined as air conditioning [18]. In general, air conditioning is a system for controlling the humidity, ventilation, and temperature in a building or vehicle, typically to maintain a cool atmosphere in warm conditions. All these types of processors need a continuous energy supply to operate. However, air conditioning would commonly take up half of building electricity consumption [19]. Limited availability and environmental concerns of conventional sources of energy such as coal, petroleum, natural gas, etc. are some serious challenges in the twentyfirst century. Some renewable sources of energy like hydroelectric power plants and wind turbines are not that feasible to set up as per requirement. Critical worries of the present energy area comprise of issues such as nonstop expansion in energy interest, quick exhaustion of ordinary energy assets, and the impact on the climate [2, 3, 17, 20, 26–29]. By 2050 the interest in energy gracefully could double or even triple as the worldwide population develops and developing nations extend their economies. This has just raised worries over potential power supply difficulties, exhaustion of fuel sources, and assisting ecological effects. The primary need for solar air conditioners is to reduce the use of electricity and use a renewable clean source of energy more and more [7]. Though, expanding carbon impression because of the developing utilization of air conditioners had expanded interest in sunlight-based cooling. The utilization of solar air conditioners is broadly found in business, private, and modern arrangements. Developing endeavors to spare power and expanding ecological concerns has additionally enlarged interest in solar air conditioners. Easy maintenance and cost adequacy are some other benefits of solar air conditioners. After the installation of solar air conditioners at home, electricity consumption can drop by 50% or more. The solar panels at the rooftop do not require much care and maintenance, sometimes they might need some repair work which is far less than the traditional air-cooling systems. Solar air conditioners have the great advantage of the excess of solar energy, as air conditioning is mostly used when the temperature is high outside. Solar panels work best with shining bright daylight and can also be used with lower sunlight, as they are equipped with batteries. These batteries can last for 12–24 h and can be recharged by sunlight. Therefore, it can also be money efficient to use solar AC. A Markov model consisting state transition diagram of a solar-powered air conditioner has been designed to study reliability measures, availability, MTTF, cost, and sensitivity analysis. In this regard, Lameiro and Duff [14] presented a model approach to the generalized solar energy space heating performance analysis problem. The authors developed a stochastic model for the performance analysis of solar energy space heating systems and implemented that model into a working FORTRAN computer program. Gangloff [5] had described intractable common mode failures, which were not easy to find in the regular Markov block diagram approach. To

Reliability Evaluation and Cost Optimization …

273

evaluate the common mode of failure, the author extended the fault tree analysis techniques. In the domain of failure and repair of a system, several authors [8, 9, 12] (Cui and Li 2007) calculated the reliability of complex systems having different modes of failure and repair techniques. Tushar et al. [25] introduced a model that can capture the intertemporal dependency of solar irradiance using Markov chain segmentation. Various state transitions were created for different intervals of time for the used model to generate the solar states during daytime. By using a numerical example, the interdependency of solar states was established with proof that solar state transition from state to state is time-dependent. They calculated a probability transition matrix by using real solar data sets and discussed the improvement. Ram [16] used the copula technique to calculate reliability measures of three state systems, and the findings of the work were significantly improved. Singh et al. [23] analyzed the cost for a machine that consisted of two subsystems. Along with cost analysis authors discussed various measures of reliability for the used system. The authors concluded the relationship between availability and time for a complex repairable system. Ram and Singh [21] improved the availability. MTTF and expected profit for a complex system. The authors implemented the Gumbel-Hougaard family copula technique in their work to conclude that in the long-run availability of the system becomes constant and the expected profit decreases rapidly. Particle swarm optimization (PSO) is a populace-based stochastic improvement strategy created by [13]. It shows some developmental calculation properties: 1. It is introduced with a populace of arbitrary arrangements that are called random solutions. 2. It looks for optima by updating generations. 3. Updating generations depends on past generations. In PSO, the possible arrangements, called particles, fly through the issue space by following the current ideal particles [10]. The updates of the particles are cultivated by the accompanying conditions. Condition (A) ascertains new velocity for every particle dependent on its past velocity, the particle’s area at which the best fitness (pid ) has been accomplished up until now, and the best particle among the neighbors at which the best fitness (pgd ) has been accomplished up until now. Condition (B) updates every particle’s position in the solution hyperspace.   v(i+1)d = wI vid + c1rand(v)(pid − xid ) + c2Rand(v) pgd − xid x(i+1)d = xid + v(i+1)d where d is the dimension, positive constants c1 and c2 are learning factors, rand(v) and Rand(v) are random functions, and the inertia weight ‘w’ has provided improved performance in several applications. So, NewV = (ParticleInertia) + (CognitiveTerm) + (SocialTerm) Kennedy and Eberhart [13] introduced a methodology for nonlinear optimization, named particle swarm optimization. This metaheuristic technique evolved through

274

A. S. Bhandari et al.

the simulation of a basic social model that describes connections between PSO and genetic algorithms. The authors concluded PSO is a very simple and strong technique for the optimization of several types of functions. Shi and Eberhart [22] experimentally studied the efficiency of PSO. The authors showed the high rate of convergence toward the optimal solution in PSO which makes the technique dependable. The authors also suggested a new technique to make advancements in PSO. Some authors [1, 4] also discussed the comparison of genetic algorithm and PSO techniques and advancement in PSO as well. Liu et al. [15] introduced a hybrid particle swarm optimization technique as up-gradation of PSO with the help of chaos, which is a type of nonlinear system’s property. The authors concluded that the hybrid PSO performs exploration as well as exploitation by implementing the chaotic local searching performance. Jiang et al. [11] discussed an improvement in PSO by partitioning the swarms into several sub-swarms, where each sub-swarm performs PSO and after a period the whole population is shuffled, and new sub-swarm are created to perform PSO again. This technique remarkably improved the ability to explore and exploit the swarms. The authors concluded that the improved PSO greatly benefits calculation accuracy and effective global optimization.

2 Working of Solar Air-Conditioner Solar air conditioners can be distinguished in many types by their arrangement of components and working procedure. The most popular type of solar air conditioner consists of the panels placed on the top of the building. The panels heat the water flowing in pipes and turn it into a gas state, which takes out the ventilating air’s hotness and cools it down. Then the cool air is flown to the building. The produced air is very cool and dry, therefore, the efficiency of cooling down the building becomes very high. The second type of solar air conditioner works on photovoltaic solar energy. In these types of systems, photovoltaic panels are placed around the building to generate electricity. The produced electricity runs the conventional air conditioner for cooling. Extra or returned electricity is stored on a grid with inverters or by batteries. But the batteries can lose about 20% of power due to the chemical inefficiency of batteries. The third type of solar air conditioner works on a completely different procedure. In the 1880s, ammonia and water were mixed in sealed steel chambers to create a cooling effect in the water. This chilled water was pumped into the air handler to absorb the heat from the space, where cooling is required. With the limited capacity of space, there rises the need of separating ammonia from water to operate again, otherwise, unlimited availability of space is needed. To recycle the chemical, the mixture is heated by 250F hot water, which separates ammonia from the water. Here evacuated tube solar panels come into the role of heating the water to 250 + F, to separate the mixture of ammonia and water. After separation, both can be reused for the chilling process.

Reliability Evaluation and Cost Optimization …

275

Fig. 1 Block diagram of a solar air conditioner

There are no destructive impacts or side-effects of solar air conditioner. They use water as a refrigerant rather than some other harmful coolants. Additionally, on the grounds that they are solar powered, they are autonomous of the poer grid and ration energy by dispensing heat and friction created by the wires of a conventional air conditioner. Solar air conditioners come with no harmful effects and zero waste products, as instead of any harmful coolant, they use water as a refrigerant. These solar air systems work best in radiant, dry atmospheres, for example, the South-western United States or Northern Africa. As these systems can also work with battery power, they can perform well in rainy, cloudy weather. Following Fig. 1 describes the block diagram of solar air conditioner.

3 Assumptions and Notations The proposed system is structured by making the following assumptions: (i) (ii) (iii) (iv)

The system is assumed to be in satisfactory working condition in the primary state. Degraded and failed states are repairable with different rates. After fix, the unit is treated all around great. Constant disappointment and fix rates are accepted.

The notations used for the solar air conditioner are described in Table 1.

4 Mathematical Modelling 4.1 Formulation of the Model The accompanying differential equations have been drawn from the state transition diagram (Fig. 2) of the solar air conditioner.

276

A. S. Bhandari et al.

Table 1 Notations t

Time

s

Variable for Laplace transformation

x

Supplementary variable

λ1 /λ2 /λ3 /λ4 Hazard rate of unit inverter/ battery/ solar panel/ charge controller μ

Repair rate of the system from the degraded state P1 (t) to P0 (t)

P0 (t)

Probability of initial (good) state

P1 (t)

Probability of the degraded state when the system has one broke down unit of solar panel

P2 (t)

Completely failed state probability caused by battery failure

P3 (t)

Completely failed state probability caused by inverter failure

P4 (t)

Completely failed state probability caused by failure of both solar panels

P5 (t)

Completely failed state probability caused by failure of charge controller

φ(x)

Rate of repairing all the failed states

Fig. 2 State transition diagram

Reliability Evaluation and Cost Optimization …



277

 d + λ1 + λ2 + 2λ3 + λ4 P0 (t) = μ(P1 (t)) dt  ∞ + φ(x)[P2 (x, t) + P3 (x, t) + P4 (x, t) + P5 (x, t)]d x 0



 d + λ2 + λ3 + λ4 + μ P1 (t) = 2λ3 t P0 (t) dt   ∂ ∂ + + φ(x) P2 (x, t) = 0 ∂x ∂t   ∂ ∂ + + φ(x) P3 (x, t) = 0 ∂x ∂t   ∂ ∂ + + φ(x) P4 (x, t) = 0 ∂x ∂t   ∂ ∂ + + φ(x) P5 (x, t) = 0 ∂x ∂t

(1) (2) (3) (4) (5) (6)

Boundary conditions P2 (0, t) = λ2 [P0 (t) + P1 (t)]

(7)

P3 (0, t) = λ1 [P0 (t) + P1 (t)]

(8)

P4 (0, t) = λ3 P1 (t)

(9)

P5 (0, t) = λ3 [P0 (t) + P1 (t)]

(10)

P0 (0) = 1

(11)

Initial Conditions

and all other state probabilities are zero at t = 0.

4.2 Solution of the Model By taking the Laplace transformation from Eqs. (1) to (10) and then using Eq. (11), the solution obtained for the model is given below:

278

A. S. Bhandari et al.

  [s + λ1 + λ2 + 2λ3 + λ4 ]P0 (s) = 1 + μ P1 (s)  ∞   + φ(x) P2 (x, s) + P3 (x, s) + P4 (x, s) + P5 (x, s) d x 0

(12)

[s + λ2 + λ3 + λ4 + μ]P1 (s) = 2λ3 P0 (s)

(13)

 ∂ + s + φ(x) P2 (x, s) = 0 ∂x   ∂ + s + φ(x) P3 (x, s) = 0 ∂x   ∂ + s + φ(x) P4 (x, s) = 0 ∂x   ∂ + s + φ(x) P5 (x, s) = 0 ∂x



(14) (15) (16) (17)

Rewriting (14), (15), (16) and (17) as  ∂ + s + φ(x) Pi (x, s) = 0 ∂x

(18)

  P2 (0, s) = λ2 P0 (t) + P1 (s)

(19)

  P3 (0, s) = λ1 P0 (t) + P1 (s)

(20)

  P4 (0, s) = λ3 P1 (s)

(21)

P5 (0, s) = λ4 [[P0 (t) + P1 (s)]

(22)



For i = 2, 3, 4, 5. Boundary condition

P0 (s) =

1 D(s)

(23)

where D(s) = (s + +λ1 + λ2 + 2λ3 + λ4 ) − μ

2λ3 s + λ2 + λ3 + λ4 + μ



Reliability Evaluation and Cost Optimization …

 −

λ 1 + λ2 + λ3 + λ4 s + λ1 + λ2 + λ3 + μ

279



 + λ1 + λ2 + λ4 sφ (s)

(24)

From (13) P1 (s) =

2λ3 P0 (s) s + λ2 + λ3 + λ4 + μ

(25)

From (14) and (25)

  1 − sφ (s) λ2 P0 (s) + P1 (s) s

1 − sφ (s) λ3 P1 (s) P3 (s) = s

1 − sφ (s) λ3 P1 (s) P4 (s) = s

  1 − sφ (s) λ4 P0 (s) + P1 (s) P5 (s) = s

P2 (s) =

(26) (27) (28) (29)

Pup (s) = P0 (s) + P1 (s)

(30)

Pdown (s) = P2 (s) + P3 (s) + P4 (s) + P5 (s)

(31)

It is noticed that Pup (s) + Pdown (s) =

1 s

(32)

4.3 Availability Availability of a system is a characteristic of a system that explains the working condition of the system at a particular time. Availability is the sum of probabilities of the working conditions for any system. Authors have calculated the availability of solar air conditioner by substituting the failure and repair rates as λ1 = 0.8, λ2 = 0.03, λ3 = 0.05,λ4 = 0.09, μ = 0.02 and φ(x) = 0.05 [6]. Pup (t) = 0.1463142691e−1.207641122t cos(0.1282345336t) + 0.1825319221e−1.207641122t sin(0.1282345336 ∗ t) + 0.8536857307e−0.5471775631t

280

A. S. Bhandari et al.

Table 2 Availability of solar air conditioner

Time (t)

Pup (t)

0

0.9999999998

1

0.8585806188

2

0.7819738439

3

0.7298988710

4

0.6876053199

5

0.6498918490

6

0.6149406606

7

0.5820896673

8

0.5510623139

9

0.5217098429

10

0.4939271627

1.0

0.9

Availability

0.8

0.7

0.6

0.5

0

1

2

3

4

5

6

7

8

9

10

11

12 2

13

14

1 15 5

Time (t)

Fig. 3 Availability of solar air conditioner

Table 2 shows the availability of solar air conditioner at different times and following Fig. 3 is the demonstration of the table.

4.4 Reliability Analysis The reliability of the solar air conditioner’s is one of the basic quality attributes that manages the conduct of every component and can be characterized as the likelihood

Reliability Evaluation and Cost Optimization … Table 3 Reliability of solar air conditioner

281

Time(t)

Pup (t)

0

1.000000000

1

0.8199309868

2

0.6741644717

3

0.5557434437

4

0.4592117352

5

0.3802722923

6

0.3155266205

7

0.2622755327

8

0.2183662793

9

0.1820748625

10

0.1520151112

of the solar air conditioner to perform adequately for a specified time stretch in a determined climate. With the unavailability of a repair facility i.e., μ = 0, φ(x) = 0 and λ1 = 0.8, λ2 = 0.03, λ3 = 0.05, λ4 = 0.09 [6], the reliability −

R(s)=

1+

0.10 s+0.17

s + 0.30

Now, by taking the Laplace inverse authors have calculated the reliability of the model in terms of time (t). Which is R(t) = e(−0.235t) ((cosh(0.65t) + (0.538sinh(0.65)t)) Above Table 3 represents the reliability of the model for the time t = 0 to t = 10 and Fig. 4 is the graphical representation of the reliability of solar air conditioner with respect to time.

4.5 Mean Time to Failure MTTF represents the average time between the failure of components of the system. Following Table 4 and Fig. 5 describes the MTTF of the solar air conditioner with respect to the variation of failure.

282

A. S. Bhandari et al.

1.0

Reliability

0.8

0.6

0.4

0.2

0.0 0

2

4

6

8

10

Time(t) Fig. 4 Reliability of solar air conditioner

Table 4 MTTF of solar air conditioner Variation in λi

MTTF with respect to failure rates λ1

λ2

λ3

λ4

0.1

1.714285714

1.464404814

2.003284072

2.032085562

0.2

1.481481481

1.442170916

1.690912924

1.884781885

0.3

1.303571429

1.391527600

1.464404814

1.730909091

0.4

1.163101604

1.330581356

1.291470434

1.589526093

0.5

1.049382716

1.267454350

1.154719937

1.464404814

0.6

0.9554655868

1.205844801

1.043740573

1.354861355

0.7

0.8766233765

1.147376450

0.9518375888

1.259037887

0.8

0.8095238095

1.092666091

0.8744797033

1.174949495

0.9

0.7517482518

1.041835358

0.8084774818

1.100803958

4.6 MTTF Sensitivity This characteristic is calculated by partial differentiation of MTTF with respect to the input variables and then varying the input by 0.1, 0.2, …, 0.9 respectively and fixing the failure rates as λ1 = 0.8, λ2 = 0.03, λ3 = 0.05, λ4 = 0.09 [6] in partial derivatives, one may get Table 5 and demonstrated graphically in Fig. 6.

Reliability Evaluation and Cost Optimization …

283

Fig. 5 Mean time to failure of solar air conditioner

Table 5 MTTF sensitivity of solar air conditioner Variation in λi

∂ MT T F ∂λ1

∂ MT T F ∂λ2

∂ MT T F ∂λ3

∂ MT T F ∂λ4

0.1

−15.51011029

−2.785142478

1.383559623

4.858441875

0.2

−9.003601439

−1.640794726

−0.80 7,076,446

3.066658569

0.3

−5.873651235

−1.137909405

−9.182736449

2.046115319

0.4

−4.131725531

−0.8601343456

−1.200072965

1.440297957

0.5

−3.063725490

−0.6846626419

−1.092408624

1.056832959

0.6

−2.362039402

−0.5638594440

−0.9682831634

0.8009739836

0.7

−1.876459468

−0.4756696512

−0.8521394955

0.6229670216

0.8

−1.526562182

−0.4085276152

−0.7501425268

0.4948867964

0.9

−1.266131453

−0.3557837044

−0.6625703911

0.4001512852

4.7 Reliability Sensitivity Sensitivity in the reliability of the solar air conditioner can be analyzed by taking partial derivative of the reliability function with respect to their input parameters, after taking the inverse Laplace transformation. By fixing the values of input parameters as λ1 = 0.8, λ2 = 0.03, λ3 = 0.05,λ4 = 0.09 [6] in the partial derivatives of reliability

284

A. S. Bhandari et al.

Fig. 6 MTTF sensitivity of solar air conditioner

expression. Table 6 contains the numerical values of the reliability sensitivity which are demonstrated graphically in Fig. 7. Table 6 Reliability sensitivity of solar air conditioner Time (t)

∂ R(t) ∂λ1

∂ R(t) ∂λ2

∂ R(t) ∂λ3

∂ R(t) ∂λ4

0

0

0

0

0

1

−0.3850328887

−0.4174266903

0.3341754247

−0.4174266903

2

−0.3099803012

−0.3969378011

0.6618856904

−0.3969378011

3

−0.2007384159

−0.3360541034

0.7658142659

−0.3360541034

4

−0.1274532735

−0.2980814046

0.7267230318

−0.2980814046

5

−0.08521134701

−0.2783179595

0.6278076002

−0.2783179595

6

−0.06124393423

−0.2661764893

0.5158654217

−0.2661764893

7

−0.04689343589

−0.2554315687

0.4116244354

−0.2554315687

8

−0.03750226945

−0.2435809364

0.3221518873

−0.2435809364

9

−0.03077464807

−0.2300916402

0.2483868571

−0.2300916402

10

−0.02560774555

−0.2152497664

0.1888986146

−0.2152497664

Reliability Evaluation and Cost Optimization …

285

Fig. 7 Reliability sensitivity of solar air conditioner

5 Optimization of Cost of a Solar Air Conditioner Using PSO The aim of using PSO is to get minimized the cost of the system with the required reliability of the whole system as well as the components. The following nonlinear programming problem is solved to get the results: Minimize C = K 1 R1 α1 + K 2 R2 α2 + K 3 R3 α3 + 2K 4 R4 α4 Subject to the constraints   R2 R3 R4 2R1 − R1 2 < 1   −[R 2 R3 R4 2R1 − R1 2 ] ≤ 0.3 Ri < 1fori = 1, 2, 3, 4 and −Ri < 0fori = 1, 2, 3, 4

(33)

286

A. S. Bhandari et al.

Fig. 8 Cost convergence curve of solar air conditioner

The following values are assigned to the constants K i and αi for i = 1, 2, 3, 4, K 1 ≡ 200, K 2 ≡ 100, K 3 ≡ 100, K 4 ≡ 150 and αi = 0.6 for i = 1,2, 3, 4 [24]. Here C represents the total cost of the system and Rs = R 2 R3 R4 2R1 − R1 2 is the reliability of the solar air conditioner. R1 , R2 , R3 , and R4 are the reliabilities of solar panels, battery, charge controller, and inverter, respectively. The optimal solution was achieved by using the Particle Swarm Optimization algorithm where 100 random particles were initiated in the solution space. The inertia weight w = 0.9, the acceleration constants c1 = 2 and c2 = 2.05 are used and no. of iterations = 1000. The optimized cost for the above nonlinear problem is 365.0355 with system reliability above 0.3 and R1 = 0.2278, R2 = 0.9088, R3 = 0.9020, and R4 = 0.9065. The following Fig. 8 is PSO convergence curve which converges to the optimized cost of the system.

6 Results Analysis In this research work, the authors have studied the numerous reliability characteristics such as obtainability, dependability, MTTF, and sensitivity of reliability and MTTF of the system under the consideration of four types of failures and employing the Markov process. Through the general investigation on the solar air conditioning system,

Reliability Evaluation and Cost Optimization …

287

the authors mentioned the accompanying observable facts. Figure 3 addressing the diagram of the availability of the solar air conditioner versus time. Initially, the availability of the system decreases smoothly with increase in time but later, the value of the availability of the system attains a constant value with the increase in time. The general pattern shows that the availability of the system decreases with the increase in time, and maintain a constant value (after long time) of 0.5 approximately for the defined values of the failure rates. Figure 4 communicates the pattern of reliability of the solar air conditioner with respect to time. From the graph, the reliability of the solar air conditioner also decreases with time and tends to zero as time increases. The pattern of MTTF of the solar air conditioner is illustrated in Fig. 5. Here L1, L2, L3, and L4 represents the failure rates λ1 , λ2 , λ3 and λ4 respectively. Figures 6 and 7 are behavioral explanations of MTTF sensitivity and reliability sensitivity of the solar air conditioner with respect to time. MTTF sensitivity is increasing while there is variation in failure rates λ1 and λ2 , decreasing with variation in λ4 and rapidly increasing after a quick fall while varying the failure rate λ3 . From Fig. 6, the sensitivity in the reliability of the solar air conditioner is observable, here also L1, L2, L3, and L4 represents the failure rates λ1 , λ2 , λ3 and λ4 respectively. Figure 8 is the PSO convergence curve for the cost of solar air conditioner which converges to 365.0355.

7 Conclusion The current work proposes the Markov model of solar air conditioner. Overall assessment presumes that with the standard upkeep of solar air conditioners, availability remains steady after some time. It is beneficial to comment here that the solar air conditioner is generally delicate in worry of component failure. In this way, it is important to deal with component failure rate, to achieve an exceptionally dependable solar air conditioner. The optimization technique is capable to determine the number of redundant components and their location for higher reliability and lower cost. The consequences of this examination are a lot useful for technologists, engineers, and plant managers.

References 1. Angeline PJ (1998) Using selection to improve particle swarm optimization. In: IEEE international conference on evolutionary computation. Anchorage, Alaska, May 4–9, 1998 2. Chai B, Yang Z (2014) Impacts of unreliable communication and modified regret matching based anti-jamming approach in smart microgrid. Ad Hoc Netw 22:69–82 3. Chai B, Chen J, Yang Z, Zhang Y (2014) Demand response management with multiple utility companies: a two-level game approach. IEEE Trans Smart Grid 5(2):722–731 4. Eberhart RC, Shi YH (1998) Comparison between genetic algorithms and particle swarm optimization. In: 1998 annual conference on evolutionary programming. San Diego

288

A. S. Bhandari et al.

5. Gangloff WC (1975) Common mode failure analysis. IEEE Trans Power Apparatus Syst 94(1):27–30 6. Goyal N, Ram M, Kaushik A (2017) Performability of solar thermal power plant under reliability characteristics. Int J Syst Assurance Eng Manage 8(2):479–487 7. Gugulothu R, Somanchi NS, Banoth HB, Banothu K (2015) A review on solar powered air conditioning system. Procedia Earth Planetary Sci 11:361–367 8. Gupta PP, Agarwal SC (1984) A parallel redundant complex system with two types of failure under preemptive-repeat repair discipline. Microelectron Reliab 24(3):395–399 9. Gupta PP, Sharma MK (1993) Reliability and MTTF evaluation of a two duplex-unit standby system with two types of repair. Microelectron Reliab 33(3):291–295 10. Hu X, Eberhart RC, Shi Y (2003) Engineering optimization with particle swarm. In: Proceedings of the 2003 IEEE swarm intelligence symposium. SIS’03 (Cat. No. 03EX706). IEEE, pp 53–57 11. Jiang Y, Hu T, Huang C, Wu X (2007) An improved particle swarm optimization algorithm. Appl Math Comput 193(1):231–239 12. Ke J-B, Lee W-C, Wang K-H (2007) Reliability and sensitivity analysis of a system with multiple unreliable service stations and standby switching failures. Phys A 380:455–469 13. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95international conference on neural networks, vol 4. IEEE, pp 1942–1948 14. Lameiro GF, Duff WS (1979) A Markov model of solar energy space and hot water heating systems. Sol Energy 22(3):211–219 15. Liu B, Wang L, Jin YH, Tang F, Huang DX (2005) Improved particle swarm optimization combined with chaos. Chaos, Solitons Fractals 25(5):1261–1271 16. Ram M (2010) Reliability measures of a three-state complex system: a copula approach. Appl Appl Math: An Int J 5(10):1483–1492 17. Hassan NU, Pasha MA, Yuen C, Huang S, Wang X (2013) Impact of scheduling flexibility on demand profile flatness and user inconvenience in residential smart grid system. Energies 6(12):6608–6635 18. Ochi M, Ohsumi K (1989) Fundamental of refrigeration and air conditioning. Ochi Engineering Consultant Office 19. Pachauri RK, Reisinger A (2007) 2007. IPCC fourth assessment report. IPCC, Geneva 20. Khan RH, Brown J, Khan JY (2013) Pilot protection schemes over a multi-service WiMAX network in the smart grid. In: 2013 IEEE international conference on communications workshops (ICC). IEEE, pp 994–999 21. Ram M, Singh SB (2010) Availability, MTTF and cost analysis of complex system under preemptive-repeat repair discipline using Gumbel-Hougaard family copula. Int J Qual Reliab Manage 22. Shi Y, Eberhart RC (1999) Empirical study of particle swarm optimization. In: Proceedings of the 1999 congress on evolutionary computation-CEC99 (Cat. No. 99TH8406), vol 3. IEEE, pp 1945–1950 23. Singh VV, Ram M, Rawal DK (2013) Cost analysis of an engineering system involving subsystems in series configuration. IEEE Trans Autom Sci Eng 10(4):1124–1130 24. Tillman FA, Hwang CL, Fan LT, Lai KC (1970) Optimal reliability of a complex system. IEEE Trans Reliab 19(3):95–100 25. Tushar W, Huang S, Yuen C, Zhang JA, Smith DB (2014) Synthetic generation of solar states for smart grid: A multiple segment Markov chain approach. In: IEEE PES innovative smart grid technologies. IEEE, Europe, pp 1–6 26. Tushar W, Chai B, Yuen C, Smith DB, Wood KL, Yang Z, Poor HV (2014) Three-party energy management with distributed energy resources in smart grid. IEEE Trans Industr Electron 62(4):2487–2498 27. Tushar W, Yuen C, Chai B, Smith DB, Poor HV (2014) Feasibility of using discriminate pricing schemes for energy trading in smart grid. In: 2014 IEEE global communications conference. IEEE, pp 3138–3144

Reliability Evaluation and Cost Optimization …

289

28. Khalid YI, Hassan NU, Yuen C, Huang S (2014) Demand response management for power throttling air conditioning loads in residential smart grids. In: 2014 IEEE international conference on smart grid communications (SmartGridComm). IEEE, pp 650–655 29. Liu Y, Hassan NU, Huang S, Yuen C (2013) Electricity cost minimization for a residential smart grid with distributed generation and bidirectional power transactions. In: 2013 IEEE PES innovative smart grid technologies conference (ISGT). IEEE, pp 1–6

Analyzing Interrelationships Among Software Vulnerabilities Using Fuzzy DEMATEL Approach Misbah Anjum, P. K. Kapur, Vernika Agarwal, and Vivek Kumar

Abstract The increasing data infringement is pressuring software organizations to create and maintain secure software. Although, not all assaults can be anticipated or prevented, many can be avoided by eliminating vulnerabilities in software. Security teams must discover a mechanism to understand the interdependence of vulnerabilities after their discovery to identify which security vulnerability pose the greatest risk and demand immediate attention. The limited revenue and the time taken by the software testers put additional pressure to select those with high severity and are being a cause for other severe flaws. Addressing such defects is the main step towards understanding the cause-effect relationship among the vulnerabilities. The present objective is to propose a framework for assessing these interrelationships of various software vulnerabilities by utilizing Fuzzy Decision-making Trial and Evaluation Laboratory (F-DEMATEL). The fuzzy theory assists in the decision mapping, whereas the DEMATEL technique is utilized to illustrate the contextual link between the kinds of vulnerability. The data is validated for software testing company placed in northern India. Keywords Vulnerabilities · Multi-Criteria Decision Making (MCDM) · Prioritization · Fuzzy DEMATEL

M. Anjum Amity Institute of Information Technology, Amity University, Noida, Uttar-Pradesh, India P. K. Kapur Amity Center for Inter-Disciplinary Research, Amity University, Noida, Uttar-Pradesh, India V. Agarwal (B) Amity International Business School, Amity University, Noida, Uttar-Pradesh, India V. Kumar Department of Operational Research, University of Delhi, Delhi, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment of Industrial Systems, Springer Series in Reliability Engineering, https://doi.org/10.1007/978-3-030-93623-5_13

291

292

M. Anjum et al.

1 Introduction The technological innovation has grown across the globe, with the growing popularity of electronic equipments, elevated sensors and other kinds of networking technologies are generally included in the Internet of Things (IoT) umbrella [1]. A statistical forecast shows that the penetration of these intelligent devices in global infrastructure is anticipated to rise from 26 billion devices to over 75 billion by 2025 [2]. This rapid growth of the use of networked intelligent devices leads to a growing danger of cybersecurity [3]. Computer users may have been the only individuals to care about vulnerabilities in the past [4], with the changing communication dynamics, anyone using smartphones, smart watches, smart TVs, or any other connected device is now vulnerable to the theft of their information [5]. Present organizations and companies have been quite careful to safeguard their networks. However, even with sustainable and accountable defense expenditure, they are still at risk [6]. This occurs as assailants can circumvent the security of organizations via unknown vulnerabilities that security individuals do not list [7]. A flaw can be uncovered by a repeated test of a certain hacker in a well-guarded network [8]. Intruders might use network configuration flaws to breach the intended target [6]. Historical statistics demonstrate that computer threats with different complexity/lethality exist in various forms. Computer vulnerability has also been widely documented and a group effort was made to compile lists of vulnerabilities such as the Database for National Vulnerabilities (NVD) and Common Vulnerabilities and Exposures (CVE) [9]. In recent years, the amount of risk assessments filed by NVD has continued to rise. In 2019 the NVD was reported with 17,302 vulnerabilities [10] while as a total of 16,555 security vulnerabilities were reported in the year 2018. IT management needs to detect and evaluate vulnerabilities on various hardware and software platforms in order to prevent vulnerability losses [11]. In the existing literature lot of quantitative models have been proposed by several authors. Such quantitative models allow developers to allocate resources for testing process, planning and security patch creation [12–16]. In addition, developers can use Vulnerability Discovery Models (VDM) to assess risk and estimate the redundancy needed in resources and procedures to deal with potential breaches [17]. As the repair of all the vulnerabilities within the given time and money is a highly tiresome process, the security team must prioritize these vulnerabilities and remedy the riskiest ones [18]. Techniques for prioritization are divided into two major groups: quantitative and qualitative. Qualitative systems offer a grading method for defining the degree of vulnerabilities in software, whereas quantitative scoring systems link each vulnerability with a number result [19]. Multicriteria (MCDM) decision-making approaches have recently become a prominent tool for assessing vulnerabilities [20]. It is regarded a complicated decision-making technique that incorporates both quantitative and qualitative elements [21]. The literature has offered a broad range of MCDM procedures. The concepts and applications of Fuzzy MCDM are always evolving [22] as they help to remove the

Analyzing Interrelationships Among Software Vulnerabilities …

293

vagueness in the data. In recent years, certain fuzzy MCDM techniques have been suggested and utilized frequently, such fuzzy TOPSIS [23], fuzzy ELECTRE [24] and FBWM [25]. In the context of the use of MCDM in prioritizing vulnerability, a hybrid strategy has been provided in one recent study by Sibal based on Normalized Criteria Distance (NCD), analytical hierarchy Process (AHP), and DEMATEL. [26]. Also, Narang et al. [27] have proposed DEMATEL method to show the interdependence in various vulnerabilities by suggesting a cause–effect theory. In the conventional DEMATEL, measures are based on clear, ambiguous standards. The following research targets can be selected from the previous discussion: • Identifying vulnerability types of a software. • To evaluate the interrelationships in terms of cause and effect among the identified software vulnerability types. To address the aforementioned research questions, the manuscript proposes a framework for assessing these interrelationships of various software vulnerabilities by employing “Fuzzy Decision-making Trial and Evaluation Laboratory (FDEMATEL).” DEMATEL methods assist to show a contextual link among the sorts of vulnerabilities, while ambiguity in the decision-making process is incorporated into the fuzzy set theory. The manuscript is organized in the following way. The research technique utilized in this study is described in Sect. 2. Section 2.1 provides the dataset in this article and Fuzzy DEMATEL is discussed in Sect. 2.2. Section 3 represents the data validation. Section 4 discusses the conclusion of the study.

2 Research Methodology The present section concentrates on the methods of research. This study identifies the vulnerabilities first, and then uses Fuzzy DEMATEL MCDM technique to discover interconnections between categories of vulnerability and also to evaluate what influence one type has on another.

2.1 Dataset Description The data set for this study was collected from the National Vulnerability Database (NVD) [28] and CVE [29] which are legitimate databases for data gathering. This data collection comprises of nine categories of vulnerabilities, which are shown in the following table along with their notations (Table 1).

294 Table 1 Software vulnerability types

M. Anjum et al. Notation

Vulnerability type

G1

SQL injection (SQLI)

G2

Cross site scripting (XSS)

G3

Buffer overflow (BO)

G4

Cross site request forgery (CSRF)

G5

File inclusion (FI)

G6

Code execution (CE)

G7

Information gain (IG)

G8

Gain of privileges (GP)

G9

Race condition (RC)

2.2 Fuzzy DEMATEL Approach To analyze the linguistic assessment by DMs on selected vulnerabilities, we have combined fuzzy set theory with DEMATEL [30]. The steps involved are: Step 1: “The decision-making body and short-listing elements are recognized”. A decision-making committee is established to help identify the study objective, and the vulnerabilities are highlighted. Step 2: “Fuzzy language scale detection”. The metrics assessment scale is being determined with the matching fuzzy numbers in this phase. Table 2 shows the Fuzzy scale used. Step 3: “Determining the assessments of decision-making body”. The relationship between short listed metrics was determined to generate initial direct matrix (IDM) in this step. Step 4: “IDM matrix defuzzification”. The verbal evaluations by the DMs are de-fuzzified into crisp value [31]. Let Z i j = (ail j , bil j , cil j ) be the effect of metric i on metric j for lth DM. Z i j = (ail j , bil j , cil j ) represents a triangular fuzzy number (TFN) where ail j , bil j , cil j are left, middle and right values. The steps of CFCS method are as follows: Step 4.1: “Normalization”: Table 2 Fuzzy scale

Linguistic term

Notation

Triangular fuzzy number

Very high influence

VH

(0.75, 1.0, 1.0)

High influence

Hi

(0.5, 0.75, 1.0)

Low influence

Li

(0.25, 0.5, 0.75)

Very no influence

VLi

(0, 0.25, 0.5)

No influence

Ni

(0, 0, 0.25)

Analyzing Interrelationships Among Software Vulnerabilities …

xail j =

xbil j =

xcil j =

295

  ail j − min ail j

(1)

max cil j − min ail j   bil j − min ail j

(2)

max cil j − min ail j   cil j − min ail j

(3)

max cil j − min ail j

Step 4.2: “Compute the normalized left (lxa) and right (lxc) value”: xbil j  lxail j =  1 + xbil j − xail j

(4)

xcil j  lxcil j =  1 + xcil j − xbil j

(5)

Step 4.3: “Total normalized crisp value needs to be calculated”     lxail j 1 − lxail j + lxcil j lxcil j   xil j = 1 − lxail j + lxcil j

(6)

Step 4.4: “Calculate crisp value”   kil j = min ail j + xil j max cil j − min ail j

(7)

Step 4.5: “Determine the integrated value” ki j =

 1 1 ki j + ki2j + ... + kil j l

(8)

Step 5: “Determination of Normalized direct relation matrix”  A K = ki j nxn = s

⎛ wher e

s = max⎝ max

1≤i≤n

n j=1

ki j , max

1≤ j≤n

n

⎞ ki j ⎠

i=1

Step 6: “Total relation matrix T is calculated”    T = ti j nxn = lim K + K 2 + K 3 ....... + K m = K (I − K )−1 m→∞

296

M. Anjum et al.

Step 7: “Building the casual diagram”. Let the matrices for Ri and Cj be n × 1 and 1 × n indicating sum of the T row and T column. Ri is the overall influence on another factor by factor I whereas Cj is the overall effect by factor j. (Ri + Ci) thus reflects the importance of the ith element in the decision-making process. On the other side, the resulting impact of ith factor is shown by (Ri-Ci). The DMs can set a threshold limit in order to decrease the factors which have a minimal influence on others. The values over the limits of the thresholds will be mapped as a casual diagram in the dataset (Ri + Ci, Ri − Ci).

3 Data Analysis The objective of this analysis is to investigate the interdependency among software vulnerability types implying F-DEMATEL. The whole idea is to help security team to remove such vulnerabilities which are related to other vulnerabilities. In this context, the main step is to list and subsequently partition the software vulnerability categories into the cause and effect divisions. The team consists of experts with considerable development and testing skills of over 5–10 years. For assessing these indicators, decision makers utilize a five-pointer fuzzy linguistic scale as indicated in Table 2 above. The initial direct matrix (IDM) is created for the identified vulnerability types as given by the decision makers and is given in Table 3. The normalized direct-relation (NDR) matrix is calculated following the steps as mentioned in step 5 of Sect. 2.2. The NDR matrix is shown in Table 4. Then the total-relation matrix is obtained using step 6 and is represented in Table 5. Once the normalized direct matrix has been produced, the total relationship matrix is calculated after the step 6 above. The total relation matrix is given in Table 5. The complete matrix of cause and effect identifying segments, as specified in step 7, is then computed and presented in Table 6. Table 3 Initial direct matrix G1

G2

G3

G4

G5

G6

G7

G8

G9

G1

Ni

VLi

Li

G2

Li

Ni

VLi

Li

VLi

VLi

VHi

Li

VLi

Hi

VHi

VLi

VHi

Li

VLi

G3

VLi

Li

G4

VLi

Hi

Ni

Hi

Li

Li

Ni

Li

VLi

Li

Li

VLi

VLi

Li

VLi

G5

Li

Li

Hi

Li

VLi

Ni

VLi

Li

Hi

Li

G6 G7

VLi

Li

Hi

Li

VLi

VLi

Li

VLi

Ni

Li

Li

VLi

Li

Hi

Li

Ni

Li

G8

Li

Hi

Hi

Li

VLi

Li

VLi

Li

Ni

VLi

G9

VLi

Li

VLi

Li

Hi

Li

Li

VLi

Ni

Analyzing Interrelationships Among Software Vulnerabilities …

297

Table 4 Normalized direct matrix G1

G2

G3

G4

G5

G6

G7

G8

G9

G1

0.065

0.052

0.100

0.100

0.052

0.052

0.149

0.186

0.052

G2

0.100

0.065

0.052

0.186

0.100

0.052

0.186

0.100

0.052

G3

0.052

0.100

0.065

0.149

0.100

0.052

0.100

0.100

0.052

G4

0.052

0.149

0.100

0.065

0.100

0.052

0.100

0.052

0.052

G5

0.100

0.100

0.149

0.100

0.065

0.052

0.100

0.149

0.100

G6

0.052

0.100

0.149

0.100

0.052

0.065

0.100

0.100

0.052

G7

0.100

0.052

0.052

0.100

0.149

0.100

0.065

0.100

0.149

G8

0.100

0.149

0.100

0.052

0.100

0.052

0.100

0.065

0.052

G9

0.052

0.100

0.052

0.100

0.149

0.100

0.100

0.052

0.065

Table 5 Total-relation matrix G1

G2

G3

G4

G5

G6

G7

G8

G9

G1

0.393

0.466

0.476

0.539

0.471

0.321

0.610

0.602

0.357

G2

0.464

0.522

0.473

0.675

0.566

0.353

0.701

0.568

0.396

G3

0.366

0.498

0.429

0.575

0.500

0.309

0.549

0.501

0.344

G4

0.351

0.519

0.442

0.480

0.481

0.297

0.530

0.438

0.330

G5

0.468

0.567

0.573

0.606

0.538

0.356

0.630

0.623

0.441

G6

0.361

0.493

0.506

0.529

0.450

0.321

0.545

0.498

0.340

G7

0.449

0.497

0.466

0.575

0.594

0.391

0.566

0.554

0.474

G8

0.420

0.542

0.466

0.493

0.504

0.313

0.560

0.480

0.348

G9

0.369

0.497

0.424

0.534

0.549

0.361

0.553

0.462

0.362

Table 6 Identification of cause and effect Ri

Ci

Ri + Ci

Ri − Ci

G1

4.235

3.641

7.875

0.594

G2

4.718

4.601

9.319

0.118

Cause

G3

4.069

4.254

8.323

−0.185

Effect

G4

3.869

5.005

8.874

−1.136

Effect

G5

4.802

4.653

9.455

0.149

Cause

G6

4.043

3.023

7.066

1.020

Cause

G7

4.567

5.244

9.811

−0.678

Effect

G8

4.126

4.726

8.851

−0.600

Effect

G9

4.111

3.392

7.502

0.719

Cause

Cause

298

M. Anjum et al.

One measure can be influenced over the others by defining threshold levels in the overall matrix of the relationship that is 0.476 in this scenario. (R + C) indicates the strength index of effects exercised or received as indicated in Table 6. Increased the (R + C) value, larger is the degree of effect on other characteristics. If (R − C) is positive, then A affects other attributes, and if value is negative, A is being affected. G7 (Information Gain) has become the most prominent attribute with the greatest (R + C) value based on the aforementioned information. Once the threshold value is calculated we can easily calculate which vulnerability is influencing other vulnerability from Table 5. The values greater than the threshold value are known to be influencing other vulnerability types accordingly. The vulnerability G1 is influencing G3, G4, G7 and G8 with intensities of 0.476, 0.539, 0.610 and 0.602 respectively. Vulnerability type G2 is influencing G4, G5, G7 and G8. Type G3 is influencing G2, G4, G5 and G7. Vulnerabilities G2, G5 and G7 are being influenced by G4. Likewise, vulnerability G6 influences all the identified vulnerability types excluding G5 and G6. G7, G8 and G9 are influencing vulnerabilities G2, G4, G5, G8 and G7 respectively. The values inside the table represent the intensity of the effect. we also obtain the causal diagram by mapping a dataset of (Ci + Ri, Ci − Ri) as given in Table 6 is represented in Fig. 1. On analyzing the above diagram given in Fig. 1, it is clear that vulnerability metrices are visually divided into the cause and effect group. The vulnerabilities G1, G2, G5, G6 and G9 are included in the cause group and vulnerability types G3, G4, G7 and G8 fall under the effect group.

Fig. 1 IRD diagram

Analyzing Interrelationships Among Software Vulnerabilities …

299

4 Conclusion In our everyday lives, software systems perform important and versatile functions. In almost every advanced and complicated system, the software components are the heart and soul. Thus, several organizations are widely using interdependent and networked software systems for their business decisions. However, organizations have lost large amounts of money and reputations in contemporary systems owing to security violations and software vulnerabilities. This work aims at identifying and assessing contextual connections between the categories of vulnerabilities in software that assist the Security Team in minimizing the risk utilizing the fuzzy theory and DEMATEL methodology. The nine selected vulnerabilities are classified into cause and effect group. According to the Table 6, it is evident that the vulnerabilities SQLI, XSS, FI, CE and RC are the causative vulnerabilities while as BO, CSRF, IG and GP are the affected vulnerabilities. G7 > G5 > G2 > G4 > G8 > G3 > G1 > G9 > G6 is the priority of vulnerabilities based on the values of (R + C). The most critical vulnerability with the greatest (R + C) value turned out to be Vulnerability G7 which requires quick attention. When we look at the equivalent G3 (Ri-Ci) value, we discover it works as an effect and therefore may decrease the effects of this vulnerability type by working directly with the causes. Similarly, the following vulnerabilities are found in the list G4, G7 and G8 and also the effect of additional vulnerabilities. Security managers need to focus on such causative vulnerabilities so that the other vulnerabilities can be controlled and loss is minimized.

References 1. Gourisetti SNG, Mylrea M, Patangia H (2019) Cybersecurity vulnerability mitigation framework through empirical paradigm (CyFEr): prioritized gap analysis. IEEE Syst J 14(2):1897– 1908 2. Statista L. Internet of things (iot) connected devices installed base worldwide from 2015 to 2025 (in billions), Statista (2019). https://www.statista.com/statistics/471264/iot-number-ofconnected-devices-worldwide 3. Gubbi J, Buyya R, Marusic S, Palaniswami M (2013) Internet of Things (IoT): a vision, architectural elements, and future directions. Future Gener Comput Syst 29(7):1645–1660 4. Kapur PK, Pham H, Gupta A, Jha PC (2011) Software reliability assessment with OR applications. Springer, London, p 364 5. Carnegie Mellon University, Security Vulnerabilities. https://www.sei.cmu.edu/our-work/sec urity-vulnerabilities/index.cfm 6. Singh UK, Joshi C, Kanellopoulos D (2019) A framework for zero-day vulnerabilities detection and prioritization. J Inf Secur Appl 46:164–172 7. Huang CC, Lin FY, Lin FYS, Sun YS (2013) A novel approach to evaluate software vulnerability prioritization. J Syst Softw 86(11):2822–2840 8. Shrivastava AK, Sharma R, Kapur PK (2015) Vulnerability discovery model for a software system using stochastic differential equation. In: 2015 international conference on futuristic trends on computational analysis and knowledge management (ABLAZE). IEEE, pp 199–205

300

M. Anjum et al.

9. Dondo MG (2008) A vulnerability prioritization system using a fuzzy risk analysis approach. In IFIP international information security conference. Springer, Boston, MA, pp 525–540 10. National Vulnerability Database, published on January 1, 2020. https://nvd.nist.gov/general/ news 11. Liu Q, Zhang Y, Kong Y, Wu Q (2012) Improving VRSS-based vulnerability prioritization using analytic hierarchy process. J Syst Softw 85(8):1699–1708 12. Kimura M (2006) Software vulnerability: definition, modelling, and practical evaluation for e-mail transfer software. Int J Press Vessels Pip 83(4):256–261 13. Okamura H, Tokuzane M, Dohi T (2013) Quantitative security evaluation for software system from vulnerability database. J Softw Eng Appl 06:15 14. Kapur PK, Garg RB (1992) A software reliability growth model for an error-removal phenomenon. Softw Eng J 7(4):291–294 15. Kansal Y, Kapur PK, Kumar U, Kumar D (2017) User-dependent vulnerability discovery model and its interdisciplinary nature. Int J Life Cycle Reliab Saf Eng, Springer 6(1):23–29 16. Younis A, Joh H, Malaiya Y (2011) Modeling learningless vulnerability discovery using a folded distribution. In: Proceedings of SAM, vol 11, pp 617–623 17. Arora A, Krishnan R, Nandkumar A, Telang R, Yang Y (2004). Impact of vulnerability disclosure and patch availability-an empirical analysis. In: Third workshop on the economics of information security vol 24, pp 1268–1287 18. Anjum M, Agarwal V, Kapur PK, Khatri SK (2020) Two-phase methodology for prioritization and utility assessment of software vulnerabilities. Int J Syst Assurance Eng Manage 11(2):289– 300 19. Liu Q, Zhang Y (2011) VRSS: a new system for rating and scoring vulnerabilities. Comput Commun 34(3):264–273 20. Kazimieras Zavadskas E, Antucheviciene J, Chatterjee P (2019) Multiple-criteria decisionmaking (MCDM) techniques for business processes information management 21. Govindan K, Rajendran S, Sarkis J, Murugesan P (2015) Multicriteria decision making approaches for green supplier evaluation and selection: a literature review. J Clean Prod 98:66–83 22. Mardani A, Jusoh A, Zavadskas EK (2015) Fuzzy multiple criteria decision-making techniques and applications–two decades review from 1994 to 2014. Expert Syst Appl 42(8):4126–4148 23. Zhang X, Xu Z (2015) Soft computing based on maximizing consensus and fuzzy TOPSIS approach to interval-valued intuitionistic fuzzy group decision making. Appl Soft Comput 26:42–56 24. Chen N, Xu Z (2015) Hesitant fuzzy ELECTRE II approach: a new way to handle multi-criteria decision-making problems. Inf Sci 2015(292):175–197 25. Anjum M, Kapur PK, Agarwal V, Khatri SK (2020) A framework for prioritizing software vulnerabilities using fuzzy best-worst method. In: 2020 8th international conference on reliability, infocom technologies and optimization (trends and future directions) (ICRITO). IEEE, pp 311–316 26. Sibal R, Sharma R, Sabharwal S (2017) Prioritizing software vulnerability types using multicriteria decision-making techniques. Life Cycle Reliab Saf Eng 6(1):57–67 27. Narang S, Kapur PK, Damodaran D, Majumdar R (2018) Prioritizing types of vulnerability on the basis of their severity in multi-version software systems using DEMATEL technique. In: 2018 7th international conference on reliability, infocom technologies and optimization (trends and future directions) (ICRITO). IEEE, pp 162–167 28. National Vulnerability Database, nvd.nist.gov/, 2020 29. CVE Details, The Ultimate Security Vulnerability Data source, www.cvedetails.com. 2020 30. Agarwal V, Govindan K, Darbari JD, Jha PC (2016) An optimization model for sustainable solutions towards implementation of reverse logistics under collaborative framework. Int J Syst Assurance Eng Manage 7(4):480–487 31. Opricovic S, Tzeng GH (2003) Defuzzification within a multicriteria decision model. Int J Uncertainty, Fuzziness Knowl-Based Syst 11(05):635–652

Universal Generating Function Approach for Evaluating Reliability and Signature of All-Digital Protection Systems Soni Bisht and S. B. Singh

Abstract Reliability analysis of various components of all-digital protection systems (ADPS) is one of the key parts of the system reliability quantification process. This paper focuses on the reliability indices and signature evaluation of the ADPS. Here, first measure is to find the reliability of the ADPS using UGF having both independent identically and non-identically distributed components. Second measure is to evaluate the signature with the help of Owen’s method using different algorithms where all the system components are coherent. Lastly, we have calculated the mean time to failure with the help of minimal signature. This paper associates with the reliability block diagram (RBD) as well as the system reliability and signature of the considered systems. Keywords Digital protection system · Reliability block diagram · Coherent system · Universal generating function (UGF) · Signature reliability · Minimal signature

1 Introduction In an electrical power system, power distribution is an important factor in the transmission of desirable electricity to the consumers. The manufacturing of reasonable electric power systems has been changed due to the various environmental factors, different economic problems and customer demands which need more reliable and efficient electricity. To recognize the theory of the future grid, the current demand is to develop advanced grid resources, various communication techniques, and information technologies in the electric power system. A future grid is associated with modernized power system having intelligent technologies which significantly help to S. Bisht (B) Department of Mathematics, Eternal University, Himachal Pradesh, Baru Sahib, India S. B. Singh Department of Mathematics, Statistics and Computer Science, G.B. Pant, University of Agriculture and Technology, Pantnagar, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment of Industrial Systems, Springer Series in Reliability Engineering, https://doi.org/10.1007/978-3-030-93623-5_14

301

302

S. Bisht and S. B. Singh

control and monitor the power system. There are many extensive controlling systems in the future grid which deals with control of different aspects such as voltage, current, energy distribution at transformers, substations transformers, distribution in different switching devices and smart meters [4, 21]. The latest development of non-conventional devices and widespread use of different digital relays to analyze the performance of an ADPS. An IEC61850 digital process system is connected to digital relays, through the output of the non-conventional device. IEC 61,850 substations have been found to be more reliable than traditional substations, which have been in use for decades. Many of the features mentioned in the IEC 61,850, such as fibre optics, redundancy settings, and a self-testing and monitoring system, all contribute to the IEC 61,850’s increased reliability. Usually, all conventional protection system is less reliable than digital protection systems. Digital protection systems contain more electronic devices than the conventional one, viz. MUs, Ethernet switches, time sources and PRs. The reliability of the system can be improved in two ways, one by replacing the copper wires with the fiber optics as proposed by the IEC 61,850 process bus. Another way is redundancy, viz. (i) by adding redundant components in the selected components of MUs and redundant MUs, named as MU1 and MU2; (ii) a communication link to protective relay (PR) and their redundant components, recognized as PR1 and PR2; and (iii) Ethernet communication media (EC) and their redundant components, named as EC1, EC2, EC3, EC4, EC5, and EC6 [2]. In the past, many techniques are used for the analysis of different reliability aspects of conventional protection systems, one of them is Markov process. Singh and Patton [17] discussed a model of a system and its related protection system. The protection systems recognized the faults and faulty components in the system to prevent the damage and also minimize the effect of fault component in the whole operation of the system. Schweitzer et al. described the reliability analysis of protective systems using fault tree analysis. Many techniques applied for the reliability analysis of conventional protection systems are also applied to the digital protection system. Schweitzer et al. first introduced the unconventional architectures of ADPS and found the reliability with the help of minimal path method. Here, researchers analyzed the component importance of all alternative architectures. Chauhan et al. [2] discussed the ranking of all components present in digital protection system using cost-based heuristic redundancy importance measure. The reliability analysis with the help of UGF is a novel approach used in reliability engineering, especially in the digital protection system. Several authors applied UGF in many systems such as series, parallel, l-out-of-m, consecutive m-out-of-n etc. with identical and different elements [7, 8, 18] analyzed the reliability of binary state and multistate systems with the help of UGF using different algorithms. Rausand and Arnljot [13] also studied the different types of the probabilistic and statistical reliability of different complex systems. Ushakov [19] evaluated the unreliability of a weighted (w-l + 1)-out-of-m: system. Negi and Singh [12] computed the reliability, expected lifetime and Birnbaum measure of the non- repairable complex systems which have two subsystems A and B having weighted u-out-of-v: G and weighted m-out-of-n: G configurations respectively with the help of UGF.

Universal Generating Function Approach for Evaluating Reliability and Signature …

303

Da et al. [3] examined the importance of component in a coherent system with the help of Birnbaum Measure and the structural importance. Boland et al. [1] introduced the new component importance called redundancy importance in the coherent systems. Samaniego and Francisco [15] discussed the concepts of signature which is an essential tool for the study of stochastic and aging properties of the coherent systems. It gives the limited information about the failure status of the system lifetime of the coherent systems. Navarro and Rychlik [11] obtained the upper and lower limits for the reliability functions and expected lifetimes of the coherent systems based on the dependent exchangeable absolutely continuous component with the help of the signature. Kumar and Singh [5] evaluated the reliability of l-out-of-m coherent system with the help of UGF. Here authors assessed the tail signature, signature reliability, Barlow Proschan index, mean time to failure and estimated cost. Navarro and Rubio [10] previously developed an algorithm for determining the signature reliability and MTTF of a five-component coherent system. Marichal and Mathonet [9] examined the reliability function, tail signature, signature, and minimum signatures of the coherent system using the structural function in the literature. Kumar and Singh [6] calculated the reliability, signature reliability, mean time to failure, Barlow Proschan index and predictable cost of sliding window systems with the use of different algorithms. From above discussion, it is noticeable that lots of work have been done for the reliability improvement of the non-conventional protection systems using different techniques. Keeping this fact in the view the current paper proposes to study the models corresponding to different architectures of digital protection systems which is composed of non-conventional devices: external time sources, MUs, Ethernet switches, EC and PR. Here we also analyze the reliability of independent and identically distributed and non-identically distributed components. The main focus of our study is to estimate the reliability and signature of digital protection system with the help of UGF and Owen’s method to show the impact of various components in the system.

1.1 Definitions 1.1.1

Structure–function

The mapping ϕ : {0, 1}m → {0, 1} of a system consisting of m components is called the structure–function of the coherent system. A system works if certain combinations of components work:  ϕ(x) =

1, if the system works 0, if the system fails

304

1.1.2

S. Bisht and S. B. Singh

Coherent Systems

A system is considered to be coherent if each of its components is significant (a component is relevant when the state of the system is determined by whether the components are operating or not) and its structure function is monotone (a structure function is monotone when the state of the system will never improve after the failure of the component).

1.1.3

Structure Design of ADPS

Digital protection system is constructed with the help of MUs, time sources, Ethernet switches, EC and PR.

1.1.4

Merging Unit

MU acquires AC current and voltage from conventional current and voltage transformers. MU digitizes the AC current and voltage signals and sends them over the Ethernet network in the form of sampled values.

1.1.5

Time Sources

Time source is an external time source but an external synchronization source needed by the system for the large time synchronization. A time source is required to manage the overall system reliability.

1.1.6

Ethernet Switch

This device is used to develop a network connection between the attached components. Ethernet switch understand the packets addressing scheme and send any data packet only to its destination ports.

1.1.7

Ethernet Communication Media (EC)

EC plays a very significant role in all-digital protection systems. The EC is a part of the intelligent electronic device (IED). The IED is used in the electric power industry to describe the power system equipment, such as circuit breakers.

Universal Generating Function Approach for Evaluating Reliability and Signature …

Structure design 1

Structure design 3

305

Structure design 2

Structure design 4

Structure design 5

Fig. 1 Structure designs (SDs) of digital protection system

1.1.8

Protective Relay

It is a relay tool measured to trip a circuit breaker when a fault is identified. Electromagnetic devices are the first PRs, relying on coils to recognize the irregular operating conditions such as overflow of current and voltage and reverse flow of voltage and current. The need of protective relay is to protect from damage of circuits and equipment within a few seconds.

1.2 Alternative System Structure Designs Zhang et al. described various types of alternative structure designs of the ADPS. In the first structure design, we considered the redundancy only on PRs whereas in second structure design redundancies are there in MUs and PRs which are dependent on each other. The third structure design has two redundant and independent PRs and in the fourth structure design, the switches are in cascade form so as to provide cross backup in the merging units. Fifth architecture adds redundancy in Ethernet switches and EC (Fig. 1).

1.3 System Reliability Block Diagram RBD is the graphical version of the components of a system and the associations between them, which can be used to establish the overall system reliability, even for a large-scale system. Therefore, it can be accomplished that the RBD method is a precise method for the analysis of such systems. Many researchers observed that the use of RBDs is one of the very important ways for system reliability analysis.

306

S. Bisht and S. B. Singh

(i) RBD of structural design 1

(ii) RBD of structural design 2

(iii) RBD of structural design 3

(iv) RBD of structural design 4

(v) RBD of structural design 5

Fig. 2 RBD of the different structure designs

Here, we consider the five-reliability block diagram in Fig. 2 which is related to the structure designs of ADPS.

1.4 Universal Generating Function (UGF) In reliability engineering, there are various methods for the reliability assessment of the systems. The universal generating function is one of the useful tools due to its difficulty and time reducing capability. The UGF is firstly introduced by Ushakov [18]. The UGF of an independent discrete random variable l is expressed in the polynomial form as: U (z) =

M  m=1

pm z l m

(1)

Universal Generating Function Approach for Evaluating Reliability and Signature …

307

where the variable X has m has possible values and pm is the probability that X is equal to lm and z is any variable. Consider r independent discrete random variable X 1 , X 2 , …, X r . Let the UGF of random variable X 1 , X 2 , …, X r be U1 (z), U2 (z)........., Ur (z) respectively and f (X 1 , X 2 , …, Xr) be an arbitrary function. Further, combination of r UGF is defined by composition operator ⊗ f , where the properties of the composition operator strictly depend on the properties of the function f (X 1 , X 2 , …, X r) . Therefore, the composition U r (z) is defined as: Ur (z) = ⊗ f U1 (z), U2 (z), . . . , Ur (z) The UGFs’ of the structure design containing two types of components are expressed as follow: For the series arrangement, the UGF can be expressed as: Uc (z) ⊗ Ud (z) = ser

M  m=1

kcm z gcm ⊗

ser

N 

kdn z gdn =

n=1

M  N 

kcm kdn z ser (gcm , gdn )

(2)

kcm kdn z par (gcm , gdn )

(3)

m=1 n=1

UGF of a parallel system is given by: Uc (z) ⊗ Ud (z) = par

M  m=1

kcm z gcm ⊗

par

N 

kdn z gdn =

n=1

M  N  m=1 n=1

where, ⊗ and ⊗ are the composition operators over u-function associated with ser

par

series and parallel system respectively. ser (gam , gbn ) and par (gam , gbn ) yield the entire performance rates of binary state elements c and d at state m and n respectively.

1.5 Assessment of System Reliability 1.5.1

Reliability of Structural Design 1

The reliability evaluation of structure design 1 by UGF can be done using following two steps: (i) (ii)

Consider the structure design that contains non-identical distributed components. Assume the architectures that contain independent and identically distributed components.

Reliability expression of structure design 1 (RSD1 ) with the use of UGF can be obtained as:

308

S. Bisht and S. B. Singh

RSD1 = min(max(min(MU, EC, E S) min(P R1, EC1) min(P R2, EC2)T S1, T S2)) (a)

(4)

Reliability computation of structure design 1, when all components in the system are non-identically distributed. The UGF of different components present in the design can be expressed as: u i (z) = pi z 1 + (1 − pi )z 0

(5)

Here, pi is the probability of i th component, where i = TS1, TS2, MU, EC, ES, PR1, PR2, EC1, EC2. Now, apply the composition operator ⊗(uTS1 (z), uTS2 (z), uMU (z), uEC (z), uES (z), uPR1 (z), uEC1 (z), uPR2 (z), uEC2 (z)) to obtain the system structure function. If X 1 is the state variable corresponding to the subsystem consisting of elements TS1 and TS2, then structure function is obtained as: X 1 (z) = max (T S1, T S2) When X 2 is the state variable related to the subsystem containing the elements MU, EC and ES, then structure function can be written as: X 2 (z) = min (MU, EC, ES) When X 3 is the state variable corresponding to the subsystem consisting of elements PR1 and EC1, then the structure function is given by: X 3 (z) = min (P R1, EC1) If X 4 is the state variable corresponding to the subsystem consisting of elements PR2 and EC2, then structure function can be written as: X 4 (z) = min (P R2, EC2) When X 5 is the state variable related to the subsystem containing the elements X 3 and X 4 , then the structure function can be evaluated by: X 5 (z) = max(X 3 , X 4 ) If X is the state variable corresponding to the subsystem containing the elements X 1, X 2 , and X 5 , then the structure function can be computed as:   X (z) = min X 1, X 2 , X 5

Universal Generating Function Approach for Evaluating Reliability and Signature …

309

Assuming the values of pi ’s, i = TS1, TS2, ES, PR1, PR2, EC1, EC, and substituting the same in Eq. (5), we have UGFs’ of the components of the system as: u T S1 (z) = 0.9z 1 + 0.1z 0 , u T S2 (z) = 0.95z 1 + 0.05z 0 , u E S (z) = 0.95z 1 + 0.05z 0 , u P R1 (z) = 0.8z 1 + 0.2z 0 , u P R2 (z) = 0.85z 1 + 0.15z 0 , u EC1 (z) = 0.85z 1 + 0.15z 0 , u EC (z) = 0.8z 1 + 0.2z 0 . To find the system reliability of the structure design 1, we apply composition operators to UGFs of different components. By doing so, we have U1 (z) = u T S1 (z) ⊗ u T S2 (z) max   1 = 0.9z + 0.1z 0 ⊗ (0.95z 1 + 0.05z 0 ) = (0.995z 1 + 0.005z 0 ), max

U2 (z) = u MU (z) ⊗ u EC (z) ⊗ u E S (z) = (0.684z 1 + 0.32z 0 ), min

min

U3 (z) = u P R1 (z) ⊗ u EC1 (z) = (0.68z 1 + 0.32z 0 ), min

U4 (z) = u P R2 (z) ⊗ u EC2 (z) = (0.595z 1 + 0.405z 0 ), min

U5 (z) = U3 (z) ⊗ U4 (z) = (0.8704z 1 + 0.1296z 0 ). max

Finally, the reliability of the structure design 1 is computed as: U (z) = U1 (z) ⊗ U2 (z) ⊗ U5 (z) = (0.592376832z 1 + 0.407623168z 0 ) min

(b)

min

When components, namely merging units, Ethernet communication media, Ethernet switch, time source and protective relay having identical probabilities of structure design 1, i.e., the probabilities of all the components are same, then the structure function becomes: RSD1 = 2 p 9 − 4 p 8 − p 7 + 4 p 6

(6)

Taking into account the different components reliabilities, one can get the reliability of structure design 1 as listed in Table 1 and depicted in Fig. 3.

310

S. Bisht and S. B. Singh

Table 1 Values of reliability of structure design 1

Component reliability

Reliability of SD1 (RSD1 )

0.90

0.7004392

0.91

0.7295654

0.92

0.7590206

0.93

0.7887547

0.94

0.8187153

0.95

0.8488473

0.96

0.8790933

0.98

0.9396874

0.99

0.9699172

Reliability of SD 1

1 0.8 0.6 0.4 0.2 0 0.88

0.9

0.92

0.94

0.96

0.98

1

Component Reliability Fig. 3 Component reliability versus reliability of SD1

1.5.2

Reliability of Structural Design 2

The reliability R S D2 of the structural design 2, with the help of UGF method is expressed as: RSD2 = min(max(min(MU 1, EC1)min(MU 2, EC2)E S) (min(P R1, EC3)min(P R2, EC4)T S1, T S2) (a)

(7)

Reliability of structure design 2, when the components in the system are nonidentically distributed. In this case, the system structure functions are as follow: where,

Universal Generating Function Approach for Evaluating Reliability and Signature … State variable

Subsystem consisting of elements

X1

max (TS1, TS2)

X2

min (MU1, EC1)

X3

min (MU2, EC2)

X4

max (X 2 , X 3 )

X5

min (PR1, EM3)

X6

min (PR2, EM4)

X7

max (X 5 , X 6 )

X8

min (X 1 , X 4, ES and X 7 )

311

Substituting the assumed values of pi ’s in Eq. (5), we have UGFs’ of components of the structure design 2 as: u T S1 (z) = 0.9z 1 + 0.1z 0 , u T S2 (z) = 0.95z 1 + 0.05z 0 , u MU 1 (z) = 0.8z 1 + 0.2z 0 , u MU 2 (z) = 0.85z 1 + 0.15z 0 u P R1 (z) = 0.8z 1 + 0.2z 0 , u P R2 (z) = 0.85z 1 + 0.15z 0 , u E S (z) = 0.9z 1 + 0.1z 0 , u EC1 (z) = 0.7z 1 + 0.3z 0 , u EC2 (z) = 0.75z 1 + 0.25z 0 , u EC3 (z) = 0.85z 1 + 0.25z 0 , u EC4 (z) = 0.7z 1 + 0.3z 0 . Now to evaluate the reliability of structure design 2, firstly compute the following by applying composition operators to the UGFs of components.   U1 (z) = u T S1 (z) ⊗ u T S2 (z) = 0.9z 1 + 0.1z 0 ⊗ (0.95z 1 + 0.05z 0 ) max

max

= (0.995z 1 + 0.005z 0 ), U2 (z) = u MU 1 (z) ⊗ u EC1 (z) = 0.56z 1 + 0.44z 0 , min

U3 (z) = u MU 2 (z) ⊗ u EC2 (z) = 0.6375z 1 + 0.3625z 0 , min

U4 (z) = U2 (z) ⊗ U3 (z) = 0.8405z 1 + 0.1152z 0 , max

U5 (z) = U P R1 (z) ⊗ U E M3 (z) = 0.68z 1 + 0.32z 0 , min

U6 (z) = U P R2 (z) ⊗ U E M4 (z) = 0.595z 1 + 0.405z 0 , min

U7 (z) = U5 (z) ⊗ U6 (z) = 0.8704z 1 + 0.1296z 0 , max

312

S. Bisht and S. B. Singh

Table 2 Values of reliability of structure design 2

Component reliability

Reliability of SD2

0.90

0.82783096

0.91

0.85007248

0.92

0.87148760

0.93

0.89196890

0.94

0.91140759

0.95

0.92969391

0.96

0.94671769

0.98

0.97653804

0.99

0.98911714

U E S (z) = 0.9z 1 + 0.1z 0 . Finally, the reliability of the structure design 2 is obtained as: U (z) = U1 (z) ⊗ U4 (z) ⊗ U E S (z) ⊗ U7 (z) = 0.6559z 1 + 0.3441z 0 min

(b)

min

min

When components are independent and identically distributed ( pi = p), then the structure function of reliability of the structure design 2 is expressed as: RSD2 = − p 11 + 2 p 10 + 4 p 9 − 8 p 8 − 4 p 7 + 8 p 6

(8)

Reliability of SD 2

Table 2 lists the reliability of structure design 2 corresponding to different component reliability and the same is represented graphically by Fig. 4.

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.88

0.9

0.92

0.94

0.96

Component Reliability Fig. 4 Component reliability versus reliability of SD2

0.98

1

Universal Generating Function Approach for Evaluating Reliability and Signature …

1.5.3

313

Reliability of Structural Design 3

The reliability of structure design 3 with the help of UGF is given by: RSD3 = min(max(T S1, T S2) max(min(MU 1, EC1, E S1, P R1, EC3)min(MU 2, EC2, E S2, P R2, E S4))

(9) (a)

When all components in the system are non-identically distributed, then the reliability of structure design 3 is computed as:

If X 1 is the state variable corresponding to the subsystem containing the elements TS1and TS1, then the structure function can be expressed as: X 1 (z) = max (T S1, T S2) If X 2 is the state variable related to the subsystem containing the elements MU1, EC1, ES1, PR1and EC3, then the structure function can be computed as: X 2 (z) = min(MU 1, EC1, E S1, P R1 and EC3) When X 3 is the state variable corresponding to the subsystem containing the elements MU2, EC2, ES2, PR2 and EC4, then the structure function is given by: X 3 (z) = min(MU 2, EC2, E S2, P R2 and EC4) If X 4 is the state variable related to the subsystem containing the elements X 2 and X 3 , then the structure function is computed by: X 4 (z) = max(X 2 , X 3 ) When X is the state variable corresponding to the subsystem containing the elements X 1 and X 4 , then the structure function becomes: X (z) = max(X 1 , X 4 ) Putting the assumed values of pi ’s in Eq. (5), the UGFs’ of components can be expressed as: u T S1 (z) = 0.9z 1 + 0.1z 0 , u T S2 (z) = 0.95z 1 + 0.05z 0 , u MU 1 (z) = 0.8z 1 + 0.2z 0 , u MU 2 (z) = 0.85z 1 + 0.15z 0

314

S. Bisht and S. B. Singh

u P R1 (z) = 0.8z 1 + 0.2z 0 , u P R2 (z) = 0.85z 1 + 0.15z 0 , u E S1 (z) = 0.9z 1 + 0.1z 0 , u E S2 (z) = 0.95z 1 + 0.05z 0 u EC1 (z) = 0.7z 1 + 0.3z 0 , u EC2 (z) = 0.75z 1 + 0.25z 0 , u EC3 (z) = 0.85z 1 + 0.25z 0 , u EC4 (z) = 0.7z 1 + 0.3z 0 Now applying the composition operators, we have   U1 (z) = u T S1 (z) ⊗ u T S2 (z) = 0.9z 1 + 0.1z 0 ⊗ (0.95z 1 + 0.05z 0 ) max 1

max

= (0.995z + 0.005z ) 0

U2 (z) = u MU 1 (z) ⊗ u EC1 (z) ⊗ u E S1 (z) ⊗ u P R1 (z) ⊗ u EC3 min 1

min 0

min

min

= 0.34272z + 0.65728z

U3 (z) = u MU 2 (z) ⊗ u EC2 (z) ⊗ u E S2 (z) ⊗ u P R2 (z) ⊗ u EC4 min

min

min

min

= 0.3603468z 1 + 0.63965312z 0 U4 (z) = U2 (z) ⊗ U3 (z) = 0.579568774z 1 + 0.420431206z 0 max

At last, the reliability of the structure design 3 is given by: U (z) = U1 (z) ⊗ U4 (z) = 0.576670z 1 + 0.42333z 0 max

It is worth mentioning that when all components are identical ( pi = p), then the structure function of reliability becomes:

(b)

RSD3 = p 12 − 2 p 11 − 2 p 7 + 4 p 6

(10)

Table 3 lists the reliability of structure design 3 corresponding to different components reliability and also represented graphically by Fig. 5.

1.5.4

Reliability of Structural Design 4

The reliability of structural design 4 by the proposed method is: RSD4 = min(max(T S1, T S2)(max(min(MU 1, EC1)min(MU 2, EC2))) min(E S1, E S2)min(P R1, EC3)min(P R2, EC4))

(11)

Universal Generating Function Approach for Evaluating Reliability and Signature …

Reliability of SD 3

Table 3 Values of reliability of structure design 3

315

Component reliability

Reliability of SD3

0.90

0.823978544

0.91

0.851693121

0.92

0.878118434

0.93

0.902948198

0.94

0.925847836

0.95

0.946452873

0.96

0.964367298

0.98

0.990372479

0.99

0.997498265

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.88

0.9

0.92

0.94

0.96

0.98

1

Component Reliability Fig. 5 Component reliability versus reliability of SD3

System structure function of the structure design 4, when all components in the system are non-identically distributed is given by: where, State variable

Subsystem consisting of elements

X 1 (z)

max (TS1, TS2)

X 2 (z)

min (MU1, EM1)

X 3 (z)

min (MU2, EM2)

X 4 (z)

max (X 2 , X 3 )

X 5 (z)

min (ES 1 , ES 2 )

X 6 (z)

min (PR1, EM3) (continued)

316

S. Bisht and S. B. Singh

(continued) State variable

Subsystem consisting of elements

X 7 (z)

max (PR2, EM4)

X 8 (z)

max (X 6 , X 7 )

X(z)

min (X 1 , X 4, X 5 and X 8 )

Again the UGFs’ of the components for the considered values of pi ’s are obtained as: u T S1 (z) = 0.9z 1 + 0.1z 0 , u T S2 (z) = 0.95z 1 + 0.05z 0 , u MU 1 (z) = 0.8z 1 + 0.2z 0 , u MU 2 (z) = 0.85z 1 + 0.15z 0 u P R1 (z) = 0.8z 1 + 0.2z 0 , u P R2 (z) = 0.85z 1 + 0.15z 0 , u E S1 (z) = 0.9z 1 + 0.1z 0 , u E S2 (z) = 0.95z 1 + 0.05z 0 u EC1 (z) = 0.7z 1 + 0.3z 0 , u EC2 (z) = 0.75z 1 + 0.25z 0 , u EC3 (z) = 0.85z 1 + 0.25z 0 , u EC4 (z) = 0.7z 1 + 0.3z 0 To evaluate the system reliability of the structure design 4 let us apply the composition operators:   U1 (z) = u T S1 (z) ⊗ u T S2 (z) = 0.9z 1 + 0.1z 0 ⊗ (0.95z 1 + 0.05z 0 ) max 1

max

= (0.995z + 0.005z ), 0

U2 (z) = u E M1 (z) ⊗ u MU 1 (z) = 0.56z 1 + 0.44z 0 , min

U3 (z) = u E M2 (z) ⊗ u MU 2 (z) = 0.6375z 1 + 0.3625z 0 , min

U4 (z) = U2 (z) ⊗ U3 (z) = 0.8405z 1 + 0.1595z 0 , max

U5 (z) = u E S1 (z) ⊗ u E S2 (z) = 0.855z 1 + 0.145z 0 , min

U6 (z) = U P R1 (z) ⊗ U E M3 (z) = 0.68z 1 + 0.32z 0 , min

U7 (z) = U P R2 (z) ⊗ U E M4 (z) = 0.595z 1 + 0.405z 0 , min

U8 (z) = U6 (z) ⊗ U7 (z) = 0.8704z 1 + 0.1296z 0 . max

At last, the reliability of the structure design 4 can be computed as:

Universal Generating Function Approach for Evaluating Reliability and Signature … Table 4 Values of reliability of structural design 4

Component reliability

Reliability of SD4

0.90

0.745047864

0.91

0.773565963

0.92

0.801768596

0.93

0.829531084

0.94

0.856723137

0.95

0.883209219

0.96

0.908848990

0.98

0.957007283

0.99

0.979226854

317

U (z) = U1 (z) ⊗ U4 (z) ⊗ U5 (z) ⊗ U8 (z) = 0.62236z 1 + 0.37764z 0 min

(b)

min

min

Reliability of structural design 4 when the components are independent and identically distributed ( pi = p) are having the same probabilities then the structure function is given by: RSD4 = − p 12 + 2 p 11 + 4 p 10 − 8 p 9 − 4 p 8 + 8 p 7

(12)

Reliability of SD 4

Considering the different components reliabilities, one can get the reliability of structure design 4 as listed in Table 4 and depicted in Fig. 6.

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.88

0.9

0.92 0.94 0.96 Component Reliability

Fig. 6 Component reliability versus reliability of SD4

0.98

1

318

1.5.5

S. Bisht and S. B. Singh

Reliability of Structural Design 5

The reliability of the structure design 5 with the help of UGF is given by: RSD5 = min(max(T S1, T S2) (max(min(MU 1, MU 2) max((min(EC1, E S1, EC3)min(EC2, E S2, EC4) (min(EC5, E S3, EC7)min(EC6, E S4, EC8))max(P R1, P R2)) (13) When X 1 is the state variable corresponding to the subsystem containing the elements TS1 and TS2, then the structure function can be written as: X 1 (z) = max (T S1, T S2) If X 2 is the state variable related to the subsystem containing the elements EC1, ES1and EC3, then the structure function can be expressed as: X 2 (z) = min (EC1, E S1 and EC3) If X 3 is the state variable corresponding to the subsystem containing the elements EC2, ES2 and EC4, then the structure function is given by: X 3 (z) = min (EC2, E S2, EC4) If X 4 is the state variable related to the subsystem containing the elements X 2 and X 3 , then the structure function can be computed as: X 4 (z) = max(X 2 , X 3 ) If X 5 is the state variable corresponding to the subsystem containing the elements EC5, ES3 and EC7, then the structure function can be obtained as: X 5 (z) = min(EC5, E S3, EC7) If X 6 is the state variable corresponding to the subsystem containing the elements EC6, ES4 and EC8, then the structure function becomes: X 6 (z) = min(EC6, E S4, EC8) If X 7 is the state variable related to the subsystem containing the elements X 5 and X 6 , then the structure function can be obtained as: X 7 (z) = max(X 5 , X 6 )

Universal Generating Function Approach for Evaluating Reliability and Signature …

319

When X 8 is the state variable related to the subsystem containing the elements X 4 and X 7 , then the structure function can be expressed as: X 8 (z) = max(X 4 , X 7 ) When X is the state variable corresponding to the subsystem containing the elements X 1 andX 8 , then the structure function becomes: X (z) = max(X 1 , X 8 ) UGFs’ of the structure design 5 corresponding to considered values of pi ’s using Eq. (5) are given by: u T S1 (z) = 0.9z 1 + 0.1z 0 , u T S2 (z) = 0.95z 1 + 0.05z 0 , u MU 1 (z) = 0.8z 1 + 0.2z 0 , u MU 2 (z) = 0.85z 1 + 0.15z 0 , u P R1 (z) = 0.8z 1 + 0.2z 0 , u P R2 (z) = 0.85z 1 + 0.15z 0 , u E S1 (z) = 0.9z 1 + 0.1z 0 , u E S2 (z) = 0.95z 1 + 0.05z 0 , u EC1 (z) = 0.7z 1 + 0.3z 0 , u EC2 (z) = 0.75z 1 + 0.25z 0 , u EC3 (z) = 0.85z 1 + 0.25z 0 , u EC4 (z) = 0.7z 1 + 0.3z 0 , u EC5 (z) = 0.9z 1 + 0.1z 0 , u EC6 (z) = 0.95z 1 + 0.05z 0 , u EC7 (z) = 0.88z 1 + 0.12z 0 , u EC8 (z) = 0.8z 1 + 0.2z 0 u E S3 (z) = 0.98z 1 + 0.02z 0 u E S4 (z) = 0.95z 1 + 0.05z 0 . Applying composition operators to UGFs, we have. U1 (z) = u T S1 (z) ⊗ u T S2 (z) = (0.9z 1 +0.1z 0 ) ⊗ (0.95z 1 +0.05z 0 ) = (0.995z 1 + 0.005z 0 ),

max

max

U2 (z) = u EC1 (z) ⊗ u E S1 (z) ⊗ u EC3 (z) = 0.5355z 1 + 0.4645z 0 , min

min

U3 (z) = u EC2 (z) ⊗ u E S2 (z) ⊗ u EC4 (z) = 0.49875z 1 + 0.50125z 0 , min

min

U4 (z) = U2 (z) ⊗ U3 (z) = 0.767169375z 1 + 0.232830625z 0 , max

U5 (z) = u EC5 (z) ⊗ u E S3 (z) ⊗ u EC7 (z) = 0.77616z 1 + 0.22384z 0 , min

min

320

S. Bisht and S. B. Singh

Table 5 Values of reliability structure design 5

Component reliability

Reliability of SD5

0.90

0.557636753

0.91

0.600090709

0.92

0.643788337

0.93

0.688492578

0.94

0.733922695

0.95

0.779751827

0.96

0.825604915

0.98

0.915632991

0.99

0.921059188

U6 (z) = u EC6 (z) ⊗ u E S4 (z) ⊗ u EC8 (z) = 0.722z 1 + 0.278z 0 , min

min

U7 (z) = U5 (z) ⊗ U6 (z) = 0.68z 1 + 0.32z 0 , max

U8 (z) = U4 (z) ⊗ U7 (z) = 0.719430327z 1 + 0.280569673z 0 . max

Finally, the reliability of structure design 5 can be computed as: U (z) = U1 (z) ⊗ U8 (z) = 0.715833175z 1 + 0.284166825z 0 min

(b)

The structure function of the structure design 5 where components are independent and identically distributed is expressed as: RSD5 = − p 18 + 2 p 17 + 4 p 15 − 8 p 14 − 4 p 12 + 8 p 11

(14)

Table 5 lists the reliability of structure design 5 corresponding to different components reliability and the same is represented graphically by Fig. 7.

Reliability of SD 5

1 0.8 0.6 0.4 0.2 0 0.88

0.9

0.92 0.94 0.96 Component Reliability

Fig. 7 Component reliability versus reliability of SD 5

0.98

1

Universal Generating Function Approach for Evaluating Reliability and Signature …

321

1.6 Signature Reliability The idea of the signature was firstly introduced by Samaniego [14] for the systems whose components have continuous and i.i.d lifetimes. In recent decades, the signature has proved to be one of the powerful tools for the quantification of the reliability of the coherent system. The signature is very effective tool in optimal design and reliability economics of the systems. Let G 1 , G 2 , ....., G v be i.i.d. component with continuous distribution function. Let T be the lifetime of this network, then the reliability at a time t ≥ 0 is given by P(T > t) =

v 

su P(G u:v > t)

(15)

u=1

where, G 1:v ≤ G 2:v ≤ ........ ≤ G v:v are the ordered lifetime of the components and signature of uth components su = P(T = X u:v ), u = 1, 2, ....., v. The vector s = (s1 , s2 , ...., sv ) is called the signature of the system.

1.6.1

Algorithm for Evaluating the Signature of All Digital Protection System with the Help of Structure Function

Step 1: Analyze the signature of the structure–function by Sm =



1

(m m−n+1)

φ(h) −

h⊆[m] |h|=m−n+1



1

(m m−n )

φ(h)

(16)

h⊆[m] |h|=m−n

and compute a polynomial function of the digital protection system by h(u) =

b 

Ca

  b a

a=1

pa q b−a

(17)

where, Cf =

b 

S f a = 1, 2, ..., b

f =b−a+1

Step 2: Calculate the tail signature of the system S = (S0 , ......., Sm ) using Sn =

u  e=n+1

se =

1

(m m−n )

 |h|=m−n

φ(h)

(18)

322

S. Bisht and S. B. Singh

Step 3: With the help of Taylor expansion about y=1, we find the reliability function in the form of a polynomial. p(x) = x u h( x1 )

(19)

Step 4: Estimate the tail signature of ADPS reliability function with the help of Eq. (18) by Sn =

(m−n)! n D p(1) n!

, n = 0, 1, ..., m

(20)

Step 5: Find the signature of the ADPS using Eq. (20). s = S n−1 − S n , n = 1, ..., m

(21)

1.7 Signature of ADPS 1.7.1

Signature of Structure Design 1

Using Eqs. (6) and (17), one can get the structure function of structure design 1 as: h(u) = 2u 9 − 4u 8 − u 7 + 4u 6

(22)

Computing the tail signature (S S D1 ) and the signature (SS D1 ) of the structure design 1 by Eqs. (20) and (21) respectively as: 2 11 1 S S D1 = 1, , , , 0, 0, 0, 0, 0, 0 3 36 21  1 13 195 1 287 , , , , , 0, 0, 0, 0, 0 s S D1 = 3 36 756 21 4410 

1.7.2

Signature of Structure Design 2

With the help of Eqs. (8) and (17), we get the structure function of structure design 2 as: h(u) = −u 11 + 2u 10 + 4u 9 − 8u 8 − 4u 7 + 8u 6

(23)

The tail signature (S S D2 ) and the signature (SS D2 ) of the structure design 2 can be computed by Eqs. (20) and (21) respectively as:

Universal Generating Function Approach for Evaluating Reliability and Signature …

323

 10 36 56 6 4 , , , 0, 0, 0, 0, 0, 0 S S D2 = 1, , , 11 55 165 55 231  1 14 2860 38 1166 4 , , , , , , 0, 0, 0, 0 s S D2 = 11 55 9075 165 12705 231

1.7.3

Signature of Structure Design 3

With the help of Eq. (10) and using Eq. (17) we get the structure function h(u) from RBD of structure design 3 as: h(u) = u 12 − 4u 11 − 2u 7 + 4u 6

(24)

The tail signature (S S D3 ) and the signature (SS D3 ) of the structure design 3 can be obtained by Eqs. (20) and (21) respectively as: 

S S D3 s S D3

1.7.4

20 3 10 1 1 = 1, 1, , , , , , 0, 0, 0, 0 33 11 99 36 231



 13 11 17 261 195 1 , , , 0, 0, 0, 0, 0 = 0, , , , 33 33 99 3564 8316 231

Signature of Structure Design 4

Using Eqs. (12) and (17), we get the structure function of structure design 4 as: h(u) = −u 12 + 2u 11 + 4u 10 − 8u 9 − 4u 8 + 8u 7

(25)

The tail signature (S S D4 ) and the signature (SS D4 ) of the structure design 4 can be obtained by Eqs. (20) and (21) correspondingly as:  5 6 14 4 1 S S D4 = 1, , , , , , 0, 0, 0, 0, 0, 0, 0 6 11 55 55 99  1 19 16 10 341 1 , , , , , , 0, 0, 0, 0, 0, 0 s S D4 = 6 66 55 55 5445 99

324

1.7.5

S. Bisht and S. B. Singh

Signature of Structure Design 5

With the help of Eqs. (14) and (17), we get the structure function of the structure design 5 as: h(u) = −u 18 + 2u 17 + 4u 15 − 8u 14 − 4u 12 + 8u 11

(26)

The tail signature (S S D5 ) and the signature (SS D5 ) of the structure design 5 can be obtained by Eqs. (20) and (21) respectively as:  1 1 7 8 43 53 2 , , , , , 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 S S D5 = 1, , , 9 17 204 765 119 357 3978  1 13 195 1 287 , , , , , 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 s S D5 = 3 36 756 21 4410

1.8 Algorithm for Finding Expected Lifetime of the ADPS with Minimal Signature Step 1: Determine the MTTF of the ADPS which has i.i.d components with mean μ. Step 2: Evaluate the minimal signature of the ADPS with an expected lifetime of reliability function by using HT (w) =

n 

Cr H1:r (w)

(27)

r =1

where,H1:r (w) = Pu (z 1:r > w) for r = 1,2, …, v Step 3: Calculate the expected lifetime E (T ) of the ADPS which has i.i.d components by E(T ) = μ

n 

Cr r

r =1

where,C = (C1 , C2 , . . . , Cv ) is a vector coefficient of minimal signature.

(28)

Universal Generating Function Approach for Evaluating Reliability and Signature …

325

1.9 Mean Time to Failure (Expected Lifetime) of the All-Digital Protection System 1.9.1

MTTF of Structure Design 1

Using Eq. (24), the expected lifetime from the minimal signature of structure design 1 is obtained as: h(u) = 2u 9 − 4u 8 − u 7 + 4u 6 The minimal signature of the structure design 1 (M SS D 1 ) with the help of Eq. (27) is given by MSSD1 = (0, 0, 0, 0, 0, 4, −1, −4, 2). Hence, the expected lifetime of the structure design 1 is evaluated by Eq. (28) as: E(T ) S D1 = 0.246031

1.9.2

MTTF of Structure Design 2

The expected lifetime from the minimal signature of structure design 2 is computed by Eq. (24) as: h(u) = −u 11 + 2u 10 + 4u 9 − 8u 8 − 4u 7 + 8u 6 The minimal signature of the structure design 2 (M SS D 2 ) with the help of Eq. (27) is computed as MSSD2 = (0, 0, 0, 0, 0, 8, −4, −8, 4, 2, −1). Hence, the expected lifetime of the structure design 2 is estimated by Eq. (28) as: E(T ) S D2 = 0.252453

1.9.3

MTTF of Structure Design 3

Using Eq. (18), the expected lifetime from the minimal signature of the structure design 3 is expressed as:

326

S. Bisht and S. B. Singh

h(u) = u 12 − 4u 11 − 2u 7 + 4u 6 The minimal signature of the structure design 3 (M SS D 3 ) using Eq. (27) is specified as. MSSD3 = (0, 0, 0, 0, 0, 4, −2, 0, 0, 0, −4, 1). Hence, the expected lifetime of the structure design 3 is calculated by Eq. (28) as: E(T ) S D3 = 0.100649

1.9.4

MTTF of Structure Design 4

Using Eq. (18), the expected lifetime from the minimal signature of the structure design 4 is obtained as: h(u) = −u 12 + 2u 11 + 4u 10 − 8u 9 − 4u 8 + 8u 7 The minimal signature of the structure design 4 by Eq. (27) is computed as MSSD4 = (0, 0, 0, 0, 0, 0, 8, −4, −8, 4, 2, − 1). Hence, the expected lifetime of the structure design 4 is evaluated by Eq. (28) as: E(T ) S D4 = 0.254310

1.9.5

MTTF of Structure Design 5

With the help of Eq. (18), the expected lifetime from the minimal signature of the structure design 5 is expressed as: h(u) = −u 18 + 2u 17 + 4u 15 − 8u 14 − 4u 12 + 8u 11 The minimal signature of the structure design 5 (M SS D 5 ) with the help of Eq. (27) is determined as MSSD5 = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, −4, 0, −8, 4, 0, 2, −1).

Universal Generating Function Approach for Evaluating Reliability and Signature …

327

Hence, the mean time to failure of the structure design 5 is estimated by Eq. (28) as: E(T ) S D5 = 0.151268

2 Conclusion The present paper analyzed the reliability of the all-digital protection system having non-identically and identically distributed components. Here, we computed the reliability of all the structure designs with the help of the UGF. Also, unlike done in the past the signature reliability of the all-digital protection systems with the help of Owens method has also been studied in this paper. The expected lifetime of digital protection systems with the help of minimal signature is also studied for the first time. In these systems, we have considered many redundant components to increase the system reliability. It is also revealed in the study that when the systems have non-identical component, then the architecture 5 is found to be most reliable having value 0.715833175 while the structure design 2 is the second most reliable having value 0.6559. The signature analysis has been done to examine the impact of the failure probabilities of components in the architectures which will help the engineers and system engineers in designing. MTTF of the all architectures with the help of minimal signature has also been obtained with respect to all five proposed aspects. We also found that MTTF of structure design 2 is the highest while the structure design 3 has attained the lowest value.

References 1. Boland PJ, El Neweihi E, Proschan F (1988) Active redundancy allocation in coherent systems. Probab Eng Inf Sci 2(3):343–353 2. Chauhan U, Singh V, Rani A, Pahuja GL (2015) Ranking of all digital protection system components using cost-based heuristic redundancy importance measure. In: International conference on recent developments in control, automation and power engineering (RDCAPE), pp 141–145 3. Da G, Zheng B, Hu T (2012) On computing signatures of coherent systems. J Multivariate Anal 103(1):142–150 4. Djekic Z, Portillo L, Kezunovic M (2008) Compatibility and interoperability evaluation of all-digital protection systems based on IEC 61850–9–2 communication standard. In: Power and energy society general meeting-conversion and delivery of electrical energy in the 21st century. IEEE, pp 1–5 5. Kumar A, Singh SB (2017) Computations of the signature reliability of the coherent system. Int J Qual Reliab Manage 34(6):785–797 6. Kumar A, Singh SB (2017) Signature reliability of sliding window coherent system. In: Mathematics applied to engineering, pp 83–95

328

S. Bisht and S. B. Singh

7. Levitin G (2005) The universal generating function in reliability analysis and optimization, vol 6. Springer, London 8. Lisnianski A, Levitin G (2003) Multi-state system reliability: assessment, optimization, and applications, vol 6. World Scientific Publishing Co Inc. 9. Marichal JL, Mathonet P (2013) Computing system signatures through reliability functions. Stat Probab Lett 83(3):710–717 10. Navarro J, Rubio R (2009) Computations of signatures of coherent systems with five components. Commun Stat-Simul Comput 39(1):68–84 11. Navarro J, Rychlik T (2007) Reliability and expectation bounds for coherent systems with exchangeable components. J Multivariate Anal 98(1):102–113 12. Negi S, Singh SB (2015) Reliability analysis of the non-repairable complex system with weighted subsystems connected in series. Appl Math Comput 262:79–89 13. Rausand M, Arnljot HÃ (2004) System reliability theory: models, statistical methods and applications, vol 396. Wiley 14. Samaniego FJ (2007) System signatures and their applications in engineering reliability, vol 110. Springer Science & Business Media 15. Samaniego FJ (1985) On closure of the IFR class under formation of coherent systems. IEEE Trans Reliab 34(1):69–72 16. Scheer GW, Dolezilek DJ (2000) Comparing the reliability of Ethernet network topologies in substation control and monitoring networks. In: Western power delivery automation conference. Spokane, Washington 17. Singh C, Patton AD (1980) Protection system reliability modeling: unreadiness probability and mean duration of undetected faults. IEEE Trans Reliab 29(4):339–340 18. Ushakov I (1986) Universal generating function. J Comput Sci Syst 24:118–129 19. Ushakov IA (1994) Handbook of reliability engineering. Wiley 20. Wu JS, Chen RJ (1994) An algorithm for computing the reliability of weighted-k-out-of-n systems. IEEE Trans Reliab 43(2):327–328 21. Zhang P, Portillo L, Kezunovic M (2006a) Compatibility and interoperability evaluation for all-digital protection system through automatic application test. In: Power engineering society general meeting. IEEE

Reliability Analysis of 8 × 8 SEN- Using UGF Method Vaibhav Bisht and S. B. Singh

Abstract The aim of this article is to assess the reliability of an 8 × 8 shuffle exchange network with one stage less (SEN-). The reliability of the 8 × 8 SENhas been assessed using three criteria: terminal, broadcast, and network reliability. The current paper offers an effective algorithm for calculating the reliability of an 8 × 8 SEN- using the universal generating function (UGF) method. With the help of UGF, the reliability is estimated for both, the same and distinct switching element probabilities. This article also investigates the impact of removing one stage from the usual 8 × 8 shuffle exchange network (SEN). Keywords Shuffle Exchange Network (SEN) · 8 × 8 SEN- · Universal Generating Function (UGF) Method · Terminal reliability · Broadcast reliability and Network reliability Paper type Research paper

1 Introduction In this modern era of communication technology, there is necessity to form a network, in which a large number (which may be hundreds or even thousands) of processors are being linked. Various complications have been aroused in the field of air traffic control, weather- forecasting, and even in the area of defence, which can only be resolved by high amount of computational power. As a result, future system performance will be able to significantly improve principles associated to a parallel processing system. When the several processors are being connected with the memory modules with the help of interconnected network, then this network is called the interconnection network. It is worth mentioning that a whole computer network system plays a key role to transmit signals/messages to the destinations (Trivedi 2008). Furthermore, these networks depend upon two types of switching V. Bisht (B) · S. B. Singh Department of Mathematics, Statistics and Computer Science, G.B. Pant University of Agriculture and Technology, Pantnagar, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment of Industrial Systems, Springer Series in Reliability Engineering, https://doi.org/10.1007/978-3-030-93623-5_15

329

330

V. Bisht and S. B. Singh

topologies for the transmission of signals/messages from source to the destination, namely, packet switching and circuit switching. Switching in the circuit completely depends on the physical paths that lie between the source and the sink.

2 Preliminaries 2.1 Interconnection Network For over the recent ten years, we’ve gradually seen the development in the field of interconnection networks which is now being injected into this contemporary age of multiprocessor frameworks. This development is introducing new roads for development and application in numerous fields. There are different components of a computer system interconnected by communication networks, and these networks are referred to as interconnection networks. An interconnection network is simply the interconnection of various networks of connecting nodes. In the interconnection network, switching elements are used to connect the destination node from the source node. The reliability of interconnection networks depends on the reliability of switching element and their interconnection network. Direct and indirect networks are the two types of interconnection networks. The network is sent across multiple edges of the direct network via point-to-point connections with the processing nodes. The router-based network or static network is another name for the direct network. Direct networks include the star graph, hypercube, torus, mesh, and the trees. Messages between any two separate nodes in an indirect network are routed through the network’s switches. In parallel computing, indirect interconnection networks are widely employed for switching and routing dynamic nodes. The communication of signals/messages in an indirect network is accomplished with the help of switches [4, 16]. Negi and Singh [15] assessed the reliability of complex systems and its subsystems, which were not repairable and was connected in series. The main purpose behind the construction of interconnection network is that when the single processor is not able to handle the task involving a huge amount of data, then task is broken into different parallel tasks, which are performed simultaneously, resulting into reduction of processing time. Hence, these interconnection network plays very important part in constructing large parallel processing system. Interconnection networks are widely used in many real-life applications such as telephone switches, networks in industries, supercomputers and many more.

2.2 Multistage Interconnection Network (MIN) Since the dynamic topology consists of numerous links that may be reconfigured by setting with the movable switching components, the multistage interconnection

Reliability Analysis of 8 × 8 SEN- Using UGF Method

331

networks (MINs) play an essential role in the dynamic network system. MINs are made up of several sheets of linked switching components organised in a predetermined architecture. The integrated circuits, telecommunication switches, multiprocessor systems and computer communications are all examples of uses for these networks. Since these networks are of big size and have a complicated topology, it is necessary to enhance the performance and hence to increase their reliability. It is important to note that these switching elements (SEs) are often linked in different number stages. Blocking, non-blocking and the rearrangeable non-blocking MINs are three different types of MINs. Since it mixes with the different connections already existing in the network, there is a very low likelihood of communication between a free source/destination pair in blocking networks. In non-blocking networks, messages are communicated from every input node to every output node without affecting the network’s pre-defined topology, resulting in various pathways between each source and destination node, which results extra stages in the system. Gunawan [9] discussed the basics of reliability engineering, distribution of probability and certain fundamentals of probability theory. A multistage interconnection network (MIN) typically has m inputs and m outputs, as well as m (=log2 m) stages and every stage has m/2 switching components, where m is the size of network. Shuffle exchange network (SEN), SEN with extra stages, Benes network, Gamma interconnection network (GIN), Extra-stage Gamma interconnection network GIN, Clos network, Omega network, Multistage cube network, and many more multistage interconnection networks are frequently used. It is important to note that network reliability is determined not only by the components in the network, but also by the topology of the network. The failure of the MINs is mostly due to inadequate network architectural selection and insufficient routing methods. Many engineers and academics have proposed different methods to enhance the performance of MINs and make them more dependable in the past. Trivedi [17] used the continuous Markov chains technique to assess the dependability of MINs. Rajkumar and Goyal [16] attempted to connect and investigate different MIN network topologies in terms of reliability, fault-tolerance, and cost-efficiency. Blake and Trivedi [5] investigated the single path MIN’s dependability and focused on fault tolerant schemes for improving network resilience. Both achieved reliability derivations for 8 × 8 shuffle exchange networks and 16 × 16 shuffle exchange networks. Bistouni and Jahanshahi [3] proposed a new technique to increase the reliability and faulttolerance of the SEN by augmenting the switching stages. They both found out that SEN with one extra stage (SEN + 1) was more reliable than SEN or SEN with two more stages (SEN + 2). Fard and Gunawan [8] estimated the terminal reliability of a modified SEN with 2 × 2 SEs at the intermediate stages, 1 × 2 at the source nodes, and 2 × 1 at the terminal nodes, and compared it to the conventional shuffle exchange network. Chinnaiah [7] proposed a new MIN called replicated SEN, which he compared to the SENs and Benes networks. Bisht and Singh [2] calculated the reliability of 4 × 4 SEN, SEN + 1 and SEN + 2 by UGF method and found that 4 × 4 SEN is most reliable and SEN + 2 is least reliable and reliability of SEN + 1 lies between these two networks.

332

V. Bisht and S. B. Singh

2.3 Universal Generating Function (UGF) The reliability of complex networks and engineering systems is calculated using a variety of approaches/methods [11, 14]. Because of its simplicity and time-saving nature, UGF is one of the most commonly utilised techniques among them. Ushakov provided the foundation for this technique [18]. Levitin and Lisnianski [13] proposed a method for calculating the relevance of element dependability in different MultiState Systems (MSS) based on the UGF process. Levitin [12] extended the universal generating function approach for multi-state system analysis to the case when the performance distributions of some components are reliant on the states of another element or the collection of elements. Bisht and Singh [1] proposed utilising UGF to assess reliability indices such as signature reliability and MTTF of complex bridge networks with lifetime components which are distributed independently and identically. Chacko [6] used the named UGF technique to quickly assess the reliability of a group of continuous MSSs. The suggested novel UGF technique can swiftly assess the reliability factors for a variety of structures. The Universal Generating Function (UGF) method is a generalisation of the ordinary generating function. The probability mass function of its variables is represented by a polynomial version of the moment generating function. If there are m potential values for k independent variables and rm is the probability of k equaling km , then the polynomial gives the UGF of k as follows: U (z) =

M 

r m z km

(1)

m=1

Consider the discrete random variables Y1 , Y2 , . . . Y p which are all independent. Let U1 (z), U2 (z), . . . , U p (z) be the UGF of the random variables Y1 , Y2 . . . Y p , and f (Y1 , Y2 , . . . Y p ) be an arbitrary function. Furthermore, the composition operator is used to describe the combination of r UGF, and the properties of the composition operator are dependent on the properties of f (Y1 , Y2 , . . . . . . Y p ). As a result, Ur (z) is written as follows:   Ur (Z ) = ⊗ f U1 (Z ), U2 (Z ), . . . U p (Z ) The UGF for two components in a network is given as: For two components connected in series, the UGF is given by:    k1 k2 K2 K1    p1k 1 z g1k1 ⊗ p2k 2 z g2k2 = u 1 (z) ⊗ u 2 (z) = p1 k 1 p2 k2 z ser (g1 k1 ,g2 k2 ) ser

k1 =1

ser

k2 =1

=1

k2 =1

(2) The UGF for two components connected in parallel is given by:

Reliability Analysis of 8 × 8 SEN- Using UGF Method

333

   k1 k2 K2 K1    p1k 1 z g1k1 ⊗ p2k 2 z g2k 2 = u 1 (z) ⊗ u 2 (z) = p1 k 1 p2 k2 z par (g1 k1 ,g2 k2 ) par

k1 =1

par

k2 =1

=1

k2 =1

(3) where, ⊗ and ⊗ signify series and parallel composition operators, respectively, in ser

par

the network. Equations (4) and (5) may be used to calculate the structural functions for the series and parallel systems, respectively as: ϕser ( p1 , . . . . pn ) = min{ p1 , . . . . pn },

(4)

ϕ par ( p1 , . . . . pn ) = max{ p1 , . . . . pn },

(5)

where, p1 , . . . ., pn are the elements connected in the system.

3 Shuffle Exchange Network with One Stage Less (SEN-) Figure 1 depicts a SEN- of size 8 × 8. The 8 × 8 SEN- Multistage interconnection network (MIN) does not meet the MIN’s primary criterion of full connection i.e., some of the source nodes are not connected to every destination node. For example, the source 000 is connected to outputs 000, 001, 010 and 011, but the same source 000 is not connected to output nodes 100, 101, 110 and 111. Since by the main requirement of MIN, all input nodes Fig. 1 The 8 × 8 SEN-

334

V. Bisht and S. B. Singh

should transmit networks to all destinations at least by one of its paths. As a result, this network does not meet the requirements for MIN approval.

3.1 New SENIn New 8 × 8 SEN- structure the MUX has been used at source and DEMUX have been used at destination end, so that the signal from each input can be transmitted to each output node. For the sake of convenience, 2 × 1 MUX and 1 × 2 DEMUX have been used to transmit network from one input to all outputs [10]. Bigger size of MUX and DEMUX can be also used to make system more redundant. This novel 8 × 8 SEN provides two pathways between each and every source and destination. The primary benefit of this strategy is that it creates totally independent paths between the source and destination, making the entire system fault resilient at both ends. Now, there are m MUX and m DEMUX at both, source, and destination node in this 8 × 8 SEN-, for a total of log 2 m− 1 stages in this network and m/2 SEs per stage. Figure 2 depicts the new SEN of size 8 × 8.

4 Reliability Analysis of SENWith the aid of UGF, the reliability of this SEN- may be assessed based on the three reliabilities: (1) terminal reliability (TR) (2) broadcast reliability (BR) (3) the network reliability (NR). To calculate reliability, we assume the following assumptions:

Fig. 2 New 8 × 8 SEN-

Reliability Analysis of 8 × 8 SEN- Using UGF Method

335

• The reliability of the 2 × 2 switching element is to be taken to be p. • The reliability of the 2 × 1 MUX and 1 × 2 DEMUX is taken to be p 1/2 .

4.1 Terminal Reliability A network’s terminal reliability (TR) may be expounded as the possibility of minimum one fault-less path between the network’s source and its destination pairs. Figure 3 shows block diagram of terminal reliability for SEN- of size 8 × 8. With the aid of UGF, the terminal reliability of SEN- may be determined as follows: RT R (S E N -) = max(min( p1 , p3 , p5 , p7 ), min( p2 , p4 , p6 , p8 )) where, p1 , p2 ,….., p8 are the probabilities of the switching element existing in this 8 × 8 SEN-. (1)

If the SEN- elements are not the same and the probabilities of the network components are distinct, the UGFs of different SEs are represented by:   u s j (z) = ps j z 1 + 1 − ps j z 0 where, ps j is the switch’s probability, j = 1, 2, . . . ..,8.

Assume that the network’s UGFs u s j (z) of the switches s j , j = 1, 2, . . . .., 8 respectively, are given as follows: u s1 (z) = 0.99z 1 + 0.01z 0 u s2 (z) = 0.98z 1 + 0.02z 0 u s3 (z) = 0.97z 1 + 0.03z 0 u s4 (z) = 0.96z 1 + 0.04z 0 u s5 (z) = 0.95z 1 + 0.05z 0 u s6 (z) = 0.94z 1 + 0.04z 0

Fig. 3 TR of 8 × 8 SEN-

1

3

5

7

2

4

6

8

336

V. Bisht and S. B. Singh

u s7 (z) = 0.93z 1 + 0.03z 0 u s8 (z) = 0.92z 1 + 0.02z 0 We get UGFs as follows after applying composition operators to distinct SEs based on their combination: U A (z) = u s1 (z) ⊗ u s3 (z) ⊗ u s5 (z) ⊗ . . . u s7 (z) min

min

min

= 0.84842505z + 0.15157495z 0 1

U B (z) = u s2 (z) ⊗ u s4 (z) ⊗ u s6 (z) ⊗ . . . u s8 (z) min

min

min

= 0.81360384z + 0.18639616z 0 1

Finally, the reliability of all SEN- terminal structures is determined as follows: U (z) = U A (z) ⊗ U B (z) max

U (z) = 0.971747011z 1 + 0.028252988z 0 Hence, the terminal reliability (TR) of this 8 × 8 SEN- is calculated as: TR (SEN−) = 0.971747011 (2)

The structural function, when the SEN- elements are identical and all SEs have the same probability is given by: RT R (S E N −) = 2 p 3 − p 6

With the help of the suggested UGF technique, the TR of 8 × 8 SEN- is assessed for distinct switching element reliability and being compared with 8 × 8 SEN, as shown in Table 1. Table 1 TR of 8 × 8 SEN-

Switching reliability

TR of 8 × 8 SEN- by UGF

TR of 8 × 8 SEN [16]

0.90

0.926559

0.72900

0.95

0.979658

0.85737

0.96

0.986714

0.88473

0.99

0.999118

0.97029

Reliability Analysis of 8 × 8 SEN- Using UGF Method

337

4.2 Broadcast Reliability The likelihood of a network/message being sent from a single source to all destination pairs is said to be broadcast reliability (BR). Figure 4 shows the block diagram for broadcast reliability for SEN-, which is of the size 8 × 8. The SEN- broadcast reliability may be computed using the UGF technique as follows: R B R (SEN−) = max(min( p1 , p3 , . . . .. p17 ), min( p2 , p4 , . . . .. p18 )) where, p1 , p2 ,….., p18 are the switching elements probabilities of these components in the network. (1)

If all the SEN- elements are distinct and the probability of the components of the network is not the same, then the UGFs of the various SEs are provided by:   u s j (z) = ps j z 1 + 1 − ps j z 0 where, ps j is the probability for the switch s j , j = 1, 2, . . . , 18. Consider the network’s UGFs u s j (z) for switches s j , j = 1, 2, . . . , 18 as: u s1 (z) = 0.99z 1 + 0.01z 0 u s2 (z) = 0.98z 1 + 0.02z 0 u s3 (z) = 0.97z 1 + 0.03z 0 u s4 (z) = 0.96z 1 + 0.04z 0 u s5 (z) = 0.95z 1 + 0.05z 0 u s6 (z) = 0.94z 1 + 0.06z 0 u s7 (z) = 0.93z 1 + 0.07z 0 u s8 (z) = 0.92z 1 + 0.08z 0 u s9 (z) = 0.91z 1 + 0.09z 0 u s10 (z) = 0.90z 1 + 0.10z 0

1

3

5

7

9

11

2

4

6

8

10

12

Fig. 4 BR of 8 × 8 SEN-

13

14

15

16

17

18

338

V. Bisht and S. B. Singh

u s11 (z) = 0.89z 1 + 0.11z 0 u s12 (z) = 0.88z 1 + 0.12z 0 u s13 (z) = 0.87z 1 + 0.13z 0 u s14 (z) = 0.86z 1 + 0.14z 0 u s15 (z) = 0.85z 1 + 0.15z 0 u s16 (z) = 0.84z 1 + 0.16z 0 u s17 (z) = 0.83z 1 + 0.17z 0 u s18 (z) = 0.82z 1 + 0.18z 0 After applying composition operators to different SEs based on their combination, we get UGFs like this: U A (z) = u s1 (z) ⊗ u s3 (z) ⊗ u s5 (z) ⊗ . . . u s17 (z) min

min

min

= 0.421755886z + 0.578244114z 0 1

U B (z) = u s2 (z) ⊗ u s4 (z) ⊗ u s6 (z) ⊗ . . . u s18 (z) min

min

min

= 0.38170668z + 0.61829332z 0 1

Finally, the broadcast reliability 8 × 8 SEN- network is given as: U (z) = U A (z) ⊗ U B (z) max

U (z) = 0.642475527z 1 + 0.357524473z 0 As a result, the 8 × 8 SEN- network’s broadcast reliability (BR) is given by: BR (SEN−) = 0.642475527 (2)

The structural function is given, when the SEN- components are alike and all SEs have the same probability: R B R (S E N −) = 2 p 13/2 − p 13

With the help of the suggested UGF technique, the BR of 8 × 8 SEN- is assessed for distinct switching element reliability and being compared with 8 × 8 SEN, as shown in Table 2.

Reliability Analysis of 8 × 8 SEN- Using UGF Method Table 2 Broadcast reliability of 8 × 8 SEN-

1

2

339

Switching reliability

BR of 8 × 8 SEN- by UGF

BR of 8 × 8 SEN [16]

0.90

0.754152

0.478297

0.95

0.919616

0.698337

0.96

0.945684

0.751447

0.98

0.984856

0.868126

0.99

0.996000

0.932065

3

5

7

9

11

13

15

17

19

21

23

4

6

8

10

12

14

16

18

20

22

24

Fig. 5 NR of the 8 × 8 SEN-

4.3 Network Reliability The likelihood of successful signal transmission from all source pairs to all sink nodes is said to be network reliability (NR). Figure 5 shows the block diagram for network reliability for SEN-, which is of the size 8 × 8. The network reliability of SEN- may be estimated using the UGF technique as follows: R N R (SEN-) = max(min( p1 , p3 , . . . .. p23 ), min( p2 , p4 , . . . .. p24 )) where, p1 , p2 ,….., p24 are the switching element’s probabilities present in this network. (1)

When the SEN- elements are not the same and the probabilities of components of the network are distinct, the UGFs for these different switching parts are provided by:   u s j (z) = ps j z 1 + 1 − ps j z 0 where, ps j is the probability of the switch s j , j = 1, 2, . . . ..0.24.

Assume that the UGFs u s j (z) for this network for the switches s j , j = 1,2,…..,24 are as follows: u s1 (z) = 0.99z 1 + 0.01z 0 u s2 (z) = 0.98z 1 + 0.02z 0

340

V. Bisht and S. B. Singh

u s3 (z) = 0.97z 1 + 0.03z 0 u s4 (z) = 0.96z 1 + 0.04z 0 u s5 (z) = 0.95z 1 + 0.05z 0 u s6 (z) = 0.94z 1 + 0.06z 0 u s7 (z) = 0.93z 1 + 0.07z 0 u s8 (z) = 0.92z 1 + 0.08z 0 u s9 (z) = 0.91z 1 + 0.09z 0 u s10 (z) = 0.90z 1 + 0.10z 0 u s11 (z) = 0.89z 1 + 0.11z 0 u s12 (z) = 0.88z 1 + 0.12z 0 u s13 (z) = 0.87z 1 + 0.13z 0 u s14 (z) = 0.86z 1 + 0.14z 0 u s15 (z) = 0.85z 1 + 0.15z 0 u s16 (z) = 0.84z 1 + 0.16z 0 u s17 (z) = 0.83z 1 + 0.17z 0 u s18 (z) = 0.82z 1 + 0.18z 0 u s19 (z) = 0.81z 1 + 0.19z 0 u s20 (z) = 0.80z 1 + 0.20z 0 u s21 (z) = 0.79z 1 + 0.21z 0 u s22 (z) = 0.78z 1 + 0.22z 0 u s23 (z) = 0.77z 1 + 0.23z 0 u s24 (z) = 0.76z 1 + 0.24z 0 We get UGFs as follows after applying composition operators to distinct SEs based on their combination: U A (z) = u s1 (z) ⊗ u s3 (z) ⊗ u s5 (z) ⊗ . . . u s23 (z) min

min

min

= 0.207808825z + 0.792191174z 0 1

U B (z) = u s2 (z) ⊗ u s4 (z) ⊗ u s6 (z) ⊗ . . . u s24 (z) min

min

min

= 0.181020576z 1 + 0.818979423z 0 Finally, the broadcast reliability 8 × 8 SEN- network is given as: U (z) = U A (z) ⊗ U B (z) max

U (z) = 0.351211727z 1 + 0.648788272z 0

Reliability Analysis of 8 × 8 SEN- Using UGF Method Table 3 Network reliability of 8 × 8 SEN-

341

Switching reliability

NR of 8 × 8 SEN- by UGF

NR of 8 × 8 SEN [16]

0.90

0.675632

0.2824295

0.95

0.886714

0.540360

0.96

0.922376

0.612709

0.98

0.977728

0.7847147

0.99

0.994032

0.8863849

Hence, the network reliability (NR) of 8 × 8 SEN- is given by: NR (SEN-) = 0.351211727 (2)

When all of the components of 8 × 8 SEN- are same and the probability of all switching elements is same, and the structural function is written as: R N R (SEN-) = 2 p 8 − p 16

With the use of the suggested UGF technique, the NR of 8 × 8 SEN- is assessed with regard to the distinct switching element reliability and being compared to NR of 8 × 8 SEN and is shown in Table 3.

5 Conclusion Reliability is a one of major concerns for most networks, particularly in the realm of communication networks. This paper demonstrates how we may use the proposed UGF to assess the reliability of 4 × 4 SEN-. This paper emphasizes the fact that 4 × 4 SEN- is not possible and use of MUX and DEMUX at source and destination makes this network possible. This study examines the reliability of 4 × 4 SEN- from three perspectives: TR, BR, and NR. A critical analysis of the findings of various reliabilities derived from the suggested technique, followed by a comparison reveals that the terminal reliability (TR), broadcast reliability (BR) and network reliability (NR) of 4 × 4 SEN- is way higher than that of 8 × 8 SEN. Hence the 4 × 4 SEN- is more reliable than 8 × 8 SEN.

References 1. Bisht S, Singh SB (2019) Signature reliability of binary state node in complex bridge networks using universal generating function. Int J Qual Reliab Manage 36(2):186–201 2. Bisht V, Singh SB (2021) Reliability estimation of 4 × 4 SENs using UGF method. J Reliab Stat Stud 173–198

342

V. Bisht and S. B. Singh

3. Bistouni F, Jahanshahi M (2014) Analyzing the reliability of shuffle-exchange networks using reliability block diagrams. Reliab Eng Syst Saf 132:97–106 4. Bistouni F, Jahanshahi M (2014) Improved extra group network: a new fault-tolerant multistage interconnection network. The J Supercomput 69(1):161–199 5. Blake JT, Trivedi KS (1989) Multistage interconnection network reliability. IEEE Trans Comput 38(11):1600–1604 6. Chacko VM (2018) Continuous multistate system universal generating function. Reliab: Theory Appl 13(4):51 7. Chinnaiah B (2019) Multistage interconnection networks in reliability shuffle exchange networks. In: First international conference on artificial intelligence and cognitive computing. Springer, Singapore, pp 571–581 8. Fard NS, Gunawan I (2005) Terminal reliability improvement of shuffle-exchange network systems. Int J Reliab, Qual Saf Eng 12(01):51–60 9. Gunawan I (2014) Fundamentals of reliability engineering: applications in multistage interconnection networks. Wiley 10. Gupta S, Pahuja GL (2016) A new SEN minus: design and reliability measures. Int J Reliab, Qual SafEng 23(04):1650012 11. Kumar A, Singh SB (2019) Signature A-within-B-from-D/G sliding window system. Int J Math, Eng Manage Sci 4(1):95–107 12. Levitin G (2004) A universal generating function approach for the analysis of multi-state systems with dependent elements. ReliabEng Syst Saf 84(3):285–292 13. Levitin G, Lisnianski A (1999) Importance and sensitivity analysis of multi-state systems using the universal generating function method. ReliabEng Syst Saf 65(3):271–282 14. Singh SB (2017) Reliability analysis of multi-state complex system having two multi-state subsystems under uncertainty. J Reliab Stat Stud 161–177 15. Negi S, Singh SB (2015) Reliability analysis of non-repairable complex system with weighted subsystems connected in series. Appl Math Comput 262:79–89 16. Rajkumar S, Goyal NK (2016) Review of multistage interconnection networks reliability and fault-tolerance. IETE Tech Rev 33(3):223–230 17. Trivedi KS (2008) Probability & statistics with reliability, queuing and computer science applications. Wiley 18. Ushakov IA (1986) A universal generating function. Soviet J Comput Syst Sci 24(5):118–129