291 36 9MB
English Pages 195 [196] Year 2023
Milan Z. Bjelica
Systems, Functions and Safety
A Flipped Approach to Design for Safety
Systems, Functions and Safety
Milan Z. Bjelica
Systems, Functions and Safety A Flipped Approach to Design for Safety
Milan Z. Bjelica University of California, San Diego San Diego, CA, USA
ISBN 978-3-031-15822-3 ISBN 978-3-031-15823-0 https://doi.org/10.1007/978-3-031-15823-0
(eBook)
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 Request lecturer material: sn.pub/lecturer-material This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
A Letter From Your Instructor
Dear Reader, I am so happy you have selected this book, and perhaps also taken the corresponding course on systems, functions, and safety! This means you have realized that any technical wizardry around new achievements of humanity may come at a huge cost – harming humans, damaging our environment, or our property. By reading this book you acknowledge that it is not enough to be just any engineer – hardware engineer, computer engineer, software engineer, or mechanical engineer; it is important to look at your designs holistically, and to understand what consequences your designs, implementations, and overall doing may have and what you can specifically do about it. Yes, this book is about system engineering with a strong focus on reliability and the accompanying metrics. However, it is also a book that takes you on a journey of how these aspects can be applied to modern, twenty-first-century endeavors that you might be taking these days. Even if you come from reliability-heavy disciplines, such as mechanical engineering, it is still good that you have encountered this book. It would be easy for you to contrast your knowledge with many real-world examples, exercises, and applications listed – a great opportunity to recap and reestablish your perspective on the complex system designs now mostly depending on new suspects: software and high-performance computing hardware. The purpose of this book is not to lay down intensive theoretical constructs and background of the system engineering and reliability theory; this part is intentionally left lightweight. You can always refer to the bibliography at the end of this book to find additional great, in-depth material in those areas. Instead, the purpose of this book is to take you by the hand and lead you step by step, to make you understand and practice whatever is important to become ready to understand the world of system safety and functional safety in the context of your next engineering project. You would feel much more relaxed – but cautious – when you return to your desk after you complete what I have prepared for you here.
v
vi
A Letter From Your Instructor
I need you to be ready for an experiment with this book. It is not like most other books you might have read. Start with Chapter 1. Read the introduction to get yourself motivated. It is only a two-page read! After this, sit back and relax while you watch an accompanying lecture video. You can access those videos by following the links in the book. Each video lasts for about 30 minutes and will explain briefly whatever you need to understand in the chapter. Then, after watching the video, take a short break. Have a coffee. Think a bit. Then come back and read the lecture note section from the book. Pay attention to bold and underlined parts. These lecture notes are very brief and will contain only what is the most relevant for the topic. Lecture notes are short but essential. You can keep them as your vital reference for the future. After going through the lecture this way, now I need you to practice. Dive into the calculation examples and exercise sections. Do whatever is required from you. It is excellent if you can bring in a peer or two to work out the exercise together. Fill in the given sheets or make the required drawings. Think and discuss along the way. If you are in a live course, use your instructor to discuss further. Take about 1 hour to complete the exercise, not more. Then, look at some examples of how other people solved the exercise. Their solutions are not ideal – those might be exactly like yours! All the shortcomings are addressed in the review text that you can read for each exercise solution. This is an actual review provided by me, the instructor. Learn from those examples and compare them with your work. There will be many similarities! If you have a chance, give your solution to your instructor for the original review. When you are done, you now need to assess your knowledge. Take a look again at the key recap points/questions. Try to remember what you did by thinking about those points out loud. Spend a few minutes doing this. Finally, take a short quiz. You have ten questions requiring a simple yes or no answer. It seems easy at first, but it's not. You need to understand the chapter to get every question right. After writing down your answers, check the key at the end of the chapter. If you got something wrong, go back to the text and try to understand why. If you have your instructor nearby, ask for additional clarification. It is very important to get every answer right and to understand why you got it right. Once you do that, you are done with the chapter and move to the next one. If you do what I suggest, you will participate in the “flipped approach” to the design for safety. You will need 1–2 days per chapter, and in a couple of weeks, you will not only read the book but understand and be able to apply what I wanted to teach you. Please evangelize this topic further. This book is about the safety culture. The safety culture is what we need the most if we want to be safe and sustainable with our next-generation tech. Have great fun comprehending this book and see you in each and every chapter along the way! Yours sincerely, Prof. Dr. Milan Z. Bjelica
Acknowledgement
The work on this book was partially supported by the autonomous province Vojvodina of Republic of Serbia, Province Secretariat for High Education, Science, and Research, under Grant 142-451-2339/2022-01/02.
vii
Contents
1
Safety-Critical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 1 Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 1 Sample Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 2 3 6 9 10 11 12 15 17 18 18 19
2
System Requirements and Functions . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 2 Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 2 Sample Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21 21 22 23 26 28 28 28 29 30 31 32 32 33
ix
x
Contents
3
System Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Calculation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3 Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3 Sample Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35 35 36 37 40 40 41 42 43 43 44 45 46 46 47 47
4
System Safety Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 4 Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 4 Sample Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49 49 50 51 54 55 56 56 57 57 58 59 59 60
5
Functional Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 5 Sample Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61 61 62 63 65 67 67 68 69 70 71 71 72
Contents
xi
6
Defining Safety Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Your First Safety Project! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Required Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Submission Deadline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sample Solution to the Project . . . . . . . . . . . . . . . . . . . . . . . . . . . .
73 73 74 75 78 80 80 80 80
7
Safety Integrity and Random Failures . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Calculation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 7 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
97 97 98 98 103 103 104 105 106 107 108 108
8
Safety Integrity of Composite Systems . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Calculation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 8 Sample Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
109 109 110 110 113 113 115 117 118 118 120 121 122 123 124 124
9
Safety Integrity Improvement Methods . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
125 125 126 126
xii
Contents
Calculation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 9 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
129 129 130 132 135 136 140 140 141
10
Proving the Safety Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Calculation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 10 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
143 143 144 144 148 148 149 151 152 156 156 157
11
Practical SIL Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Now Try for Yourself! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Required Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sample Solution to the Project . . . . . . . . . . . . . . . . . . . . . . . . . . . .
159 159 160 160 167 169 169 170
12
System Safety Checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
183 183 183 186 187
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Chapter 1
Safety-Critical Systems
Introduction Most of the industries nowadays struggle with the ever-increasing complexity of their products as well as the tools and the equipment involved. Wherever we look, we are overwhelmed with the abundance of new possibilities, among which many are enabled by the emerging technical functionalities. Take a look at your smartphone, for example. It allows you to communicate at any distance; it reveals worldwide events to you and provides you with a live broadcast from any place almost instantly; the wealth of the worlds’ knowledge is a click away; you are kept entertained, informed, and connected at all times. All this comes at a cost of the huge complexity of smartphones today, including the most sophisticated chips and millions of lines of code in the software. It is not rare, however, that such a complex system fails: at times, we have a poor Internet connection, our applications hang, or the battery drains out. Also, we usually replace the device every 2 or 3 years. Complexity is usually the consequence of the digital (r)evolution. In the 1990s, computers started to take over media broadcasting, starting from satellite TV, on to the set-tops and TV sets, to digital flat TV screens. Then, the twenty-first century brought us the miniaturized computer in the form of a smartphone, and digitization continued and consumed all communications and broadcast. We further witnessed the revolutionization of mobility, bringing complex applications, autonomous driving algorithms, and modern driving assistance functions to vehicles. The aerospace sector, railway, production plants, energy sector, and others are getting more connected and computerized every year. The digitization momentum started to grasp all the remaining industries, putting all the things, and people, in the global communication network. Enabling a machine to communicate with any other machine, with a powerful computer behind each device, allows the utilization of a
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0_1
1
2
1 Safety-Critical Systems
multitude of new algorithms and the introduction of an unprecedented set of new functions. Automatic optimization of busy traffic across intersections, unmanned processes in production plants, and the car that takes you from point A to point B while you watch a live broadcast of a tennis match from across the globe – all this is slowly becoming the reality. All aforementioned examples are based on complex systems, which require various interacting elements to achieve their end purpose. The listed new functions, which are enabled by communications, complex processors, and software, are making the systems even more complex, which may affect their reliability. In case the systems control high energy, such as kinetic, electrical, or chemical, they may harm people and the environment or damage property. We call those systems safetycritical, and they range from simple battery-powered devices, over to passenger vehicles, aerospace or rail, all the way to factories, power plants, or chemical plants. It is obvious that the digitization of such systems, introducing higher complexity, results in lowered reliability and the increased risk of mishaps introducing harm or damage. Safety-critical systems are not new; they are around for many decades, and even centuries. However, such complex, hardware, and software-rich installments require new regard to how those systems need to be defined, designed, implemented, maintained, and decommissioned. Some areas, such as software, bring together engineers which have little or no previous insights into the processes and methods required for safety-critical system design, whereas the role of those engineers becomes dominant in safety-critical systems. Therefore, we start the journey in this book by introducing safety-critical systems and understanding their key properties and aspects. Apart from the main terminology, such as a system, critical system, and system of systems, this chapter sheds light on required engineering processes in system engineering, emphasizing system delineation, requirement elicitation, and the required process models and traceability.
Video Lesson This chapter has a corresponding video lesson: sfs1.nit-institute.com
Lecture Notes
3
Lecture Notes A system is a combination of different interacting elements which are organized to achieve one or more stated purposes. The system fulfills its purpose through the provision of system functions. Correct orchestration of system elements and system functions is of vital interest for safe system design. Particular view instead of the system view may lead to the unforeseeable component and function interactions (remember: Uber Tempe accident, safety driver vs the autopilot; Ariane 5 disaster, self-destruct mechanism vs position sensing; power production vs safety test in Chernobyl). The system needs to be evaluated regarding its effects on the environment. Therefore, a system must be delineated from the environment, with the system boundary clearly defined (Fig. 1.1). Technical systems provide functions using hardware, software, and mechanical parts. The system is safety-critical if it can cause harm (to users, people, the environment) or damage (to the property). The system may cause harm or damage, if it produces or controls energy, such as kinetic energy, electrical energy, or chemical/radioactive processes. Also, systems providing critical support (such as life support or decision support) may be safetycritical. Swiss cheese model illustrates how hazards may materialize into accidents (causing damage and/or harm), by the chain of events passing through “cheese holes” via many slices of cheese, each slice depicting one protective barrier of the system (e.g.,
4
1
Fig. 1.1 System delineation – a clear split is needed between the system and its environment, via the definition of the system boundary
Safety-Critical Systems
ENVIRONMENT
boundary
SYSTEM
ENVIRONMENT
Active failures
Damage or Harm
Latent faults
Poor organization, g , decisions… design, Hazards
System “layers”
Fig. 1.2 Swiss cheese model – hazards may propagate and become failures causing damage or harm
system design and organization, implementation with respect to faults, exploitation with respect to failures, etc.) (Fig. 1.2). A system of systems (SoS) is a system comprising of other systems which are usually heterogeneous, distributed, and with well-defined interfaces. One of the systems within SoS is our system in focus, but it must not be overlooked that there are other systems it interacts with which may provide or consume critical information based on the functions of our system (Fig. 1.3). System delineation is performed by starting with the imminent environment and listing all the participating components which are obviously present. Then, the analysis “zooms out” to adjacent systems (as in SoS); then operators, users, and operation modes; and then other related systems (such as IT), as the last layer before the system boundary (delineation line). Outside of the boundary, the environment elements shall be listed (elements affecting the system, and those affected by the system). Finally, the intended functions, use cases, and legislation (standards) must be contrasted with the design.
Lecture Notes
5
Fig. 1.3 System of systems: our regarded system (system-in-focus) may not be alone!
Sibling systems
System System
subsystem
subsystem
subsystem
System
SYSTEM IN FOCUS Containing system
Enterprise processes
Project processes
Enterprise Environment Management Process Investment Management Process
Project Planning Process Project Assessment Process
System Life Cycle Management Process
Project Control Process
Resource Management Process
Decision-making Process
Quality Management Process
Risk Management Process
Agreement processes
Configuration Management Process
Acquisition Process Supply Process
Technical processes Stakeholder Requirements Definition Process Requirements Analysis Process Architectural Design Process Implementation Process
Information Management Process
Integration Process Verification Process Transition Process Validation Process Operation Process Maintenance Process Disposal Process
Fig. 1.4 Some important system engineering process groups and processes
System engineering (SE) concentrates on the design and application of the whole as distinct from the parts (regarding the “added value” of the whole), therefore providing all means to define, design, and utilize systems. SE regards the entirety of the design, including technical aspects, but also social and environmental aspects (Fig. 1.4). The system engineering process provides a top-down decomposition approach, starting from the requirements definition and analysis, architectural design and implementation (left side of V model), and then assembling and exploiting “the
6
1
Safety-Critical Systems
Fig. 1.5 V model with the notion of traceability
whole” through integration, verification (“is the system implemented correctly”), validation (“is the right system implemented”), operation, maintenance, and disposal processes (right side of V model) (Fig. 1.5). Starting from the requirements, processes produce artifacts (items, documentation) that need to be clearly linked (“used by,” “derived from”) with the inherent provision of traceability. Agile development (e.g., SCRUM), where the implementation is organized in periodic sprints, in which items are implemented based on the periodically updated backlog, is possible to utilize, but still all processes, phases, and traceability aspects need to be respected (Fig. 1.6).
Exercise 1 Electric scooters are increasingly being used today and are the subject of controversies. They are classified as different vehicle types in different countries, in some are banned from traffic, and in others restricted to certain lanes or speeds. You are selected by one of the countries still pending the legislation to analyze this system, delineate it from the environment, and discuss its criticality (Fig. 1.7). Current situation per country: sfs1.link.nit-institute.com
Exercise 1
7
Daily Scrum Product Backlog
Sprint Planning
Sprint Retrospective
Sprint Backlog
Sprint Review
Release increment
Fig. 1.6 Agile methodology (SCRUM) shall not compromise traceability
Fig. 1.7 Your system – an electric scooter with geofencing
Your Tasks for the Exercise • Discuss the functions of the scooter. Consider modes of operation, user interfaces, components, and the environment. Make rough notes. • Think about the usage of geofencing to prevent riders to operate the scooter in specific areas and discuss it. Which additional functions and components would be required for this feature? • By using the system delineation sheet, decompose the scooter to the imminent components, adjacent systems, operators/users, other related systems, and the environment. Make sure to clearly define the system boundary and to place all components, users, operators, and subsystems within their respective layers. Make sure you identify system of systems!
8
1
Safety-Critical Systems
• For each of the components, deduce whether it is safety-critical. • Is the electric scooter as a whole safety-critical? Which assumption can support or debunk this claim? To-Do List • Perform the exercise individually or with your peers. One can share the screen and keep notes, all contribute. • Create presentables (e.g., drawing, Excel calculation). • Discuss your solution and share it with others. Note: Digital files for this exercise are available at sfs1.ex.nit-institute.com
Exercise 1 Template
Exercise 1 Template
9
10
Exercise 1 Sample Solutions See several exemplary solutions to the exercise:
Solution 1
1
Safety-Critical Systems
Exercise 1 Sample Solutions
11
Review Comments by the Instructor • The delineation of the system is mostly correctly defined. Please note that in the iterations within a pre-project or concept phase, the system boundary might “shrink” to exclude items that would not impact the design or safety as much (e.g., “paying system”). • In the decomposition, however, it would be good to keep the level detail higher in the narrower shells (e.g., in the Z-shell). For example, the power chain (powertrain) might need to be further decomposed so that the electric motor is clearly visible (as well as, e.g., electronic brake). A power supply system, similarly, would need to expose the existence of a battery pack (with high fire hazards!). • Please note that software, although important to consider, is worthless if noted in a generic way, so specificity here is essential. Make sure to distinguish the software element which is important to regard per se, other than elements that are inherent parts of other controllers. • A notable absentee are HMI units, showing information and accepting commands from operators/users, since those are frequently overlooked in the design, but their complexity and non-intuitiveness may cause (un)foreseeable misuse. • In the other systems shell, usually IT systems (Internet-based/cloud-based) can be placed, such as map update IT system for the geofencing function, but also SW upgrade back-end. • In the environment, it would be good to add more granularity (e.g., pedestrians, types of obstacles, etc.) as opposed to using generic notions, such as “Traffic.”
Engin e
Rental operat or
baery
Brakes
Controlle r Unit
Speaker
Weight control unit
Z-Shell
Adjacent systems
Operators / Users
Comm. Module
GNSS
Sensores
Info Display
Folding and Carrying
Security
Other systems (System of Systems?)
People
Power Source
Wheels
Inter net
Satellite System
Traffic: -type...
Road Infrasture
ENVIRONMENT
- Safety Crical
Weather Forecast
Group name: Group2
1
ENVIRONMENT
Weather
Electric Scooter with Geofencing – System View and Delineaon
12 Safety-Critical Systems
Solution 2
Exercise 1 Sample Solutions
13
Review Comments by the Instructor The view is mostly correct, with several notes to consider: • Although it is good to include information display, other (input) types of HMIs would also need to be considered, and their criticality carefully assessed (note: some units, although showing only information, can be critical with regard to the decision-making of operators, and the misuse which can happen due to the, e.g., figuring out the wrong state of the system). • The operator shell could include Maintenance and Service personnel. Maintenance/Service operation modes can be overlooked although those special modes can uncover important hazards. • The notion of “Security” shall be more closely elaborated, e.g., which operator or (sub)system is especially security-critical.
14
1
Safety-Critical Systems
• The consideration of the Internet being within the system boundary is usually too much even for this phase; communication issues are usually addressed as failures on communication modules/adapters instead. • In the listing on page 2, modes of operation are good to have, but notably, here we miss modes such as “In-service,” “Folded,” or “Transport.”
Exercise 1 Sample Solutions
Solution 3
15
16
1
Safety-Critical Systems
Review Comments by the Instructor • It is very good to have a leg (stand) analyzed, since, e.g., expansion of this stand during the driving may pose a serious hazard. • Make sure to deepen the analysis with respect to Maintenance/Service mode, corresponding personnel and equipment.
17 Exercise 1 Sample Solutions
Solution 4
Pedestrians
Operators / Users
Control unit
Z-Shell
Adjacent systems
Break control module
Power supply
Traffic conditions
Mobile (un)locking
User interface
Location module
SW update module
Light control
Passenger
Other systems (System of Systems?)
Driver
Geo mapping server
Weather conditions
Mainten ance
Charging station
Module for vehicle speed
Steering module
Animals on the way
SW update server
Tire inflating system
Other vehicles
Group name: GROUP 4
ENVIRONMENT
Geofencing planning and operations system
- Geofencing – Features: location modules - slow down before traffic crossing, no max speed in pedestrian zone
Electric Scooter with Geofencing – System View and Delineation
Road conditions
ENVIRONMENT
Notes: -Energy – kinetic (could potentially harm driver and pedestrians) -> scooter is safety crinical
18
1
Safety-Critical Systems
Review Comments by the Instructor The analysis is mostly correct, with some outstanding remarks: • Z-shell mostly includes controllers; however, actual actuators and other components are very important for safety, such as breaks themselves, the battery, knobs/ levers, etc. • Breakdown of potentially affected/affecting entities seems OK, although not specifically clear why animals are considered separately • Some overlooked “environmental” effects, such as weight limit, can be added. Also, the descriptions shall be as precise as possible, and generic notions such as “conditions” shall be avoided
Key Recap Questions To make sure you fully understood safety-critical systems, reflect on the following: • • • • • •
Think about a system of choice. Think about users, system environment, and boundary. Is your system critical? Which components does it have? Are there subsystems? Think about hazards.
Self-assessment Now take the time to self-assess your knowledge by taking the quiz below. Each listed statement is either correct or incorrect. Please mark your answer and then check in the key at the end of the book. 1. A system engineer in a company focuses on the implementation of the system components which are assigned to his company, with little or no regard to the use cases and functions that would be provided by the end system or product. 2. When analyzing systems for safety, direct users and operators of the system are usually located within the system boundary. 3. The system is safety-critical only if it can kill or injure people while performing its functions. 4. The system is always safety-critical if it controls a chemical process that may cause a fire. 5. If the amount of kinetic energy produced by the system is somewhat reduced (e.g., vehicle speed is limited), the safety-criticality of the system may change as a result.
Self-assessment Key
19
6. Systems that only provide information to the operators via screens are usually not regarded as safety-critical. 7. One layer of protection in a Swiss cheese model (e.g., one system component, or one single process), if failed, can only be a major factor for an accident in case other slices (layers) also fail (exhibit “holes”). 8. When analyzing the safety of an aircraft, an air traffic control system must also be regarded for total safety assessment since air travel is actually a system of systems. 9. In a system engineering process, a verification test plan must always consist of test cases that are referencing implementation units which again reference system architecture documentation and the respective requirements. 10. The correctly implemented system, verified through the complete test plan with perfect coverage, is always a safe system.
Self-assessment Key 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
False True False True True False False True True False
Chapter 2
System Requirements and Functions
Introduction If you try to remember your last project, it is not uncommon that you will recall a document with the title similar to “requirements specification.” Unfortunately, many would remember this document being an incomplete set of things the product should do, and that many aspects were left unsaid and decided during the development phases. Some requirements were vaguely defined, puzzling the developers about how to decompose and implement them. Frequently, it happened that some things were known but not written down, or that the final design deviated from the actual desires of stakeholders which were omitted or simply not formulated correctly. Now let us put that situation in the context of safety-critical systems. The problems become obvious. If we do not know what the system shall exactly do, and which functions it shall exactly perform, we will end up not knowing what kind of harm or damage the system may produce as a side effect. Incorrect, incomplete, or missing requirements specifications, according to research, actually account for about 40% of all problems with system safety. This is truly an alarming indicator! This is why we need to dedicate this chapter to system requirements and system functions and try to understand how requirements engineering and corresponding processes should be performed in any design, and most importantly in safety-critical system designs. It is essential to fully scope the list of clearly described functions the system shall perform since only by listing functions we can analyze potential failures and failure modes of those functions. That way we can figure out if and to what extent our system behavior may result in a mishap. Requirements are all about communication. In the same way that two persons exchange information about anything through speech, requirements communicate the intended functions of the system to all the stakeholders and make sure that everybody is aligned. Stakeholders being people, it is required that the requirements are expressed in a way that is easily understood and unambiguous. Since each stakeholder looks at the system from his or her unique angle, much important © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0_2
21
22
2 System Requirements and Functions
information can be left out in communication – regarded as “goes without saying.” This is why a careful requirements engineering process is required. This process starts from the requirement elicitation. Need to elicit requirements instead of expecting them to be specified for you from some “invisible external entity” bends our thinking. We (i.e., requirement engineer!) shall gather and figure out requirements by interacting with all the stakeholders in any way required so that we lure out every single bit and document each requirement correctly. It is also very important to know who your stakeholders are and to understand that the list of stakeholders is much broader than we immediately think. Our stakeholders are not only people from the management and the customers – the list is much larger including end users, suppliers, developers, shareholders, competitors, standardization bodies, but also the society – people that might be affected by the system, governments and legislators, trade unions, and associations. Requirements engineering equally requires the requirements to be expressed, formulated, and documented in a specific, nonambiguous, and consistent way. Several processes are also required to be meticulously followed, specifying how the requirements shall be negotiated, validated, coordinated, updated, tracked, traced, and changed. This chapter will give you a much-needed introduction to all these aspects, to make you fully aware of the available practices so that you do not miss this critical first step in the design of the safety-critical system.
Video Lesson This chapter has a corresponding video lesson: sfs2.nit-institute.com
Lecture Notes
23
Lecture Notes A correct definition of system requirements is essential to correctly define system functions and afterward assess hazards, making the requirements definition among the most important phases with respect to system safety. Bad requirements specifications are frequently encountered in practice (~40%) which is alarming in the sense of safety (Figs. 2.1 and 2.2). The requirement is a needed condition or capability of the system. Functional requirements define system functions that satisfy the user’s need to achieve a goal. Requirements need to be elicited (gathered, figured out) since they are not given directly by the stakeholders. Then, requirements need to be correctly documented (formulated, expressed), communicated (negotiated, validated, coordinated), and finally managed (updated, tracked, traced). Requirements are hierarchically organized so that they support traceability (e.g., high-level requirements -> system requirements -> functional requirements -> technical/hardware/software requirements), and they also reference the design specification and items which specify, implement, and verify them. Requirements are elicited from various stakeholders: end users, customers, shareholders, competitors, managers, developers, and suppliers, but also government/legislators, trade unions, and other affected groups in the society. Our stakeholders may also be in the environment (beyond system boundary!). Requirements need to be clearly documented (in a semiformal or formal way, in writing). They define what shall be done, instead of how it shall be implemented. Ambiguity in wording shall be avoided, and completeness shall be sought. Consistency and traceability throughout the requirements specification are essential, as well as procedures for maintaining the specification and agreeing to requirement definitions and changes.
INCORRECT
INCOMPLETE
MISSING
never rarely frequently
never rarely frequently
never rarely frequently
Fig. 2.1 Distribution of problems with incorrect, incomplete, or missing requirements specifications. (Source: Fraunhofer study on functional safety in automotive – ISO 26262, 2013)
24
2
System Requirements and Functions
Fig. 2.2 An infamous “Project Tree” illustration of what happens with the requirements specified incorrectly
The most important requirement groups are functional requirements (“The vehicle shall drive autonomously in a traffic jam on a highway”), quality requirements (“The display shall be readable in bright sunlight”), and constraint requirements (“The power consumption of the controller shall not exceed 10W”). A special case of constraint requirements is safety requirements (“Autonomous driving shall be disabled if the driver becomes inattentive”). Requirements engineering is a part of system engineering, dealing with requirements development (elicitation, documentation) and requirements management (change management, coordination and communication, tracking and monitoring, escalation). Methods to develop and elicit requirements include the Kano model, which can classify the initial set of requirements according to the level of goal (need) fulfillment, and the level of satisfaction the inquired stakeholder expresses by this fulfillment. This method requires a questionnaire in which the stakeholder uses the Likert scale (“I like that,” “I expect that,” “I am neutral,” “I tolerate that,” “I dislike that”) to answer the positive question (“What if the requirement/feature is not there/not working”) and the negative question (“What if the requirement/feature is there/ working”) for each of the requirements. Based on the answers, the requirements are classified to “Performance” (key differentiator for the system), “Must have” (mandatory), “Delighter” (not expected, but increasing value), “Questionable” (need
Lecture Notes
25
What if the feature is there / working
Fig. 2.3 Kano model What if the feature is not there / not working I like that
I expect that
I am neutral
I tolerate that
I dislike that
I like that
Questionable
Delighter
Delighter
Delighter
Performance
I expect that
Reverse
Questionable
Indifferent
Indifferent
Must have
I am neutral
Reverse
Indifferent
Indifferent
Indifferent
Must have
I tolerate that
Reverse
Indifferent
Indifferent
Indifferent
Must have
I dislike that
Reverse
Reverse
Reverse
Reverse
Questionable
Fig. 2.4 The table used in a Kano requirement elicitation model questionnaire
further elicitation), “Indifferent” (not perceived as important), and “Reverse” (shall be removed or redefined) (Figs. 2.3 and 2.4). Elicitation is encompassed within brainstorming sessions (in workshops with stakeholders), or wider inquiries via questionnaires. The formulation of the requirement is in writing, in natural language: “The system” + “shall/should/will/may” + / “provide” “with the ability to” / “be able to” . All the requirements together form a system requirements specification (SRS). This specification has several formal notation elements, such as ID (with clear nomenclature), Name, Description (requirements definition), Author,
26
2
System Requirements and Functions TRACEABILITY!
Ver.
Derived from
Used by
The vehicle shall enable the TJP Bjelica function to be activated if the speed is below 40mph, driver is attentive and the guard rail is detected.
0.5
HLR_001
TR_001 TR_002 TR_003 TR_004
1
To be discussed.
Cruise control start
The vehicle shall maintain the current speed when activated by the driver.
Bjelica
1.0
HLR_002
TR_005 TR_006
2
-
FR_003
Cruise control stop
The vehicle shall stop maintaining the current speed if the driver makes any action upon vehicle pedals or the steering wheel.
Bjelica
1.0
HLR_002
TR_007 TR_008
2
Discuss other means of stopping.
QR_001
ISO 26262
Digital Cockpit Domain Controller shall adher to ISO 26262 ASIL B.
Valls
0.7
HLR_037
All TR
2
Under review
ID
Name
Description
FR_001
Traffic Jam Pilot (TJP) Activation
FR_002
Author
Prio Note
Fig. 2.5 Example excerpt from a system requirements specification, showing traceability
Version, Traceability (Derived from/Used by/Related to), Priority, Note, etc (Fig. 2.5). By having the requirements at hand (especially functional), we can try to figure out what happens if a function fails and is there a hazard (potential for damage or harm) to people, the environment, or the property (is the failure dangerous). Safety requirements would then state identified constraints that shall prevent this dangerous failure to happen. Further, safety functions could be defined which implement the safety requirements.
Exercise 2 For the electric scooter with geofencing, discussed in Chapter 1, now we need to construct a brief system requirements specification (on the functional level). In the first part of the exercise (20 min), your team shall construct a requirements specification of at least ten requirements, denoting the requirements that present system functions (functional requirements), and other requirements related to constraints (safety!) or quality. In the second part of the exercise (10 min), the requirements shall be reviewed with the stakeholders, which are from your counterpart team, using the Kano model. In the last 10 min of the exercise, your team shall prioritize the requirements according to the outputs of the Kano analysis (Fig. 2.6). Your Tasks for the Exercise • Based on the discussion from Chap. 1, now fill in the requirements table correctly. To show traceability, consider some high-level requirements but do not spend time writing them in. Also, use dummy technical requirements for reference at this point.
Exercise 2
ID
27
Name
Descripon
Author
Ver.
Derived from
Used by
Prio
Note
FR_001
Traffic Jam Pilot (TJP) Acvaon
The vehicle shall enable the TJP funcon to be acvated if the speed is below 40mph, driver is aenve and the guard rail is detected.
Bjelica
0.5
HLR_001
TR_001 TR_002 TR_003 TR_004
1 (P)
-
FR_002
Cruise control start
The vehicle shall maintain the current speed when acvated by the driver.
Bjelica
1.0
HLR_002
TR_005 TR_006
2 (D)
-
FR_003
Cruise control stop
The vehicle shall stop maintaining the current speed if the driver makes any acon upon vehicle pedals or the steering wheel.
Bjelica
1.0
HLR_002
TR_007 TR_008
3 (I)
Discuss other means of stopping.
FR_004
Fast acceleraon
The vehicle shall accelerate upon pedal press using the highest torque available.
Stevic
0.5
HLR_005
TR_012
(R)
Removed aer Kano.
QR_001
ISO 26262
Digital Cockpit Domain Controller shall adher to ISO 26262 ASIL B.
Valls
0.7
HLR_037
All TR
2
Under review
Fig. 2.6 System requirements specification after Kano analysis (prioritized)
• Mark all functional requirements with the prefix FR, quality with QR, constraint with CT, and safety with SF. • Make sure to follow the formulation wording correctly (e.g., “The system shall . . .”) making sure to be clear, nonambiguous, and complete as much as possible. • Present the table to the counterpart group. Split all requirements into the categories (Delighter, Performance, Must Have, Questionable, Indifferent or Reverse) according to the Kano analysis, and prioritize (Performance, then Delighters, then Indifferent) and remove Reverse ones. • What about the completeness of the stakeholder list? Can you actually remove some requirements based on any stakeholder response? Get ready to discuss! To-Do List • Perform the exercise individually or with your peers. One can share the screen and keep notes, all contribute. • Create presentables (e.g., drawing, Excel calculation). • Discuss your solution and share with others. Note: Digital files for this exercise are available at sfs2.ex.nit-institute.com
28
2
System Requirements and Functions
Exercise 2 Template Electric scooter with geofencing – Requirements specification ID
Name
Description
Author
Ver.
Derived from
Used by
Priority
Kano (P/M/D/I/Q/R)
Note
Used by
Priority
Kano (P/M/D/I/Q/R)
Note
FR_001
QR_001
SF_001
Exercise 2 Sample Solutions See several exemplary solutions to the exercise:
Solution 1 Electric scooter with geofencing – Requirements specification ID
Name
Description
Author
Derived from
FR_001
Acceleration
0.5
HLR_001
TR_001 TR_011
1
M
FR_002
Breaking
The vehicle shall slow down when brake lever is pulled on.
Nebojsa Cvijic
0.5
HLR_002
TR_002 TR_022
1
M
Folding
The vehicle shall be able to be folded.
Vladimir Pavlovic
0.5
HLR_003
TR_003
2
D
Steering
The vehicle shall change direction according to steering handle.
Caner Dur
0.5
HLR_004
TR_004
1
M
FR_005
Charging
The vehicle shall be able to be recharged.
Dario Peric
0.5
HLR_005
TR_010
1
M
FR_006
Battery indicator
The vehicle shall have battery state indication.
Vanja Arbutina
0.5
HLR_006
TR_009
3
D
FR_007
GPS
The vehicle shall have communication with GPS.
Nebojsa Cvijic
0.7
HLR_007
TR_005
2
M
CR_001
Power consumption
The vehicle shall not consume more than 50W at average.
Vladimir Pavlovic
0.5
HLR_005
TR_006
3
D
SR_001
Geofencing
The vehicle shall indicate when leaving safe zone.
Caner Dur
0.7
HLR_007
TR_015
2
M
QR_001
Load weight
The vehicle shall be able to carry 120kg
Dario Peric
0.5
HLR_010
TR_008
2
M
FR_003
FR_004
Vanja Arbutina
Ver.
The vehicle shall accelerate when accelerate lever is pulled on.
To be more clarified in future.
Exercise 2 Sample Solutions
29
Review Comments by the Instructor • FR_007 seems more like a technical requirement than a functional requirement. Functional requirements describe system functions (goals) – the system provides these functions toward their users/operators, and the effects of those functions manifest toward the environment, at the system boundary (as well as failures of those functions!). Instead, say, FR_007 might cover the need for geofencing (as in SR_001), whereas FR_007 might become TR_XXX derived from SR_001. • Generally, when deriving functional requirements, it is good to first list the operation modes of the system (e.g., driving, parked, rolling, carrying) and then list various functions for each of the modes.
Solution 2 Electric scooter with geofencing – Requirements specification ID
Name
Description
Author
Ver.
Derived from
Used by
Priority
Kano (P/M/D/I/Q/R)
CR_001
Maximum velocity
The maximum velocity of the system shall not exceed 30 km/h.
Jelić
1
HLR_001
TR_001 TR_002
2
D
QR_001
ISO 26262
The system shall be developed according to ISO 26262 ASIL B
Miškulin
1
HLR_002
ALL TR
1
M
CR_002
Battery life in idle state
The system battery shall be able to provide power for at least 2 hours in idle mode
Brkljač
1
HLR_001
TR_003
1
M
Folding mode
The system shall be foldable for carry
HLR_004
TR_004 TR_005
5
M
1
M
1
M
2
M
FR_001
SR_001
Full stop
FR_002
Acceleration
FR_003
Braking
1
HLR_001
Miškulin
1
HLR_001
Ilić
1
HLR_001
Ilić
1
HLR_001
TR_009
3
M
Brkljač
1
HLR_003
TR_010
4
M
Jelić
1
HLR_004
TR_011
3
M
2
M
2
D
FR_004
Steering
FR_005
Shutdown when geofenced
The system will be able to safely shutdown when exiting geofenced area.
Low battery warning
SR_003
Leg stand during driving
QR_002
Display adaptiveness
1
Lubina
The system shall be able to change direction when steered.
SR_002
Jelić
The system must be able to fully stop within 10m at maximum velocity and load. The system shall be able to accelerate when holding the acceleration lever The system shall be able to decelerate when holding the brakes
The system shall indicate low battery warning when the percentage drops below 10% The system shall provide that the leg stand shall not be opened during driving The system shall provide that the display is adaptable to brightness in the environment
TR_006 TR_007 TR_008 TR_006 TR_007 TR_008 TR_006 TR_007 TR_008
Lubina
1
HLR_005
TR_006 TR_007
Brkljač
1
HLR_002
TR_020
Note
Discuss other standards?
Not understandable. Which direction and how steered? How to shutdown when exiting area? Can be elaborated as technical requirement.
Display remains clearly visible
Review Comments by the Instructor • Usually, functional requirements come first, stemming directly from high-level requirements. Constraint and quality requirements are very important, but in the presentation to the stakeholders, they shall be put second since they sound more technical. Generally, the requirement engineer shall act as a facilitator of the communication among the stakeholders; therefore, easy-to-understand requirements (nonambiguous, agreeable) are desired. • Kano analysis yielded Delighter for the CR_001 which states the maximum allowed velocity of the scooter. Usually, this requirement delights safety engineers or legislators/auditors, so it is indeed possible. Please note here that
30
2
System Requirements and Functions
depending on the stakeholders, Kano analysis may give different results, and sometimes the results must be averaged or weighted according to the importance of a stakeholder. It is in the end all about the argumentation so many approaches in prioritization and agreements are possible as long as the problem has been analyzed from various angles and among all relevant stakeholder groups. • Make sure not to have a vague requirement definition. For example, QR_002 states “adaptable to brightness.” This is unclear. What kind of adaptation shall be provided? Instead, “display shall remain visible for all daylight brightness levels” would be more clear. Also, “change direction when steered” in FR_004 is also vague; better: “change the direction according to the position of the steering handle.” • FR_005 is rather a safety requirement; however, it must also have additional requirements (maybe derived from it) which would specify in more detail how such a critical operation must be handled (e.g., producing an alarm, slowing down, etc.).
Solution 3 Electric scooter with geofencing – Requirements specification ID
Name
FR_001
Increasing speed
FR_002
Decrease velocity
FR_003
HMI information
FR_004
Lights
Description The scooter shall increase speed when the driver pulls the gas lever The scooter shall decrease speed when the driver release the gas lever The scooter shall present information about the vehicle speed, battery percentage, driving mode and mileage on display The scooter shall turn on the lights when the visibility (light conditions) drops below a level The scooter should provide function to be folded for easier transport
Author
Ver.
Derived from
Used by
Manic
1
HLR_001
TR_001 TR_0011
Beric
1
HLR_001
TR_004
Basta
1
HLR_001
TR_005 TR_012 TR_013
Basta
1
HLR_002
TR_006
Priority
Kano (P/M/D/I/Q/R)
1
M
1
M
Note
LIGHT_LEVEL is referenced in the definition part of the requirements
FR_005
Fold
Basta
1
HLR_002
TR_007
SF_001
Maximum velocity
The scooter shall not exceed maximum velocity of 25 km/h
Videnovic
1
HLR_002
TR_003
2
P
SF_002
Maximum acceleration
The scooter shall not exceed the maximum acceleration of
Basta
1
HLR_002
TR_008
1
M
MAX_ACC_VALUE is referenced in the definition part of the requirements
SF_003
Battery level
The scooter shall not be able to drive if battery level below 10%
basta
1
HLR_002
TR_009
Q
Rejected after review
basta
1
HLR_002
TR_010
1
M
Karan
1
HLR_002
TR_002
3
I
SF_004
Weight limit
QR_001
Constant velocity
The scooter shall not be able to drive if there is weight below 25kg The scooter should remain constant velocity per certain drive mode (SPORT, ECO, CITY)
Review Comments by the Instructor • Again, different stakeholders may respond differently to requirements (e.g., Must-Have for SF_004 could change if we are to make inquiries among children). • SF_003 is removed after Kano; however, it is questionable whether the safety engineer was among the stakeholders (to bring additional perspective to this requirement).
Exercise 2 Sample Solutions
31
• In FR_005, make sure not to use vague terms, such as “easier transport.” Better: “to enable the carrying with one hand.” • Make sure to have variables resolved in the same document, in the definitions section, and that they are kept together with the specification so that those values cannot be arbitrarily interpreted.
Solution 4 Electric scooter with geofencing – Requirements specification ID
Name
Description
Author
Ver.
Derived from
Used by
Priority
Kano (P/M/D/I/Q/R)
FR_001
Acceleration
The electric scooter shall enable acceleration and deceleration.
Bojkić
1.0
HLR_001
TR_001 TR_002
1
M
FR_002
Steering
The electric scooter shall provide steering control for the driver.
Bojkić
1.0
HLR_001
TR_003
1
M
Braking
The electric scooter shall provide braking control for the driver.
Bojkić
1.0
HLR_001
TR_004
1
M
HLR_002
TR_005 TR_006 TR_007
2
D
HLR_002
TR_008 TR_009
2
D
FR_003
FR_004
FR_005
Weight limit
Geofencing
FR_006
Battery alert
FR_007
Mobile control
FR_008
Sound signaling
QR_001
Light signaling
SF_001
Velocity control
The electric scooter shall prevent starting if the weight exceeds the limit. The electric scooter should prevent driving outside of the defined geofencing zone. The electric scooter should give the driver an alert if the battery level is insufficient for the travel distance. The electric scooter should provide controls for turning on and off via the mobile application. The electric scooter may be able to produce sound alerts. The electric scooter shall have lights bright enough so that the driver can navigate safely on the road with insufficient visibility. The electric scooter shall prevent powering off unless it is not stationary.
Barić
Mihić
1.0
1.0
Mihić
1.0
HLR_002
TR_010 TR_011 TR_012
Simić
1.0
HLR_003
TR_013
Simić
1.0
HLR_004
TR_014 TR_015
Kaštelan
1.0
HLR_004
TR_016 TR_017
Mihić
1.0
HLR_037
TR_125
Note
Safety requirement?
To the TR level?
Group 4
Review Comments by the Instructor • FR_008 is rather a technical requirement that can be derived from a functional requirement in which sound alerts may be utilized (e.g., battery low alert, etc.). Functions at the scooter level shall clearly describe the goals of the complete system toward the users/operators and with respect to the environment. • Geofencing may be even stated as a safety goal (SG) which is defined together with high-level requirements (HLRs). Then, safety requirements (SFs) are derived from SGs, and FRs are derive from HLRs, whereas SFs are related to FRs. • FR_007 might be made more clear if stating the connection with the use case (e. g., renting), which can be set up in the HLR from which FR_007 was derived. • The weight limit shall be set in FR_004 or additional FR as for the configurability of this limit in, e.g., maintenance phase shall be enabled (although this is unlikely). • QR_001 is impossible to implement as it is currently formulated.
32
2
System Requirements and Functions
Key Recap Questions In this you have been reading about and practicing exemplary requirements engineering practices through requirement elicitation: • • • • •
Think about a recent system you worked with. Remember its requirements specification. Were you involved in the elicitation? How would YOU perform the elicitation if given the task? Imagine how bad (or nonexistent) requirements impact safety.
Self-assessment Now take the time to self-assess your knowledge by taking the quiz below. Each listed statement is either correct or incorrect. Please mark your answer and then check in the key at the end of the book. 1. Requirement elicitation is performed among the project team members in a company developing a system. 2. Social groups outside the system boundary (in the system environment) are among the stakeholders in the requirements development process. 3. Failing to fulfill a constraint requirement may bear the risk to system safety. 4. The requirement engineer is not in charge of requirements communication, coordination, and escalation. 5. To apply a Kano analysis, the initial requirements specification needs to be assembled as a proposal to the stakeholders. 6. If a stakeholder expects that a listed feature is not present in the system, but would like it to be present, then this feature is assessed to be a Delighter according to Kano analysis. 7. Kano analysis helps us to detect features that “go without saying” and are usually not directly stated by the stakeholders. 8. System requirements specification allows a hierarchical view of the requirements, allowing traceability from, e.g., the functional requirements to technical requirements, but traceability to other artifacts (design specification, items, tests) is not maintained. 9. Safety requirements can only be defined by assessing failures of functions implementing functional requirements. 10. Dangerous failures of system functions, which implement functional requirements of the system, always lead to system hazards and potential accidents; therefore, the requirements specification needs to be complete before the safety assessment.
Self-assessment Key
Self-assessment Key
1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
False True True False True True True False False True
33
Chapter 3
System Safety
Introduction To set the stage for understanding the system safety, how it is defined, and how it can be provided, we needed first to understand the ways how systems are defined and specified. This being done in Chaps. 1 and 2, through the demystification of safetycritical systems and their requirements and functions, allows us now to knowingly dive into the challenging world of system safety. We deem the system safe in case it can be reasonably shown that it cannot cause unacceptable harm to people or the environment, or damage to the property. It is immediately obvious that safety is not the absolute category; we are always accepting some level of risk involved with using the system. Therefore, it is important to be able to express the level of risk which is acceptable. To express this risk, we first need to be able to quantify it. Undoubtedly, methods to assess and quantify the risk involved with a system are a big consideration in the field of system safety, and this is going to be addressed in this chapter. To be able to correctly assess risk, we need to understand what our system is doing. This means we need to resort back to the system requirements and functions, in order to assess what happens in case of failures of those functions. Further, system boundary would tell us which system components may exercise failures, and which targets from the system environment might be affected by those failures. This allows us to elicit hazards, which are all potential situations causing the system to fail in a harmful or dangerous way. Only by luring out all imaginable hazards, we can start quantifying the risk of each of the hazards and evaluate the extent of harm or damage each hazard may cause if activated. Notions such as accident, incident, hazard, risk, fault, error, failure, severity, and probability are all very important and precisely defined in the field of system safety, so they are going to be carefully addressed in this chapter as well. Learning the quantitative and qualitative methods to express the risk of a hazard allows us to figure out the acceptability of the risk and to treat it adequately to bring it © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0_3
35
36
3 System Safety
down to the acceptable level. Active and passive safety measures are introduced, which allow us to now act starting from the earliest design phases and inherently build in safety to the systems we design. Active measures would allow the safety mechanism designed for the system to react in case of detection of any operational anomaly or abnormality and bring the system to the state in which the hazard can no longer activate (e.g., cutting off the power supply, applying brakes, sounding alarms, opening gates, turning off a function). Passive measures may help reduce the risk by introducing protective gear or equipment (e.g., safety helmets, safety belts in the cars, warning signs or labels), or by providing additional safety training. Active measures are more interesting since they include technical implementations which potentially require electronics (hardware), software, and mechanical parts, and they represent subsystems that are critical with regard to their reliability. The field dealing with the prescription of active measures and their correct design and assessment is called functional safety, and it is specifically considered from Chap. 5 onward. Finally, process models required in system engineering are now extended with the respective processes of safety engineering. Those processes include systematic ways of how hazards are analyzed and assessed, how they can be removed, and what needs to be done with the remaining (residual) failures in the context of risk acceptability and the requirements set forth by the safety standards, the legislators, and finally, us all as societies.
Video Lesson This chapter has a corresponding video lesson: sfs3.nit-institute.com
Lecture Notes
37
Lecture Notes Safety is a condition of being protected from danger, risk, or injury. Safety, therefore, is freedom from conditions causing death, injury, occupation illness, damage or loss of property, or damage to the environment. Safety is not absolute; we want our system to function with an acceptable minimum of accidental loss. Various areas of safety exist in general (food safety, occupational health and safety, public safety, job safety, drug safety, etc.). System safety deals with the safety of technical systems, which, when deployed, shall not cause harm or damage. Functional safety is related to system safety, regarding only active measures which are needed to keep or bring the system to the safe state (Fig. 3.1). A chain of events with a system may lead to an accident, if the last event in the chain caused damage or harm to a target (person, environment, property). If the chain only partially executes, with damage or harm avoided, we are regarding it as an incident (almost accident). Hazard is the potential for an accident (“Driving on the icy road”). It is an imagined situation that, if happens, may materialize as an accident if actuated (e.g., “braking when driving”). When assessing safety, we need to list all the hazards and evaluate them: which conditions would allow a hazard to materialize into an accident. Each hazard, therefore, has a causal factor (“icy road, braking”) and an associated probability (“chance of braking on the icy road”). It also has targets that
Food Safety Occupational Health & Safety
System Safety
Technical systems
Safety
Public Safety
Functional Safety
Environment Safety
Drug Safety
Fig. 3.1 Types of safety
Job Safety
38
3
System Safety
Causal factors **
Probability Actuation (Event)
Incident Target (s) Accident
Hazard
Harm Damage
Severity Fig. 3.2 Hazard and its components leading to incidents and accidents
can be harmed or damaged, as well as the severity of the harm or damage caused (assessing consequences). Within a safety analysis and associated processes, hazards are looked to be removed, removing the potential accidents/incidents as well (Fig. 3.2). Risk is a probability (P) that the causal factors of the hazard will materialize into an accident of certain severity (S). Each hazard, therefore, has an associated risk as a quantitative or qualitative measure to assess the seriousness of the hazard with regard to its required consideration and removal. Quantitative expression of risk is a multiplication of probability [0,1] and severity (e.g., number of fatalities): RH = P S Risk can also be expressed as a “units per hour”: RH Δt = RH
h i 1 U sev Δt h
Qualitative risk is a more usual definition, where the risk is expressed as a descriptive term (e.g., “Low,” “Medium,” “Serious,” or “High”), or as a letter or number (A–D, I–IV) with gradation of seriousness of the risk maintained. Each standard prescribes a risk assessment matrix (table), where probability category is stated descriptively, in the first column (“Frequent,” “Probable,” “Occasional,” “Remote,” “Improbable,” “Incredible”). Likewise, the severity category is stated descriptively, in the first row (“Catastrophic,” “Critical,” “Marginal,” “Negligible”). The intersection of values for probability and severity gives the resulting risk. Each category is described to help the assessor correctly select each value, but very often their selection is based on argumentation (Fig. 3.3). Once the risk is defined, its acceptance is discussed. If the risk is too high, the associated hazard needs to be treated and reassessed until the risk envelope drops below the decided threshold (Fig. 3.4).
Lecture Notes Category Frequent Probable Occasional Remote Improbable Incredible
39 Range (failures per year) > 10-3 -3 10 to 10-4 10-4 to 10-5 10-5 to 10-6 10-6 to 10-7 < 10-7
Catastrophic I I I II III IV
Critical I I II III III IV
Marginal Negligible I II III II III III III IV IV IV IV IV
Fig. 3.3 Risk assessment matrix as per IEC 61508
Fig. 3.4 Risk envelope and risk acceptability
Hazards can be prevented by addressing various phases of system development and execution. The internal fault is a condition in the system which happened during the design or development, inherently (they are in the system “all the time” – bugs, wrong design decisions, inadequate material properties, etc.). The external fault is a lack of consideration for an external effect in the design (e.g., not considering the ice on the road). Faults are dormant until activated by a causal factor. In that case, errors in the system appear (wrong internal state, calculations, variable values, specific internal behavior). Errors may propagate all the way to the system boundary, materializing as failures of system functions (functions not producing the prescribed effects or fulfilling goals set in the design). Failures may be dangerous, meaning that under specific conditions, they materialize into incidents or accidents, with potentially harmful consequences (Fig. 3.5). Hazard and risk analysis always starts by analyzing failures of system functions, and their various failure modes (how the function shall fail). If a dangerous failure is detected, a new hazard is defined and its risk is assessed. Hazard can be removed by strict adherence to the process model in the system life cycle (fault prevention), or by removing faults once detected (through redesign – fault removal), by detecting and correcting errors (during runtime – fault tolerance), or by addressing failures (via
40
3
System Safety
Fig. 3.5 Failure chain
passive safety measures, e.g., protective equipment). Failures are therefore classified as dangerous, detectable, and undetectable, while dangerous undetectable failures are considered remaining and their respective hazards and risk need to be assessed and discussed with regard to acceptability.
Calculation Examples Task 1 Calculate the quantitative risk for a hazard: “a fault in a power press machine in the factory causes an unintentional start during maintenance and kills maintenance workers.” The maintenance crew, on average, has five workers, and the death or serious injury is imminent (unavoidable) in case of an unintentional start. Factory records show that unintentional starts happen once in every 100 maintenance cycles, with one maintenance per year on schedule. Solution RH = P S Severity can be expressed as a number of fatalities: S = 5 fatalities Probability can be expressed as: P= Therefore:
1 = 0:01 100
Exercise 3
41
5 fatalities 100 maintenance cycles 1 0:05 fatalities 0:05 fatalities - 6 fatalities = = 5:7 10 = RH Δt = RH Δt 1a year 365 24 h h RH = P S = 0:05 =
Exercise 3 For the electric scooter with geofencing, discussed in Chaps. 1 and 2, now we need to identify hazards and assess the risk for each of the hazards. Hazards can be identified by assessing the failures of functions of the electric scooter, based on the requirements specification developed in Chap. 2. To assess the failures correctly, consider as many failure modes as possible for each function. You may also use guidewords, such as no, always (stuck), reverse (opposite), more, less, early, and late to help you figure out failure modes. For each of the identified failures, assess whether it is dangerous or not. In case a failure is dangerous, mark it as being a hazard. Then assess the risk for the hazard by using IEC 61508 risk assessment matrix. Mark and shortly describe causal factors for each hazard in the form of a failure chain according to the class of cause (external fault, internal fault, error) (Fig. 3.6). Your Tasks for the Exercise • Based on the functional requirements from Chap. 2, now analyze hazards and fill in the hazard and risk evaluation sheet. • To fill in the sheet, analyze failure modes for at least two functions of your choice. Use guidewords to help you pinpoint the possible failure modes. For each failure
Funcon
FR_001
Is a hazard?
Failure mode Traffic Jam Pilot (TJP) acvates on a highway with vehicle speed above 60 mph
Yes
Probability
Remote
Risk category
Severity
Catastrophic
Fig. 3.6 Example of a hazard and risk evaluation sheet
II
Failure chain (fault – error – failure) Fault in the speed sensor leads to error (wrong speed value) in the algorithm for TJP acvaon, leading to TJP turning on in inappropriate situaon.
42
3
System Safety
mode, create a new row in the sheet. If you determine that the failure is dangerous, mark it as a hazard and assess the risk. • Risk assessment is done according to the IEC 61508 risk assessment matrix. • Discuss and describe the failure chain leading to the potential accident. Distinguish between faults (external, internal), errors, and the failure in your description. To-Do List • Perform the exercise individually or with your peers. One can share the screen and keep notes, all contribute. • Create presentables (e.g., drawing, Excel calculation). • Discuss your solution and share it with others. Note: Digital files for this exercise are available at sfs3.ex.nit-institute.com
Exercise 3 Template Power Press – Preliminary Hazard List (PHL) Hazard ID H1
Hazard description
P
S
R
R(
.
)
MEM (risk acceptable?)
P Safety measure description
S
R
R (
.
)
MEM (risk acceptable?)
Exercise 3 Sample Solutions
43
Exercise 3 Sample Solutions See several exemplary solutions to the exercise:
Solution 1 Electric scooter with geofencing – Hazard and risk evaluation sheet Function ID
Function name
Failure mode description
Hazard?
Probability
Severity
Risk category
FR_002
Braking
No Reaction when brake Lever is pull down
Y
Remote
Catastrophic
II
FR_002
Braking
Brake Lever Stuck, and it is not possible to brake
Y
Frequent
Critical
I
Braking
Reverse Reaction, when we press brake, scooter accelerate
Y
Remote
Critical
III
Braking
When we press brake lever, braking is more aggressive
Y
Fault? Error? Failure?
Braking
When we press brake lever, braking is not sufficient
Y
Fault? Error? Failure?
FR_002
Braking
When we press brake lever, reaction is delayed
Y
FR_002
Braking
When we press brake lever, reaction is too early
N
Remote
Negligible
IV
Fault? Error? Failure?
FR_005
Charging
Without possibility to charge the vehicle`s battery
N
Occasional
Marginal
III
Fault? Error? Failure?
FR_005
Charging
Charging continues even when battery is full
Y
Error
FR_005
Charging
Instead, Battery charge, battery goes to discharging
N
Fault? Error? Failure?
FR_005
Charging
Battery is charged less then maximum battery capacity
N
Fault? Error? Failure?
FR_002
FR_002
FR_002
Failure chain When cable is cut, leads to error in brake system, and it can cause a hazard incident/accident – undetectable failure on boundary of system When brake lever is stuck, leads to error in brake system, and it can cause a detectable failure on boundary of system When brake lever is pressed, accelerate system due to bug in design, and it can cause a detectable failure on boundary of system
Fault? Error? Failure?
Review Comments by the Instructor • When assessing the probability of a risk, argumentation needs to be provided, targeting failure rates (statistics) based on reports from previous system versions or similar systems. At the very least, the rationale needs to be consistent throughout the PHL (e.g., probability of misuse is always higher than the probability of the failure of mechanical components, which is again higher than the probability of failure of electronics). • Severity consideration may be impacted by the operation mode; therefore, in the usual PHI, failure mode of the function is combined with each of the operating modes, to lure out all potential hazards related to that function failing. For example, driving speed of the scooter in various operating modes (e.g., pedestrian mode or speed mode) might affect the severity of the risk (e.g., in pedestrian mode, when the scooter is being driven around 5 km/h, we can argue that the driver might “jump off” and prevent the accident in many cases, and that also the severity of impact in some cases may be lower). • Please make sure to use the guidewords correctly; e.g., if the function is “stuck,” this means it is “always-on,” e.g., always breaking (vs break lever stuck, what is a misinterpretation of the function).
44
3
System Safety
• Please note that in “stuck” failure modes, it is possible not to have a hazard (e.g., always braking, means not available, but indeed safe since the vehicle is not moving). This helps decompose the failure rate of the braking subsystem since not all failures are dangerous (all safe failures, and also all dangerous detectable failures – we will see about those in subsequent lectures and courses – can be disregarded in the final SIL consideration)
Solution 2 Electric scooter with geofencing – Hazard and risk evaluation sheet Function ID
Function name
FR_001
Failure mode description
Hazard?
no, always (stuck), reverse (opposite), more, less, early, late
Y/N
Probability
Severity
Risk category
Failure chain
I-IV
Fault? Error? Failure?
FR_001
Folding mode
Folding happens during driving.
Y
Remote
Critical
III
FR_002
Acceleration
Acceleration lever gets stuck and scooter accelerates out of control.
Y
Remote
Catastrophic
II
FR_002
Acceleration
Acceleration lever reaction time is late .(Latency in response)
N
FR_002
Acceleration
Acceleration lever gives more speed when used.
Y
Probable
Critical
I
FR_003
Braking
Braking works less when heavy rain.
Y
Occasional
Catastrophic
I
FR_003
Braking
Braking handle breaks with more force then it should.
Y
Occasional
Marginal
III
Fault: Metal pin is worn out. Error: Metal pin breaks. Failure: Scooter starts folding during driving. Fault: Jacket gets stuck between lever and steering wheel. Error: Acceleration lever unable to return to normal. Failure: Scooter accelerates out of control.
Fault: Acceleration lever not calibrated. Error: Control unit gives wrong acceleration value. Failure: Stability while driving lost. External Fault: Heavy rain. Internal Fault: Wires not isolated properly. Error: Breaks do not receive breaking signals. Failure: Not managing to break in time. Fault: Breaking lever not calibrated. Error: Control unit gives wrong breaking values. Failure: Breaking discs break with more force.
Review Comments by the Instructor • Make sure to properly analyze the environment; harm and damage are not only to the vehicle and the driver but also potentially to other traffic participants. For example, in a sudden folding case, if the scooter is allowed on the roads, heavy consequences can occur to many traffic participants in general, causing potentially many persons harmed. • Generally, the output you created is really good!
Exercise 3 Sample Solutions
45
Solution 3 Electric scooter with geofencing – Hazard and risk evaluation sheet Function ID
Function name
Failure mode description
Hazard?
Probability
Severity
Risk cathegory
Failure chain
FR_001
Increasing speed
Velocity increases rapidly
Y
Improbable
Critical
III
Fault in speed control unit leads to error (wrong acceleration value), that further leads to speed increase that is higher than wanted.
FR_001
Increasing speed
Velocity does not increases when the driver pulls the gas lever
Y
Remote
Critical
III
Fault? Error? Failure?
FR_002
Decrease velocity
Velocity does not decrease if driver releases the gas lever
Y
Remote
Critical
III
Mechanical fault in gas lever leads to losing control over speed control unit, the scooter maintains speed and causes traffic accident.
FR_002
Decrease velocity
Velocity abruptly jumps to maximum value after reaching 0 level of acceleration
Y
Occasional
Critical
II
Negative value of speed is interpreted as maximum speed, causing the scooter to increase speed.
FR_002
Decrease velocity
Velocity decreases rapidly
Y
Improbable
Marginal
IV
Fault? Error? Failure?
FR_005
Fold
It is impossible to open scooter if it is folded
N
Remote
Negligible
IV
Fault? Error? Failure?
FR_005
Fold
Scooter unfolded by its own
N
Remote
Marginal
III
Fault? Error? Failure?
Review Comments by the Instructor • Be careful about the faults originating from software (bugs). Those are systematic faults and are hard (and usually impossible) to model probabilistically. Instead, specific measures are prescribed for the development (by the standard) to prevent and remove systematic faults. It is good, however, to analyze these kinds of faults and point out the importance of the systematic development, but make sure to balance your PHI so that faults due to misuse, environment (external faults), and also hardware/mechanical wear are also taken into account with enough weight.
46
3
System Safety
Solution 4 Electric scooter with geofencing – Hazard and risk evaluation sheet Function ID
Function name
Failure mode description
Hazard?
Probability
Severity
Risk cathegory
Y
Improbable
Catastrophic
III
FR_003
Braking
Brake never works when trying to brake while driving at high speeds?
FR_003
Braking
Brake accidentaly activates when driving at high speed.
Y
Remote
Catastrophic
II
FR_003
Braking
Brake activates late, seconds after manually pressing the brake.
Y
Occasional
Marginal
III
FR_005
Geofencing
Geofencing zone does not detect the scooter leaving the zone, so that the driver can keep driving the scooter even if he is outside of the zone.
N
Improbable
Marginal
IV
FR_005
Geofencing
Geofencing falsely flags the scooter as being outside of the zone, so that the scooter can not be driven.
Y
Improbable
Catastrophic
III
FR_004
Weight limit
Weight limit falsely senses that the driver is over the required weight limit.
Y
Probable
Critical
I
FR_004
Weight limit
Weight limit falsely senses that the drivers weight is not heavy enough.
N
Probable
Negligible
III
Failure chain
Fault in the users driving skills leads to a driving error in terms of pressing the brake which causes an accident. Fault in the register overflow leads to false values in the variables that controls braking which causes an error in slowing down the vehicle in time. Fault in the geofencing server leads to an error when the scooter tries updating the zone where driving is permited, allowing the driver to drive in zones where it is not allowed. Fault in the geofencing mapping system leads to an error in calculating coordinates which leads to potential accidents possibly during high speeds. Fault in the weight sensor software leads to an error in the weight calculation that allows a child that is under the weight limit to operate the scooter. Fault in the control software leads to an error when flagging the measured weight which leads to a failure where the scooter can not be driven.
Review Comments by the Instructor • Please see comments to other groups, especially for hazards related to systematic faults in software. • In case a hazard is not identified, there is no need to assess the risk (no hazard – no risk). • Improper training and faults due to misuse are great to always consider because many hazards are actuated by a human factor. • I see you have corrected my other comments from the live session, so now this analysis looks pretty good (in the parts which are completed, of course).
Key Recap Questions In this chapter, you have been exploring the main concepts of system safety and performing initial steps to identify system hazards. Now, with regard to the system you discussed: • • • • •
For each system function, discuss potential failures. Are there any hazards involved? If so, what are the risks? Try to quantify the risk! What caused the failure? Go back following the failure chain! What can we do about faults and errors?
Self-assessment Key
47
Self-assessment Now take the time to self-assess your knowledge by taking the quiz below. Each listed statement is either correct or incorrect. Please mark your answer and then check in the key at the end of the book. 1. System safety deals with methods that need to provide absolute protection from harm or damage. 2. A safety belt in the vehicle is a safety measure prescribed by functional safety. 3. Reported incidents are an indicator of an imminent accident. 4. Hazards always have the same usual targets (people, environment, property) regardless of the safety standard applied for their assessment. 5. Hazard A is more serious and shall be prioritized over Hazard B if Hazard A if actuated, causes the death of 100 people, and Hazard B, if actuated, causes the death of 50 people. 6. Let us say that for System A, the maintenance phase, which happens once a year, is considered hazardous. One of the ways to remove this hazard is to decrease the frequency of system maintenance. 7. A fault in the system is always considered a causal factor for a hazard. 8. Errors in the system may lead to dangerous failures of system functions. 9. System faults can be detected and removed during the system operation. 10. If we detect a dangerous failure of a system function, we may prevent the accident.
Self-assessment Key 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
False False True False False True False True False True
Chapter 4
System Safety Process
Introduction In Chap. 3, we have introduced all the important terminology of system safety and understood how safety can generally be assessed and ensured. One big consideration in that field is the systematicity of all processes and procedures involved. Therefore, process models for overall system development now need to be extended with appropriate process models for safety-specific and safety-relevant activities. It is also necessary to position all safety activities in the context of the overall system life cycle to understand how those activities are interlinked and when and how they are appropriately performed. Including safety in all phases of a system development project is providing an inherent view of safety, and designing all safety prescriptions from the start in a proactive way. Being proactive about safety is an essential consideration in system safety engineering today. Opposite to proactive safety, reactive safety as a traditional concept is based on merely logging incidents and accidents once they occur, and performing safety “fixes” subsequently. It is already understood that reactive safety is in some programs too costly with regard to caused damages, and in some programs, it cannot at all be tolerated as a primary method of safety assurance (e.g., aerospace programs, transportation of people, etc.). Instead, proactive safety makes sure that all design decisions which are taken from the start consider existing or introduced hazards. Then, together with the requirement analysis phase, appropriate safety requirements are also defined. Based on the safety requirements, a system architecture is appropriately defined, including safety-related subsystems and safety functions. System implementation immediately includes all safety measures, and subsequent verification and validation phases seek to validate the safety and provide an argumentation in the form of the final safety case. Proactive system safety engineering includes some very specific activities. For example, hazard identification is performed very early, immediately from a so-called pre-project phase, in which we usually only estimate the scope of work, provide © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0_4
49
50
4 System Safety Process
rough effort estimation, and sketch the top-level system architecture. The list of hazards is provided as one of the key inputs for the requirements elicitation process, in which functional hazard evaluation is performed to evaluate the risk for each hazard. Based on the quantification of risk, appropriate safety goals are formulated, which are then decomposed to specific safety requirements. Those requirements can be allocated to specific affected elements in the system technical concept during the system design. This way safety is conceptualized before the system is even commenced with any specific implementation. In this chapter, the system safety process is discussed and practiced in more detail, with the goal to understand all additional safety-related phases, artifacts they produce, and traceability that needs to be maintained throughout all artifacts across various processes involved. Additionally, important sub-processes are laid out algorithmically, with specific actions prescribed to bring the risks down to the tolerable zone.
Video Lesson This chapter has a corresponding video lesson: sfs4.nit-institute.com
Lecture Notes
51
Lecture Notes Proactive safety considers a safety process that includes all safety-related sub-processes which follow the system life cycle from its inception, until the decommissioning, and is aligned with system engineering processes and project management. Proactive safety is inherent to the system design (considered from day zero), as opposed to reactive safety (“fly-crash-fix-fly”) which was traditionally used to analyze incidents and accidents and based on them implement safety measures. ISAPro model considers the split of the system life cycle into the problem space (pre-project phase, conceptualization, ideation, usually to create the proposal to the customer), model space (requirements development and system design), solution space (implementation, construction, integration, and testing), and maintenance (during system operation). Each engineering phase or project management phase is intertwined with an appropriate safety process (Fig. 4.1). Preliminary hazard identification (PHI) is a safety process in which the team, based on the rough concept and the initial listing of high-level requirements, identifies a preliminary hazard list (PHL) and performs their early assessment (risk evaluation), considering all phases of the system life cycle. The output of PHI is the list of overall safety goals (high-level safety requirements, e.g., “The system shall avoid collision with other vehicles in the proximity”). The risk associated with each hazard in the PHL can be allocated to the respective element in the rough technical concept, to give an early indication of the criticality and the required reliability/redundancy of system components (this helps the early estimation of time and cost!), and even create the preliminary safety concept. The functional hazard evaluation (FHE) phase is where specific safety requirements are defined based on the initially defined safety goals, in close connection with
Development project Problem space
Project Initialization
Preliminary Hazard Investigation (PHI)
Rough concept
Model space
Project start
Maintenance Solution space
Project controlling
Operation
Project close down
Functional Hazard Evaluation (FHE) Preliminary System Safety Evaluation (PSSE)
Requirements analysis
Design
Configuration Management, Quality Management (Verification, Validation)
System Safety Evaluation (SSE)
Construction, Integration and Test
Maintenance
Operational System Safety Evaluation (OpSSE)
Operation and technical maintenance
Disposal
Problem solving management, Change management
Fig. 4.1 ISAPro process model. (Source: ISaPro®: A Process Model for Safety Applications)
52
4 System Safety Process
the requirements development phase, extending the PHL by performing a complete hazard identification at this point. This phase is based on the requirement analysis process running in parallel (based on, e.g., functional requirements) with specific safety requirements produced as output. Safety requirements are derived from safety goals and high-level requirements and are related to functional requirements. FHE further helps the requirement analysis to correctly define all technical safety requirements (TSR) which specify safety features (and safety functions) for the upcoming system design phase. In PHI and FHE, hazards are identified via workshops with a multitude of stakeholders and experts of various relevant backgrounds, and risk analysis and evaluation are performed for each hazard according to the appropriate safety standard, to correctly quantify or categorize the risk (e.g., “ASIL B,” “SIL 2,” etc.). Then, it is judged whether the risk is acceptable or not, based on the associated failure rates (number of dangerous failures/fatalities per year/hour). According to the Minimum Endogenous Mortality (MEM) principle, people die in accidents with 2 ∙ 10-4 fatalities per person per year. MEM states that any new technical system must not have mortality higher than 1/20 of 2 ∙ 10-4, that is, it must prove to have only up to 10-5 fatalities per person per year (it is assumed one person is using ~20 technical systems every day). GAMAB (globalement au moins aussi bon – globally at least as good) allows referencing of our system to the equivalent system already in use, judging that our system is at least as good as the existing (and approved!) system. ALARP (as low as reasonably practicable) method further allows us to argue that measures to decrease risk are grossly disproportionate to benefits the system brings (e.g., too costly or impractical), splitting the risk into intolerable, tolerable, and acceptable levels. We can then judge tolerable risk and make argumentation in this regard (what is usually challenging and needs appropriate referencing as in GAMAB). If the risk is judged to be too high (intolerable), we need to implement risk reduction measures. Respecification/redesign may be needed so that the technical concept is changed to allow the reduction of risk (e.g., adding safety functions to bring the system to the safe state in case dangerous failures/states are detected, or adding other safety elements – warnings, alarms). This shall be done in FHE (together with the design phase) since later on it can be very costly. Additional measures may be to (re)define procedures (e.g., for operation) and to enhance user/ operator training. Finally, passive measures may be employed, such as protective gear, enclosures/barriers, area demarcation, safety labels, etc. After prescribing the safety measures, hazard identification needs to be repeated (maybe we introduced new hazards!) and risks reevaluated until all risk is judged to be acceptably low and all hazards closed (Fig. 4.2). The preliminary system safety evaluation (PSSE) process is used to check whether the system design phase (system architecture and technical design) produces outputs that comply with the safety requirements, to make sure that the design is safe before the implementation starts. The system safety evaluation (SSE) process extends the verification and validation phases of the V-model to verify if all safety-related elements are correctly implemented and if the system fulfills the overall safety
Lecture Notes
53
HAZARD IDENTIFICATION RISK AVOIDANCE Measures? RISK ANALYSIS
RISK REDUCTION
Next Hazard
Risk is too high RISK EVALUATION
HAZARD CLOSED
Risk acceptable? ???
Risk is acceptably low
Fig. 4.2 Hazard identification and risk evaluation process steps
Fig. 4.3 Process tailoring. (Source: INCOSE)
goals. Safety is continuously evaluated during the operation of the system as well, in the operational system safety evaluation (OpSSE), logging and tracking all safetyrelevant system behavior (incidents, accidents) which can help in maintenance (repair) and further system (re)design. Standards usually allow process tailoring, to match the standard provisions more closely with the specific project, which can be used to find a trade-off between the level of processes required for safety compliance and the cost needed to implement and maintain them (Fig. 4.3).
54
4
System Safety Process
Exercise 4 Your team is in the middle of the preliminary hazard identification (PHI) phase, with the preliminary hazard list identified for the power press machine in the factory (Fig. 4.4). Your goal is first to analyze and evaluate the risk according to IEC 61508. Consider Catastrophic severity only in case of multiple fatalities; Critical in case of a single fatality or irreversible injury; Marginal in case of a nonfatal, reversible injury; and Negligible in case of minor or no injuries. Consider probability according to the failure range given in the table (see M3 lecture note). Then, argue the risk acceptance according to MEM (10-5 deaths per person per year). In case the risk is too high, prescribe a safety measure and reevaluate the risk until you can close it. H1: Operator places his hand in a pressing area due to misuse and gets injured. H2: Press starts unintentionally during maintenance, with press mechanism active, potentially harming maintenance workers. H3: Press moves or topples due to imbalance and potentially harming the operator. Fig. 4.4 Power press machine
Exercise 4 Template
55
Your Tasks for the Exercise • Analyze and evaluate the risk associated with hazards H1, H2, and H3 according to the IEC 61508 and the description above. • Express the risk quantitatively (in failures per year). • Evaluate the risk acceptability according to MEM. • In case of inacceptable risk, prescribe a suitable safety measure. • Reevaluate the risk and reassess its acceptability to close the hazard. To-Do List • Perform the exercise individually or with your peers. One can share the screen and keep notes, all contribute. • Create presentables (e.g., drawing, filled in sheet). • Discuss your solution and share it with others. Note: Digital files for this exercise are available at sfs4.ex.nit-institute.com
Exercise 4 Template Power Press – Preliminary Hazard List (PHL) Hazard ID H1
H2
H3
Hazard description Operator places his hand in a pressing area due to misuse and gets injured Press starts unintentionally during maintenance, with press mechanism active, potentially harming maintenance workers Press moves or topples due to imbalance and potentially harming the operator
P
S
R
R (
.
)
MEM (risk acc?)
Safety measure description
P
S
R
R (
.
)
MEM (risk acc?)
56
4
System Safety Process
Exercise 4 Sample Solutions See several exemplary solutions to the exercise:
Solution 1 Power Press – Preliminary Hazard List (PHL) Hazard ID H1
H2
H3
Hazard description Operator places his hand in a pressing area due to misuse and gets injured Press starts unintentionally during maintenance, with press mechanism active, potentially harming maintenance workers Press moves or topples due to imbalance and potentially harming the operator
P
S
R
R (
.
)
MEM (risk acc?)
R
MEM
Safety measure description
P
S
R
Improbable
Critical
3
10^-6
yes
(
.
)
(risk acc?)
Frequent
Critical
1
>10^-3
no
Sensor shall detect human hand when the machine comes into contact with the skin.When the sensor detects human skin it will not allow operating.
Remote
Critical
3
10^-5 to 10^-6
yes
-
-
-
-
-
-
Probable
Cathastopic
1
10^-3 to 10^-4
no
Press machine shall be nailed down to the floor so that it shall not be moved without the use of special equipment.
Incredible
Cathastopic
4
all technical safety requirement derived from SR004 should inherit
Send braking signal in 0.02s from detection
Review Comments by the Instructor • Requirements definition is usually one field composed of all the required aspects. Currently, you have split it into Description and Intervention, which can be done while brainstorming, but the final SRS shall be made in a standard requirements specification format. • One note: in case you have a very high-level safety requirement (e.g., Folding shall not happen during driving) which is not prescribing any measure, this is usually called safety goal (SG) and placed at the same level as HLRs. • Make sure to address freedom from interference aspects as well. • ASIL levels are inherited by TSRs from the SRs, based on, e.g., PHI. It is strange to have different ASIL levels at this point. It seems you have started to prematurely think about the implementation and ASIL allocation to functional blocks or perhaps even ASIL decomposition (what is possible in theory but here not convincingly performed).
Exercise 5 Sample Solutions
69
Solution 3 Electric scooter with geofencing – Requirements specification ID
Name
FR_001
Increasing speed
FR_002
Decrease velocity
FR_003
HMI information
FR_004
Lights
Description The scooter shall increase speed when the driver pulls the gas lever The scooter shall decrease speed when the driver release the gas lever The scooter shall present information about the vehicle speed, battery percentage, driving mode and mileage on display The scooter shall turn on the lights when the visibility (light conditions) drops below a level The scooter should provide function to be folded for easier transport
Author
Ver.
Derived from
Used by
Manic
1
HLR_001
TR_001 TR_0011
Beric
1
HLR_001
TR_004
Basta
1
HLR_001
TR_005 TR_012 TR_013
Basta
1
HLR_002
TR_006
Priority
Kano (P/M M/D D/I/Q Q/R R)
1
M
1
M
Note
LIGHT_LEVEL is referenced in the definition part of the requirements
FR_005
Fold
Basta
1
HLR_002
TR_007
SF_001
Maximum velocity
The scooter shall not exceed maximum velocity of 25 km/h
Videnovic
1
HLR_002
TR_003
2
P
SF_002
Maximum acceleration
The scooter shall not exceed the maximum acceleration of
Basta
1
HLR_002
TR_008
1
M
MAX_ACC_VALUE is referenced in the definition part of the requirements
SF_003
Battery level
The scooter shall not be able to drive if battery level below 10%
basta
1
HLR_002
TR_009
Q
Rejected after review
basta
1
HLR_002
TR_010
1
M
Karan
1
HLR_002
TR_002
3
I
TR_006, TR_007, TR_008, TR_009
1
The scooter shall not be able to drive if there is weight below 25kg The scooter should remain constant velocity per certain drive mode (SPORT, ECO, CITY)
SF_004
Weight limit
QR_001
Constant velocity
SR_001
Lever engaged control
Control unit shall slow down scooter when hand is not present on the lever
Manic
1
HR_008
TR_006
Lever pressure detection
The contact sensor shall detect pressure on the lever
Karan
1
SR_001
TR_007
Lever contact pressure
Sensor monitor shall detect that lever is released within 250ms
TR_008
Sensor monitor signal detection
TR_009
Slowing down safe state
The sensor monitor shall generate output signals within 50ms. Speed control unit shall slow down system within 3s if sensor monitor doesn’t detect pressure
Basta
1
SR_001
Videnovic
1
SR_001
Beric
1
SR_001
ASIL A. According to ISO 26262, although high exposure is present, at the same time user has high controllability due to low speed
Review Comments by the Instructor • It is not obvious where the SR_001 was derived from. It is fine for the exercise but usually, it is related to the FR or derived from HLR based on PHI. The definition of the safety function seems sane. • Aspects such as freedom from interference and SIL allocation to requirements need to be also provided.
70
5
Functional Safety
Solution 4 Electric scooter with geofencing – Requirements specification ID
Name
Description
Author
Ver.
Derived from
Used by
Kano (P/M/D/I/Q/R)
FR_001
Acceleration
FR_002
Steering
The electric scooter shall provide steering control for the driver.
Bojkić
1.0
HLR_001
TR_003
1
M
FR_003
Braking
The electric scooter shall provide braking control for the driver.
Bojkić
1.0
HLR_001
TR_004
1
M
FR_004
Weight limit
Barić
1.0
HLR_002
TR_005 TR_006 TR_007
2
D
HLR_002
TR_008 TR_009
2
D
TR_010 TR_011 TR_012
FR_005
Geofencing
FR_006
Battery alert
FR_007
Mobile control
FR_008
Sound signaling
QR_001
Light signaling
SF_001
Velocity control
The electric scooter shall prevent starting if the weight exceeds the limit. The electric scooter should prevent driving outside of the defined geofencing zone. The electric scooter should give the driver an alert if the battery level is insufficient for the travel distance. The electric scooter should provide controls for turning on and off via the mobile application. The electric scooter may be able to produce sound alerts. The electric scooter shall have lights bright enough so that the driver can navigate safely on the road with insufficient visibility. The electric scooter shall prevent powering off unless it is not stationary.
Bojkić
Mihić
1.0
1.0
HLR_001
TR_001 TR_002
Priority
The electric scooter shall enable acceleration and deceleration.
Mihić
1.0
HLR_002
Simić
1.0
HLR_003
TR_013
Simić
1.0
HLR_004
TR_014 TR_015
Kaštelan
1.0
HLR_004
TR_016 TR_017
Mihić
1.0
HLR_037
TR_125
1
Note
M
Safety requirement?
To the TR level?
Exercise 5
SF_101
Braking safety
TR_101
Mechanical system blockade
TR_102
Dependence on pressure
TR_103
Response time
TR_104
SIL
The electric scooter shall not allow unintentional braking while driving.
The mechanical braking system shall deactivate after the driver releases the brake lever. The braking level shall be linearly dependent on the pressure on the brake lever. The response time of the braking system shall be between and sec. The electric scooter braking system shall have SIL 2.
1.0
HLR_001
1.0
SF_101
1.0
SF_101
1.0
SF_101
1.0
SF_101
TR_101 TR_102 TR_103 TR_104
EUC: braking system
ECS: electric scooter control system
Safe state: the electric scooter brakes at 50% intensity when the problem with the braking system is detected.
Safe operation: the electric scooter shall limit the maximum speed to 10 kmph after it enters the safe state.
Review Comments by the Instructor • It seems you have added TRs with the intention to enhance the quality of the overall functions (as additional quality requirements) rather than prescribing the safety function what was the goal of the exercise. • Safety function addressing SF_101 shall probably be able to detect (by MONITORING) whether the braking was unintentional (by, e.g., pressure sensor on the handle or similar method) and then performing INTERVENTION (e.g., by signaling the malfunction and entering the safe state – the question is what it could be if we need to prevent the braking – what probably should not be done). • It is good to detail EUC/ECS, etc. for the exercise, but, e.g., safe state requires to use of the braking system as a remedy, whereas the braking system itself is the cause of the problem in the first place. Safety function, therefore, needs to provide freedom from interference, by finding an alternative way to brake (having, e.g., redundant/additional braking system – e.g., another mechanical brake).
Self-assessment
71
Key Recap Questions In this chapter, you have been analyzing the prescription of active safety measures (safety functions). Now, with regard to the system you discussed: • • • • • •
Think up a safety function. Which safety requirement/safety goal does it address? How the safe state is defined? What is the safety integrity level of the function? Discuss freedom from interference! Can you make your system fail-operational?
Self-assessment Now take the time to self-assess your knowledge by taking the quiz below. Each listed statement is either correct or incorrect. Please mark your answer and then check in the key at the end of the book. 1. Safety functions always address hazards exhibited by the functions of EUC, but not the functions of ECS. 2. Safety functions execute on top of the safety-related system (SRS) which can be made an integral part of the ECS only if the freedom from interference is provided. 3. A safe state is a state in which a system induces no harm or damage, and in which all system functions are always turned off. 4. The safety function shall monitor the system only for dangerous failures of its functions. 5. In case SRS fails, the system is immediately considered unsafe. 6. The electronic control unit within SRS shall always comply with the safety integrity level determined from the risk category of the hazard which is actuated by a dangerous failure of an EUC function. 7. A safety function operating in an on-demand mode can be made of less reliable components than the equivalent safety function operating in a continuous mode. 8. Safety integrity level is among the most important requirements for the safety function. 9. Methods of implementation of the safety function (e.g., system engineering processes, software development techniques) shall always be at the highest possible quality according to the standard regardless of the SIL allocated to that safety function. 10. Hazard A is identified, with risk evaluated according to ISO 26262 as ASIL C. The function of the system exhibiting this hazard then always needs to be reworked so that it satisfies the requirements for ASIL C according to the standard.
72
5
Self-assessment Key 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
False True False False True True True True False False
Functional Safety
Chapter 6
Defining Safety Functions
Introduction In Chaps. 1 through 5 we have introduced all top-level aspects of system safety and functional safety. To bring the considerations to the applicability, now it would be important to analyze specific exemplary functional safety standards and an example system, to see which provisions it gives with respect to functional safety. In this chapter, we would only work out the part of the process which results in the definition of safety functions and their safety integrity requirements. We could say that this process yields correct inputs to the system requirements specification, in terms of additional safety requirements (including top-level safety requirement/ safety goal, but also technical safety requirements and quality safety requirements/ safety integrity requirements). All the general provisions in functional safety stem from the originating standard, IEC 61508. This standard deals with the electrical, electronic, or programmable electronic (E/E/PE) systems and specifically deals with the safety functions including sensors, controllers, and actuators. Functional safety standards for different specific areas are derivates (profiles) of IEC 61508, with the goal to more closely define aspects according to specificities of systems in a regarded area (e.g., power plants, heavy machinery, automotive, etc.). For example, heavy machines (such as loading trucks, cranes, forklifts, harvesters, etc.) need to be compliant with either IEC 62061 or ISO 13849, and also ISO 26262 in case they are driven on regular roads. IEC 62061 focuses on electrical, electronic, and programmable electronic control systems specifically for machinery, whereas ISO 13849 provides a broader look at the safety life cycle, including safety requirements and guidance on the principles for the design and integration of safetyrelated parts of complex mechanical/electrical systems of machinery. Those standards are currently simultaneously applied to the same system and are therefore good as learning examples. They have different provisions, e.g., risk assessment which is
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0_6
73
74
6 Defining Safety Functions
similar in nature but revealing in the sense of how different standards cater to the essentially same practice. An example in this chapter would start by laying out an idea for a heavy machinery system. The analyzed system is a wrecking ball crane. Although a bit outdated, this system is a great ramification of our childhood/cartoon memories and could be nice for easy understanding with people of no specific technical background in construction and demolition projects. The initial step is inevitably understanding our system. We need to be able to define the system, its scope, main components, and the system boundary. By doing this, we set the stage for the hazard and risk analysis (HARA) phase, in which we would need to list all thinkable hazards: dangerous events, hazardous situations, dangerous faults, malevolent or unauthorized activities, and foreseeable misuse. Then, we would analyze each hazard with respect to its probability and severity. We here apply specific provisions from IEC 62061, in which probability is a composite of three distinctly evaluated categories (frequency and duration of use, probability of hazard event, and avoidance) which are summed up and then used with the severity category to look up the table and obtain the resulting safety integrity level (SIL): SIL 1, SIL 2 or SIL 3, SIL 3 being the most strict. In ISO 13849, however, three categories are used to assess risk (severity of injury, frequency of exposure to hazard, and probability of avoiding harm), and based on their combination, the resulting performance level (PL) is deduced: PL a, PL b, PL c, PL d, or PL e, PL e being the most strict. SIL and PL are used for the same thing, but they affectively define the safety integrity requirements safety-related systems (and safety functions within them) must fulfill if they are designed and used to mitigate the respective hazard, according to the selected standard. Safety functions are designed to monitor the events leading to hazards (e.g., via specific sensors) and then intervene in case those events are observed (e.g., by bringing the system to the safe state). Safety functions are then prescribed by their safety requirements, defining these monitoring and intervention phases, which shall obviously fully eliminate the risk of the hazard or reduce it to the tolerable zone. Safety functions inherit the SIL or PL resulting from the hazard, and this SIL or PL effectively becomes their safety integrity requirement. This further means that safety functions must be designed so that all processes, procedures, and architectural constraints must be met as prescribed by the standard for the given SIL or PL. Further, safety integrity against random failures must be provided, showing that the residual failures are low enough with regard, again, to the SIL or PL prescribed and their respective quantitative failure metrics. Let’s now do this interesting analysis for our wrecking ball crane!
Video Lesson This chapter has a corresponding video lesson: sfs6.nit-institute.com
Lecture Notes
75
Lecture Notes The procedure can be graphically described with the analysis sheet shown in the figure below. We would then magnify this procedure step-by-step until safety functions are correctly defined (Fig. 6.1). Wrecking ball crane has, among others, two hazards identified in this example. Hazard 1: Machine tipping due to imbalance and harming operator/workers Affecting: Whole machine Operation mode: In operation Countermeasure: Balance monitoring Hazard 2: Wrecking ball hits the operator Affecting: Whole machine Operation mode: In operation Countermeasure: Ball position monitoring According to IEC 62061, the probability of occurrence of harm resulting from an identified hazard is expressed via three categories using the table below. Probability of occurrence of harm Frequency and duration F ≤1h 5 >1h – ≤1 day (d) 5 >1d – ≤2 wks 4 >2wks – ≤1 year 3 >1 year 2
Probability of hazard event Very high Likely Possible Rarely Negligible
P 5 4 3 2 1
Avoidance
A
Impossible Possible Likely
5 3 1
Class CI is calculated as a sum of all parameters, CI = F + P + A.
76
6
Defining Safety Functions
Fig. 6.1 The overall procedure for the wrecking ball crane example
For Hazard 1, we can judge the frequency of use of the wrecking ball crane to be less than each day (say, it is used once per week). This results in F = 4. The probability of wrecking ball crane tipping due to imbalance is judged to be caused by operator misuse and only when brought to the extreme position, rendering this situation to happen rarely. This results in P = 2.
Lecture Notes
77
Finally, if this happens, this harm is impossible to avoid, that is, the operator cannot somehow escape and evade this accident. This results in A = 5. Final CI is therefore CI = 4 + 2 + 5 = 11. Note how the selection of those categories can be subjective, so the proper argumentation for the reviewer is extremely important. The final selection of the safety integrity level (SIL) as per IEC 62061 is conducted by using CI and the evaluated severity of harm, according to the table below.
Severity of hazard consequences Death, losing eye or arm Permanent, losing fingers Reversible, medical attention Reversible, first aid
S 4 3 2 1
Class CI (F + P + A) 8–10 4 5–7 SIL 2 SIL 2 SIL 2 SIL 1
11–13 SIL 3 SIL 2 SIL 1
14–15 SIL 3 SIL 3 SIL 2 SIL 1
For Hazard 1, consequences can result in a permanent injury of the crane operator in the cabin, but not necessarily death (it is not expected the cabin to be damaged so that the fatal harm could occur, especially taking into account the passive safety measures, as well as the operation happening at the construction site where medical attention can be sought for by co-workers). This results in S = 3. Finally, for S = 3 and CI = 11, the table yields the resulting SIL 2. It is often denoted as the SIL required, as SILr 2. For Hazard 2, we can attempt the analysis according to ISO 13849. Three categories are prescribed here. The severity of injury resulting from a hazard, denoted as S, can be classified as S1 (slight, usually reversible injury) or S2 (severe, usually irreversible injury, including death). Frequency and/or duration of stay (exposure to hazard), denoted as F, can be classified as F1 (rare to often and/or short exposure to hazard) or F2 (frequent to continuous and/or long exposure to hazard). Finally, the probability of avoiding or limiting harm, denoted as P, can be classified as P1 (possible under certain conditions) or P2 (hardly possible). The final performance level (PL) is obtained by following the tree as in the figure below (Fig. 6.2). Hazard 2 is judged to bear severity S = S2 (injury sustained can be irreversible), and the exposure to the hazard is expected to be short (not continuous), with F = F1. Finally, the probability is very low (hardly possible, only under extreme circumstances) with P = P2. Now, S2 - F1 - P2 results in PLr d. For each of the hazards, now safety functions can be defined to effectively remove hazards altogether. The safety function for Hazard 1 can be defined by the following requirements for the safety-related system (SRS): Safety requirement 1.1 (SR 1.1): SRS shall monitor the balance of the crane in order to predict a near-tipping event.
78
6
Fig. 6.2 Performance level decision tree as per ISO 13849
Defining Safety Functions P1
F1
PLa
P2 S1
P1
PLb
F2 P2
Start P1
PLc
F1 P2
S2 P1
PLd
F2 P2
PLe
Technical safety requirement 1.1.1 (TR1.1.1, derived from SR1.1): SRS shall use gyroscopes positioned at the crane top, cabin, and the back of the vehicle to detect imbalance. Safety requirement 1.2 (SR 1.2): SRS shall, in case of imbalance over the threshold (near-tipping), cut off the controls and command the vehicle to sound an alarm and bring up the crane. Quality safety requirement 1.3 (QSR 1.3): SRS shall achieve SILr 2. The safety function for Hazard 2 can be defined by the following requirements for the safety-related system (SRS): Safety requirement 2.1 (SR 2.1): SRS shall monitor ball position to prevent the wrecking ball from reaching the vehicle. Technical safety requirement 2.1.1 (TR2.1.1, derived from SR2.1): SRS shall use two position sensors (at crane arm and ball pulley) to monitor crane height and ball descent and to detect possible collision if swung. Safety requirement 2.2 (SR 2.2): SRS shall disallow dangerous positioning by limiting the controls extent continuously. Quality safety requirement 2.3 (QSR 2.3): SRS shall achieve PLr d.
Your First Safety Project! By using the knowledge and existing exercise materials obtained from all chapters so far, now you need to work and submit your assignment. If possible, you can carry out this assignment in a group of four or five. Each group would need to define the workflow for project execution first. Discuss this workflow with your instructor, if possible. Key points of the workflow shall define the amount of common work (e.g., during workshops and brainstorming sessions, integration of the work, preparation of the presentation of the work) and individual work (e.g., preparing parts of the required items for submission).
Your First Safety Project!
79
Together with your team, you need to choose a safety-critical system. It can be the electric scooter we used during exercises, but also any other system of interest. As assistance, you can consider, for example, any of the following: • • • •
Specialty vehicles, e.g., forklift truck Scooters and hoverboards Special machines, e.g., excavator Factory equipment, e.g., wood cutting
Upon selection of the system, you need to perform a quick system delineation and decomposition of the system to components, considering users/operators and the environment (see Exercise 1). After that, you need to perform requirements elicitation. First, for each of the systems, define a few high-level requirements (HLRs) focusing only on the top-level goals of the system. Then, perform the decomposition of each HLR to several functional requirements (FR). If you come across some quality/constraint requirements at this point, feel free to include them also. You do not need to document the elicitation process (as using Kano) or perform it systematically. You may use Exercise 2 as a reference. Based on the functional requirements, now perform the preliminary hazard identification (PHI), discussing and enumerating all the hazards (creating the preliminary hazard list – PHL). Evaluate risk and define the appropriate safety measures. Perform risk assessment according to any of the standards discussed during lectures, which you feel best suits your system (IEC 61508, IEC 62061, or ISO 13849). Devise safety measures for each hazard. Make sure to prescribe at least two active safety measures, using the concepts of functional safety. You do not need to quantify the risk and provide formal acceptance argumentation, but make sure to decrease the SIL after reassessment to the lowest level existing in the standard. You may use Exercises 3 and 4 as a reference. Update the requirement specification, by defining top-level safety requirements according to the identified hazards. For at least two safety requirements of your choice, prescribe active safety measures and derive their technical safety requirements (TSRs) further. You may use Exercise 5 as a reference. As finalization, illustrate your safety concept with the focus on at least one safety function in a draft drawing. The format of this illustration is free; however, make sure to either consider the system state diagram (depicting relevant events and the safe state) or architectural block diagram (depicting the data flow which describes fault-error-failure propagation as well as “trapping” the error/failure with a safety function). Create a presentation in which you will describe the process you followed, as well as the most important (interesting) details from the documentation you produced. Please make sure that you include both safety functions you defined in the presentation. It would be beneficial if you are in a position to present your findings to a live audience since this would be a simulation of your argumentation to a reviewer. Use 15 minutes for the presentation in total. Please strictly adhere to the time. Make sure
80
6 Defining Safety Functions
you split the presentation so that each group member can contribute and present a part.
Required Output • System delineation/decomposition sheet (may be free-form) • System requirements specification (SRS), in the Exercise 2 template • Preliminary hazard list (PHL) including the reassessment after safety measures (adapt the template starting from either Exercise 3 or Exercise 4) • Safety concept illustration (drawing, free-form) • Final presentation (in ppt)
Submission Deadline Take 1 or 2 weeks to work the project out.
Assessment You or your peers (or your instructor) can assess your work. Take 20 points as the project total score. Up to 10 points shall be allocated for your submission only. Each released document artifact brings you 2 points (if complete and mostly correct), 1 point (if incomplete and/or partially incorrect), or 0 points (if not provided, mostly incomplete or mostly incorrect). Other 10 points are going to be assigned based on your presentation (up to 5 points for convincing presentation-proof with solid/ believable argumentation and up to 5 points for the presentation content – specifically addressing the communication clarity and understandability of the presented material). Consider that you need to score at least 14 points to pass this review simulation!
Sample Solution to the Project The team selected a forklift truck as their considered system (available also in digital form at sfs6.ex.nit-institute.com). They gave the following presentation (Fig. 6.3): The group analyzed and delineated the system as in Fig. 6.4. System requirements specification for the forklift truck was provided, as well as preliminary hazard list (PHL).
Your First Safety Project!
81
Procedure • Research about the system ion: defining the system components, compo mponents, system • System delineation: nvironment boundary and environment rements • High level requirements rements • Functional requirements ard Ident ntiification Preliminary Hazard List • Preliminary Hazard Identification ents • Safety requirements • Technical safety requirements • Safety concept
Defining Defi De finiing fi ng Safety Safe fety ty Functions Funct Fu ctiions ions : FForklift io o orrkl kliifft Project Project 1 : Group Grou rou r p4 Maja Ma Maj ja Baric Ba ic Ivan Kastelan a Ka stelan ste t Nebojsa Nebo Neb eb b jsa Cvijic C Cvij Cvvi vijijiicc Filip Fi Fil lip M Mihic ih Miroslav Videnovic Miirosla rosla osla slavv V Videnov Videno Viden id denov enov nov o ic
High Level Requirements
ID
Name
Description
HLR_001
Lifting
The forklift shall allow lifting and lowering the load.
HLR_002
Driving
The forklift shall provide the ability to be driven.
HLR_003
Tilting
The forklift shall allow tilting of the forks.
Preliminary Hazard List Hazard ID
Hazard description
Functional Requirements
Severity (S)
Frequency (F)
Probability (P)
PL
Acc?
Safety measure description
Severity (S)
Frequency (F)
Probability (P)
PL
Acc?
S2
F1
P1
c
YES
S2
F1
P1
c
YES
Active safety measures:
H01
ID
Name
Description
Derived from
FR_009
Forward movement
The forklift shall move forward when the forward-reverse lever is in the forward position.
Power supply
S2
Escaping propane gas is ignited by the hot engine potentially irreversibly harming multiple people.
S2
F1
P2
d
NO
- Install lightning alert when driving forward with restricted visibility - Install fixed sensors with a 360degree detection zone to detect a pedestrian
Active safety measures:
HLR_002 H08
FR_011
- Install collision warning system
The driver's view is blocked by the load while driving forward, potentially hits and irreversibly harms pedestrians.
The engine shall burn propane gas to power the forklift.
F1
P2
d
NO
- Automatic fire detection and suppression system with the ability of manual activation that releases non-corrosive fluid to extinguish the fire Passive safety measures:
HLR_002
- crew training (fire safety)
Technical safety requirements derived from SR_006 Safety requirements
ID
Name
SR_001 Blocked view
Forklift fire safety
SR_006
Description
Derived from
Used by
The collision warning system shall activate when driving forward while driver view is blocked by the load.
FR_009
TSR_001 TSR_002 TSR_003 TSR_004 TSR_006 QSR_001
The forklift shall provide a fire extinguishing system that releases noncorrosive fire-extinguishing fluid in case of a potential fire near the engine.
FR_011
TSR_019 TSR_020 TSR_021 QSR_007
Technical safety requirements derived from SR_001 ID
Name
Description
TSR_001
Forward driving visual alarm
The visual alarm in the cabin shall activate when driving forward with speed above 3 kmph while the driver's view is blocked by load.
TSR_002
Forward driving sound alarm
The sound alarm system shall activate when driving forward with speed above 5 kmph while the driver's view is blocked by load.
TSR_003
Near object detection
The radar shall measure a distance from an object or pedestrian while driving.
ID
Name
Description
TSR_019
Forklift fire sensor
The forklift shall be equipped with a sensor that detects a fire.
TSR_020
Forklift fire suppression display
The forklift shall be equipped with an interactive display that allows the driver to activate the fire suppression manually.
SR_006
TSR_021
Automatic fire suppression activation
The forklift shall automatically activate the release of fire-extinguishing fluid once a potential fire is detected by the sensor.
SR_006
QSR_007
Forklift fire suppression performance level
The forklift fire suppression system shall be developed according to PL d.
SR_006
Forklift system state diagram regarding SR_001
Derived from
SR_001
SR_001
SR_001 SR_002 SR_001 TSR_004
Environment preview
A monitor of the driver assistance system shall display the environment around the vehicle.
TSR_006
Collision avoidance
Forklift shall apply breaks and stop automatically if the driver assistance system detects a possible collision.
QSR_001
Radar detection range
The radar shall monitor 360 degrees around the forklift
SR_002 SR_001 SR_002 SR_001 SR_002
Fig. 6.3 Project presentation sent for a review (outline)
Derived from SR_006
82
6
Defining Safety Functions
Fig. 6.4 System delineation sheet (forklift truck)
ID
Name
HLR_001
Lifting
HLR_002
Driving
HLR_003
Tilting
FR_001
Lifting the forks
FR_002
Lowering the forks
Derived from
Description Author High-level requirements Group The forklift shall allow 4 lifting and lowering of the load
Ver. 1.0
–
The forklift shall provide the ability to be driven
Group 4
1.0
–
The forklift shall allow tilting of the forks
Group 4
1.0
–
1.0
HLR_001
1.0
HLR_001
Functional requirements The forklift shall move Group the forks up when the 4 lever is pushed up Group 4
Used by FR_001 FR_002 FR_003 CR_001 FR_004 FR_005 FR_006 FR_007 FR_008 FR_009 FR_010 FR_011 CR_002 FR_012 FR_013 FR_014 CR_003
(continued)
Your First Safety Project!
ID
Name
FR_003
Fixed fork height
FR_004
Acceleration
FR_005
Deceleration
FR_006
Parking
FR_007
Steering left
FR_008
Steering right
FR_009
Forward movement
FR_010
Reverse movement
FR_011
Power supply
FR_012
Tilt forward
FR_013
Tilt backward
FR_014
Fixed tilt angle
83
Description The forklift shall move the forks down when the lever is pulled down The forks shall not move up or down when the lever is in a neutral position The forklift shall accelerate when the gas pedal is pressed The forklift shall decelerate when the brake pedal is pressed The forklift shall not allow movement when the parking lever is pulled The forklift shall turn left when the steering wheel is turned left The forklift shall turn right when the steering wheel is turned right The forklift shall move forward when the forward-reverse lever is in the forward position The forklift shall move in reverse when the forward-reverse lever is in the reverse position The engine shall burn propane gas to power the forklift The forks shall tilt forward when the tilt control lever is pushed forward The forks shall tilt backward when the tilt control lever is pulled backward The tilt angle of the forks shall remain unchanged when the tilt control lever is in the neutral position
Author
Ver.
Derived from
Group 4
1.0
HLR_001
Group 4
1.0
HLR_002
Group 4
1.0
HLR_002
Group 4
1.0
HLR_002
Group 4
1.0
HLR_002
Group 4
1.0
HLR_002
Group 4
1.0
HLR_002
Group 4
1.0
HLR_002
Group 4
1.0
HLR_002
Group 4
1.0
HLR_003
Group 4
1.0
HLR_003
Group 4
1.0
HLR_003
Used by
(continued)
84
6
ID
Name
CR_001
Load limit
CR_002
Maximum speed
CR_003
Maximum tilt angle
SR_001
Blocked view
SR_002
Blind spot preview
SR_003
Forklift imbalance prevention
SR_004
Load balancing system
Defining Safety Functions
Description Author Constraint requirements The forklift shall lift the Group load with mass up to 4 2 metric tons The forklift shall not Group exceed the maximum 4 speed of 10 km/h Group The tilt angle of the 4 forks shall not exceed ±10 degrees Safety requirements The collision warning Group system shall activate 4 when driving forward while the driver’s view is blocked by the load
Ver.
Derived from
1.0
HLR_001
1.0
HLR_002
1.0
HLR_003
1.0
FR_009
The driver assistance system shall display areas around the vehicle which are obstructed from the view of the driver while driving in reverse The forklift shall prevent vehicle tipping by monitoring and correcting the forklift balance
Group 4
1.0
FR_010
Group 4
1.0
FR_001
The forklift shall have a load balancing system that shall detect the load position relative to the forks and correct the position if over the threshold
FR_004 FR_005
Group 4
1.0
FR_007 FR_008 FR_009 FR_010 FR_012 FR_013 FR_004 FR_005 FR_007 FR_008 FR_012 FR_013
Used by
TSR_001 TSR_002 TSR_003 TSR_004 TSR_006 QSR_001 TSR_003 TSR_004 TSR_005 TSR_006 QSR_001 TSR_007 TSR_008 TSR_009 TSR_010 TSR_011 TSR_012 QSR_002 QSR_003
QSR_004 TSR_013 TSR_014 TSR_015 QSR_005
(continued)
Your First Safety Project!
ID SR_005
Name Lift and tilt hydraulics safety
SR_006
Forklift fire safety
SR_007
Wearing protective gear
SR_008
Passing forklift safety training
SR_009
Seat belt
TSR_001
Forward driving visual alarm
TSR_002
Forward driving sound alarm
TSR_003
Near object detection
TSR_004
Environment preview
85
Description The forklift shall prevent potential loss of hydraulic pressure by having a backup hydraulic hose The forklift shall provide a fire extinguishing system that releases noncorrosive fireextinguishing fluid in case of a potential fire near the engine All operators within the system boundary shall wear protective gear
Derived from FR_001 FR_002 FR_012 FR_013
Author Group 4
Ver. 1.0
Used by TSR_016 TSR_017 TSR_018 QSR_006
Group 4
1.0
FR_011
TSR_019 TSR_020 TSR_021 QSR_007
Group 4
1.0
–
All operators within the Group system boundary shall 4 pass forklift safety training The forklift shall have a Group brightly colored seat 4 belt that shall prevent the driver from falling out of the seat while driving Technical safety requirements Group The visual alarm in the 4 cabin shall activate when driving forward with a speed above 3 km/h while the driver’s view is blocked by the load The sound alarm system Group 4 shall activate when driving forward with a speed above 5 km/h while the driver’s view is blocked by the load The radar shall measure Group 4 a distance from an object or pedestrian while driving Group A monitor of the driver assistance system shall 4 display the environment around the vehicle
1.0
HLR_001 HLR_002 HLR_003 HLR_001 HLR_002 HLR_003
–
1.0
HLR_002
–
1.0
SR_001
–
1.0
SR_001
–
1.0
SR_001 SR_002
–
1.0
SR_001 SR_002
–
(continued)
86
6
ID TSR_005
Name Reverse driving visual alarm
TSR_006
Collision avoidance
QSR_001
Radar detection range
TSR_007
Forklift load position monitoring
TSR_008
Forklift load mass monitoring
TSR_009
Forklift balance monitoring
TSR_010
Forklift imbalance detection
TSR_011
Forklift safe state transition due to imbalance detection Forklift safe state due to imbalance protection
TSR_012
Defining Safety Functions Derived from SR_002
Description A monitor for driver assistance shall display visual warning signals while reverse movement if the distance from the object is less than 3 m The forklift shall apply breaks and stop automatically if the driver assistance system detects a possible collision The radar shall monitor 360 degrees around the forklift The forklift shall be equipped with two gyroscopes, one positioned just between rear wheels and one just between forks The forklift forks shall be equipped with force sensors to monitor force due to load The forklift shall be able to continuously monitor balance by using data from gyroscopes and force sensors By monitoring the balance, the forklift shall be able to detect neartipping events The forklift shall be put into the safe state if a near-tipping event is detected
Author Group 4
Ver. 1.0
Used by –
Group 4
1.0
SR_001 SR_002
–
Group 4
1.0
SR_001 SR_002
–
Group 4
1.0
SR_003
–
Group 4
1.0
SR_003
–
Group 4
1.0
SR_003
–
Group 4
1.0
SR_003
–
Group 4
1.0
SR_003
–
The forklift safe state shall be cut off the lift and tilt control to the user, taking over lowering and backward tilting to the initial position with and turning on the emergency alarm
Group 4
1.0
SR_003
–
(continued)
Your First Safety Project!
ID QSR_002
QSR_003
QSR_004
TSR_013
Name Forklift imbalance prevention performance level Forklift lifting performance level Forklift tilting performance level Load imbalance monitoring
TSR_014
Load imbalance detection
TSR_015
Load imbalance correction
QSR_005
Load balancing system level Forklift hydraulic hose monitoring Forklift hydraulic hose burst alert
TSR_016
TSR_017
87
Description The forklift balance monitoring and imbalance prevention shall be developed according to PL e The forklift lifting and lowering function shall be developed according to PL e The forklift tilting function shall be developed according to PL e The load balancing system shall monitor the load tilt relative to the forks with sensors placed at each fork The load balancing system shall detect an imbalance of the load if the tilt of the load is above the threshold angle of 5 degrees The load balancing system shall correct load imbalance by activating the fork extensions that shall push the load toward the equilibrium position The load balancing system shall be developed according to PL d The forklift shall be equipped with a sensor that detects when a hydraulic hose bursts The forklift shall provide a visual alert in the form of a glowing red light informing the driver that the primary hose has burst, that the backup hose is in use, and that maintenance is due
Author Group 4
Ver. 1.0
Derived from SR_003
Group 4
1.0
SR_003
–
Group 4
1.0
SR_003
–
Group 4
1.0
SR_004
–
Group 4
1.0
SR_004
–
Group 4
1.0
SR_004
–
Group 4
1.0
SR_004
–
Group 4
1.0
SR_005
–
Group 4
1.0
SR_005
–
Used by –
(continued)
88
6
ID TSR_018
Name Forklift backup hydraulic hose activation
QSR_006
Forklift hydraulics performance level Forklift fire sensor
TSR_019
TSR_020
Forklift fire suppression display
TSR_021
Automatic fire suppression activation
QSR_007
Forklift fire suppression performance level
Defining Safety Functions
Description The forklift shall allow the use of a backup hydraulic hose in case the system has detected that the primary hose has burst The forklift hydraulics system shall be developed according to PL d
Author Group 4
Ver. 1.0
Derived from SR_005
Group 4
1.0
SR_005
–
The forklift shall be equipped with a sensor that detects a fire The forklift shall be equipped with an interactive display that allows the driver to activate the fire suppression manually The forklift shall automatically activate the release of fireextinguishing fluid once a potential fire is detected by the sensor The forklift fire suppression system shall be developed according to PL d
Group 4
1.0
SR_006
–
Group 4
1.0
SR_006
–
Group 4
1.0
SR_006
–
Group 4
1.0
SR_006
–
Used by –
Hazard description The driver’s view is blocked by the load while driving forward, potentially hitting and irreversibly harming pedestrians
While driving in reverse, due to increased driving difficulty, the forklift potentially hits and irreversibly harms pedestrians
Hazard ID H01
H02
Frequency (F) F1
F1
Severity (S) S2
S2
Forklift – preliminary hazard list (PHL)
P2
Probability (P) P2
d
PL d
NO
Acc? NO
Safety measure description Active safety measures: Install collision warning system Install lightning alert when driving forward with restricted visibility Install fixed sensors with a 360-degree detection zone to detect a pedestrian Active safety measures: Install forklift driver assistance systems Install fixed sensors with a 360-degree detection zone to detect a pedestrian
Frequency (F) F1
F1
Severity (S) S2
S2
P1
Probability (P) P1
YES
Acc? YES
(continued)
c
PL c
Your First Safety Project! 89
Hazard description The forklift tips forward while carrying the load due to imbalance, potentially injuring the driver or pedestrian/crew
The forklift tips to the side while making a sharp turn due to imbalance, potentially injuring the driver or pedestrian/crew
Hazard ID H03
H04
Frequency (F) F2
F2
Severity (S) S2
S2
P2
Probability (P) P2
e
PL e
NO
Acc? NO
Safety measure description Active safety measures: Balance monitoring and imbalance prevention Passive safety measures: Wearing a seat belt Wearing a protective helmet Passing forklift safety training Active safety measures: Balance monitoring and imbalance prevention Passive safety measures: Wearing a seat belt Wearing a protective helmet Passing forklift safety training
Frequency (F) F1
F1
Severity (S) S2
S2
P1
Probability (P) P1
c
PL c
YES
Acc? YES
90 6 Defining Safety Functions
Hazard description The load falls off the forks potentially harming the driver in the cabin or pedestrians/crew
The driver falls out of the cabin while driving, potentially dying or obtaining an irreversible injury
Hazard ID H05
H06
Frequency (F) F1
F2
Severity (S) S2
S2
P2
Probability (P) P2
e
PL d
NO
Acc? NO
Safety measure description Active safety measures: Install the load balancing system that will detect load imbalance and correct its position Passive safety measures: Install the locks that will hold the load firmly attached to the forks Active safety measures: Install the automatic cabin door locking system Passive safety measures: Install the brightly colored seat belt
Frequency (F) F1
F1
Severity (S) S2
S2
P1
Probability (P) P1
YES
Acc? YES
(continued)
c
PL c
Your First Safety Project! 91
Hazard description Forklift hydraulic hose bursts leading to a loss of pressure which drops the forks potentially killing multiple people
Escaping propane gas is ignited by the hot engine potentially irreversibly harming multiple people
Hazard ID H07
H08
Frequency (F) F1
F1
Severity (S) S2
S2
P2
Probability (P) P2
d
PL d
NO
Acc? NO
Safety measure description Active safety measures: Install backup hydraulic hose Passive safety measures: Regular hose maintenance Crew training (don’t walk under/near forks and safety gear) Active safety measures: Automatic fire detection and suppression system with the ability of manual activation that releases noncorrosive fluid to extinguish the fire Passive safety measures: Crew training (fire safety)
Frequency (F) F1
F1
Severity (S) S2
S2
P1
Probability (P) P1
c
PL c
YES
Acc? YES
92 6 Defining Safety Functions
Hazard ID H09
Hazard description The forklift tips backward while carrying the load on an inclined surface potentially harming or killing the driver or pedestrians nearby
Severity (S) S2
Frequency (F) F1
Probability (P) P2 PL d
Acc? NO
Safety measure description Active safety measures: Monitor the balance of a forklift and activate an alarm signal if the forklift becomes out of balance Forklift should sound an alarm and light signaling while carrying a heavy load, warning the pedestrians around to keep a safe distance
Severity (S) S2
Frequency (F) F1
Probability (P) P1 Acc? YES
(continued)
PL c
Your First Safety Project! 93
Hazard ID H10
Hazard description The forklift hits a power cable while forks are in an elevated position causing electricity harm or killing the driver
Severity (S) S2
Frequency (F) F1
Probability (P) P2 PL d
Acc? NO
Safety measure description Active safety measures: Install a camera with object detection on top of forks, so the driver can have a better view of the position and detect the surrounding objects Passive safety measures: Isolate the cabin of the driver from forks, such that the electric current will not endanger the driver
Severity (S) S2
Frequency (F) F1
Probability (P) P1 PL c
Acc? YES
94 6 Defining Safety Functions
Your First Safety Project!
95
Fig. 6.5 State transition diagram depicting the draft initial safety concept for the forklift truck
Finally, the team provided their safety concept in the form of a state transition diagram, depicting the safe state (Fig. 6.5).
Chapter 7
Safety Integrity and Random Failures
Introduction The first half of the textbook covered all the aspects required to understand the safety-critical system, specify its requirements, perform hazard and risk assessment, figure out potential safety precautions in the form of functional safety prescriptions (safety functions), and allocate safety integrity level requirements to them. The second half of the textbook now shall address aspects of verification, that is, how we can show that our safety function design and implementation conforms with the requirements prescribed for the designated safety integrity level. Safety integrity as a notion is related to the resilience of the system to any dangerous failure. Safety functions, as key functional safety prescriptions, have the goal to prevent accidents by detecting anomalies and/or failures that may lead to the accidents, and bringing the system to a safe state before the accident has a chance to materialize. Failures of safety functions as such shall also be avoided since their correct operation is directly responsible for accident prevention. Furthermore, safety integrity levels are allocated to safety functions, defining the extent of quality their design and implementation need to respect, to make them appropriate for the fulfillment of their tasks. In this chapter, we would dissect the specific properties of safety integrity that safety functions need to show. The first set of properties is related to the systematic safety integrity, making sure that the processes and procedures we selected to design and implement safety functions minimize the chance that faults in the design and implementation can occur. The second set of properties is related to random failures, usually attributed to physical aspects of parts used in safety function implementation. Those parts shall show high robustness against wear and tear and shall demonstrate durability. Random failures are analyzed using probabilistic models, and this chapter deals with the respective metrics for quantifying random failures. Safety integrity against random failures, therefore, can be a deal-breaker for your safety-related system compliance against the required functional safety standard, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0_7
97
98
7 Safety Integrity and Random Failures
even if your design and implementation were flawless. Each component that is selected needs to fulfill reliability targets defined by the standard. Reliability is formally defined, as the probability that the system or its component would be still operational after a certain elapsed runtime (“survival” probability). In this chapter, we would explore all related metrics which are usually provided by the component manufacturers, such as failure rate and mean time to failure (MTTF) which can be used to assess the required reliability for that component. Based on the calculations available within the reliability theory, it is possible to assess the composite reliability for the complete safety-related system and therefore provide formal proof that the safety integrity against random failures is in line with the required safety integrity level allocated to that system.
Video Lesson This chapter has a corresponding video lesson: sfs7.nit-institute.com
Lecture Notes Safety integrity is resistance against dangerous failures. It is among the requirements for safety functions, expressed as a required safety integrity level (e.g., SIL, ASIL, PL, etc.) stemming from the risk evaluation of the hazard the safety function mitigates. Safety integrity is verified by quantifying the failures of the safety function and comparing them with the required values from the functional safety standard,
Lecture Notes
99
Safety integrity level (SIL) 4 3 2 1
Average frequency of a dangerous failure of the safety function [h-1] (PFH) ≥ 10−9 < 10−8 < 10−7 ≥ 10−8 −7 < 10−6 ≥ 10 < 10−5 ≥ 10−6
Fig. 7.1 Required failure rates for each safety integrity level as defined by IEC 61508
and also by verifying whether all the procedures prescribed by the standard have been met for the safety function according to its safety integrity level. Safety integrity is assessed against random failures, depending on the system and hardware reliability (e.g., IEC 61508 requires the frequency of dangerous failures for SIL 4 to be anywhere from 10-9 to 10-8) (Fig. 7.1). Safety integrity against systematic failures, which are not random in nature, is also regarded: faults in specifications and design, faulty processes and documentation, software bugs, and other design-time faults which can be latent (hidden). Artificial intelligence (AI) is also prone to systematic faults, but due to the pseudorandom nature of these faults and the inability of their systematic discovery (they are based on learning processes happening in training phases), it can have its failures probabilistically modeled through a test. Random failures happen due to manufacturing material fatigue, wear and tear, as well as environmental influences (interferences) which mostly affect mechanical and hardware (electronic) components. Safety function usually consists of both electronic (controllers executing the logic and sensors) and electromechanical components (as actuators which bring the system to the safe state, e.g., relays, switches, breaking pads, valves, etc.), so random failures must be quantified for it and compared with the required values from the functional safety standards. Random failures are quantified using failure probability F(t). This is a cumulative distribution function of a random variable T (time to failure) evaluated at t (expressing the probability that the failure will certainly happen within a time interval [0, t]). Therefore: Fð0Þ = 0,Ft → ðtÞ = 1 FðtÞ = ½0, 1 Another important quantification is reliability R(t). As opposed to failure probability, reliability is the probability that the system is going to survive until the time t. Therefore: RðtÞ = 1 - FðtÞ, Rð0Þ = 1, Rt → 1 ðtÞ = 0 RðtÞ = ½0, 1 Please note that both F(t) and R(t) are conditional probabilities (Fig. 7.2).
100
7
Safety Integrity and Random Failures
1 0.9 0.8 0.7 0.6 0.5 Failure probability F(t)
0.4
Reliability R(t)
0.3 0.2 0.1 0 0
1
2
3
4
5 time (t)
6
7
8
9
10
Fig. 7.2 Graphs of failure probability and reliability of a system over the time t
The trend of failures at a time moment is expressed with failure rate: hð t Þ =
d dt F ðt Þ
Rðt Þ
=
units failed in ½t, t þ dt 1 : units survived until t dt
The fundamental relation between the reliability and failure rate: Z RðtÞ = e
-
t hðuÞdu
0
:
The failure rate is usually higher right after the system production, due to process faults and manufacturing errors (early “infant mortality” failures), and then it decreases (decreasing failure rate – DFR) and becomes nearly constant. After a sufficient system runtime, the failure rate starts to increase again, due to wear-out failures (increasing failure rate – IFR). This typical failure rate graph is called the bathtub curve. In practice, the system is released after it is stress-tested (“skipping” the DFR zone) and regarded only during the constant failure rate period, what allows simplifications and many benefits for reliability calculation and comparisons (Fig. 7.3). These systems are called e-systems for which a constant failure rate is defined as: hðtÞ = const: = λ ) RðtÞ = e - λt : Manufacturers of components usually release λ values in the documentation (or they can be calculated using handbooks – MIL, FIDES, etc.). Another typical value that is used is mean time to failure (MTTF), what is the expectation value of system failure until the runtime t for which it stands:
Lecture Notes
101
Failure Rate
Decreasing Failure Rate
Increasing Failure Rate
Constant Failure Rate
Observed Failure Rate
Early “Infant Mortality” Failure
Wear Out Failures
Constant (Random) Failures
Time
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
2
4
6 F(t)
8
10
R(t)
Fig. 7.3 Bathtub curve (left) and failure probability F(t) and reliability R(t) for an e-system
Z
1
MTTF = 0
Z RðtÞ = 0
1
e - λt =
1 ½ h λ
Interestingly, for e-systems failure probability at MTTF is always ~0.63 (not 0.5!). The failure rate is usually expressed as “failures per hour” (h-1) or “failures in time” (FIT) for which it stands:
102
7
Safety Integrity and Random Failures
1
failure probability F(t)
0.8
~0.63
0.6
0.4
0.2
0
O = 0.2 O = 0.05 O = 0.01
1 O
0
20
40
60
80
100
time 1 O = 0.2 O = 0.05 O = 0.01
reliability R(t)
0.8
0.6
0.4
~0.37
0.2 1 O
0 0
20
40
60
80
100
time
Fig. 7.4 Failure probability and reliability for the e-system for different failure rate values (MTTF is expressed as 1/λ)
λ = x h - 1 = x 109 FIT⟹λ = y FIT = y 10 - 9 h - 1 More reliable systems have smaller λ and larger MTTF and vice versa (Fig. 7.4).
Calculation Examples
103
Calculation Examples Task 1 A company released 1000 units into pilot deployment on January 5, 2021. Each month, the company made a record of the number of units still operational, as shown in the table (Table 7.1). Calculate failure probability (F(t)), reliability (R(t)), and failure rate (h(t)) after each month of runtime. Solution Reliability is the “probability of survival”; therefore, it is a direct proportion between the number of surviving (remaining) units after a monthly inspection xm and the total number of units: RðmÞ =
units surviving x = m 1000 total units
Failure probability is then: F ðmÞ = 1 - RðmÞ The failure rate can be expressed as: xm - x m units failed in mprev , m hð m Þ = = prev units survived until mprev xmprev Therefore: Date of inspection February 5, 2021 March 5, 2021
Units surviving (xm) 798 694
Units failed since last inspection (xmprev - xm ) 202 104
F(m) 0.202 0.306
R(m) 0.798 0.694
April 5, 2021 May 5, 2021 June 5, 2021 July 5, 2021 August 5, 2021 September 5, 2021 October 5, 2021 November 5, 2021
638 607 577 548 494 373 221 0
56 31 30 29 54 121 152 221
0.362 0.393 0.423 0.452 0.506 0.627 0.779 1
0.638 0.607 0.577 0.548 0.494 0.373 0.221 0
h(m) 0.2 104 798 = 0:13 0.08 0.05 0.05 0.05 0.1 0.24 0.41 1
104
7
Table 7.1 Units remaining in operation after each inspection
Safety Integrity and Random Failures
Date of inspection February 5, 2021 March 5, 2021 April 5, 2021 May 5, 2021 June 5, 2021 July 5, 2021 August 5, 2021 September 5, 2021 October 5, 2021 November 5, 2021
Units remaining in operation 798 694 638 607 577 548 494 373 221 0
Task 2 Constant failure rate for a power supply component, according to the manufacturer, is rated at λ = 200FIT. Calculate failure probability and reliability of the power supply after the runtime of 5 years as well as the MTTF in hours. Solution: First we need to convert the failure rate to h-1: λ = 200 FIT = 200 10 - 9 h - 1 = 2 10 - 7 h - 1 We need to convert the requested runtime to hours: t = 5a = 5 365 24 h = 43 800 h Failure probability can be expressed as: F ðt Þ = 1 - e - λt F ð43 800 hÞ = 1 - e - 210
- 71 h ∙ 43800
h
= 0:0087
Reliability can be expressed as: Rðt Þ = e - λt = 1 - F ðt Þ Rð43 800 hÞ = 1 - 0:0087 = 0:9913 Mean time to failure: MTTF =
1 1 = 5 000 000 h ⟹MTTF = λ 2 10 - 7 h - 1
Exercise 7
105
Exercise 7 One of the safety functions prescribed for your system needs a relay switch as an actuator to cut off the power and bring your system into the safe state. Your company evaluated the relay switch component for reliability, performing the accelerated life testing (ALT) on the sample of 10,000 switches over the course of 3 weeks. It is expected that the switch is activated once per hour during the system runtime, being that the function of the system for which the safety function is prescribed executes in the on-demand mode, once per hour. During ALT, the test is performed by activating the relay switch once every 10 seconds, and logging the total number of failed switches every 10,000 cycles, as shown in the ALT log. Cycles tested 10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 90,000 100,000 110,000 120,000 130,000 140,000 150,000 0.99): A=
MUT MUT = MUT þ MDT MTBF
Calculation Examples Task 1 An excerpt from pseudo-FMEDA is given in for the microcontroller component in the table below (F, is a dangerous failure; D, is an undetectable failure; R, failure rate). Component uC
Failure code 101
Failure cause
Safety measure
Failure assessment F
Power cutoff
Redesign, safety PMIC
1
D 1
R 20 FIT
uC
102
Time desync
Redesign, backup RTC
1
1
5 FIT
uC
103
Faulty data, output stuck at 1
Systematic safety integrity assurance
0
1
10 FIT
uC
104
Faulty data, output stuck at 0
Plausability check, crosscorrelation of outputs
1
0
10 FIT
uC
105
Memory corruption
Diagnosis (BIST)
1
0
10 FIT
uC
106
Bond wires detachment
Manufacturing process reassessment, soldering QA
1
1
5 FIT
LVDS deser.
201
Wire break
Redesign, PCB reassessement
1
1
1 FIT
LVDS deser.
202
Stuck at last frame
Plausability check at uC
1
0
1 FIT
Calculation Examples
149
Calculate the diagnostic coverage and the safe failure fraction for the microcontroller. Solution First, we need to classify the failures of the microcontroller from FMEDA: • • • •
Safe failures: 103 Dangerous failures: 101, 102, 104, 105, 106 Undetectable failures: 101, 102, 103, 106 Dangerous undetectable failures: 101, 102, 106 Now we can calculate failure rates for each failure class: λs = λ103 = 10 FIT λd = λ101 þ λ102 þ λ104 þ λ105 þ λ106 = ð20 þ 5 þ 10 þ 10 þ 5Þ FIT = 50 FIT λ = λs þ λd = 10 FIT þ 50 FIT = 60 FIT λdu = λ101 þ λ102 þ λ106 = ð20 þ 5 þ 5Þ FIT = 30 FIT λdd = λd - λdu = 50 FIT - 30 FIT = 20 FIT Diagnostic coverage is: DC ½% = Safe failure fraction is: 10þ20 dd SFF ½% = λλssþλ þλd = 10þ50 = 50%
λdd 20 = = 40% λd 50 ,
SFF ½% = 1 -
λdu λ
=1-
30 60
= 50%
Task 2 A log of the operation of three units of your system is given, depicting the uptimes and downtimes of the system. Calculate the availability of the system.
150
10
Proving the Safety Integrity
Solution We can count the number of all uptimes and downtimes for all units directly from the log: Number of uptimes for Unit A: 4 Number of downtimes for Unit A: 4 Number of uptimes for Unit B: 5 Number of downtimes for Unit B: 5 Number of uptimes for Unit C: 4 Number of downtimes for Unit C: 4 Number of all uptimes: 13 Number of all downtimes: 13 We can sum up all durations of uptimes and downtimes for all units directly from the log: Duration of all uptimes for Unit A: 12 Duration of all downtimes for Unit A: 4 Duration of all uptimes for Unit B: 11 Duration of all downtimes for Unit B: 5 Duration of all uptimes for Unit C: 9 Duration of all downtimes for Unit C: 7 Duration of all uptimes: 32 Duration of all downtimes: 16 P
MUT =
uptimes of all units 32 = = 2:46 No of uptimes of all units 13 P
MDT =
downtimes of all units 16 = = 1:23 No of downtimes of all units 13
Now we can calculate the availability of the system: 2:46 A = MUTMUT þMDT = 2:46þ1:23 = 0:67 = 67%
Exercise 10
151
Exercise 10 Continue the exercise from Chap. 9, now attempting to increase reliability by analyzing the failures of system components and attempting to assess dangerous undetectable failures. The initial pseudo-FMEDA sheet is given below:
Additionally, try to respecify the detectability of failures for the diversified configuration of the system. Finally, calculate safe failure fraction (SFF) and diagnostic coverage (DC), and compare all safety integrity metrics against the requirements from the functional safety standard (see further table), considering that the required SIL for the described SRS is SIL 2:
152
10
Safe failure fraction of an element SIL 2 SUB2: SIL 2 SUB 3: (1-β) 2*λd2*T+β*λ d= 2.4*10-9+1.08*10-7=1.104*10-7=> SIL 2 SRS1 => SIL 2 = SILr 2
Once the SRS is finally validated, the system shall be put in operation and monitored in the field trial. All potential issues (incidents, accidents, errors) need to be logged and carefully investigated.
We must also consider the potential failures due to systematic faults in programmable components. Those faults may be removed only in the design phase, or, if known, may be tolerated via a fault tolerance mechanism by using measures for HW/SW prescribed in standards.
To consider systematic faults in the example, it is possible to either fully document the process used to develop software for logic components (L), by using ASPICE L2 and measures for HW/ SW prescribed by the standards. We can also resort to using components which are proven in use.
SRS 1 acc. IEC 62061 Gyro 1
Gyro 2
Gyro 3
L R
Diagnostics
I/O
SUB2 (Subsys. A)
CCF
R
SUB1 (Subsystem C)
SUB3 (Subsys. B) SUB1
SUB2
SUB3
SRS1 (Subsys. A)
MTTFd pos: B10d/(0.1*n op)=154a 1/MTTFchannel=1/154+1/300+1/300 MTTFchannel=76a (high)
SRS 2 acc. ISO 13849 I (Pos)
L
O (I/O)
SRS2 Pos. sensor (rotary), B10=10,000,000 op, hop=8, dop=90, t=2s nop=(dop*hop*3600)/t=1,296,000 SFF=50% => B10d=0.5/B10=20,000,000 I/O uC, MTBF=300a Logic uC, MTBF=300a DC=95% (medium)
Category 2
DCmedium & MTTF high & C2 => PL d = PLr d
TE
Let us remember the first safety function defined for the wrecking ball crane in our example, by looking at its requirements. Safety requirement 1.1 (SR 1.1): SRS shall monitor the balance of the crane in order to predict a near-tipping event.
162 Fig. 11.1 Technical safety concept for the SRS serving to SR 1.1
11
Gyro 1 (top) Gyro 2 (cabin) Gyro 3 (back)
Practical SIL Calculation
Logic
Diag
Relay 1 Relay 2 I/O
Technical safety requirement 1.1.1 (TR1.1.1, derived from SR1.1): SRS shall use gyroscopes positioned at the crane top, cabin, and the back of the vehicle to detect imbalance. Safety requirement 1.2 (SR 1.2): SRS shall, in case of imbalance over the threshold (near-tipping), cut off the controls and command the vehicle to sound an alarm and bring up the crane. Quality safety requirement 1.3 (QSR 1.3): SRS shall achieve SILr 2. By following the guidelines from IEC 62061, we can create a technical safety concept including an architecture containing three distinct gyroscope sensors as inputs (top, cabin, and back – Gyro 1, Gyro 2, and Gyro 3, respectively) which feed their readings to a logic component (L). The logic component implements the algorithm required in SR 1.2, sending the cutoff command to two redundant relays at the same time (Relay 1 and Relay 2) to disable the controls. Also, the alarm command is sent to the appropriate I/O block which consists of the sound actuators (e.g., speakers). To enhance the safety integrity, a separate diagnostics unit is used to detect failures on any of the gyroscopes to prevent misjudgments of the system (Fig. 11.1). Now we can decompose our technical safety concept using the guidelines from IEC 62061. We can notice that our concept matches three design patterns with the prescribed RBDs. Input gyroscopes are equivalent to IEC 62061 subsystem C. Logic and I/O can be represented as a simple series (as prescribed by subsystem A), whereas redundant relays represent subsystem B. All three subsystems finally compose a final series allowing us to calculate the final reliability of the defined SRS (Figs. 11.2 and 11.3). The next step is to consult the specification of each selected component according to the information released by the manufacturer (or results from a previously conducted FMExA analysis). For example: Gyroscope model is ADiS16385, where manufacturer specified λd = 0.86∙10-6 h-1. Relays are identical with λd = 600 FIT. Common cause failures for relays can be modeled by means of the beta-factor model, β=18%. Additionally, the manufacturer suggests a proof test interval to check and replace relays each T = 10,000 h. I/O block combined is based on a microcontroller with MTBF = 300a, whereas logic microcontroller is declared as SIL 2 compliant – we estimate this being at the “middle” of the range: ~5*10-7. Diagnostic coverage for detecting dangerous failures on gyroscopes is 90%.
Lecture Notes
163
Fig. 11.2 Possible design templates available with IEC 62061 Fig. 11.3 RBD templates for the technical safety concept of SR 1.1. as per IEC 62061
Gyro 1
Gyro 2
Gyro 3
L
SUB2 (Subsys. A)
R
Diagnostics
I/O
CCF
R
SUB1 (Subsystem C)
SUB3 (Subsys. B) SUB1
SUB2
SUB3
SRS1 (Subsys. A)
By knowing this information, we can now perform the calculations. IEC 62061 gives ready-made formulae for each of the subsystems, although they all stem from the calculations devised in Chaps. 7, 8, 9, 10, and 11. For example, for the Subsystem C, IEC 62061 gives the following final failure rate formula: λF = N λd ð1 - DC Þ If we remember from before, failure rates of the series configuration can be summed, meaning: λF = λ1 þ λ 2 þ . . . þ λ N When having the diagnostics in place, only the undetectable portion of failure rate remains in the calculation:
164
11
Practical SIL Calculation
Fig. 11.4 Safety integrity levels and their required failure rate targets as per IEC 62061
λdu = λd ð1 - DC Þ When we plug the actual values: λF = 3 0:86 10 - 6 ð1 - 0:9Þ = 0:26 10 - 6 IEC 62061 gives the requirements on which targets need to be achieved for each SIL level (Fig. 11.4). We can therefore see that for SUB1, the obtained value falls within the SIL 2. For the subsystem B, IEC 62061 also gives a final formula to use: λ F = ð1 - β Þ2 λ d 2 T þ β λ d The same result is obtained by following the “classic” calculation approach for T: Rk ðTÞ = 1 - ð1 - RðTÞÞ2 e - λk T = 1 - 1 þ 2RðT Þ - R2 ðT Þ = 2e - λT - e - 2λT λk =
ln ð2e - λT - e - 2λT Þ T
ln 2e - ð1 - βÞλd T - e - 2ð1 - βÞλd T λF = þ β λd T When plugging in the values for SUB 3: λF = 1:104 10 - 7 = > SIL 2 Finally, for SUB1, SUB2, and SUB3 as a series, we get: λTOT = ð 1 þ 2:6 þ 5Þ 10 - 7 = > SIL 2
Lecture Notes
165
Fig. 11.5 Available architectural categories as templates given by ISO 13849 Fig. 11.6 Defined technical safety concept for the SRS 2
I (Pos)
L
O (I/O)
Category 2
TE
We can see that the obtained values conform to the safety integrity requirement QSR1.3 of SILr 2. The second safety function for the wrecking ball crane was defined using the following requirements: Safety requirement 2.1 (SR 2.1): SRS shall monitor ball position to prevent the wrecking ball from reaching the vehicle. Technical safety requirement 2.1.1 (TR2.1.1, derived from SR2.1): SRS shall use two position sensors (at crane arm and ball pulley) to monitor crane height and ball descent and to detect possible collision if swung. Safety requirement 2.2 (SR 2.2): SRS shall disallow dangerous positioning by limiting the controls extent continuously. Quality safety requirement 2.3 (QSR 2.3): SRS shall achieve PLr d. We now create a technical safety concept according to the guidelines from ISO 13849. We first look at the suggested design templates from this standard (Fig. 11.5). Our system requires a set of input sensors (nonredundant!) that provide continuous input to the logic, which in turn continuously decides whether to limit further control possibility by sending appropriate signals to the output. We can include a “test” module (diagnostics) to introduce some diagnostic coverage and therefore decrease the output failure rate. This architecture is most similar to the Category 2 design (Fig. 11.6). By consulting manufacturer information, we get that the rotary position sensors are declared with their B10 = 10,000,000 operations (clicks). For the use case of the
166
11
Practical SIL Calculation
wrecking ball crane, based on the worktime logs we can extract the yearly number of operations: the machine is operated 8 hours each day, 90 days each year on average. If one cycle of the position sensor (time between clicks when used) is 2 seconds, we can calculate the total yearly number of sensor operations: hop = 8, dop = 90, t = 2s
nop =
d op hop 3600 = 1 296 000 t
We have also performed an FMEDA and decided that out of all failures, 50% are safe failures which can be excluded (safe failure fraction – SFF=50%). In this case, we have an updated B10 value: B10d =
SFF 0:5 = = 20 000 000 B10 10 000 000
The manufacturer declares MTBF values for I/O and logic controllers to be at 300a. Diagnostic coverage of the configuration is 95%. Category 2 is a simple series configuration for which it stands: λF = λPOS þ λL þ λIO 1 1 1 1 = þ þ MTTF channel MTTF dPOS MTTF L MTTF O We can calculate MTTF for dangerous failures of the position sensors: MTTF dPOS =
B10d = 154a 0:1 nop
Finally: 1 1 1 1 = þ þ MTTF channel 154a 300a 300a
MTTF channel = 76a To decide the achieved performance level as per ISO 13849, we need to use the obtained MTTF for one series in a configuration (denoted as “channel”) as well as
Now Try for Yourself!
167
the overall DC value and the selected system category. We can then plug those values into appropriate tables given by the standard to obtain the achieved PL. For MTTF of the channel, we deduce the value as high: MTTFD Denotation of each channel Low Medium High
Range of each channel 3a ≤ MTTFD < 10a 10a ≤ MTTFD < 30a 30a ≤ MTTFD < 100a
For the DC of the system, we deduce the value medium: Diagnostic coverage (DC) Denotation of each channel None Low Medium High
Range of each channel DC < 60% 60 % ≤ DC < 90% 90 % ≤ DC < 99% DC ≥ 99%
Then we can use the final table to determine the PL: Category DCavg MTTFD of each channel Low Medium High
B None
1 None
2 Low
2 Med.
3 Low
3 Med.
4 High
a b n/a
n/a n/a c
a b c
b c d
b c d
c d d
n/a n/a e
We can see that the obtained performance level is PL d, which meets the requirement QSR 2.3 of PLr d.
Now Try for Yourself! Based on the outputs from Project 1, now you need to provide argumentation and evidence for your safety case, related to the safety integrity of your safety functions and other important aspects. As a group work, revisit your system requirements specification, and, if needed, extend it so that you have at least five safety requirements and corresponding derived safety requirements (technical, quality) which describe five active risk mitigation measures (safety functions). In a group, for each of the safety functions, define the list of required written evidence for the safety case (just the list, not the evidence itself!) which are in your view required to close the safety case with respect to those safety functions. One of
168
11
Practical SIL Calculation
the pieces of evidence would surely be the evidence on the fulfillment of the prescribed safety integrity level with regard to random failures and other quantitative safety integrity metrics. After the group part, now each of the team members shall select one safety function and provide evidence, through calculation, that the safety function fulfills the respective safety integrity level with regard to quantitative failure metrics (as defined in Chap. 10 exercise). Remember to: • Define a rough architectural block diagram for the corresponding safety-related system (SRS) • Draw a corresponding reliability block diagram (RBD) • Find out failure quantifiers from relevant online sources, handbooks, or references for each of the components, and be able to argue your choice (it does not need to be exact – this is just an exercise – but the choice needs to be reasonable) • Make sure to use only dangerous undetectable failures – base the numbers on the Chap. 10 exercise FMEDA, guesstimates, or rule of thumb (e.g., consider 50% of all failures to be dangerous, and 90% of all dangerous failures to be detectable) – you do not need to derive any specific FMEDA • Perform the calculation to find out the reliability of the complete SRS after 10 years of runtime and decide on the output SIL; use tables as in exercise Chap. 10 • Discuss the SIL with regard to safety integrity requirements • In case SIL is not met, apply any of the safety integrity improvement methods you find suitable (e.g., derating, hot spare, item-level redundancy, majority voting, etc.), update the RBD, and recalculate reliability so that the SIL is met • In the case SIL was met and no safety integrity improvement was needed, anyhow apply the improvement so that the MTTF of the SRS is increased by 30% using the same methods as in the previous point • In case the changes were violating the system requirement specification, backtrace and perform the required changes in the specification. As a group, create the final presentation in which you will describe the updates to the specification, discuss the list of evidence, and introduce the safety functions and their requirements. Then, each team member shall describe his/her safety claim for his corresponding safety function and present his/her findings. It would be beneficial if you can present your findings live to your peers or your instructor. Spend 15 minutes on the presentation in total. Please strictly adhere to the time. For example, you can spare 5 minutes for the overall group part presentation, and 2 minutes per team member to describe particular claims around each of the five safety functions. Feel free to organize the presentation of the overall group part however you find suitable (one or more people can present this part).
Now Try for Yourself!
169
Required Output • Updated system requirement specification, in the Chap. 2 template. • PDF with the claim for your safety function, including: (a) (b) (c) (d) (e)
Architectural block diagram of the SRS Corresponding RBD Calculation steps and results Discussion around the achieved SIL Application of the SIL improvement method (revisited architecture, RBD, etc.) (f) Recalculation steps and final results
• If you are using a calculation Excel sheet, feel free to provide it instead of writing all calculation details in PDF (this is not mandatory, but then the calculation shall be presented in detail in the PDF). • Final presentation (in ppt). Submission Deadline Take 1 or 2 weeks to work the project out.
Assessment You or your peers (or your instructor) can assess your work. Take 20 points as the total possible score. Up to 5 points can be allocated based on the group part of your submission and presentation, and will be the same for each group member. Up to 2 points can be provided for the safety functions requirements (2, mostly complete and correct; 1, incomplete or partially incorrect; 0, major flaws or not provided). Up to 3 points can be provided for the safety case composition and the presented claims (3, mostly complete and correct; 2, at most one notable missing artifact in the list; 1, several aspects incorrect and/or missing; 0, mostly incomplete or incorrect). Ten points can be assigned based on the individual submission part (up to 2 points per each of the aspects – 2 if it is mostly complete and correct; 1, incomplete or partially incorrect; and 0, completely incorrect or missing): 2 points for architectural block diagram, 2 points for RBD, 2 points for SIL calculation, 2 points for the prescribed SIL improvement methods, and 2 points for the recalculation and final SIL achievement or reliability improvement). Finally, 5 points can be assigned based on your presentation and argumentation (up to 3 points for convincing presentation proof and up to 2 points for the presentation content – specifically addressing the communication clarity and understandability of the presented material).
170
11
Practical SIL Calculation
Sample Solution to the Project The team selected a forklift truck as their considered system and continued the previously started project (the full solution is available in digital form at sfs11.ex.nitinstitute.com). They gave the following presentation:
Now Try for Yourself!
171
Required Evidence for the Safety Case • System functions are defined, and hazards are identified and evaluated according to the functional safety standard (ISO 13849 is our case). • Safety functions are defined to mitigate the hazards of system functions. • The performance level that is allocated to the system function is inherited from the risk of the hazard. • SRS is realized by respecting restrictions by architecture according to the functional safety standard. • Dangerous undetectable random failures of SRS are quantified by failure rate and compared with values required by the functional safety standard.
172
11
Practical SIL Calculation
• In case PL is not met, redundancy or other improvement methods may be applied, and dangerous undetectable random failures are re-quantified and proved that SRS met PL according to the functional safety standard. • Traceability from safety requirements and then through technical safety requirements, with regard to functional requirements they are related to, and then through safety design and implementation is established. • Safety functions are implemented according to all provisions of the appropriate functional safety standard, including the process of its definition, implementation, and verification, PL requirements, as well as specifically prescribed implementation techniques (at the system, hardware, and software level). This is especially regarded to the failures due to systematic faults which cannot be quantified as random failures. • SRS is verified on each level against test cases for compliance with the technical safety requirements. • Compliance with quality safety requirements is demonstrated through various failure metrics derived from the used architectural elements and their composition (failure probability, reliability, failure rate, MTTF, safe failure fraction, diagnostic coverage, etc.), as well as from failure statistics from the field trials and stress tests. • Traceability from test cases toward safety architecture and design is established. • Safety processes and procedures defined by the company, which are aligned with the functional safety standard, are followed: (a) Documentation is reviewed, and reports are provided as evidence of the completeness of documentation and correct traceability. (b) Implementation is reviewed, and reports are provided as evidence of compliance with the functional safety standard. (c) It is proved that there are no open items that can be traced to the hazards which exhibit intolerable risk. • It is proved that the performance level of the system security cannot be compromised. • Training for employees regarding safety processes and procedures defined by the company is periodically organized. • Internal audits and audits of external authorities are performed periodically regarding safety processes and procedures that are defined within the company. ID Name High-level requirements HLR_001 Lifting
Description
Author
Ver.
Derived from
The forklift shall allow lifting and lowering of the load
Group 4
1.0
–
Used by FR_001 FR_002 FR_003 CR_001 (continued)
Now Try for Yourself!
173
ID HLR_002
Name Driving
Description The forklift shall provide the ability to be driven
Author Group 4
Ver. 1.0
Derived from –
HLR_003
Tilting
The forklift shall allow tilting of the forks
Group 4
1.0
–
The forklift shall move the forks up when the lever is pushed up The forklift shall move the forks down when the lever is pulled down The forks shall not move up or down when the lever is in a neutral position The forklift shall accelerate when the gas pedal is pressed The forklift shall decelerate when the brake pedal is pressed The forklift shall not allow movement when the parking lever is pulled The forklift shall turn left when the steering wheel is turned left The forklift shall turn right when the steering wheel is turned right The forklift shall move forward when the forward-reverse lever is in the forward position The forklift shall move in reverse when the forward-reverse lever is in the reverse position
Group 4
1.0
HLR_001
Group 4
1.0
HLR_001
Group 4
1.0
HLR_001
Group 4
1.0
HLR_002
Group 4
1.0
HLR_002
Group 4
1.0
HLR_002
Group 4
1.0
HLR_002
Group 4
1.0
HLR_002
Group 4
1.0
HLR_002
Group 4
1.0
HLR_002
Functional requirements FR_001 Lifting the forks FR_002
Lowering the forks
FR_003
Fixed fork height
FR_004
Acceleration
FR_005
Deceleration
FR_006
Parking
FR_007
Steering left
FR_008
Steering right
FR_009
Forward movement
FR_010
Reverse movement
Used by FR_004 FR_005 FR_006 FR_007 FR_008 FR_009 FR_010 FR_011 CR_002 FR_012 FR_013 FR_014 CR_003
(continued)
174
11
ID FR_011
Name Power supply
FR_012
Tilt forward
FR_013
Tilt backward
FR_014
Fixed tilt angle
Constraint requirements CR_001 Load limit
CR_002
Maximum speed
CR_003
Maximum tilt angle
Safety requirements SR_001 Blocked view
SR_002
Blind spot preview
Practical SIL Calculation
Description The engine shall burn propane gas to power the forklift The forks shall tilt forward when the tilt control lever is pushed forward The forks shall tilt backward when the tilt control lever is pulled backward The tilt angle of the forks shall remain unchanged when the tilt control lever is in the neutral position
Author Group 4
Ver. 1.0
Derived from HLR_002
Group 4
1.0
HLR_003
Group 4
1.0
HLR_003
Group 4
1.0
HLR_003
The forklift shall lift the load with mass up to 2 metric tons The forklift shall not exceed the maximum speed of 10 km/h The tilt angle of the forks shall not exceed ±10 degrees
Group 4
1.0
HLR_001
Group 4
1.0
HLR_002
Group 4
1.0
HLR_003
The collision warning system shall activate when driving forward while the driver view is blocked by the load
Group 4
1.0
FR_009
Driver assistance system shall display areas around the vehicle which are obstructed from the view of the driver while driving in reverse and safely stop the vehicle if a probable collision is detected
Group 4
1.0
FR_010
Used by
TSR_001 TSR_002 TSR_003 TSR_004 TSR_006 QSR_001 QSR_002 TSR_003 TSR_004 TSR_005 TSR_006 QSR_001 QSR_003
(continued)
Now Try for Yourself!
ID SR_003
SR_004
SR_005
Name Forklift imbalance prevention
Load balancing system
Lift and tilt hydraulics safety
SR_006
Forklift fire safety
SR_007
Wearing protective gear
SR_008
Passing forklift safety training
SR_009
Seat belt
175
Description The forklift shall prevent vehicle tipping by monitoring and correcting the forklift balance
The forklift shall have a load balancing system that shall detect the load position relative to the forks and correct the position if over the threshold Wireless Hose Diagnostic Unit (HDU) shall continuously monitor hose assemblies using a 433 MHz frequency communication protocol and alert the user when the hose is damaged The forklift shall provide a fire extinguishing system that releases noncorrosive fireextinguishing fluid in case of a potential fire near the engine All operators within the system boundary shall wear protective gear All operators within the system boundary shall pass forklift safety training The forklift shall have a brightly colored seat belt that shall prevent the driver from falling out of the seat while driving
Author Group 4
Ver. 1.0
Derived from FR_001 FR_004 FR_005
Group 4
Group 4
1.0
1.0
FR_007 FR_008 FR_009 FR_010 FR_012 FR_013 FR_004 FR_005 FR_007 FR_008 FR_012 FR_013 FR_001 FR_002 FR_012 FR_013
Used by TSR_007 TSR_008 TSR_009 TSR_010 TSR_011 TSR_012 QSR_004
TSR_013 TSR_014 TSR_015 QSR_005 TSR_016 TSR_017 TSR_018 QSR_006
Group 4
1.0
FR_011
TSR_019 TSR_020 TSR_021 QSR_007
Group 4
1.0
–
Group 4
1.0
HLR_001 HLR_002 HLR_003 HLR_001 HLR_002 HLR_003
Group 4
1.0
HLR_002
–
–
(continued)
176
ID Name Description Technical safety requirements TSR_001 Forward The visual alarm in the driving visual cabin shall activate alarm when driving forward with a speed above 3 km/h while the driver’s view is blocked by the load TSR_002 Forward The sound alarm system driving sound shall activate when alarm driving forward with a speed above 5 km/h while the driver’s view is blocked by the load TSR_003 Near object The radar shall measure detection a distance from an object or pedestrian while driving TSR_004 Environment A monitor of the driver preview assistance system shall display the environment around the vehicle TSR_005 Reverse driv- A monitor for driver assistance shall display ing visual visual warning signals alarm while reverse movement if the distance from the object is less than 3 m TSR_006 Collision The forklift shall apply avoidance breaks and stop automatically if the driver assistance system detects a possible collision QSR_001 Radar detecThe radar shall monitor tion range 360 degrees around the forklift QSR_002 Blocked view The collision warning system shall be develperformance oped according to PL d level The driver assistance QSR_003 Blind spot system shall be develview perforoped according to PL d mance level The forklift shall be TSR_007 Forklift load equipped with two position gyroscopes, one posimonitoring tioned just between rear wheels and one just between forks
11
Practical SIL Calculation
Author
Ver.
Derived from
Used by
Group 4
1.0
SR_001
–
Group 4
1.0
SR_001
–
Group 4
1.0
SR_001 SR_002
–
Group 4
1.0
SR_001 SR_002
–
Group 4
1.0
SR_002
–
Group 4
1.0
SR_001 SR_002
–
Group 4
1.0
SR_001 SR_002
–
Group 4
1.0
SR_001
–
Group 4
1.0
SR_002
–
Group 4
1.0
SR_003
–
(continued)
Now Try for Yourself!
ID TSR_008
Name Forklift load mass monitoring
TSR_009
Forklift balance monitoring
TSR_010
Forklift imbalance detection
TSR_011
Forklift safe state transition due to imbalance detection Forklift safe state due to imbalance protection
TSR_012
QSR_004
TSR_013
Forklift imbalance prevention performance level Load imbalance monitoring
TSR_014
Load imbalance detection
TSR_015
Load imbalance correction
177
Description The forklift forks shall be equipped with force sensors to monitor force due to load The forklift shall be able to continuously monitor balance by using data from gyroscopes and force sensors By monitoring the balance, the forklift shall be able to detect neartipping events The forklift shall be put into the safe state if a near-tipping event is detected
Author Group 4
Ver. 1.0
Derived from SR_003
Used by –
Group 4
1.0
SR_003
–
Group 4
1.0
SR_003
–
Group 4
1.0
SR_003
–
The forklift safe state shall be cut off the lift and tilt control to the driver, taking over lowering and backward tilting to initial position with and turning on the emergency alarm The forklift balance monitoring and imbalance prevention shall be developed according to PL d The load balancing system shall monitor the load tilt relative to the forks with sensors placed at each fork The load balancing system shall detect an imbalance of the load if the tilt of the load is above the threshold angle of 5 degrees The load balancing system shall correct load imbalance by activating the fork extensions that shall push the load toward the equilibrium position
Group 4
1.0
SR_003
–
Group 4
1.0
SR_003
–
Group 4
1.0
SR_004
–
Group 4
1.0
SR_004
–
Group 4
1.0
SR_004
–
(continued)
178
ID QSR_005
TSR_016
11
Name Load balancing system level Hydraulic hose state updates
TSR_017
Hydraulic hose sensors
TSR_018
Wireless HDU
QSR_006
Hydraulic hose monitoring system performance level Forklift fire sensor
TSR_019
TSR_020
Forklift fire suppression display
TSR_021
Automatic fire suppression activation
QSR_007
Forklift fire suppression performance level
Practical SIL Calculation
Description The load balancing system shall be developed according to PL d Wireless Hose Diagnostic Unit shall transmit performance data regularly to provide SMS text and email messages to signal impending hose failure Sensors shall monitor and detect potential issues and transmit data to the HDU The wireless HDU shall update the server when the sensors have signaled impending hose failure The forklift hydraulics system shall be developed according to PL d
Author Group 4
Ver. 1.0
Derived from SR_004
Used by –
Group 4
1.0
SR_005
–
Group 4
1.0
SR_005
–
Group 4
1.0
SR_005
–
Group 4
1.0
SR_005
–
The forklift shall be equipped with a sensor that detects a fire The forklift shall be equipped with an interactive display that allows the driver to activate the fire suppression manually The forklift shall automatically activate the release of fireextinguishing fluid once a potential fire is detected by the sensor The forklift fire suppression system shall be developed according to PL d
Group 4
1.0
SR_006
–
Group 4
1.0
SR_006
–
Group 4
1.0
SR_006
–
Group 4
1.0
SR_006
–
Gyro
Relay
Gyro
Relay
Origninal
Li Hydraulic cylinder
Logic
Force Sens
Architectural Block Diagram
I/O
Tilt Hydraulic cylinder
Force Sens
TE
Improved
Relay
Gyro
Relay
Gyro
Relay
Force Sens
Relay
Logic
Force Sens
Li Hydraulic cylinder
I/O
Force Sens
Tilt Hydraulic cylinder
Force Sens
Now Try for Yourself! 179
Gyro
Improved
Gyro
Original
Gyro
Gyro
Force Sens
Force Sens
Force Sens
Reliability Block Diagram
CCF
Force Sens
Force Sens
Force Sens
Logic
CCF
Relay
Logic
Relay I/O
Relay
Relay
Lift Hydraulic cylinder
CCF
Tilt Hydraulic cylinder
Relay
Relay
CCF
I/O
Lift Hydraulic cylinder
Tilt Hydraulic cylinder
180 11 Practical SIL Calculation
Now Try for Yourself!
181
Chapter 12
System Safety Checklist
Video Lesson This chapter has a corresponding video lesson: sfs12.nit-institute.com
Lecture Notes To make sure that our organization has the capability of developing a safe system, and that our technical system will be safe in the end, this lecture provides the final checklist which is needed to close that safety circle. The safety circle consists of six top-level areas: three expertise- and knowledge-based – knowing the system, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0_12
183
184
12 System Safety Checklist
Knowing the system
FuSa Hazards and Risk Evaluation
Safety prescriptions
SySa
SAFETY CIRCLE Inherent safety process
V&V
Certification and audits
Living safety
Fig. 12.1 Safety circle with all the items which need to be closed to achieve safety
knowing to perform hazard and risk evaluation correctly and knowing various safety prescriptions which assure the safety of the technical system implementation – as well as three organizational-/process-based aspects, applying an inherent safety process, performing audits and relevant certifications, as well as living safety by nurturing the safety culture in the company (Fig. 12.1). Knowing the system requires knowledge of the system environment as well as knowledge of system requirements. System delineation splits the system from the environment. If we know what the system does, in the form of requirements, we can only then know what can go wrong with the system. By knowing the environment, we understand who and what is affected by the operation of our system. Hazard and risk evaluation requires the knowledge of methods for the purpose, to identify hazards and evaluate the risk for acceptability, and to be as complete as possible (e.g., FFA, HAZOP, FMEA, FTA, ETA, etc.). Risk acceptability is defined by the standards or by the justification of tolerable risk. Verifiability needs to be assured, so that the analysis can be argued by using references/sources and that the
Lecture Notes
185
selected values, e.g., risk categories, are plausible. Human factors must be regarded when hazards are identified, so the effects such as misuse, or reasonably foreseeable misuse, must be considered as actuating factors for a hazard, which often stem from the complexity of user interface or lack of training. Humans tend to habituate actions and disregard new and changed procedures. Humans are often inattentive, and inattentiveness can come from the fact of paying too much attention to one aspect while completely overlooking the other; therefore, balance in the human interface design must be sought. Safety prescriptions include the knowledge of basic concepts and terminology from system and safety engineering, as well as functional safety with respect to active safety measures. Which prescription needs to be applied and how is usually regulated by the authorities of the area or country where the system is deployed. Many fields have prescribed safety standards for the purpose, which need to be examined and, if relevant and/or required, applied to the system design. Security aspects deal with freedom from attack (intrusions), which needs to be provided to assure safety (freedom from harm). Any technical element or a process in the system, if compromised by an attacker, can as a result produce a decrease in the safety integrity of the system and, therefore, make the system unsafe. Confidentiality in access to all system artifacts and elements must be assured to make the system both safe and secure. Security does not always imply safety (e.g., security measures might disable or prolong some safety-critical access in case of incidents). Safety measures by themselves can compromise safety (e.g., unlocking gates to prevent accidents due to fire hazards allows intruders to access the premise without authentication). These aspects need simultaneous regard for both security and safety which go hand in hand. The software usually participates in a safety function implementation, and we must make sure that its design is given proper attention in the safety sense. System safety evaluation needs to include the evaluation of software safety integrity, making sure that all software engineering and system engineering processes are strictly followed for the prevention of systematic faults (bugs). An inherent safety process is required to deal with safety in a proactive way. The safety life cycle needs to be followed along the V-model, tapping into the project management and system engineering processes (PHI, FHE, SSE, OpSSE). General quality management principles (standards and process models, such as ISO 9001 and ASPICE) are usually required and strictly monitored in case of safety-critical system development, including the quality assurance (QA) methods through traceability, as well as having explicit verification and validation phases. Companies using the inherent safety process need to define safety roles, appoint the appropriate personnel, and give them essential decision powers. Safety roles are not as such sufficient for safety – they only complement the well-established safety processes in the company! Audits are important, meaning that we have processes to check whether we continuously perform all the before mentioned procedures and prescriptions for our system and within our company. Finally, external audit and certification
186
12 System Safety Checklist
would allow us to prove to others our capabilities and the safety of the released systems. Finally, companies need to nurture a safety culture. First, the management of the company needs to understand all safety aspects and to be educated in that regard, not prioritizing company monetary performance without paying meticulous attention to impacts on safety. They shall enforce the safety roles and encourage them to question all management decisions and escalate properly. Safety culture shall propagate throughout the company, requiring the definition of safety objectives and having safety procedures/principles disseminated and adopted by all employees – and not only safety and QA departments! A proactive approach to safety, even in thinking and attitude (will this action that I make impact safety?), is essential to company success – safety is everyone’s responsibility!
Self-assessment Now take the time to self-assess your knowledge about all the required safety aspects by taking the quiz below. Each listed statement is either correct or incorrect. Please mark your answer and then check in the key at the end of the book. 1. If the system is removed from one original environment and placed into another environment, its safety properties remain the same. 2. A user interface element that requires too much user attention to be operated properly can be an actuating factor for a hazard. 3. Misuse of the system does not need to be regarded if we define specific training and prescribe procedures for system operation. 4. Safety prescriptions are first applied according to the safety standards, and only after that with regard to the regulations of the authorities in the area in which the system shall be deployed. 5. When proving the safety integrity of the system, we must prove that the security of the system cannot be compromised. 6. Together with the enforced system integrity, security requires confidentiality and availability of the system to be maintained. 7. Software debugging and bug reporting by users after the system release is an essential practice to make sure the system is safe. 8. Quality management and the application of process models, as in ISO 9001 and ASPICE, are required together with the inherent safety process, to make sure that the company is capable of developing safe systems. 9. The safety manager in the company is fully responsible for the safety of the developed system. 10. Safety culture in the company needs to be nurtured and starts from the company management, who shall not prioritize the monetary performance of the company over the unacceptable impact that this might have on safety.
Self-assessment Key
Self-assessment Key 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
False True False False True True False True False True
187
Bibliography
An Introduction to Functional Safety and IEC 61508, Application Note, AN9025, MTL Instruments Group, 2002 Automotive SPICE, Process Reference Model, Process Assessment Model, v3.1, VDA QMC Working Group, 2017 Bjelica, Milan. My Big, Fat, Safe Software Stack: Functional Safety for Complex Software for Next-Generation Vehicles. 7th Conference on the Engineering of Computer Based Systems. 2021. Blanchard, Benjamin S. System engineering management. Wiley, 2004. CASE Editorial Board. 2014. The Guide to the Systems Engineering Body of Knowledge (SEBoK), v. 1.3. R.D. Adcock (EIC). Hoboken, NJ: The Trustees of the Stevens Institute of Technology. Dubrova, Elena. Fault-tolerant design. New York: Springer, 2013. Ericson, Clifton A Hazard Analysis Techniques for System Safety, 2nd Edition, 2015 FIDES guide 2009: Reliability Methodology for Electronic Systems, Edition A, FIDES Consortium, September 2010 Gell-Mann, Murray. What is complexity? Complexity and industrial clusters. Physica-Verlag HD, 2002. 13–24. IEC 61508: Functional Safety of Electrical/Electronic/Programmable Electronic Safety-related Systems, IEC, Geneva, 2010. IEC 62061:2021, Safety of machinery - Functional safety of safety-related control systems, IEC, Geneva, 2021. IEEE/ISO/IEC 15288-2008, ISO/IEC/IEEE International Standard – Systems and software engineering System life cycle processes, 2008. INCOSE Systems Engineering Handbook v. 3.2, INCOSE‐TP‐2003‐002‐03.2, January 2010 ISO 13849-1:2015, Safety of machinery — Safety-related parts of control systems, ISO, Geneva, 2015 ISO 26262: Road vehicles – Functional safety. 2nd ed., parts 01-12. ISO, Geneva, 2018. ISO 9001:2015-11: Quality management systems – Requirements; ISO, Geneva, 2015 Le Guen, Jean, and Risk Assessment Policy Unit. Reducing risks, protecting people. 1999. Manson, Steven M. Simplifying complexity: a review of complexity theory. Geoforum 32.3 (2001): 405-414. MIL-STD-882E: Department of Defense Standard Practice – System Safety; 2012
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0
189
190
Bibliography
Raussand, M. and Hoyland, A. 2004. System Reliability Theory: Models, Statistical Methods, and Applications, Wiley, 2nd edition. Reason, James. Human error: models and management. BMJ 320.7237 (2000): 768-770. Roland, Harold E., and Brian Moriarty. System safety engineering and management. Wiley, 1991. Sebron, Walter, Hans Tschürtz, and Peter Krebs. The shell model–a method for system boundary analysis. European Conference on Software Process Improvement. Springer, Cham, 2018.
Index
A Accident, 3, 19, 32, 35, 37–39, 42, 47, 49, 51–53, 59, 61, 63, 77, 97, 113, 185 As low as reasonably practicable (ALARP), 52, 59 Availability, 144, 147–150, 156, 186
B Burn in testing, 105, 125, 126
C Claim, 8, 143–145, 152, 168, 169 Composite systems, 109–111, 124
D Dangerous undetectable failures, 40, 146, 149, 151, 152, 156, 168 Diagnostic coverage (DC), 144, 146, 147, 149, 151, 152, 156, 159, 162, 165–167, 172 Diversity, 125–127, 140 Dynamic redundancy, 127, 141
E Equipment under control (EUC), 61, 63, 64, 71, 136
Error, 35, 39, 41, 42, 46, 47, 62–64, 79, 100, 133, 145
F Failure, 4, 21, 26, 32, 35, 36, 39–42, 46, 47, 52, 54, 55, 62–65, 71, 74, 79, 97–101, 107–111, 124–129, 133, 136, 140, 141, 143, 145–149, 151, 156, 162, 166, 168, 172, 178 Failure chain, 40–42, 46, 61 Failure probability, 99–106, 108, 109, 112, 143, 145, 172 Failure rates, 52, 98–106, 108, 109, 112, 114, 117, 123–126, 128, 133–135, 141, 143, 144, 146–149, 152, 156, 159, 163–165, 171, 172 Faults, 4, 35, 39–42, 46, 47, 62–64, 74, 97, 99, 100, 110, 125–127, 152, 156, 159, 172, 185 Fit, 65, 101, 105, 133, 162 Functional safety (FuSa), 23, 36, 37, 47, 61–66, 73, 79, 97–99, 143, 146, 147, 151, 159, 171, 172, 185 Functions, 1–4, 7, 18, 21, 23, 25, 26, 32, 35–37, 39, 41, 46, 47, 62–67, 71, 87, 99, 105, 108–111, 117, 127, 144, 145, 147, 171
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0
191
192 G Globale-ment au moins aussi bon (GAMAB), 52
H Hazard, 3, 4, 18, 23, 26, 32, 35–42, 46, 47, 49–52, 55, 59, 62, 64–67, 71, 74–79, 94, 97, 98, 136, 145, 146, 156, 171, 172, 184–186 Hazard identification, 49, 51–53, 59, 61, 64 Hazard list, 51, 54, 80
I Incidents, 35, 37–39, 47, 49, 51, 53, 59, 185 Inherent safety, 59, 184–186 ISApro, 51, 59
K Kano model, 24–26
M Mean downtime (MDT), 147, 148, 150 Mean time between failures (MTBF), 148, 162, 166 Mean time to failure (MTTF), 98, 100–102, 104–109, 123, 136, 141, 143, 146, 159, 166–168, 172 Mean uptime (MUT), 147, 148, 150 Minimum endogenous mortality (MEM), 52, 54, 55, 59, 136, 152
R Random failures, 62, 74, 97–99, 107, 110, 117, 124–126, 133, 143, 147, 168, 171, 172 Reactive safety, 49, 51, 59 Reliability, 2, 36, 51, 65, 98–106, 108–117, 123–134, 136, 140, 143, 145, 151, 156, 162, 168, 169, 172, 180–181 Reliability block diagram (RBD), 110–112, 114, 115, 117, 123, 124, 127–131, 133, 136, 141, 159, 162, 163, 168, 169 Requirements, 2, 5, 6, 19, 21–27, 32, 35, 36, 41, 49–52, 59, 61–66, 71, 73, 74, 77–80, 82, 84, 97, 98, 107, 117, 125, 143–147, 151, 156, 161, 164, 165, 167–169, 172–174, 184
Index Requirements engineering, 21, 22, 24, 32 Risk analysis, 39, 52, 74 Risk assessment, 38, 39, 41, 42, 61, 73, 79, 97, 145
S Safe failure fraction (SFF), 144, 146, 147, 149, 151, 152, 156, 159, 166, 172 Safety, 3, 18, 19, 23, 26, 27, 32, 35–38, 40, 47, 49–55, 59, 61, 63, 64, 66, 71, 73, 77, 79, 80, 85, 89–94, 97–141, 143–145, 147, 156, 159, 168, 172, 175, 183–186 Safety case, 49, 143–145, 152, 156, 167, 169 Safety concept, 51, 79, 80, 95, 159, 162, 163, 165 Safety critical system, 2, 18, 21, 22, 35, 79, 97, 185 Safety culture, 184, 186 Safety function requirements, 26, 49, 52, 61, 62, 64–66, 71, 73, 74, 77, 78, 98, 123, 156, 165, 168, 169 Safety functions, 26, 49, 52, 61–66, 71, 73–75, 77–79, 97–99, 105, 107, 108, 110, 111, 113, 115, 117, 124, 126, 140, 145, 146, 156, 159–161, 165, 167–169, 171, 172, 185 Safety integrity, 62, 64, 73, 74, 97–99, 107, 108, 110, 123–126, 140, 143–145, 151, 156, 159, 160, 162, 164, 165, 167, 168, 185, 186 Safety integrity improvement, 125, 141, 168 Safety integrity level (SIL), 52, 64–66, 71, 74, 77, 79, 97–99, 105, 107, 108, 117, 125, 126, 145–147, 151, 152, 156, 159–181 Safety life cycle, 47, 147 Safety related system (SRS), 25, 61, 64, 66, 71, 77, 78, 80, 97, 98, 107, 113, 117, 125, 126, 135, 136, 145, 151, 152, 156, 161, 162, 165, 168, 169, 171, 172 Safety requirements, 24, 26, 32, 49–52, 59, 62, 64–66, 71, 73, 74, 77–79, 84, 85, 145, 156, 161, 162, 165, 167, 172, 174, 176 SIL calculations, 146, 169 Static redundancy, 127, 129 Swiss cheese model, 3, 4 System, 1–7, 18, 19, 21–32, 35–47, 49–59, 61–66, 71, 73, 74, 79, 80, 82, 84–89, 91, 92, 97–100, 102, 105, 108–111, 117,
Index 123–127, 129–133, 135, 136, 140, 143–152, 156, 162, 165, 167–172, 174–178, 183–186 System of system (SoS), 2, 4, 5, 7, 19 System safety, 21, 23, 32, 35, 37, 46, 47, 49, 52, 53, 59, 61–63, 73, 109, 156, 159, 185 System safety process, 50, 59
193 Systematic failures, 99, 108, 143 Systematic safety integrity, 61, 62, 97, 143, 146
T Traceability, 2, 6, 7, 23, 26, 32, 50, 66, 145, 172, 185