139 95 32MB
English Pages 292 [283] Year 2021
Lecture Notes in Networks and Systems 268
Matteo Zallio Carlos Raymundo Ibañez Jesus Hechavarria Hernandez Editors
Advances in Human Factors in Robots, Unmanned Systems and Cybersecurity Proceedings of the AHFE 2021 Virtual Conferences on Human Factors in Robots, Drones and Unmanned Systems, and Human Factors in Cybersecurity, July 25–29, 2021, USA
Lecture Notes in Networks and Systems Volume 268
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA; Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada; Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science.
More information about this series at http://www.springer.com/series/15179
Matteo Zallio Carlos Raymundo Ibañez Jesus Hechavarria Hernandez •
•
Editors
Advances in Human Factors in Robots, Unmanned Systems and Cybersecurity Proceedings of the AHFE 2021 Virtual Conferences on Human Factors in Robots, Drones and Unmanned Systems, and Human Factors in Cybersecurity, July 25–29, 2021, USA
123
Editors Matteo Zallio Department of Engineering University of Cambridge Cambridge, UK
Carlos Raymundo Ibañez Department of Engineering Peruvian University of Applied Sciences Lima, Peru
Jesus Hechavarria Hernandez Universidad de Guayaquil Quayaquil, Ecuador
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-3-030-79996-0 ISBN 978-3-030-79997-7 (eBook) https://doi.org/10.1007/978-3-030-79997-7 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Advances in Human Factors and Ergonomics 2021
AHFE 2021 Series Editors Tareq Z. Ahram, Florida, USA Waldemar Karwowski, Florida, USA
12th International Conference on Applied Human Factors and Ergonomics and the Affiliated Conferences (AHFE 2021) Proceedings of the AHFE 2021 Virtual Conferences on Human Factors in Robots, Drones and Unmanned Systems, and Human Factors in Cybersecurity, July 25–29, 2021, USA.
Advances in Neuroergonomics and Cognitive Engineering Advances in Industrial Design
Advances in Ergonomics in Design Advances in Safety Management and Human Performance Advances in Human Factors and Ergonomics in Healthcare and Medical Devices Advances in Simulation and Digital Human Modeling Advances in Human Factors and System Interactions Advances in the Human Side of Service Engineering Advances in Human Factors, Business Management and Leadership Advances in Human Factors in Robots, Unmanned Systems and Cybersecurity Advances in Human Factors in Training, Education, and Learning Sciences
Hasan Ayaz, Umer Asgher and Lucas Paletta Cliff Sungsoo Shin, Giuseppe Di Bucchianico, Shuichi Fukuda, Yong-Gyun Ghim, Gianni Montagna and Cristina Carvalho Francisco Rebelo Pedro M. Arezes and Ronald L. Boring Jay Kalra, Nancy J. Lightner and Redha Taiar Julia L. Wright, Daniel Barber, Sofia Scataglin and Sudhakar L. Rajulu Isabel L. Nunes Christine Leitner, Walter Ganz, Debra Satterfield and Clara Bassano Jussi Ilari Kantola, Salman Nazir and Vesa Salminen Matteo Zallio, Carlos Raymundo Ibañez and Jesus Hechavarria Hernandez Salman Nazir, Tareq Z. Ahram and Waldemar Karwowski (continued)
v
vi
Advances in Human Factors and Ergonomics 2021
(continued) Advances in Human Aspects of Transportation Advances in Artificial Intelligence, Software and Systems Engineering Advances in Human Factors in Architecture, Sustainable Urban Planning and Infrastructure Advances in Physical, Social & Occupational Ergonomics
Advances in Manufacturing, Production Management and Process Control Advances in Usability, User Experience, Wearable and Assistive Technology Advances in Creativity, Innovation, Entrepreneurship and Communication of Design Advances in Human Dynamics for the Development of Contemporary Societies
Neville Stanton Tareq Z. Ahram, Waldemar Karwowski and Jay Kalra Jerzy Charytonowicz, Alicja Maciejko and Christianne S. Falcão Ravindra S. Goonetilleke, Shuping Xiong, Henrijs Kalkis, Zenija Roja, Waldemar Karwowski and Atsuo Murata Stefan Trzcielinski, Beata Mrugalska, Waldemar Karwowski, Emilio Rossi and Massimo Di Nicolantonio Tareq Z. Ahram and Christianne S. Falcão Evangelos Markopoulos, Ravindra S. Goonetilleke, Amic G. Ho and Yan Luximon Daniel Raposo, Nuno Martins and Daniel Brandão
Preface
This book deals with two areas of critical importance both in the digital society and in the field of human factors: “Robots, Drones and Unmanned Systems” and “Human Factors in Cybersecurity”. Researchers are conducting cutting-edge investigations in the area of unmanned systems to inform and improve how humans interact with robotic platforms. Many of the efforts focused on refining the underlying algorithms that define system operation and on revolutionizing the design of human–system interfaces. The multi-faceted goals of this research are to improve ease of use, learnability, suitability, interaction, and human–system performance, which in turn will reduce the number of personnel hours and dedicated resources necessary to train, operate, and maintain the systems. As our dependence on unmanned systems grows along with the desire to reduce the manpower needed to operate them across both the military and the commercial sectors, it becomes increasingly critical that system designs are safe, efficient, and effective and provide humans with reliable solutions to daily challenges. Optimizing human–robot interaction and reducing cognitive workload at the user interface require research emphasis to understand what information the operator requires, when they require it, and in what form it should be presented, so they can intervene and take control of unmanned platforms when it is necessary. With a reduction in manpower, each individual’s role in system operation becomes even more important to the overall success of the mission or task at hand. Researchers are developing theories as well as prototype user interfaces to understand how best to support human–system interaction in complex operational environments. Because humans tend to be the most flexible and integral part of unmanned systems, the human factors and unmanned systems’ focus considers the role of the human early in the design and development process in order to facilitate the design of effective human–system interaction and teaming. This book addresses a variety of professionals, researchers, and students in the broad field of robotics, drones, and unmanned systems who are interested in the design of multi-sensory user interfaces (auditory, visual, and haptic), user-centered design, and task–function allocation when using artificial intelligence/automation to offset cognitive workload for the human operator.
vii
viii
Preface
This book additionally deals with the role of the human factors in cybersecurity. It is in fact the human element what makes the cyberspace complex and adaptive. According to international cybersecurity reports, people are both an essential part of the cybersecurity challenge and part of its solution. Cyber-intrusions and attacks have increased dramatically over the last decade, exposing sensitive personal and business information, disrupting critical operations, and imposing high costs on the economy. Therefore, understanding how people behave in the digital environment and investigate the role of human error in security attacks is therefore fundamental for developing an effective approach to cybersecurity in a variety of contexts. This book gathers studies on the social, economic, and behavioral aspects of the cyberspace and reports on technical and analytical tools for increasing cybersecurity. It describes new educational and training methods for management and employees aimed at raising cybersecurity awareness. It discusses key psychological and organizational factors influencing cybersecurity. Additionally, it offers a comprehensive perspective on ways to manage cybersecurity risks for a range of different organizations and individuals, presenting inclusive, multidisciplinary, and integrated user-centered design approaches combining technical and behavioral elements. As editors, we hope its informative content will provide inspiration, leading the reader to formulate new, innovative research questions, applications, and potential solutions for creating effective human-centered solutions by teaming with robots and unmanned systems. Contributions have been organized into five sections:
Human Factors in Robots, Drones and Unmanned Systems 1. 2. 3. 4.
Human Factors and Unmanned Aerial Vehicles Robots in Transportation Systems Drones, Robots and Humanized Behaviors Robotic Systems for Social Interactions
Cybersecurity 5. Human Factors in Cybersecurity Each section contains research papers that have been reviewed by members of the International Editorial Board. Our sincere thanks and appreciation to the board members as listed below:
Human Factors in Robots, Drones and Unmanned Systems P. Bonato, USA R. Brewer, USA G. Calhoun, USA R. Clothier, Australia N. Cooke, USA L. Elliott, USA
Preface
K. Estabridis, USA D. Ferris, USA J. Fraczek, Poland J. Geeseman, USA J. Gratch, USA S. Hill, USA E. Holder, USA M. Hou, Canada L. Huang, USA C. Johnson, UK M. LaFiandra, USA S. Lakhmani, USA J. Lyons, USA K. Neville, USA J. Norris, USA J. Pons, Spain C. Stokes, USA P. Stütz, Germany R. Taiar, France J. Thomas, USA A. Trujillo, USA A. Tvaryanas, USA H. Van der Kooij, The Netherlands D. Vincenzi, USA E. Vorm, USA H. Widlroither, Germany H. Zhou, UK
Cybersecurity P. Aggarwal, USA M. Bashir, USA R. Buckland, Australia A. Burov, Ukraine B. Caulkins, USA R. Chadha, USA G. Denker, USA V. Dutt, India F. Greitzer, USA E. Huber, Austria J. Jones, USA A. Moallem, USA P. Morgan, UK D. Nicholson, USA W. Patterson, USA
ix
x
Preface
J. Still, USA A. Tall, USA M. Ter Louw, USA E. Whitaker, USA July 2021
Matteo Zallio Carlos Raymundo Ibañez Jesus Hechavarria Hernandez
Contents
Human Factors and Unmanned Aerial Vehicles Concept for Cross-platform Delegation of Heterogeneous UAVs in a MUM-T Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Siegfried Maier and Axel Schulte
3
Swarms, Teams, or Choirs? Metaphors in Multi-UAV Systems Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Oscar Bjurling, Mattias Arvola, and Tom Ziemke
10
Visual Communication with UAS: Estimating Parameters for Gestural Transmission of Task Descriptions . . . . . . . . . . . . . . . . . . . Alexander Schelle and Peter Stütz
16
A Distributed Mission-Planning Framework for Shared UAV Use in Multi-operator MUM-T Applications . . . . . . . . . . . . . . . . . . . . . . . . . Gunar Roth and Axel Schulte
25
Robots in Transportation Systems Conditional Behavior: Human Delegation Mode for Unmanned Vehicles Under Selective Datalink Availability . . . . . . . . . . . . . . . . . . . . Carsten Meyer and Axel Schulte Lethal Autonomous Weapon Systems: An Advocacy Paper . . . . . . . . . . Guermantes Lailari Measuring the Impact of a Navigation Aid in Unmanned Ship Handling via a Shore Control Center . . . . . . . . . . . . . . . . . . . . . . . . . . . Gökay Yayla, Chris Christofakis, Stijn Storms, Tim Catoor, Paolo Pilozzi, Yogang Singh, Gerben Peeters, Muhammad Raheel Afzal, Senne Van Baelen, Dimiter Holm, Robrecht Louw, and Peter Slaets
35 43
52
xi
xii
Contents
A Computational Assessment of Ergonomics in an Industrial Human-Robot Collaboration Workplace Using System Dynamics . . . . . Guilherme Deola Borges, Rafael Ariente Neto, Diego Luiz de Mattos, Eugenio Andres Diaz Merino, Paula Carneiro, and Pedro Arezes A New Modular Intensive Design Solution for ROVs . . . . . . . . . . . . . . . Qianqian Jing, Jing Luo, and Yunhui Li
60
69
Drones, Robots and Humanized Behaviors Designing for the Unknown: Using Structured Analysis and Design Technique (SADT) to Create a Pilot Domain for a Shore Control Centre for Autonomous Ships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dag Rutledal Reporting of Ethical Conduct in Human-Robot Interaction Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Julia Rosén, Jessica Lindblom, and Erik Billing Exploratory Analysis of Research Publications on Robotics in Costa Rica Main Public Universities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Juan Pablo Chaverri, Adrián Vega, Kryscia Ramírez-Benavides, Ariel Mora, and Luis Guerrero
79
87
95
A Century of Humanoid Robotics in Cinema: A Design-Driven Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Niccolò Casiddu, Claudia Porfirione, Francesco Burlando, and Annapaola Vacanti RESPONDRONE - A Multi-UAS Platform to Support Situation Assessment and Decision Making for First Responders . . . . . . . . . . . . . 110 Max Friedrich, Satenik Mnatsakanyan, David Kocharov, and Joonas Lieb Robotic Systems for Social Interactions Exploring Resilience and Cohesion in Human-Autonomy Teams: Models and Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Samantha Berg, Catherine Neubauer, Christa Robison, Christopher Kroninger, Kristin E. Schaefer, and Andrea Krausman Multi-modal Emotion Recognition for User Adaptation in Social Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Michael Schiffmann, Aniella Thoma, and Anja Richert Robot Design Needs Users: A Co-design Approach to HRI . . . . . . . . . . 135 Francesco Burlando, Xavier Ferrari Tumay, and Annapaola Vacanti Social Robotic Platform to Strengthen Literacy Skills . . . . . . . . . . . . . . 143 Mireya Zapata, Jacqueline Gordón, Andrés Caicedo, and Jorge Alvarez-Tello
Contents
xiii
Structural Bionic Method of Climbing Robot Based on Video Key Frame Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Xinxiong Liu and Yue Sun Prototype System for Control the ScorBot ER-4U Robotic Arm Using Free Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 Elizabeth Chávez-Chica, Jorge Buele, Franklin W. Salazar, and José Varela-Aldás Human Factors in Cybersecurity Detecting Cyberattacks Using Linguistic Analysis . . . . . . . . . . . . . . . . . 169 Wayne Patterson Digital Image Forensics Using Hexadecimal Image Analysis . . . . . . . . . 176 Gina Fossati, Anmol Agarwal, and Ebru Celikel Cankaya Identifying Soft Biometric Features from a Combination of Keystroke and Mouse Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 Sally Earl, James Campbell, and Oliver Buckley CyberSecurity Privacy Risks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Linda R. Wilbanks Sharing Photos on Social Media: Visual Attention Affects Real-World Decision Making . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Shawn E. Fagan, Lauren Wade, Kurt Hugenberg, Apu Kapadia, and Bennett I. Bertenthal Prosthetic Face Makeups and Detection . . . . . . . . . . . . . . . . . . . . . . . . . 207 Yang Cai Analysis of Risks to Data Privacy for Family Units in Many Countries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Wayne Patterson Exploring Understanding and Usage of Two-Factor Authentication Account Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 Jeremiah D. Still and Lauren N. Tiller Strategies of Naive Software Reverse Engineering: A Qualitative Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 Salsabil Hamadache, Markus Krause, and Malte Elson How Safely Do We Behave Online? An Explanatory Study into the Cybersecurity Behaviors of Dutch Citizens . . . . . . . . . . . . . . . . 238 Rick van der Kleij, Susanne van ’t Hoff-De Goede, Steve van de Weijer, and Rutger Leukfeldt
xiv
Contents
Evaluating the BOLT Application: Supporting Human Observation, Metrics, and Cognitive Work Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 247 Aryn Pyke, Ben Barone, Blaine Hoffman, Michael Kozak, and Norbou Buchler Vulnerability Analysis Through Ethical Hacking Techniques . . . . . . . . 256 Ángel Rolando Delgado-Pilozo, Viviana Belen Demera-Centeno, and Elba Tatiana Zambrano-Solorzano User Perceptions of Phishing Consequence Severity and Likelihood, and Implications for Warning Message Design . . . . . . . . . . . . . . . . . . . 265 Eleanor K. Foster, Keith S. Jones, Miriam E. Armstrong, and Akbar S. Namin Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Human Factors and Unmanned Aerial Vehicles
Concept for Cross-platform Delegation of Heterogeneous UAVs in a MUM-T Environment Siegfried Maier(B) and Axel Schulte Institute of Flight Systems, Universität Der Bundeswehr Munich, Werner-Heisenberg-Weg 39, 85577 Neubiberg, Germany {siegfried.maier,axel.schulte}@unibw.de
Abstract. In this article we present a new approach enabling human pilots to delegate and negotiate tasks with other human pilots in Manned-Unmanned Teaming (MUM-T) missions. This incorporates a concept for cross-platform delegation of tasks during missions, bridging different hierarchical leadership levels. So far, we focused on a single human user guiding several unmanned systems. Within this scope we will consider human cooperation to enable the delegation of tasks between several MUM-T compounds on a systems-of-systems level. The methods we already use to delegate tasks within a single MUM-T package will now be extended by delegating tasks from one system to other systems, coordination between packages and situation-dependent deployment of UAVs from another MUM-T compound, all supervised and coordinated by the highest hierarchical instance in the overall structure. Results from initial expert feedback sessions showed that the presented concept represents an early, but already valid approach. However, revisions are still needed concerning the interaction between the human users and the technical functions. Keyword: Cross-platform · Mission planning · Task delegation · Manned-unmanned teaming
1 Introduction and Background MUM-T missions facilitate teams of manned and unmanned systems. These compounds consist of at least one manned vehicle and at least one unmanned vehicle, both controlled by the human cockpit crew aboard the manned aircraft. A field of research deals with the mission planning and management of several unmanned vehicles or aircraft by a single human during the execution of highly dynamic missions [1, 2]. As a starting point for the conception and design of a MUM-T technology solution, we conduct a work process analysis followed by a work system cognitive design according to [3]. The work system notation provides a graphical and semantic description language to create a top-level system design for complex Human-Autonomy Teaming (HAT) systems. This results in the following (Fig. 1) system for MUM-T missions. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 3–9, 2021. https://doi.org/10.1007/978-3-030-79997-7_1
4
S. Maier and A. Schulte
Fig. 1. Work process for MUM-T missions
The Work Process (WProc) represents the MUM-T Mission to be performed and it receives its Work Objective (Wobj) “MUM-T Mission Objective” from the Command and Control (C2) center. The WProc is embedded in a Work Environment (WEnv) providing various inputs, which are information from the physical and tactical environment. The information or physical effects generated by the WProc – the Work Process Output (WPOut) – are the Mission Result. According to the WProc defined in Fig. 1, the corresponding initial Work System (WSys) design is shown in Fig. 2.
Fig. 2. Work system for task based guidance
A WSys is facilitated by two different roles, the Worker and the Tools. The Worker knows, understands, and pursues the WObj by own initiative. The Tools receive tasks from the Worker and will only perform when told to do so. Tools are in most cases conventional automated systems, like vehicles including conventional automation. In addition, there shall be Cognitive Agents in a WSys which understand the tasks delegated by Workers and can realize those tasks by using their dedicated Tools. Currently, we investigate the guidance of several Unmanned Aerial Vehicles (UAVs) from the cockpit of aerial vehicles in the fast-jet and helicopter domain [4, 5]. In both domains, the cockpit crew is responsible for the delegation of tasks to the unmanned team members and the monitoring of the mission execution, in addition to the conventional pilot tasks. Therefore, we developed our Task-Based Guidance approach according to [6], where the cockpit crew delegates tasks to the cognitive agents on board the UAVs using dedicated interfaces. In previous studies, our work focused on guiding a small number of UAVs from aboard a single manned platform [7]. Here the pilot specifies the tasks to be performed during the mission, assigns them to his unmanned team members with the help of a Tasking Interface (TI) and creates a mission plan. The result is displayed in the same interface that is used for task delegation. The described approach was implemented in our research simulator and experimentally evaluated by use of a single fast-jet with a single-seat cockpit. The experiments were performed with pilots of the German Air Force. The exact execution of the experiment and the results are contained in [8].
Concept for Cross-platform Delegation of Heterogeneous UAVs
5
2 Problem Definition In large-scale military air operations, several teams – also called packages – consisting of different aircraft with different subtasks work together to achieve a common overall mission objective. This is achieved through a so-called Composite Air Operation (COMAO). Within a COMAO, a distinct hierarchy and responsibility of the individual participant exists. In general, there are three roles with dedicated tasks within a COMAO which must be assumed by the team members who are in the air [9]. The Mission Commander (MC) has the overall view of the mission and acts as the superior planning authority in the air. One hierarchical level below, the Flight Leads (FL) are deployed in different packages with different mission subtasks. On the lowest level in a COMAO are the individual team members who have to fulfil their specially assigned tasks by using their own abilities. These so-called wingmen are usually manned aircraft, however, will be replaced by UAVs in our MUM-T mode of operation (Fig. 3). Applying the work process and work system analysis to a COMAO and only look at the MC and one FL, then one gets the WProc shown in Fig. 4. .
Fig. 3. COMAO using a MUM-T approach
Fig. 4. Work process for a COMAO, only MC and one FL considered
Currently, in air operations the communication between the MC and the FLs is mainly voice based. In MUM-T, an additional channel for task delegation via voice commands can be incoherent with the previously mentioned modes of operation concerning the unmanned assets, and therefore, in total complex and potentially too demanding for the user. Furthermore, with Task-Based Guidance of UAVs [8] and Task-Based Guidance of the own aircraft, this would lead to a major break and heterogeneity in the interactions of the pilots in the cockpit in our MUM-T mission management tools. Therefore, the aim is to apply our approach of Task-Based Guidance, which was proven extremely useful for manned to unmanned delegation [8], for manned to manned delegation as in
6
S. Maier and A. Schulte
a COMAO. This gives a type of task delegation between pilots that corresponds to the leadership of the UAVs. This provides a uniform and general mission management while executing MUM-T missions on COMAO level. In this context we speak of cross-platform delegation interfaces.
3 Concept The first step is the top-level design of the Work System as shown in Fig. 5.
Fig. 5. Work system COMAO
The MC receives the main mission objective and corresponding sub-mission objectives from C2 as WObj (see Fig. 4) and provides them to the FL as part of his WPOut. During the mission execution, the MC receives information from the environment and from the FL. Based on this information, the MC can delegate new tasks to the FL which complement their individual mission objectives. The described system can of course be extended by any number of FLs, depending on the mission requirements. The WSys shows the complexity of the tasking interactions of the MC which are threefold: (1) flying his tactical aircraft by use of conventional flight control and flight guidance interfaces (potentially in the near future also by a task-based approach), (2) managing a team of UAVs by use of a dedicated TI, (3) delegating tasks to other manned or mannedunmanned flights mainly by use of voice communication, but also supported by a rather inflexible data-link system (e.g. MIDS/Link 16). Our approach is to integrate all these activities to a single, cockpit integrated delegation manager, which provides a unified mode of interaction for the three mentioned channels. Therefore, we suggest the taskbased method and interaction design that we widely used for UAV-delegation. This leads to a WSys, which can be seen in Fig. 6.
Fig. 6. WSys with delegation manager
The tasks defined by the MC are passed to a cognitive agent, the so-called Delegation Manager (DM), via appropriate interfaces. The DM shall unify and comprise the interaction modes in a single framework and support the formulation and routing of tasking
Concept for Cross-platform Delegation of Heterogeneous UAVs
7
dialogues. This agent can then determine a suitable recipient for the task to be delegated and send it to this receiver. Furthermore, to the DM supports the coordination of the individual MUM-T packages in order to carry out coordinated operations if necessary. This is achieved by inserting and parameterizing constraints between planned tasks. Some parts of the described concept have already been implemented and are described in the following section.
4 Implementation and Early Results With the existing research simulator (a full hardware and software setup for solving and validating of concepts and hypotheses with human-in-the-loop experiments), it is possible to execute full military air operations with a single MUM-T compound, consisting of one human in a single-seat aircraft and any numbers of UAVs. This has been extended to execute missions on COMAO level with multiple MUM-T packages. After the hardand software extensions (additional complete simulation environments and cockpits, as well as the possibility to exchange data between the systems), a simulation mission was created deploying three MUM-T packages and two aircraft (one single seat and one dual seat). Each package consists of one human and three UAVs. One of the humans assumes the role of MC (inside the rear-cockpit of the dual-seat aircraft) and the other two act as FL, both in the front-cockpits of their aircrafts. FLs can delegate tasks to their own team members using the existing functionalities and interfaces (see Fig. 7).
Fig. 7. Cockpit interface section with tactical map and TI
After selecting a Tasking Object, the Task Selection opens, where a suitable task can be selected. Afterwards the TI opens. Here, the task can be delegated to the Team Members or the own Aircraft. The resulting mission plan with the assigned tasks (boxes in the TI) is also shown in the TI. If necessary, the individual tasks can be configured more precisely in the Task Config. For the MC, to have a full overview of the mission, its progress, and to delegate tasks to the FLs, the TI has been extended to allow the MC to additionally see the other packages (Fig. 8), while the FLs have the view above.
8
S. Maier and A. Schulte
Fig. 8. TI MC
In addition to the process described above, the MC has the option of delegating tasks to the FLs beyond the boundaries of his own compound. This process is shown in Fig. 9.
Fig. 9. Sequence for delegating tasks from MC to FL
The MC selects a task in his cockpit interface, whereupon his TI opens. Now he chooses the Package to delegate the task to. The task is sent to the receiver cockpit, and a message is displayed to the recipient and sender indicating that the task has been received or successfully sent. At the same time, the sent task is added on a dedicated interface on the receiver’s side. The FL can select the task from this interface, which centers his map in the cockpit interface on the corresponding target object and opens the TI. After adding the task to the mission plan, it is updated and the MC will also see the updated mission plan of the corresponding FL in his TI. Depending on the overall-situation and the workload of the FLs, it would be conceivable that the MC could also insert and/or parameterize tasks directly in the mission plan of the other packages, for example to facilitate coordinated maneuvers.
5 Conclusions and Outlook Initial experiments with test persons - including German Air Force pilots - showed that the current implementation of the TI for the MC can become confusing, especially for more complex, longer missions. Another aspect that needs to be examined and revised
Concept for Cross-platform Delegation of Heterogeneous UAVs
9
is the time required by the planning algorithm to create mission plans that become more and more complex as the mission duration increases. Basically, the current concept and implementation allows to carry out proven missions consisting of one MUM-T system, which have been extended to the corresponding configuration with three MUM-T systems. The direct insertion and parameterization of tasks by the MC into the FL’s mission plan and the DM’s capabilities still need to be fully defined, implemented as well as evaluated through meaningful experiments. In addition, the TI will be redesigned so that it is displayed on its own dedicated interface. This way, the TI becomes larger and thus more manageable. Furthermore, this has the advantage that dependencies between individual tasks can be displayed more easily across packages, thus facilitating coordination between packages. In addition, the tactical map is no longer covered by the TI, which further increases the overview and operability of the cockpit interface.
References 1. Gangl, S., Lettl, B., Schulte, A.: Management of multiple unmanned combat aerial vehicles from a single-seat fighter cockpit in manned-unmanned fighter missions. In: AIAA Infotech@ Aerospace (I@ A) Conference, p. 4899 (2013) 2. Behymer, K., et al.: Initial evaluation of the intelligent multi-uxv planner with adaptive collaborative/control technologies (IMPACT) (2017) 3. Schulte, A., Donath, D., Lange, D.S.: Design patterns for human-cognitive agent teaming. In: Harris, D. (ed.) EPCE 2016. LNCS (LNAI), vol. 9736, pp. 231–243. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-40030-3_24 4. Schmitt, F., Roth, G., Schulte, A.: Design and Evaluation of a Mixed-Initiative Planner for Multi-Vehicle Missions (2017) 5. Heilemann, F., Schmitt, F., Schulte, A.: Mixed-initiative mission planning of multiple UCAVs from aboard a single seat fighter aircraft. In: AIAA Scitech 2019 Forum, p. 2205 (2019) 6. Uhrmann, J., Schulte, A.: Task-based guidance of multiple UAV using cognitive automation. In: COGNITIVE 2011, The Third International Conference on Advanced Cognitive Technologies and Applications, pp. 47–52 (2011) 7. Heilemann, F., Schulte, A.: Interaction concept for mixed-initiative mission planning on multiple delegation levels in multi-UCAV fighter missions. In: Karwowski, W., Ahram, T. (eds.) IHSI 2019. AISC, vol. 903, pp. 699–705. Springer, Cham (2019). https://doi.org/10.1007/9783-030-11051-2_106 8. Heilemann, F., Schulte, A.: Experimental evaluation of an adaptive planning assistance system in manned unmanned teaming missions. In: Schmorrow, D.D., Fidopiastis, C.M. (eds.) HCII 2020. LNCS (LNAI), vol. 12197, pp. 371–382. Springer, Cham (2020). https://doi.org/10. 1007/978-3-030-50439-7_25 9. Fredriksen, P.K.: 26 Interaction in Aerial Warfare: The Role of the Mission Commander in Composite Air Operations (COMAO). In: Interaction: ‘Samhandling’ Under Risk. Cappelen Damm Akademisk/NOASP (Nordic Open Access Scholarly Publishing) (2018)
Swarms, Teams, or Choirs? Metaphors in Multi-UAV Systems Design Oscar Bjurling1(B) , Mattias Arvola2 , and Tom Ziemke2 1 Digital Systems, RISE Research Institutes of Sweden, c/o Linköpings Universitet,
581 83 Linköping, Sweden [email protected] 2 Department of Computer and Information Science, Linköpings Universitet, Linköping University, 581 83 Linköping, Sweden {Mattias.Arvola,Tom.Ziemke}@liu.se
Abstract. Future Unmanned Aerial Vehicles (UAVs) are projected to fly and operate in swarms. The swarm metaphor makes explicit and implicit mappings regarding system architecture and human interaction to aspects of natural systems, such as bee societies. Compared to the metaphor of a team, swarming agents as individuals are less capable, more expendable, and more limited in terms of communication and coordination. Given their different features and limitations, the two metaphors could be useful in different scenarios. We also discuss a choir metaphor and illustrate how it can give rise to different design concepts. We conclude that designers and engineers should be mindful of the metaphors they use because they influence—and limit—how to think about and design for multi-UAV systems. Keywords: Drone swarm · Human-swarm interaction · Metaphor
1 Introduction Swarm robotics is a field dedicated to how relatively simple robotic agents, such as Unmanned Aerial Vehicles (UAVs, or drones), can collaborate by mimicking the behaviors of biological systems, such as ant colonies or beehives. In this paper we explore this apparent swarm metaphor, compare it to other generative metaphors, and discuss its implications for the design and development of drone swarm systems. Before diving into the nature of swarm behavior, however, we will briefly discuss the fundamental properties and function of metaphor in everyday life and, importantly, in systems design. Figurative speech (metaphor, metonymy, etc.) is not a mere linguistic curiosity, but an indication of how we think. Conceptual metaphor is at the heart of both thought and action in our everyday lives [1–3]. The central mechanism of conceptual metaphor theory is cross-domain mapping, which is the process of identifying and establishing links between a source domain and a target domain that we wish to understand, experience, or explain [1, 2]. For instance, in the LOVE-IS-A-JOURNEY metaphor, the TRAVELLERS in the JOURNEY source domain correspond to the LOVERS in the LOVE target domain, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 10–15, 2021. https://doi.org/10.1007/978-3-030-79997-7_2
Swarms, Teams, or Choirs? Metaphors in Multi-UAV Systems Design
11
and a journey’s DESTINATION is metaphorical to the RELATIONSHIP GOALS of a romantic relationship [1]. While some mappings between the source and target domains are explicit as in the above examples, others are inferred, or entailed [1]. For instance, the LOVE-IS-A-JOURNEY metaphor entails that, because a journey can be exhausting to the point where some rest is needed, the same could be true for love, as in “we’re taking a break in our relationship”. Additionally, conceptual metaphor typically serves to understand a relatively abstract target domain (e.g. TIME) in terms of a source domain that we can more readily experience (e.g. MOTION), which is also why conceptual metaphors are typically unidirectional in nature [1]. Primary metaphor theory attempts to resolve the apparent contradiction of a target domain already having an invariant conceptual structure (or image schema) that dictates how the source domain can be mapped onto it, while simultaneously being in need of “borrowing” the source domain structure due to itself being too abstract [1]. Primary metaphors are “basic” metaphors that are grounded in our direct bodily experience, such as KNOWING-IS-SEEING. Compound (or complex) metaphors involve the conceptual blending of primary metaphors, like THEORIES-ARE-BUILDINGS [1]. Metaphor has long been an important tool for designers, especially in the digital age where new, abstract concepts must be presented in reasonably intuitive ways. A ubiquitous example is the COMPUTERS-ARE-OFFICE-DESKTOPS metaphor through which computer’s internal mechanisms (and our conceptual understanding thereof) are structured in terms of “files”, “folders”, or “waste bins” [4, 5]. In this way, metaphor enables the user’s interpretation of digital designs by providing semantic information and physical reference points [5]. Metaphor is also a powerful tool for designers themselves in all stages of the design process: to reframe a design problem to highlight certain elements while concealing others [1, 5], communicate their conceptual design model to others [4], or formulate and test design theories [5]. Furthermore, design metaphors can be either descriptive or prescriptive in nature. Descriptive metaphors primarily serve to recontextualize a design problem to give the designer a better understanding of it. Prescriptive metaphors cast the design problem in an entirely different light, helping the designer to generate novel design solutions [6]. This second kind of design metaphor is more commonly referred to as generative metaphors [3]. These are a special case of “seeing-as” metaphors that enable new ways of viewing the world, which promotes innovation [3]. As an example, Schön [3] describes how, tasked with improving the painttransferring properties of a paintbrush with synthetic bristles, somebody in a product development team pondered aloud how “a paintbrush is a kind of pump”. This prompted the team to think about the spaces in between the bristles as channels for the paint to flow through. When studying how the “channels” of the synthetic brush reacted to the brush being pressed onto the canvas, the team noticed that they bent at a sharper angle than natural bristles. They believed that this restricted the flow of paint and researched different ways to improve this aspect of the synthetic brush until it transferred paint in much the same way as its natural counterpart. This PAINTBRUSHES-ARE-PUMPS metaphor, Schön argues, is generative precisely because it unlocked new ways of thinking about the problem and its possible solutions. Importantly, Schön further argues that there is no a priori similarity between paintbrushes and pumps. Instead, the similarities are realized by directly perceiving or experiencing an artifact in a certain context, like
12
O. Bjurling et al.
holding a brush to apply paint to a canvas [3]. In other words, (generative) metaphors create similarities [2].
2 The Swarm Metaphor To analyze the swarm metaphor, one must first understand the fundamental properties of swarms. The noun “swarm” typically evokes images of a great number of agents, specifically bees, that move in a dense formation. A definition of swarming focuses on how members of a group interact: “Swarming: A collection of autonomous individuals relying on local sensing and reactive behaviors interacting such that a global behavior emerges from the interactions” [7, p. 3]. This essentially describes what insects like ants and bees do. What are the benefits of this behavior that prompt us to want to implement it in multi-UAV systems? It is decentralized, meaning that there is no leader, which, together with its robustness and scalability, contributes to its general resilience to change [7]. Moreover, swarming agents’ ability to self-organize means that a swarm is flexible, adapting dynamically to changes in the environment. Concerning drones, the intended product (or goal) of all this—the emergent, global level behavior—is that a swarm of simple (i.e. cheap, expendable) agents can perform the same tasks as a single complex (i.e. expensive) agent [7]. In other words, swarms trade efficiency for simplicity. The MULTI-UAV-SYSTEMS-ARE-SWARMS metaphor, henceforth simply the “swarm metaphor”, is a compound metaphor relying on the unification of several primary metaphors. The mappings of the swarm metaphor are presented in Table 1. Table 1. Selection of role mappings of the swarm metaphor (using bees). Source: Bee swarm
Target: Multi-UAV system
Bees
→
Drones
Beehive
→
Home base
Beekeeper
→
Drone operator
Touch/Pheromones/Dance (communicate)
→
Short range (local) data links
Foraging for food (survival)
→
Search/retrieve task (mission)
Swarming
→
Coordinating group activity
The DRONES-AS-BEES mapping entails that drones and bees share some attributes. Individual bees have limited cognitive capacities but compensate by acting as a collective. This suggests that drones, too, require only enough computational power and sensor quality to function individually while maintaining the ability to collaborate with other drones. This reduces cost. However, adhering too strongly to the metaphor may needlessly limit the capacities of the swarm. Drones, equipped with sophisticated sensors and software systems, have greater potential for intelligent behavior than insects. In short, designers must be careful not to wear the swarm metaphor like a straitjacket. Another inference is that because worker bees essentially sacrifice themselves when attacking
Swarms, Teams, or Choirs? Metaphors in Multi-UAV Systems Design
13
and stinging intruders, perhaps drones could have similar abilities, like risking structural damage or battery drain to complete a task. This ties into the idea of individual drones being relatively expendable. From a system design standpoint, the metaphor prompting us to think of drones as bees brings to the fore our existing understanding of how singular bees typically behave and projects these bee qualities onto drones. Next, the DRONE-OPERATORS-AS-BEEKEEPERS mapping entails aspects of control. Beekeepers maintain bee colonies to use them to collect honey from the hive. By the same token, the drone operator assumes a supervisory role, using the drone swarm to accomplish a goal. Furthermore, beekeepers can only indirectly affect the behavior of individual bees or the hive, and the drone operator is also limited to implicit control of the swarm and its member drones, according to the metaphor. Indeed, methods for controlling drone swarms are a prominent topic in human-swarm interaction research. The COORDINATION-AS-SWARMING mapping can be tied into the BEECOMMUNICATION-AS-NETWORK-COMMUNICATION mapping to explain the focus in swarm research on having the drones (or software agents) cluster in swarms by utilizing local sensors and data link networks. An important line of research, but the metaphor risks distracting researchers from exploring other solutions. What if the drone communication network is global to the swarm? Existing on-board communication equipment already has a range of several kilometers, raising questions about the purpose of having drones fly in close formation. A more interesting entailment, which ties back to the question of control, is that bees (like ants) use stigmergy—depositing pheromones in the environment—to communicate. This local communication is studied and simulated in swarm research by enabling drones to deploy virtual beacons. In summary, the swarm metaphor, as a generative metaphor in multi-UAV systems design, is useful in the sense that it promotes simple solutions to tackle complex tasks. However, its simplicity poses different design challenges; the performance outcome of the swarm is inherently probabilistic as there is no way of knowing what any individual drone will do or how its behavior will impact the swarm overall. Designing usable emergent systems, like drone swarms, is notoriously difficult [7]. We will now briefly compare this approach to another generative metaphor: teams.
3 The Team Metaphor In the MULTI-UAV-SYSTEMS-ARE-TEAMS metaphor, or the “team metaphor”, the DRONES map onto TEAM MEMBERS. They typically have specialized and complementary skillsets to cover different tasks and responsibilities, and drones are entailed to be more capable and autonomous than in the swarm metaphor. This further means that drone team members are typically not expendable like their swarm counterparts. In the team metaphor, the role of the DRONE OPERATOR maps onto the MISSION COMMANDER, who supports and monitors the team, providing information and tasks. From a control standpoint, this is more complex than the BEEKEEPER of the swarm as the MISSION COMMANDER is in direct communication with the team, enabling direct influence on each member. Regarding communication, while swarm members broadcast basic information to all in their vicinity, team members communicate semantically rich information to specific individuals [7]. The COORDINATING-GROUP-ACTIVITY of
14
O. Bjurling et al.
the multi-UAV target system corresponds to PLANNING in the team metaphor, highlighting that teams engage in deliberate, goal-driven activities where all team members have some awareness of the team’s mission and task allocations. This is not the case for swarms. A final inference is that teams can have team leaders, meaning that there could be a leader drone coordinating local task allocation and acting as a liaison between the team and the drone operator. Table 2 juxtaposes swarms and teams, and a comparison suggests that they are suited for different things. No single one of these generative metaphors can adequately facilitate our understanding of what multi-UAV systems can be. Why do bees swarm whereas wolves work as a team? Is one approach superior to the other? In nature, the two systems evolved for different reasons, in entirely different creatures. However, Clough [7, p. 8] observes that “simple animals use simple ways of working with each other”, and notes that “swarming things are not smart. If they were, they’d team!” This suggests that the two strategies are not necessarily mutually exclusive in designing autonomous multiUAV systems. Perhaps the two can be applied in parallel, with focus shifting between them. Or maybe the two metaphors can be unified into a new conceptual blend. This brings us to our final metaphor. Table 2. Comparison of Swarm and Team properties. Adapted and expanded from [7]. Attribute
Swarm
Team
Temporal
Reactive
Predictive
Composition
Homogenous Heterogenous
Interrelationships Simple
Complex
Predictability
Probabilistic
Deterministic
Individual worth
Expendable
Critical
Efficiency
Low
High
Relative cost
Low
High
Controllability
Low
High
4 The Choir Metaphor Consider the multi-UAV system in terms of a choir or an orchestra [8, 9]. In such a metaphor, DRONES map onto SINGERS, the DRONE OPERATOR becomes the CHOIR CONDUCTOR, the MISSION is equivalent to the SONG, and so on. Like team members, choir singers are highly skilled and have different responsibilities, but they are divided into subgroups singing different harmonies of the song. They must pay close attention to sing in sync with their subgroup as well as the rest of the choir. Naturally, they must know the parts of the song and their role in it. In this metaphor, the individual singer is less critical than in a team, but also less expendable than in a swarm, sitting somewhere in between. While the composer orchestrates the piece and its harmonies, the conductor
Swarms, Teams, or Choirs? Metaphors in Multi-UAV Systems Design
15
guides the choir throughout their performance. The conductor sets the tempo for the entire choir and, using gaze, body language, and special signs, communicates with and instructs the entire choir, any of its harmonic subgroups, or even individual singers to pay closer attention or modulate their singing. In a multi-UAV system, the drones could keep a local copy of the mission (like sheet music) in case they lose data connection links with ground control (analogous to forgetting the lyrics).
5 Conclusion The choir metaphor immediately brings different design ideas and solutions to the forefront than swarm or team metaphors. The point here is not that one of the generated design concepts is better, just that they are different, but each of them can be useful depending on the context of use. The power of metaphors—conceptual and generative alike—is fully realized when we engage with them in a dialogic way. While explicit source-domain role mappings are more immediately intuitive and obvious, the inferred or entailed mappings are usually only revealed upon closer inspection of the metaphor. This back-and-forth between people (in language or in system design) and the metaphor reveals its strengths and, perhaps more importantly, its weaknesses. The use of metaphors therefore requires a level of reflective analytical care [2]. The use of generative metaphors in multi-UAV systems design is no different in this regard. We ought to be mindful of, and question, the metaphors we use and consider what assumptions—explicit or inferred—they prompt us to use as basis for our design work.
References 1. Evans, V.: Cognitive Linguistics: A Complete Guide. Edinburgh University Press, Edinburgh (2019) 2. McClintock, D., Ison, R., Armson, R.: Conceptual metaphors: a review with implications for human understandings and systems practice. Cybern. Hum. Knowing 11, 25–47 (2004) 3. Schön, D.A.: Generative metaphor: a perspective on problem-setting in social policy. In: Ortony, A. (ed.) Metaphor and Thought, pp. 137–163. Cambridge University Press, Cambridge (1993) 4. Neale, D.C., Carroll, J.M.: The role of metaphors in user interface design. In: Helander, M.G., Landauer, T.K., Prabhu, P.V. (eds.) Handbook of Human-Computer Interaction. Elsevier, New York, pp. 441–462 (1997). https://doi.org/10.1016/B978-044481862-1.50086-8 5. Dove, G., Lundqvist, C.E., Halskov, K.: The life cycle of a generative design metaphor. In: Proceedings of the 10th Nordic Conference on Human-Computer Interaction - NordiCHI 2018, pp. 414–425. ACM Press, New York, New York, USA (2018). https://doi.org/10.1145/3240167. 3240190 6. Hey, J., Linsey, J., Agogino, A.M., Wood, K.L.: Analogies and metaphors in creative design. Int. J. Eng. Educ. 37, 283–294 (2008) 7. Clough, B.T.: UAV swarming? So what are those swarms, what are the implications, and how do we handle them? In: AUVSI Unmanned System Conference, Orlando, FL (2002) 8. Boy, G.A.: The Orchestra: a conceptual model for function allocation and scenario-based engineering in multi-agent safety-critical systems. In: Proceedings of the European Conference on Cognitive Ergonomics, VTT, Helsinki, Finland, pp. 187–193 (2009) 9. Boy, G.A.: Orchestrating Human-Centered Design. Springer, London (2013). https://doi.org/ 10.1007/978-1-4471-4339-0
Visual Communication with UAS: Estimating Parameters for Gestural Transmission of Task Descriptions Alexander Schelle(B) and Peter Stütz Institute of Flight Systems, Bundeswehr University Munich, 85577 Neubiberg, Germany {Alexander.Schelle,Peter.Stuetz}@unibw.de
Abstract. Visual communication, and especially that based on gestures, can be an alternative to radio-based command and control links to small unmanned aerial vehicles (UAVs). In order to allow this natural form of communication to be an equivalent substitute to a conventional data link with regard to the efficient transfer of task data, the human-UAV-interaction must consist of as few gestural parameters as possible to keep the mental load for the user low, but at the same time provide enough information for the UAV to estimate the users intention. For this purpose, an experimental study under laboratory conditions with N = 20 German-speaking participants has been conducted to generate a dataset with 120 verbal formulations for six common UAV related tasks. The collected data has been evaluated regarding the distribution of sentences, words and parts of speech. Results indicate that users prefer short task descriptions prioritizing verbs and nouns (84.15%) with a mean number of 2.55 (SD = 1.57) words per sentence. All tasks considered were described with an average of 1.73 sentences (SD = 1.00). A slot filling approach is presented to map parameters in user intents. Keywords: Human factors · Human-UAV-Interaction · Gestural task formulation
1 Introduction Unmanned aerial systems (UAS) are predominantly being used as airborne sensor platforms for reconnaissance, surveillance and documentation tasks. The transmission of control and mission data from a ground control station to the unmanned aerial vehicle (UAV) is realized via a digital radio data link on almost all systems. However, this connection can be disturbed by topographic or electromagnetic influences or generally not be available or desired at all. In order to enable the access to the flying system in such situations, a concept including a functional prototype with basic functions was presented in previous work [1] and evaluated in real flight tests. Static gestures of an authorized person were detected onboard by means of a stabilized RGB camera including a convolutional pose machine and translated into UAV tasks with few parameters (take off/land, fly to direction, find objects of type, etc.) in a context-dependent manner, taking into account the current operating conditions and conversation topic. However, in order to © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 16–24, 2021. https://doi.org/10.1007/978-3-030-79997-7_3
Visual Communication with UAS
17
allow visual communication to be an equivalent substitute to a conventional data link with regard to the efficient transfer of task data, such a system must not only support more various parameters but also master a balancing act. On the one hand, the number of parameters to be transferred must be kept as small as possible to keep the conversation short and the mental load for the user low. On the other hand, a sufficient number of parameters must be transmitted allowing the system to estimate the user intention. Once this minimum set of necessary parameters for each task type has been found, it is then necessary to consider which gesture based representation form can best be used to encode each parameter type. In addition to ergonomic and cognitive factors, the visual and computational limitations of the flying system must also be considered. In the following, the focus is first on the human aspect of this kind of Human-Machine interaction and the identification of relevant parameters for the description of common UAV tasks.
2 Related Work Existing gestural forms of communication (national sign languages, marshalling signs in aviation, military hand signals) are only partially suitable for communication with UAVs, since they contain small-scale movements that cannot be adequately detected from a flying aircraft or only have a limited vocabulary of domain-specific commands. Currently there is little literature that explicitly uses gestures to transmit complex tasks containing multiple parameters to UAVs. Existing systems primarily use gestures to trigger basic on-board functions (e.g. photo/video recording), manipulate the flight attitude, the direction of movement [2, 3] or to specify trajectories [4]. Similar challenges arise in the cooperation of divers with unmanned water vehicles [5]. The decoding of task descriptions and the estimation of user intentions plays also an important role in the field of chatbots and voice assistants. Conversational interfaces like Snips [6] use the slot filling method [7] in combination with a natural language understanding engine to estimate the users’ intentions (intents) from individual utterances (slots).
3 Generating Minimal Parameter Sets for UAV Related Tasks Looking at current and possible future use cases for small UAVs that can take off and land vertically, six main reference tasks can be identified: • • • • • •
flight task (relocate UAV) data acquisition task (generate image/video data) evacuation task (navigate user out of dangerous situation) manipulation task (help user to interact with nearby objects) search task (explore environment and find objects of interest) transport task (relocate payload)
Each of these tasks can be described in various ways with different numbers of parameters. Therefore, it must first be determined what kind of parameters and how many of them a UAV should need to be able to perform a given task. The following section describes the study carried out, discusses the operationalization and processing of the data obtained and provides the results followed by a discussion.
18
A. Schelle and P. Stütz
3.1 Experiment An experimental study has been conducted under laboratory conditions to generate a dataset with verbally formulated UAV tasks. Sample Description. The sample consisted of N = 20 German-speaking participants ranging in age from 23 to 35 years (M = 28.65, SD = 3.54, 3 female, 17 male) who contributed a total of 120 verbal formulations for six different task scenarios. All participants were research assistants of the Institute of Flight Systems with an engineering or software background and a high intrinsic motivation to participate. No monetary compensation was paid. Test Setup. The experiments took place in a test room that was isolated from external influences. The participants stood on a marked position and had a direct line of sight to the examiner, who sat at a table about four meters in front of the participant. A screen with a diagonal of approximately 1.90 m served as a presentation medium and was mounted on a raised position behind the examiner. The participant was presented with video animations as well as a photo of a flying multicopter in order to increase immersion. Animations with little to no linguistic elements were chosen as a graphical representation form for the tasks to be commanded to the UAV in order to minimize possible influences on the participants during the verbal task formulation by the choice of words of the examiner. All animations (Fig. 1) showed a top view of a user (blue circle) and a UAV (stylized quadcopter). At the beginning of each animation, the UAV was located in front of the user in immediate vicinity and performed one or more actions (Table 1). Test Procedure. First, each participant received a briefing on the test procedure. The task was to put himself/herself in the position of the user in the respective animation. Each of the six runs went according to the following pattern: after a twofold presentation of the animation, the participant was asked to verbally describe the actions seen and to name the respective goal of the user in the animation. This served as a comprehension check to ensure, that all relevant information for the respective job in the animation was captured by the participant. In this conversation, the examiners choice of words was kept as general as possible in order to minimize any influence on the participant. After that, the participant should formulate the shortest possible sequence of verbal commands which would cause the UAV to perform the actions shown in the animation. The basic assumption here was that the UAV would have the same cognitive capabilities as a human communication partner and could derive the actions necessary to achieve this goal on its own. Operationalization. Each participant was video and audio recorded during the experiments. The given statements for each animation were transcribed and saved as text files. These files have been processed using a common Python based natural language
Visual Communication with UAS
19
Fig. 1. Animations used in study for graphical description of UAV tasks (blue dot with little arrow = user with indicated view direction, green object with four rings = UAV).
processing toolkit [8] to obtain the most compact form of the task descriptions. The transformation consisted of the following steps: 1. Break each statement down into sentences (sentence tokenization [9]) 2. Break each sentence down into words and punctuation (word tokenization [10]) 3. Remove punctuation and common German words that do not add much value to the meaning of the sentence, e.g. articles (stop words removal) 4. Group together different forms of the same word and reduce them to their base form (lemmatization [11]) 5. Label each words part-of-speech (POS) according to POS Tiger tagset (POS tagging [11]) 6. Convert POS tags according to Universal POS tagset [12] 7. Calculate distribution of sentences, POS and lemmata for each task
20
A. Schelle and P. Stütz
Table 1. Descriptions of the presented animations. At the beginning of each animation, the UAV hovers in front of the user. Animation Visible actions A1
UAV flies to a house
A2
UAV flies to a tree to the left of the user and scans it from all sides
A3
UAV and user are located inside a maze; UAV flies the shortest way from the maze to the house; user follows UAV
A4
UAV picks up a box to the right of the user; user moves to the left; UAV follows user with box
A5
UAV flies to four white locations within a 300 m radius and marks three of them red and the last one green, where it stays; a helicopter then lands next to the UAV
A6
UAV picks up box to the right of the user; UAV flies to house, drops box, flies back to the user and lands in front of him
3.2 Results Distribution of Sentences. The evaluation of the overall number of sentences used for the formulation of the tasks seen in all animations did result in a bimodal distribution (Mode = 1 and 2, M = 1.73, SD = 1.00). Table 2 shows the distribution for each animation. Table 2. Number of sentences used for the formulation of the actions seen in each animation Animation Mode Mean SD A1
1
1.10
.31
A2
1, 2
1.55
.61
A3
1
1.20
.41
A4
2
2.00
.65
A5
1
1.35
.59
A6
2
3.20
1.28
Distribution of Parts of Speech. With respect to all formulations, there is a clear preference for verbs and nouns (including proper nouns like UAV ). These three categories together already account for 84.1% of the parts of speech used. Table 3 lists all parts of speech (after the stop word removal processing step).
Visual Communication with UAS
21
Table 3. Distribution of the utilized parts of speech for task formulations in animations A1–A6 Part of speech Percentage Percentage in German vocabulary [13] for comparison (proper) nouns 44.9
74.9
verbs
39.2
9.8
numerals
4.7
N/A
adverbs
4.3
1.2
adjective
2.8
13.5
particle
2.6
N/A
adposition
1.5
N/A
others
0
0.6
Number of Lemmata per Sentence and per Task. After dividing all verbal expressions into individual sentences, a unimodal distribution of the number of lemmata per sentence can be observed (Mode = 2, M = 2.55, SD = 1.57). Split into groups, 23.56% of the sentences were made of only one lemma, whereas the majority of the sentences (56.73%) consisted of 2 to 3 lemmata. More than 3 lemmata were only used in 19.8% of sentences. The boxplot in Fig. 2 shows the lemma count for each animation including multiple outliers that can be assigned to few participants that preferred more detailed formulations than others. Table 4 provides a more detailed look at the recorded data and shows examples of the respective shortest and longest utterances for each animation.
Fig. 2. Boxplot showing the distribution of the number of lemmatized words used per animation (white circle = mild outlier, star = extreme outlier, number = sample from dataset, bold line = median, antennas = interquartile range * 1.5)
22
A. Schelle and P. Stütz
Table 4. Overview of the generated dataset including mean, standard deviation (SD), minimum and maximum of lemma count per animation
Animation
A1 A2
A3 A4 A5
A6
Lemma count Mean Min SD Max 2.90 2 1.45 7 4.15 2 2.62 12 3.15 1.46 3.45 1.88 6.00 1.97
2 7 2 10 2 12
6.90 2.91
4 15
Example from dataset (words lemmatized and translated literally, sentences are separated by semicolon) fly house fly direction 10 clock house; hold position scan tree Drone fly direction 10 clock 20 meter tree; scan tree; hold position lead house UAV find way maze; fly way house follow package UAV fly right 2 meter object; grab object; take; follow search helipad UAV scout radius 300 meter area; rate area; good rate; depart area transport object house; land UAV fly right 2 meter; take object; fly house; lay object down; come back; land
Discussion and Further Observations. Participants primarily used verbs and nouns to phrase the tasks seen in the animations Table 3. This is not surprising, since 84.7% [13] of the German vocabulary consists of these parts of speech and most of the UAV tasks consisted of some kind of action and object the UAV had to interact with. Most participants tried to keep it short and were able to formulate tasks using one to two sentences in most cases. More than one sentence was usually chosen for animations that showed several individual actions taking place in succession over time (animations A2, A4 and A6). Two participants addressed their task descriptions directly to the UAV (e.g. UAV fly house). All other started their sentences with a verb followed by a noun/proper noun or adverb, which corresponds to the sentence order in the German imperative (e.g. fly house, grab box). In 30% of the sentences participants added directional instructions to their sentences to specify objects (scan tree left, grab box right), all others did not mention that parameter. The distance of the user to the object did not seem to be a decisive factor (animations A1, A3, A5 vs. A2, A4, A6). Therefore, it must be assumed that users expect some autonomy from the UAV in selecting the proper object of interest based only on their descriptions.
Visual Communication with UAS
23
3.3 Conclusion and Outlook The goal of this study was to find out what kind of parameters and how many of them a UAV should need to be able to perform a given task from a user perspective. The findings indicate that users prefer short formulations for UAV tasks ranging from one to three words, mostly starting with verbs followed by nouns and adverbs as specifying parameters. Transferred to a gestural human-UAV-interaction, these tasks can be mapped according to the slot filling method to intents using slots as parameters. Table 5 shows the proposed minimal slot sets for common UAV tasks. Slots of type action correspond to the specific capabilities of the UAV, whereas destination, object, subject, hint, specification and radius represent the respective types of information needed for the execution of those capabilities.
Further experiments will show whether the intent detection and slot filling method of conversational interfaces can be transferred to a gestural form of interaction. Regarding the slots, it must also be investigated which gestural forms of representation are best suited for the corresponding parameter entities. Finding an appropriate method to combine several individual user intents into a mission will be another field of study.
References 1. Schelle, A., Stütz, P.: Gestural transmission of tasking information to an airborne UAV. In: Yamamoto, S., Mori, H. (eds.) HIMI 2018. LNCS, vol. 10904, pp. 318–335. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-92043-6_27 2. Peshkova, E., et al.: Exploring intuitiveness of metaphor-based gestures for UAV navigation. In: 2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), Lisbon, pp. 175–182. IEEE (2017). https://doi.org/10.1109/ROMAN.2017. 8172298 3. Sanders, B., Vincenzi, D., Shen, Y.: Investigation of Gesture Based UAV Control. In: Chen, J. (ed.) Advances in Human Factors in Robots and Unmanned Systems. Advances in Intelligent Systems and Computing, vol. 595, pp. 205–215. Springer, Cham (2018). https://doi.org/10. 1007/978-3-319-60384-1_20 4. Chandarana, M., et al.: A natural interaction interface for UAVs using intuitive gesture recognition. In: Savage-Knepshield, P., Chen, J. (eds.) Advances in Human Factors in Robots and Unmanned Systems. Advances in Intelligent Systems and Computing, vol. 499, pp. 387–398. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-41959-6_32 5. Xu, P.: Gesture-based Human-robot Interaction for Field Programmable Autonomous Underwater Robots. arXiv preprint arXiv:1709.08945 (2017)
24
A. Schelle and P. Stütz
6. Coucke, A., et al.: Snips Voice Platform: An embedded Spoken Language Understanding system for private-by-design voice interfaces (2018). http://arxiv.org/pdf/1805.10190v3 7. McTear, M., Callejas, Z., Griol, D.: The Conversational Interface. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32967-3 8. Loper, E., Bird, S.: NLTK: The Natural Language Toolkit (2002). http://arxiv.org/pdf/cs/020 5028v1 9. Kiss, T., Strunk, J.: Unsupervised multilingual sentence boundary detection. Comput. Linguist. 32, 485 (2006). https://doi.org/10.1162/coli.2006.32.4.485 10. Taylor, A., Marcus, M., Santorini, B.: The Penn Treebank: An Overview. In: Ide, N., Véronis, J., Abeillé, A. (eds.) Treebanks. Text, Speech and Language Technology, vol. 20, pp. 5–22. Springer Netherlands, Dordrecht (2003). https://doi.org/10.1007/978-94-010-0201-1_1 11. Wartena, C.: A Probabilistic Morphology Model for German Lemmatization (2019) 12. Petrov, S., Das, D., McDonald, R.: A Universal Part-of-Speech Tagset. http://arxiv.org/pdf/ 1104.2086v1 (2011) 13. Bibliographisches Institut GmbH: Die Verteilung der Wortarten im Rechtschreibduden (2020). https://cdn.duden.de/public_files/2020-10/Verteilung-der-Wortarten_2020.svg. Accessed 22 Jan 2021
A Distributed Mission-Planning Framework for Shared UAV Use in Multi-operator MUM-T Applications Gunar Roth(B) and Axel Schulte University of the Bundeswehr Munich, Werner-Heisenberg-Weg 39, 85577 Neubiberg, Germany {Gunar.Roth,Axel.Schulte}@unibw.de
Abstract. This contribution presents a framework to support the planning of Manned-Unmanned Teaming (MUM-T) missions in which unmanned aerial vehicles (UAVs) are shared among multiple distributed users. To negotiate the provision of requested UAV services, we modeled a centralized task allocation procedure that uses individual utility values to represent the preferences of requester and provider. Our initial approach solves the planning problem classically, by directly accounting for utility and costs. A second approach extends this in favor of a universal application that does not rely on direct accountability. This second approach however presents potential limitations in finding the optimal solution as well as increased computation costs. Both approaches were integrated and tested in a helicopter research simulator. Study results indicate that both approaches provide the same solution quality. We conclude that, regardless of the implemented method, UAVs can be effectively shared between distributed users through a centralized task allocation model. Keywords: Assistant system · Manned-unmanned teaming · Shared UAVs · Mission planning
1 Introduction and Background The continuous development of unmanned systems increases their potential utility in both civil and military applications. They can carry out dangerous tasks without human risk and long lasting or repetitive tasks without getting tired or losing focus. They may also outperform manned systems as they are not constrained to human physical limits. Possible civilian applications include agriculture, search and rescue, disaster relief, and terrain mapping. Such systems also have many valuable military applications, such as supporting reconnaissance, engagement, or electronic warfare. In Manned-Unmanned Teaming (MUM-T) missions, a manned command vehicle is used to manage one or more unmanned systems that support mission execution [1, 2]. Together, they are intended to act as a team and contribute to increasing the overall mission efficiency and effectiveness. In these teams, humans contribute most significantly and uniquely with their high-level cognitive abilities, such as planning and problem solving [3]. These abilities as well as their retention of the authority to decide about © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 25–32, 2021. https://doi.org/10.1007/978-3-030-79997-7_4
26
G. Roth and A. Schulte
potential weapon use make the human vital in such a team. Unmanned vehicles on the other hand, can contribute to mission success with support in reconnaissance for example through automated sensing and perception, increasing the manned system’s sensor range and automating task execution. In particular, reducing the risk of the manned command vehicle by enabling it to remain distant from potential enemies or hazards provides a significant utility in these scenarios. Previous research studied concepts of controlling multiple unmanned systems in manned-unmanned-teams [1, 4] whereby the unmanned systems were used exclusively by a single user. We aim however, to allow multiple users access to the unmanned systems in a networked operations space. Sharing unmanned systems has the potential to utilize their resources more efficiently and to provide their capabilities to additional users who would not otherwise have the infrastructure necessary to operate them. In a previous article, we presented a concept for the shared use of unmanned systems in MUM-T, discussing authority distribution and levels of interoperability [5]. The unique contribution of this article is the presentation of a framework to support the coordinated use of limited unmanned systems by multiple users. In particular, we present and evaluate two approaches for assisting human decision-making in the planning of MUM-T missions with shared unmanned aerial vehicles (UAVs).
2 Related Work Several efforts address the problem of connecting agents that require data or a capability with those who can provide it. The terms “mediator,” “facilitator,” and “middleagents” [6–8] are used interchangeably in the relevant literature to describe agents located between the requester and provider that can process requests among initially unacquainted agents. Decker et al. [9] investigated middle-agents under privacy considerations and distinguished “matchmaking” and “brokering” behaviors. IMPACT [10] is a framework that supports multiagent interactions and agent interoperability in an application-independent manner. This system uses a matchmaking middle-agent to allocate requests to suitable agents. Services are requested using a verbnoun expression. If there is no exact match of requested and available services, the system can use a thesaurus to find services that are similar to the requested one. Sara and Nader [11] propose a framework that applied a broker mechanism to enable collaborative UAV behavior. The UAVs manage their data and organize their missions through a cloud architecture. Collaboration methods are offered as services that can be requested over the internet by multiple users. A cloud-based web service serves as broker to assign the requested services to the available UAVs. These applications are designed only for the interaction between one user and multiple agents. Additionally, their processes of allocating a provider to a requester does not involve human decision-making. Our effort differs in that we want to apply the broker approach to coordinate UAV use between multiple operators and to assist the associated mission planning.
A Distributed Mission-Planning Framework for Shared UAV
27
3 Application Our application focuses on a manned two-seated helicopter that is teamed with three UAVs to accomplish dynamic and complex military missions [12]. The UAVs support the reconnaissance of mission-relevant areas to increase the helicopter’s sensor range and are controlled by the pilots according to a task-based guidance approach [13]. This approach enables the helicopter crew to provide high-level mission tasks to the UAVs rather than requiring low-level flight control. Using automation on-board the UAV, a complex, highlevel task is broken down into executable sub-tasks that can be processed in sequence. In addition, the UAVs are made available to a third-party user within the mission area who also requires their capabilities. This additional user (the client) can request specific tasks as well as temporary access to the UAVs during mission execution. The helicopter crew (the host) can respond to these requests, considering the impact on their mission plan, and, if desired, appropriately task the UAVs. For this decision, the pilots have to develop a mental model of the UAV tasks required by the client and manage this in accordance with their own UAV related mission requirements. Furthermore, they must establish the optimal task allocation, comprising of task assignment and scheduling. This allocation burden introduces additional cognitive effort and a non-neglectable extra task-load for the pilots, possibly without providing additional benefit to the original mission’s execution. In the worst case, such a request would even interfere with the ongoing mission plan and negatively affect the crew’s mission performance. Our goal is to support this decision-making and to encourage a shared UAV employment that is most beneficial to each of the users. Therefore, we are developing an assistant system functionality to evaluate requests according to their impact on mission efficiency and generate support for integrating them into the local mission plan.
4 Implementation The implemented framework was designed to support mission planning by proposing solutions for implementing foreign requests into the helicopter mission plan. In principle, the helicopter crew and the client plan their missions individually and execute them independently. Interaction between them occurs only when the client relies on support from the pilots. In order to enable this support, the services of their subordinate UAVs are offered to be used by the client, as long as their task load permits this. The client is then given the opportunity to request the offered services. To coordinate such requests, we use a centralized task allocation procedure that is modelled using three distributed agents: a requesting agent (R-Agent), a providing agent (P-Agent) and a middle-agent. The R-Agent is located at the requesting side and represents the client’s interests in the negotiation with the middle-agent. To this end, it generates requests according to the client’s specifications, communicates them to the middle-agent, and returns the results to the client. The P-Agent is located at the providing side. It generates solutions for integrating a request into the local mission plan, considering the interests of the host. The quality of the resulting mission plans is then used for negotiating with the middle-agent. Once the negotiation is complete, the P-Agent communicates the request and the proposed solution to the host and forwards the decision back to the middle-agent.
28
G. Roth and A. Schulte
The middle-agent mediates between the R-Agent and the P-Agent. Its task is to coordinate the request and provision of services. We developed it on the basis of a broker architecture and designed it to equally consider the interests of both parties. It receives a request from the R-Agent, negotiates it with the potential P-Agent, and transmits the result back to the R-Agent. As a central intermediary, the broker also serves to preserve private information. All communication between client and host passes through the broker and can accordingly be anonymized if desired. In order to consider the respective interests of client and host, we pursue a negotiationapproach based on individual utility values representing their preferences. These values are used by the middle-agent to rate different solutions against each other and are thus used as decision parameters for the integration. Utility Client. In order to integrate a request into the mission plan in the interest of the client, we need a representation of the request’s usefulness for mission execution. This value need not be constant, but can be variable over a period of time. One can imagine that certain tasks have a higher utility if their execution is temporally well integrated into the mission. However, the same task can have a lower utility value at some non-optimal point in time. In our application, the client’s preference and thus the utility of a request depends exclusively on the time of its execution. This time-dependent process can be described by a utility function. To manually define such a function, the client can specify a preferred time representing maximum utility. The utility value reduces as time deviation increases. This decrease can be modeled in different ways (e.g. linear). The utility can furthermore be restricted by specifying deadlines. Before the earliest and after the latest deadline, the utility is zero. Rather than a manual definition, it would also be conceivable to determine this utility function automatically using a mission cost metric for the client. Transmitting this function allows the middle-agent to perform the negotiation without communicating again with the R-Agent. Alternatively, it would be possible to query the utility values incrementally during the negotiation process. However, this would involve additional communication overhead. Utility Host. Fulfilling a request may require the pilots to accept a task or the temporary absence of an UAV that would not have been necessary for their mission accomplishment. Due to this additional effort, it is possible that they will not experience any immediate benefit to mission completion. In the worst case, the request can have a negative effect on efficiency or even conflicts with the helicopter mission. Nevertheless, there are certain considerations that can be taken into account in order to meet the pilots’ preferences. Basically, there should always be an incentive to utilize the UAVs as efficiently as possible and thus keep mission costs low (i.e., distances traveled, mission duration, etc.). In addition, the decision-making process should induce a minimum of additional workload. Another consideration is therefore to integrate the request in such a way that the resulting mission plan deviates as little as possible from the original mission plan. In this way, the proposed solution shall remain quickly comprehensible and allow fast decision making with low mental load.
A Distributed Mission-Planning Framework for Shared UAV
29
Broker. The broker’s job is to negotiate a request with the P-Agent so that a maximum overall utility is achieved across both parties. In our first approach (depicted in Fig. 1), the request is directly forwarded to the pilots’ P-Agent. The P-Agent then determines how to best integrate the request into the local mission plan, by feeding the clients utility function directly into the P-Agent’s allocation metric. In this approach, the P-agent solves the scheduling problem by directly considering both requester and provider preferences, thus providing a single solution that is optimized from the allocation metric’s point of view. In our second approach (depicted in Fig. 2), the broker identifies promising allocations (pairs of UAV and time) to integrate a request, each representing an alternative integration of the requested task into the local mission-plan. For this estimation, the broker keeps a continuously updating overview of how the offered services are currently loaded. These alternatives are then forwarded to the P-Agent and evaluated according to their effect on mission cost. Within this approach, the utility function of the request is neither forwarded to the P-Agent nor considered in its allocation metric. Instead, the P-Agent solves the planning problems under the allocation constraints as they are requested by the broker. The broker now receives several solutions for integrating the request into the local mission plan and evaluates them using a multi-criteria-decisionmaking approach. We compare the mission costs incurred with the utility value for the client and rank the solutions using a TOPSIS [14] analysis. In both approaches, the helicopter crew is confronted with the request and the generated solution proposal on a display in their cockpit. A tactical map shows the spatial properties and a timeline indicates the temporal impact of the request on the mission plan. Based on this information, the pilots have to decide about the integration of the request. They can either accept the proposed solution, modify it or completely decline the request. If the request is granted, the corresponding service is executed by the UAVs and the result is returned to the client via the broker. If the request is denied, the client is notified that the request cannot be served at the moment.
Fig. 1. Framework diagram of our first approach to manage the task allocation between a host and a client in a MUM-T scenario.
30
G. Roth and A. Schulte
Fig. 2. Framework diagram of our second task allocation approach in a MUM-T scenario.
Multiple Hosts. Although our application considers only a single host (namely the helicopter crew), the underlying concept of the broker is suitable for the coordination between multiple hosts and multiple clients. Accordingly, its overview of available services is extended to cover multiple hosts. Upon receiving a request, the broker uses an estimate to determine, which of the registered hosts are best suited to fulfill the request. Our approaches are extended in that the request is routed to the P-Agents of all potential hosts. Depending on the approach, the broker now receives one or more solutions from each P-Agent that integrate the request into the local mission plan. These solutions can in turn be ranked based on the TOPSIS analysis, regardless of the chosen approach. The winning host is ultimately assigned the request. If this host rejects the request, the broker can contact the host of next highest-ranked solution.
5 Evaluation and Discussion Specifically, with regards to solving the scheduling problem, our first approach is preferable to the second. The first requires less computation and communication during negotiation, and we obtain a solution that minimizes the mission cost for the host. However, a requirement for this approach is that the utility for the client can outweigh additional mission costs for the host. Accordingly, the client utility needs to be appropriately scaled to fit other costs and utilities contained in the allocation metric, which can be difficult to achieve when no information about their individual metric is shared. In addition, the utility function needs to be passed to the P-Agent, which can entail privacy issues. The R-Agent’s utility function, as well as the P-Agent’s allocation metric, are sensitive information that they may not want to disclose. Our second approach addresses these issues. Due to the indirect multi-criteriadecision-making comparison, this approach allows negotiation regardless of whether mission costs and client utility have the same scaling and can outweigh each other. Moreover, it preserves privacy information since client preferences are not forwarded to the provider per se. However, these benefits come with the cost of potentially limiting
A Distributed Mission-Planning Framework for Shared UAV
31
the solutions as the selected alternatives to be evaluated are purely based on the broker’s estimation. The closer this estimation matches the allocation metric of the P-Agent, the more likely we are to obtain the same solution as from the first approach. Additionally, this approach increases computation and communication cost for both the P-Agent and the broker. However, as long as we consider only few hosts and requests this increase may be negligible in our application. We validated both approaches in a simulated multi-operator MUM-T environment. A helicopter research simulator served as command vehicle, equipped with three subordinate UAVs. The services of the UAVs primarily served the helicopter crew’s mission execution, but were made available to the additional user as far as their capacities allowed. In eight different use-cases, a requested service was integrated into the crew’s mission plan. Each planning problem was solved with both approaches. To ensure direct comparability of the two approaches, the client utility was scaled to match the allocation metric of the P-Agent. In all use-cases, the second approach provided the same UAV allocation with negligible deviations in time allocation (M = 0.75 s, SD = 2.81 s) as well as the incurred increase in mission cost (M = 1.77%, SD = 2.48%). Accordingly, our second approach was able to provide the same solutions with almost identical mission costs and client utility.
6 Conclusion Our framework aims to support the coordinated shared use of UAVs by multiple users in a networked operations space. We consider a helicopter equipped with multiple UAVs, and an additional user who also requires the capabilities of these UAVs. To support the mission planning in this multi-operator MUM-T application, we have developed an assistant system function. It is designed to coordinate the request of UAV services and their provision through evaluating potential integrations and proposing optimized solutions. This functionality was implemented in a helicopter research simulator. To account for the respective preferences of both parties, we have defined individual utility values. Coordinating the request and provision of UAV services is based on a broker architecture with three distributed agents. One of which supports the interests of the provider, one supports the interests of the requester, and the third coordinates the negotiation by communicating with the other two agents. We modeled the negotiation procedure with two approaches that use different mechanisms to integrate the request into the helicopter mission plan. Their respective advantages and disadvantages in terms of universal use and performance were discussed. Furthermore, we have shown how both approaches can be extended to account for multiple hosts. A validation showed that our second approach was able to provide almost identical solutions as our first approach while offering enhanced privacy and universality. Although we demonstrated that our system can generate valid solutions, we still need to evaluate whether they can support the pilots’ allocation task. This is to be investigated in an upcoming Human-in-the-Loop experiment with helicopter pilots in our research simulator.
32
G. Roth and A. Schulte
References 1. Uhrmann, J., Strenzke, R., Rauschert, A., Meitinger, C., Schulte, A.: Manned-unmanned teaming: artificial cognition applied to multiple UAV guidance. In: NATO RTO SCI-202 Symposium on Intelligent Uninhabited Vehicle Guidance Systems, Neubiberg (2009) 2. Strenzke, R., Uhrmann, J., Benzler, A., Maiwald, F., Rauschert, A., Schulte, A.: Managing cockpit crew excess task load in military manned-unmanned teaming missions by dual-mode cognitive automation approaches. In: AIAA Guidance, Navigation, and Control Conference, Portland, pp. 6237–6260 (2011) 3. Schulte, A., Donath, D.: Cognitive engineering approach to human-autonomy teaming (HAT). In: 20th International Symposium on Aviation Psychology, Dayton (2019) 4. Chen, J.Y., Barnes, M.J., Harper-Sciarini, M.: Supervisory control of multiple robots: humanperformance issues and user-interface design. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 41(4), 435–454 (2010) 5. Roth, G., Schulte, A.: A concept on the shared use of unmanned assets by multiple users in a manned-unmanned-teaming application. In: Harris, D., Li, W.-C. (eds.) HCII 2020. LNCS (LNAI), vol. 12187, pp. 189–202. Springer, Cham (2020). https://doi.org/10.1007/978-3-03049183-3_15 6. Wiederhold, G.: The Architecture of Future Information Systems. A Mediator Architecture for Abstract Data Access. Report No. STAN-CS-90–1303, Stanford University, pp. 1–36 (1990) 7. Genesereth, M.R., Ketchpel, S.P.: Software agents. Commun. ACM 37(7), 48 (1994) 8. Klusch, M., Sycara, K.: Brokering and matchmaking for coordination of agent societies: a survey. In: Omicini, A., Zambonelli, F., Klusch, M. (eds.) Coordination of Internet Agents, pp. 197–224. Springer, Heidelberg (2001). https://doi.org/10.1007/978-3-662-04401-8_8 9. Decker, K., Williamson, M., Sycara, K.: Matchmaking and brokering. In: Proceedings of the Second International Conference on Multi-Agent Systems, vol. 432 (1996) 10. Arisha, K., Ozcan, F., Ross, R., Kraus, S., Subrahmanian, V.S.: IMPACT: the interactive Maryland platform for agents collaborating together. In: Proceedings International Conference on Multi Agent Systems, pp. 385–386. IEEE (1998) 11. Mahmoud, S., Mohamed, N.: Collaborative UAVs cloud. In: 2014 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 365–373. IEEE (2014) 12. Uhrmann, J., Schulte, A.: Concept, design and evaluation of cognitive task-based UAV guidance. Int. J. Adv. Intell. Syst. 5(1), 145–158 (2012) 13. Uhrmann, J., Schulte, A.: Task-based guidance of multiple UAV using Cognitive automation. In: COGNITIVE 2011, The Third International Conference on Advanced Cognitive Technologies and Applications, pp. 47–52 (2011) 14. Hwang, C.L., Yoon, K.: Multiple attribute decision making. In: Yoon, K. (ed.) Methods and Applications A State-of-the-Art Survey. LNE, vol. 186. Springer, Berlin (1981). https://doi. org/10.1007/978-3-642-48318-9
Robots in Transportation Systems
Conditional Behavior: Human Delegation Mode for Unmanned Vehicles Under Selective Datalink Availability Carsten Meyer(B) and Axel Schulte Institute of Flight Systems (IFS), Universität der Bundeswehr Munich (UBM), Werner-Heisenberg-Weg 39, 85577 Neubiberg, Germany {carsten.meyer,axel.schulte}@unibw.de
Abstract. In this article, we investigate the workshare and task allocation of humans and cognitive agents onboard unmanned vehicles (UV) in military scenarios with temporarily unavailable datalink. Human control over UV depends on a reliable datalink connection. Useful operation under temporary datalink interruption requires automatic capabilities of the UV to react to developing situations and pursue the mission objective, without human input. We aim to increase controllability during these times by the pre-definition of behaviors (what actions to take if certain conditions are met). The introduction of suchlike working agreements between human and UV-onboard agent raises the discussion of task allocation and workshare between human and UV. Based on this discussion, we present the implementation of the proposed solution in our helicopter simulator with cockpit-based UV control. Although this article focusses on military application, the findings promise relevance to other fields like the integration of UV into civilian airspace. Keywords: Human autonomy teaming · Task-based delegation · Datalink · Deliberation · Controllability
1 Introduction and Background Human-autonomy teaming (HAT) as a concept to look at highly automated humanmachine systems has gained a lot of popularity in recent years in military context [1]. It describes a team consisting of humans, but also of intelligent technical agents in collaborative pursuit of a common work objective. Mission effectiveness and survivability of humans is increased by deploying unmanned systems instead of manned systems for critical tasks. Research interests of the Institute of Flight Systems at the University of the Bundeswehr Munich are, amongst others, Manned-Unmanned Teaming (MUM-T) military scenarios, consisting of one manned aircraft teamed up with several Unmanned Aerial Vehicles (UAV). The UAV are e.g. used as scouts ahead of the manned ship to detect and subsequently effectively avoid possible threats. A human Air Mission Commander (AMC) can issue commands to their UAV, which those execute and transmit their actions and state (position, velocities, system states, etc.). The AMC bears authority over, and accountability for the UAV in the researched © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 35–42, 2021. https://doi.org/10.1007/978-3-030-79997-7_5
36
C. Meyer and A. Schulte
MUM-T scenarios. These obligations require the AMC to be able to exert “Meaningful Human Control” (MHC), a requirement often formulated especially for Autonomous Weapon Systems, but with the lack of a clear specification of the preconditions [2]. Verdiesen et al. [3] therefore derived a framework for “comprehensive human oversight over Autonomous Weapon Systems which may ensure solid controllability and accountability for the behaviour [sic] of Autonomous Weapon Systems”. This framework is structured in three layers of control mechanisms (control at technical, socio-technical and governance level), and at three different phases of deployment (before, during and after deployment). The internal (technical) layer describes the necessary requirements for the system from a control system engineering point of view: inputs, robust feedback mechanisms, and a human understandable output. The socio-technical layer (in between the internal and external layer) focusses on the relationship and interactions of the human operators interacting with the autonomous system: They shall fully understand limitations and capabilities of the system as well as be able to set control measures before deployment. During deployment, they shall know what the autonomous system is doing and have a way to influence its actions. In this article, we will use the terms “observability” for the capability to know what the system is doing and “controllability” for the capability to influence its actions (those terms are used in a similar way to system control theory [4]). After the deployment they shall be able to assess the behavior of the system, in order to account for its actions. The external control layer (governance) refers to the political and institutional oversight mechanisms that are used to justify the deployment of units (e.g. a government mandate). This layer is of less interest in our study. Both observability and controllability require a technical way to transmit the respective data, i.e. a working datalink. Datalinks are susceptible to numerous adverse effects, especially in military missions where both transmitting parties (UAV and manned ship) are mobile and in only partially controlled environment. These effects include, but are not limited to, hostile actions (jamming), interruptions of line of sight (applicable only for line of sight datalinks, e.g. by terrain), as well as deliberate radio silence to reduce detectability by avoiding emissions. Operation of human-autonomy teams in scenarios with temporarily unavailable datalinks will result in a gap in the control mechanism of the socio-technical layer during deployment (specifically during the datalink interruption). In this article we present an approach to compensate the control gap with an extension of the control mechanism of the socio-technical layer before the deployment (a similar idea is very briefly mentioned in Chapter 6 of [3]). This type of operation also requires enough autonomy of the UAV to modify its plan, in order to ensure contribution to the overall mission objective if no external control is possible. Therefore, we discuss which role the UAV has to take in the human-autonomy team in these scenarios as well as the workshare between human and autonomous system. Finally, we present an exemplary implementation in our research simulator.
Conditional Behavior: Human Delegation Mode for Unmanned Vehicles
37
2 Implications of Datalink Loss for Hierarchical Control of UAVs by Humans In order to better understand the relation of humans working with highly automated mission systems and to derive further options on how the workshare between humans and cognitive agents can be organized, we introduce a descriptive language (details in [5]). This language can be used to explicitly state the role of humans and artificial cognitive agents abstractly as part of a Work System (WSys). A specific configuration of humans and cognitive agents is referred to as “design pattern”. We will give an example of a commonly used design pattern for UAV control in HAT and discuss its limitations with regard to datalink interruptions. 2.1 HAT Work System Descriptive Language HAT work systems consist of different actors: humans ( ), cognitive agents ( ) and conventional automation ( ). Actors, that initiatively pursue the work objective (WObj) of the whole work system (e.g. “fly transport mission safely”) are called “Workers”. Actors, that can only receive commands from workers are called “Tools”. Conventional automation can only act as Tool, while humans can only be Workers, cognitive agents can assume both roles, depending on their configuration (indicated by a “W” or “T” on the symbol). The organizational structure of the different actors can either be heterarchical (cooperation, assistance) between humans and/or cognitive agents, or hierarchical (tasking, delegation) between any actors. 2.2 Commonly Used HAT Work System Design Patterns for UAV Control Due to technical limitations such as time delay [6], or human factors limitations like work load [7], UAV control is mostly exerted by few, time discrete commands to the flight management system or abstract tasks [7]. If abstract tasks are used to set the goal of an onboard cognitive agent, the human is in “supervisory control” [8], delegating tasks to a cognitive agent, which then interacts with the low-level systems and conventional automation (autopilot etc.) of the UAV. Most currently used work system designs for UAV control by a human can be described as hierarchical, regardless of whether these commands are issued to a single UAV, a team of UAV or a swarm (see Fig. 1, [9]). The delegated tasks are of unconditional nature, i.e. the recipient is not given the option to choose between multiple variations of the same task or decide whether or not to execute it at all. The cognitive agent is acting as a Tool, prohibited to alter, replace or disregard any explicitly given and valid command. As a Tool, the agent has no knowledge of the WObj, which is a privilege to Workers and will never take initiative to pursue the WObj, unless explicitly tasked to do so. A Tool relies on its supervising Worker to adjust the plan and any delegated tasks to account for new developments (e.g. retreat on engagement by enemy, divert due to weather, etc.). These adjustments can only be made if the human Worker is able to observe and control the subordinate Tool, which is impossible during the interruption of the datalink between them.
38
C. Meyer and A. Schulte
Fig. 1. Elementary design pattern for task delegation
3 UAVs as Worker and Compensation of Control Gap The combination of a UAV unable to initiatively pursue the WObj (due to its role as a Tool) and a human Worker unable to control the UAV during datalink interruptions leads us to suggest the adoption of Worker as role for the UAV agent. In this role, it can still ), be utilized in task-based delegation mode (i.e. supervisory control, depicted with but can also change its plans and tasking by itself, under certain conditions during a datalink interruption. Additionally, we propose a way to limit the effects of the control gap during the datalink interruption by offering additional control options prior to task execution. To achieve these changes, it is not only necessary to restructure the Work System Design (which actor plays which role and interacts with whom) but also to redefine what actor is accountable for a task under which conditions, and how and when this task allocation may change. These assignments are referred to as Working Agreement (WA; [1] Chap. 4.4). We define the new WA and WSys design based on the two underlying conditions: no active datalink and active datalink. 3.1 No Active Datalink The cognitive UAV agent is a Worker and can deliberate1 to pursue the overall WObj within human defined boundaries. The cognitive agent primarily fulfills the tasks delegated by the AMC. If all explicitly given tasks are completed (and if still without a datalink) the cognitive agent takes the necessary actions to pursue the WObj (e.g. choose another task that supports the mission or try to reestablish the datalink). This increase in deliberation capabilities mandates the physical location of the cognitive agent onboard its respective UAV, to ensure the agent is always capable of controlling the system and adapt to a developing situation without external interaction (see Fig. 2). The compensation of the control gap during times without datalink can be achieved by giving the human a way to constrain the available planning/deliberation space of the agent. These limitations are formulated as a “behavior” [10]. Behaviors are defined by the controlling human while an active datalink exists, as a set of conditional commands, defining allowed reactions to events that may occur in the future when no interaction is 1 Deliberation refers to the resource-bounded generation of plans of action, based on a means-
end reasoning process. While the agent is committed to the generated plan, it also frequently reevaluates a generated plan [17].
Conditional Behavior: Human Delegation Mode for Unmanned Vehicles
39
possible. Prominent examples for conditional commands in human-human work relationships include clearances under instrument flight rules, which contain an “Expect…” part, telling pilots what routing to expect if they lose communications with air traffic control [11]. In military reconnaissance operations, it is common to issue something referred to as “reconnaissance guidance” to scouts, defining how to behave, e.g. when an enemy unit is discovered [12].
Fig. 2. Work system and working agreement (orange boxes) for potentially unavailable data-link
3.2 Active Datalink In times of active datalink, the human worker is in hierarchical control of the cognitive UAV agent and delegates tasks to it, and sets the desired behaviors for times of interrupted datalinks (see Fig. 2). These manually adaptable behaviors increase the complexity of tasking a UAV, which is undesirable in the already high workload environment. Datalink interruptions (especially jamming) can hardly be predicted, but dictate expeditious setting of an acceptable behavior. In order to reduce the effort and time necessary to set a behavior, the cognitive agent of the UAV can offer decision support and make behaviors adaptive (i.e. automatically changing according to the overall situation). The adaptivity is achieved by suggesting an adequate behavior to the human, who can accept, reject or adjust that behavior. This results in a cooperative relationship between human and cognitive agent with regard to UAV behaviors. Changing the Level of Automation (LOA) [13] under certain conditions, by which a suggested behavior is adopted, from: only adopt if approved by human (LOA 5) to: adopt if human not disapproves (LOA 6), further increases the resilience against sudden datalink loss, which prevent the human from processing the proposed behaviors. In this case, the UAV cognitive agent will adopt a proposed behavior, if the datalink loss prevails for an extended amount of time and the human has not rejected the proposition before the datalink loss occurred.
4 Implementation We present our implementation of the WSys and WA, as discussed in the previous chapter with regards to the underlying datalink conditions. The implementation is under active development and functional testing in the full mission MUM-T simulator at Institute of Flight Systems (IFS), Universität der Bundeswehr Munich.
40
C. Meyer and A. Schulte
4.1 Active Datalink With active datalinks, the AMC controls the UAV with task-based guidance [7], issuing abstract tasks, that are broken down by onboard automation into a sequence of atomic, executable commands (further on: a plan) via a hierarchical task network [14]. Task delegation is done via a map-like user interface by clicking on the desired target of a task and then simply choosing between context available tasks (left side of Fig. 3). The magnification of the lower right corner of Fig. 3 shows the interface to define task specific behaviors. Behaviors are defined as a set of reactions (defines what to do if a certain event takes place, e.g. try to reestablish the datalink to transmit data) and constraints (defines how to execute a task, e.g. avoid engagement) [10], if the datalink is interrupted for a specific reason (line of sight interruption, jamming or commanded radio silence) and if an event occurs (e.g. detection of a unit or completion of a task). A discrimination of settings according to interruption reasons is deemed necessary to account for different threat levels form low (line of sight interruption) to high (active hostile jamming). Next to the matrix of detailed settings, preset buttons are provided for frequently used combinations. The UAV agent constantly evaluates the overall situation regarding the requirements of a task for the criteria of availability (how often will the UAV try to reconnect and be available for new commands), covertness (how hard will it be to detect or engage the UAV), and plan accordance (how close will the UAV adhere to the plan that was last transmitted to the AMC). The agent then suggests the most suitable behaviors to the AMC, in order to reduce tasking complexity. Figure 3 shows an active suggestion with a dialog at the top to completely accept or decline the proposal, as well as, on the right side, a purple highlighting of detailed settings that are proposed to be changed. 4.2 No Active Datalink The increase in onboard deliberation capabilities and control via Behaviors during times without an active datalink is realized in two steps [10]: 1. A layered-control approach [15] continuously monitors a current plan for validity and triggers a replanning if necessary. Alternatively, it inserts new tasks into the existing plan, according to the defined Reactions and if the respective event occurs. 2. Incomplete planning at the time of task delegation by the AMC: certain portions of a plan (e.g. what to do if the last explicitly given task is completed) can only be completely planned shortly before their execution, since their execution greatly depends on the prevailing situation (e.g. which task to choose or how to try to reestablish the datalink).
Conditional Behavior: Human Delegation Mode for Unmanned Vehicles
41
Fig. 3. Human-machine interface at IFS’ full mission simulator with behavior settings
5 Conclusion, Application, and Future Work The operation of UAV under the premise of a temporarily unavailable datalink requires an increase in onboard deliberation capabilities of the UAV to be able to react adequately to developing situations when no external inputs are available. Additionally, these operations challenge meaningful human control requirements since the human can neither observe the UAV nor delegate or alter tasks during times without datalink. These changes raise the discussion about, and call for the redefinition of currently established roles and relationships of human and UAV in MUM-T scenarios (WSys configuration) as well as the allowed interactions and task allocation between human and UAV (WA). The presented approach now allows the UAV to independently alter an existing plan during datalink interruptions, in order to pursue the overall mission objective. Furthermore, the approach aims to improve controllability prior to datalink interruptions by enabling the supervising human to predefine behaviors, which limits the actions a UAV is allowed to take during the interruption. In order to limit the increase in task delegation complexity due to the new behaviors, the UAV shall assist the human by suggesting suitable behaviors, and automatically adopt those in case of a sudden datalink interruption. The presented implementation in a military MUM-T scenario uses behaviors consisting of reactions to specific events and constraints defining how to execute a task under certain reasons of datalink interruption. Although the proposed concept is presented in a military HAT context, it could be transferred to UAV integration in non-segregated civilian airspaces or deep space operations that require energy preservation by limiting data transmissions, similar to Rosetta [16]. Future work will focus on integration testing and human-in-the loop experiments to evaluate the improvements in controllability by the proposed measures.
42
C. Meyer and A. Schulte
References 1. TR-HFM-247: Human-Autonomy Teaming: Supporting Dynamically Adjustable Collaboration. STO/NATO (2020) 2. Ekelhof, M.: Moving beyond semantics on autonomous weapons: meaningful human control in operation. Glob. Policy 10, 343–348 (2019). https://doi.org/10.1111/1758-5899.12665 3. Verdiesen, I., Santoni de Sio, F., Dignum, V.: Accountability and control over autonomous weapon systems: a framework for comprehensive human oversight. In: Minds and Machines (2020). https://doi.org/10.1007/s11023-020-09532-9 4. Kalman, R.E.: On the general theory of control systems. In: Proceedings First International Conference on Automatic Control, Moscow, USSR, pp. 481–492 (1960) 5. Schulte, A., Donath, D.: A design and description method for human-autonomy teaming systems. In: Karwowski, W., Ahram, T. (eds.) IHSI 2018. AISC, vol. 722, pp. 3–9. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73888-8_1 6. Valavanis, K.P., Vachtsevanos, G.J. (eds.): Handbook of Unmanned Aerial Vehicles. Springer Netherlands, Dordrecht (2015). https://doi.org/10.1007/978-90-481-9707-1 7. Uhrmann, J., Strenzke, R., Rauschert, A.: Manned-unmanned teaming: artificial cognition applied to multiple UAV guidance. In: NATO RTO SCI-202 Symposium on Intelligent Uninhabited Vehicle Guidance Systems, pp. 1–16 (2009). https://doi.org/10.14339/RTO-MP-SCI202-11-doc 8. Sheridan, T.: Human supervisory control of robot systems. In: Proceedings. 1986 IEEE International Conference on Robotics and Automation, pp. 808–812. IEEE (1986). https://doi.org/ 10.1109/ROBOT.1986.1087506 9. Schulte, A., Heilemann, F., Lindner, S., Donath, D.: Tasking, teaming, swarming: design patterns for human delegation of unmanned vehicles. In: Zallio, M. (ed.) AHFE 2020. AISC, vol. 1210, pp. 3–9. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-51758-8_1 10. Meyer, C., Schulte, A.: Operator controlled, reactive UAV behaviors in manned-unmanned teaming scenarios with selective datalink availability. In: 2020 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 1673–1679. IEEE (2020). https://doi.org/10.1109/ ICUAS48674.2020.9214018 11. 14 C.F.R. §91.185: IFR operations: Two-way radio communications failure, USA (1989) 12. US Department of the Army: FM 3-98: Reconnaissance and Security Operations, Washington, DC (2015) 13. Sheridan, T.B., Verplank, W.L., Brooks, T.: Human and Computer Control of Undersea Teleoperators. Massachusetts Institute of Technology: Man-Machine Systems Laboratory, Cambridge, Mass (1978) 14. Rudnick, G., Schulte, A.: Implementation of a responsive human automation interaction concept for task-based-guidance systems. In: Harris, D. (ed.) EPCE 2017. LNCS (LNAI), vol. 10275, pp. 394–405. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-584720_30 15. Watson, D.P., Scheidt, D.H.: Autonomous systems. Johns Hopkins APL Tech. Digest. 26, 368–376 (2005) 16. West, J.L., Accomazzo, A., Chmielewski, A.B., Ferri, P.: Space mission hibernation mode design: lessons learned from Rosetta and other pathfinding missions using hibernation. In: 2018 IEEE Aerospace Conference, pp. 1–14. IEEE (2018). https://doi.org/10.1109/AERO. 2018.8396812 17. Rao, A.S., Georgeff, M.P.: Deliberation and its role in the formation of intentions. In: Uncertainty Proceedings 1991, pp. 300–307. Elsevier (1991)
Lethal Autonomous Weapon Systems: An Advocacy Paper Guermantes Lailari(B) Strategic and Innovative Group, LLC, Falls Church, VA 22046, USA
Abstract. Some countries, human rights organizations, artificial intelligence experts and academics have expressed doubts about the moral, ethical, and legal development and use of Lethal Autonomous Weapon Systems (LAWS). The United States, United Kingdom, Israel, Russia and many other countries have disagreed with these concerns. This paper will argue that (1) LAWS already exist and are in use by many countries for both defensive and offensive purposes and (2) LAWS cannot be legislated away; the technology is pervasive from smart cars and autonomous ships to military self-protection systems. As long as countries and their respective militaries follow internationally accepted norms when using LAWS such as the Laws of War, principles of war, and have a systematic legal review process, militaries will have the sufficient and necessary controls to address those who criticize and oppose their development and use. Keywords: Artificial intelligence · Autonomous weapon · Human-in-the-loop · Human-on-the-loop · Human-out-of-the-loop · Land mine · Laws of war · Lethal autonomous weapon system · Military · Principles of war · Robot · Sea mine · Unmanned aerial vehicle · Unmanned ground system · Unmanned surface vehicle · Unmanned underwater vehicle
In 2012, a major debate started when Human Rights Watch (HRW) and the International Human Rights Clinic (IHRC) at Harvard Law School co-published the first [1] of several papers against the development and use of lethal autonomous weapon systems (LAWS). Their most recent diatribe against LAWS was published late in 2020 [2]. In addition to HRW and IHRC efforts, the Campaign to Stop Killer Robots has also published similar papers and tracked developments against LAWS [3]. In 2014, the United Nations also began to debate LAWS and issued recommendations under the Certain Conventional Weapons (CCW) treaty committee. Several notable lawyers [4–6] have responded to LAWS critics and argued that LAWS should not be banned and as long as they are used in accordance with the Laws of War (LoW), “it is better to develop norms to control these systems than to attempt to ban them outright” [6]. This paper will not enter into a detailed legal assessment of LAWS; many western governments, as well as Russia, are conducting research and development of LAWS based on their understanding of the LoW. In 2020, 28 UN representatives, mostly from Latin America, Africa, Austria, Holy See, Iraq, Jordan, and Pakistan support a preemptive LAWS ban. Eleven countries, mostly European/Western and Russia, South Korea, and Turkey oppose a ban on developing and employing LAWS [7]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 43–51, 2021. https://doi.org/10.1007/978-3-030-79997-7_6
44
G. Lailari
This paper will argue that LAWS are ubiquitous and case studies will be used to demonstrate the strength of this argument. Furthermore, the paper will contend it is impractical to legislate LAWS away. If successful, this policy will have deleterious effects on countries that follow the internationally accepted LoW. In accordance with the LoW, a legal weapons review process is required for all new weapon systems and any major modification of that system. Given all of these checks and balances along the way, the use of LAWS will be sufficiently restrained and lawfully employed. Next, a few definitions of LAWS are provided. A Lethal Autonomous Weapon System, sometimes referred to as an autonomous weapon system (lethal is redundant), has several definitions. The international Committee of the Red Cross (ICRC) defines autonomous weapon systems as: “Any weapon system with autonomy in its critical functions. That is, a weapon system that can select (i.e. search for or detect, identify, track, select) and attack (i.e. use force against, neutralize, damage or destroy) targets without human intervention” [27]. The European Union uses the ICRC definition when discussing LAWS. The US Department of Defense (DOD) has three categories of autonomous weapons systems: (1) semi-autonomous weapon system, (2) Human -supervised autonomous weapon system, and (3) autonomous weapon system. The definitions are as follows: – “Semi-autonomous weapon system is a weapon system that, once activated, is intended to only engage individual targets or specific target groups that have been selected by a human operator… – “Human-supervised autonomous weapon system is an autonomous weapon system that is designed to provide human operators with the ability to intervene and terminate engagements, including in the event of a weapon system failure, before unacceptable levels of damage occur. – “An autonomous weapon system is a weapon system that, once activated, can select and engage targets without further intervention by a human operator. This includes human-supervised autonomous weapon systems that are designed to allow human operators to override operation of the weapon system, but can select and engage targets without further human input after activation” [9]. Similar to DOD, HRW defines LAWS as a robot that can act on its own. Human involvement is based on three tiers and is referred by HRW as the following: – “Human-in-the-Loop Weapons: Robots that can select targets and deliver force only with a human command” [hereafter H-in-TL]; – “Human-on-the-Loop Weapons: Robots that can select targets and deliver force under the oversight of a human operator who can override the robots’ actions” [hereafter H-on-TL]; and – “Human-out-of -the-Loop Weapons: Robots that are capable of selecting targets and delivering force without any human input or interaction” [H-out-TL].
Lethal Autonomous Weapon Systems: An Advocacy Paper
45
“…The term “fully autonomous weapon” refers to both out-of-the-loop weapons and those that allow a human on the loop, but that are effectively out-of-the-loop weapons because the supervision is so limited” [1]. DOD and HRW use similar terminology; H-in-TL is the equivalent of semiautonomous, H-on-TL is human-supervised and H-out-TL is autonomous. This paper will use the DOD and HRW definitions interchangeably. LAWS began with the employment of self-guided weapons against targets. Prior to this step, weapon developers applied knowledge developed about natural phenomena to aim and guide weapons to targets. From arrows, catapults, canons and guns, mankind has used the scientific method to: improve the accuracy of weapons, decrease the suffering of non-combatants as well as combatants, shorten the time of conflict, and even prevent conflict. The inclusion of autonomy via algorithms, artificial intelligence and machine learning is analogous. Let us begin by looking at civilian autonomous systems: cars, airplanes and ships. Cars have autonomous systems built into them as part of normal safety features such as the air bag which saved 50,457 lives in the US from 1987 to 2017 [10]. Other examples include crash imminent braking which applies the brakes autonomously when the system determines that forward collision is imminent and dynamic brake support activates if the driver brakes insufficiently to avoid the crash [11]. Large commercial aircraft have many autonomous systems such as auto-pilot. In January 2020, Airbus tested an automatic take-off system as part of a milestone for its Autonomous Taxi, Take Off, and Landing Project that uses image processing technology and Artificial Intelligence (AI) [8]. Automatic self-protection systems are now used on some commercial airliners to protect them from man-portable air defense systems (MANPADs). Use of these MANPADs (which are LAWS after launch (H-Out-TL)) against civilian passenger airliners is a violation of the LoW. For example, no one knows for sure who fired the SA-16 missiles in the 1994 shoot down of a Rwandan government flight, killing the Presidents of Rwanda and Burundi and sparking the Rwandan genocide where over 800,000 people were killed. If the Rwandan aircraft had a working self-protection system on it, perhaps both disasters might have been avoided. Autonomous self-protection systems in use include BAE’s Jet Eye, Northrup Grumman’s Guardian, Elta’s Flight Guard, and El-Op’s MUSIC. Even for sea-going vessels, both surface and subsurface, commercial shipping companies are exploring autonomous systems. For example, the autonomous shipping market was estimated to be $5.8 billion in 2020 and is projected to reach $14.2 billion by 2030 according to market research [12]. For the US Navy, they have been experimenting with autonomous surface and sub-surface systems such as the joint Navy’s and DOD’s Strategic Capabilities Office’s Ghost Fleet Overlord (GFO) unmanned surface vessel and DARPA’s Sea Hunter and NOMARS (No Manning Required, Ship) [13]. In January 2021, the GFO successfully completed a test of its abilities and traveled over 4,700 nautical miles almost totally autonomously and then participated also nearly autonomously in a naval exercise [14]. With respect to subsurface systems, underwater unmanned vehicles (UUVs) have also proliferated. For example, according to a recent CRS report from December 2020,
46
G. Lailari
the US Navy has 45 UUVs, growing to 57 by the end of 2021 and has invested in ExtraLarge Unmanned Undersea Vehicles (XLUUVs) “Orca” starting in 2019 and five will be built by 2023. UUVs “can be equipped with sensors, weapons, or other payloads, and can be operated remotely, semi-autonomously, or (with technological advancements) autonomously” [15]. China, South Korea and Russia are also aggressively pursuing large unmanned surface and subsurface vehicles. Boston Dynamics, a robotics company, has developed an autonomous dog-like robot called Au-dog. They teamed with JPL and CalTech to add artificial intelligence capabilities to compete and win a DARPA subterranean competition with a system called Mars Dog. NASA plans to send Mars Dog and other robots to Mars to explore caves and other subterranean features on the red planet. Additionally, the research team is developing the Mars Dogs to work as a pack such as to. “work together to lower themselves into caves. While inside, the robots could continue to act in synergy. They could also talk to their counterparts, share what they are seeing in real time, charge each other, and help one another store samples” [16]. This capability will have uses in underground facilities such as subways, tunnels, as well as for military or police subterranean operations such as along the US-Mexican border and the Israeli-Gaza border. Imagine the implications of a pack of LAWS Mars dogs (with policing or military rules of engagement) inside an illicit tunnel. All of the autonomous system examples to this point have been presented in a positive light. To provide balance, land mines and booby traps are examples of autonomous weapons that have had deleterious effects on civilians. As of 2018, 164 countries signed and ratified the Convention on the Prohibition of the Use, Stockpiling, Production and Transfer of Anti-Personnel Mines and on their Destruction, usually referred to as the Ottawa Convention of 1997 or the Anti-Personnel Mine Ban Treaty [17]. However bad these autonomous systems are, they still may be used and are not banned by the treaty. The limitation of their use according to the convention is as follows: their location should be confined to a specified and delineated area or they will be deactivated after a specified period of time; they should be detectable by normal minesweeping systems (they must have metal in them to be detected); and they cannot be disguised as harmless objects. Additionally, the following key countries have not signed or ratified the convention as of 2018: United States, China, India, Pakistan, and Russia [18]. These countries are the largest producer of mines and have the largest stockpile of them. Sea mines are restricted by the 1907 Hague Convention VIII (Hague VIII) based on the technology at the time and some additional restrictions found in the Ottawa Convention and interpretations of the LoW having to do with distinction, proportionality (including precautions), military necessity and humanity/unnecessary suffering. Sea mine restrictions are summarized as follows: – “mines may only be used to achieve a legitimate military outcome; – “belligerents must retain some control over mines they have deployed and/or have the ability to render a mine safe if such control is lost; – “notification of minefields must occur; and
Lethal Autonomous Weapon Systems: An Advocacy Paper
47
– “the location of a minefield must be recorded so that the area can be cleared once hostilities have ceased” [19]. Given the above restrictions, it would seem a sensible framework to employ for LAWS. The first argument against LAWS by the United National (UN) and Human Rights (HR) organizations is that they should not be developed because of the lack of human control. This is not a valid argument. Any weapon system that can adjust itself after being fired is a LAWS. When Surface to Air Missiles (SAMs) are fired at invading aircraft, the SAM once fired is not controlled by a human, rather it has sensors on-board and off-board which feed it information to find its target. Many of these weapons, once activated cannot be stopped. Therefore, using the above categories, once launched, a SAM is a LAWS using human-out-of-the-loop (H-out-TL) algorithms. These systems make decisions so fast that human interaction would be useless and only machine decision-making will make LAWS effective and efficient. Cruise missiles, such as the Tomahawk, and other weapons such as modern air-toground missiles (AGM), air-to-air missiles (AAM), air-to-ship missiles (ASM), surface to air missiles (SAM) are all H-out-TL once fired. Missile defense systems such as the US PATRIOT, Aegis, Stinger, THAAD, and Israeli designed and produced Iron Dome all use their own algorithms to hit the ballistic or maneuvering target. The Iron Dome system, for example, could be considered a H-on-TL system since a human is required to monitor the system to ensure that it is performing as designed, especially when dozens or more rockets are launched simultaneously to attack. Another weapon system example that uses algorithms for targeting is the US designed sea-based Phalanx Close-In Weapon System (CIWS-pronounced Sea-Wiz) which shoots up to 4,500 20 mm rounds per minute effectively from 1–3+ miles away and can conduct automated search, detect, track, engage and confirm kills employing its own software controlled radar system. The system is used for autonomous close protection of ships from small boats, surface torpedoes, anti-ship missiles, helicopters and rockets. Similarly, Russia and China have similar systems. CIWS and its variants, which have been sold to over 40 countries, are H- out-TL once activated. Other ground-based autonomous systems (H-In-TL and H-On-TL) are various sentry systems. South Korea developed the Samsung SGR-A1, a military robot sentry, used along the Korean demilitarized zone (DMZ) and allows the operator to conduct: “…surveillance, tracking, firing and voice-recognition systems built into a single unit…provide suppressive fire with a machine gun… [It] can either sound an alarm, fire rubber bullets or make use of its… machine gun. It can understand the soldier’s arms held high to indicate surrender, and then not fire. Normally the ultimate decision about shooting would be made by a human, not the robot” [20]. Other similar sentry-type systems include the Israel Defense Forces Sentry Tech system Roeh-Yoreh (Sees-Fires) deployed along the Gaza border. For ground-mobile systems, the US Army’s “Wingman” Joint Capability Technology Demonstration autonomous program recently tested on the M1097 HMMWV “Humvee”, the M113 armored personnel carriers and on other larger combat vehicles
48
G. Lailari
[21]. In summary, all military services are testing and some have deployed some kinds of lethal autonomous weapon systems with different levels of human control. Next, we move to the LoW and principles of war to understand how all weapons, including LAWS can be used and not used. In the beginning of this paper, the author chose not to delve deeply into the legal debate but provided references for further reading. However, readers might find it helpful to know which LoW affect the use and deny the use of LAWS. The key ones are: Hague Conventions, Geneva Conventions, Customary International Humanitarian Law, United Nations Convention on Certain Conventional Weapons (CCW) Treaty, and Law of Armed Conflict (LOAC). In addition to these international agreements, there are national laws and guidance within each country’s military, as in the US, the DOD’s Law of War Manual (see Sect. 6.5.9 Autonomy in Weapon Systems) [22] as well as directives and regulations, such as DOD Directive 3000.09, Autonomy in Weapon Systems. In other words, countries that follow the LoW will conduct due diligence in making sure that the use of LAWS will follow the word and intent of the law. Those countries that don’t follow the LoW, whether the country uses LAWS or other weapons, will be liable for their actions [6]. Whenever military commanders employ military capabilities, including the potential use of LAWS, they also use principles of warfare (based on the LoW) to make those decisions. Others have provided suggestions to enhance the questions asked about a weapon system. For example, three general questions are usually utilized by DOD lawyers when reviewing weapon systems to ensure they comply with the LoW: “(1) …[Is] a specific rule, whether as a treaty obligation or viewed as customary international law, prohibiting or restricting the use of the weapon? “(2) …[Is] in its normal or intended circumstances of use, the weapon is of a nature to cause superfluous injury or unnecessary suffering? “(3) …[Is] the weapon …capable of being used in compliance with the rule of discrimination (or distinction)?” [23] A legal advisor to the US Department of State suggested adding two more questions: “(4) Whether the weapon is intended, or may be expected, to cause widespread, long-term and severe damage to the natural environment?” “(5) Whether there are any likely future developments in the law of armed conflict that may be expected to affect the weapon subject to review?” [23] Although procedures are in place to ask questions about the weapon system, there are always possibilities of improving the process. However, according to the Stockholm International Peace Research Institute as of 2015, only 12 to 15 countries are actually known to have a formal weapons review process, yet 174 countries are a party to the Article 36 of the 1977 Additional Protocol I to the 1949 Geneva Conventions that requires states to have a weapons review process [25]. This demonstrates that there is a large gap between what should be done versus what is actually being done and should be resolved especially as LAWS become more prolific.
Lethal Autonomous Weapon Systems: An Advocacy Paper
49
For the US military, 12 principles of joint operations are promulgated in Joint Publication (JP) 3.0 Joint Operations: objective, offensive, mass, maneuver, economy of force, unity of command, security, surprise, and simplicity as well as newly added restraint, perseverance, and legitimacy [24]. Of these, offensive, mass, maneuver and surprise would appear to encourage the use of LAWS in warfighting. However, economy of force, restraint and legitimacy could impede the use of LAWS in warfare. With respect to LAWS, DOD published in 2012 and updated in 2017 DOD Directive (DODD) 3000.09, Autonomy in Weapon Systemswhich “establishes DOD policy and assigns responsibilities for the development and use of autonomous and semiautonomous functions in weapon systems, including manned and unmanned platforms” and “establishes guidelines designed to minimize the probability and consequences of failures in autonomous and semi-autonomous weapon systems that could lead to unintended engagement… or to loss of control of the system to unauthorized parties” [9]. The DODD provides a template for other countries to use to help ensure that they follow the LoW and a rigorous system for legally reviewing and testing LAWS. Finally, the recently published (January 2021) final draft of the National Security Commission on Artificial Intelligence document’s four judgements and final recommendations further reinforce the above points not only for LAWS but also for Artificial Intelligence (AI) in general: (1) “their use is authorized by a human commander or operator, properly designed and tested AI-enabled and autonomous weapon systems have been and can continue to be used in ways which are consistent with International Humanitarian Law (IHL)”; (2) “existing DoD procedures are capable of ensuring that the United States will field safe and reliable AI-enabled and autonomous weapon systems and use them in a manner that is consistent with IHL…”; (3) “there is little evidence that U.S. competitors have equivalent rigorous procedures to ensure their AI-enabled and autonomous weapon systems will be responsibly designed and lawfully used…”; (4) “the Commission does not support a global prohibition of AI-enabled and autonomous weapon systems…”[26] The commission’s final recommendation is that the US “work with allies to develop international standards of practice for the development, testing, and use of AI-enabled and autonomous weapon systems” [26]. These four judgements and recommendation demonstrate that LAWS development and employment will continue for the foreseeable future and that enough controls and reviews should suffice to ensure that use of LAWS in war will follow the LoW. In conclusion, autonomous systems are ubiquitous. The more they are used in our daily lives, the more we depend on them and trust them. This paper highlighted the current state of LAWS, including development and employed systems. It showed that LAWS are here to stay and are being developed and some are even deployed. Using the framework that concluded LAWS development should continue, there are many safeguards in place such as weapons review processes and the LoW that should ameliorate people who have concerns about these systems. However, this research also indicated that most countries do not have comprehensive weapon review processes in place to ensure that LAWS are created and employed in consideration of the LoW. Additionally, as was demonstrated by simple landmines, LAWS should have similar constraints such as use in a defined area and/or for a defined period of time. Furthermore, there should be a built-in design practice for a human to disable or turn off the system either mechanically or remotely in
50
G. Lailari
case the system has an “unintended engagement” or experiences a “loss of control” [9]. As time passes and technology improves, there should be a deliberate process to review the LoW in conjunction with LAWS development and employment (including artificial intelligence and machine learning) and ensure that the legal framework keeps up with these advances and places limitations on these systems as needed.
References 1. Losing Humanity: The Case Against Killer Robots, Human Rights Watch, November 2012 2. New Weapons, Proven Precedent: Elements of and Models for a Treaty on Killer Robots (2020). http://hrp.law.harvard.edu 3. “Publications”, Campaign to Stop Killer Robots. https://www.stopkillerrobots.org/ 4. Schmitt, M.N.: Autonomous weapon systems and international humanitarian law: a reply to the critics. Harv. Nat. Sec. J. (2013) 5. Schmitt, M.N., Thurnher, J.: Out of the loop: autonomous weapon systems and the law of armed conflict. Harv. Nat. Secur. J. 231, 4 (2013) 6. Dunlap, C.J.: Accountability and Autonomous Weapons: Much Ado About Nothing? Temple Int. Comp. Law J. 30, 63–76 (2016) 7. International Discussions Concerning Lethal Autonomous Weapon Systems, updated 15 October 2020. https://crsreports.congress.gov 8. “Airbus demonstrates first fully automatic vision-based take-off”,https://www.airbus.com/ newsroom/pressreleases/en/2020/01/airbus-demonstrates-first-fully-automatic-visionbasedtakeoff.html 9. Autonomy in Weapon Systems, DOD Directive 3000.09, Incorporating Change 1 on May 8, 2017 (2012) 10. “Air Bags”, National Highway Traffic Safety Administration. https://www.nhtsa.gov/equipm ent/air-bags 11. “Safety Technologies”, National Highway Traffic Safety Administration. https://www.nhtsa. gov/equipment/safety-technologies 12. “Autonomous Ships Market”. https://www.marketsandmarkets.com/Market-Reports/autono mous-ships-market-267183224.html 13. Larter, D.B.: Unclear on unmanned, Part 3: A New Year’s resolution to slow down (2021). https://www.defensenews.com 14. Todd, L.C.: DOD’s Autonomous Vessel Sails through Transit Test, Participates in Exercise Dawn Blitz (2021). https://www.defense.gov 15. Navy Large Unmanned Surface and Undersea Vehicles: Background and Issues for Congress, Congressional Research Service (2020) 16. Backman, I.: Very good space boys: Robotic dogs may dig into Martian caves. In: Eos, vol. 102 (2021) 17. CCW Amended Protocol II (United Nations). https://www.un.org/ 18. The Ottawa Convention: Signatories and States-Parties (2018). https://www.armscontrol.org/ factsheets/ottawasigs 19. Letts, D.: Beyond Hague VIII: Other Legal Limits on Naval Mine Warfare, 90 INT’L L. STUD. 446, Stockton Center for the Study of International Law (2014) 20. Pike, J.: Samsung Techwin SGR-A1 Sentry Guard Robot. Global Security (2011) 21. Kimmons, S.: ’Wingman’ program developing armed robotic vehicles to be controlled by Soldiers, Army News Service (2018) 22. DOD Law of War Manual 2015 (updated December 2016)
Lethal Autonomous Weapon Systems: An Advocacy Paper
51
23. Noone, G.P., Noone, D.C.: The debate over autonomous weapons systems. 47 Case W. Res. J. Int. L. 25, 235 (2015) 24. Joint Publication 3.0 Joint Operations, 17 January 2017, Change 1 25. Boulanin, V.: Implementing Article 36 Weapon Reviews in the Light of Increasing Autonomy in Weapons Systems, SIPRI Insights On Peace & Security (Stockholm Int‘l Peace Research Inst., Solna, Sweden) (2015) 26. Draft - Final Report, National Security Commission on Artificial Intelligence (2021). https:// www.nscai.gov 27. Autonomous Weapon Systems: Implications of Increasing Autonomy in the Critical Functions of Weapons. Expert meeting. International Committee of the Red Cross (ICRC), Versoix, Switzerland (2016)
Measuring the Impact of a Navigation Aid in Unmanned Ship Handling via a Shore Control Center Gökay Yayla(B) , Chris Christofakis, Stijn Storms, Tim Catoor, Paolo Pilozzi, Yogang Singh, Gerben Peeters, Muhammad Raheel Afzal, Senne Van Baelen, Dimiter Holm, Robrecht Louw, and Peter Slaets Department of Mechanical Engineering, KU Leuven, 3000 Leuven, Belgium {gokay.yayla,chris.christofakis,stijn.storms,tim.catoor, paolo.pilozzi,yogang.singh,gerben.peeters,raheel.afzal, senne.vanbaelen,dimiter.holm,robrecht.louw, peter.slaets}@kuleuven.be
Abstract. Considering the potential shift in the maritime domain towards shorebased (unmanned) navigation in the future, the role of Shore Control Centers (SCCs) and their instruments to provide the human-vessel interactions will gain in importance. With the advancements in satellite navigation systems and sensor technology, one could argue that electronic instruments can better assist shorebased skippers and improve their situational awareness. However, the correctness or precision of an assistive tool’s outcome does not necessarily mean that the tool is useful for a ship-handler’s cognition. This study aims to provide insights into this validation gap by investigating the impact of a newly developed navigation aid on the accuracy and safety performance of shore-based ship handlers. Keywords: Navigation aid · Ship handling · Shore Control Center · Unmanned navigation · Unmanned surface vessel
1 Introduction Ship handling involves decision making and maneuvering in complex and dynamic environments. Accurate and safe ship handling skills are traditionally developed through experience and by practicing in different operational conditions. During close-proximity maneuvers, skippers tend to rely more on their sight and the visual information it provides than on the information coming from the onboard electronic instruments. This sightbased maneuvering is sometimes referred to as a part of “ship sense” [1]. However, considering the potential shift towards shore-based (unmanned) navigation in the future, this form of “sense” might need to be changed in favor of electronic instruments. With advancements in satellite navigation and sensor technology in the maritime domain, electronic instruments can better assist shore-based skippers and improve their situational awareness. Nevertheless, the “onboard eyes” might remain of paramount importance for © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 52–59, 2021. https://doi.org/10.1007/978-3-030-79997-7_7
Measuring the Impact of a Navigation Aid
53
remote ship handling. Hence, the role of Shore Control Centers (SCCs) will gain in importance as well, as to provide the human-vessel interactions needed for the operator to acquire (remote) ship sense [2]. An important question to be answered in this context is how skippers can accurately and safely keep control over the exterior points of the hulls of their vessels while sailing in confined waters. A European Horizon 2020 (H2020) project, called Hull-to-Hull (H2H) [3], presents a promising solution to address this question by utilizing the European Global Navigation Satellite Systems (EGNSS), i.e., Galileo and EGNOS, by developing a navigation aid that assists skippers in their navigational decisions and supports safe navigation which might increase the level of autonomy. In this solution, which will be referred to as the “H2H system” from here on, skippers are able to see the relative positioning between two vessels or between shore and vessel along with an uncertainty zone depending on the positioning accuracy (Fig. 1).
Fig. 1. A snapshot from the H2H Viewer software while the vessel is approaching for docking. The green zone shows the uncertainty and the blue lines show the relative distance/speed.
However, when using SCCs, an operator evidently loses a direct ship sense, and thus the subsequent harmony with the environment, which complicates the ship handling [2]. When using an instrument, the precision of its outcome is a contributing factor in its cognitive success. Nevertheless, this doesn’t automatically lead to the conclusion that it is useful in the skipper’s cognitive process of handling a vessel. Therefore, in this study, the authors aimed at quantifying this usefulness and validate the impact of the H2H system on accurate and safe ship handling, specifically for the inland waterways. This paper continues as follows: Sect. 2 elaborates on the experimental design, Sect. 3 presents the results, and Sect. 4 concludes this study.
2 Method Most of the existing research about unmanned ship-handling focuses on sea-going vessels and stays rather theoretical [4, 5], whereas this study aims at studying the outcome of real human-vessel interactions using sensory information through an SCC. For this study, live experiments were conducted using a remote-controlled research vessel, 4.8m in length, operated via its SCC (Fig. 2). The research vessel, named Cogge, is a 1/8 scale model of a CEMT-I (De Conférence Européenne des Ministres de Transport)
54
G. Yayla et al.
class Watertruck+ vessel [6] and it has an over-actuated propulsion system with a 360degrees-steerable bow and stern thruster [7]. Even though this design increases the maneuverability of the vessel, it adds a challenge for the operators that are used to conventional actuation systems (i.e., with propeller(s) and rudder(s)). The SCC, detailed in [2], includes multiple displays for realistic visualization of the camera feeds from the vessel and H2H system monitoring, and a conning box with a human-machine interface (HMI) screen.
Fig. 2. The Cogge (left), 4.8 m in length, and the representation of its SCC (right). The Cogge has a GNSS receiver with a main (rear) and an auxiliary (front) GNSS antenna, and an Inertial Measurement Unit (IMU) to calculate the navigational states [8].
2.1 Missions The experiments were conducted at the Leuven-Dijle Canal in Leuven (Belgium). The authors asked the participants to complete two missions representing the two common use cases: (i) straight sailing and (ii) docking (Fig. 3 and 4).
Fig. 3. Sketch of an experiment run: first, the participant starts with a straight sailing mission (blue trajectory), continues with a test docking maneuver (gray trajectory) from the starboard side, and then performs the actual docking (yellow trajectory) from the port side. The distance of a complete run is approximately 1000 m.
Measuring the Impact of a Navigation Aid
55
Fig. 4. Docking mission: outside view (left), front-port-starboard camera feeds and H2H viewer screen combined (right).
During straight sailing, the participant is asked to pass between the first buoy and the shore to replicate a static object avoidance scenario. The second buoy marks the point where the participant needs to turn around and sail back again on the centerline. 2.2 Configuration “Configuration” is defined as the explanatory variable of the study with two levels: (i) camera only, referred to as NoH2H, and (ii) camera + H2H, referred to as H2H (Fig. 5). In the base configuration (NoH2H), the participants handled the vessel using the four camera feeds from the vessel that were displayed on the screens in the SCC. In the augmented configuration (H2H), the H2H viewer software was also displayed on another screen in front of the participant. This software is able to show the exact geometric representation of the vessel with accurate positioning (see Fig. 1 and 4). The performance with the NoH2H configuration established the benchmark reference for each participant, and it was compared to the same participant’s performance with the H2H configuration.
Fig. 5. NoH2H configuration (left), H2H configuration (right).
2.3 Participants Since this study requires ship-handling skills that can easily vary from person to person, the potential random noise associated with individual differences must be minimized, avoiding the possible impact to be covered by this random noise. Therefore, a withinsubjects design was selected, in which each participant was exposed to both conditions (i.e., H2H and NoH2H). On the other hand, its most important drawback is the learning effect due to repeated measurements using the same subjects. Therefore, the starting
56
G. Yayla et al.
configuration was randomized among the participants to balance out any learning effects across two configurations. The authors would ideally conduct the experiments with the skippers only, as they are the target audience for a navigation aid. However, from a practical standpoint, there was a risk of ending up with a too-small sample size, given the limited access to skippers willing to participate. Therefore, 25 students were also included in the sample, resulting in 32 participants in total. Even though this improved the statistical relevance of our study, one needs to appreciate the study’s results in terms of this broader population. 2.4 Data Processing Even though it is hard to measure quantitatively and precisely how a skipper performs with and without an aid like the H2H system, Yayla et al. [9] present an approach to tackle this challenge. Considering their approach, accuracy and safety were selected as the two metrics to quantify the performance of the participants in this study. Two sensors, the GNSS receiver and the Inertial Measurement Unit (IMU), measured the navigational states of the vessel. A python script was developed to process the data explained below and normalize it using a min-max scaler to compare the impact on a common scale (i.e., as a percentage). For the straight sailing mission, the accuracy metric is the mean cross-track error (XTE) which is calculated as the Mean Absolute Error (MAE) of the main GNSS antenna position from the planned path (i.e., the centerline of the canal). For docking, on the other hand, it is the accuracy of the vessel’s final position and orientation. The main goal of navigational safety is to avoid collisions. So, we need to measure how close a vessel is to a possible collision. Therefore, the safety metric is the Closest Point of Approach (CPA) of the hull of the vessel to both the first buoy and the shoreline for the straight sailing mission. For docking, that is defined as the vessel’s speed when the vessel’s hull touches the fenders, representing the quay. Table 1 shows the basic descriptive statistics of the accuracy and safety metrics after processing the data for all 32 participants. Table 1. These values show the descriptive statistics of the normalized difference (impact) of the metrics between H2H and NoH2H (i.e., metricH2H – metricNoH2H ) and give us the first impression that the H2H system might have a positive impact on accurate and safe ship handling. Metric
Mean
Std. Dev. Median Max Min
Straight sailing accuracy −16% 18%
−17%
40% −47%
Straight sailing safety
−4%
30%
−2%
69% −65%
Docking accuracy
−10% 37%
−7%
70% −73%
Docking safety
0%
−3%
67% −80%
30%
Measuring the Impact of a Navigation Aid
57
3 Results To visualize the general distribution of the results and facilitate the first interpretation, the impact for each participant was grouped into categories. The authors considered an impact below 10% neutral and used 20% intervals to define low, moderate, and high impact areas. Figure 6 presents the categorized results for both missions and gives an impression of an accuracy improvement. The direction of the impact depends on the metric defined for each mission (e.g., while a decrease in distance to the buoy/shore -safety metric- during straight sailing means a negative impact, a decrease in speed at the docking moment means a positive impact) A small number (1 or 2) of observations outside of these bins were treated as outliers. Then, the results were further investigated to see if the preliminary interpretations in Table 1 and Fig. 6 can be supported statistically. For this, a formal hypothesis testing was applied, where the null hypothesis (H0 ) was defined as: “The H2H system has no (positive) impact on the specific performance metric”. Depending on the metric, the alternative hypothesis (H1 ) was defined as one-tailed (improvement). Table 2 shows the mathematical representations of these hypotheses along with the test results. As the aim is to compare the performance metrics with the H2H system to the NoH2H metrics, it is appropriate to perform a paired test where the difference becomes the sample and that sample is compared to zero. Since the sample size is small (n < 50) and the population variance is unknown, the t-test was used.
(a)
(b)
Fig. 6. The impact of the H2H System: accuracy and safety performance during straight sailing (a), and accuracy and safety performance during docking (b). Green represents a positive impact, and red represents a negative impact.
58
G. Yayla et al.
Table 2. Results of the hypothesis tests where µ is the mean impact. Looking at the values in straight sailing metrics, the null hypothesis for the accuracy metric can be significantly rejected (p < 0.05). The results are similar for the docking mission, yet less significant (0.05 < p < 0.10). Metric
H0
H1
p
Reject H0
Straight sailing accuracy µ ≥ 0 µ < 0 0.00 Yes Straight sailing safety
µ ≤ 0 µ > 0 0.25 No
Docking accuracy
µ ≥ 0 µ < 0 0.06 Yes
Docking safety
µ ≥ 0 µ < 0 0.47 No
4 Conclusion and Future Work This study presents the results of a field study for accuracy and safety evaluation in unmanned ship handling through an SCC. Firstly, the experimental design explained in Sect. 2 offers a reasonable framework to measure the impact of a navigation aid, such as the H2H system. Considering the small sample size and the nature of the experiments, it is acceptable to conclude that the H2H system increased the accuracy of participants’ performances according to the findings explained in Sect. 3. For safety, a positive effect seems to be the case, although the results are insufficient to support such a statement given the limitations of the study, that is, a limited sample size, and a lack of familiarity with the propulsion system and the maritime domain in general for many of the participants. Although there were some limitations, this study validates the contribution of the H2H system in single-handed sailing and docking in IWW through an SCC. Secondly, it is not easy to filter out the impact of the H2H system quantitatively because human attention is hard to quantify: there is neither a physical unit to describe human attention nor a method to infer the value of the specific attention on an object. However, the statistical analysis with the defined performance metrics still allows us to validate the contribution of the H2H system in single-handed sailing and docking in IWW through an SCC, even though it can still be improved by adding other cognitive dimensions such as eye-tracking in future work. Lastly, instead of being limited by a study with only experienced sailors, the inclusion of novices (i.e., students) in this study presents a broader view on the usability question, possibly leading to new insights for analyzing the effect of navigational aids. Therefore, given this fact and the potential shift towards shore-based navigation for the IWW in the foreseeable future, this study could also be considered as a pioneer work on quantifying the impact of such navigation aids. Acknowledgement. This study is conducted within the scope of the H2H Project and the project has received funding from the European GNSS Agency under the European Union’s Horizon 2020 research and innovation programme grant agreement No 775998.
Measuring the Impact of a Navigation Aid
59
References 1. Prison, J., Dahlman, J., Lundh, M.: Ship sense-striving for harmony in ship manoeuvring. WMU J. Marit. Aff. (2013). https://doi.org/10.1007/s13437-013-0038-5 2. Peeters, G., et al.: An Inland shore control centre for monitoring or controlling unmanned inland cargo vessels. J. Mar. Sci. Eng. (2020). https://doi.org/10.3390/jmse8100758 3. The Hull-to-Hull Project (2020). https://www.sintef.no/projectweb/hull-to-hull/ 4. Wahlström, M., Hakulinen, J., Karvonen, H., Lindborg, I.: Human factors challenges in unmanned ship operations-insights from other domains. In: 6th International Conference on Applied Human Factors and Ergonomics (2015). https://doi.org/10.1016/j.promfg.2015.07.167 5. Porathe, T., Prison, J., Man, Y.: Situation awareness in remote control centres for unmanned ships. In: Proceedings of the Human Factors in Ship Design & Operation, p. 93 (2014) 6. Watertruck+: The Future of Inland Navigation (2020). http://www.watertruckplus 7. Peeters, G., et al.: An unmanned inland cargo vessel: design, build, and experiments. Ocean. Eng. 201, 17 (2020). https://doi.org/10.1016/j.oceaneng.2020.107056.(2020) 8. Yayla, G., et al.: Accuracy benchmark of Galileo and EGNOS for Inland Waterways. In: International Ship Control Systems Symposium (iSCSS 2020), Netherlands, IMaReST (2020) 9. Yayla, G., et al.: Impact of a navigation aid on unmanned sailing in inland waterways: design and evaluation challenges. In: Proceedings of the IEEE eXpress Conference Publishing, Oceans 2020, Signapore, 15–18 May 2020 (2020)
A Computational Assessment of Ergonomics in an Industrial Human-Robot Collaboration Workplace Using System Dynamics Guilherme Deola Borges1(B) , Rafael Ariente Neto2 , Diego Luiz de Mattos1 , Eugenio Andres Diaz Merino2 , Paula Carneiro1 , and Pedro Arezes1 1 ALGORITMI Research Centre, School of Engineering, University of Minho,
Guimaraes, Portugal {pcarneiro,parezes}@dps.uminho.pt 2 Design Management Center (NGD), Design and Usability Laboratory (LDU), Production Engineering Department, Federal University of Santa Catarina, Florianopolis, Brazil [email protected]
Abstract. An automotive company in Portugal has a high rate of sick leave due to occupational diseases where it is planned the insertion of an industrial HumanRobotic Collaboration (HRC) system to assist workers’ activities. As this system can result in physical and mental overload, depending on the different Levels of Collaboration (LoC), the aim of this paper is to predict what would be the best working condition between worker and robot. This descriptive research establishes a quantitative approach, since it prospects scenarios generated by computer simulation. It explores the outputs of sick leave rate by inserting an industrial HRC in the production line. The main results consist of the scenarios that graphically describe the evolution of the indicators over time. We conclude that counterintuitive effects can occur on these systems, and a computational simulation is useful to predict working condition scenarios when deciding which human-robot configuration fits better. Keywords: System dynamics · Human factors · Ergonomics · Non-linear behavior · Human-robot collaboration
1 Introduction An automotive company in Portugal plans to insert a robotic system to collaborate with a worker hoping to reduce physical overload. It is expected that an effective collaboration between humans and robots would result in combining their skills: precision, speed and fatigue-free operation of the robot with human sensory perception, and cognition. It has been described in [1], that managers tried to increase production expecting certain productivity, it caused physical overload on workers, consequent sick leave, knowledge losses, leading to productivity below expected values, and finally resulting in counter intuitive effects. Therefore, this study establishes a preliminary approach to the problem exploring the basic reflexes of the insertion of an industrial HRC system. Considering the nonlinear © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 60–68, 2021. https://doi.org/10.1007/978-3-030-79997-7_8
A Computational Assessment of Ergonomics
61
characteristic of the system’s behavior, this work introduces a model based on System Dynamics (SD) representing the feedback mechanisms related to physical and mental overload, as well as the LoC of the robotic system with focus on task accomplishment. There are different LoC between the worker and the robot depending on the technology available and what is needed for completing the task. As presented in Fig. 1, these levels are: Level 0 (cell), Level 1 (coexistence), Level 2 (synchronized), Level 3 (cooperation), and Level 4 (collaboration).
Fig. 1. Levels of collaboration. Source: [2].
The computational model simulates scenarios that graphically describe the behavior of the main indicators. The aim is to draw guidelines that evolve the model into a structure to apply in a practical analysis of a production line.
2 Method A literature review was carried out to identify the most relevant factors to be considered in an industrial HRC system. Once the variables are defined, a SD approach is designed and runs in the software Vensim [3]. The conceptual description and detailed model are complemented by mathematical equations of the relationships between the constants and variables. Finally, the scenarios are prospected for analysis, opening space for evaluations and policies. The sequence of activities that characterizes the research method is shown in Fig. 2.
Literature review
Factors in HRC systems
Variables definiƟon for the model
Causal Loop Diagram design and Stock and Flow Map design
EquaƟons design
Policy Design EvaluaƟon
Fig. 2. Method used in model development.
In the following section, the modeling activity is described throughout a detailed description on how the factors were arranged, and the respective diagrams are exposed.
3 Model DS modeling consists of a conceptual description based on the factors identified in the literature, and a mathematical description to allow computational simulation.
62
G. D. Borges et al.
3.1 Conceptual Description of the System The Causal Loop Diagram (CLD) describes the relationship between the factors or variables of a system. By organizing the factors and its relationships, the four main feedback loops are shown in Fig. 3. The “production self-regulation” cycle describes the cause-and-effect relationships that allow the employees of the production line to adjust the work pace to meet production targets. The “learning by repetition” cycle, on the other hand, describes how daily experience leads to greater skill to execute a task, thus being a reinforcement cycle. According to the systemic theory, in real contexts the reinforcement cycles have their action limited by associated control cycles. In the CLD this occurs through the action of the “limitation due to disease incidence” cycle, which are consequences of physical and mental overload. The “mental overload reinforcement” cycle shows how the “pressure” increases mental load, which may cause medical leave, reduces task knowledge, increases cycle time, therefore it is harder to meet production target, leading to more pressure and mental burden.
Fig. 3. Systemic context.
In this work the production target is defined by the tactical level and the described system develops trying to achieve it. Based on this context, the information is converted into a more detailed model, encompassing all the constants and variables necessary for inserting equations, computational simulation and prospecting scenarios.
A Computational Assessment of Ergonomics
63
3.2 Detailed Model A detailed model greatly expands the understanding established in the conceptual diagram, keeping the indicated information cycles in order to maintain the established concept itself. Therefore, the Stock and Flow Map (SFM) provides concrete details to the previously developed CLD applying a structural and mathematical model. The structure of the subsystem that deals with the dynamics of the production subsystem was adapted from [4] and is shown in Fig. 4. The two stocks are “work for processing” and “work in process - WIP”. “Production rate - Pr” is defined by WIP and the “cycle time” (CT). Pr = WIP/CT .
(1)
In this mathematical model all WIP is processed, even under extreme theoretical conditions. Therefore, the production rate does not assume negative values. Pr ∈ R | Pr ≥ 0.
(2)
Fig. 4. Production subsystem.
The structure of the “cycle time” stock and its influences are presented in Fig. 5. In real systems, workers react to possible delays and manage to reduce cycle time. This behavior is inserted by comparing the “desired production rate” with that achieved. This comparison is inserted as a “Pressure index (Pi)” that accelerates or stops the rate at which the cycle time is adjusted. The “perception time of meeting the target - PMTM” is also considered to influence the cycle time adjustment rate, which is conditioned to a “minimum cycle time - CTm” that, in the systemic context, derives from the knowledge of the task. The nonlinear behavior of the cycle time, considering that the rate at which it is obtained becomes smaller as it approaches the minimum cycle time, uses the Euler exponential relation. CT = (CTm − CTt0 ) ∗ e
1 − PTMT ∗t
∗ Pi.
(3)
64
G. D. Borges et al.
Fig. 5. Cycle time influences.
It is considered a structure that inserts the behavior related to employee turnover and the task knowledge index (Fig. 6) similar to that used by [1]. The “leave rate - Lr” is formulated considering both the “physical load – PhL”, the “mental load - ML”, and the average “time to disease incidence - TDI”. Lr = (PhL ∗ ML)/TDI .
Fig. 6. Workers and knowledge structure.
(4)
A Computational Assessment of Ergonomics
65
4 Simulation and Prospection The simulation and prospecting of scenarios was preceded by tests to assure the necessary reliability of the DS model to be considered useful for analysis [5]. The tests verified the structure, dimensional consistency, extreme conditions, and error in steady state and permanent regime. After applying tests to the model, it was configured to prospect the scenarios to be used in the analysis. The “Euler” integration method was used. The integration increment (dt) was set to 1/5 of the value of the smallest time constant. The simulation horizon, as such analysis has a focus on operational dynamics whose effects occur in a relatively short time, was defined in a work shift (8 h). The human workload value was weighted at 33% for mental demands and 67% for physical demands. In addition, a very important aspect for this analysis is the insertion of the LoC influences in the simulation model. In this case the insertion was done by the definition of values in ordinal scales of 5 levels. Thus, the LoC influences on mental workload, postural demand and knowledge necessary to perform the task. The specific values used to configure the model are shown in Table 1. Table 1. Levels of collaboration influence on the model. Level of collaboration 0
1
2
3
4
Mental workload
0.5
1
2
3
4
Physical workload (postural)
4
3
2
1
0.5
Knowledge
1 (100%)
0.9 (90%)
0.7 (70%)
0.5 (50%)
0.3 (30%)
As the computer simulation runs, it is possible to see the effectiveness of the feedback cycles in driving production to the goal (Fig. 7). The pattern of behavior starts below the target, however it grows gradually, representing a situation of adaptation and pace.
Fig. 7. Production rate behavior through a shift time.
66
G. D. Borges et al.
In this model, the cycle time was modeled as a variable, suffering the influences already shown in the CLD. In general, the cycle time decreases as the production line is able to adjust production to the target. This is due to increased knowledge in terms of the ability to carry out the task. In Fig. 8 it is presented the evolution of the cycle time over the simulation horizon time together with a third axis (robotic), which proves to be very useful for the interpretation of the system. Figure 9 shows little improvement between the condition of the production line between level 0 and a level 1. Level 2, in terms of operationalization of the production line, is shown to be the best condition, since the cycle time easier meet the goal. However, levels 3 and 4 showed worse results (in terms of operationalization) compared to level 0.
Fig. 8. Variation of cycle time for different LoC.
As shown in Fig. 9, the most favorable scenario to avoid the incidence of occupational diseases, which leads to employees on leave, corresponds to the highest level of robotic assistance (30.4% lower than level 0). Although this level still returns a greater frequency of cycles (due to the reduced cycle time) and the performance of mental workload, the reduction in postural load seems to compensate for these factors. It demonstrates that inserting a HRC system is complex, and both physical and workload may affect the leave rate, which significantly change the prospective scenarios. It was observed that the existence of a cognitive load can generate counterintuitive effects on the system’s behavior. Thus, expanding the model for its practical application consists of selecting ergonomic tools for the quantification of the levels of physical and mental workload that can be applied in an industrial HRC system.
A Computational Assessment of Ergonomics
67
Fig. 9. Variation of employees on leave for different LoC.
5 Conclusions and Future Work The simulations showed that the incidence of occupational disease is significantly reduced with the inclusion of an industrial HRC system. In the scenarios prospected by this preliminary model, the reduction was up to 30.4%. On the other hand, it does not mean that higher LoC always lead to higher levels of productivity and safety. It can be found more often than not that intermediate LoC fits better to solve a problem regarding HRC systems. A correct understanding of a system is achieved by modeling and prospecting scenarios, specially looking for counter intuitive effects. This model can be applied to a practical case where the scenarios could be used by managers for decision making. However, there are two main points where this work can be further developed in the future: (i) explore how mental load occurs by evolving the structure that inserts the behavior of such load in the simulation model. It is suggested to follow [6], where the studies are developed specifically using SD to assist such exploration; (ii) apply ergonomic tools to quantify the levels of physical and mental workload that can be applied in the practical context. For physical workload, it is suggested the Rapid Upper Limb Assessment (RULA) [7] method to assess postures, and for mental overload, the NASA Task Load Index (NASA-TLX) [8] tool that classifies the load perceived by operators. Acknowledgement. This work has been supported by FCT – Fundação para a Ciência e Tecnologia within the R&D Units Project Scope: UIDB/00319/2020.
References 1. de Mattos, D.L., Ariente, R., Merino, E.A.D., Forcellini, F.A.: Simulating the influence of physical overload on assembly line performance: a case study in an automotive electrical component plant. Appl. Ergon. 79, 107–121 (2019)
68
G. D. Borges et al.
2. Bauer, W., Bender, M., Braun, M., Rally, P., Sholtz, O.: Lightweight robots in manual assembly – best to start simply! Fraunhofer Institute for Industrial Engineering IAO (2016) 3. Ventana Systems. Vensim simulation software. https://vensim.com/ 4. Sterman, J.D.: Business Dynamics: Systems Thinking and Modeling for a Complex World. Irwin/McGraw-Hill, Massachusetts Institute of Technology, Sloan School of Management Boston (2000) 5. Forrester, J.W., Senge, P.M.: Tests for building confidence in System Dynamic Models. TIMS Stud. Manag. Sci. 14, 209–228 (1980) 6. Jafari, M.-J., Zaeri, F., Jafari, A.H., Najafabadi, A.T.P., Hassanzadeh-Rangi, N.: Human-based dynamics of mental workload in complicated systems. EXCLI J. 18, 501–512 (2019) 7. Middlesworth, M.: A Step-by-Step Guide Rapid Upper Limb Assessment (RULA) (2019) 8. NASA Task Load Index (TLX) v.1.0. Manual (1986)
A New Modular Intensive Design Solution for ROVs Qianqian Jing, Jing Luo(B) , and Yunhui Li School of Arts and Design, Shenzhen University, Shenzhen, Guangdong, China [email protected]
Abstract. There are more resources in the ocean than people have imagined. However, due to the complexity of the deep-sea environment, it is impossible to exploit these resources with ordinary terrestrial equipment. The deep-sea underwater vehicle ROV was born. However, the existing ROV has many kinds, single functions, and low efficiency, and other problems, such as difficult disassembly and assembly and exposed parts, have been found through observation and interview. The modular design concept is used to deconstruct functions and shapes of ROVs, which greatly improves the operating efficiency under the premise of the operator’s control. At the same time, the damage of components is reduced to a certain extent, and maintenance costs are greatly reduced. All of these also bring the possibility of the commercial development of the deep-sea ROV. Keywords: ROV · Modular design · Job mode · User experience
1 Introduction The ocean accounts for about 70% of the earth’s surface area, and its Marine resources are far more abundant than human imagination [1]. However, due to the complexity of the deep-sea environment, ordinary terrestrial equipment can’t explore or exploit these resources in the deep sea, so the deep-sea underwater robot ROV came into being [2]. ROV has a variety of functions [3]. Different types of ROV are used for different tasks [2, 4]. It is widely used in various fields such as military, coast guard, maritime affairs, customs, nuclear power, hydropower, offshore oil [7], fishery [5], maritime salvage, pipeline exploration, and Marine scientific research [6]. In recent years, the strategy of various countries to go deep-sea has led to the rapid development of the underwater robot. Companies or organizations of various countries are making efforts to update and iterate the design of ROV and improve its performance and efficiency every year [8]. The CURV-21 is a deep-sea remotely operated vehicle (ROV) capable of diving up to 20,000 feet. It is primarily used for some deep-sea salvage purposes by the United States Maritime Department. It has three main advantages. First, its overall size is small compared with its predecessor. The ROV can also be combined with other components to form an integrated search and recovery system, switching between scanning sonar function and ROV operation via fiber-optic umbilical cable and a shared processing system. Third, for some special operations, the CURV-21 can also be customized kits, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 69–76, 2021. https://doi.org/10.1007/978-3-030-79997-7_9
70
Q. Jing et al.
including but not limited to special fishing tools, instrument kits or equipment for other tasks [9]. The pre-set toolkit is a prototype of a modular design, with the advantage of being able to perform multiple tasks with the same ROV, increasing efficiency while reducing size. The Pluto Gabbia dual ROV system is a novel approach to ROV design, where the two ROVs are initially connected by umbilical cords, and once they reach operating depth, they disconnect from each other and operate as separate ROVs. The two ROVs can be controlled completely independently through the umbilical cord connection. The advantages of this approach are obvious: first, it is very flexible, and second, the two ROVs can watch each other when the cable is tangled, and each ROV can help to untangle the other ROV’s cable. This also indicates that there is more room for ROV innovation and design. Based on the previous research, we developed a modular conceptual design solution to optimize ROV operations. The goal was to provide a more convenient and engineering friendly operation mode through the modular design concept, thereby improving ROV efficiency and user experience, and enabling ROV sub-sea missions to be performed more efficiently. At first, observation and interviews were used as design methods for user research. Through the actual observation of the composition of ROV and the interview of engineers in the actual operation process, many problems of ROV during operation can be found. On this basis, the design scheme to solve the existing problems is put forward and the conceptual design is made. The application of modular design ideas in the innovative design process of ROV is introduced in detail in this paper.
2 User Research The user research comprised two parts: (a) Observe and analyze the basic parts and components of ROV, observe the whole process of ROV from the preparation work before launching, to the actual operation on the seabed, to the recovery and maintenance of ROV, and look for design opportunities, and (b) Interview with the engineer to demonstrate the design insight in section (a) and gather additional information from the interview to develop the design concept. 2.1 Observation According to observation, ROV mainly has the following modules: buoyancy module, lighting module, camera module, manipulator module, frame module, power system module, and control module. ROV will adopt different modes when carrying out different operational requirements underwater, mainly including floating mode and crawling mode. The floating mode does not require power system modules such as tracks, but mainly uses camera and lighting modules to complete observation tasks. In crawling mode, the power module should be installed in advance before launching, and the underwater camera and lighting module should be used to complete operational tasks. After understanding the composition of the ROV and analyzing the video recording of the ROV during the actual operation, the following problems were found: First, the center of gravity is high and the overall shape is too large. Second, the scientific theoretical
A New Modular Intensive Design Solution for ROVs
71
methods such as fluid mechanics and bionics are not combined with modeling design. Third, the track lacks protection. Fourth, most of the internal parts are exposed, which has a certain resistance to commercialization in the later stage. 2.2 Interview The engineers operating the ROV were interviewed about their operating experience and questions about using existing ROVs. These are some of the most common questions collected. First, they want to improve the ROV from an industrial design perspective so that it can be easily disassembled and assembled onboard before it goes offshore, increasing operational efficiency. Second, due to the various components and functional modules involved in the machine, we hope to arrange modules more quickly according to the different tasks. Finally, it is hoped that on the premise of ensuring efficiency, it can be more consistent with the experience of the staff.
3 Design Process Results of the user research provided valid data for design concept and prototype development. According to the above problems and the modeling characteristics of ROV, the following reference basis and solutions are proposed for the subsequent ROV design. The main summarization is as follows: 1. The center of buoyancy and the center of gravity as far as possible on a vertical line, and the center of buoyancy should be higher than the center of gravity as far as possible; 2. Whether the propeller channel is designed or not will affect the thrust of the propeller, thus affecting the traveling speed, direction, and other parameters. 3. In the overall shape design, but also to consider its gravity self-balance and resistance balance, for the control of posture balance will be easier. 4. In terms of color selection, it is recommended to use bright colors such as yellow, orange, and bright green, because most visible light bands underwater will be absorbed and appear blue-green in the seabed. 5. In the design of the lamp, the light source should be far away from the camera as far as possible, and the direction of the outgoing light should be parallel to the camera, to avoid the reflection of floating dust on the seafloor affecting the image imaging. 6. In the design of buoyancy materials, try to avoid complex curved surface processing because buoyancy materials are mostly machine-processed, although there is a die opening way, the cost is higher. Previous studies have found that the preparation time for ROV placement and recovery is long, and the cost of task execution is high. Therefore, the main problems to be solved in this study are to reduce the launching times of ROV, improve its disassembly and assembly capacity, and optimize its task flow.
72
Q. Jing et al.
3.1 Modular Design Idea This study using modular design ideas of underwater robot system apart according to the feature classification and finishing (as shown in Fig. 1), then according to the preliminary interview questions, for the removal of the ROV maintenance difficult problem, we will each azimuth thrusters and action components integrated into the module, and mainly through the action of the module design to reduce the ROV focus of the whole, overall modeling large commercial hard problems.
Fig. 1. The underwater robot system is disassembled, classified and sorted according to its functional components.
3.2 Optimize the Disassembly Structure In the conclusion of the interview, we learned that when the ROV is performing the observation task, it needs to be in a state of zero buoyancy, so it needs to remove the excess structural components that will increase the overall load. When the ROV performs operational missions, it needs to have a strong grip and additional loads to improve the stability of the ROV in the deep-sea environment and to meet the requirements for precise operations. Therefore, this study expects that when the ROV needs to perform observational tasks or tasks that do not require detailed operations, the floating mode of the ROV, which uses the portability of wheels in the action module, can be used to meet the needs of the tasks. When the ROV needs to perform tasks requiring detailed operations, the roam mode of
A New Modular Intensive Design Solution for ROVs
73
the ROV is enabled, that is, the wheels are replaced with tracks in the action module, which can not only increase the load and improve stability but also adapt to a variety of complex deep-sea terrain to meet the needs of detailed operations. Therefore, according to the characteristics of these two operation modes, we designed the following concept ROV(as shown in Fig. 2).
Fig. 2. Starting with the ROV operation, the conceptual design is developed from the modular structure.
3.3 Design of Action Module In the sketch of the concept design, we designed a shuttle-shaped structure that integrates the functional components in addition to the structural modules required for the operation, i.e. the modular design of the active components. As shown in Fig. 3, there are two groups of action modules. One is the power module that controls the forward and backward motion of the ROV. The other, which controls the left and right movement of the ROV, is the direction module. The advantages of this structure are: it solves the disassembly, replacement and maintenance of the action module required by the task; secondly, because there is enough space inside the shuttle, the redundant functions of the ROV can be stored in it to improve the storage capacity of the mainframe of the ROV. In the process of ROV placement and recovery, the shuttle-shaped design reduces the resistance of water flow, reduces the loss and time of ROV in this process, and improves the working efficiency of ROV.
Fig. 3. The spindle structure.
74
Q. Jing et al.
4 Evaluation of Design Make final model adjustments on computer software. The design principle of adjustment is as follows: first, the top is provided with a groove and bayonet, easy to be combined with the repeater placement and recovery; Second, based on the smaller overall volume, the storage space of the work sample is increased. Third, it can remotely control the Angle and position of camera and light in the viewing system; Fourth, the frame design should ensure that the ROV placement and recovery process reduces the resistance, and as far as possible to reduce the weight of the frame; Fifth, the action module is easy to disassemble and assemble; Six is the selection of protection and placement of structural modules. The three-dimensional model is shown in Fig. 4.
Fig. 4. No crawler and installation crawler mode schematic diagram.
Based on the design criteria we set earlier, we finally produced the following design: first, the structural modules were placed in the frame in the same way as the existing ROV, integrating the action module with the thrusters (Fig. 5); Second, the structural modules are integrated into four operational modules to increase ROV payloads. Third, the structural modules are integrated into four buoyancy materials to increase the ROV’s payload. Fourth, the mechanical arm is integrated with the front two columns to expand the operating range of the mechanical arm.
Fig. 5. Display the final design scheme.
A New Modular Intensive Design Solution for ROVs
75
In terms of details, the framework adopts the design of topology optimization. After precise calculation, a large number of materials in the original structure are removed, so that the whole structure can maintain high stiffness and strength, that is, the lightest weight can achieve the optimal load-bearing structure. Abandon the traditional frame structure, the observation system adopts the embedded rotatable method to expand the lighting range and visual range and can be combined with the buoyancy material in the form of coordination without being obtrusive. Besides, facing the problem of commercialization, we have carried out a variety of different functions, shapes and uses for each module that can adapt to any scene and task environment, for the user to choose the collocation (Fig. 6).
Fig. 6. Module decomposition and composition.
This is the end of the research and exploration. Figure 7 shows the simulated ROV action scene in the deep sea. It is hoped that this study will explore the modular design of ROV and lay the foundation for its commercialization.
Fig. 7. Simulate ROV action in deep sea.
5 Conclusion This research focuses on improving ROV efficiency through a modular design concept. The goal is to improve ROV efficiency in deep-sea while making it easier for the operator
76
Q. Jing et al.
to operate. The design integrates thrusters with operational modules for improved disassembly and assembly efficiency. Different modules can replace combination not only solves the existing between the ROV variety, low efficiency of single function and, also can improve the working efficiency of the deep sea, and to a certain extent, reduce the damage to the parts, greatly reduces the maintenance cost, in the late more than these, also brought the commercial development of deep-sea homework type the ROV.
References 1. Kaluza, A., Lindow, K., Stark, R.: Investigating challenges of a sustainable use of marine mineral resources. Procedia Manuf. 21, 321–328 (2018) 2. Macreadie, P.I., et al.: Eyes in the sea: unlocking the mysteries of the ocean using industrial, remotely operated vehicles (ROVs). Sci. Total Environ. 634(SEP.1), 1077–1091 (2018) 3. Christ, R.D., Wernli, R.L.: Chapter 1 - The ROV Business. The ROV Manual. Elsevier Ltd. (2014) 4. Romano, C., Gerard, D., Edin, O., Joseph, C., Thomas, N., Daniel, T.: Inspection-class remotely operated vehicles—a review. J. Mar. Sci. Eng. 5(1), 13 (2017) 5. McLean, D.L., Partridge, J.C., Bond, T., Birt, M.J., Bornt, K.R., Langlois, T.J.: Using industry ROV videos to assess fish associations with subsea pipelines. Cont. Shelf Res. 141, 76–97 (2017) 6. Mai, C., Pedersen, S., Hansen, L., Jepsen, K., Yang, Z.: Modeling and control of industrial ROV’s for semi-autonomous subsea maintenance services. IFAC-Papersonline 50(1), 13686– 13691 (2017) 7. Khojasteh, D., Kamali, R.: Design and dynamic study of a ROV with application to oil and gas industries of Persian Gulf. Ocean Eng. 136, 18–30 (2017) 8. de Vivero, J.L.S., Mateos, J.C.R.: Ocean governance in a competitive world. The BRIC countries as emerging maritime powers—building new geopolitical scenarios. Mar. Policy 34(5), 967–978 (2010) 9. U.S. deploys CURV-21 in Argentine submarine search. https://defpost.com/us-deploys-curv21-argentine-submarine-search 10. Pluto Gabbia - twin ROV system. http://www.idrobotica.com/pluto-gabbia.php
Drones, Robots and Humanized Behaviors
Designing for the Unknown: Using Structured Analysis and Design Technique (SADT) to Create a Pilot Domain for a Shore Control Centre for Autonomous Ships Dag Rutledal(B) Department of Ocean Operations and Civil Engineering, NTNU in Aalesund, Larsgaardveien 2, 6009 Aalesund, Norway [email protected]
Abstract. Designing decision support system for domains that do not yet exist is problematic because approaches such as cognitive task and work analysis are difficult to conduct due to the lack of established domains or users. An example of such a revolutionary domain is a Shore Control Centre (SCC) for autonomous ships. This method paper suggests the use of Structured Analysis and Design Technique (SADT) to create a pilot domain for a SCC. SADT was chosen as it provides a robust structured method to model dynamic systems and for this purpose it provides an opportunity to define and analyse the prospected activities executed by a SCC. By applying SADT on the Autoferry, a small autonomous passenger ferry expected to be in operation in the not so distant future, the constituent activities of the SCC can be defined in terms of input, control, mechanisms and output. Keywords: Autonomous ships · Shore control centre · SADT · Revolutionary design
1 Introduction “The development of decision support systems is a complex and challenging task even with the most clearly defined problems and well-established domains” [1]. Subject matter experts, existing systems, and established organizational structure aids the designer of decision support aids. However, when the system designer is tasked to design a decision support system for revolutionary domain (a non-existent proposed domain), typical approaches like cognitive task and work analysis are difficult to implement because no domain exits to model [1]. An example of such a conundrum is the development of autonomous ships. As they do not yet exist and naturally no domain exists from which to gather data and there are no users from which to elicit knowledge. Before approaches like Cognitive Work Analysis (CWA) can proceed from the more ecological phases to the cognitive ones, a pilot domain must be created in which the assumptions, constraints, and structural relationships developed in the work domain analysis phases can be evaluated. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 79–86, 2021. https://doi.org/10.1007/978-3-030-79997-7_10
80
D. Rutledal
The introduction of autonomous ships into the maritime domain and the role expected covered by a SCC, is a textbook example of a revolutionary domain and it is therefore crucial that some form of validation or checks-and-balance of assumptions occurs prior to moving forward with the cognitive analysis and preliminary design. To date, autonomous shipping appears to primarily have been focusing on a technology push rather than considering and providing sociotechnical solutions including the introduction of a SCC [2].
2 SADT Structured Analysis and Design Technique (SADT) was created to describe activities. Douglas T. Ross, an creator of SADT (or IDEF (Integrated DEFinition), as this tool is known among public-domain users), described it as a “graphic language for blueprinting systems” that has its roots in “cell modelling of human-directed activities” [3]. Since its commercial introduction in 1973, SADT has been applied in hundreds of projects in such diverse industries as aerospace, telecommunications and in the maritime industry. A broad range of functions is represented in these applications, a description of a training system for the US army [3], and modelling a port logistic process in Busan, Korea [4]. Even though none of these cases represent revolutionary domains, several themes recur in these applications that makes SADT especially effective for the purpose of this method paper. First, SADT focuses on activities, what activities do we foresee a SCC should execute. SADT makes those activities explicit and will help stakeholders at every level to understand why these activities must be done. Second, SADT structurally provides for such important attributes for cognitive work such as: – Who or what performs the activity (“mechanisms” in SADT terms); and – What guides or limits the activity (“controls”) Finally, SADT is valuable in improving internal communication because the modelbuilding process includes a protocol to involve stakeholders in every level, from legislators to laypeople. The resulting dialogue not only clarifies activities and actor’s roles, but also promotes consideration of process redundancies and improvements. The modelbuilding process represents the key to gaining organisational consensus on a process [5]. By applying SADT on the Autoferry, a small autonomous passenger ferry which will soon operate autonomously in a narrow channel in Trondheim, Norway, the constituent activities of the SCC can be defined in terms of input (indicating the start of an activity), control (what guides or limits the activity), mechanisms (who or what performs the activity) and output (the result of the activity). The point of departure for the SADT is the overall goal of the system, what it intends to accomplish. This is also called the mother node often referred to as A0. This overall goal is then broken down into phases. The operation of the Autoferry is broken down into five phases, passengers boarding, departure, transit, arrival and passengers disembarking. Each of these phases are further broken down to the actions performed within each phase and by whom/what in order to conduct the function.
Designing for the Unknown: Using SADT
81
With SADT, model building starts with a simple “box and arrow” graphics, see Fig. 1, showing the functions as a box and the interfaces to and from the function as arrows entering or leaving the box. The graphical language of the SADT methodology can be considered to be the most important feature, since it produces a modelling method. This involves the structured decomposition shown in Fig. 1 (right), i.e., the orderly breaking down and addition of extra detail to a complex system reduced to its constituent parts. Because this method of analysis is top-down, hierarchical and structured, it focuses attention on important features, bringing the correct objectives to the foreground, as opposed to irrelevant ones which needs to be kept in the background, depending on which level is under consideration. The effect of this in describing the SCC’s role within autonomous shipping becomes obvious, since all the parameters involved in any activity within the model can be accounted for, Simultaneously, the relationship between activities can be identified.
Fig. 1. Function box and interface arrows. Right: hierarchy and structured decomposition of SADT methodology. Source: modified from [11]
Fig. 2. The mother node A0 and abbreviations used
The mother-node see Fig. 2, which describes the overall goal or purpose of the system are further broken down to next level which represents the different phases included in the overall goal, see Fig. 3.
82
D. Rutledal
Fig. 3. The level below the mother node.
As an example of the final detailing level the node A3 “Transit” was chosen. On this level the actors/artefacts involved in the different tasks this exact node consists of are identified, see Fig. 4.
Fig. 4. The A3 node (transit)
In this figure, the Command and Data Management (CDM) is introduced. The thought behind the CDM is that it will act as a sensor fusion unit. A central for the collection of data from the different sensors involved, such as Radar, LIDAR, ECDIS, positional sensors, cameras and so on. The CDM will determine whether the vessel can proceed within the SOP or not. It is beyond the scope of this paper to describe this in detail. 2.1 Agent-/Artefact Orientated Flowcharts However, SADT is not enough when one is to describe a complete system within a revolutionary domain. It does describe the function of the system in the specified stages, but it does not shed any light on who or what artefact are expected to perform the actual tasks each function calls for. To remedy this, this paper suggests in each of these phases to produce an agent-/artefact oriented-flowchart described by Magne Aarset [6], see Fig. 5 for example, focusing on the internal procedures and the interface between the persons (agents) and the hardware (artefacts) who/which are present. These agents/ar-tefacts will constitute the “mechanisms” in the SADT. Each task done by actor/artefact is given a
Designing for the Unknown: Using SADT
83
unique label which underlines one of the main strengths of this technique. It will make the tasks explicit and traceable throughout the system. By suggesting these actors/artefacts and their purposed tasks, we can start designing a simulated pilot do-main in which an autonomous vessel in conjunction with a SCC, can operate. In this paper, four different actors/artefacts are suggested, the Command and Data Manage-ment (C), the Operator (O), the Vessel (V) and Passengers (P). For each of these a detailed description of the tasks the actor/artefact are expected to perform, what tasks instigate this task, and in turn which subsequent task by a different actor/artefact this instigates. In Fig. 5, such a description is shown for the operator for the A3 node, transit.
Fig. 5. Agent-/artefact orientated flowchart for the operator in a SCC during transit
2.2 The Operator One of main challenges is to define the role of the operator in the SCC. This is one of the key questions the subsequent studies hopefully will find an answer to. Is the operator expected to monitor at all? Does he/she have to capacity to do this? If so, we touch upon the very ironies of automation first described by Bainbridge [7]. As Strauch [8] discusses, Lisanne Bainbridge’s main issues in “Ironies of automation” are still valid more than three decades after publication. “The more advanced a control system is, the more crucial may be the contribution of the human operator” [7]. So, the human operator is left with an impossible task according to Strauch. “Operators of automated systems will therefore need to monitor systems that were implemented precisely because they can monitor more quickly and reliably than can the operator” [8]. This could be illustrated with an example. In a case where the vessel cannot proceed within the SOP because of an obstruction of some kind is in its way, the CDM discovers this and sends a signal to the operator along with a suggested course of action, the C3 task in Fig. 5. This instigates the O2 task in the operator’s list of tasks. If it is up to her to approve the course of action suggested by the CDM or whether the course of action proceeds unless she disapproves or refuses the action, is a question still not answered. In either case, if she overturns the suggestion of the CDM, leads to the question: on what bases did she do this? Was it her experience that led her to recognise clues in the environment that the automation did not, or did the CDM misunderstand the situation
84
D. Rutledal
at whole? Or is the operator’s judgment flawed, that she did not pick up the clues in the environment that the automation did? Which has the best situational awareness, the operator, or the automation? This touches upon several much debate questions regarding situational awareness. For an extensive insight see among others [9]. Whether the operator approves, or chooses not to overrun the automation, the evasive course of action needs to be effectuated by the vessel itself, the V2 task in Fig. 5. To verify this, the CDM monitors this through its C1 task. If there are any discrepancies, CDM reports this to the operator and loop is complete. This way of analysing possible situations for each of the five phases shown in Fig. 3, a complete overview of the actors involved and tasks to be done, can be obtained.
3 Case As the Autoferry is nearing its realization, a monitoring/remote operation facility, SCC, must be in place. As there is currently no vessel operating in the proposed route soon to be inhabited by the Autoferry, a replacement case had to be chosen. A case where an existing vessel operate and where a pilot domain can be created in which the assumptions, constraints, and structural relationships developed in the work domain analysis phases, represented by the SADT, can be evaluated. As such, a high-speed passenger carrier operating in Ålesund, in the western part of Norway, may serve the purpose. In this case the simulator control room located at NTNU in Ålesund can act as a SCC. This is a highly flexible environment where modifications can be done swiftly if needed. In this setting we could recrate the route of the actual high-speed passenger carrier and simulate its transit. To create realistic scenarios, a monitoring station has been setup to continuously record situations the vessel encounter on its way. The monitoring equipment consists of a radar with Electronic Chart Display Information System (ECDIS)/ Automatic Identification System (AIS) overlay, AIS receiver and an Axis 360 camera. The Timezero software presents and records the data and allows replay at a later stage. This means that also the targets without AIS will be registered and identified. This is significant as a recent study done in a nearby area shows that only 41% of the targets were equipped with AIS [10]. The result of this monitoring is that it allows us to recreate actual traffic situations in which the high-speed passenger carrier had to deviate from the intended route. This could be due to a number of reasons among others, fishing vessels operating in the area, arrival or departure of large ships with the assistance of tug vessels and general commercial/leisure traffic. Especially interesting situations, such as when the COLREGs are apparently violated, can be recreated in the simulator exactly as they happened in real life. This would give us a unique ability to test different layouts and ways of presenting information for an operator in a SCC. The navigators onboard the monitored vessel has agreed to participate in this work which could leave us with a really interesting question; will they act differently when operating a simulated ship with the exact same traffic surroundings than they did in real life? As example of questions this research could lead to is whether a 3D-egocentric view (the viewpoint is set onboard the vessel) is necessary or if a viewpoint from afar (bird’s eye view) would suffice. The former, suggested as the better option in earlier studies though in a different domain [11], would require a great bandwidth to facilitate the
Designing for the Unknown: Using SADT
85
resolution necessary to view the details often sought after when assessing the risk of e.g. close quarter situations. The latter could be provided by onshore monitoring stations strategically located across the area in focus. This could be simulated and tested.
4 Discussion SADT is arguably reductionist to its core with an underlying assumption that the tasks done by actors/artefacts would remedy any foreseeable situation an autonomous ship would encounter. Arguably, the development of autonomous ships is more of a journey then a destination. The scope of this is situated in a complex system containing among other things, social acceptance, economic and political influences that might constrain or expedite the development. The influence of these, and others, may only be indirectly connected to the domain of interest, but nevertheless exert an effect on the system that could not be anticipated in advance. “The problem is that solutions to problems within complex environments are constructed as if they weren’t complex” [12]. Ramalingam elaborates over this when looking at sustainability projects that the system’s pronounced addiction to seeing the world through a classic reductionist lens is not trivial: such processes lead to problems being defined and solutions chosen prematurely to give a sense of closure and certainty [8].This is the “empiricist paradigm” characterized by an acceptance of “a unique and universal reality that is alike for everyone, existing independently from the observer” [14]. However, before we can venture into the methods better suited for handle such complex sociotechnical systems, such as CWA [15], we need a starting point. As such the SADT with its agent/artefact orientated flowchart will allow to create simulated version of a SCC.
5 Conclusion In this method paper I have done an analysis of the prospected activities executed by a SCC for autonomous ships. As this is a revolutionary domain, methods such as CWA and CTA are difficult to conduct. However, the use of SADT in conjunction with the agent/artefact orientated flowcharts shows great potential and seems well suitable for this purpose as this will give a more holistic view of the tasks the proposed actors/artefacts must perform in order to maintain the goal of the autonomous ships, namely that they should be at least as safe as the conventional ships of today [16]. Based on this information, a pilot version of a SCC can be designed. Problems that remain are rooted in the reductionist worldview the SADT and the flowcharts are embedded in. Even though much effort has been put to into trying to make sure that the risks and hazards identified at in the preliminary hazard analysis [17], done for the Autoferry are covered. There will most certainly be instances and cases not yet thought of. Hopefully, by creating a pilot domain for the SCC, based on the SADT and the agent/artefact orientated flowcharts, we can start uncovering these weaknesses through simulation of real-life incidents. By simulating realistic real-world scenarios with real world actors as operators will hopefully give us valuable insights for a continued discussion on operator behavior, and control room design.
86
D. Rutledal
References 1. Cummings, M.L., Guerlain, S.: The tactical Tomahawk conundrum: designing decision support systems for revolutionary domains. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (2003). https://doi.org/10.1109/icsmc.2003.1244638 2. Lutzhoft, M., Hynnekleiv, A., Earthy, J.V., Petersen, E.S.: Human-centred maritime autonomy-an ethnography of the future. J. Phys. Conf. Ser. (2019). https://doi.org/10.1088/ 1742-6596/1357/1/012032 3. Marca, D., McGowan, C.L.: Structured Analysis and Design Technique. MacGraw-Hill (1988) 4. Roh, H.-S., S Lalwani, C., Naim, M.M.: Modelling a port logistics process using the structured analysis and design technique. Int. J. Logist. Res. Appl. (2007). https://doi.org/10.1080/136 75560701478240 5. Congram, C., Epelman, M.: How to describe your service: an invitation to the structured analysis and design technique. Int. J. Serv. Ind. Manag. (1995). https://doi.org/10.1108/095 64239510084914 6. Aarset, M.: Kriseledelse. Fagbokforlaget, Bergen (2010). 7. Bainbridge, L.: Ironies of automation. Automatica (1983). https://doi.org/10.1016/0005-109 8(83)90046-8 8. Strauch, B.: Ironies of automation: still unresolved after all these years. IEEE Trans. Hum. Mach. Syst. (2018). https://doi.org/10.1109/THMS.2017.2732506 9. Flach, J.M.: Situation awareness: context matters! A commentary on endsley (2015). https:// doi.org/10.1177/1555343414561087. 10. Rutledal, D., Relling, T., Resnes, T.: It’s not all about the COLREGs: a case-based risk study for autonomous coastal ferries. IOP Conf. Ser. Mater. Sci. Eng. (2020). https://doi.org/10. 1088/1757-899x/929/1/012016 11. Porathe, T., Human, M., Group, F., Shipping, D., Technology, M.: Mental rotations and map use : cultural differences. In: Proceedings of Scandinavian Maritime Conference (2012) 12. Burns, D., Worsley, S.: Navigating Complexity in International Development (2015). https:// doi.org/10.3362/9781780448510 13. Ramalingam, B.: Aid on the Edge of Chaos (2013). https://doi.org/10.1017/CBO978110741 5324.004 14. Ruiz, A.B.: The contribution of humberto maturana to the sciences of complexity and psychology. J. Constr. Psychol. 9, 283–302 (1996). https://doi.org/10.1080/107205396084 04673 15. Vicente, K.J.: Cognitive Work Analysis. Lawrence Erlbaum Associates Inc., Mahwah (1999) 16. Rolls-Royce: Autonomous ships: the next step. AAWA Position Pap. Roll. Royce Plc., London (2016) 17. Thieme, C.A., Guo, C., Utne, I.B., Haugen, S.: Preliminary hazard analysis of a small harbor passenger ferry-results, challenges and further work. J. Phys: Conf. Ser. (2019). https://doi. org/10.1088/1742-6596/1357/1/012024
Reporting of Ethical Conduct in Human-Robot Interaction Research Julia Rosén(B) , Jessica Lindblom, and Erik Billing Interaction Lab, University of Skövde, Högskolevägen 1, 541 28 Skövde, Sweden [email protected]
Abstract. The field of Human-Robot Interaction (HRI) is progressively maturing into a distinct discipline with its own research practices and traditions. Aiming to support this development, we analyzed how ethical conduct was reported and discussed in HRI research involving human participants. A literature study of 73 papers from three major HRI publication outlets was performed. The analysis considered how often the following five principles of ethical conduct were reported: ethical board approval, informed consent, data protection and privacy, deception, and debriefing. These five principles were selected as they belong to all major and relevant ethical guidelines for the HRI field. The results show that overall, ethical conduct is rarely reported, with four out of five principles mentioned in less than one third of all papers. The most frequently mentioned aspect was informed consent, which was reported in 49% of the articles. In this work, we aim to stimulate increased acknowledgment and discussion of ethical conduct reporting within the HRI field. Keywords: Human-Robot Interaction · Ethics · Methodology
1 Introduction There is an ongoing dialogue on how to shape the future of HRI [1, 2]. Baxter et al. [3] emphasize the importance of keeping the HRI field interdisciplinary while finding a common ground to employ research. It should be emphasized that a truly interdisciplinary perspective on HRI will require researchers to adopt a wider set of concepts, theories, and methods in their own research, which implies the need to read a broader spectrum of literature as well as correctly applying the methods therein [2]. Combining several different disciplines and research areas, as is necessary in the case of research within HRI, leaves the risk of misinterpretations of underlying epistemological, theoretical, and methodological foundations that may not be explicitly articulated among the different disciplines, and erroneously considered as common knowledge within a community. Therefore, such endeavors ought to be discussed from a methodological perspective, including its ethical practices. The focus in this paper is ethical reporting when conducting HRI research with human participants. While there are already well established protocols concerning the ethical principles, consensus has still not been reached on when and how these guidelines © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 87–94, 2021. https://doi.org/10.1007/978-3-030-79997-7_11
88
J. Rosén et al.
should be applied within the HRI field. Proper ethical conduct ought to be considered in any research field and is certainly not a controversial, nor new, claim. It is crucial in fields where human participants are involved as the well-being and interests of the participants must be considered. Controversial experiments such as the obedience experiment by Milgram from 1963 [4] and the Stanford prison simulation by Haney et al. [5] created a need to standardize how to protect participants [6]. Of course, these kinds of unethical studies are not being executed in HRI research today; however, it might sometimes be difficult to foresee how the participants will be impacted by an experiment, which makes proper ethical conduct crucial for any research field. There are numerous ethical best practice documents created by e.g. foundations, associations, and lawmakers, to ensure empirical research is done ethically. In this work, we have found three common ethical guidelines that apply to empirical HRI research: Ethical Principles of Psychologists and Code of Conduct by the American Psychological Association (APA) [7], Ethics in Social Science and Humanities by the European Commission (EC) [8], and Declaration of Helsinki by the World Medical Association (WMA) [9]. In short, APA’s set of ethical principles is relevant for any experimental psychology research involving participants, and aims to ensure that the empirical research being done is ethical and well-reasoned. These principles are standardized across the field and are considered crucial when doing any psychology research. EC has several documents for ethical conduct depending on the area of research. For this paper, we have chosen to include their document on Ethics in Social Science and Humanities since much of the empirical HRI research being done today usually falls in this category; any research conducted within the European Union needs to adhere to these guidelines. Lastly, WMA created the Declaration of Helsinki with ethical principles for any medical research involving participants. This document is used in many fields that deals with human participants, including the HRI field. Within these guidelines, we have identified five recurring ethical principles. First, ethical board approval refers to a board or committee that is responsible for approving certain empirical research before the study is executed. An appropriate board depends on the regulations and laws that exist in the researchers’ residing country and where the study is intended to take place. Usually, the researcher must provide information to the ethical board about the intended study and how it is intended to be carried out. Second, informed consent is a document provided to the participants covering the nature of the study that they are asked to participate in. The participant should be informed of key elements of their participation, including but not limited to: how long the study takes, that they can end their participation at any point, any limits of confidentiality, and how to get in touch with the researchers if any questions or concerns should arise. Third, data protection and privacy refers to researchers’ responsibility to ensure that the participants’ information and data are kept with integrity and treated as confidential. For example, studies involving EU citizens, the participants’ data protection and privacy needs to be compliant with the General Data Protection Regulation Union [10]. Fourth, deception refers to when the participants are deceived in any way while participating in the study. Deception is sometimes necessary to attain certain results that would not be revealed otherwise [7]. For deception to be justified, the researchers must have ruled out all other options that do not involve deception, agree that the empirical research is
Reporting of Ethical Conduct in Human-Robot Interaction Research
89
important enough to risk deceiving participants, and they must explicitly explain to the participants afterwards what part of the study was deceptive. Fifth, debriefing refers to a session after the study where participants are made aware of information that is deemed appropriate to disclose. One purpose of debriefing is to correct any misconceptions the participants may have regarding what they experienced during the study, including revealing any forms of deception and why it was necessary. With the aim of stimulating an increased discussion and reporting of ethical principles, a literature study was conducted to gauge how ethical conduct is reported in empirical HRI research. We based our ethical reporting investigation on the five aforementioned ethical principles as these were developed as a way to protect participants and are relevant for HRI research.
2 Method To reach an overview of how ethics are reported in the HRI field, especially more recently, three major publication outlets from the HRI field were analyzed - the ACM/IEEE International Conference on Human-Robot Interaction (HRI) [11], the IEEE International Conference on Robot & Human Interactive Communication (RO-MAN) [12], and the ACM Transactions on Human-Robot Interaction (THRI) [13]. The years analyzed were 2018 for the HRI conference, 2019 for the RO-MAN conference, and all articles/papers published from 2018 and 2019 in the THRI journal. For both the HRI conference and the THRI journal, all full-length papers were considered. HRI had 49 full length papers total and THRI had 31 full length papers total (editorials excluded). For the RO-MAN conference, however, a random selection of 40 papers was used to reduce the number to a similar amount as the two other outlets. In total, 120 papers were considered. The sample used here can be seen as a form of data triangulation. Data triangulation is the usage of a variety of data sources in a study. Through the use of data triangulation, one explores whether the inferences from the empirical data are valid, and estimates consistency. Since the aim was to review how ethics are reported when participants are involved, we applied an inclusion criterion defined as studies that comprised human participants involved in an experimental setting. An experiment is defined as a study where certain variables are manipulated which “investigates cause and effect relationships, seeking to prove or disprove a causal link between a factor and an observed outcome” [p. 127, 14]. The literature review was an iterative process; after the inclusion criterion was applied, remaining papers were read in more detail and analyzed in terms of ethical conduct, considering the five common principles of APA [7], EC [8] and WMA [9], specifically: ethical board approval, informed consent, data and privacy, deception, and debriefing. The purpose was to detect any explicit mentions of these five principles in relation to the reported experiments. If an aspect was mentioned, the paper was marked with a yes for that principle, otherwise, it was marked with a no. These principles were reported in numerous ways and to varying degrees. For example, we did not differentiate between a consent form that was reported in close detail (e.g. how the consent was given, what was included in it), and a consent form that was mentioned briefly (sometimes at other places than in the method section). For the scope of our literature study, we decided to not judge
90
J. Rosén et al.
the level of detail for each aspect in these papers, but only to note if there was any mention of them or not. Also, if the article discussed more than one experiment, we assigned a yes if the principle as mentioned in relation to at least one of these experiments. Thus, a no indicates that the principle was never mentioned for any experiment reported in the paper.
Fig. 1. The amount of ethical principles reported in 73 experiments from three major publication outlets in HRI.
Deception and debriefing were also chosen to be reported in the same matter: that is, they were included if the authors explicitly mentioned them. The exception was Wizard of Oz studies (WoZ), i.e. robots being remotely controlled by a human, where deception is necessary; unless mentioned that the participants were aware of the staging. Any publication that explicitly mentioned how the privacy of the participants was considered was marked with yes. As the phrasing can vary when it comes to data and privacy, a closer analysis was made in order to detect if this issue was mentioned. This was again mentioned to varying degrees, but we focused broadly and included any mention of it.
3 Results A total of 73 publications were analyzed in relation to ethical principles. We would like to stress that these results cannot say whether the authors had considered these ethical principles in their experiments or not, but rather if the authors explicitly mentioned them in their publication. Below is a summary of the primary findings (Fig. 1). Ethical board approval was explicitly mentioned in 23% of all the publications (7 in HRI, 4 in RO-MAN, and 6 in THRI). Thus a total of 17 out of the 73 publications explicitly mentioned that they had an ethical board approval to conduct the study. Informed consent was explicitly mentioned in 49% of all the publications (21 in HRI, 6 in RO-MAN, and 9 in THRI). Thus, a total of 36 out of 73 publications explicitly mentioned the use of informed consent in their study.
Reporting of Ethical Conduct in Human-Robot Interaction Research
91
Data protection and privacy was explicitly mentioned in 4% of all the publications. Three publications in HRI mentioned this aspect; however, none of the analyzed papers from RO-MAN and THRI discussed data protection and privacy. Thus, a total of 3 out of 73 publications explicitly mentioned that they considered data protection and privacy in their study. Deception was explicitly mentioned as a method in 22% of all the publications (8 in HRI, 4 in RO-MAN, and 4 in THRI). Thus, a total of 16 out of 73 publications explicitly mentioned that deception was used in their study. Out of these 16 publications, a WoZ-set up was used as a method in 50% of those (8 out of 16 publications). Debriefing was explicitly mentioned in 19% of all publications (11 in HRI, 1 in ROMAN, and 2 in THRI). Thus, a total of 14 out of 73 publications explicitly mentioned the use of debriefing in their study. As discussed in the background, debriefing is critical in studies involving deception and when considering the publications that involved deception, 44% publications mentioned debriefing in relation to deception (7 out of 16 publications). None of the five identified ethical principles were reported in 36% of the papers (26 out of the 73 publications). A follow up email was sent to corresponding authors of the 2018 HRI conference that had publications lacking information about both ethical board and informed consent. Out of 16 contacted authors, 7 responded that they did get ethical approval for their study and that all participants signed a consent form before participating in the study. One of these authors responded that the study was only conducted on lab members and therefore did not deem it appropriate to include informed consent and ethical board approval.
4 Discussion The results from our literature study show that ethical conduct was rarely reported in the three publication outlets chosen. THRI reports ethical board approval more frequently; 40% of the analyzed publications mentioned board approval, compared to HRI and RO-MAN where about 20% of the analyzed publications mentioned this aspect. The HRI conference requires that the corresponding author checks a box that the study has been approved by relevant ethics committees if participants were involved. Due to this, it could be argued that it is not necessary to explicitly write that the study has been ethically approved and might be the reason for the low number reported in the reviewed papers. Despite this, it would be of value to list this in the publications, making it more obvious for the reader when, and how, ethical principles were considered. Readers that have not submitted to this conference before would not know that an ethical board approval box was checked before submission. This argument does not, however, explain the low rate of reporting informed consent. Informed consent is the most commonly reported aspect of ethical conduct that we identified; however, it is still missing in half of all papers examined. HRI and THRI report informed consent more frequently than RO-MAN in this regard, but all outlets neglected to mention this aspect in at least 40% of the analyzed papers. As indicated by the responses to the emails sent to corresponding authors of some HRI publications, both board approval and informed consent are probably used more
92
J. Rosén et al.
frequently than reported. Still, proper ethical conduct deserves to be properly emphasized in the literature. Surprisingly, only 4% of the papers from this literature study explicitly mentioned data and privacy. By data and privacy, we mean where the author addresses how the data are handled and how the privacy of the participants is kept. This issue is relevant to HRI research since some personal data are usually gathered in experimental studies, e.g., through video recordings. As other ethical principles, data and privacy policies are likely used more frequently than explicitly reported in the papers. Nevertheless, it is an important ethical concern that deserves more attention. Deception and debriefing are well rounded and customary practices in fields like psychology, and similar strategies can be seen in empirical HRI research. When presenting robots in HRI studies, they may appear to be more intelligent than what they actually are, oftentimes with intentional deception, e.g. WoZ [15]. In our literature study, there were eight publications that used WoZ, which is a common method used in this field [15]. Other than WoZ, deception was explicitly mentioned in 14 publications. Interestingly, there were 14 publications in total where debriefing was explicitly mentioned, but only 9 coming from the publications with deception. Our interpretation is that debriefing also is used in studies not including deception to inform the participants on the nature, purpose, and conclusions of the empirical study. This is a positive practice that could perhaps be adopted more frequently. Moreover, we found some themes that could broach on deception. For example, from the 2018 HRI conference two papers used a robot with emulated emotions. Although these publications used a consent form, each robot’s capabilities do not seem to be addressed explicitly. Another publication deceived participants into thinking the robot was making errors in a card game when in fact the robot’s behavior was programmed. It could be argued that studies like these should include debriefing to make sure that participants do not leave the study with any misconceptions. Based on the obtained findings in our literature study, debriefing might not always be considered before sending participants away after the data collection step is conducted. Though it might not always be needed, it could be a very useful tool, not only to fulfill ethical guidelines, but to gain a deeper understanding of participants involved in HRI studies. In addition to this, it is also possible that by not being truthful to participants and by not debriefing them, the view that researchers are not truthful could be more common which could cause undesirable effects on future studies. Although many publications neglected to report some, or all, of the five ethical principles in their papers, there were several authors that pursued proper ethical conduct in their work. These are, of course, worth noting and could perhaps be seen as best practice when conducting empirical HRI research. For example, Oliveira et al. [16] explicitly mentioned informed consent (“After signing an informed consent, participants were asked to provide information regarding their sex and age”), data and privacy (“The anonymity and confidentiality of the individual data was guaranteed”), and debriefing (“At the end, participants were thanked for their collaboration, received a movie ticket for participating in the study, and were debriefed”). Rea and Young [17] explicitly mentioned ethical board approval (“Our university’s research ethics board approved both studies”), informed consent (“Participants were first given a briefing of the experiment and signed an informed consent form”), and deception with debriefing (“after all three conditions,
Reporting of Ethical Conduct in Human-Robot Interaction Research
93
the participants were debriefed about the deception in the obstacle course room with the robot – it was, in fact, always the exact same robot”). The publications presented in the above paragraph show some of the unique ethical issues the field of HRI is facing, and to our knowledge there are not yet any established guidelines on how this type of deception, involving (human-like) robots, should be treated. As WoZ studies are more explicit deception, these studies touch on uncharted territory and need to be addressed further in this field. Future studies could explore this topic further. One possible reason for excluding ethical conduct from the published articles may be space limitation. As many technical conferences, both HRI and RO-MAN put a hard page limit on published papers. Some conferences, including HRI, already exclude references from this page limit. We suggest that acknowledgments of ethical conduct could be treated in a similar way. If references to board approval, informed consent, and other ethical conduct could be included outside the page limit, it might help shape a practice where ethical conduct is a standard piece included in HRI publications to a larger degree than today. It is worth noting that underreporting of ethical principles is not a unique issue for HRI. A literature study by Schroter et al. [18] found that ethical approval and consent form is underreported in medical journals, with failure to mention these in 31% and 47% of the papers, respectively. The above authors emphasize the importance of being transparent in publications. We hope that our paper’s contribution can be one step towards a similar discussion in the HRI field. The issue of how to handle ethical issues, like deception, deserves a wider acknowledgment and discussion within the HRI field. One concrete example could be the four criteria put forward by Matthias [19] providing a framework for when and why misleading and deceiving robots are morally permissible. According to the author, deception when using robots in healthcare is only morally acceptable when (1) it is in the patient’s best interest, (2) it results in increased autonomy for the patient (e.g. being able to make more choices and being able to control the robot), (3) it is transparent or suggestive that deception is occurring and that the patient can chose to stop the deception, and (4) no harm could come to the patient, directly or indirectly. The latter also means that if the patient relies on a specific service from the robot (e.g. reminders to take medication) it must also be informed of the actual capabilities and services of the robot to be informed of what they can expect from it. The underreporting of ethical principles in HRI research may have several ethical implications at the societal level. On the one hand, uninformed participants’ expectations of robots may result in misunderstandings of current robots’ capabilities and functionality. On the other hand, HRI researchers often demonstrate their robots to the public with pre-scripted lines and behaviors, where the robot interacts “naturally” with a human, without explaining the robot’s functionality and how the interaction was set up beforehand. Although this is outside the scope of this study, we want to raise the question—what ethical responsibilities do researchers have towards not only participants, but to the public when presenting robots? We hope that this work contributes to stimulate the conversation of proper ethical conduct in the interdisciplinary field of HRI, both methodologically and ethically.
94
J. Rosén et al.
Acknowledgements. Special thanks to Oskar MacGregor for his valuable insight on proper ethical conduct. We would also like to thank Erik Lagerstedt and Kajsa Nalin for their support and help on parts of the analysis of the literature study.
References 1. Dautenhahn, K.: Socially intelligent robots: dimensions of human-robot interaction. Philos. Trans. Roy. Soc. B Biol. Sci. 362(1480), 679–704 (2007) 2. Lindblom, J., Andreasson, R.: Current challenges for UX evaluation of human-robot interaction. In: Schlick, C., Trzcieli´nski, S. (eds.) Advances in Ergonomics of Manufacturing: Managing the Enterprise of the Future, pp. 267–277. Springer, Cham (2016). https://doi.org/ 10.1007/978-3-319-41697-7_24 3. Baxter, P., Kennedy, J., Senft, E., Lemaignan, S., Belpaeme, T.: From characterising three years of HRI to methodology and reporting recommendations. In: The Eleventh ACM/IEEE International Conference on Human Robot Interaction, pp. 391–398. IEEE Press (2016) 4. Milgram, S.: Behavioral study of obedience. Psychol. Sci. Public Interest 67(4), 371–378 (1963) 5. Haney, C., Banks, C., Zimbardo, P.: Study of prisoners and guards in a simulated prison. Naval Res. Rev. 26(9), 1–17 (1973) 6. Weiten, W.: Psychology: Themes and Variations. Cengage Learning (2007) 7. American Psychological Association: Ethical Principles of Psychologists and Code of Conduct. American Psychological Association. https://www.apa.org/ethics/code 8. European Commission: Ethics in Social Science and Humanities. European Commission. https://ec.europa.eu/info/sites/info/files/6._h2020_ethics-soc-science-humanities_en.pdf 9. World Medical Association: Declaration of Helsinki. World Medical Association. https:// www.wma.net/policies-post/wma-declaration-of-helsinki-ethical-principles-for-medical-res earch-involving-human-subjects/ 10. European Union: General Data Protection Regulation. EU. https://gdpr-info.eu 11. ACM/IEEE International Conference on Human-Robot Interaction. https://dl.acm.org/confer ence/hri 12. IEEE International Conference on Robot and Human Interactive Communication (RO-MAN). https://www.ieee-ras.org/conferences-workshops/financially-co-sponsored/ro-man 13. ACM Transactions on Human-Robot Interaction. https://dl.acm.org/journal/thri 14. Oates, B.J.: Researching Information Systems and Computing. Sage (2005) 15. Riek, L., Howard, D.: A code of ethics for the human-robot interaction profession. In: Proceedings of We Robot (2014) 16. Oliveira, R., et al.: Friends or foes? Socioemotional support and gaze behaviors in mixed groups of humans and robots. In: Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction, pp. 279–288. ACM (2018) 17. Rea, D.J., Young, J.E.: It’s all in your head: using priming to shape an operator’s perceptions and behavior during teleoperation. In: Proceedings of the 2018 ACM/IEEE International Conference on Human Robot Interaction, pp. 32–40. ACM (2018) 18. Schroter, S., Plowman, R., Hutchings, A., Gonzalez. A.: Reporting ethics committee approval and patient consent by study design in five general medical journals. J. Med. Ethics 32(12), 718–723 (2006) 19. Matthias, A.: Robot lies in health care: when is deception morally permissible? Kennedy Inst. Ethics J. 25(2), 169–162 (2015)
Exploratory Analysis of Research Publications on Robotics in Costa Rica Main Public Universities Juan Pablo Chaverri(B) , Adrián Vega, Kryscia Ramírez-Benavides, Ariel Mora, and Luis Guerrero Universidad de Costa Rica, San José, Costa Rica {juan.chaverri,adrian.vegavega,kryscia.ramirez,ariel.mora, luis.guerreroblanco}@ucr.ac.cr Abstract. The article presents the main robotic trends linked to the publications of the institutional repositories of the five main Costa Rican public universities: Tecnológico de Costa Rica, University of Costa Rica, National University, State Distance University, and National Technical University. The publications were obtained after using the keyword ‘robotics’ in the search engines of each repository. This procedure generated 241 results, of which only 55 were relevant, according to the application of five selection criteria. The analysis of the publications involved the categorization of nine general variables and the counting of their frequencies. The results obtained indicate that the publications cover a period of 14 years, in which male participation predominates and the presence of the Tecnológico de Costa Rica and the University of Costa Rica. The main robotic fields detected correspond to educational, autonomous, and industrial robotics. Keywords: Robotics · Costa Rica · Public Universities · Publication trends · Educational robotics · Autonomous robotics · Gender gap
1 Introduction Costa Rica has been characterized by investing, since the last decades of the 19th century to the second decade of the 21st century, a significant percentage of their incomes in public education [1]. This has led to the development of primary, secondary, and university education in different periods of its history. These investments made public universities the main context for Costa Rican scientific and technological innovation [2]. As a result, the Costa Rican public university context is a reference for the study of technical, cultural, and political-economic phenomena in the country linked to the development of technologies for the different institutions and production systems. Recently, one of the technologies that are redefining human work and self-perception is robotics [3]. Using robotics with other computational resources is facilitating the incorporation of various mechatronic applications in contexts previously not viable or considered science fiction topics. For example domestic environments [4]; hospitals, shopping centers, retirement homes [5]; or classrooms [6] among others. Changing the economic, legal, cross-cultural, and geopolitical backgrounds incorporating opportunities and challenges for each country, as well as new socialization forms. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 95–102, 2021. https://doi.org/10.1007/978-3-030-79997-7_12
96
J. P. Chaverri et al.
For this reason, the present research reviews the publications produced by the main public universities in Costa Rica, regarding robotics, to determine the trends of its development in this context. The universities in question correspond to the following: Technological of Costa Rica (TEC), University of Costa Rica (UCR), National University (UNA), State Distance University (UNED), and National Technical University (UTN).
2 Research Objective The classification of publications related to robotics works as a starting point to differentiate how the work in this field is divided among the five Costa Rican public universities and what trends constitute it. Therefore, this research was oriented to determine the following variables for each publication: 1) the year of publication, 2) the gender of the people involved in the authorship, 3) the type of publication, 4) the publication format, 5) the university affiliation of the authorship, 6) the robotic field covered, 7) the type of robot used, 8) the target audience for the robot, and 9) the role played by the robot. The inclusion of the variable ‘gender’ was carried out since an important gender gap has been reported in the area of STEM in Costa Rica [7]. Thus, it is appropriate to determine how the participation of genders behaves in this field of knowledge to refer to the possible gender symmetries or asymmetries that each university incorporates on this topic.
3 Background Systematic bibliographic reviews, also called “meta-analytic” or “meta-synthetic” reviews [8], constitute a valuable methodological resource to check the consistency of the conclusions made on a research question or to abstract new theoretical aspects on certain topics from a bibliographic corpus that can sometimes be voluminous. The present investigation is a step forward regarding the future design of similar systematic bibliographic reviews since in the Costa Rican context no investigations have been found that even refer to the objective that guides this investigation; and because it tries to show the trends that Costa Rican robotics defines, so future meta-analytic or metasynthetic reviews on the characteristics related any of these trends or the characteristics can be elaborated.
4 Methodology The publications analyzed were obtained by consulting the institutional repositories of the universities mentioned. This query was carried out between July 5th, 2020, and January 15th, 2021. By using the keyword ‘robotics’ in their respective search engine. The results generated by this search correspond to digital sources available. The repositories consulted correspond to the following: TEC, RepositorioTEC [9]; UCR, Kérwá Repository [10]; UNA, Institutional Academic Repository [11]; UNED, ReUNED [12]; and UTN, UTN Repository [13].
Exploratory Analysis of Research Publications
97
Thus, the search carried out generated a total of 271 results, broken down as follows: TEC Repository (114), Kérwá Repository (79), Institutional Academic Repository (35), ReUNED (43), and UTN Repository (0). 4.1 Selection Criteria The selection of the publications involved the implementation of three criteria. First discriminates publications related to mechatronic systems (a0 ) and “virtual” robotic systems (a1 ), such as chatbots or software aimed at automating computer procedures. Second discriminates publications referring to the literal identifier of the word ‘robotics’ (b0 ), as a component of the technological transformations of the industry of the current century or related mentions, and those oriented properly to the design and implementation of mechatronic systems (b1 ) or software architectures for robots under development (b2 ). Finally, third discriminates publications that make a tangential mention of the word ‘robotics‘ (c0 ) and publications of an informative nature on aspects of robotics (c1 ) or publications linked to the presentation, discussion, or evaluation of robotics programs educational (c2 ). In summary, the publications appropriate to the conditions a0 , b1 , b2 , c0 , c1, and c2 . Thus, 55 relevant references were identified, which are available at [14–68] for eventual consultation.
5 Findings and Discussion of Results The main findings are shown in Figs. 1, 2, and 3. These are detailed below. 5.1 Distribution of Publications According to year, the Gender of Authorship, University, and Robotic Field Figure 1 indicates analyzed publications defined in a period of 14 years, covering three decades: 2000–2009, 2010–2019, and 2020–2029. However, most of the publications are concentrated in the second half of the 2010–2019 decade. On the other side, the numerous existence of publications with multiple authorship forced the creation of gender categories ‘Mixed, gender parity’, ‘Mixed, female predominance’, and ‘Mixed, male predominance’. The first category is fulfilled when there is an equal number of people of the same gender per publication. The last two are fulfilled when the number of people of the same gender is greater than the number of people of the other. Thus, it can be seen that the male gender predominates over the other gender configurations. His participation is only absent for the year 2013. Female participation is approximately half that of males and is concentrated in 2019. However, it is characterized by generating the largest number of publications that year. If the set of publications with multiple authorship is taken into account, without distinguishing the gender variable, it is denoted that teams on robotics or robotics-related issues have proliferated in the last eight years. However, when the gender variable is considered in these teams, it is obtained that in those where a majority of male gender
98
J. P. Chaverri et al.
research is reported, the male continues to prevail over the female. Thus, gender parity is still a pending issue in this scientific and technological field. Regarding the distribution of publications according to university origin, we observe that the TEC reports the highest number of publications, followed by the UCR and the minimal presence of the UNA. The UNED and the UTN are characterized by their absence of content in this field. On the other hand, the distribution of publications by gender and university indicates that the TEC concentrates the largest number of publications produced by men, which makes it the university that incorporates the main gender asymmetry in the studied population. However, the TEC is also characterized by including the largest number of publications produced by women; and the fewest publications with multiple authorship. The UCR stands out for registering a greater number of publications with multiple authorship and for significantly reducing the gender asymmetry shown globally. While UNA shows a gender parity. Finally, the main robotics fields dealt with by Costa Rican public universities correspond to the following: educational robotics (17), autonomous (14), industrial (11), and social (4). The main fields treated by TEC correspond to autonomous, industrial, and educational robotics. Achieving predominance in the first two globally. Meanwhile, the UCR concentrates on educational and social robotics, managing to predominate both globally. UNA understands the field of educational robotics exclusively.
Fig. 1. Distribution of publications according to year, gender, university, and robotic field. (The color difference in the year-gender flow indicates the different decades comprehend by the population. MFP: Mixed, female predominance; MMP: Mixed, male predominance; MGP: Mixed, gender parity.)
Exploratory Analysis of Research Publications
99
5.2 Distribution of Publications According to Gender, Format, Source, and Robotic Field The classification of publications according to reference and publication types makes it possible to distinguish relationships not visible in Fig. 1. Displaying the social division of labor between universities generating publications specifically oriented to the design, presentation, and evaluation of robotic systems plays in Costa Rican public education. Also, Fig. 2 shows that robotics publications far exceed the number of robotics related publications. ‘Thesis’ constitutes the major type of reference for the first type of publication and ‘Reports’ predominate in the second. In this sense, it can be seen that the distribution of publications between the variables ‘type of publication’ and ‘robotic field’ generates a flow very similar to that shown between the variables ‘university’ and ‘robotic field’ in Fig. 1. However, Fig. 2 allows us to identify that approximately half of the publications related to educational robotics correspond to robotics related publications and that most of these publications refer to this educational field. If the UCR predominates in the field of educational and social robotics, as stated based on Fig. 1, the previous observation implies that the UCR stands out specifically in the field of social robotics, when it comes to robotic publications; while its predominance in the field of educational robotics is due to robotics related publications. Therefore, it can be concluded that the TEC constitutes the main Costa Rican university linked to the development of robotic systems, as well as the public university that works the greatest diversity of robotic fields in the country. Meanwhile, the UCR deals mainly with social aspects related to the application of robotics in the Costa Rican public educational system. Finally, Fig. 2 indicates that the gender asymmetry globally exposed by graph 1 is maintained for each of the types of references identified, whether it is considered from the point of view of individual or multiple authorship. However, the gender gap is wider when it comes to thesis production. Female participation is predominantly concentrated in thesis production, which suggests more focused participation in the development of robotic systems. 5.3 Distribution of Publications by Type of Robot, Target Audience, and Robotic Role This subsection of the investigation proceeded by recording the type of robot, target audience, and robotic role presented in each publication. In this sense, those publications that alluded to more than one educational or research program on robotics applications were reviewed to determine the referent of these three variables. The references that made this type of mention were publications on robotics. Therefore, this decision tries to offer the registration of the aforementioned variables in the broadest way that the sources consulted allowed it to be done. So, the types of robots mostly reported in the Costa Rican public university context are the following: kits of educational robotics (First place), robotic arms and humanoid robots (Second place), mobile agents terrestrial (Third place), and drones rotary wing (Fourth place).
100
J. P. Chaverri et al.
Fig. 2. Distribution of publications according to gender, type of reference, type of publication, and robotic field.
The category ‘Does not apply’ involves all those cases in which the target audience and the purpose of the educational program were mentioned, but the information regarding the type of robot used in such programs was omitted. This condition illustrates the limitations that this type of publication entails in the present analysis. Moreover, the main target audiences correspond to the following: Academic-Industry (First place), school (Second place), college (Third place), people with disabilities and children at preschool age (Fourth place), and university and academy students (Fifth place). The ‘Academy-Industry’ category emphasized a high number of projects in the mechatronics or electronics engineering area, representing a significant bounding between academy and industry, sometimes explicit and other times implicit. So the boundary between one domain and the other was fuzzy or very fuzzy. This reason justified the use of this categorization, as this reduces the arbitrariness that the respective disaggregation may give rise to. The opposite happened with the categories ‘Academy’ and ‘Industry’ since the nexus was exclusive. Finally, the main robotic roles reported in the literature are the following: educational resource (First place), the industrial manipulator (Second place), and probe (Third place). About the robot playing a care role are grouped together, it is observed that this type of role acquires a relevance similar to that shown by the industrial manipulation of objects.
Exploratory Analysis of Research Publications
101
Fig. 3. Distribution of publications according to robot type (left), target population, and robot role (right). (PLC: Programmable logic controller; MPS: Modular production system; SCARA: Selective compliance assembly robot arm.)
6 Conclusions In a resolution, the research has shown the “photograph” of the main robotic trends registered in the publications of the five university repositories analyzed. It is about the image of a recent phenomenon, mainly male, applied predominantly in the TEC campuses, to generate robotics applications in the field of autonomous, industrial, and educational robotics. Supplementary Materials: The complete bibliography of the 55 references analyzed in the full-text review is available at https://citic.ucr.ac.cr/sites/default/files/lr_refere nces_krb.pdf. Acknowledgments. The appreciated timely collaboration of Paula Sibaja Conejo during the preparation of the graphics; as well as Mónica Aguilar Bonilla’s suggestions regarding the inclusion of the five university contexts in the research.
Funding. This research was funded by ECCI and CITIC at the University of Costa Rica, grant numbers 834-C0-02. Conflicts of Interest. The authors declare no conflict of interest.
References 1. Molina Jiménez, I.: El financiamiento educativo público en Costa Rica a largo plazo (1860– 2016). Hist. Mem. 16, 165–198 (2018)
102
J. P. Chaverri et al.
2. Orozco-Barrantes, J., Guillén-Pérez, S.: Objetivos e instrumentos de las políticas de innovación en Costa Rica. Política Económica Para El Desarro. Sosten. 6, 1–24 (2020) 3. Richardson, K.: Challenging Sociality: An Anthropology of Robots, Autism, and Attachment. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-74754-5 4. Ronzhin, A., Rigoll, G., Meshcheryakov, R. (eds.): ICR 2016. LNCS (LNAI), vol. 9812. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-43955-6 5. Salichs, M.A., Ge, S.S., Barakova, E.I., Cabibihan, J.-J., Wagner, A.R., Castro-González, Á., He, H. (eds.): Social Robotics: 11th International Conference. ICSR 2019. LNCS (LNAI), vol. 11876. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-35888-4 6. Daniela, L. (ed.): Smart Learning with Educational Robotics: Using Robots to Scaffold Learning Outcomes. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-19913-5 7. Programa Estado de la Nación: Educación superior en Costa Rica. In: Román, M., Román, I., Vargas, A.J., and Cullell, J.V. (eds.) Informe Estado de la Educación 2019. pp. 151–205. Programa Estado de la Nación, San José, Costa Rica (2019) 8. Siddaway, A.P., Wood, A.M., Hedges, L.V.: How to do a systematic review: a best practice guide for conducting and reporting narrative reviews, meta-analyses, and meta-syntheses. Annu. Rev. Psychol. 70, 747–770 (2019) 9. Tecnológico de Costa Rica: RepositorioTEC. https://repositoriotec.tec.ac.cr/discover. Accessed 10 Feb 2021 10. Universidad de Costa Rica: Repositorio Kérwá. http://kerwa.ucr.ac.cr/. Accessed 10 Feb 2021 11. Universidad Nacional: Repositorio Académico Institucional. https://repositorio.una.ac.cr/dis cover. Accessed 10 Feb 2021 12. Universidad Estatal a Distancia: ReUNED. https://repositorio.uned.ac.cr/reuned/discover. Accessed 10 Feb 2021 13. Universidad Técnica Nacional: Repositorio UTN. http://repositorio.utn.ac.cr/. Accessed 10 Feb 2021
A Century of Humanoid Robotics in Cinema: A Design-Driven Review Niccolò Casiddu, Claudia Porfirione, Francesco Burlando(B) , and Annapaola Vacanti Dipartimento Architettura e Design, Università degli Studi di Genova, Stradone Sant’Agostino 37, 16123 Genova, GE, Italy {niccolo.casiddu,claudia.porfirione, francesco.burlando}@unige.it, [email protected]
Abstract. The research provides a design-driven overview of the fifty main humanoid robots that have made their appearance in science fiction movies from 1915 to 2018. The study provides a comparison of the principal aesthetic and interaction features in relation to what kind of character was performed and in what kind of movie it appears. As a result, the research defines a user-centered taxonomy of humanoid robotics and provides a graphical display of the data about the aesthetic and interaction features of the analyzed robots. Keywords: User centered design · Interaction design · Humanoid robotics
1 Introduction The scope of unmanned system is constantly expanding and, in the near future, humanoid robots will become a part of our everyday life. Therefore, a user-centered approach in the design of the latter is required to ensure a proper human-robot interaction in all the situations in which humanoids will be required. Nevertheless, if we refer to our present, humanoid robots are not very widespread and those who actually are in the market are capable of limited interactions. Our disappointment about such limited features is due to a bias generated, in part, from the movie industry. In fact, when people are asked about the word “robot”, one of the first things they think about is sci-fi cinema [1]. Due to this reason, our expectations got too high compared to the technological advancements that followed up in the real world. Even C-3PO, that was designed in the 1970s, is able to interact with a human in a better manner than its own kind can do nowadays. The reason, of course, is that Star Wars was a science fiction movie, but when humanoid robots entered the real market in the 2000s people were expecting something similar. And they are still waiting for it.
Attribution of paragraphs. 1: Claudia Porfirione. 2,5: Francesco burlando 3,4: Annapaola Vacanti. Work carried out under the supervision of Niccolò Casiddu © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 103–109, 2021. https://doi.org/10.1007/978-3-030-79997-7_13
104
N. Casiddu et al.
We have evidence of the influence of sci-fi movies in the design of some real robots [2] and we can also say that the robotics industry has been inspiring for certain directors [3]. In light of such mutual exchange, a design-driven analysis of robotics characters from science fiction movies could help in the future design of real-world humanoid robots. We can consider the film director as an almighty designer who must not submit to technological limits, and therefore operate any design choice, with the sole purpose of defining the appearance and behaviour he wants the character to have. Thus, it is of particular interest to analyze aesthetic and interaction features e.g. what colour is used for the eyes of an evil character? what sound does a goofy character make? smallest robots are more likely to be good or evil? and so on.
2 Taxonomy Some precedent studies in literature have analyzed humanoid robotics in sci-fi movies. Merz et al. [2] built a database of 411 robots by searching “Robot + Movie” and “ScienceFiction + Robot” on Google. Mubin et al. [4] included in their research the 12 sci-fi robots of the Carnegie Mellon University Robot Hall of Fame [5] and the top 10 in terms of IMDB rating out of 28 popular sci-fi robots analyzed by Shin & Jeong [6]. However, those studies have selected robots based on their popularity or their interaction skills. The present contribution intends to take the opposite approach, by defining robots on the basis of their appearance in order to discover if this element has a direct correlation with others, such as the ability to interact or the renown of the character. Therefore, it is necessary to define a degree-of-anthropomorphism-based taxonomy that will allow to select the robots to be analyzed. On the basis of Mori’s most famous uncanny valley theory [7], Eaton’s taxonomy [8] and previous work by the authors [9], it follows a taxonomy of n + 1 separate levels of human-likeness. 0 Replicant. A robot identical to a person, placed in the upper right side of the uncanny valley chart. At the moment, this is a hypothetical level of human-likeness and represent an asymptote for robotics designers. 1 Actroid. A robot with strong visual human-likeness. just based on its appearance, most certainly an actroid would be mistaken for a human being. Some actroids consist of the copy of a specific person, and they are called geminoid from the latin geminus, which means twin. Actroids’ designers use a silicone-base material to resemble human skin and these robots usually has fake hairs, fingernails etc. In sci-fi movies, these robots are performed by human actor without makeup, or with some makeup or other tips to make it understand that they are not human beings.
A Century of Humanoid Robotics in Cinema
105
n-6 Half-Actroid. It has the morphology of an actdroid in some parts of the body – usually the upper part – but the rest of the body is missing or belongs to the android category. In particular circumstances, it can be mistaken for a human being, but it is a deception doomed to fail. n-5 Android. A robot with the morphology of a human. It has arms, legs and a head with mouth and eyes. These last two can be motorized or static. However, there is no possibility of mistaking the robot for a human being since it has no skin, but a cover made in plastic or metal. n-4 Humanoid. It has the broad morphology of a human, but the design of the anthropomorphic connotations is simplified. Even if it does not resemble a human, almost all the anthropomorphic features are represented. The more elements missing, the more the robot is close to n-3. n-3 Inferior Humanoid. A n-4 humanoid with more missing parts. It may have a monitor instead of a face or a wheeled base instead of legs. It is complicated to define a specific boundary between n-4 and n-3, since some robots can present very accurate anthropomorphic characteristics in some parts of the body while missing other parts. n-2 Cartoonized Humanoid. If a n-4 humanoid robot is human-inspired, a n-2 cartoonized humanoid is inspired to comic book or cartoon characters. Similar to n-4, it has a roughly human morphology, but its exterior appearance recalls a super-hero or a manga character. n-1 Human-Inspired. It has the broad morphology of a human body just in some parts (usually an arm). Therefore, it definitely does not resemble a human. these robots are typically industrial and are at the lower left side of the uncanny valley chart. n Human-Centered Designed. Even if it looks nothing like a human, it is designed to interact with human and to operate in human environments. Therefore, its design is most likely based on anthropometric measures.
3 Selection Criteria On the basis of what stated before, we selected robots belonging to the categories from n-2 to n-6 on the grounds that follow. The present contribution intends to investigate on the parallels and deviation between the design of humanoid robots and humanoid robotic characters in sci-fi movies. Therefore, n-1 and n categories are not of interest since the degree of human-likeness is too low. At the same time, 0 category can exist only in
106
N. Casiddu et al.
sci-fi movies and the approach to the design of those character is completely different from other robots, since usually a human actor is used with just a few tips by makeup crew. 1 category has very little space in sci-fi movies because, as we just mentioned, getting to the upper level of human likeness with real actor is very easy; so, actroids are rarely used. Likewise, all cartoon and animation movies were excluded since the design process of the characters is too different to enclose the latter in the same set of other robotics characters. Starting by Golem from Der Golem directed by Paul Wegener & Henrik Galeen in 1915 and counting for one hundred years, fifty humanoid robots which fulfil the criteria were found in movies and tv series production1 . These characters were analyzed according to the following principles and, as a result, displayed in a summary visualization.
4 Visualization This section provides a graphical analysis of the aesthetic and interaction features of the fifty selected humanoid robots. Robots are divided in groups according to the main scope in which them operate. Caregiving includes all those robots which operate specifically for the medical assistance of human characters. Physical tasks include all those characters which are intended to carry out a bodily work on behalf of a human. In military robots are grouped that perform tasks of surveillance, civil security or war. Research includes robots that are being designed during the plot for research purposes. The robotic characters that serve as assistants for humans for all that does not concern health or physical work are included in service and assistance. The following features are displayed: year of first release of the movie or tv series, country of production, rate according to IMDB [10], conformity to previous taxonomy, estimated height, two main colors, presence of legs, arms and eyes, nose and mouth. Moreover, the visualization shows if the character is evil and its supposed gender (Fig. 1).
1 In the movies which presented more than one robot, the most representative characters were
chosen.
A Century of Humanoid Robotics in Cinema
107
Fig. 1. 1 - Star Wars III: Revenge of the Sith by George Lucas. 2 - Robot & Frank by Jake Schreier. 3 - The Master Mystery by Harry Grossman, Burton L. King. 4- Der Herr der Welt by Harry Piel. 5 - Tobor the great by Lee Sholem. 6 - THX 1138 by George Lucas. 7 - Gojira tai Megaro by Jun Fukuda. 8 - Star Wars Episode V: The Empire Strikes Back by Irvin Kershner. 9 - Short Circuit by John Badham. 10 - Star Wars Episode I: The Phantom Menace by George Lucas. 11 - Battlestar Galactica by Ronald D. Moore. 12 - Star Wars III: Revenge of the Sith by George Lucas. 13 - The day the Earth stood still by Scott Derrickson. 14 - Elysium by Neill Blomkamp. 15 - Pacific Rim by Guillermo del Toro. 16 - Le Cinquieme Element by Luc Besson. 17 - Der Golem by Paul Wegener, Henrik Galeen. 18 - The day the Earth stood still by Robert Wise.
108
N. Casiddu et al.
Fig. 1. (continued) 19 - Target Earth by Sherman A. Rose. 20 - Iron Man 2 by Jon Favreau. 21 – L’uomo Meccanico by Andrè Deed. 22 - The wizard of OzVictor by Fleming, George Cukor, Mervyn LeRoy, Norman Taurog, Richard Thrope, King Vidor. 23 - Sleeper by Woody Allen. 24 – Logan’s Run by Michael Anderson. 25 - Buck Rogers in the 25th Century by Glen A. Larson. 26 - Total Recall by Paul Verhoeven. 27 - Real Steel by Shawn Levy. 28 - Metropolis by Fritz Lang. 29 - Lost in Space by Iwrin Allen. 30 - Star Wars Episode IV: A new hope by George Lucas. 31 The Black Hole by Gary Nelson. 32 - Saturn 3 by Stanley Donen. 33 - The Hitchhiker’s guide to the galaxy by Alan J.W. Bell, John Lloyd. 34 - Return to Oz by Walter Murch. 35 - Rocky IV by Sylvester Stallone. 36 - Spaceballs by Mel Brooks. 37 - Red Dwarf by Ed Bye. 38 - Le Cinquieme Element by Luc Besson. 39 - Lost in Space by Stephen Hopkins. 40 - I, Robot by Alex Proyas. 41 - The Hitchhiker’s guide to the galaxy by Grath Jennings. 42 - Hinokio by Takahiko Akiyama. 43 - Transformers by Michael Bay. 44 - Eva by Kike Maillo. 45 - Automata by Gabe Ibáñez. 46- Lost in Space by Irwin Allen. 47 - Forbidden Planet by Fred M. Wilcox. 48 - Sleeper by Woody Allen. 49 - Bicentennial Man by Chris Columbus. 50 - Automata by Gabe Ibáñez. (Burlando Francesco 2021)
A Century of Humanoid Robotics in Cinema
109
5 Findings and Conclusion The visualizations show that the main role in which robotic characters are used is Service and Assistance followed by Military. While in the first category characters are usually good, in the latter there is a consistent presence of evil robots. This factor could be at the foundation of the fact that human beings experience both fascination and fear when they interact with robots [11]. While research and caregiving are some of the most common areas in which humanoid robots are used in real life, there are very few sci-fi movies that explore such scopes. Robots by Military and Physical Tasks are often of grey or metallic colors, while in Service and Assistance we can find the use of different colors. This could be, in part, due to the fact that the first two categories present a lot of robots from old movies that were designed with less technology possibilities. There are only four female robot characters and only in the Research and Service and Assistance category. Future contributions will discuss furtherly the evidence just mentioned and other element of interest, included a comparison between robots from the real world and robotic characters from sci-fi movies.
References 1. Ray, C., Mondada, F., Siegwart, R.: What do people expect from robots? In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3816–3821 (2008). https:// doi.org/10.1109/IROS.2008.4650714 2. Merz, N., Huber, M., Bodendorf, F., Franke, J.: Science-fiction movies as an indicator for user acceptance of robots in a non-industrial environment. In: Proceedings of the 2020 on Computers and People Research Conference, pp. 142–143. ACM, Nuremberg (2020). https:// doi.org/10.1145/3378539.3393847. 3. Schmitz, M., Endres, C., Butz, A.: A Survey of Human-Computer Interaction Design in Science Fiction Movies 4. Geraci, R.M.: Robots and the sacred in science and science fiction: theological implications of artificial intelligence. Zygon® 42(4), 961–80 (2007) 5. https://doi.org/10.1111/j.1467-9744.2007.00883.x. 6. Mubin, O., Wadibhasme, K., Jordan, P., Obaid, M.: reflecting on the presence of science fiction robots in computing literature. ACM Trans. Hum.-Robot Interac. 8(1). 1–25 (2019). https://doi.org/10.1145/3303706 7. CMU Hall of Fame. Robot Hall of Fame (2012). http://www.robothalloffame.org/. 8. Shin, S., Jeong, J.: Taxonomies of artificial intelligence robot appeared in SF films. In: Proceedings of HCI Korea, HCIK 2016, pp. 449–53. Hanbit Media, Inc., Seoul (2016). https:// doi.org/10.17210/hcik.2016.01.449. 9. Mori, M., MacDorman, K.F.: The Uncanny Valley [From the Field]. IEEE Robot. Autom. Mag. 19(2), 98–100 (2012). https://doi.org/10.1109/MRA.2012.2192811. 10. Eaton, M.: Evolutionary Humanoid Robotics. SpringerBriefs in Intelligent Systems. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-44599-0. 11. Zallio, M. (ed.): AHFE 2020. AISC, vol. 1210. Springer, Cham (2021). https://doi.org/10. 1007/978-3-030-51758-8 12. IMDb: Ratings, Reviews, and Where to Watch the Best Movies & TV Shows. s.d. IMDb. http://www.imdb.com/ 13. Foerst, A.: Cog, a humanoid robot, and the question of the image of god. Zygon® 33(1), 91–111 (1998). https://doi.org/10.1111/0591-2385.1291998129.
RESPONDRONE - A Multi-UAS Platform to Support Situation Assessment and Decision Making for First Responders Max Friedrich1(B) , Satenik Mnatsakanyan2 , David Kocharov2 , and Joonas Lieb1 1 German Aerospace Center (DLR), Institute of Flight Guidance, Lilienthalplatz 7,
38108 Braunschweig, Germany [email protected] 2 Akian College of Science and Engineering, American University of Armenia, 40 Baghramyan Avenue, Yerevan 0019, Republic of Armenia
Abstract. The RESPONDRONE project develops and evaluates a multi-UAS platform to accelerate situation assessment and ease decision making during natural disasters. In order to assure that the multi-UAS platform meets the needs of the end-users, first response organizations are part of the project and closely involved in the design, development and evaluation of the platform. As one of the first steps in the project, a mock-up of the multi-UAS platform’s user interface was developed, which consists of two displays, one being a display for an on-site command center and the other a mobile application for the first response units in the field. Both displays were initially evaluated by eight subject matter experts in an online study and received good to excellent ratings regarding usefulness, usability and ease of use of the systems. The results are currently being incorporated into the displays and further evaluations are planned with the improved mock-ups. Keywords: Unmanned aircraft systems · First responders · User interface
1 Introduction Natural disasters cause a continual threat to humanity. In 2018, globally 61.7 million people were affected by extreme weather events, such as earthquakes, floods, landslide or wildfires [1]. More than 1.2 million lives were lost in the timespan from 2000–2018, and 10,373 deaths accumulated in 2018 by natural disasters. Since it is expected that climate change will increase the risk of weather-related hazards in the future, two-thirds of the European population could be affected by weather related disasters by the year of 2100. These figures in combination with the scientific forecast on weather related events emphasize the need for efficient and resilient disaster response capabilities by local authorities, international organization and communities of states. As one key element to reduce the number of casualties caused by natural disasters, first responders need to be equipped with technological means that enable them to quickly assess the disaster situation. Crucial for any first response in natural catastrophes © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 110–117, 2021. https://doi.org/10.1007/978-3-030-79997-7_14
RESPONDRONE - A Multi-UAS Platform
111
is to immediately and comprehensively assess the disaster degree and the affected area. Hereby, emergency response teams need to quickly acquire a thorough understanding of the disaster. As an example, real time information on the status of critical structures, power grids, road accesses or the exact location of victims and environmental threats is pivotal for an efficient and effective response mission. Collecting, processing, sharing and displaying this information is challenging, especially when no reliable communication infrastructure is available or the infrastructure has been destroyed by the disaster. Moreover, the deployment and direction of first responder teams as well as monitoring the response mission progress is difficult for decision makers and should be supported by an established communication network to coordinate the response missions. A possible solution to quickly gain situation awareness during a natural disaster with limited or damaged infrastructure is the use of unmanned aircraft systems (UAS) [3, 4]. Hence, the application of UAS can be an enabling technology for first responders to quickly achieve situation awareness in large-scale natural disasters. The RESPONDRONE project, funded by the European Commission’s H2020 programme, aims to develop and evaluate a multi-UAS platform for first responders in order to accelerate situation assessment, especially during the early stages of a disaster, and simplify operations for first responders, making first response operations more efficient. Several first response organizations across Europe, Armenia and Israel are part of the project consortium and closely involved in the design, development and evaluation of the multi-UAS platform. Hereby, it is envisioned that multiple highly automated UAS provide enhanced capabilities to support assessment missions, search and rescue operations, as well as forest fire fighting. These capabilities will be demonstrated at a large-scale exercise at the end of the project. In order to assure easy operability, the operating procedures of the multi-UAS platform must be compatible with, and integrated into, the operational workflow of first responders. Therefore, end-user needs and requirements for the usage of the multi-UAS platform were obtained from the project’s first responder organizations through structured interviews and a design thinking workshop [5]. In a next step, front end mock-ups of the developed user interfaces (UI) for the multi-UAS platform were developed and evaluated by representatives of the first response organizations of the consortium. This paper presents results from end-user interviews, the developed mock-up UIs and the results of the conducted evaluation study.
2 End-User Requirements 2.1 End-User Studies Nine end-user organizations from Europe, Armenia and Israel are part of the RESPONDRONE consortium. One of the main focus of the RESPONDRONE project is to have end-user input in all steps of the development. The major goal of the research team was to collect and aggregate information from current first response operation practices. The focus was both on the research questions, and also on answers to the questions that became of great interest to system requirements designers and developers of testing scenarios.
112
M. Friedrich et al.
The research activities have been compiled into the following categories: • First responder operational concepts • Technological solutions that are currently in use • Identification of stakeholders of the end-user organizations in the different countries. The Soft Systems Methodology was used for developing the questionnaires for the end-users. The methodology allows for a clear distinction between structural aspects of end-user operation in terms of power hierarchy, reporting and communication patterns, as well as process aspects, such as decisions to do certain activities, executing them and monitoring the results. The Soft Systems Methodology has been developed Peter Checkland [6] and has since been widely used in technology management consultancy, predominantly for technology driven organizational change management analysis and implementation, including requirement gathering for new information and communication technology. The end-user studies showed that one of the greatest expectations that end-users have towards RESPONDRONE is the speed and accuracy of disaster assessment that can better support on-time decision making, helping to tackle disasters while they are small. Fast identification of scale and kinetics of disasters and the ability to search for living organisms were among the most requested capabilities that end-users hope the RESPONDRONE platform will provide. The field study interviews also clearly demonstrated the importance of considering challenges for UAS usage. Part of these challenges involves technology, specifically the speed and ease of its deployment in emergency situations. Major concerns include the training of crew, planning for UAS use, the introduction of new roles on both tactical and operational levels, information accessibility, and changing decision-making patterns. Many end-users expect RESPONDRONE to address these kinds of concerns. 2.2 Design Thinking Workshop Following the end-user studies, a two-day design workshop was organized that aimed at creating potential mock-ups of the RESPONDRONE platform that specifically consider the ideas and needs of the end-users. Whereas the original design thinking methodology does not foresee the inclusion of potential end-users in the first stages of the process, during the workshop, the end-user partners actively participated in the entire process. The results of the workshop comprised three different mock-ups of the RESPONDRONE platform, which are presented in detail in [5]. Based on the results of the end-user studies and the design thinking workshop, the functional design of the RESPONDRONE platform as well as mock-ups of the system’s UI were created. Hereby, the mock-ups developed during the design thinking workshop were fused into two UIs, presented in detail in Sect. 3.2.
3 The RESPONDONE Multi-UAS Platform The end-user studies revealed that UAS are a low cost and maneuverable solution for first responders in early detection and real-time monitoring missions. Further, improved
RESPONDRONE - A Multi-UAS Platform
113
technological capabilities for sharing information in real time among all involved stakeholders as well as speeding up disaster assessment were mentioned as key functionalities desired by a platform that intends to ease decision making for first responders. Therefore, the RESPONDRONE multi-UAS platform was designed to provide relevant information in real-time to all involved stakeholders using a cloud-based system. Multiple highly automated UAS, capable of automated risk assessment and avoidance (e.g. automated avoidance of a previously undetected hazard such as bush fires), collect important data from the disaster site and stream this information into a cloud service, which is accessible by all involved stakeholders. 3.1 Multi-UAS Control and Supervision The UAS used within RESPONDRONE are Alpha 800 and Alpha 900 helicopters, powered by a gasoline engine and an endurance of 2.5 h and 4.5 h, respectively (cf. Fig. 1). The UAS are highly automated and thus do not need to be flown manually by a remote pilot. Due to operative reasons, within RESPONDRONE, the UAS are commanded by a safety pilot during take-off and landing. During cruise flight, the automatic mode is engaged, meaning that the UAS follow a pre-planned flight route. Risks, such as bush fires, are automatically detected by equipped sensors (e.g. electrooptical/ infra-red camera), and their positions are transmitted to a dedicated Traffic & Mission Management (TMM) module. The TMM continuously evaluates the detected risks and will compute an alternative flight route around the risks in question (in this case the bush fire) and send the updated flight route to the multi-UAS Ground Control Station (GCS), depicted in Fig. 1. The TMM will always choose the flight route with the lowest risk and thus the safest flight route for the UAS [5]. The technical sub-systems of each UAS are supervised by the UAS pilot using the multi-UAS GCS. In case of a critical system failure, the UAS pilot can always command the UAS to land or return to the launch point immediately. It is worth noting that also for the return to launch flight route, the TMM will compute the flight route with the lowest risks for the UAS.
Fig. 1. Pictures of the Alpha 800 and the ground control station.
114
M. Friedrich et al.
3.2 User Interface Mock-Ups Based on the three prototype mock-ups developed at the design thinking workshop, two functional mock-ups of the multi-UAS platform’s UI were created. Figure 1 and 2 illustrate the designed UIs. Figure 1 shows the UI of an on-site command center (OSCC) located in the near vicinity of the disaster area and Fig. 2 presents the mock-up of a mobile application for the first response units in the disaster area. The OSCC UI in Fig. 2 presents a (moving) map of the disaster area and shows the locations of the deployed UAS as well as the first responders in the field. Further, detected risks, available resources and detected victims may be shown on the map as well as live video feeds from cameras mounted on the UAS. If the user requires more information on a specific point of interest, they would select this point by clicking on the map display and send a request to the TMM. The TMM would evaluate the risks on the route to the point of interest and suggest a suitable flight route, which would need to be acknowledged by the OSCC, before it is sent to the multi-UAS GCS and uplinked. As a result, the UAS will fly to the point of interest providing the requested information e.g. via camera. The UI also enables the staff in the OSCC to communicate with and send tasks to the first response units in the field. The mobile application shown in Fig. 3 is meant for the first responders in the field. The map display has the same functionality as the map display in the OSCC UI. The first responders also have the possibility to request a UAS to fly to a certain point of interest (to be acknowledged by the OSCC) by merely clicking on the map and they can also access the live video feeds. Further, the UI enables the first responders to send reports to and receive tasks from the OSCC.
Fig. 2. Mock-up of the OSCC UI.
RESPONDRONE - A Multi-UAS Platform
115
Fig. 3. Mock-up of the first responder mobile application.
4 First UI Evaluation A first evaluation study of the developed UI mock-ups was conducted. The aim of this first study was to ensure that the UIs indeed meet the expectations of the end-users. For this purpose, the OSCC UI was presented to the end-users in a video (approximately 9 min long) explaining and showing the main features and the functionality of the interface. For the mobile application, an actual mock-up was developed, which allowed the end-users to interact with the UI. The mobile application mock-up was accessible online. 4.1 Method The study was conducted online. After the end-users had watched the video and made themselves familiar with the mobile mock-up, they filled out an online questionnaire. In order to determine the overall perceived usefulness of the UIs, the participants were asked to rate the usefulness of the UI on a 5-point Likert scale from 1 (not useful at all, I would not use the application during a disaster mission) to 5 (very useful, I would definitely use the application during a disaster mission). Further, the participants filled out the System Usability Scale (SUS, [7]) for determining the perceived usability of the UIs. In this questionnaire the participants ranked 10 statements on a 5-point Likert scale (ranging from 1–5), based on how much they agree with the statement. Further, the participants indicated their perception on how much effort would be required to use the UIs. For this purpose, the participants filled out the Perceived Ease of Use questionnaire [8]. This questionnaire consists of six statements that are answered on a 7-point Likert scale, ranging from 0–6 with higher numbers meaning lower effort. At the end of the
116
M. Friedrich et al.
questionnaire, the participants had the possibility to indicate their impressions regarding advantages and disadvantages of the UIs. In total, 8 participants took part in the evaluation study. All participants were members of the end-user partners of the RESPONDRONE consortium. Four participants indicated being high level decision makers (e.g. “deputy head of the rescue service”), three indicated being operation decision makers (such as incident commander) and one participant selected the option “other”. 4.2 Results In order to analyze the question regarding the usefulness of the UI and the Ease of Use questionnaire, the raw data was analyzed descriptively. For the SUS questionnaire, an overall SUS score was computed, which can be interpreted as percentages and adjective ratings [9]. Table 1 shows the medians across all participants, maxima and minima for the OSCC UI and the mobile application. For the OSCC UI, the median of the usefulness question and the Ease of Use questionnaire was 5.00, indicating high usefulness and low perceived effort. The median SUS score was 78.75, corresponding to a good to excellent usability. For the mobile application, the median usefulness score was 4.5, the median Ease of Use score was 5.00 and the median SUS score 80.62. Similar to the OSCC UI these results correspond to a high usefulness, low perceived effort and good to excellent usability. However, in the open questions section the participants indicated that it would take too much time to add information to the system. Also, it was criticized that the system does not provide the possibility of showing multiple UAS camera feeds simultaneously. Further, the usefulness of the mobile device under extreme environmental conditions in the disaster area was questioned. Table 1. Results of the first UI evaluation. Measure
OSCC UI Mdn
Usefulness Usability Ease of use
Mobile application Min
Max
Mdn
Min
Max
5.00
4.00
5.00
4.50
4.00
5.00
78.75
75.00
92.50
80.62
75.00
90.00
5.00
3.33
6.00
5.08
3.33
6.00
5 Discussion Both, the OSCC UI and mobile application received good to excellent ratings regarding usefulness, usability and ease of use. Despite the good ratings of the interfaces, several major shortcomings were identified that have to be taken into consideration during the further development of the UIs, such as the time needed for data input or the simultaneous availability of multiple video streams. Further, it needs to be ensured that the mobile device is compatible with potentially extreme environmental conditions such as heat or heavy rain.
RESPONDRONE - A Multi-UAS Platform
117
To continue and further foster the iterative design process, the results of this evaluation are currently being incorporated into the UIs. The final mock-ups of the UIs will be integrated into a simulation framework to evaluate the UIs a second time. In a last iteration step, the multi-UAS platform will be integrated and tested within the scope of a large-scale demonstration in a real environment including a third evaluation of the whole system and the UIs. Acknowledgements. This project is funded by the European Union’s H2020 Research and Innovation Programme and South Korean Government under Grant Agreement No. 833717.
References 1. United Nations for Disaster Risk Reduction UNISDR: 2018: Extreme weather events affected 60 million people. Press release. Geneva, Switzerland (2019) 2. Forzieri, G., Cescatti, A., Batista e Silva, F., Feyen, L.: Increasing risk over time of weatherrelated hazards to the European population: a data-driven prognostic study. Lancent Planet Health 1, 200--208 (2017) 3. Drury, J., Riek, L., Rackliffe, N.: A decomposition of UAV-related situation awareness. In Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-Robot Interaction, pp. 88--94. ACM Press, New York (2006) 4. Doherty, P., Rudol, P.: A UAV search and rescue scenario with human body detection and geolocalization. In: Orgun, Mehmet A., Thornton, John (eds.) AI 2007. LNCS (LNAI), vol. 4830, pp. 1–13. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76928-6_1 5. Friedrich, M., Borkowski, M., Lieb, J.: A multi-UAS platform to accelerate situation assessment in first response missions – identification of user needs and system requirements using design thinking. In: Proceedings of the 2020 AIAA/IEEE 39th Digital Avionics Systems Conference (DASC), pp. 1–7. IEEE, New York (2020) 6. Checkland, P., Scholes, J.: Systems Methodology in Action. Wiley, New York (1999) 7. Brooke, J.: SUS: a retrospective. J. Usability Stud. 8, 29–40 (2013) 8. Davis, F.D.: Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Q 13, 319–340 (1989) 9. Bangor, A., Kortum, P., Miller, J.: Determining what individual SUS scores mean: adding an adjective rating scale. J. Usability Stud. 4, 114–123 (2009)
Robotic Systems for Social Interactions
Exploring Resilience and Cohesion in Human-Autonomy Teams: Models and Measurement Samantha Berg1 , Catherine Neubauer2 , Christa Robison2 , Christopher Kroninger2 , Kristin E. Schaefer2 , and Andrea Krausman2(B) 1 Oakridge Associated Universities, Aberdeen Proving Ground, MD, USA 2 DEVCOM Army Research Laboratory, Aberdeen Proving Ground, MD, USA
Abstract. Team resilience affects both the cohesion and subsequent performance of that team. For human teams, resilience is tied to team learning, team flexibility, social capital, and collective efficacy. But for human-autonomy teams, resilience also includes cyber resilience and robust and adaptable robotic control. This work builds out the theory associated with resilience in human-autonomy teams, followed by a step-by-step procedure for developing a resilience subscale for measuring human-autonomy team cohesion. Keywords: Human-autonomy teams · Robotics · Team resilience · Team cohesion
1 Introduction Resilience is fundamental to team cohesion and subsequent success when teams encounter environmental and team stressors. For the scope of this work, humanautonomy teams, consist of one or more human members coupled with one or more autonomous assets that are interdependent and collaborate to accomplish a unified goal or mission. As such, multiple facets must be considered in order to fully understand these constructs from a human factors approach, networked systems or cyber resilience approach, and even the control architecture of the autonomy. This paper briefly outlines each of these facets as well as the process used to develop a scale to measure resilience and cohesion in human-autonomy teams.
2 Theory: Resilience in Human-Autonomy Teams Within the teaming context, the terms resilience and cohesion go hand-in-hand. From the human team literature, team cohesion refers to a group’s interpersonal attraction, shared beliefs, and coordinated behavior [1] that is essential to team performance and effectiveness [2–4]. Thus, team resilience is a multi-phasic process whereby members collectively apply skills, abilities, and resources to overcome adversity and improve team state [5], or the ability to prepare, absorb, adapt and recover from disruptive events © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 121–127, 2021. https://doi.org/10.1007/978-3-030-79997-7_15
122
S. Berg et al.
[6]. Additionally, it can be argued that resilience is a key feature for the development of highly cohesive and trusted human teams [7, 8]. In fact, resilience is sometimes viewed as a combination of team states including collective efficacy, shared mental models, and familiarity [9]. This area of work is particularly relevant for teams in extreme environments where cohesion is impacted differently than in teams under normal conditions [10]. For example, individuals working in extreme environments tend to exaggerate issues, which may lead to group impairment when increased tension and perception of team problems negatively impact team cohesion [11]. However, military unit cohesion has been shown to counteract these extreme environment stressors [12]. In recent years, there has been a push to integrate robotic systems as team members in military operations in order to increase efficiency and decrease risk to Warfighters [13]. These human-autonomy teams are especially effective for open-ended and complex conditions where aspects of a task are not always mapped or planned out, (e.g., combat situations; [14])1 by aiding in information planning, task planning and allocation, and team operations [15]. However, it is paramount to understand how the incorporation of robotic systems to human teams may disrupt the teams’ homogeneity and subsequent cohesion and resilience [16, 17]. 2.1 Team Cohesion and Resilience Team cohesion is a dynamic state that emerges over time, and can be enhanced through prioritizing collaboration and meeting team goals [18]. Theoretically, this may imply that team cohesion has the impact of greater team performance due to causing a feeling of responsibility to contribute to one’s team, and a belief that the team will triumph over adversity [19]. Team resilience on the other hand is tied to the task interdependence [20], and emotional carrying capacity (i.e., the extent to which team members can constructively express emotions; [21]. In studies on relationships between unit cohesion and resilience in military personnel, researchers found that higher unit cohesion was associated with higher mental health resilience [22, 23], and may be a moderator of unit resilience on team performance [6, 12], as high levels of team cohesion have buffered against the impact of stress [1]. However, this research has primarily been conducted within the context of human teaming, so there is a need to understand the effect of an autonomous agent on team cohesion and resilience emergence. 2.2 Network and Cyber Resilience The process by which the team relies on transmission of information is subject to network and cyber resilience, or quality of service (i.e., term to describe up time and intended functionality to quantify normal operation of the network [24]. Thus, the resilience associated with the network, hardware and software is a foundation for creating a dependable 1 A non-human intelligent agent is typically defined as “an autonomous entity which observes
and acts upon an environment entity and directs its activity toward achieving goals” (Russell & Norvig, 2009, pg. 34). Non-human agents should have the following characteristics: autonomy, observation of the environment, action upon the environment, and direction of activity towards achieving certain goals (Chen & Barnes, 2014; Russell & Norvig, 2010).
Exploring Resilience and Cohesion in Human-Autonomy Teams
123
and trustworthy agent in a battlespace environment. This helps us identify instances of electronic warfare through physical or wireless adversarial attacks resulting in compromised communications and long range attacks. While this is a large field of research, highlighted here are three of the core components that seem to be pertinent to humanautonomy team resilience: intrusion detection, network assurance, and communication authentication. Intrusion detection is an overall role for a device on a network to observe network traffic and determined if the traffic is from a legitimate source [25]. Some examples of adversarial intrusion include methods for circumventing wireless authentication measures, exploiting vulnerabilities in wireless hardware on a platform, or event physical attacks, such as implanting a foreign device in a platform or disabling a sensor. Common methods (e.g., packet sniffers, log keeping of foreign or unanticipated network traffic, and auditing devices) are used to account for common traffic on a network to provide assurance to the team that information is not a result of intrusion [26]. Network assurance tells the user if the network is working as anticipated; while communication authentication is the gatekeeper of authorized access to platforms that allows authorized information to travel on the network. Together these core areas of resilience affirm the trustworthiness of quality, reliability, and accuracy of the information across the network. More specifically, is the information transmitted accurate, due to an attack, or an error in the system. 2.3 Resilience: Controls Perspective The transmission of information is only one additional factor to understanding humanautonomy team resilience. The other focuses on the addition of robotic and autonomous assets into team structures, specifically multi-robot teams [27]. While, a formal or consensus definition of robot team resilience is lacking, we proffer that two concepts from the communities of control and planning embody resilience. These are the capacity of an agent to robustly execute an assigned task (e.g., follow prescribed trajectories2 ), coupled with the capacity of the larger team to adapt plans in order to accomplish the mission. Thus, robustness generally defines the performance of each agent, while the capacity to re-plan is a team characteristic–though this delineation would be far from a stricture, particularly within a hierarchically organized team. Stressors that necessitate team resilience are common when confronting a dynamic adversary or operating in a complex environment where complete information cannot possibly be known from the start. This stress can be multiplied when adding robotic assets into the team structure. Therefore, resilience from a robotics controls perspective can thus be broken down into planning and re-planning. Planning problems can be decomposed into task assignment, sequencing/scheduling, and motion planning [28]. The planning process assigns tasks and schedule to agents predicated on their anticipated capacity to fulfill their individual plans. Robust control [29] seeks to characterize 2 Any task or more specifically system with inputs (controls) and outputs (states) can be analyzed
as a control system. As motion control is integral to nearly all robot tasks, following a trajectory will very often be a task or component thereof, but a task can be much broader including for example elements of manipulation.
124
S. Berg et al.
or expand the extent of error (e.g. the difference between a desired and actual behavior). Controllers are designed explicitly such that models of agent dynamics or disturbance have uncertainty but to the extent that there uncertainty can be bounded, robust control can be achieved.3 It is rare that a practical robot team will be able to explicitly communicate the full state of their collective knowledge into one centralized planner. Thus, each agent has a different and incomplete picture of the planning problem and planning team action may become a task performed at the agent level through some combination of explicit communication and inference, meaning optimal plans become even more elusive. While these concepts of robustness of control and re-planning are useful for identifying where the component qualities of resilient team performance reside, they offer no useful global metric for resilience. Given the general intractability of planning tasks, relative performance compared to similar teams or against rivals will typically be the most useful measure of resilience.
3 Methods Metrics associated with human-autonomy team cohesion and resilience are currently being established. Thus, here we have built on the above literature to develop and evaluate a resilience subscale of a larger human-autonomy team cohesion scale development effort. 3.1 Initial Item Development Based on existing scales measuring team resilience in human teams [30] and a theoretical framework of characteristics of resilient elite sports teams [31] an initial pool of 20 team resilience items was developed across four subdimensions: team learning orientation (e.g., Mistakes are openly discussed in the team in order to learn from them), team flexibility (e.g., Team members adjust their approaches to overcome obstacles), shared language (e.g., Both human and agent team members use common terms), and perceived efficacy for collective team action (e.g., When a team member has a problem, the rest of the team is able to assist them. 3.2 Content Validation Eleven subject matter experts (SMEs) from various academic and government institutions, who were experts in team cohesion or human-autonomy teams, were contacted via email. Upon agreement to participate, they completed content validation procedures for the full team cohesion scale. They were provided background information and purpose and then rated the initial items using a 3-point Likert scale including “extremely important to include in the scale”, “important to include in the scale”, or “should not
3 Note that we do not advocate that the explicit use of robust control is a necessary condition
for achieving resilience. Many if not all control approaches seek to achieve robustness through more or less formal descriptions.
Exploring Resilience and Cohesion in Human-Autonomy Teams
125
be included in the scale” according to the procedures outlined in [32, 33]4 . SMEs could also provide feedback on each item, as well as additional recommendations to the scale design, or suggest items that may be missing from the scale.
4 Content Validation Results Evaluated items were analyzed using the Content Validity Ratio (CVR) [34]. With a sample of 11, our criterion value was set at .59. Items with a CVR below this value were considered for removal. Of the original 20 items, 16 were retained for the resilience subscale to be further analyzed through test-retest reliability assessment and additional validation testing and 1 item was added from another subdimension. Specifically, three items from the perceived efficacy for collective team action subscale did not meet the CVR criterion and were removed (i.e., ‘This team is able to do things together’, ‘There is open and honest communication between human and non-human team members’, and ‘Thorough preparation in training helped to deal with stressors’). Additionally, based on the SME qualitative feedback and CVR value, 1 item from this subscale was moved to the ‘belongingness’ subscale of social cohesion (i.e., ‘I feel connected to my human and non-human team members’). Finally, 1 item from the task cohesion subdimension (i.e., ‘Team members did not communicate well during the task’) was added to the Shared Language, Resilience subdimension.
5 Conclusions and Path Forward Following SME content validation protocols, results are promising that team resilience in human-autonomy teams is critical part of understanding team cohesion. To further validate this scale, an online study using the revised full cohesion scale will be conducted in collaboration with the United States Military Academy (USMA) to provide a baseline for test-retest reliability assessment, as well as quantifying the range of potential item responses when observing low versus high cohesive human-autonomy teams. Future work will include additional reliability and validity testing to formalize a complete human-autonomy team cohesion scale. Acknowledgments. Research was sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-20-2-0250. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein. The authors thank the subject matter experts and reviewers for their helpful feedback. The authors also thank the scale development team: Drs., Sean Fitzhugh, Danny Forster, Shan Lakhmani, Erica Rovira and Cadet Jordan Blackmon. 4 This is a standard method of scale development and has been used recently in the development of
human-autonomy team trust scales (Yagoda & Gillan, 2012; Schaefer, 2016). Lawshe’s protocol recommends 11 SMEs with a criterion set to 0.59 to ensure SME agreement is unlikely due to chance. The formula yields values ranging from +1 to –1, where positive values indicate at least half of the SMEs rated the item as extremely important.
126
S. Berg et al.
References 1. Driskell, T., Driskell, J.E., Salas, E.: Mitigating stress effects on team cohesion. Team Cohesion: Advances in Psychological Theory, Methods and Practice (Research on Managing Groups and Teams, Vol. 17), pp. 247–270. Emerald Group Publishing Limited (2015). https:// doi.org/10.1108/S1534-085620150000017010 2. Carron, A.V., Brawley, L.R.: Cohesion. Small Group Res. 31(1), 89–106 (2000). https://doi. org/10.1177/104649640003100105 3. Beal, D.J., Cohen, R.R., Burke, M.J., McLendon, C.L.: Cohesion and performance in groups: a meta-analytic clarification of construct relations. J. Appl. Psychol. 88(6), 989–1004 (2003). https://doi.org/10.1037/0021-9010.88.6.989 4. Mathieu, J.E., Kukenberger, M.R., D’Innocenzo, L., Reilly, G.: Modeling reciprocal team cohesion-performance relationships, as impacted by shared leadership and members’ competence. J. Appl. Psychol. 100(3), 713–734 (2015). https://doi.org/10.1037/a0038898 5. Cato, C.R., Blue, S.N., Boyle, B.: Conceptualizing risk and unit resilience in a military context. In: Trump, B.D., Florin, M.-V., Linkov, I. (eds.) IRGC Resource Guide on Resilience (vol. 2): Domains of Resilience for Complex Interconnected Systems. EPFL International Risk Governance Center, Lausanne (2018). Available on irgc.epfl.ch and irgc.org 6. Zemba, V., Wells, E.M., Wood, M.D., Trump, B.D., Boyle, B., Blue, S., Cato, C., Linkov, I.: Defining, measuring, and enhancing resilience for small groups. Safety Sci. 120, 603–616 (2019). https://doi.org/10.1016/j.ssci.2019.07.042 7. Gittell, J.H., Cameron, K., Lim, S., Rivas, V.: Relationships, layoffs, and organizational resilience airline industry responses to September 11. J. Appl. Behav. Sci. 42, 300–329 (2006). https://doi.org/10.1177/0021886306286466 8. Norris, F.H., Stevens, S.P., Pfefferbaum, B., Wyche, K.F., Pfefferbaum, R.L.: Community resilience as a metaphor, theory, set of capacities, and strategy for disaster readiness. Am. J. Community Psychol. 41, 127–150 (2008). https://doi.org/1007/s10464-007-9156-6 9. Bowers, C., Kreutzer, C., Cannon-Bowers, J., Lamb, J.: Team resilience as a second-order emergent state: a theoretical model and research directions. Front. Psychol. 8 (2017). https:// doi.org/10.3389/fpsyg.2017.01360 10. Salas, E., Rico, R., Passmore, J., Vessey, W.B., Landon, L.B.: The Wiley Blackwell Handbook of the Psychology of Team Working and Collaborative Processes (2017) 11. Stuster, J.: Bold Endeavors: Lessons from Polar and Space Exploration. Naval Institute Press, Annapolis (1996) 12. Williams, J., Brown, J.M., Bray, R.M., Anderson Goodell, E.M., Olmsted, K.R., Adler, A.B.: Unit cohesion, resilience, and mental health of soldiers in Basic Combat Training. Military Psychol. 28(4), 241–250 (2016). https://doi.org/10.1037/mil0000120 13. Barnes, M.J., Evans, A.W.: Soldier-robot teams in future battlefields: an overview. In: HumanRobot Interactions in Future Military Operations, pp. 9–29 (2010) 14. Chen, J.Y.C., Barnes, M.J.: Human - agent teaming for multirobot control: a review of human factors issues. IEEE Trans. Hum. Mach. Syst. 44(1), 13–29 (2014). https://doi.org/10.1109/ THMS.2013.2293535 15. Sycara, K., Sukthankar, G.: Literature Review of Teamwork Models (2006) 16. O’Reilly III, C.A., Caldwell, D.F., Barnett, W.P.: Work group demography, social integration, and turnover. Admin. Sci. Q. 34, 21–37 (1989) 17. Smith, K.G., Smith, K.A., Olian, J.D., Sims, H.P., O’Bannon, D.P., Scully, J.A.: Top management team demography and process: the role of social integration and communication. Adm. Sci. Q. 39(3), 412 (1994). https://doi.org/10.2307/2393297 18. Widmeyer, W.N., Ducharme, K.: Team building through team goal setting. J. Appl. Sport Psychol. 9(1), 97–113 (1997). https://doi.org/10.1080/10413209708415386
Exploring Resilience and Cohesion in Human-Autonomy Teams
127
19. Rovio, E., Eskola, J., Kozub, S., Duda, J., Lintunen, T.: Can high group cohesion be harmful? A case study of a junior ice-hockey team. Small Group Res. 40, 421–435 (2009). https://doi. org/10.1177/1046496409334359 20. Stoverink, A.C., Kirkman, B.L., Mistry, S., Rosen, B.: Bouncing back together: toward a theoretical model of work team resilience. Acad. Manag. Rev. 45(2), 395–422 (2020). https:// doi.org/10.5465/amr.2017.0005 21. Carmeli, A., Stephens, J.P.: Knowledge creation and project team performance: the role of emotional carrying capacity. Acad. Manag. Proc. 2014(1), 12811 (2014). https://doi.org/10. 5465/ambpp.2014.12811 22. Brailey, K., Vasterling, J.J., Proctor, S.P., Constans, J.I., Friedman, M.J.: PTSD symptoms, life events, and unit cohesion in U.S. soldiers: baseline findings from the neurocognition deployment health study. J. Traumatic Stress 20(4), 495–503 (2007). https://doi.org/10.1002/ jts.20234 23. Kanesarajah, J., Waller, M., Zheng, W.Y., Dobson, A.J.: Unit cohesion, traumatic exposure and mental health of military personnel. Adv. Access Publ. 66, 308–315 (2016). https://doi. org/10.1093/occmed/kqw009 24. Marcotte, R.J., Wang, X., Mehta, D., Olson, E.: Optimizing multi-robot communication under bandwidth constraints. Auton. Robots 44(1), 43–55 (2019). https://doi.org/10.1007/s10514019-09849-0 25. Mitra, A., Richards, J.A., Bagchi, S., Sundaram, S.: Resilient distributed state estimation with mobile agents: overcoming Byzantine adversaries, communication losses, and intermittent measurements. Auton. Robots 43(3), 743–768 (2018). https://doi.org/10.1007/s10514-0189813-7 26. Bhardwaj, A., Avasthi, V., Groundar, S.: Cyber security attacks on robotic platforms. Netw. Secur. 10, 13–19 (2019) 27. The Distributed and Collaborative Intelligent Systems and Technology Collaborative Research Alliance. https://www.dcist.org/ 28. Neville, Messing, Ravichandar, Hutchinson, Chernova: an interleaved approach to trait-based task allocation and scheduling. In: ICRA 2020 29. Zhou, K.: Robust and Optimal Control. Prentice-Hall, Inc., Upper Saddle River (1996) 30. Sharma, S., Sharma, S.K.: Team resilience: scale development and validation. Vision. 20(1), 37–53 (2016). https://doi.org/10.1177/0972262916628952 31. Morgan, P.B.C., Fletcher, D., Sarkar, M.: Defining and characterizing team resilience in elite sport. Psychol. Sport Exerc. 14(4), 549–559 (2013). https://doi.org/10.1016/j.psychs port.2013.01.004 32. Yagoda, R., Gillan, D.: You want me to trust a ROBOT? The development of a human-robot interaction trust scale. Int. J. Soc. Robot. 4(3) (2012) https://doi.org/10.1007/s12369-0120144-0 33. Schaefer, K.E.: Measuring trust in human robot interactions: development of the “trust perception scale-HRI”. In: Mittu, R., Sofge, D., Wagner, A., Lawless, W.F. (eds.) Robust Intelligence and Trust in Autonomous Systems, pp. 191–218. Springer, Boston (2016). https://doi.org/10. 1007/978-1-4899-7668-0_10 34. Lawshe, C.H.: A quantitative approach to content validity. Pers. Psychol. 28, 563–575 (1975)
Multi-modal Emotion Recognition for User Adaptation in Social Robots Michael Schiffmann(B) , Aniella Thoma(B) , and Anja Richert(B) TH Köln University of Applied Sciences, Betzdorfer Str.2, 50679, Köln, Germany {Michael.Schiffmann,Aniella.Thoma,Anja.Richert}@th-koeln.de
Abstract. The interaction of humans and robots in everyday contexts is no longer a vision of the future. This is demonstrated, for example, in the increasing use of service robots, e.g., household robots or social robots such as Pepper from the company SoftBank Robotics, illustrates. The prerequisite for social interaction is the robot’s ability to perceive their counterpart on a social level and, based on this, output an appropriate reaction in the form of speech, gestures or facial expressions. In this paper, we first present the state of the art for multi modal emotion recognition and dialog system architectures which utilize emotion recognition. The methods are then discussed in terms of their applicability and robustness. Starting points for improvements are identified and subsequently, an architecture for the use of multi-modal emotion recognition techniques for further research is proposed. Keywords: Social robotics · Emotion detection · Chatbots · Service robots · Affective policy
1 Introduction Especially in industrial contexts, robots are often designed to fulfill strongly repetitive or critically dangerous tasks. As robots become more prevalent in the context of healthcare and service, the design of social robots becomes more important. Social robots are defined as humanoid or anthropomorphically realized autonomous machines that interact with humans based on social rules [1]. Thus, the abilities of these robots should no longer be limited to the fulfillment of manual tasks, but they have to be able to interact with human beings. The systems must be oriented to the abilities, preferences, requirements and current needs of their users, adapting to the situation and emotional state of their user [2]. Taking these definitions into account, we consider social robots as systems that can fulfill tasks like answering questions, chatting, giving path guidance or fulfilling advisory tasks. We assume that these systems have audio and video input and output as well as digital or real anthropomorphic or human-like body with motion capabilities to express gestures, postures and even mimic. An essential demand from users is an intuitive interaction with technical systems. A significant part of human interaction is language-based, i.e. the ability to carry on a conversation. Robotic systems, especially with human-like appearance, should therefore be designed as conversational agents. In general, three different types of conversational © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 128–134, 2021. https://doi.org/10.1007/978-3-030-79997-7_16
Multi-modal Emotion Recognition for User Adaptation in Social Robots
129
agents respectively dialog systems can be distinguished: question-answering systems, goal-oriented systems and chatbots. Most suitable for conversational agents in the context of customer service are goal-oriented systems and chatbots. Goal-oriented dialogue systems are designed to help users to complete specific tasks like buying a ticket for public transport. In contrast, chatbots are developed with the goal of having long conversations about different topics. Both systems can be realized either rule-based or neural/corpus based. They can also be realized hybrid [3]. A few years ago, the success of introducing chatbots in the sector of customer service was limited because they were unable to provide convincing and engaging interactions [4, 5]. But the performance has increased in recent years especially regarding short conversations with only a single context what enables them to handle customer service conversations [6]. Furthermore, approaches were developed on how to improve the weaknesses of chatbots to appear unnatural and impersonal by adding specific linguistic elements to achieve a more engaging communication style [7]. In this context, the emotional understanding that a system must build up with regard to a user in order to fulfill the user’s needs is especially important. The robotic system must be enabled to make users feel understood and create the feeling that they have reached someone they can expect help from. Thus, in addition to analyzing the content of the conversation, the user’s state must also be determined. This can be done via emotion recognition. Based on the information gained, a suitable reaction in the form of speech, gestures or facial expressions can be created as output. The assumed benefit of designing cooperative and social conversational agents lies in the increase of service quality in customer service [8]. But like conversation quality, emotion recognition must be robust and reliable, or otherwise it will lead to negative user experiences.
2 Emotion Detection and Dialog Systems The combination of emotion detection and dialog systems allows the adaptation to the user and the realization of an engaging interaction. Thus, researches have been working on the topic for a long period of time. To combine both systems, there are different approaches available. In the following, we will first discuss the abilities of emotion detection and dialog systems and then focus on the fusion of the two systems. Emotion detection in human-robot-interaction can be achieved by utilizing different sources or media like video, audio or text that are available during the interaction. Emotional states can be derived from this input. There are other sources of emotion detection, such as blood-pressure or heart rate, but those sources will not be considered here because they are not commonly captured by social robots. Detecting emotions from audio and video is a strongly researched area. For a multi modal emotion detection our focus is on the fusion of different sources of emotions. A state-of-the-art approach to combine audio, video and text-based emotion recognition to one feature set is presented by [13]. The approach is based on the idea of increasing the precision of emotion recognition by fusing different sources of emotion recognition to one output set. The different sources for emotion recognition can be replaced by more accurate algorithms for, e.g. facial expressions [13].
130
M. Schiffmann et al.
In the case of detecting emotions from text it is possible to process the user’s utterance. This is called sentiment analysis and on a given scale a user’s utterance can be classified as positive or negative [3]. But the question arises, especially in the service context, how this information can be used in a meaningful way to adapt the response to the situation and the user. Several architectures have been proposed to combine dialog systems with emotion recognition. They are mainly based on the generalized architecture presented by [9] where it is stated that a goal-oriented dialog system is based on the principal components automatic speech recognition (ASR), spoken language understanding (SLU), dialog state tracking (DST), dialog policy, natural language generation (NLG) and text to speech (TTS). A prominent example for a hybrid approach is the chatbot XiaoIce which is widely used in many messengers in the Asian-pacific area. The goal was to create a chatbot that acts as “an AI companion with an emotional connection to satisfy the human need for communication, affection, and social belonging” [10]. This chatbot has a huge number of users and is extremely powerful to engage users to disclose to it [10]. A different approach on dialog systems which use multi modal emotion recognition is proposed for psychiatric treatment of patients [11]. The approach is more likely to be classified as a multi modal dialog system than a text-based chatbot. It can communicate through text, speech, audio and visual cues and generates its answers based on the age, gender and detected emotions from text, voice and video as well as from bio sensors. The answer generation is done in a hybrid way and consists of a neural decoder which gets information from a language model and a response inference engine which combines rules for psychiatric treatment with the emotional user state. To our best knowledge there is no data available which allows a statement over the functionality and suitability of this approach [11]. A state-of-the-art architecture called ADVISER for a multi modal, multi domain and socially engaging dialog system was proposed in [12] and tackles the challenge to combine emotion detection, processing and user adaptive response generation in one framework. In particular features like user state tracking, backchanneling, affective policies and multi modal emotion detection are highly interesting. These features enable the user adaptation. User state tracking is used to monitor the user’s engagement level, valence level (negative, neutral, positive), arousal (flow, medium, high) and emotions (angry, happy, neutral, sad). Backchanneling is a feedback mechanism in conversations where the listener is signaling acknowledgment or reaction to the speaker. The affective policy is a rule-based system feature which is used to generate a domain-agnostic emotional output. The multi modal emotion recognition is supposed to be accomplished by models which can predict emotions either in a time-continuous fashion or discretely via conversation turn. So far, the framework is implemented to provide its core functionality. Next, the researchers have planned to deliver multi-modal speech and vision models. The presented ideas build a very good baseline of what improvements need to be done to make dialog systems more appealing for users [12].
Multi-modal Emotion Recognition for User Adaptation in Social Robots
131
3 Improving Multi Modal Dialog Systems For the real-world use of dialog systems embedded in a social robot, there are architectures available, but these systems have often only been tested in laboratory environments. In contrast to purely text-based solutions, systems based on social robots have not been evaluated with a large number of users. Improving social conversational agents is the focus of the project SKILLED to deliver a high quality of service in real-world servicing tasks. SKILLED aims to develop a socio-empathic, multilingual natural language AI platform. Further, the project aims to investigate the influence of different modalities (e.g., language and interaction behavior across different communication channels), the handling of errors in the interaction with the user, and the design of multi-turn dialogue control, considering language, culture, and bonding.
Fig. 1. Multi Modal Dialog System for service-oriented social robots based on [11, 12].
In order to achieve and assure quality of use when using service-oriented social robots which are intended to socially interact like humans, a number of requirements must be fulfilled. We identified the following requirements as drivers of our research for improving such systems: applicability and robustness. The importance is justified in their relation to multi modal emotion recognition and multi modal response generation among others realized with an affective policy. For applicability on various contexts, we assume freedom of racial and gender bias, diversified answers and timing of answers as fundamental requirements. For robustness, fault tolerance and time behavior are the relevant criteria. In detail, our research will focus on latencies for backchanneling and the ability of continuous self-improvement. To achieve these goals, we will use a derived architecture in accordance to [11, 12] (see Fig. 1) as it is the general composition of modules to build a goal-oriented dialog system.
132
M. Schiffmann et al.
The architecture presented is used for the evaluation of the adaptive system behavior related to the stated quality goals in the framework of a user centered design. The research for the mentioned improvements will start with evaluations in a laboratory setting and will be extended to real-world testing to obtain continuous feedback from users. With regard to the continuous self-improvement of the system we follow the approaches of [14, 15], where direct user feedback is collected by the dialog system to improve the response generation. We will use this approach and collect emotional user feedback in field experiments. In addition, we will observe the experiments and evaluate the perception of the system by the user with questionnaires. We will use the results to gain use case specific information on how to improve the systems behavior for user adaptation. Furthermore, we will utilize the approach presented by [13] for fusion of different sources for emotion recognition. The approach is highly suitable, because it offers the possibility to change the prediction models, for example for facial expression detection. This is giving us the flexibility to interchange prediction models with more accurate models. We also want to investigate whether this approach can be improved by finetuning the algorithm itself with the data from our user tests. The affective policy uses the output of the emotion recognition to generate an affective output. The affective output will be combined with the natural language generation component to create a combined affective and informational system answer. For our research to improve the affective policy, we focus on the input, the processing and the output as starting points. The input for the affective policy is usually emotional features extracted from audio or video. We want to investigate the extent to which demographic information such as age and gender, in addition to emotional features, can improve the generation of affective output without causing bias. Regarding the processing of affective policy, we aim to explore the question of how the available information has to be processed. In particular, we are interested in whether the type of processing, e.g., constant processing over time, processing per round of conversation, or another combination, is particularly suitable for generating an appropriate emotional output. An important part of the user adaptation is the ability to backchannel during a conversation. The presented system works in conversation turns (see [12]) and only reacts to the user when it is at turn. Previous work has shown that speakers feel uncomfortable when response times are not within an interval of 300 to 700 ms [16]. An appropriate backchanneling could avoid this discomfort, but it has to be done under the existing time constraints. Taking this into account we will utilize the approach of [12] for audio analysis and try to generate the backchanneling in real time. Finally, to realize a user centered output, we want to explore what kind of adaptation users expect from social robots in certain use cases rather than using an agnostic approach. We assume that affective rules need to be varied according to the use case. For example, the requirements in public spaces differ from those in private spaces. In addition to the application of affective rules, there is also the question of how emotional output is perceived. A suitable design is once again essential to realize a meaningful user adaptation. Depending on the user, the output preference can vary from minor textual adjustments to very strong visual expressions of emotions.
Multi-modal Emotion Recognition for User Adaptation in Social Robots
133
Thus, the research approach pursued aims at implementing different existing approaches and investigating them in multiple field experiments with respect to applicability and robustness. In a further step, the requirements for affective policy will be explored.
4 Conclusion In this paper, we presented a brief overview about multi modal emotion detection and dialog systems. With regard to our research focus to improve the user adaptation of multi modal dialog systems, we setup an initial architecture for user evaluations and defined different quality aspects as starting points. We defined robustness and applicability as major aspects to improve service-oriented social robots. In terms of applicability, we want to focus on the affective policy which is a key element for user adaptation. Regarding robustness, we focus on the ability of continuous self-improvement, time behavior and fault tolerance. We are following the research plan to first evaluate the system in a laboratory context and then in user scenarios. During the user scenarios, we will evaluate the potential improvements with suitable measures. Acknowledgements. The authors acknowledge the financial support by the Federal Ministry of Education and Research of Germany in the framework of FH-Kooperativ 2-2019 (project number 13FH504KX9).
References 1. Ferrari, F., Eyssel, F.: Toward a hybrid society. In: Agah, A., Cabibihan, J.-J., Howard, A.M., Salichs, M.A., He, H. (eds.) ICSR 2016. LNCS (LNAI), vol. 9979, pp. 909–918. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47437-3_89 2. Meudt, S.: Maschinelle Emotionserkennung in der Mensch-Maschine Interaktion (2019). https://doi.org/10.18725/OPARU-15022. https://oparu.uni-ulm.de/xmlui/handle/123456789/ 15079 3. Jurafsky, D., Martin, J.H.: Speech and Language Processing (3rd ed. draft). https://web.sta nford.edu/~jurafsky/slp3/ 4. Schuetzler, R., Grimes, M., Giboney, J., Buckman, J.: Facilitating Natural Conversational Agent Interactions: Lessons from a Deception Experiment (2014) 5. Wannemacher, P.: Bots Aren’t Ready To Be Bankers. https://go.forrester.com/wp-content/upl oads/Forrester_Bots_Arent_Ready_To_Be_Bankers.pdf 6. Kasinathan, V., Abd Wahab, M.H., Syed Idrus, S.Z., Mustapha, A., Yuen, K.: AIRA Chatbot for travel: case study of AirAsia. J. Phys. Conf. Ser. 1529, 022101 (2020). https://doi.org/10. 1088/1742-6596/1529/2/022101 7. Liebrecht, C., van Hooijdonk, C.: Creating humanlike chatbots: what chatbot developers could learn from webcare employees in adopting a conversational human voice. In: Følstad, A., et al. (eds.) CONVERSATIONS 2019. LNCS, vol. 11970, pp. 51–64. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39540-7_4 8. Gnewuch, U., Morana, S., Maedche, A.: Towards Designing Cooperative and Social Conversational Agents for Customer Service (2017)
134
M. Schiffmann et al.
9. Williams, J.D., Raux, A., Henderson, M.: The dialog state tracking challenge series: a review. dad 7, 4–33 (2016). https://doi.org/10.5087/dad.2016.301 10. Zhou, L., Gao, J., Li, D., Shum, H.-Y.: The design and implementation of XiaoIce, an empathetic social chatbot. arXiv:1812.08989 [cs] (2019) 11. Oh, K.-J., Lee, D., Ko, B., Choi, H.-J.: A chatbot for psychiatric counseling in mental healthcare service based on emotional dialogue analysis and sentence generation. In: 2017 18th IEEE International Conference on Mobile Data Management (MDM), pp. 371–375. IEEE, Daejeon (2017). https://doi.org/10.1109/MDM.2017.64 12. Li, C.-Y., Ortega, D., Väth, D., Lux, F., Vanderlyn, L., Schmidt, M., Neumann, M., Völkel, M., Denisov, P., Jenne, S., Kacarevic, Z., Vu, N.T.: ADVISER: a toolkit for developing multi-modal, multi-domain and socially-engaged conversational agents. arXiv:2005.01777 [cs] (2020) 13. Bagher Zadeh, A., Liang, P.P., Poria, S., Cambria, E., Morency, L.-P.: Multimodal language analysis in the wild: CMU-MOSEI dataset and interpretable dynamic fusion graph. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2236–2246. Association for Computational Linguistics, Melbourne, Australia (2018). https://doi.org/10.18653/v1/P18-1208 14. Lu, Y., Srivastava, M., Kramer, J., Elfardy, H., Kahn, A., Wang, S., Bhardwaj, V.: Goaloriented end-to-end conversational models with profile features in a real-world setting. In: Proceedings of the 2019 Conference of the North, pp. 48–55. Association for Computational Linguistics, Minneapolis - Minnesota (2019). https://doi.org/10.18653/v1/N19-2007 15. Hancock, B., Bordes, A., Mazaré, P.-E., Weston, J.: Learning from Dialogue after Deployment: Feed Yourself, Chatbot! arXiv:1901.05415 [cs, stat] (2019) 16. Lundholm Fors, K.: Production and Perception of Pauses in Speech (2015). http://hdl.handle. net/2077/39346
Robot Design Needs Users: A Co-design Approach to HRI Francesco Burlando(B) , Xavier Ferrari Tumay(B) , and Annapaola Vacanti(B) Dipartimento Architettura e Design, Università di Genova, Genoa, Italy [email protected], {xavier.ferraritumay, annapaola.vacanti}@edu.unige.it
Abstract. The rapid technological growth that we are facing has highlighted issues about human-computer interaction, specifically with reference to the scope of robotics; this kind of dynamics open up new challenges for UX designers. We refer specifically to the category of social robots, which is a growing sector in the market of humanoid robots. Social robots must provide a good level of interaction with human beings, who interlace complex and deep relationship with them; they are able to learn from the environment in which they are located as from the interactions with users and to develop some sort of decision-making autonomy. Designing a social humanoid robot is not just the development of a consumer product as a household appliance and is not just the engineering of a body to a vocal assistance, but it also requires taking into account the psychological impact of the robot and evaluating social dynamics. Therefore, it is fundamental to place the individual at the center of the design process in order to study an acceptable user experience. Co-design methodologies are now considered an essential practice, especially in technological projects, due to the fact that a high-tech product, system or service is most likely capable to actively interact with users for which is intended. Designing an interactive and funny co-design experience is fundamental in order to positively engage users and get valuable data and insight about their perspective on the project that is being developed. Reflecting on the main themes and issues that must be taken into consideration in the design of a social robot based on UCD principles, the paper aims to propose new challenges and approaches for co-design activities for the robotic project, taking advantage of digital platforms and innovative IT solutions. Keywords: Human-robot interaction · Co-design methods · UCD approach
1 Scenario As the technological development of the appliances and services that assist us in our everyday life is not slowing down, but instead growing in intensity and pervasiveness, the designer’s role is getting more and more central and complex. In particular, Interaction Design and User Experience Design have become fundamental areas of research: Attribution of paragraphs. 1, 4: Annapaola Vacanti. 2: Francesco Burlando. 3: Xavier F. Tumay. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 135–142, 2021. https://doi.org/10.1007/978-3-030-79997-7_17
136
F. Burlando et al.
technological devices need to be designed by firstly taking into consideration the ways in which the user is going to interact with them. We do not refer merely to digital interfaces, even though this is the main area of application of UX approach, but also and especially to high tech products. This sort of appliances requires a big effort by the user in order to learn its functions and positively interact; the main goal of UX designers in these matters is to facilitate the learning process and create a seamless and useful experience for the user, understanding his perspective, needs and expectations [1]. In this article we mainly refer to the area of robotics, which poses a series of very specific challenges to the designer; in fact, we are undoubtedly facing a future – if not a present already – in which machines are going to gain more and more autonomy (we prefer not to use the term “intelligence”, since the discussion around the theme of AI development is complex and articulated and outside our research area). Autonomous machines capable of relieving humans from dangerous and/or complex tasks or merely assisting them in workplaces or at home are – in fact – machines, but cannot be designed as mere tools, due to the fact that we are seeing and increasing trend of anthropomorphic design on the market. This trend is justified by the fact that, especially in the area of social robotics, it makes sense to implement humanlike features in the design, to adapt the robot to the humanized environment and its spaces and measures; plus, in order to create an emotional and friendly relationship between the human and the robot, the latter should present both in its appearance and its behaviour some kind of familiarity to humanity [2]. While engineers work on solving hardware and software issues, the designer should focus his efforts on studying Human – Robot Interaction (HRI) and Human – Humanoid Interaction (HHI) in particular, in order to facilitate the implementation of assistive humanoids in our society.
2 UX Design Applied to Robotics 2.1 Understanding the Social Robot When designing a social robot, there are several themes that need to be considered and reflections that must be done in order to create a seamless and positive experience for the end user. • Degree of human likeness and size: for a robot to interact and move inside of a humansized environment, it is more functional to have human dimensions in its body size. In particular, social robots tend to be quite small if compared to robots devoted to physical tasks in factories and similar environments, due to the fact that they may need to be implemented in private houses, with very small room for moving around without being in the way. This said, we must not forget that designing a robot that is too similar to a human being may create an uncanny reaction in the user; social robots must be perceived as companions, friends. [3] For this reason, they are often smaller in size in relation to an adult, and they present cute somewhat cartoon-like features in the face e/or body shape.
Robot Design Needs Users: A Co-design Approach to HRI
137
• Materials and colors: when designing the appearance of the robot, we must not forget that user expect a product that is strong, durable and of good quality; we cannot design a robot that feels weak or easily breakable; it must be perceived as an appliance that we can rely on. Choosing materials and colors is not an easy task, because many of these aspects are related, in the user imagination, to the previous idea of robots that they created in their mind, maybe watching sci-fi movies, where robots tend to have very specific appearance in connection to their use and/or personality. It is common for social robots to be mostly white in their body, with some amount of vivacious color in the details. • Ability to move and degrees of freedom: even though these themes are a prerogative of the engineer rather than the designer, they must be considered when studying and building the perception of the user regarding the level of “intelligence” and autonomy of the robot. [4] The more autonomous the robot is in moving around and performing task, the more it is perceived as smart and, on the other side, potentially scary; for this reason, its aesthetics must instead create a positive image of friendliness and safety. • Vocal feedback: the ability to speak and respond to specific commands and/or questions is not a prerogative of all kind of robots. Same as the ability to move, this characteristic makes a robot feel more or less intelligent. Its voice should be clear and friendly, but maybe not completely human; it may have some kind of “disturbance” that validates the perception of interacting with a technological object rather than a human being. • Visual feedback: social robots need to have an emotional interaction with their users, thus their “faces” are probably the most important interfaces to be designed, both regarding their aesthetical appearance and their capacity to react to stimuli coming from the environment and/or the user. As said before, it is preferable not to design a 100% humanlike face, but rather make use of cartoon or emoticon style, because iconic images are more understandable for a wider public and tend to build empathy rather than uncanniness [5]. 2.2 Understanding the End User We have now reflected on how a social robot should be designed and what are the areas of competence of the designer on these matters. We can see at glance as most of the themes involve the perception that the user has of the robot, or its emotions and expectations regarding his interaction with it. But who is the end user? What are his main characteristics? These questions actually are the foundations of our project [6]. • Demographics: first of all, we need to understand the background of our users. Age, gender and location of birth/residence are aspects that strongly define one’s individual way of life, habits and approach to technological products [7]. • Living conditions: a person that lives alone has very different needs than those of a big family or those of a frail user that needs constant assistance by humans and/or robots. Also, the configuration of the living environment makes a huge difference in the insertion of a social robot in the specific location; tight spaces, presence of stairs or other obstacles can strongly reduce the ability to move and provide support to the user by the robot.
138
F. Burlando et al.
• Physical and mental status: social robots are especially useful for assisting frail people, who may have physical or even cognitive disabilities, or simply be elderly or very young. Thus, it is of fundamental importance to understand the impairments that the specific target of the project may present. • Needs: in strong connection with the previous point, the needs of the specific target of users must be thoroughly analyzed and unpacked in order to understand how every feature of the robot can be developed in order to facilitate that specific task [8]. • Familiarity with technology: people of different age and academic/work background have very different levels of familiarity with technological and IT products and may have a very different learning curve when it comes to accepting and implementing a robotic system in their daily lives. • Expectations from a robot: all of us have expectations and stereotyped images of what a robot looks like and behaves; these are mostly generated by the sci fi scenario and may lead to way too high expectations about the alleged intelligence of the robotic system, thus creating dissatisfaction in the user when dealing with a real robot. The appearance of the robot should then suggest its functions and abilities, without going too far into creating a futuristic appearance not supported by related high level functionalities.
3 Co-designing Robots In light of the previous reflections, we believe it is of great importance to apply a User Centered Design (UCD) approach to the robotic project. Even more, it would be a great support to the design process to be able to involve users since the preliminary stages, and then test iterations of the project while it proceeds to its final result. Of course, we can rely on several methodologies that are commonly used in co-design sessions. 3.1 Stages of the Project Different tools may be used in different stages of the project [9]: • Definition of the design brief: at this level, we need users’ participation in order to understand what our project should do. We are working at a very high level, without getting deep into functional details. Users can help us understand what they need, what features we should focus on and what features are just of secondary importance. • Design of the prototype: during this stage we get into details; how should our project do what we need it to do? It is a phase of convergence, where the expectations of the users must be converted in real-life solutions. Involving users here can be useful for understanding since the beginning of the work which solutions are easier to be understood and then put in use for our target. • Validation of the prototype: this stage is crucial to perfect the design and find incongruences or low affordances in its functioning. Users will test the prototype and may even propose disruptive solutions in substitution to what has been done. As stated by the Design Thinking approach, this process is iterative, especially concerning the last two stages [10].
Robot Design Needs Users: A Co-design Approach to HRI
139
The selection of the user sample also follows different rules in every stage of the process: when we are defining the design brief it may be useful to collect as many data as possible, thus involving a very broad group of participants; when getting into details, this group should be reduced to a smaller, more targeted audience, of which we can deeply understand expectations and needs; finally, we test our prototypes with smalls groups of users that represent very closely the target sector of people that may find our product suitable when it goes on the market. 3.2 Issues for the Robotic Project Data collected during co-design sessions is vital to produce widely accepted products that are able to be a consistent support in users’ lives. These data, though, do not run out of use: they provide a useful base for future work and insights for researchers and designers on the matter of study. Although we can retrieve a flourishing literature, especially in recent years, dealing with human perception of robots [11], we often are under the impression that collecting non-biased accurate data regarding the specific theme of robot design in relation to appearance and interaction features, is a challenge that still needs study and development. Due to the fact that humanoid robots are still far from being pervasively present in our society and daily activities, it is difficult to ask for direct answers and opinions from users regarding such a complex product, of which they often do not possess a direct experience of interaction. Many of the people who may benefit from the assistance of a social robot in the future have never had a real-life encounter with one of them. Even worse, most of us share a common stereotype regarding robots, favored by the diffusion of such sort of creatures in sci-fi movies and literature. [12] These media built the misconception that robots are powerful intelligent machines that could at any moment arise and conquer the planet at the expense of humanity. Even in cases where this dramatic scenario is not being taken seriously by users, most of them still are disappointed when they get to interact with a real robot with limited features; reality has not yet reached fantasy, let alone surpass it. On the other hand, we must remember that our users do not have a design education and may find complicated to express their ideas and suggestions through drawing or directly answering to specific questions about the robot appearance and features. Even more, weak users may not be able to explain their needs. In general, it would be very difficult to understand how users may benefit from the presence of a robot in their lives, what use could be best for them and what aspects would make them uncomfortable.
4 Good Practices and Challenges for Collaborative Robot Design If we feel that classical data collection methods such as questionnaires and interviews may be failing us in the attempt of proficiently engage users in discussing matters regarding robot design and behavior in human society, we have several opportunities to put to the test other engaging activities. Most importantly, users need to be engaged in their creative side and expose their own ideas, without feeling judged or constrained by their own inability to creatively express themselves in the way that a designer has studied to do [13].
140
F. Burlando et al.
As recalled above, the design process in general – and the one of a humanoid robot in particular – is very articulated and complex. During the early design stages, it is good practice to organize focus groups in which users can observe and comment one or several mockups of a robot, described by a moderator [14]. Although this activity can prove to be very useful, we have wondered during our research activities if we could find more engaging methodologies both for the users and the design team; as a result, we came up with various kind of participatory design games, which will be described in depth in further publications. It follows a brief review of the main insights gathered while organizing co-design sessions for the robotic project during the last two years of activity of our research group. • Defining precise scenarios helps users to propose actionable solutions: working within clear boundaries actually helps to explore and discuss solutions that are actually useful for the project, rather than having a broad but not focused discussion on the matter of “what a social robot could do for humans”. • Co-drawing or assisted design activities relive users from embarrassment: having some pre-designed tools to facilitate users’ creativity is crucial when working with people who are not used to drawing or expressing themselves visually; an experience that has proven to be very successful is having an illustrator drawing in real time what the users suggest. • Parametric tools allow anyone to 3D model and prototype: in order to overcome the need of a moderator or illustrator, parametric tools such as Grasshopper – a wellknown plugin of Rhinoceros 3D – [15] offer the chance to modify several dimensions of the robot and create unique designs even to completely inexperienced users. • Providing a defined number of variations helps collect insightful data: when the design options of a parametric or assisted drawing tool are defined, it is easier to analyze variations in the meta-designs generated by the users, thus having insightful data on their expectations. • Both children and elderly users can be engaged through participatory games: usually, people of all ages are happy to be able to have their opinions heard, and after a first moment of shyness they usually are fully committed to the activity, if the latter proves to be enough challenging and funny. In conclusion, although the design process of a social robot is already long and challenging, to design various co-design activities fit for every stage of the process is not at all a waste of time, but rather a stimulating activity both for the users involved and the design team. In the scope of social robotics, we are facing completely new and exciting challenges for which we do not yet have much data available, and we are facing a market of users who are not sure to want or need a humanoid creature living beside them in the near future [11]. Taking the design of a humanoid to a more human level means to harness the creativity of our future clients beyond their misconceptions and biases regarding robots; once negative reactions are outpaced by curiosity and excitement we actually find that well-designed robots are capable of building an empathic relationship with human users that overcomes that of any other kind of appliance we use in our everyday lives (Fig. 1).
Robot Design Needs Users: A Co-design Approach to HRI
141
Fig. 1. A moment of the participated parametric workshop organized by our research group at the European Maker Faire held in Rome in October 2019.
References 1. Albert, W., Tullis, T.: Measuring the User Experience: Collecting, Analyzing, and Presenting Usability Metrics. Newnes (2013) 2. Zlotowski, J., Proudfoot, D., Bartneck, C.: More human than human: does the uncanny curve really matter? (2013) 3. Kanda, T., et al.: Analysis of humanoid appearances in human–robot interaction. IEEE Trans. Robot. 24(3), 725–735 (2008) 4. Huang, H.-M., et al.: A framework for autonomy levels for unmanned systems (ALFUS). In: Proceedings of the AUVSI’s Unmanned Systems North America, pp. 849–863 (2005) 5. Casiddu, N., Burlando, F., Porfirione, C., Vacanti, A.: Designing synthetic emotions of a robotic system. In: Karwowski, W., Ahram, T., Etinger, D., Tankovi´c, N., Taiar, R. (eds.) IHSED 2020. AISC, vol. 1269, pp. 148–155. Springer, Cham (2021). https://doi.org/10.1007/ 978-3-030-58282-1_24 6. Abras, C., Maloney-Krichmar, D., Preece, J.: User-centered design. Bainbridge, W. Encycl. Hum. Comput. Interac. 37(4), 445–456 (2004). Sage Publications, Thousand Oaks 7. Ezer, N.: Is a robot an appliance, teammate, or friend? Age-related differences in expectations of and attitudes toward personal home-based robots. Diss. Georgia Institute of Technology (2008) 8. Khalid, H.M.: Embracing diversity in user needs for affective design. Appl. Ergon. 37(4), 409–418 (2006) 9. Rubin, J., Chisnell, D.: Handbook of Usability Testing: How to Plan, Design and Conduct Effective Tests. Wiley, Hoboken (2008)
142
F. Burlando et al.
10. Tschimmel, K.: Design Thinking as an effective Toolkit for Innovation. In: ISPIM Conference Proceedings. The International Society for Professional Innovation Management (ISPIM) (2012) 11. Arras, K.O., Cerqui, D.: Do we want to share our lives and bodies with robots? A 2000 people survey: a 2000-people survey. Technical report 605 (2005) 12. Ackerman, E.: Study: nobody wants social robots that look like humans because they threaten our identity. IEEE Spectr., 1–5 (2016) 13. Sanders, E.B.-N., Jan Stappers, P.: Co-creation and the new landscapes of design. Codesign 4(1), 5–18 (2008) 14. Casiddu, N., Burlando, F., Porfirione, C., Vacanti, A.: Humanoid robotics: guidelines for usability testing. In: Karwowski, W., Ahram, T., Etinger, D., Tankovi´c, N., Taiar, R. (eds.) IHSED 2020. AISC, vol. 1269, pp. 102–109. Springer, Cham (2021). https://doi.org/10.1007/ 978-3-030-58282-1_17 15. Grasshopper 3D. http://www.grasshopper3d.com
Social Robotic Platform to Strengthen Literacy Skills Mireya Zapata1(B) , Jacqueline Gordón2 , Andrés Caicedo3 , and Jorge Alvarez-Tello4 1 Research Center of Mechatronics and Interactive Systems - MIST, Universidad Tecnológica
Indoamérica, Machala y Sabanilla, 172103 Quito, Ecuador [email protected] 2 Faculty of Human and Health Sciences, Universidad Tecnológica Indoamérica, Machala y Sabanilla, 172103 Quito, Ecuador [email protected] 3 Faculty of Architecture Arts and Design, Universidad Tecnológica Indoamérica, Machala y Sabanilla, 172103 Quito, Ecuador [email protected] 4 Center for Technology Transfer and Innovation – CTTI, Universidad Tecnológica Indoamérica, Machala y Sabanilla, 172103 Quito, Ecuador [email protected]
Abstract. This work presents a robotic platform with mechanical movements of the head, torso, and arms interact in real-time through the “Tití app” that integrates game sessions to improve literacy skills, relating the development of a new learning scenario. The functionality of the app is based on the reading and writing errors proposed by the T.A.L.E test. The prototype developed for the dynamic interaction with the app, is controlled by an Arduino board, which manages the servo-motors and the Bluetooth communication to establish a wireless connection with a Tablet or mobile device that runs the app. The mechanical design and the control software development allow the execution of operations between the robotic prototype Titíbot and the Tití app. Its design form, function and context were considered to define each of its functions and customize the interaction to reinforce the scenario and context of application. Keyword: Social-robotics · Social interaction · Reading · Literacy skills
1 Introduction Mobile robots result from industrial transformation responding to human needs in different fields of application, including education. Several researches show that using social robots in educational processes helps reinforce the subject matter, optimizing the learning sessions. Also, suppose the robotic component is also integrating with learning-oriented game applications [1, 2]. The reinforcement is even more significant, allowing capture of the user’s attention in the process of “learn by playing”, reinforcing the messages through the prototype mechanic and its communication and interaction model [3]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 143–149, 2021. https://doi.org/10.1007/978-3-030-79997-7_18
144
M. Zapata et al.
The robot’s mimicry improves the interaction and acceptance of the robot to develop empathy, improving group acceptance, and executing the functions that present signals and communication modalities to facilitate interaction with users in a familiar way. In this paper, we present an innovative platform called Titibot platform, made up of a mobile application (software) that communicates with a mechanical structure (hardware) in order to reinforce the long-term learning process. Titi app [4] present adapted exercises to the Ecuadorian reality in idiomatic form, in terms of Spanish Language and characteristic representations of the country and the region. The physical structure pretends to be low cost and friendly to the environment. It uses recyclable materials that have the mechanical stability to develop a robust infrastructure for implementing the servo-motors that will develop the movement and facilitate the understanding of the message sent by the app. Finally, the integration of physical robots and the app, allows the scalability of options in customizing the type of robot according to the application context that improves the quality of life. The growth that it represents in society in the personalization of the care service or one-to-one educational guides is significant.
2 The Robot – Human Interaction Social robotics has become a channel for the introduction of social interaction and accompaniment of tools that apply treatments or assistance to various social segments [5], in this case, the development of systematized, computerized and digitized activities has social interaction to reinforce the message of the app, with the gestures of the robotic platform. According to Ceccarelli [6], “mechatronic robots are systems that have functionalities and perform tasks of actions and mechanical interactions with humans or with other systems.” These actions develop social interaction when there are communication actions for sending and receiving messages. In this sense, the development of appropriate technology for the identified context has to adapt the technological availability and the scope of each system for the development of social interaction. According to Chiluiza [7], the various types of robots (anthropomorphic, static, bipedal, mobile, telepresence, etc.), interact in various groups of people, including those who interact in closed spaces such as specialty laboratories, can have homogeneous profiles or be contemporary in age and acceptance by technology, while the profiles of people in an uncontrolled environment or in open spaces or in public spaces, the profiles can be diverse. In this way, the social robots that accompany a learning process can have different forms depending on the technological availability and the characterization of the application context.
3 Design and Implementation 3.1 Robotic Platform Structure The prototype structure of the robot supports the interface and complements the stimulation through the movement of the arms. The communication gestures is composed of bars of balsa wood, which makes the structure low cost and environmentally friendly.
Social Robotic Platform to Strengthen Literacy Skills
145
Fig. 1. Front and rare side of the robotic structure
The implementation of the movements was carried out on the light structure, supports 4 servomotors used for the movement of the joints. Their location can be seen in Fig. 1. In addition, it was considered in the design that the structure is capable of supporting the Tablet on which the application is executed, giving the appearance that the robot looks like a Tití monkey (squirrel monkey), holds the mobile device so that the child can play and meet the challenges (see Fig. 2).
Fig. 2. Sketch of the social robotic Titíbot
3.2 Actuators and Controller For the implementation of the structure, Dynamixel XL-320 servomotors were used with a turning angle of up to 300° and a stall torque of 0.39 N.m sufficient to support the required payload [8]. Given the position of the motors, the following individual or combined movements can be performed (Table 1). The open source OpenCM 9.04 Type B controller uses an ARM Cortex-M3 processor. This card is specialized for the control and programming of Dynamixel servos under a 3-pin TTL daisy chain scheme, in which multiple Dynamixel servos can be wired together in sequence or in a ring (see Fig. 3). Its programming language is compatible
146
M. Zapata et al. Table 1. Possibilities of social interaction messages n°
Movement of body part
Messages
1
Head
Affirmation
2
Neck
Negation
3
Right arm up
Question
4
Left arm up
Failed attempt
5
Both arms up
Success
with Arduino in C/C++. was used to send the angles. The following motion sequences were included in the controller programming.
Fig. 3. Daysy chain link. Modified from [9]
3.3 Tití App This is a pedagogical tool designed to support the literacy process for boys and girls at school in an environment that encourages local cultural training through the use of scenes of Ecuadorian flora and fauna. The challenges presented by the application consist of rescuing 12 animals in danger of extinction. In addition to using the monkey “Tití” (known as the squirrel monkey) as the main character for interaction with the user (see Fig. 4). The app has four scenarios that correspond to the natural regions of Ecuador: Coast, Sierra, Amazonian and Galapagos, which contain learning exercises in 12 stations that include 3 levels of difficulty, using visual and narrative resources. Each time the user completes a season, an animal at risk of extinction is rescued. To stimulate and enhance the development of social and cognitive skills, a humanrobot-game interaction platform was implemented, which is integrated with the Titi app and its educational tasks and challenges. The robotic platform has the appearance of the squirrel monkey and has 3 degrees of freedom, allowing the movement of the head, neck and arms, in a synchronized way with the actions that are executed in the app.
Social Robotic Platform to Strengthen Literacy Skills
147
Fig. 4. Graphic interface of the Tití app, con la instrucción de encontrar la letra correcta.
3.4 The Social Interaction The social robotic prototype, synchronized with the actions that are executed in the Titi app, seeks to carry out an objective evaluation of its impact and the level of usability in the teaching-learning process of school-age children, and with special abilities during their first years of basic education. This application also helps to create environmental awareness, since by using emblematic animals of the Ecuadorian fauna in its interface, it motivates them to know and take care of the local and world fauna. This objective is achieved using entertaining games and a platform for monitoring their use, which presents evaluations and results in simple language interpreted by teachers and parents. Nayeth Solórzano, director of the Research component of this project and professor at the Faculty of Art, Design and Audiovisual Communication (FADCOM), indicates that with the use of new technologies and trend studies on forms of learning in recent generations - applied to Project components related to MIDI-AM- seeks to create applications that can be used on cell phones or tablets with innovative forms of playful learning. A fundamental characteristic of Tití app is its construction from a model based on user needs and is comprised of five moments: Conceptualization, User, Design (prototype), Development and Publication. It is extremely important because it acts in a fundamental stage for the cultural formation of the human being and motivates in the children the love for reading, good writing, and caring for wildlife and the environment. Tití app is a tool designed for boys and girls from second to seventh grade (6–12 years old), who need reinforcement in the reading and writing process (see Fig. 5). It includes new technologies with mobile and fixed devices, as enhancing learning resources. More extensive information at: http://titiapp.ec/. It takes place on a map of Ecuador where the boys will rescue 12 animals in danger of extinction. The Titi monkey (known as the squirrel monkey) will accompany you on your adventure. The children go through four scenarios: Coast, Sierra, Amazon, and Galapagos to solve various exercises in 12 stations that include three levels of difficulty. Each time a season is completed, an animal at risk of becoming extinct is rescued.
148
M. Zapata et al.
A fundamental characteristic of Tití app is its construction based on user needs and is comprised of five moments: Conceptualization, User, Design (prototype), Development and Publication. “From Design we work with students throughout the process of conceptualization of visual and narrative resources, to build the interface that connects the child or the user with all the possibilities that this tool provides” explained Ing. Andrés Caicedo, Graphic Designer Multimedia in the Graphic Design career (Quito).
a)
b)
c)
d)
Fig. 5. a) Titíbot without Tablet, b) social interaction, c) social interaction, d) social interaction.
4 Conclusions In this paper, we present the use of the hardware (Titíbot) and software (Tití app) for the construction of the robotic platform for implementing the social interaction to strengthen the children literacy skills. This platform is an innovative solution, that uses the integration of the new technologies that not only helps teachers or parents, but also helps to improve social interaction between children and the management of non-traditional interfaces assisted by ICTs. The learning processes are reinforced by means of interactive games on the Tablet with sound and image effects that are accompanied by the movements of the robot that allow them to attract the attention of children, thus facilitating teaching-learning in a playful way, which they allow you not only to handle simple tasks, but it is designed to have a progressive advance, in addition to allowing the monitoring of the progress of your users. Interaction with the Tití app and Titíbot call the attention of children,
Social Robotic Platform to Strengthen Literacy Skills
149
which leads to higher levels of participation and commitment, resulting in new models of learning and participation processes.
References 1. Paillacho Chiluiza, D.F., Solorzano Alcivar, N.I., Paillacho Corredores, J.S.: LOLY 1.0: a proposed human-robot-game platform architecture for the engagement of children with autism in the learning process. In: Botto-Tobar, M., Zamora, W., Larrea Plúa, J., Bazurto Roldan, J., Santamaría Philco, A. (eds.) ICCIS 2020. AISC, vol. 1273, pp. 225–238. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-59194-6_19 2. Jadán-Guerrero, J., Guerrero, L., López, G., Cáliz, D., Bravo, J.: Creating TUIs using RFID sensors—a case study based on the literacy process of children with down syndrome. Sensors 15(7), 14845–14863 (2015). https://doi.org/10.3390/s150714845 3. Park, H.W., Grover, I., Spaulding, S., Gomez, L. and Breazeal, C.: A model-free affective reinforcement learning approach to personalization of an autonomous social robot companion for early literacy education. In: Proceedings of the AAAI Conference on Artificial Intelligence, 01 July 2019, vol. 33, pp. 687–694 (2019). https://doi.org/10.1609/aaai.v33i01.3301687 4. Gordón, J., et al.: Psycho-pedagogical recovery tool based on the user-centered design. In: Basantes-Andrade, A., Naranjo-Toro, M., Zambrano Vizuete, M., Botto-Tobar, M. (eds.) TSIE 2019. AISC, vol. 1110, pp. 36–44. Springer, Cham (2020). https://doi.org/10.1007/978-3-03037221-7_4 5. Park, H.W., Grover, I., Spaulding, S., Gomez, L., Breazeal, C.: A model-free affective reinforcement learning approach to personalization of an autonomous social robot companion for early literacy education. In: Proceedings of the AAAI Conference on Artificial Intelligence, 01 July 2019, vol. 3, pp. 687–694. (2019). https://doi.org/10.1609/aaai.v33i01.3301687 6. Ceccarelli, M.: Advances in the mechanical design of robots. Inventions 3(1), 3010010 (2018). https://doi.org/10.3390/inventions3010010 7. Paillacho, D.: Designing a robot to evaluate group formations. Ph.D. [Dissertation]. Universidad Politécnica de Cataluña, Barcelona (2019). https://upcommons.upc.edu/handle/2117/134 749?show=full 8. ROBOTIS e-Manual: Servo ROBOTIS DYNAMIXEL XL-320, China. https://emanual.rob otis.com/docs/en/dxl/x/xl320/ 9. Robotclub Malasya. Dynamixel Servo Actuator. http://site.robotclub.com.my/main/3150/ index.asp?pageid=186221&t=dynamixel-servo-actuator
Structural Bionic Method of Climbing Robot Based on Video Key Frame Extraction Xinxiong Liu and Yue Sun(B) Industrial Design Department, School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China {xxliu,m201970393}@hust.edu.cn
Abstract. This paper analyzes the current research status of the bionic climbing robot, and the method of collecting and processing animal climbing video in the bionic design process. Take sloth as a bionic research object, use its tree climbing video as experimental material, split the video into sequence images, extract key frame images that can represent sloth’s key posture. Then analyze its motion state and different stage of climbing the tree, combined with the simplified sloth skeleton structure to do structural bionic research, so as to provide a reference for the structural bionic design of related products. Keywords: Bionic climbing robot · Key frame extraction · Structural bionics
1 Introduction With rapid economic development of society, people increasingly hope that robots can be used to replace humans in dangerous work and repetitive tasks. Organism has undergone a long-term evolution of natural selection; the unique characteristics of organism provide many useful references for robot design. Compared with other robots, bionic robots have strong flexibility and wonderful adaptability, and can complete specific tasks more efficiently. Some locations are difficult to reach, and work in complex environment is very difficult and dangerous, bionic climbing robots can replace people to complete tasks in high height place such as tree climbing, fruit picking, military detection, building quality inspection and so on. In the existing bionic design process of climbing robots, Motion Capture and X-ray Imaging technology are often used to collect animal motion data. Although detection is accurate, the operation is too complicated and it is not universally applicable, making it difficult to collect data on undomesticated wild animals. There is a great need for a simple and efficient bionic design method for animal motion data collection and analysis. Therefore, we use the more easily available animal climbing video as the research object, extracts the key frame images of it as reference material for motion state analysis and structural bionics.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 150–157, 2021. https://doi.org/10.1007/978-3-030-79997-7_19
Structural Bionic Method of Climbing Robot Based on Video Key Frame Extraction
151
2 Theory of Bionic Climbing Robot and Key Frame Extraction 2.1 Bionic Climbing Robot Bionic robots are an important research branch in the field of bionic design and robotics. Climbing robot is a special industrial robot developed in the past thirty years. They can be used for many tasks involving the safety risks of human operators. Climbing robots using bionic design methods generally focus on the grab ways and motion principle of animals, so as to optimize the mechanical structure used in climbing robots. According to different bionic objects, robots with the function of climbing vertical walls or pole-shaped objects are mainly divided into four categories. First is the gecko-like climbing robot [1–3]. Because the gecko have an excellent adsorption structure feet, a flat and slender body, it can overcome gravity to move freely and quickly on smooth surfaces such as walls and ceilings, robots designed and developed with it generally have good performance. Second is the snake-like climbing robot [4]. The climbing principle of snakes is quite different with other animals. After winding themselves around objects, they rely on the friction between the inner side of the body and the objects to stabilize, uses a function similar to a spring to promote itself upward. Snake-like robots have high degrees of freedom that make them difficult to control, as they begin to be used in real environments, more complex three-dimensional motions have been studied. Third is the ape-like climbing robot [5–7]. Primates are the closest creatures to humans. Their movement methods include bipedal walking, quadrupedal walking, and more complex cantilever motions and quadruped climbing motions. The motion of apes are important and difficult problems in the climbing robots research, which also provide a certain reference for the research of more difficult humanoid robots. Fourth is the insect-like climbing robot [8–11]. Related research mostly studies the adsorption principle between inserts and surface of objects. Almost all inserts have claw structures at ends of their feet. It is different from the claws of cats. The tiny and numerous claws do not need to penetrate the surface of the object. They only need to hook the rough surface to stable themselves, making it more adaptable for climbing on natural surfaces. 2.2 Key Frame Extraction In order to conduct bionic research on animals, it is necessary to apply their motion and structural characteristics to the design of climbing robots. However, the motion recorded in video and motion capture data are the initial data, lacking directly available information. Frame is the smallest unit of a single image in the video or animation. Key frame is the frame that can represent key posture in the process of motion. Since key frames have the ability to generalize the original motion process, focusing on key frames will reduce the complexity of motion analysis, so key frame extraction has a wide range of use in many fields such as content-based video retrieval, motion synthesis and editing, and animation production. To extract key frames from a video, first need to split the video into video sequence images. The actual objects represented by the video sequence images
152
X. Liu and Y. Sun
are these graphics mapped to two-dimensional plane, and the motion characteristics of moving objects in the space scene are obtained by analyzing them. According to whether the camera has relative move, video images are divided into static scene and dynamic scene images. The camera remains relatively still in a static scene, and video records a fixed scene. The dynamic scene refers to the relative movement of the camera and the target. Although there has been a lot of research on target detection and background subtraction under fixed cameras, there is still a lack of effective methods to do these thing with moving camera. In 2018, Yazdi studied the latest method of moving target detection in video sequence images captured by sports cameras [12]. The key frame extraction method can be divided into equal interval key frame extraction and adaptive key frame extraction. Because the rhythm and speed of the motion will not be exactly the same, the equal interval extraction method often cannot select the most representative key frame. If the video content at a constant speed, this method can be considered, which can save a lot of time. The adaptive extraction of key frames can be divided into three types: First is based on content analysis [13], which study changes in image information such as color and texture, it can extract key frames represent important content, but it is easy to select too many key frames when the camera shot changes. Second is based on motion analysis [12, 14], which extract key frames based on the motion state of the camera shot or object, it can analyze the motion state of different objects, but a single perspective are poorly representative of 3D motion. Third is based on cluster analysis [15, 16], which categorize similar frames and extract key frames from each class, it can avoid repetition between similar frames, but it is easy to ignore the relevance of content and time in the video.
3 Key Frame Images Extraction Before extracting key frame images from the video, first need to select the bionic object. Because sloth usually hangs on the tree and its tree climbing motion is slow, the image in the recorded tree climbing video is clearer than that climbing quickly animal like cats, and it is more appropriate to study it as a preliminary bionic object. Search for related videos of sloth climbing tree on the Internet, and select a 9.22 s video in which the sloth’s back is facing the camera lens, the lens shakes less and have good sharpness. Because the overall movement of sloths climbing trees is relatively slow, according to the most basic animation production of 24 frames per second, the video is divided into 233 frames of images in the Premiere, and the size of each frame is 600 * 700 dpi. When extracting key frame images, the turning point of sloth motion state in 233 frames of video is selected. Take 1–41 frames of video sequence images as an example, the body of the sloth has been moving to the left in frames 1–24, and the body in frame 24 has moved to the left extreme point. In frames 24–29, the sloth keeps loosening its right hind claw to grasp the trunk, and it is completely loose in frame 29, the sloth gradually straightens its right hind leg in frames 29–32, and fully straightens it in frame 32. In frames 32–41, the sloth moves its right hind leg upwards and moves to the highest point in frame 41. With this method, frames 1, 24, 29, 32, 41, 80, 85, 94, 105, 149, 153, 158, 165, 195, 201, 207 and 220 are extracted as all the key frame images of the video sequence, 17 pieces of key frames in total (Fig. 1).
Structural Bionic Method of Climbing Robot Based on Video Key Frame Extraction
153
Fig. 1. Key frame images selected from 1–41 video sequence images.
When calculating the difference in the number of frames between all extracted key frames, it can be found that the difference between 1–24, 41–80, 105–149 and 165–195 is larger than the difference between other adjacent key frames, and the sloth is moving its body between these key frames. Therefore, selected key frame images can be divided into four motion stages. First stage has a total of 40 frames about 1.67 s, and the sloth’s body moves to the left in frames 1–24, and the right hind leg moves upward in frames 29–41. The second stage has a total of 64 frames about 2.67 s, and the sloth’s body moves up in frames 41–80, the right front leg moves up in frames 85–105. The third stage totals 60 frames about 2.5 s, the sloth’s body moves to the right in frames 105–149, and the left hind leg moves up in frames 153–165. The average of these four stages is 54.75 frames per stage, about 2.28 s, which together constitute a period of motion for the sloth to climb tree (Fig. 2).
Fig. 2. Sloth motion state analysis on the extracted 17 key frame images.
If key frame images are extracted at equal intervals, one image will be selected every 13 frames in 233 pieces of video sequence images, they are 1, 14, 27, 40, 53, 66, 79, 92, 105, 118, 131, 144, 157, 170, 183, 196, 209 and 222 frames, 18 pieces of key frames in total. Take the first five key frame images extracted by the two methods for comparative analysis. It can be seen that key frame images of 14 and 27, 40 and 53 extracted at equal intervals are very similar, while sloth motion state in key frame image extracted from motion turning point is gradually changing. Although extracted at equal intervals selects one more key frame image, the generalization ability of sloth climbing tree motion is worse than extracted at motion turning point (Fig. 3).
4 Structure Model Extraction and Transformation Before extracting the motion structure model, we need to know the skeleton structure of sloth. The sloth’s skeleton is similar to most mammals. Its trunk is mainly composed of the vertebral column, thoracic cavity and abdominal cavity, and limbs are mainly composed of large arms, forearms and palms. We simplified it to the structural model,
154
X. Liu and Y. Sun
Fig. 3. The first five key frame images extracted at equal intervals and at motion turning point.
skeleton is replaced by black straight lines, joints are replaced by black dots, and the part of trunk is marked gray. Only the spine is retained in the trunk, palms and claws are omitted from limbs, which facilitates the extraction of structural model from key frame images and structural bionics, it will be transformed into a mechanical structural model (Fig. 4).
Fig. 4. The skeleton structure of the sloth (left) and the simplified structure model (right).
When extracting the structure of the key frame image, because the camera shot have some shake, it is necessary to find a stationary point in the video scenes to re-position key frame images. Taking the 6th key frame image as the demarcation point, the previous key frame uses the small tree in the background scene which mark as yellow strips shown in Fig. 5, subsequent key frames use the large spots on the trunk which mark as blue squares, so that each key frame image is rely on these mark to re-position. Comparing the positions of the first key frame and the last key frame with the red horizontal dotted line, it can be seen that the sloth has moved up nearly half the length of its body in the period of tree climbing motion. After re-position key frame images, based on the simplified structure of sloth, the structure is extracted as shown in Fig. 5. After extracting the structure from key frame images, it needs to be transformed into mechanical structure, and simplified mechanical structure model need to be constructed before the complex robot design. We use Creo to model it, the gray oval represents the simplified body, the black straight rod represents the part of limb, and the white column replaces the joint. A part of the space is left at connection between black straight rod and
Structural Bionic Method of Climbing Robot Based on Video Key Frame Extraction
155
Fig. 5. Structure is extracted from key frame images based on the simplified structure of sloth.
gray block to facilitate rotation of the rod. Since the key frame image is only extracted from a single perspective, we assumed that the model only climbing on two-dimensional plane. The range of motion of its right fore limb and right hind limb is the red area shown in Fig. 6, and animal joints generally have a limit range of rotation, so when building a structural model, it is necessary to impose a rotation range constraint on each part of the component.
Fig. 6. Simplified mechanical structure model (left) and range of limb motion (right).
Based on the structure transformed from key frame images, selecting the first frame and the last frame of the four motion stages are 1, 41, 105, 165 and 220 frames as the main reference for the subsequent structure bionics. In the assembly of Creo parts, component constraints of pins at the white shaft part, the rotation axes coincide and are on the same plane. Afterwards, you can change the position of each part refer to the structure extracted above. From these simplified structural, it can be seen that the sloth first moves the center of gravity of body to the left, then the right limb starts to move upwards, first moves up the right hind leg and then the right fore leg. Similarly, if we want to move the left limb, first need to move the center of gravity to the right (Fig. 7).
Fig. 7. Extracted and transformed structure model.
156
X. Liu and Y. Sun
5 Conclusion In this paper we extracted the motion key frames from the tree-climbing video of sloth and completed the preliminary structure extraction and transformation, and got these conclusions. First is about the motion video of bionic object. Whether it is searching for existing resources or recording by researchers themselves, it is necessary to meet the conditions of high definition, high frame rate and fixed camera shot position as much as possible. At the same time, choose the angle that can most obviously observe motion state of bionic object, generally the back or the side, the angle of camera needs to be directly facing the object. If camera shot shakes, we need to find a stable object in the video scene, use it as a reference to re-position key frame images. Second, when the key frame is extracted, since the climbing of animal is mostly regular and periodic motion, we use the turning point of the bionic object’s motion state as selection standard, that is, observe when a certain part of the animal body reaches the limit position. Although there are more key frames extracted at equal intervals, the representation of sloth’s tree climbing motion is not as good as the key frame extracted at motion turning point, because each action in animal motion process may not be completely equal in duration. Third, due to the similarities between animal skeleton model and mechanical model, the joints can be replaced by rotating parts, and the bones can be replaced by connecting rods. Before extracting the motion structure from the key frame images, it is necessary to know the animal skeleton structure and simplify its structure to carrying on preliminary structural bionics. There are still many shortcomings in our research, after fully understanding its motion characteristics, the structure of each part will be gradually improved to realize the final bionic climbing robot design.
References 1. Daltorio, K.A., Gorb, S., Peressadko, A., Horchler, A.D., Ritzmann, R.E., Quinn, R.D.: A robot that climbs walls using micro-structured polymer feet. In: Tokhi, M.O., Virk, G.S., Hossain, M.A. (eds.) Climbing and Walking Robots, pp. 131–138. Springer, Heidelberg (2006). https://doi.org/10.1007/3-540-26415-9_15 2. Unver, O., Uneri, A., Aydemir, A., Sitti, M.: Geckobot: a gecko inspired climbing robot using elastomer adhesives. In: Proceedings 2006 IEEE International Conference on Robotics and Automation, ICRA 2006, pp. 2329–2335. IEEE (2006) 3. Kim, S., Spenko, M., Trujillo, S., Heyneman, B., Mattoli, V., Cutkosky, M.R.: Whole body adhesion: hierarchical, directional and distributed control of adhesive forces for a climbing robot. In: Proceedings 2007 IEEE International Conference on Robotics and Automation, pp. 1268–1273. IEEE (2007) 4. Wright, C., Buchan, A., Brown, B., Geist, J., Schwerin, M., Rollinson, D., Tesch, M., Choset, H.: Design and architecture of the unified modular snake robot. In: 2012 IEEE International Conference on Robotics and Automation, pp. 4347–4354. IEEE (2012) 5. Bretl, T., Rock, S., Latombe, Jean-Claude., Kennedy, B., Aghazarian, H.: Free-climbing with a multi-use robot. In: Ang, M.H., Khatib, O. (eds.) Experimental Robotics IX. STAR, vol. 21, pp. 449–458. Springer, Heidelberg (2006). https://doi.org/10.1007/11552246_43
Structural Bionic Method of Climbing Robot Based on Video Key Frame Extraction
157
6. Provancher, W.R., Jensen-Segal, S.I., Fehlberg, M.A.: ROCR: An energy-efficient dynamic wall-climbing robot. J. IEEE/ASME Trans. Mechatron. 16(5), 897–906 (2010) 7. Sintov, A., Avramovich, T., Shapiro, A.: Design and motion planning of an autonomous climbing robot with claws. J. Robot. Auton. Syst. 59(11), 1008–1019 (2011) 8. Kim, S., Asbeck, A.T., Cutkosky, M.R., Provancher, W.R.: SpinybotII: climbing hard walls with compliant microspines. In: Proceedings of the 12th International Conference on Advanced Robotics, ICAR 2005, pp. 601–606. IEEE (2005) 9. Spenko, M.J., et al.: Biologically inspired climbing with a hexapedal robot. J. Field Robot. 25(4–5), 223–242 (2008) 10. Haynes, G.C., Khripin, A., Lynch, G., Amory, J., Saunders, A., Rizzi, A.A., Koditschek, D.E.: Rapid pole climbing with a quadrupedal robot. In: 2009 IEEE International Conference on Robotics and Automation, pp. 2767–2772. IEEE (2009) 11. Lam, T.L., Xu, Y.: Biologically inspired tree-climbing robot with continuum maneuvering mechanism. J. Field Robot. 29(6), 843–860 (2012) 12. Jian, M., Zhang, S., Wu, L., Zhang, S., Wang, X., He, Y.: Deep key frame extraction for sport training. J. Neurocomput. 328, 147–156 (2019) 13. Yamasaki, T., Aizawa, K.: Motion segmentation of 3D video using modified shape distribution. In: 2006 IEEE International Conference on Multimedia and Expo, pp. 1909–1912. IEEE (2006) 14. Artner, N.M., Ion, A., Kropatsch, W.G.: Hierarchical spatio-temporal extraction of models for moving rigid parts. J. Pattern Recogn. Lett. 32(16), 2239–2249 (2011) 15. Loy, G., Sullivan, J., Carlsson, S.: Pose-based clustering in action sequences. In: First IEEE International Workshop on Higher-Level Knowledge in 3D Modeling and Motion Analysis, HLK 2003, pp. 66–72. IEEE (2003) 16. Yu, X., Liu, W., Xing, W.: Behavioral segmentation for human motion capture data based on graph cut method. J. Vis. Lang. Comput. 43, 50–59 (2017) 17. Sloths Climbing Trees. https://www.youtube.com/watch?v=iK8WqGJX9wo
Prototype System for Control the ScorBot ER-4U Robotic Arm Using Free Tools Elizabeth Chávez-Chica1 , Jorge Buele2(B) , Franklin W. Salazar1 , and José Varela-Aldás2 1 Universidad Técnica de Ambato, Ambato 180103, Ecuador
{jchavez5315,fw.salazar}@uta.edu.ec
2 SISAu Research Group, Facultad de Ingeniería y Tecnologías de la Información y la
Comunicación, Universidad Tecnológica Indoamérica, Ambato 180212, Ecuador {jorgebuele,josevarela}@uti.edu.ec
Abstract. The acquisition of controllers offered by manufacturers represents a high cost for the user and presents limitations for development. Therefore, in this work we propose the development of an electronic controller for the ScorBot ER_4U robotic arm operation. The programs are elaborated in C language, in the ESP-IDF development environment, which allows the manipulation and control of the arm. The development of this research directly benefits the research field. In this way, an open license software is delivered for the modification of codes and circuits that support the evolution of robotics knowledge. The tests carried out validate this proposal, both in operation and in the response times obtained. Keywords: ScorBot · MQTT · Servo DC motors · Free software
1 Introduction Research in the area of automation and robotics has allowed the development of multiple companies dedicated to the manufacture of electrical and electronic equipment [1, 2]. The execution of processes in an automatic or computer manipulated way requires the integration of several knowledge domains about physics, mechanics, electronics and computer science [3]. The industrial sector develops open field robotic equipment and for multiple applications, however, it is verified that the software for the manipulation of the same has a cost since it works with licenses. The development of automation and robotics applications with platforms that use proprietary software, with paid license, limits the research and free modification of the components of the robots produced. The use of free software in academia and research centers allows the development of collaborative projects, promoting the development of applications in various fields of science. In robotics, for example, the use of robotic arms in industrial environments is common, since it generates good performance in large- scale production lines [4]. The ScorBot ER-4U robotic arm is one of the most widely used, although it is designed to be handled only by its proprietary software (Scorbase or Robocell) [5]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 158–165, 2021. https://doi.org/10.1007/978-3-030-79997-7_20
Prototype System for Control the ScorBot ER-4U Robotic Arm
159
This situation reduces the possibilities in the expansion of applications and their characteristics [6]. For example, to integrate programs with RS232, TCP/IP, RS485 and other network layer communication protocols and application of the OSI model with the robot arm, ScorBase is required as the interface between the host and the arm’s controller [7]. The mandatory use of proprietary software implies dependence on response times in motion execution commands from third-party programs [8]. The specific projects for ScorBot robots can be seen in several investigations. Therefore, in [9] presents the analysis of this robot’s arm is performed and focuses mainly on its cost-effectiveness by using reconditioning and programming methods. As a result, the position graphics obtained by RoboCell show that the robot works correctly and accurately, however, a runtime error occurs during operation. In the research carried out by [7] the kinematic model of the ScorBot 4PC manipulator implemented in the MATLAB guide is shown. The development and implementation of the kinematics model of the five- degree-of-freedom manipulator robot is performed. The models are processed and presented in a graphical inter-face developed in the GUIDE. At [10] a programming method for the remote operation of a ScorBot ER-4U is proposed, allowing the human operator to create routines for the robot. Two interfaces called client and server were developed in the LabVIEW software, which process the information through a TCP/IP communication protocol. The implementation of a free hardware and software controller, described in the present project, eliminates the design barriers and generates the opening to the elaboration of new applications, which integrate network protocols management in a direct way, using only an electronic card. The project implements a controller that allows the arm to be manipulated independently by means of its own memory, or through an external host from which the routines to be executed are sent. This document consists of four sections, where the introduction and the materials and methods are described in Sect. 1 and Sect. 2 respectively. Section 3 shows the results obtained from the research and Sect. 4 sets out the respective conclusions.
2 Hardware The robot arm controller is designed according to the system architecture shown in the block diagram in Fig. 1. 2.1 Electronic Control Interface The electronic control interface is the group of wired connections required between the controller and the electrical and electronic devices integrated into the robot arm. The interface used is designed based on the DD50 connector, which is a male terminal of a cable that comes from the connections of each electronic element that integrates the robotic arm. The control module is designed with a DD50 female type port that allows direct connection to the arm, using the same interface designed by the manufacturer Intelitek.
160
E. Chávez-Chica et al.
Fig. 1. Control environment architecture.
2.2 Electronic Module The electronic module is the set of analog and digital components in charge of managing the exchange of information between the robotic arm and the control terminal, as well as executing positioning and motor protection processes through the use of microcontrollers. The control signals required in the core of the electronic module to be implemented are 39, including WiFi communication modules and current sensors in the printed circuit that allow monitoring the behavior of the motors, detecting impacts. The control module used in this research is the ESP32 microcontroller, which functions as the core of the data processing and handling of the peripherals integrated in the robot arm. Due to the fact that the device does not have enough pins, two microcontrollers are used, intercommunicated through the RS232 protocol. The slave controller generates the electrical signals required to execute the movement of the ScorBot motors, processes the analog signals from the current sensors, detects impacts and manages the transmission of alerts on emergency shutdown of the motors. 2.3 Primary Board The primary board is the printed circuit that contains all the electronics of the MQTT module except for the DD50 interface for connection to the ScorBot. The circuit is made on a 6-layer square PCB occupying a total area of 100 cm2 . The primary board is electrically powered by a two-pin WJ142V connector, for which an external 24 V DC source must be used. The designed module has three input and output interfaces: port 6 is an interface parallel to port 13, it allows to connect test terminals to make measurements in the pins of the DD50 connector. Port 13 consists of three horizontal male headers that allow the addition of an angular interface, a secondary electronic board that connects to the ScorBot. Port 7 is a 12-pin female header distributed in two rows, each row connecting to 1 ground pin and 4 pins of the master and slave controller, this connection facilitates the creation of communication ports with the ESP32 microcontrollers and is used to program them.
Prototype System for Control the ScorBot ER-4U Robotic Arm
161
2.4 Angular Interface The angle interface is a printed circuit board used as an adapter to connect the primary board PCB to the ScorBot ER_4U male DD50 connector. The designed electronic board has an area of 40 cm2 in a rectangular shape of 4 cm high by 10 cm wide and is connected on the longest side to the primary board. The final design of the device is as shown in Fig. 2, the image shows the distribution of the elements that make up the control module.
Fig. 2. Final design of the prototype developed.
3 Software 3.1 Communication Interface The communication interface is the group of protocols and transmission media used for the bidirectional transfer of data between the computer or host and the controller card. By selecting WiFi technology as the transmission medium in the communication interface, the possibility of implementing systems with the robotic arm is opened up by means of protocols of the application layer of the OSI model, in this case the MQTT protocol. This communication guarantees the transmission of information between multiple nodes under certain parameters. MQTT allows the integration of the SSL or TLS protocol, leading to establish a secure communication and requires a lower bandwidth than the websockets. 3.2 Control Firmware The control firmware is the software created to control the peripherals that make up the robotic arm. The module divides the software in two sections, the first one controls the peripherals connected to the master and the second one those of the slave. The programs are elaborated in C language in the ESP-IDF development environment. The master’s firmware manages the communication with the MQTT brocker and detects internally
162
E. Chávez-Chica et al.
the position of the arm to generate the required movements in the robot axes. On the other hand, the slave firmware transforms the analog signals from the current sensors into digital signals, manipulates the turning on and off of the motors and controls their direction of rotation. 3.3 Broker The broker is a service installed in an operating system, which receives the messages sent by MQTT customers and distributes them among them according to configuration rules, in subscription and publication environments. The broker service is installed in a Linux operating system, without any cost for the acquisition of licenses and allowing the modification of the source code according to the needs of the project. The selected broker is the Emq-x, this server allows the configuration of rules and applications through a web interface. 3.4 Control Host The control host is an external equipment, which has the capacity to execute the functions of a MQTT client, through the TLS V1.2 protocol. This terminal equipment can be programmed in a microcontroller, reduced board computer or personal computer. In the present investigation the same computer equipment that hosts the MQTT brocker server is used, since it reduces the operative costs and guarantees optimal processing characteristics. The programming environment selected for the development of the interface is Nodered, a web programming tool that uses the Node Js server and a hybrid programming language based on Json. The designed interface (HMI) is web accessible through a browser, using the address: http://IP_SERVER:1880/ui/. It is divided into three sections: movement control, function control and positioning control. This movement control is composed of an indicator LED and two control buttons for each axis of the arm in a clockwise and counter-clockwise direction. The indicator lights take different colors according to the working state of the assigned engine. The positioning control panel consists of seven numeric input interfaces, six assigned to each axis and the seventh used to set the travel speed. In addition to a stop button used as an emergency stop, as shown in Fig. 3. Each of the numerical inputs consists of a keyboard entry section and two buttons, which control the unit increase or decrease of the positions. The function control corresponds to the function panel, this panel can be used to program movement routines, using memories stored in a database.
4 Experimental Results The final result of the operation of the system is reflected in Fig. 4. On the left side is the graphic interface that controls the movements of the robotic arm and on the right side is illustrated the results of a camera that captures the movements of the arm. In the image of the camera it can be seen that the relation of sizes between the robotic arm and the MQTT controller is ergonomic. The electronic module, encapsulated in a blue ABS housing, is connected directly to the 110 VAC power supply and to the robot
Prototype System for Control the ScorBot ER-4U Robotic Arm
163
Fig. 3. Positioning section of the HMI Demo interface.
arm via the D50 terminal on the white cable. The connection between the computer and the electronic module is made via a wireless network using the 802.11 n protocol. The computer shows the demonstrative graphic interface designed in a NodeJs server using the Node-Red tool. The control application designed allows movements, displacements and positioning in the axes of the arm to be executed.
Fig. 4. Operation of ScorBot control system by MQTT.
During the operating tests, 400 displacement requests were made for each of the axes, over a period of one week. Table 1 shows the data tabulation of the response times in milliseconds, measured for each of the requests made. In order to analyze system response times, 2400 tests are performed. Here it is obtained that in the fashion a response time of less than 200 [ms] and a peak time of 1 [s], caused by error of the data network, stands out. The general summary is shown in Fig. 5.
164
E. Chávez-Chica et al. Table 1. Response times for requests to start and stop engines. Weather [ms]
Base Shoulder Elbow Pitch Roll Gripper Speed
1–100
139
87
1
0
0
0
273
101–200
170
181
231
157
91
0
114
201–300
67
82
112
142
188
210
13
301–400
22
41
37
73
83
112
0
401–500
2
9
18
22
27
53
0
501–600
0
0
0
5
11
24
0
601–700
0
0
0
0
0
1
0
701–800
0
0
0
0
0
0
0
801–900
0
0
0
0
0
0
0
901–1000 Total
0
0
0
0
0
0
0
400
400
400
400
400
400
400
Fig. 5. Overview of response time statistics.
5 Conclusions As a low-cost alternative measure, which avoids the use of the controller offered by the manufacturer, the present proposal has been designed. The elaboration of the electronic control module based on the MQTT protocol is developed using free hardware and software. The controller module that has been designed is composed by a double core of the microcontroller ESP32 WROM32 that allows a direct connection between peripherals through the MQTT protocol, using the wireless interface of the 802.11 n standard. The controller presented is not limited to manipulating the peripherals of the ER-4U robot arm, but has the ability to control the speed of DC motors and the position of motors with gearboxes operating at a voltage of up to 24 VDC with a current consumption of 1 A. Additionally it has the feasibility of reading 5 Vdc digital signals from sensors
Prototype System for Control the ScorBot ER-4U Robotic Arm
165
and the possibility of connecting external communication modules using RS232 or SPI protocols. The equipment has the freedom to be implemented in applications of educational and industrial environments, as well as the feasibility of modifying its design for its evolution or adaptation to new communication protocols. The communication between the nodes is done without bandwidth problems. The response times measured in the request of movements through the MQTT protocol, when sending a packet from the client and waiting for the reply are less than 600 ms with a fashion of 200 ms. As future work, the implementation in an industrial environment is established.
References 1. de Sá Carvalho, E.dosS., Filho, N.F.D.: Proposal for a mobile learning system focusing on the characteristics and applications practical of industry 4.0. RISTI Rev. Iber. Sist. Tecnol. Inf. 27, 36–51 (2018). https://doi.org/10.17013/risti.27.36-51 2. Luo, S., Bimbo, J., Dahiya, R., Liu, H.: Robotic tactile perception of object properties: a review. Mechatronics 48. https://doi.org/10.1016/j.mechatronics.2017.11.002 3. Gattullo, M., Scurati, G.W., Fiorentino, M., Uva, A.E., Ferrise, F., Bordegoni, M.: Towards augmented reality manuals for industry 4.0: a methodology. Robot. Comput. Integr. Manuf. (2019). https://doi.org/10.1016/j.rcim.2018.10.001 4. Buele, J., Espinoza, J., Bonilla, R., Edison, S.C., Vinicio, P.L., Franklin, S.L.: Cooperative control of robotic spheres for tracking trajectories with visual feedback. RISTI Rev. Iber. Sist. Tecnol. Inf. E19, 134–145 (2019) 5. Jha, A., Chiddarwar, S.S., Alakshendra, V., Andulkar, M.V.: Kinematics-based approach for robot programming via human arm motion. J. Braz. Soc. Mech. Sci. Eng. 39(7), 2659–2675 (2016). https://doi.org/10.1007/s40430-016-0662-z 6. Ochoa, L.J.D., Rudolf, T.M.: Flexible production cell applying artificial vision concepts and open source CNCs. In: ACM International Conference Proceeding Series, pp. 51–55 (2019). https://doi.org/10.1145/3354142.3354152 7. Aroca Trujillo, J.L., Serrezuela, R.R., Azhmyakov, V., Zamora, R.S.: Kinematic model of the scorbot 4PC manipulator implemented in Matlab’s Guide. Contemp. Eng. Sci. 11, 183–199 (2018). https://doi.org/10.12988/ces.2018.8112 8. Szuster, M., Gierlak, P.: Approximate dynamic programming in tracking control of a robotic manipulator. Int. J. Adv. Robot. Syst. 13 (2016). https://doi.org/10.5772/62129 9. Patil, P.V., Ohol, S.S.: Performance analysis of SCORBOT ER 4U robot arm. Int. J. Mater. Sci. Eng. 2, 72–75 (2014). https://doi.org/10.12720/ijmse.2.1.72-75 10. Franklin Salazar, L., Buele, J., Velasteguí, H.J., Soria, A., Núñez, E.E.T., Benítez, C.S., Gabriela Orejuela, T.: Teleoperation and remote monitoring of a ScorBot ER-4U robotic arm in an academic environment. Int. J. Innov. Technol. Explor. Eng. 8, 40–45 (2019)
Human Factors in Cybersecurity
Detecting Cyberattacks Using Linguistic Analysis Wayne Patterson(B) Patterson and Associates, 201 Massachusetts Avenue NE, Suite 316, Washington, DC 20002, USA
Abstract. Increasingly, computer users throughout the world are faced with various types of cyberattacks that might originate anywhere. In many cases, for example with ransomware, the unsuspecting user might receive a message requesting a payment of funds lest some dire consequence occur. The nature of such a message may lead to an analysis of the origin of the attack, based on the use of language in the demand. For example, if the targeted user is English-speaking, the language in the attack might have been originated by a speaker of another language, who might use human translation or machine translation in the attack. In this paper, we explore the possibility of determining in part the source of an attack through linguistic analysis of sample test messages that might have been translated by human or machine methods, and from one of a range of human languages. Keywords: Cybersecurity · Cyberattacks · Machine translation · Human translation · Human factors
1 Introduction In [1], we considered the problem of analyzing potential cyberattacks, such as ransomware, through an analysis of the external messaging that may indicate an attack. We proposed a defense mechanism against such attacks coming from a different language base, chose a number of examples of quotations in English literature and popular culture, and devised a procedure to see if our test subjects could determine if a perceived attack came from a human writer with a partial knowledge of English, or a machine translation. We chose Google Translate (GT) since it is universally accessible. We have developed an index which could assist the cyberdefender in the classification of potential human or machine cyberattacks where the attack message might not have been written originally in English. In order to conduct this analysis, we use GT, and text originally written in English, translated into another language, and then back into English (We refer to this process as ABA). We also chose several widely used languages (French, Spanish, German, Russian, Chinese, Hindi) to determine the effectiveness of GT. The six test languages, when English is added, are languages spoken by approximately 30.8% of the world’s population of 7.874 billion; the seven languages considered together are the primary languages of 2,425 billion persons [2] (Table 1). © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 169–175, 2021. https://doi.org/10.1007/978-3-030-79997-7_21
170
W. Patterson Table 1. Prevalence of languages considered in this study. Language Rank of prevalence among Approximate number of world languages users in millions (M) Chinese
1
918 M
Spanish
2
480 M
English
3
379 M
Hindi
4
341 M
Russian
7
154 M
French
16
77.2 M
German
17
76.1 M
2 Levenshtein Distance In information theory, Levenshtein distance [3] is a string metric for measuring the difference between two sequences. The Levenshtein distance between two words is the minimum number of single-character edits required to change one into the other. We have modified Levenshtein distance (MLD) in order to account for the phenomenon when performing an ABA translation that some languages interchange the position of parts of speech. We begin string matching until the characters do not match. Then advance, counting the number of mismatched characters, until we have a match again. Continue processing until the end of the example. The MLD is the sum of the mismatched pairs. Example: Of all the gin / joints / in all the / towns in all / the world, she / walks into / mine. Of all the gin / stores / in all the / cities of / store 6 /
/ store 10
/ the world, she / enters /
/ store 9
/ mine.
/
Thus MLD = 6 + 10 + 9 = 25.
3 Research Approach The purpose of this research has been to develop a metric to assist in the determination of the likelihood that a text observed from an external source may be an indicator of a cyberattack, either human- or machine-generated. In order to develop a profile against which we may test an unrecognized text, we can capture the text, and submit it to the ABA test described above to attempt to determine the original language of the possible cyberattack. We developed a series of 20 sentences in English, half appearing as quotations in English literature (Q), and the other half from English popular culture, film in this instance (F). Each sample of text was subjected to the ABA process in each of the six languages described above (Table 2).
Detecting Cyberattacks Using Linguistic Analysis
171
Table 2. 20 test quotations from English literature and film scripts. Number
Category
Quotation
Source (author or film character)
T1
F
I’m as mad as hell, and I’m not going to take this anymore!
NETWORK
T2
Q
When a person suffers from delirium, we speak of madness. when many people are delirious, we talk about religion
Robert Pirsig (1948-)
T3
F
Of all the gin joints in all the towns in CASABLANCA all the world, she walks into mine
T4
F
Open the pod bay doors, please, HAL
2001: A SPACE ODYSSEY
T5
F
Mrs. Robinson, you’re trying to seduce me. Aren’t you?
THE GRADUATE
T6
F
Keep your friends close, but your enemies closer
THE GODFATHER PART II
T7
F
If you build it, he will come
FIELD OF DREAMS
T8
Q
A lie gets halfway around the earth Sir Winston Churchill before the truth has a chance to get its (1874–1965) pants on
T9
F
I have always depended on the kindness of strangers
A STREETCAR NAMED DESIRE
T10
Q
Sex and divinity are closer to each other than either might prefer
Saint Thomas More (1478–1535)
T11
Q
Political correctness is despotism with Charlton Heston manners (1924–2008)
T12
Q
The only way to get rid of a desire is to yield to it
Oscar Wilde (1854–1900)
T13
Q
Whether you think that you can, or that you can’t, you are usually right
Henry Ford (1863–1947)
T14
Q
There are no facts, only connotations
Friedrich Nietzsche (1844–1900)
T15
Q
I’m living so far beyond my income that we may almost be said to be living apart
e e cummings (1894–1962)
T16
Q
People demand freedom of speech to Soren Aabye Kierkegaard make up for the freedom of conviction (1813–1855) which they avoid
T17
F
Tell’em to go out there with all they got and win just one for the Gipper
KNUTE ROCKNE, ALL AMERICAN (continued)
172
W. Patterson Table 2. (continued)
Number
Category
Quotation
Source (author or film character)
T18
F
Round up the usual suspects
CASABLANCA
T19
F
Love means never having to say you’re sorry
LOVE STORY
T20
Q
The greatest glory in living lies not in never falling, but in rising every time we fall
Nelson Mandela (1918–2013)
The average MLD values, which for each of the test sentences are translated first into another of the chosen languages, then back to English, are given in Table 3. Table 3. Value of modified Levenshtein distance (MLD) for language pairs. ABA example
Code for translation
MLD value averaged over all test entries
English-French-English
EFE
15.5%
English-Spanish-English ESE
17.0%
English-German-English EGE
20.7%
English-Russian-English ERE
32.6%
English-Chinese-English ECE
35.5%
English-Hindi-English
30.0%
EHE
This test is designed to determine the original source language of the cyberattack. If it were possible to limit the other language to the six examples, it would allow the attacked party to narrow the range of potential attack sources. We will continue with several other analyses to assist in determining the legitimacy of the translated material.
4 Reverse Translation In order to provide some groundtruthing in terms of the consistency of the translation, reverse translation (we will designate as BAB) will be performed with a number of the examples. Since earlier we demonstrated in Table 3 that the MLD values are the least for EFE, ESE and EGE, we will perform this analysis for only these sets of cases. The average for each test T1-T20 will be used for each language pair (Table 4).
Detecting Cyberattacks Using Linguistic Analysis
173
Table 4. Value of modified Levenshtein distance (MLD) for language pairs. Initial translation ABA
MLD value
Reverse translation BAB
MLD value
EFE
15.5
FEF
11.6
ESE
17.0
SES
8.2
EGE
20.7
GEG
8.5
5 Comparing Literary Quotes (Q) and Film Dialog (F) Half of the text quotes were drawn from Q examples and half from F examples, using the assumption that most writers being quoted in literature would adhere to strict grammatic standards, and that script writers for film might be more inclined to depart from strict grammatical standards for dramatic reasons. As a consequence, we attempted to compare the two subsets of Q and F to see if translation systems we more accurate in terms of MLD (Table 5). Table 5. MLD scores on film dialogue (F) and literary quotes (Q) Language pair
MLD score on F
MLD score on Q
Total MLD score
% Difference (F, Q)
English – French
15.2
15.7
15.5
0.3
English – Spanish
17.0
17.1
17.0
0.0
English – German
23.4
18.9
20.7
2.7
English – Russian
34.9
31.1
32.6
2.3
English – Chinese
38.0
33.8
35.5
2.5
English – Hindi
32.1
28.6
30.0
2.1
These results seem to indicate that the translation software performs similarly in each set of cases independently of the type of translation case.
6 Comparing the Most and Least Accurate MLD Measures Of the test bed of 20 items, T1–T20, for the translation into each language, the MLD was compared. The six languages considered essentially fell into two categories of three languages each in terms of their accuracy: {French, German, Spanish} and {Chinese Hindi, Russian}.
174
W. Patterson
On the entire range of quotes, those for which the MLD were minimal were (Table 6): Table 6. Quotations among T1–T20 with minimal and maximal MLD values. Minimal MLD case MLD value Maximal MLD case MLD value T7
6.11
T15
48.15
T6
8.16
T3
43.29
T12
9.26
T18
36.90
T14
10.53
T17
32.44
T11
13.19
T8
32.36
Table 7. Quotations among T1–T20 with exact translations by translation type Translation type
Number of exact translations (of 20)
Test quotation (T or Q)
Number of exact translations
EFE
6
T11 (Q), T14 (Q)
4
ESE
5
T6 (F), T7 (F)
3
EGE
4
T4 (F)
2
ECE
2
The identification of the particular test items and the values of MLD are useful in refining the type of test bed that will give a greater definition of the test item. For example, the T7 test item translation is 94% correct over all the chosen ABA language translations. Thus, the T7 is probably not a good candidate to distinguish a potential hacker’s source language. Table 7 also indicates the utility of translation type as well as the test quote. For example, in six of the test cases T1–T20, the MLD is zero, thus these items would not be useful in detecting malicious attacks. This would also be true with T11 and T14 as test quotations, which would constitute perfect translations for four of our six language pairs.
7 Conclusions Although the study is limited in the information gained from the given data set of quotations and language translation methods, it may be a useful approach in real-time detection of ransomware and other cyberattacks in terms of narrowing the field of suspects of an attack. For example, if the language in a suspected attack is subjected to the methods described in this paper, the techniques described herein could provide clues as to the language of origin of the attacker. Further research could be even more useful in this regard.
Detecting Cyberattacks Using Linguistic Analysis
175
References 1. Patterson, W., Murray, A., Fleming, L.: Distinguishing a human or machine cyberattacker. In: Ahram, T., Karwowski, W., Vergnano, A., Leali, F., Taiar, R. (eds.) IHSI 2020. AISC, vol. 1131, pp. 335–340. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39512-4_53 2. https://en.m.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers 3. https://en.wikipedia.org/wiki/Levenshtein_distance
Digital Image Forensics Using Hexadecimal Image Analysis Gina Fossati(B) , Anmol Agarwal, and Ebru Celikel Cankaya Department of Computer Science, The University of Texas at Dallas, Richardson, TX, USA {Gina.Fossati,Anmol.Agarwal,Ebru.Cankaya}@utdallas.edu
Abstract. Digital forensics is gaining increasing momentum today thanks to rapid developments in data editing technologies. We propose and implement a novel image forensics technique that incorporates hexadecimal image analysis to detect forgery in still images. The simple and effective algorithm we developed yields promising results identifying the tool used for forgery with zero false positives. Moreover, it is comparable to other known image forgery detection algorithms with respect to runtime performance. Keywords: Forgery detection · Image manipulation
1 Introduction Detecting image forgery is a common problem in the digital forensics world that has been researched at great length. The typical approach involves analyzing the pixels that make up an image. This technique can be used to detect many types of image forgery including cut and paste editing operations where a portion of one picture is imposed on another. There has been a wide variety of work done to detect image forgery using different approaches. Much recent work [1–5, 8, 9] proposes using machine learning techniques via training convolutional neural networks to determine if an image was manipulated. Another approach used in other work [13] is analyzing quality artifacts and texture features. In this approach, an image is converted into its YCbCr color model [6] and then, texture descriptions and histograms are used to detect forged images. Another method to detect forged images uses image quality artifacts after converting an image into its YCbCr color model [13]. A limitation of existing approaches is that there is no method to determine which editor was used to manipulate an image. Editor information could be useful for digital forensics analysis. Therefore, in this work, we propose a method to determine which editor was used to manipulate an image. We analyze the bytes of the image instead of an its pixels. By analyzing the bytes, we can determine which editor specifically was used to manipulate an image. This technique works because many editors leave a signature in the bytes of a modified file. Before our research, there was no method to gain editor information from a modified image file. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 176–183, 2021. https://doi.org/10.1007/978-3-030-79997-7_22
Digital Image Forensics Using Hexadecimal Image Analysis
177
The rest of our paper is organized as follows: Sect. 2 discusses background and related work, Sect. 3 discusses experimental work and our proposed method, Sect. 4 discusses results, and finally, Sect. 5 discusses the conclusion and possible future work.
2 Background and Related Work Digital image forgery detection has been a popular subject in recent years. With the increase in image sharing and access to image editing tools, such as GIMP and MS Paint, image manipulation has become more common. There have been many proposed research tools [1–13] that aim to detect forged images using a wide variety of approaches. Unlike our approach, these techniques detect image forgery by studying the attributes of images, such as an image’s pixels. The first group of approaches employ machine learning techniques for image forgery detection purposes. Nguyen, Yamagishi, and Echizen [8] proposed using capsule networks to detect forged images. In their work, three capsules were used for image classification and two other capsules were used to output both forged images and authentic images. The capsules were CNNs (Convolution Neural Networks) trained to detect forged images. Softmax and the cross-entropy loss function were used for classification [8]. At first, the model was trained using the capsule network approach in which output of the three capsules is dynamically routed to determine whether an image is forged or authentic. This model achieved a 96.75% accuracy on a patch of a dataset when distinguishing CGI images from photographic images and a 99.72% accurate on the full dataset. Gaussian random noise and a squash [8] was applied to improve the model’s accuracy. With this method, the model achieved a 97% accuracy on a patch of a dataset with CGI and photographic images and a 100% accuracy on the full dataset. Elsharkawy et al. [5] use homomorphic image processing to detect forged images. In this approach, the illumination component of images is studied. Forged images have a different illumination component compared to unmodified images. Support Vector Machine (SVM) and Neural Network (NN) classifiers are used to classify images based on the illumination component. Using SVM, this approach has a 97.2% accuracy [5]. Abd El-Latif, Taha, and Zayed [4] detect image splicing, a type of digital image forgery, using deep learning and the Haar Wavelet transform. In this work, images are inputted into a Convolutional Neural Network (CNN) and then the Haar Wavelet transform is applied. Next, the algorithm uses SVM to detect whether images are forged or authentic. Chen, Kang, Shi, and Wang [2] also propose using convolutional networks to classify forged and original images. In their method, all layers of networks are densely connected to allow the model to adapt as images change. Similar work [3] has also been done using convolution neural networks with an AlexNet model. There have been many methods proposed to detect copy-move and copy-paste image forgeries [1, 9, 11]. Nguyen and Cao [9] use matrix decomposition and frequency transforms. They focus on detecting copy-paste manipulations of an image and use the QR and SVD methods of matrix decomposition to determine if an image was forged or not. Ardakan, Yerokh, and Saffar [1] also propose another method to detect copy-move image forgeries. In this method, a Gabor filter is used to detect image forgery. The filter is applied on each block of the image and then afterwards similar blocks are found. Uneven areas of the forged image will have similar
178
G. Fossati et al.
properties, so if similar blocks are found in a certain region, those blocks could be part of the area that was copied and moved to another part of the image [1]. The other group of image forgery detection attempts focus on using quality artifacts and texture features. In [13], Manu and Mehtre first convert an image into its YCbCr Color Model [6] and then introduce two different methods to detect manipulated images. The first method uses texture descriptions and histograms to detect forged images and then the second method uses image quality artifacts, entropy histograms and texture descriptors. Different than what has been proposed by researchers already, our approach incorporates image hex values to detect forgery in images. Our work seeks to provide another way to detect forged images by looking at the raw hex of images. We look for certain patterns in the hex that indicate the image was modified by an image editing software. Our approach is more general in nature. In contrast to previous works discussed, we focus on applying our detection techniques to images captured from different cameras and use a different approach that, to the best of our knowledge, has not been previously explored. We extend the work of Schetinger et al. [10]. In this work, the authors provide an overview of image forensics (analysis) and image composition (modification) techniques. A forgery detection scale, FD, is introduced. FD rates techniques on their ability to provide image history and is used when examining images traces. For example, a technique that produces binary results - the image either has or has not been modified - is at FD1. More detail is introduced when moving up the scale such as the possible location of modification in FD3 or a link to an image editor in FD4. FD describes methods examining the pixels of an image in search of modifications such as splicing and cut outs. The paper goes on to discuss image composition focusing on five main techniques - object transferring, object insertion and manipulation, lighting, erasing, image enhancement and tweaking. It concludes that “current state-of-the-art forensics has all the basic tools it needs to be able to detect most forgeries” [10]. We took this conclusion as a challenge to approach image forgery detection from a new angle. Our method differs from those analyzed in the paper in that we examine the bytes of files, not pixels.
3 Experimental Work After viewing the hex of images modified by certain editors compared to their originals, we noticed patterns. Many editors add a specific set of bytes (a signature or “match string”) near the beginning of the image files they modify. For example, the popular photo editor GIMP inserts the string ‘GIMP’ among other bytes. We implement an image forgery detector tool using Python 3 on a Windows 10 64-bit machine; however, any machine with Python 3 should be compatible. Our tool takes as input a directory containing image files and outputs whether each supported editor was detected in each image. Additionally, our tool offers support to check just one image in the default DATAPATH (‘./Dataset/’) by adding it as a command line argument. For example, to analyze one image use ‘python ForgeryDetector.py imagefilename’. Similarly, to run on the entire DATAPATH directory, use ‘python ForgeryDetector.py’ instead. Furthermore, to evaluate the accuracy of our detection method, the script displays
Digital Image Forensics Using Hexadecimal Image Analysis
179
output when errors (false positives or false negatives) are detected, together with the total number of images analyzed. Table 1. Camera models and editors used in test data. Cameras
CanonPowerShotSD1200IS, CanonPowerShotSX10IS, Samsung Galaxy S4/S5 Mini/S7/S8/S10e, iPhone 6s
Editors
MS Paint, GIMP [16], BeFunky [15], Fotor [14], Pixlr [13]
The cameras and image files used in our implementation are listed in Table 1. We used 80 original images – 10 from each of the 8 cameras. We then modified each of the 80 originals with all 5 of our supported editors. This made a total of 480 images in our dataset. Once the data was created, we were able to run our ForgeryDetector program to detect forgery. The algorithm we used in ForgeryDetector is simple. It starts by crawling through the entire ‘DATAPATH’ directory, separating the images by editor used for modification. This is possible because of our naming convention - ‘imagename-editorname.extension’. We then run the byte analysis on each group of pictures by reading the raw bytes of the image. For higher resolution images (most modern images), the byte string of the entire file is very long. This makes it impractical to search images in their entirety. Fortunately, during our testing phase we discovered that most editor modifications (signatures) occur in the first 2000 characters of the file (byte string we read). By only looking at the first 2000 characters, our algorithm finishes in reasonable time. Each 2000-character byte string is checked against each supported editor’s unique signature(s) or “match string(s).” We were careful in choosing these match strings for two reasons. First, if the string is too short, there is risk of false positives, i.e. detecting editor modification that does not exist. Choosing a match string that is just a few bytes long makes it likely that those few bytes also appear in another image – original or modified. This means that those few bytes (our short match string) are not a good indicator that one specific editor was used as they appear in many types of images. Therefore, match strings must be long enough that they are unique to one editor. On the other hand, if the string is very long, it may be specific for only a small subset of images modified by editor x resulting in false negatives, i.e. not detecting modification in images that have been modified. We remedy this problem by having multiple match strings for a single editor. Some editors require more than one match string. This is because not all editors use the same signature every time. In fact, some editors (MS Paint and BeFunky) share signatures, which results in false negatives – images modified by MS Paint or BeFunky were not detected as modified images. Because MS Paint and BeFunky use an identical signature, it is impossible to differentiate one from the other based on byte analyzation. When this happens, the best our algorithm can do is give a list of possible editors, which is still better than previous algorithms that do not give any editor information.
180
G. Fossati et al.
Fig. 1. ForgeryDetector development algorithm
The difficult part, as we discussed, is choosing the match strings. Figure 1 diagrams our process. We first look at the bytes of a modified image and make an educated guess at a good match string. This is done by comparing the modified image bytes to the bytes of the original image. Then we run ForgeryDetector with our chosen match string. If we get false positives (unmodified images and images modified by other editors detected as modified), we increase the length of the match string. If we get false negatives, we need to decrease the length of the match string. This is not outlined in Fig. 1, but as we mentioned earlier, it is necessary to have multiple match strings for some editors. We only add a new match string in the case that we get stuck in a loop where decreasing the length of the match string results in false positives and increasing the length results in false negatives. By adding a second match string, we avoid false positives that a shorter match string gives while also avoiding false negatives by having a match string for the additional signature that the editor uses. Finally, if the match string results in no false positives or negatives, we are finished and have chosen a match string. Otherwise we repeat, continuing to modify the match string until there are no detection errors for any of the images. As mentioned above, MS Paint and BeFunky share a signature making it impossible to distinguish the two. We used a method of choosing match strings that is very easy for a human. However, we believe that a machine learning algorithm will be able to distinguish unique patterns that humans cannot by using more flexible criteria. Because our match string is very strict – one continuous string of bytes (no wild characters in the string) that does not consider the location of the match string in the file – it may be missing an essential signature left by the editor. We were still able to choose good match strings using strict criteria but running the algorithm with more flexible match string criteria would likely improve accuracy.
4 Results and Comparison We used all 480 images from our dataset (outlined in Table 1) to test ForgeryDetector, and the results are promising (Table 2). In presenting the results, we group by editor
Digital Image Forensics Using Hexadecimal Image Analysis
181
because our match strings are based on the editors – when there are false positives or negatives for a specific editor, our algorithm re-evaluates the chosen match string for that editor. As we discussed in the previous section, MS Paint and BeFunky resulted in false negatives. We will first revisit the significance of false positives and negatives. A false positive means that ForgeryDetector detected an editor that was not used. Our algorithm favors minimization of false positives over false negatives as seen in Table 2, i.e. we will not detect an editor to avoid detecting an incorrect editor. In terms of match strings, this means increasing the length of shorter match strings that result in false positives even if this results in false negatives. In the case of MS Paint and BeFunky, we were unable to add enough match strings to break out of the loop (switching between shortening and lengthening match strings infinitely) as we mentioned in the previous section. Table 2. ForgeryDetector results Editor
# Photos
False Pos.
False Neg.
# Correct
% Correct
None
80
0
0
80
100.00
MS Paint
80
0
50
30
37.50
GIMP
80
0
0
80
100.00
BeFunky
80
0
50
30
37.50
Fotor
80
0
0
80
100.00
Pixlr
80
0
0
80
100.00
We have chosen to minimize false positives because our purpose is academic, and we do not want to detect something that is not there. In a different setting, it may be more valuable to minimize false negatives. For example, a security focused application may want to detect all forged images. In that case, the logic is similar to that of an Intrusion Detection System (IDS) – it is bad if a forged image is not detected. It prefers no (or minimal) false positives even if it increases the false negatives. To implement this scenario, a similar approach could be taken to minimize false negatives over false positives. Merely shorten longer match strings or in the case of MS Paint and BeFunky, share the same match string to yield detection of both MS Paint and BeFunky for images containing a shared signature. In terms of runtime, creating the match strings is very time consuming because of the human element required. As we mentioned, a machine learning algorithm may be more accurate, but it would definitely be more efficient than our “by-hand” strategy. The runtime of the ForgeryDetector tool is O(NM) where N is the total number of images in the data set and M is the number of supported editors – for each editor, we check against each match string for that editor. However, M will be small compared to very large N and constant for any dataset once the match strings are chosen, so the runtime is essentially O(N). Our results are comparable and, in some ways, favorable to other image forgery detection algorithms. We have achieved an overall accuracy of 79.17% of predictions
182
G. Fossati et al.
being correct. More notably, we have no false positives. Most importantly, we provide the ability to detect which specific editor was used for modification. This has not been done by any prior work because it is infeasible using pixel analysis techniques. Theoretically, any editor can create an identical change on the pixels. Although more advanced editors have settings to simplify/automate editing techniques, at the base level, all editors have the same function – change the value of individual pixels. While it may be more difficult to make the same edit with “Tool A” than “Tool B”, it is still possible to yield and identical forged images with “Tool A” and “Tool B”. We must note two limitations of our project. First, our testing was on a relatively small data set which may have introduced bias by creating match strings that are not as accurate for a different dataset that was not used for choosing the match string. However, our accuracy suggests that this method of forgery detection is viable for some editors. Second, it is possible to delete a small number of bytes from the match string without changing the appearance of the modified image, i.e. the image can still be viewed with an image viewing program. This countermeasure renders our tool useless, but the average person does not have the knowledge or skill to perform such measures. This is clearly a limitation, but ForgeryDetector is still applicable to images found in the wild as many of them are created by people without technical knowledge of image forensics. Additionally, a machine learning algorithm may be able to detect certain patterns that cannot be removed from the image file, but this is a subject for future research.
5 Conclusion and Future Work Image forgery detection has traditionally focused on pixel analysis. Our research took a new approach using byte analysis. We successfully created a tool to detect not only when images have been modified but also the editor used to modify them which has never been done before this research. Our forgery detector supports three editors with 100% accuracy. We were unable to achieve 100% accuracy with the other two editors because of the identical signatures used by BeFunky and MS Paint. As we noted, this problem can likely be solved by relaxing the match string criteria and with the application of machine learning to our algorithm. The results of this project present exciting possibilities for future research as there are several possible improvements and extensions. We were only able to create match strings for five editors, but given more time and resources, additional editor support would make the tool more comprehensive when testing images in the wild. A practical direction for editor choice is iOS and Android image editors. Social media is one of the most common places for people to post photos, and much of social media activity, especially image uploads, happens on smartphones. Images are often edited in some way before being posted, making mobile apps a prime target for future work. Increasing the number of images in the test dataset would be another improvement. A larger set of images from a more diverse background of cameras and editors would allow us to better test as well as improve accuracy. Finally, the application of machine learning provides a nice extension of our tool. By feeding a large dataset to a neural network, we expect performance increases in both speed and accuracy. Machine learning may also be able to overcome the failed detection of forgery when small modifications to the bytes of the
Digital Image Forensics Using Hexadecimal Image Analysis
183
file are made as it is much more efficient at noticing patterns than a human with a hex editor. Acknowledgments. The authors would like to thank Danny Choi, Zijia Ding, and Brandon Lam for helping contribute to the initial prototype that inspired us to pursue this work.
References 1. Mokhtari Ardakan, M., Yerokh, M., Akhavan Saffar, M.: A new method to copy-move forgery detection in digital images using Gabor filter. In: Montaser Kouhsari, S. (ed.) Fundamental Research in Electrical Engineering. LNEE, vol. 480, pp. 115–134. Springer, Singapore (2019). https://doi.org/10.1007/978-981-10-8672-4_9 2. Chen, Y., Kang, X., Shi, Y.Q., Wang, Z.J.: A multi-purpose image forensic method using densely connected convolutional neural networks. J. Real-Time Image Proc. 16(3), 725–740 (2019). https://doi.org/10.1007/s11554-019-00866-x 3. Doegar, A., Dutta, M., Gaurav, K.: CNN based image forgery detection using pre-trained AlexNet model. Int. J. Comput. Intell. IoT 2(1) (2019). SSRN: https://ssrn.com/abstract=335 5402. (March 19, 2019) 4. Eman, I., El-Latif, A., Taha, A., Zayed, H.H.: A passive approach for detecting image splicing using deep learning and Haar wavelet transform. Int. J. Comput. Netw. Inf. Secur. 11(5), 28–35 (2019) 5. Elsharkawy, Z.F., Abdelwahab, S.A.S., Abd El-Samie, F.E., Dessouky, M., Elaraby, S.: New and efficient blind detection algorithm for digital image forgery using homomorphic image processing. Multimed. Tools Appl. 78(15), 21585–21611 (2019). https://doi.org/10.1007/s11 042-019-7206-3 6. Ibraheem, N.A., Hasan, M.M., Khan, R.Z., Mishra, P.K.: Understanding color models: a review. APRN J. Sci. Technol. 2(3), 265–275 (2012) 7. Manu, V.T., Mehtre, B.M.: Tamper detection of social media images using quality artifacts and texture features. Forensic Sci. Int. 295, 100–112 (2019) 8. Nguyen, H., Yamagishi, J., Echizen, I.: Capsule-forensics: using capsule networks to detect forged images and videos (2019) 9. Nguyen, H.C., Cao, T.L.: Using matrix decomposition and frequency transforms to detect forgeries in digital images. In: 2019 IEEE-RIVF International Conference on Computing and Communication Technologies (RIVF). IEEE, 16 May 2019 10. Schetinger, V., Iuliani, M., Piva, A., Oliveira, M.M.: Image forgery detection confronts image composition. Comput. Graph. 68, 152–163 (2017) 11. Al_azrak, F.M., Elsharkawy, Z.F., Elkorany, A.S., El Banby, G.M., Dessowky, M.I., Abd ElSamie, F.E.: Copy-move forgery detection based on discrete and SURF transforms. Wirel. Pers. Commun. 110(1), 503–530 (2019). https://doi.org/10.1007/s11277-019-06739-7 12. Oyiza, A.H., Maarof, M.A.: An improved discrete cosine transformation block based scheme for copy-move image forgery detection. Int. J. Innov. Comput. 9(2) (2019). https://doi.org/ 10.11113/ijic.v9n2.194 13. Pixlr: Online Photo Editor (2019). https://pixlr.com/editor/ 14. Fotor: Online Photo Editor (2019). https://www.fotor.com/ 15. BeFunky: Photo Editor (2019). https://www.befunky.com/ 16. GIMP: GNU Image Manipulation Program (2019). https://www.gimp.org/.Y
Identifying Soft Biometric Features from a Combination of Keystroke and Mouse Dynamics Sally Earl, James Campbell(B) , and Oliver Buckley School of Computing Science, University of East Anglia, Norwich NR4 7TJ, UK {s.earl,j.campbell1,o.buckley}@uea.ac.uk
Abstract. In this preliminary paper, we investigate the use of keystroke and mouse dynamics as a means of identifying soft biometric features. We present evidence that combining features from both provides a more accurate means of identifying all of the soft biometric traits investigated, regardless of the machine learning method used. The data presented in this paper gives a thorough breakdown of accuracy scores from multiple machine learning methods and numbers of features used. Keywords: Soft biometrics · Keystroke dynamics · Mouse dynamics
1 Introduction Whilst biometrics are becoming more prevalent as alternatives to conventional username and password systems, a key source of data which is often overlooked is the mouse. When we take into account the sheer number of times the average user moves or clicks a mouse per day, there is potentially a vast amount of data that goes unused. The predictive power of soft biometric features could be significantly improved when mouse data is used in parallel with keystroke dynamics. In this paper we present a fusion approach, which brings together keystroke and mouse data. Previous work completed, such as that of [1] has found that mouse dynamics directly complement keystroke dynamics, resulting in a higher accuracy level when data is analysed. Combining mouse with keystroke dynamics therefore, could be a way to leverage certain parts of the users’ interaction with the mouse, in a way that directly assists in the authentication and identification of certain soft biometrics of a user. In this study we aim to utilise keystroke and mouse dynamics as a method of predicting soft biometric features, such as gender and handedness. Furthermore, we aim to understand if mouse dynamics can provide an uplift to the accuracy of prediction when combining these two widely used technologies.
2 Literature Review Biometrics can be broadly split into two categories, physical and behavioural. Physical biometrics are physical features of a person which can be analysed to uniquely identify © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 184–190, 2021. https://doi.org/10.1007/978-3-030-79997-7_23
Identifying Soft Biometric Features
185
them, such as fingerprints, whilst behavioural biometrics are concerned with dynamic actions and usage as a unique characteristic, such as keystroke dynamics [2]. “Soft” biometrics are features which are not uniquely identifying on their own, but can aid in identification [3]. Keystroke dynamics are one of the most widely accepted forms of behavioural biometric currently in use [4]. These describe the unique typing pattern of a user and are collected from keystroke data which results in two measurements; dwell time (the time between key press and key release), and flight time (the time between the initial key press and pressing the subsequent key) [5, 6]. The interaction of these two values form the user’s typing ‘rhythm’. A key consideration when using keystroke dynamics is the split of the letters or text. These are classified as n-grams which can be defined as a contiguous sequence of n items. The most common n-grams are those of uni-grams, bi-grams and tri-grams with these being a length of 1, 2, or 3 letters respectively. Bi-grams are often seen as more of a robust method of identification as they are less effected by environmental factors [7], and as such are the most commonly used. Mouse dynamics is described as the way in which a user interacts with their system through the mouse. As with keystroke dynamics, there are common measurements which are extracted, which are based around the xy co-ordinates and time [8]. Research into using keystroke dynamics to determine soft biometric features was first conducted by Giot and Rosenberger [9], who were able to predict gender with 91% accuracy. Research conducted by Idrus et al. [10] then found accuracy scores of between 80–90% for recognising number of hands used, handedness, age and gender when measuring participants’ typing a passphrase multiple times. Soft biometrics are therefore a viable inclusion, and one which can be predicted by utilising keystroke data. This was improved upon by the same authors in the following year [11]. With regards to discovering soft biometrics through mouse dynamics, research has found the gender of a user of a web page when collecting xy co-ordinates, time, and event (click or move) with a success rate of 70%. Additionally, the age of a user (either older or younger than 33) was identified with a 90% success rate [12]. Yamauchi and Bowman [13] investigated the identification of gender and emotions of a user while using mouse dynamics. When analysing mouse trajectories, they achieved a 61–65% accurate gender prediction. The combination of keystroke dynamics and mouse dynamics is a relatively novel technique. Pentel’s [14] findings show that mouse was the most accurate compared to keyboard, with scores on identifying gender at 73% for keyboard and 94% for mouse. A combination of keystroke and mouse has also been found to provide the best results when attempting continuous authentication, as it proved difficult to spoof both measurements simultaneously [8]. The studies above highlight the promising potential accuracy of keystroke and mouse dynamics when combined; and also their individual ability to predict some soft biometric features. We present in this paper our preliminary investigation into this concept.
186
S. Earl et al.
3 Methodology For this study, we devised a platform which would allow us to gather keystroke and mouse dynamic data through a series of tasks. For the mouse data, participants were asked to click the centre of a single ‘crosshairs’. When they clicked within 100px of the centre of the target, the next was shown. This task was similar to that devised by Van et al. [15] chosen because it provides a user-friendly means of collecting data in a laboratory setting. Additionally, mouse data was gathered at the point that we collected demographic information. For the keystrokes tasks, participants were first asked to copy a passage from Bram Stoker’s Dracula, and then asked to describe the plot of their favourite film. These provided some data with the same expected results for all participants (the fixed text and the mouse tasks), and some which simulated more realistic use (free prompt and mouse data gathered elsewhere). The free collection is crucial should keystroke or mouse dynamics be integrated into a continuous authentication system, as these can never be done under controlled conditions [16]. 3.1 Mouse Dynamics We captured the mouse data using the p5.js library [17] which allowed us to capture more real-time information in a more intuitive format than basic JavaScript mouse event listeners allowed. The xy co-ordinates and time were logged whenever the mouse moved, the left click button was pressed, or when the right click button was released. Additionally, it was logged whether the clicks were on or off target. The data was to extract a number of features, including the speed of the mouse, the number of errors, and the co-ordinates of each click. When we combined these features within all of the relevant tasks this created a total of 55 mouse features. This occurred due to the features for all mouse data having 4 instances per participant, and the features for mouse tasks only having 3 instances per participant. 3.2 Keystroke Dynamics Keystroke data was captured using JavaScript keyUp and keyDown event listeners. When a listener was triggered, the timecode, keycode, and type of event (up or down) was logged. This data was then processed into keystroke events (Key, Time of Press, Time of Release). From this data we extracted a number of features, chosen from the literature, including the dwell time for each bigram and the error rate. When we combined these features within all of the relevant tasks this created a total of 186 keystroke features. 3.3 Machine Learning We used machine learning to classify our data as it has generally been found to be more accurate than statistical methods [18]. For this we used the open-source Python library scikit-learn [19]. In order to examine whether combining the two biometrics increased accuracy, we used 5 classifiers: Decision Trees, Random Forest, Gaussian Naive-Bayes, SVM, and K Nearest Neighbours. We have included the results for 2 different sets of parameters for Decision Trees within this paper.
Identifying Soft Biometric Features
187
We randomly split our data with 95% in the training set and the remaining 5% in the test set, chosen as a split to help with our imbalanced dataset. The data was then resampled 100 times. We additionally performed random undersampling, using the imbalanced-learn library [20]. In addition to undersampling, we also attempted feature selection to improve the accuracy. To do this we used the scikit-learn ‘SelectKBest’ function, to select the ‘k’ best features within the dataset, with k set to all, 100, and 150, to examine how reducing the number of features might improve the accuracy. These values for k were selected in order to present a broad range of potential feature numbers, without removing so many features that the models would become imprecise. Due to the smaller number of features, we only used k = all for the purely mouse dynamics experiments.
4 Results We recruited 240 participants who completed our study. Of these, 225 were right handed, 15 left. 120 identified as female, 119 as male, and 1 as ‘other’, and there was an even distribution of ages, which we split into 6 bins. We removed the participant who identified as ‘other’ from the sample when examining gender, due to the severe class imbalance. Additionally, data was collected about the number of hours that the participants spent on electronic devices in a day, which was again split into 6 bins. As can be seen in Table 1, the results we achieved varied in accuracy between both the different demographics and different classifiers. However, we found that a combination of mouse and keystroke data consistently provided the most accurate results across all of the classifiers. The least consistently accurate classifier was SVM, whilst the most consistently accurate classifier was Random Forest. 4.1 Gender For gender, we achieved a maximum accuracy of 68.33%, using the random forest classifier. Gender was the most consistently accurately predicted demographic, with accuracy only falling below 50% on 4 occasions. The results sacrificed accuracy compared to some previous studies (such as those in [10] and [13]), however, this is not surprising due to our data collection being created to mimic an authentication system with a single data capture, whilst many previous studies concerned more continuous data collection in a more natural setting, with more data collection. 4.2 Handedness With handedness we achieved our highest accuracy of 80.50%, using decision trees as the classifier. In stark contrast to gender, handedness had the largest range of accuracy scores, with the lowest at 26%. With the results for handedness, it is worth noting that the sample size (after undersampling) was significantly smaller than the other demographics tested (30 total) owing to a small sample of left-handed people. This is likely to be a large factor in the range of results, as one failed prediction can significantly alter the overall accuracy score.
188
S. Earl et al. Table 1. Accuracy scores for all classifiers
4.3 Age When predicting age, the most accurate classifier (Gaussian Naive-Bayes with 100 features) achieved an accuracy score of 27.64%. This is slightly better than random chance. Other studies frequently consider age in a binary form (often under or over 30). We chose to additionally complete classification in this form. This significantly increased our accuracy, with the highest accuracy being 69.22% using a combination of keystroke and mouse dynamics with the Random Forest classifier. 4.4 Hours Spent on Electronic Devices As with age, our accuracy for hours spent on electronic devices was also low, with a maximum accuracy of 22%. The majority of classifiers produced exceptionally low accuracy scores. This suggests that prediction of time spent on such devices is difficult from keystroke and mouse dynamics, potentially because there is not a link between the two. Additionally, we theorise that the hours spent on a computer do not correlate to keystroke timings and mouse usage. 4.5 Feature Selection In addition to the above analysis, we also considered which features had been removed during our feature selection stage, to determine if keystroke or mouse dynamics had any meaningful weight on the accuracy. As can be seen in Table 2, the mouse dynamic features always made up a smaller percentage of the selected features, as it so be expected with them making up just over 22% of the total features possible. This data suggest that mouse features are more discriminatory, as they always made up a larger percentage of features when 100 are selected, and reducing in percentage when the number of features
Identifying Soft Biometric Features
189
increases to 150. This suggests strongly that the most prejudicial features in the dataset are those arising from mouse dynamics. It is also important to note that we collected mouse dynamic data in 4 separate parts of the study, compared to the 2 where we collected keystroke data, and this may affect how influential the features are. Table 2. Percentage of used features for each demographic tested
5 Conclusion The preliminary results of our study show that using a combination of mouse and keystroke dynamic features is more effective at predicting soft biometric features than using either in isolation (see Table 1). As can be seen from the table, the combined accuracy scores were better than the individual (mouse or keystroke in isolation) accuracy scores, for each machine learning method. In addition to the results in Table 1, we can see from Table 2 that both mouse dynamic and keystroke features were selected as the most influential, showing the value of using them in combination. In order to further develop our approach, future work will in the first instance focus on understanding the discriminative power of the features collected. As a result of this we will look to undertake further, more targeted experiments to properly leverage the key features. With this in mind, we will further refine our machine learning approach to optimise the hyperparameters for the features.
References 1. Bhatnagar, M., Jain, R.K., Khairnar, N.S.: A survey on behavioral biometric techniques: mouse vs keyboard dynamics. Int. J. Comput. Appl. 975, 8887 (2013) 2. Jain, A.K., Flynn, P., Ross, A.A. (eds.): Handbook of Biometrics. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-71041-9 3. Jain, A.K., Dass, S.C., Nandakumar, K.: Soft biometric traits for personal recognition systems. In: Zhang, D., Jain, A.K. (eds.) ICBA 2004. LNCS, vol. 3072, pp. 731–738. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25948-0_99 4. Yampolskiy, R.V., Govindaraju, V.: Taxonomy of behavioural biometrics. In: Behavioral Biometrics for Human Identification: Intelligent Applications, pp. 1–43. IGI Global (2010) 5. Giot, R., El-Abed, M., Rosenberger, C.: Keystroke dynamics overview. In: Biometrics, pp. 157–182. InTech (2011)
190
S. Earl et al.
6. Moskovitch, R., et al.: Identity theft, computers and behavioral biometrics. In: 2009 IEEE International Conference on Intelligence and Security Informatics, pp. 155–160. IEEE (2009) 7. Bergadano, F., Gunetti, D., Picardi, C.: User authentication through keystroke dynamics. ACM Trans. Inf. Syst. Secur. (TISSEC) 5(4), 367–397 (2002) 8. Mondal, S., Bours, P.: A study on continuous authentication using a combination of keystroke and mouse biometrics. Neurocomputing 230, 1–22 (2017) 9. Giot, R., Rosenberger, C.: A new soft biometric approach for keystroke dynamics based on gender recognition. Int. J. Inf. Technol. Manag. 11(1–2), 35–49 (2012) 10. Idrus, S.Z.S., Cherrier, E., Rosenberger, C., Bours, P.: Soft biometrics for keystroke dynamics. In: Kamel, M., Campilho, A. (eds.) ICIAR 2013. LNCS, vol. 7950, pp. 11–18. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39094-4_2 11. Idrus, S.Z.S., Cherrier, E., Rosenberger, C., Bours, P.: Soft biometrics for keystroke dynamics: profiling individuals while typing passwords. Comput. Secur. 45, 147–155 (2014) 12. Kratky, P., Chuda, D.: Estimating gender and age of web page visitors from the way they use their mouse. In: Proceedings of the 25th International Conference Companion on World Wide Web, pp. 61–62 (2016) 13. Yamauchi, T., Bowman, C.: Mining cursor motions to find the gender, experience, and feelings of computer users. In: 2014 IEEE International Conference on Data Mining Workshop, pp. 221–230. IEEE (2014) 14. Pentel, A.: Predicting age and gender by keystroke dynamics and mouse patterns. In: Adjunct Publication of the 25th Conference on User Modeling, Adaptation and Personalization, pp. 381–385 (2017) 15. Van Balen, N., Ball, C., Wang, H.: Analysis of targeted mouse movements for gender classification. EAI Endorsed Trans. Secur. Saf. 4(11), e3 (2017) 16. Sim, T., Janakiraman, R.: Are digraphs good for free-text keystroke dynamics? In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–6. IEEE (2007) 17. p5.js: p5js.org (2014) 18. Giot, R., El-Abed, M., Rosenberger, C.: Keystroke dynamics with low constraints SVM based passphrase enrollment. In: 2009 IEEE 3rd International Conference on Biometrics: Theory, Applications, and Systems, BTAS 2009, pp. 1–6. IEEE (2009) 19. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011) 20. Lemaître, G., Nogueira, F., Aridas, C.K.: Imbalanced-learn: a Python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18(17), 1–5 (2017). http://jmlr.org/papers/v18/16-365.html
CyberSecurity Privacy Risks Linda R. Wilbanks(B) Towson University, Towson, MD 21252, USA [email protected]
Abstract. Privacy is one of the greatest cybersecurity risks due to the extensive reliance on the internet. With the increase in work, social interactions, financial and medical transactions accomplished almost totally online during the pandemic, businesses’ and users’ critical data is at increased risk of exposure or theft. This stolen data can be used to steal corporate secrets, commit espionage or fraud or to steal a person’s identity and financials. Cyber risks cannot be eliminated, they must be managed, there must be an identification of what vulnerabilities will allow threats access to privacy information so that mitigations can be put in place to reduce the probability of the loss of information and privacy. Keywords: Privacy · CyberSecurity · Risk management
1 Introduction The internet is increasingly becoming a conduit for individuals’ personal and professional lives worldwide. The digital ecosystem often requires the transmission of personal information across secure and insecure networks utilizing a complex chain of custody for personally identifiable information (PII), thus introducing privacy issues. The internet exposes PII to an increasing number of threats. As organizations collect, process and transmit more PII, they become more attractive targets for cyber criminals. The massive consumer base of some online service providers presents unprecedented opportunity for malicious parties to perpetrate large scale security breaches. For example, Equifax acknowledged a breach potentially affecting more than 100 million people in the US. Yet people continue online transactions with companies that have lost clients, customers, and employees PII [1]. In 2019, 86% of breaches were financially motivated; personal data, especially medical data, can be sold. According to 2019 Corporate Insurance Experts: Cyber Risk, the loss of data, is the second highest or highest perceived business risk; digital assets now represent over 85% of an organizations value; 99% of new information is stored digitally. At least one in every 10 Fortune 1000 companies has experience a publicly disclosed breach. Threat to privacy is real [1]. See examples in Table 1. Any company or person utilizing the internet is subject to a cyber breach and the loss of personal information. The economic costs of cyber-attacks exceed those associated with natural disasters. The costs of data breaches, including loss of personal information are expected to amount to US $2.1 Trillion globally by 2019. It is estimated that cybercrimes cost the world US $3 Trillion in 2015 and that is expected to increase to US $6 Trillion annually by 2021 [3]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 191–198, 2021. https://doi.org/10.1007/978-3-030-79997-7_24
192
L. R. Wilbanks Table 1. Examples of data breaches resulting in loss of privacy for millions of users [2].
When
Company
#Records lost
Cause
Feb 2020
Pakistan Mobile Operators
115M
Customer personal information for sale for $2.1M bitcoin (1bitcoin–$32,5000USD)
Jan 2020
Microsoft
250M
Customer support records for 14 years exposed
Nov 2019
OxyData
380M
Data aggregation firm data exposed (big data)
Sept 2019
Facebook
420M
User’s names, phone numbers, gender, location exposed
July 2019
Capital One
100M
Personal information stollen from credit card applications, employee pro
2 Privacy Privacy is the right to be free from being observed or disturbed by other people, the right to not have your personal information shared with others. Privacy is the right to control, edit, manage and delete information about oneself and to decide when, how, and to what extent information is communicated to others. Privacy is about the protection of oneself from the outside world, their ability to determine for themselves when how and to what extent information about themselves is communicated to others [4]. Every individual should be able to manage and protect information about themselves, individually and independently from the others. This is unfortunately not always possible in the electronic world. The privacy of individuals does not only depend on their own actions and data, but may also be affected by the privacy decisions and by the data shared by other individuals. Privacy erosion occurs slowly, over time, which has prevented awareness of the induced risks. In the age of surveillance from state and private internet communication companies, for an individual to protect their privacy or to remain obscure is becoming almost impossible. A renewed emphasis on the right to privacy influenced in direct response to the aftermath of many breaches and loss of personal information as shown in Table 1. Personally identifiable information (PII) is defined as “(1) Any information that can be used to distinguish or trace an individual’s identity, such as name, social security number, date and place of birth, mother’s maiden name, or biometric records; and (2) any other information that is linked or linkable to an individual, such as medical, educational, financial and employment information” [5]. When a person’s PII is lost, stolen or shared with unauthorized users, it is a loss of their privacy.
CyberSecurity Privacy Risks
193
3 CyberSecurity Risks The cyber risk of privacy is very real as demonstrated by data breaches that occur in many companies and result in the loss of private information, see the diversity of companies where data was lost in Table 1. Cyber security risk is the risk to an organizational operation’s mission, reputation, assets, and individuals due to the potential for unauthorized access, use, disclosure, modification or destruction of information and/or information systems. Cyber risks are those that arise through the loss of confidentiality, integrity and/or available of the data and/or networks. The risks also relate to the authentication to access/view the information. Cyber risks, like any other type of risk, cannot be eliminated, they must be managed.
4 Privacy Risks Privacy risks can be found in anywhere PII is used or stored, which are very diverse locations. PII is used in smart devices and wearable devices. It is also used in medical, and financial. Big Data is where data from multiple sources is compiled, creating new privacy risks. 4.1 Smart Devices and the Internet of Things (IoT) This is a digital era, a time of Alexa and Google Home, of Facebook and Fitbit, of Twitter and Tik-Tok. Smart devices are becoming ubiquitous and increasing rapidly in number and applications such as digital assistants, smart home security systems, remote educational labs, wearable medical devices [6]. Remarkable opportunities in smart technologies in the home, communicating from home with schools, businesses and friends, deliveries of groceries or anything else users want. But there are privacy risks associated with all of these, especially when they from uses in homes and businesses [7]. The growing popularity of smart devices indicates that the Internet of Things (IoT) technology is becoming totally integrated into our everyday life. Research indicates that 20.4 billion IoT devices will be deployed by 2020. For example, the use of IoT at work, the use of wearable devices could tell when and where employees were most active and productive, but also could easily take control of employees’ privacy. The use of customers collected by IoT related companies in order to understand consumers’ behavior and spending. Public and private organizations store, track and use massive amounts of data produced by IoT devices [6]. As shown in Table 1, all companies are vulnerable to loss of data. The massive amount of data collected through smart devices and shared widely, if lost through a breach, would impact millions of user’s privacy. 4.2 Wearables Wearable Wireless devices such a wireless earbud, smartwatches, virtual reality headsets continually collect information about the user, what do they like, what music do they purchase, what type of games do they use, how do they use the games, how often do they
194
L. R. Wilbanks
play or listen. These devises form an Internet of Bodies (IoB) individual components of the single human’s wireless network, using Bluetooth to communicate, allowing an attacker to snoop or attack one of the devised. Mitigation new research at Purdue University for a secure IoB communications system including encryption, non-broadcasting to prevent eaves dropping or compromised device [8]. Although this information may seem harmless, user information should not be exposed or used without approval, privacy should be a right, not a privilege, regardless of the financial gain for manufactures. 4.3 Social Media Social media is an internet tool often used by businesses to communicate with customers and clients. It is used for financial and medical transitions with customers and to pay employees, using customers and employees PII. Social media is fraught with privacy risks, users transmit their personal information in many forms, from paying bills, internet buying, to postings on Facebook, Twitter, LinkedIn, YouTube. Associated with these sites are personal profiles, where they went to school, live and work, relationship status, family members, pets, etc., all personal information that is available to most people. Facebook collects an average of 15 TB daily. Many users assume they own the information they contribute to social media sites; the social media providers make an entirely different assumption – that once users contribute information to a social media site, the information belongs to the social media site. This results in user’s privacy becoming secondary to the commercial possibilities for users’ information [1]. As shown in Table 1, in September 2019, Facebook exposed 420 mission users name, phone number, Facebook ID number, gender and country location. But in 2016, a Facebook distributor, Cambridge Analytica, a political consulting firm, harvested the data of over 400 million Facebook users to psychologically profile voters during the 2016 election. The incident resulted in an executive hearing where senators raised issues on user’s privacy and the company’s mishandling of data, making the cyber risks associated with user/customer privacy visible and public. In 2017 Yahoo revealed that personal information on all of its 3 Billion users was stolen [2]. 4.4 Medical Personal medical devices that collect medical data about the user have become very common. People have watches and phones that collect heart rate, blood pressure and possibly other information, they have devices such as heart monitors, insulin monitors and pumps inserted in their bodies, all collecting personal medical information. These connected medical devices are vulnerable for releasing private information. In June 2020 a model of insulin pumps were recalled because they were transmitting sensitive information without encryption, making the data accessible to anyone nearby who might want to listen in [8]. Medical records retained by providers, Physicians offices, hospitals, pharmacies, contain personal information about health issues and should remain private, but often do not. In December 2020, a large cosmetic surgery chain, The Hospital Group, was hacked and over 900 GB of plastic surgery photographs. Quest Diagnostics was hacked
CyberSecurity Privacy Risks
195
and provided a link to a billing services company where payment information was stolen over a period of 8 months [2]. 4.5 Financial Cryptocurrencies are often used on the dark web, with cyber criminals, but they also have significant privacy costs. An adversary has many capabilities to identify the actual users associated with a cryptocurrency account. Transparency is a major factor driving the use of applications such as cryptocurrencies, but it does not provide reasonable privacy protection. It is known that the transparent nature gives other users access to details of conducted transactions. If an individual uses Bitcoin to pay for goods or services, the party with whom the transaction is being made can know exactly how much money the individual has, possibly increasing the threat to personal safety. Bitcoin transactions re difficult to track but they are not completely anonymous; all transactions are recorded in a permanent public ledger [9]. The legal banking and credit card industries also have privacy issues. Most individuals have at least one bank account and one credit card. The financial servicers clients’ comprehensive financial history are also vulnerable to exposure. JPMorgan, the largest US bank, was hacked in October, 2014, exposing all data for 76M customers. Equifax, in October 2017, exposed names, social security numbers, birthdates, addresses, driver license numbers, credit reports and more for 143 million customers in the United States, England and Canada. Table 1 identifies Capitol One credit card exposed social security numbers and bank account numbers of their customers in July 2019 [2]. These are only a few examples demonstrating cybersecurity privacy risks. 4.6 Big Data The US Government, and the governments of many countries, are collecting vast amounts of information from these diverse sources in the interest of counterterrorism. They are monitoring all email and internet traffic and tracking the purchases of goods and services. Telephone information is being recorded in volumes, the caller ID information specifically. Vehicles can be tracked by toll transponders, GPS, and imaging devices. Much of this information is being kept (Zettabytes of it) and can be mined for transgressions beyond terrorist-threat indicators. This is combination of data from many sources is commonly known as “Big Data”. 6 This type of big data analyses is also used outside of government agencies. Auto insurance companies can determine if a subscriber has traffic tickets or accidents, should they raise rates or cancel clients. Amazon Prime stores users credit card information (as do many companies), knows what a customer purchased previously, and recommend products based on buying patterns. All of this data, collected, stored and analyzed poses risks to an individual’s privacy [10]. There is a tradeoff between security and personal privacy, and we generally accept some invasions into our personal privacy in exchange for security. We must continue to accept this intrusion for a certain level of security, but at what point is the invasion of our privacy no longer acceptable. But in some cases, through our actions on line, such as facetime, Facebook Twitter, Zoom, Alexis, Amazon Prime, etc., we voluntarily give up our privacy for convenience. In the COVID-19 world, we could argue we had no choice,
196
L. R. Wilbanks
our world, work and personal, had to revolve around electronics, but we need to do so with the recognition of the change in the state of privacy.
5 Privacy Mitigations 5.1 Legal Privacy Protection There are many laws that were designed to protect people’s personal information and their privacy. These laws place the burden of protection on the companies that store the data. The Health Insurance Portability and Accountability Act (HIPPA) passed in 1996 ensures that health information data and privacy are protected. HIPPA applies not only to health services industries, but also to organizations or businesses that handle health insurance. HIPPA defines health information as any data that is created or received by: Health care providers, healthy plans, public health authorities, employers, life insurers, schools or universities. HIPPA relates to the privacy of an individual. With respect to privacy, HIPPA clearly states that health data must not be shared with anyone without the express consent of the patient [11]. Family Educational Rights and Privacy Act (FERPA) protects the privacy rights of student after the age of 18 stating all personally identifiable information (PII) about the student including educational data and health data, must be protected and released only with permission of the student [11]. The Children Online Privacy Protection Act (COPPA), addresses the concern for the lack of privacy for children on the internet. COPPA limits the options which gather information from children and created warning labels if potential harmful information or content was presented. In 2000, the Children’s Internet Protection Act (CIPA) [11]. The Children’s Internet Protection Act (CIPA) was developed to implement safe internet policies such as rules, and filter software and to limit access to offensive content from school and library computers. This law includes requirements for protecting the unauthorized use of minors’ personal information [11]. While not a law, most companies have a computer usage policy that includes Rules of Behavior for employees using a company’s computer network. One required element in this document addresses the privacy of users, or lack of privacy. Most organizations stress that users have no expectation of privacy when using employer resources and are subject to monitoring. Data can be viewed at any time, including data and email files and the employee’s internet history and activities. 5.2 Computer and Applications Privacy Settings The user of online systems, applications and games must take responsibility for securing their privacy, they cannot rely solely on the developers. Privacy settings are the part of a social networking website, internet browser, piece of software, etc. that allows the user to control who sees information about them. Several social networking websites try to protect the personal information of their subscribers, as well as provide warning through a privacy and terms agreement. On Facebook, for
CyberSecurity Privacy Risks
197
example, privacy settings are available to all registered users: they can block certain individuals from seeing their profile, they can choose their “friends”, and they can limit who has access to their pictures and videos. Default settings can be chosen for convenience but should consider personalizing privacy settings. All websites receive, and many track, which IP address is used by a visitor’s computer. Companies match data over time to associate name, address, and other information to the IP address. There is ambiguity about how private IP addresses are, whether they need to be treated as personally identifiable information users need to be aware of this privacy risk, as it is difficult to mask their IP address. An HTTP cookie is data stored on a user’s computer that assists in automated access to websites or web features and may also be used for user-tracking. Cookies are a common concern in the field of cyber privacy. Cookies are advertisers’ main way of targeting potential customers, but customers can delete the cookies. Some sites will have a warning banner stating they use cookies asking if the user wants to proceed.
6 Conclusion All electronic information can ultimately be used to uniquely identify the user, thus creating privacy risks. Anyone and every company are susceptible to data breach and loss of personally identifiable information, impacting user’s privacy. The hackers are here to stay—it is a profitable business to sell PII and the hackers have evolved the sophistication of their techniques. Insider threats which are just as dangerous as outsiders when it comes to invading and impacting privacy. Employees accidentally releasing or exposing PII. Cyber risks, like any other type of risk, cannot be eliminated, they must be managed, there must be an identification of what vulnerabilities will allow threats access to privacy information. Adequate cybersecurity to identify and mitigate privacy risks requires active participation of everyone within a company and all uses of the internet. Is privacy dead with our extensive use of the internet? That is the question every user must ask when sharing information, are they willing to accept the loss of privacy.
References 1. Wilbanks, L.R.: Cyber security risks in social media. In: Meiselwitz, G. (ed.) HCII 2020. LNCS, vol. 12194, pp. 393–406. Springer, Cham (2020). https://doi.org/10.1007/978-3-03049570-1_27 2. World’s Biggest Data Breaches & Hacks: Information is Beautiful (2021) 3. Humbert, M., Trubert, B.: A survey of interdependent privacy. ACM Comput. Surv. 52(6), 1–122 (2019). Article 122 4. Humble, K.: Human rights, international law and the right to privacy. J. Internet Law 23(12), 1–24 (2020) 5. Burns, A., Johnson, E.: The evolving cyberthreat to privacy. IT Prof. 20(3), 64–68 (2018) 6. Ahmad, N., Laplante, P., DeFranso, J.: Life, IoT, and the pursuit of happiness. IT Prof. 4–10 (2019) 7. Sullivan, L., Reiner, P.: Ethics in the digital era: nothing new? IT Prof. 39–42 (2020) 8. Sen, S., Maity, S., Das, D.: The body is the network. IEEE Spectr. 45–49 (2020)
198
L. R. Wilbanks
9. Kshetri, N.: Cryptocurrencies: transparency versus privacy. IT Prof. 27–31 (2020) 10. Laplante, P.: Who’s afraid of bid data? IT Prof. 15(5), 6–7 (2013) 11. Easton, C.: Computer Security Fundamentals, 3rd edn, pp. 59–63. Pearson Education, Inc. (2016)
Sharing Photos on Social Media: Visual Attention Affects Real-World Decision Making Shawn E. Fagan1(B) , Lauren Wade2 , Kurt Hugenberg1 , Apu Kapadia1 , and Bennett I. Bertenthal1 1 Indiana University Bloomington, 107 S Indiana Avenue, Bloomington, IN, USA
{faganse,khugenb,kapadia,bbertent}@iu.edu
2 Michigan State University, 426 Auditorium Road, East Lansing, MI, USA
[email protected]
Abstract. This study tested the effect of visual attention on decision-making in digital environments. Fifty-nine individuals were asked how likely they would be to share 40 memes (photos with superimposed captions) on social media while their eye movements were tracked. The likelihood of sharing memes increased as attention to the text of the meme increased; conversely, the likelihood of sharing decreased as attention to the image of the meme increased. In addition, increased trait levels of agreeableness predicted a greater likelihood of sharing memes. These results indicate that individual differences in personality and eye movements predict the likelihood of sharing photo-memes on social media platforms. Keywords: Decision-making · Visual attention · Eye tracking · Social media · Privacy
1 Introduction Of the multitude of items that can be shared digitally, the meme is one of the most popular. The term “meme,” coined by Richard Dawkins, describes a unit of culture that is transmitted and replicated, undergoing change and evolving in the process [1]. Sharing a meme is therefore a communicative process whereby one participates in “digital culture” [2]. The goal of the present study was to better understand the cognitive processes governing whether memes are shared. While there are many studies on meme-sharing in the context of political messaging [3], rhetorical expression [4], hashtag activism [5], and beyond, few studies have examined what psychological mechanisms influence sharing memes. In the age of the virtual echo-chamber, behavioral research affirms that people share online content they agree with. For example, Macskassy and Michelson (2011) reported that among Twitter users, the model best suited to predicting retweeting behavior (i.e., reposting content shared by another user) was based on profile similarity [6]. Individuals were likely to repost information shared by like others. A meta-review of survey information and peer-reviewed studies similarly found that compared to journalists and media personnel, most lay users shared information that they agreed with and endorsed [7]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 199–206, 2021. https://doi.org/10.1007/978-3-030-79997-7_25
200
S. E. Fagan et al.
Relatedly, Amon and colleagues (2019) found that participants reported sharing memes if they found them funny or the content personally relatable [8, 9]. In general, the “message” of the meme plays a primary role when deciding to re-share digital content. A potential unexplored mechanism underlying meme-sharing is visual attention. A growing body of evidence links visual attention to decision-making. One well-replicated finding is that increased attention to an option boosts the likelihood of preferentially selecting that option [10, 11]. Similar phenomena are seen in studies with social stimuli. For example, looking longer at a stimulus (e.g., a face) predicts an increased likelihood of rating that stimulus as more attractive [12]. This finding extended to abstract objects, as well. Yet, there is almost no research that examines the relationship between visual attention and decisions to share content online, though many studies use eye-tracking to explore online shopping behavior [13], online advertisement efficacy [14], the processing of images that accompany online articles [15], and engagement with (though not sharing of) certain kinds of memes [16]. In general, privacy considerations factor prominently in an individual’s decision to share content online, particularly personal photos. Young adults are aware of the privacy violations and perils that come with photo sharing, engaging in self-preservation strategies to avoid becoming “the next meme” [17]. This appears acutely true for younger (under 35-years-old) individuals [18]. Young adults also advocate seeking consent from individuals in a photo before sharing it [17]; they will consider their own privacy when sharing photos of themselves and the privacy of their friends when sharing a group photo [19, 20]. However, once a photo transforms into a meme and is circulated widely, concerns about others’ privacy may dissolve. Amon and colleagues (2019) prompted participants to think about the privacy of a meme’s focal subject in a photo-sharing paradigm [8]. They found that while participants reported considering the privacy of the meme’s focal individual, they still rejected the premise that sharing a meme impinged on another’s privacy. The current investigation was designed to address two questions. First, to what extent does visual attention predict the likelihood someone will share a meme? Second, do individual characteristics and personal privacy preferences predict meme-sharing behavior? To investigate these questions, we invited undergraduate students to the lab to view a set of photo-memes and rate the likelihood that they would share those memes on their preferred social media platform while recording eye movements. Specifically, we were interested in the relationship between the distribution of their visual fixations and their sharing decisions. We hypothesized that greater attention to the meme text would be associated with greater semantic engagement with the meaning of the text [21], and less concern with the privacy of the individual depicted in the photo. Accordingly, we predicted that these individuals would be more likely to share the meme. By contrast, we predicted that participants attending more to the meme image would be less likely to share that image. Here, we hypothesized that focusing more on the photo would be associated with humanizing the person or persons in the image and prompt the participant to consider that these individuals may not have given their consent to share their image [22]. In testing how personality might affect the likelihood of sharing photo-memes, our goals were more exploratory, and thus there were no specific hypotheses.
Sharing Photos on Social Media
201
2 Method 2.1 Participants Fifty-nine undergraduate students from a large midwestern university database took part in the study for course credit. Participants were 56% female, 18–26 (Mage = 19 years), and 78% White, 13% Asian, 7% Latinx, and 2% Black. All students had normal to corrected vision. Students were ineligible to participate if they had taken medication that could have impaired cognitive function within 24 h of participation. 2.2 Procedure The university IRB approved all study procedures. Participants were tested individually. Upon arrival, a trained research assistant guided the participant through the consent process. They then explained to the participant that they would view a series of images and rate how likely they would be to share those images online; after which they would complete a short series of questionnaires. The participant sat in front of a computer monitor and rested their head on a chinrest to minimize movement. The research assistant guided the participant through calibration of the eye tracker, and then instructed them to focus on a fixation cursor for one-minute to record baseline pupil information. The photo-meme sharing task started promptly after the rest period, and upon completion, participants filled out two surveys on the computer. The entire procedure took approximately 45 min. 2.3 Photo Sharing Task The photo sharing task was created using Tobii Pro Lab with E-Prime 3.0 integration. Participants viewed 40 memes pre-tested to range in valence from very positive, positive, negative, to very negative [8]. Each meme remained onscreen for 8 s (image slide), after which a 7-point Likert scale appeared below the meme, instructing participants to rate the likelihood that they would share the meme on social media from 1-Extremely Unlikely to 7-Extremely Likely (Likert slide). Participants responded via keystroke. They had unlimited time to decide if they were likely to share the meme. 2.4 Personality Measures Big Five Inventory - 10 Item Short Form (BFI-10) [23]. The BFI-10 is an abbreviated 10-item version of the 44-item scale that reduces personality to five dimensions: neuroticism, agreeableness, extraversion, conscientiousness, and openness. Participants were asked, “How well do the following statements describe your personality?” and responded on a 5-point scale ranging from Disagree Strongly to Agree Strongly. Whereas previous studies validated the reliability of the shorter 10-item version of the BFI [24, 25], reliability for our sample was low (Cronbach’s alpha = .50). Privacy Questionnaire [8, 26] . A single-item privacy question asked participants, “Are you a private person who keeps to yourself or an open person who enjoys sharing with others?” using a 7-point scale ranging from 1-Very Private, to 7-Very Open.
202
S. E. Fagan et al.
2.5 Data Acquisition and Processing Eye gaze data were collected using a Tobii TX300 Eye Tracker (Tobii Pro AB, Stockholm, Sweden) with integrated 23 monitor (1920 × 1080 pixels) at a frequency of 60 Hz. Participants sat 60–70 cm from the monitor and completed a 5-point calibration with validation. Participants rested their heads in a chinrest to minimize movement. We used Tobii Pro Lab for offline analyses of gaze data. Areas of interest (AOIs) were drawn around the borders of faces, bodies, objects, the full meme image, lines of text, and the Likert scale that appeared below the meme during the Likert slide. The minimum threshold for a fixation was 100 ms. We excluded gaze data from one participant who had only 19% usable recorded data.
3 Results We used SPSS 26.0 to conduct all statistical analyses. The dependent measure was total fixation duration to three specific AOIs: meme text, meme image, and the Likert scale (during the Likert slide, only). Total fixation duration represents the cumulative duration of all fixations within an AOI. 3.1 Descriptive Statistics First, we examined the difference in attention to the components of the image and Likert slides. During the image slide presentation, participants looked more at the meme image (M = 3.95 s, SD = .60) than text (M = 2.19 s, SD = .51), t(58) = 14.21, p < .001, d = 1.87. While the Likert slide was on the screen, participants looked the longest at the Likert scale (M = 1.08 s, SD = .45), then the meme image (M = .79 s, SD = .43), and finally the meme text (M = .29 s, SD = .19), F(2,114) = 81.38, p < .001, ηp2 = .588. Average likelihood to share was M = 3.42, SD = .81, and ranged from 1.62 to 5.04 (Fig. 1). 3.2 Correlations We computed Pearson’s correlations between total fixation duration to meme text, meme image, and Likert scale; task performance measures (likelihood of sharing and reaction time to sharing decision); demographics (age, gender); personality traits (privacy and BFI); and the likelihood of sharing. Correlations revealed that the mean likelihood of sharing (Likert scale score) was positively associated with mean response time, r(58) = .28, p < .05, total fixation duration to the meme text during the image slide, r(58) = .29, p < .05, and total fixation duration to the meme text during the Likert slide, r(58) = .25, p = .055. The decision to share was negatively associated with total fixation duration to the meme image during the image slide, r(58) = −.29, p < .05. There was no relationship between fixation duration to the Likert scale and likelihood of sharing, r(58) = .11, p = .400. There was a significant positive relationship between sharing and BFI agreeableness, r(59) = .38, p < .05. There were no other relationships between personality measures and likelihood of sharing.
Sharing Photos on Social Media
203
3.3 Effects of Looking on Likelihood to Share Memes To further discern the influence of attention on the likelihood of sharing memes, we conducted two stepwise hierarchical linear regressions. First, we calculated attention ratio scores for both image slide and Likert slide by dividing the total fixation duration to meme text by total fixation duration to meme image. Ratio scores above 1.0 indicated greater looking to the text than the image and ratios below 1.0 indicated greater looking to the image than the text. Mean ratio for the image slide was .55 and ranged from .26–.99; mean ratio for the Likert slide was .37 and ranged from .09–.86. Step 1 of each model contained the fixation ratio score (from either the image or Likert slide). Because correlations showed a significant relationship between BFI agreeableness and sharing, we added that trait to Step 2 of the model to see if it explained additional variance.
Fig. 1. Image slide (left). Likert slide (right). Heat map of gaze data averaged over all participants during a single trial. Attention is distributed between the lines of text and meme subject. Darker (red) colors indicate areas with greater total fixation time during the trial. (Color figure online)
Image Slide. The text-to-image ratio during the image slide and BFI agreeableness significantly predicted likelihood to share, F(2,55) = 4.63, p < .01, R2 = .15. BFI agreeableness specifically predicted increased sharing, β = .31, t(57) = 2.31, p < .05. Ratio score also predicted increased sharing, β = 1.35, t(57) = 1.99, p = .055. Thus, as attention to text increased, the likelihood of sharing increased. Likert Slide. The text-to-image ratio during the Likert slide and BFI agreeableness significantly predicted sharing behavior for memes, F(2,55) = 4.97, p < .05, R2 = .15. Ratio score marginally predicted increased sharing, β = 1.01, t(57) = 1.71, p = .093; BFI agreeableness also predicted increased sharing, β = .33, t(57) = 2.58, p < .05. The correlational results suggested that participants were less likely to share memes as attention to the meme image increased. To better understand what feature or features of the meme image were driving this pattern, we calculated change scores for each feature, e.g., we subtracted total fixation duration to the face of the meme subject from total fixation duration of the overall meme image. Findings revealed that only total fixation duration to the face correlated with likelihood of sharing, r(58) = −.27, p < .05.
204
S. E. Fagan et al.
As in the previous regression analyses, we computed a new attention ratio score, dividing fixation duration to the meme text by fixation duration to any faces of individuals in the meme image. Both BFI agreeableness and attention ratio score during the image slide significantly predicted likelihood to share, F(2,55) = 4.63, p < .01, R2 = .15. BFI agreeableness, β = .31, t(57) = 2.35, p < .05, and the fixation ratio of text to face predicted increased sharing, β = .60, t(57) = 2.08, p < .05. Given that this finding is based only on attention to the face, it suggests that as attention to any faces in the image decreased (relative to attention to the text), sharing likelihood increased.
4 Discussion This is one of the first studies to examine the role of visual attention on the sharing of digital content. We found that as attention to the text of a meme increased, so did the likelihood of sharing said meme (though over the course of a trial, attention to the image portion of the meme image was greater than attention to the text or caption). Understanding a meme, specifically one that has a photo with a caption, requires attention to the caption itself. The caption guides the viewer’s interpretation of the photo; subsequently, the viewer can then evaluate if they think the caption suits the photo in the context of giving the meme a specific meaning and/or if it makes the meme humorous. We also know that more attention is often associated with greater favorability, as seen in forced choice paradigms in behavioral economics [10] and marketing studies of successful advertisements [27]. Therefore, we believe that increased attention to the caption reflected a more favorable reading of the meme, and supports past work showing people are more likely to share online content that they feel they relate to and find amusing [8]. We asked several post-experiment survey questions, one of which asked why participants did or did not share memes. The overwhelming response was that the meme was either funny or relatable, or both. This finding supports previous studies that users like and share content which they feel represents their own interests or an idea that they endorse. Thus, it is fair to deduce that the more time participants allocate to the text, the more they engage with the meaning or messaging of the meme. Conversely, attending more to the meme image rather than the caption, was associated with a reduced likelihood to share. Our results showed that attention to the meme image was driven largely by increased fixation time to the face of the person or persons in the image. This reduction in sharing was perhaps a result of the automatic mentalizing that occurs when one looks at a human face [28]. Thus, the attention to faces could have resulted in the “humanization” of the meme subjects and increased concern for their privacy. However, attention to faces could also represent a disconnect from the meme’s “message.” Otherwise bored perceivers may have oriented to the most salient object on the screen, a human face, which preferentially captures attention [29]. Future analyses should examine the temporal distribution of gaze patterns during the image presentation to see if more time spent at the end of the trial on meme subjects and/or their faces affects sharing decisions. Interestingly, we found a relationship between sharing likelihood and agreeableness, though no other personality traits nor personal privacy preferences. Generally, people with high levels of agreeableness tend to seek social acceptance as well as show genuine
Sharing Photos on Social Media
205
sympathy with others and have high levels of prosocial motivation [30]. For these individuals, sharing memes may be a means of demonstrating sociability. A study on social attention found that high levels of agreeableness were associated with increased fixation time to the eyes of social stimuli [31]. This finding would appear counter-intuitive to our gaze findings, however. Attention to social stimuli (meme target faces, specifically) was negatively related to sharing, and we saw no relationship between agreeableness and total fixation to meme image. However, given the low reliability of the BFI measure we believe these results should be interpreted with caution. This is the first study of its kind to examine how visual attention affects photomeme sharing behavior. Future studies are necessary to replicate our results; namely, that engagement with a meme’s caption drives decisions to share content, and that focusing on the people in a photo-meme may have a deterrent effect. Nevertheless, this is an important step in understanding how our cognitive and perceptual processes influence decision-making in real-world digital environments. Acknowledgments. This material is based upon work supported in part by the National Science Foundation under grant CNS-1814476.
References 1. Dawkins, R.: The Selfish Gene. Oxford University Press, Oxford (2006) 2. Wiggins, B.E., Bowers, G.B.: Memes as genre: a structurational analysis of the memescape. New Media Soc. 17, 1886–1906 (2015) 3. Ross, A.S., Rivers, D.J.: Digital cultures of political participation: internet memes and the discursive delegitimization of the 2016 U.S Presidential candidates. Discourse Context Media 16, 1–11 (2017) 4. Jenkins, E.S.: The modes of visual rhetoric: circulating memes as expressions. Q. J. Speech 100, 442–466 (2014) 5. Thrift, S.C.: #YesAllWomen as feminist meme event. Feminist Media Stud. 14, 1090–1092 (2014) 6. Macskassy, S.A., Michelson, M.: Why do people retweet? Anti-homophily wins the day! In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media (2011) 7. Metaxas, P.T., Mustafaraj, E., Wong, K., Zeng, L., O’Keefe, M., Finn, S.: What do retweets indicate? Results from user survey and meta-review of research, 4 (2015) 8. Amon, M.J., Hasan, R., Hugenberg, K., Bertenthal, B.I., Kapadia, A.: Influencing photo sharing decisions on social media: a case of paradoxical findings. In: 2020 IEEE Symposium on Security and Privacy (SP), pp. 1350–1366 (2020) 9. Hasan, R., Bertenthal, B.I., Hugenberg, K., Kapadia, A.: Your photo is so funny that I don’t mind violating your privacy by sharing it: effects of individual humor styles and photosharing behaviors. In: The Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, Yokohama, Japan (2021) 10. Krajbich, I., Armel, C., Rangel, A.: Visual fixations and the computation and comparison of value in simple choice. Nat. Neurosci. 13, 1292–1298 (2010) 11. Smith, S.M., Krajbich, I.: Gaze amplifies value in decision making. Psychol. Sci. 30, 116–128 (2019)
206
S. E. Fagan et al.
12. Shimojo, S., Simion, C., Shimojo, E., Scheier, C.: Gaze bias both reflects and influences preference. Nat. Neurosci. 6, 1317–1322 (2003) 13. Ho, H.-F.: The effects of controlling visual attention to handbags for women in online shops: evidence from eye movements. Comput. Hum. Behav. 30, 146–152 (2014) 14. Kaspar, K., Weber, S.L., Wilbers, A.-K.: Personally relevant online advertisements: effects of demographic targeting on visual attention and brand evaluation. PLoS ONE 14, e0212419 (2019) 15. Geise, S., Heck, A., Panke, D.: The effects of digital media images on political participation online: results of an eye-tracking experiment integrating individual perceptions of “photo news factors.” Policy Internet 13(1), 54–85 (2020) 16. Akram, U., et al.: Eye tracking and attentional bias for depressive internet memes in depression. Exp. Brain Res. 239(2), 575–581 (2020). https://doi.org/10.1007/s00221-02006001-8 17. Rashidi, Y., et al.: “You don’t want to be the next meme”: college students’ workarounds to manage privacy in the era of pervasive photography. In: Proceedings of the Fourteenth USENIX Conference on Usable Privacy and Security, pp. 143–157. USENIX Association (2018) 18. Malik, A., Hiekkanen, K., Nieminen, M.: Privacy and trust in Facebook photo sharing: age and gender differences. Program 50, 462–480 (2016) 19. Besmer, A., Richter Lipford, H.: Moving beyond untagging: photo privacy in a tagged world. In: Proceedings of the 28th International Conference on Human Factors in Computing Systems - CHI 2010, p. 1563. ACM Press, Atlanta (2010) 20. Zemmels, D.R., Khey, D.N.: Sharing of digital visual media: privacy concerns and trust among young people. Am. J. Crim. Justice 40(2), 285–302 (2014). https://doi.org/10.1007/s12103014-9245-7 21. Zenner, E., Geeraerts, D.: One does not simply process memes: image macros as multimodal constructions. In: Cultures and Traditions of Wordplay and Wordplay Research, pp. 167–194 (2018) 22. Todd, A.R., Bodenhausen, G.V., Richeson, J.A., Galinsky, A.D.: Perspective taking combats automatic expressions of racial bias. J. Pers. Soc. Psychol. 100, 1027–1042 (2011) 23. Rammstedt, B., John, O.P.: Measuring personality in one minute or less: a 10-item short version of the Big Five Inventory in English and German. J. Res. Pers. 41, 203–212 (2007) 24. Hahn, E., Gottschling, J., Spinath, F.M.: Short measurements of personality – validity and reliability of the GSOEP Big Five Inventory (BFI-S). J. Res. Pers. 46, 355–359 (2012) 25. Rammstedt, B., Kemper, C.J., Klein, M.C., Beierlein, C., Kovaleva, A.: A short scale for assessing the big five dimensions of personality: 10 item Big Five Inventory (BFI-10). Methods Data Anal. 7, 17 (2013) 26. Hoyle, R., Stark, L., Ismail, Q., Crandall, D., Kapadia, A., Anthony, D.: Privacy norms and preferences for photos posted online. ACM Trans. Comput.-Hum. Interact. 27, 30:1–30:27 (2020) 27. Maughan, L., Gutnikov, S., Stevens, R.: Like more, look more. Look more, like more: the evidence from eye-tracking. J. Brand Manag. 14(4), 335–342 (2007). https://doi.org/10.1057/ palgrave.bm.2550074 28. Looser, C.E., Wheatley, T.: The tipping point of animacy. How, when, and where we perceive life in a face. Psychol. Sci. 21, 1854–1862 (2010) 29. Langton, S.R.H., Law, A.S., Burton, A.M., Schweinberger, S.R.: Attention capture by faces. Cognition 107, 330–342 (2008) 30. Graziano, W.G., Habashi, M.M., Sheese, B.E., Tobin, R.M.: Agreeableness, empathy, and helping: a person × situation perspective. J. Pers. Soc. Psychol. 93, 583–599 (2007) 31. Wu, D., Bischof, W., Anderson, N., Jakobsen, T., Kingstone, A.: The influence of personality on social attention. Pers. Individ. Differ. 60, 25–29 (2014)
Prosthetic Face Makeups and Detection Yang Cai(B) Carnegie Mellon University, Pittsburgh, PA 15213, USA [email protected]
Abstract. Prosthetic makeup can be used to alter the appearance of a human face drastically through molded facial parts, skin-like silicone masks, and plastic surgery. Compared to conventional disguise methods such as glasses, a fake mustache, or a wig, it is more realistic and difficult to detect. We experimented with prosthetic makeups and with an affordable near-infrared camera and a portable thermal camera for smartphones. We found the spectra signatures can reveal the features at certain facial areas such as eye sockets, nose bridge, and forehead that are almost invisible to humans and computers by visual inspection. NIR reflectance depends on the distance to the lighting source, surface material property, and face poses. Thermal signatures also depend on the breathing flow, facial temperature, and thermal properties of the prosthetics, as well as the air gap between the prosthetics and the skin. We found the location-based spectra features and multiple bands to potentially increase the detection accuracy and explainability in terms of location, material, and makeup process. Keywords: Prosthetic makeup · Deep fake · Face recognition · Face detection · Pattern recognition · Forensics · Computer vision · Thermal imaging
1 Introduction Prosthetic makeup is a set of disguise techniques that conceal or change a person’s physical appearance through molded parts, skin-like silicone masks, and even plastic surgery. In contrast to conventional disguise methods such as rigid facial masks, fake mustaches, fake beards, wigs, glasses, and costumes, prosthetic makeup is more realistic, elaborate, custom-made, and difficult to detect. Professional facial prosthetics can be traced back to WWI, 100 years ago, American sculptor Anna Coleman Ladd hand-crafted facial prosthetics for disfigured WWI soldiers [1]. She had a cast made of their faces and constructed the prosthetic from very thin copper. She then painted the prosthetic parts to resemble the soldiers’ skin color. Early facial prosthetics brought humanity and dignity back to the wounded soldiers. Facial prosthetics have been widely used in Special Effects (SFX) in filmmaking. Professional makeup artists can transform an actor’s appearance drastically. Films such as Lord of Ring, Mrs. Doubtfire, and Tootsie have used prosthetic makeup intensively. Some makeup may take as long as 8 h, including elaborative casting, taping, painting, retouching, and adjusting. One of the outstanding makeup masters is the two times Academy award-winner Kazu Hiro [2]. In the interview, he reviewed the prosthetic © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 207–214, 2021. https://doi.org/10.1007/978-3-030-79997-7_26
208
Y. Cai
makeup process for the film Bombshell [3], using prosthetic sculpting, molding, and casting techniques to create advanced cosmetic effects. To prepare the facial prosthetics, first, a mold of the desired part of the body is cast using ingredients such as Alginate and Plaster of Paris. Then the desired shape of the prosthetic is achieved by using clay to sculpt over the mold. Curing this mold with prosthetic material such as latex, silicone, etc. is the final step in the process of making the prosthetic. Facial prosthetics have been also used in the intelligence communities to enable spies to disguise themselves and change their appearance at close range. The CIA has a special unit for disguise technologies. Joanna Mendez, the former Chief of Disguise of CIA once demonstrated the facial prosthetics on herself to President George H. Bush at White House. In the documentary video [4], she explained how to use prosthetic makeup to convincingly change the structure of cheekbones, the shape of the nose, etc. It is not hyperbole to infer that such techniques might be used by hostile groups to undermine national security. Today, altering a face digitally is a trivial work. There are plenty of downloadable apps for all sorts of filters and makeup. However, altering a face in a physical world to be viewed or captured in close range still needs a certain level of training and expertise. Materials to make a prosthetic limb are available on the market for anyone to purchase, and the techniques to make one can be learned through the Internet. This makes facial detection when wearing prosthetic makeup difficult and poses serious security implications. On the other hand, facial prosthetics also help individuals to protect themselves and defeat massive surveillance in public. In this study, we explore the spectral signatures of common prosthetic materials such as silicone masks and rubber foams with near-infrared and thermal imaging devices. We focus on semantically articulated anomalous features of those prosthetic materials that could inspire future computational detection techniques.
2 Related Work Many prior studies have focused on simple disguises and makeups with obscurity without altering facial structures but could be an adversary to facial recognition systems, such as eyeglasses, goggle, mustache, beard, or wig. Researchers in [5] collected 2,460 images from 410 different subjects to experimentally study human facial detection while wearing disguise or makeup. Their work showed that the performance of two commercial matches fell drastically when used to automatically recognize faces. The Disguised Face Database [6] contains over 100,000 high-quality images of 325 individual people. Subjects were captured under 5 different viewpoints and wearing 28 different disguises, including mask, mustache, eyebrows, face mask, eyepatch, hat, sunglasses, and their combinations. Near-infrared (NIR) spectrum imaging has been used for face detection that outperforms state-of-the-art face detection methods in stable lighting as well as variable lighting condition [8]. The advantages of near-infrared imaging include: it is affordable to average users; it can sense human skin details such as veins, and it can see-through glass where thermal imaging is not capable. Skin detection is critical to distinguish nonskin attachment and skin based on reflectance. Human skin has a change in reflectance at around 1.4 µm, allowing for a highly accurate skin mapping by considering the weighted
Prosthetic Face Makeups and Detection
209
difference between the lower-band near-IR image and the upper-band near-IR image, where the weight is the ratio of luminance in the lower-band near-IR (0.8–1.4 µm) to the upper-band near-IR (1.4–2.2 µm). NIR texture patches are also used for machine learning to detect authentic or disguised faces [9] but they are limited by the training samples and their environments. The study in [11] demonstrates that the upper band of the near-infrared (1.4–2.4 µm) is particularly advantageous for disguise detection purposes. This is attributable to the unique and universal properties of the human skin in this sub-band. Thermal imaging is a more expensive approach but it provides unique advantages to reveal hidden features such as implanted parts under the skin based on the thermal signatures. It can be used to see through the prosthetic facial mask to reveal the structure behind it. In contrast to the NIR images, thermal images of a face contain drastically different signatures, which are also sensitive to facial and ambient temperatures. The study in [10] presents the technology for detecting faces in thermal images. Texture analysis has been studied to distinguish the authentic facial patches versus fake facial patches based on the pixel intensity variances in visible and thermal images [7]. The study in [11] proposes four modalities visible, lower NIR, upper NIR, and thermal infrared for disguise detection. However, there have been fewer studies on prosthetic facial makeup. The authors of the study [11] demonstrated the appearance of an individual wearing a fake nose in a visible and upper NIR but did not show substantial signatures in the spectrum. Furthermore, the existing studies have not covered the new prosthetic makeup materials such as elastically articulated silicone facial masks and rubber foams that are available in online shops.
3 Prosthetic Face Makeup Experiments In this study, we would like to experiment with available prosthetic makeup materials, including the silicone face mask and rubber foam structural attachments. Figure 1 shows an elastic mask that is available from Amazon.com. It is made of thin silicon with openings for the eyes and mouth. It comes with realistic hairs and skin with wrinkles. In contrast to conventional masks, which are rigid and plastic-like, this mask is much more realistic, including translucent skin color and a tight fit to the head where eyes and mouth can move. From the webcam live video or on the street, it is hard to detect the disguised mask. The author once wore the mask to attend a webinar meeting with over 190 audiences. No one raised a flag. Figure 1 also shows the two prosthetic rubber foams for altering the shape of the nose and chin. These products can be ordered from special prosthetic makeup shops online. We then test with available near-infrared cameras and thermal cameras to reveal the underlying spectrum signatures for feature representation and pattern recognition.
210
Y. Cai
Fig. 1. Silicone face mask (left), the prosthetic nose (center) and chin (right)
4 Thermal Signatures We use the compact thermal camera Seek Thermal ™ that has the spectral range 7.5– 14 µ, pixel pitch 12 µ, 206 × 156 resolution, 36° field of view, at 9 Hz frame rate, with the range of up to 1,000 ft detection distance and a temperature range from - 40 F to 626 F. See the product specification in the reference [13]. Figure 2 shows the thermal signatures of a normal face and the face with the silicone mask on. From the thermal intensity distribution, we found the mask alters thermal signatures drastically. For example, the areas around the eye sockets of the mask appear to be colder with sharp edges. The area around the mouth also has a sharp intensity gradience. This is due to the small air gaps between the mask and skin. The thermal reflectances of the mask and the skin are different, creating abnormal thermal intensity distributions. Besides, the prosthetic mask blocks the heat from the hotspots on the face such as the forehead and eyebrows. The thermal features of the face are replaced with the signatures from the prosthetic mask. For example, the T-shape below the forehead and the two eyebrows and lips disappear. These semantic patterns can be used for future detection at the particular locations of a face. Figure 3 shows the thermal signatures of the prosthetic nose and chin. Similarly, due to the differences in the material properties, the prosthetic foams create anomalous patterns that stand out of normal distribution. The two patches appear to be colder than skin temperature. In addition to the material thermal properties, the glue between the prosthetic foam and the skin may also play a role here. If the glue is not thermally conductive or there is an air gap between the two surfaces, then it would be reflected on the thermography of the face. Thermography is affected by the body temperature, breathing patterns, and duration of the measurement. The thermal contrast of a prosthetic attachment may also be affected by its location. For example, the prosthetic patch on the bridge of the nose in Fig. 3 (right) may have a similar temperature as the skin because normally the nose is the coldest place on a face.
Prosthetic Face Makeups and Detection
211
Fig. 2. Thermal signatures (7.5–14 µ) of the bare face (left) and with the mask (right)
Fig. 3. Thermal signature (7.5–14 µ) of the prosthetic nose (left) and chin (right)
5 Near-Infrared Signatures We experimented with the affordable home security webcam WyzeCam with F2.0 aperture, and four 850 nm infrared LEDs that deliver clarity up to 30 feet away in total darkness [12]. To block any potential visible light during the daytime, a black plastic film from the old X-Ray film is used in front of the webcam. Figure 4 shows the near-infrared signature of the silicone mask in two poses. At 850 nm LED lighting, we can see the reflectance from the skin in the eye sockets is
212
Y. Cai
significantly different from the mask around them. The areas above the lips are also different, which may be due to the mask surface angle or the translucent property of the mask material. The two poses show the area features are consistent. The intensity gradients around the eye sockets provide a clue for detecting prosthetics. Figure 5 shows the near-infrared signatures of the prosthetic nose in two poses. The front view reveals the nose patch has a strong reflectance compared to the surrounding area. However, the reflectance depends on the poses and distance from the NIR light source. The right image of Fig. 5 shows that the NIR light was saturated so that the reflectance from the prosthetic nose and the skin is indistinguishable.
Fig. 4. Near infrared signature (850 nm) of the silicone mask in two poses
Fig. 5. Near-infrared signatures (850 nm) of the prosthetic nose in two poses
Compared to thermal signatures, NIR signature distributions are more or less close to the signatures in visible lighting. Figure 6 shows the NIR signatures of the prosthetic chin in two poses. We can observe the clear edges around the prosthetic chin, which
Prosthetic Face Makeups and Detection
213
provides a clue for detecting prosthetics. As NIR LEDs are inexpensive and widely available, the detection technology can be optimized by combining multiple NIR bands to reveal significant signatures of prosthetics.
Fig. 6. Near-infrared (850 nm) of the prosthetic chin in two poses
6 Toward Explainable Detection Prosthetic makeups are advanced disguises that are difficult to detect visually by humans and computers. Spectra signatures of NIR and thermal imaging of prosthetics and authentic faces can be learned from training datasets (e.g. texture analysis) [7]. However, the machine learning algorithms need substantial amounts of data in different poses, lighting, skin colors, and ambient environments. They do not provide any explanation about the signatures nor flexibility for considering new prosthetics. Our experiments show that multiband spectra gradience signatures at particular facial areas such as eye socket, chin, forehead, and nose can help to detect the prosthetics. To achieve this, we ought to detect facial landmarks in NIR or thermal images [7–10] and then detect the spectra signature around the areas. This would add to the explainability of the detection, for example, where the prosthetics are. Multiple spectra bands would also reveal more spectra signatures and material properties. For example, the rubber foam-based prosthetic nose and chin have a strong reflectance in NIR and colder signatures in the thermal images due to air gaps between the prosthetics and the skin. These add physical explainability to the detection models.
7 Conclusions We experimented with prosthetic makeups (including a flexible, articulated silicone mask and a prosthetic, rubber foam nose, and chin) with an affordable near-infrared camera and a portable thermal camera for smartphones. We found the spectra signatures can reveal the features at certain facial areas such as eye sockets, nose bridge, and forehead that are almost invisible to humans and computers by visual inspection. NIR reflectance depends on the distance to the lighting source, surface material property, and face poses.
214
Y. Cai
Thermal signatures also depend on the breathing flow, facial temperature, and thermal properties of the prosthetics, as well as the air gap between the prosthetics and the skin. We found the location-based spectra features and multiple bands can potentially increase the detection accuracy and explainability in terms of location, material, and makeup process. Acknowledgment. The author would like to thank the discussions with Scott Ledgerwood, Neta Ezer, Nick Molino, Justin Viverito, Dennis Fortner, Likhitha Chintareddy, and Mel Siegel. The author is grateful for the support from NIST, PSCR, PSIAP, Northrop Grumman Corporation Cyber Security Consortium, and SOTERIA AI Program.
References 1. Davies, G.: American sculptor built facial prosthetics for disfigured WWI soldiers, World War I Centennial. worldwar1centennial.org. https://www.worldwar1centennial.org/index. php/communicate/press-media/wwi-centennial-news/3222-american-born-sculptor-built-fac ial-prosthetics-for-wwi-soldiers.html 2. Kazu Hiro: Wikipedia. https://en.wikipedia.org/wiki/Kazu_Hiro 3. Kazuhiro Tsuji on Bombshell’s Prosthetic Makeup. https://www.youtube.com/watch?v=NO0 6357ggB4 4. Mendez, J.: Former CIA Operative Explains How Spies Use Disguises: WIRED. Youtube.com. https://www.youtube.com/watch?v=JASUsVY5YJ8 5. Wang, T.Y., Kumar, A.: Recognizing Human Faces under Disguise and Makeup. ISBA (2016). http://www4.comp.polyu.edu.hk/~csajaykr/myhome/papers/ISBA2016.pdf 6. The Disguised Face Database. https://www.cis.upenn.edu/~dfaced/index.html 7. Dhamecha, T.I., Nigam, A., Singh, R., Vatsa, M.: Disguise Detection and Face Recognition in Visible and Thermal Spectrums. http://www.iab-rubric.org/papers/disguise-ICB13.pdf 8. Dowdalla, J., Pavlidisa, I., Bebisb, G.: Face detection in the near-IR spectrum. Image Vis. Comput. 21, 565–578 (2003). https://www.cse.unr.edu/~bebis/IVC03.pdf 9. Ghoneim, A., Youssif, A.: Visible/infrared face spoofing detection using texture descriptors. In: CSCC 2019 (2019). https://doi.org/10.1051/matecconf/201929204006. https://www. matec-conferences.org/articles/matecconf/pdf/2019/41/matecconf_cscc2019_04006.pdf 10. Prokoski, F., Riedel, R., Coffin, F.: Identification of individuals by means of facial thermography, CH3119-5/92. IEEE Xplore (1992) 11. Pavlidis, I., Symosek, P.: The imaging issue in an automatic face/disguise detection system. In: CVBV 2000 (2000). http://www.cpl.uh.edu/publication_files/C13.pdf 12. WYZE product website. Captured on 13 January 2021. https://wyzelabs.zendesk.com/hc/enus/articles/360030119511-Night-Vision 13. Seek Thermal Specifications. https://www.thermal.com/uploads/1/0/1/3/101388544/com pact-sellsheet-usa_web.pdf
Analysis of Risks to Data Privacy for Family Units in Many Countries Wayne Patterson(B) Patterson and Associates, 201 Massachusetts Avenue NE, Suite 316, Washington, DC 20002, USA
Abstract. Ground-breaking research by Sweeney a number of years ago demonstrated the vulnerability of personal information that can be relatively easily discovered, allowing a malicious attacker or hacker to be able to recover sensitive information about an individual. In particular, the Sweeney research showed that close to 90% of the individuals in the United States can be identified uniquely using only three easily discoverable data points: postal code, gender, and birthdate including year. Our current research has shown that in most United Nation member countries, including the United States, almost 90% of family units at the same residence can be identified uniquely using only two easily found data points: postal code and birthdate including year of one family member. Keywords: Data privacy · Sweeney research · Family units · Malicious attacks · Hackers · Postal codes
1 Introduction In previous work, several authors beginning with Sweeney have demonstrated the vulnerability of many computer users to the discovery of their personal information given only three pieces of easily obtainable personal information: exact date of birth (day, month, year), gender, and residential postal code (or ZIP code in United States Postal Service terminology). In Sweeney’s original study [1], she demonstrated that approximately 87% of the persons in the United States of America could be uniquely identified by only the three data points indicated above. Using 2020 population data, this can be interpreted as approximately 288 million of the nation’s 331 million residents. It is not necessary to indicate how simple it is for a hacker to identify those three data points for any person. For a hacker in the physical vicinity of the victim, “dumpster diving” is likely to succeed, whereas perusing the Internet will usually provide these data points for a remote attacker. Patterson and Winston-Proctor [2], were able to extend the results published by Sweeney to a comparison study including a set of 30 countries (mostly United Nations members) selected according to population, internet, and computer usage. The current research extends these results to the 193 United Nations member countries. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 215–222, 2021. https://doi.org/10.1007/978-3-030-79997-7_27
216
W. Patterson
In general, these studies have concluded that in most countries, it is possible to reveal the identity of individuals by discovering only the triad of data points indicated above, which we will identify in what follows as the triad T = (PC, G, B); where PC = postal code, G = gender, and B = birthdate including year. Considering the skill set of potential hackers, we change the problem somewhat by examining the difficulty of discovering not the data points to determine an individual uniquely, but rather the related question of uniquely identifying a family unit living together. The underlying assumption is that it might serve a hacker’s purposes just to identify the family unit at a given address—for example, if the hacker’s malicious purpose might be to inflict some damage (theft or otherwise) at this family’s location. What this paper will show is that in a large percentage of countries that are United Nations members, with sufficient data published; that for the most part, discovering sufficient data about the family unit will have approximately the same level of success with only two data points rather than three (in this case postal code and one birthdate of a family member including year). We will also note that in the vast majority of countries that the problem of identification of the individual is similar to the problem of identification of the family unit (again with one less data point). Nine exceptional countries will be examined in greater detail. Sweeney demonstrated [1] that using only public database information, a very high percentage (~87%) of easily available information about United States residents can be determined. This could expose the overwhelming percentage of United States residents to information easily available, in order to facilitate the capture by hackers or other malevolent actors of such personal information, through techniques we now refer to as “dumpster diving.” In particular, her research demonstrated that approximately 87% of the United States population can be identified uniquely using only the Unites States’ five-digit postal code, date of birth (including year), and gender. Although this result has held up over time, Sweeney’s technique made no attempt to develop similar estimates for other countries. In this paper, we use Sweeney’s techniques [1] in order to provide estimates of the ability of similar demographics to provide the same type of data for all United Nations member countries. Through this mechanism, we attempt to determine the susceptibility to data privacy attacks throughout a substantial portion of the world’s population. The extremely rapid growth in almost all cyber environments, and public awareness of the susceptibility to individually targeted attacks, or attacks on organizations that maintain such personal data have heightened public concern about the security of their personal information. Throughout the world, there is a rapid increase in the reported number of incidents of vital personal information stored electronically being captured by malicious actors. The degree of the threat is even more complicated with world-wide concern about COVID19 and the question of the validity of (or “fake”) information sources regarding the pandemic. The exponential growth of cyberattacks in virtually every computing environment; and second, public awareness of vulnerability to attacks that may be directly aimed at the individual, or to an organization that maintains widespread data on the entire population, have accelerated this concern.
Analysis of Risks to Data Privacy for Family Units
217
As mentioned above, Dr. LaTanya Sweeney was able to demonstrate the vulnerability of most persons in the United States to easily available demographic data: “It was found that 87% (216 million of 248 million) of the population of the United States had reported characteristics that likely made them unique based only on {5-digit ZIP, gender, date of birth}” [1]. However conclusive was her research concerning most United States persons and their susceptibility to easily found demographic information, her work did not address similar issues in other countries. Nevertheless, she provided a template for developing similar estimates regarding other countries throughout the world. For our purposes, we will attempt to analyze the vulnerabilities of persons with key personal data elements being obtained by malicious actors. The current membership of the United Nations (UN) consists of 193 countries, so our analysis will consider all of these countries. Following the Sweeney approach, we will explore for each United Nations member country, the value of the triad T = (PC, G, B), and thus establish the vulnerability of almost all persons to cyberattack. Sweeney’s research has been critical in alerting US residents as to how easily they can be identified individually, with techniques such as “social engineering” or “dumpster diving.” Since approximately 87% of the US population can be identified uniquely by the value of T, the identity of the individual can easily be compromised by persons seeking that information in publicly available databases. In prior work [1, 2], comparable analyses were done for a subset of the UN countries considered here. For each country, we have estimated the percentage of the population that can be uniquely identified through analysis of the possible values of the triad T. What we will determine in this paper is that we may achieve very comparable results by relaxing one of the three values in T for a new triad T which reduces the challenge to a potential attacker by only having to discover two pieces of information rather than three.
2 Selection of Countries for Analysis We have included all 193 UN countries; for the purposes of this analysis, we will divide the 193 countries into four groups: (1) Countries without standardized postal code systems (44); (2) Countries for which the Absolute Values of the differences between T and T differ by less than 1% (123); (3) Countries for which the Absolute Values of the differences between T and T differ by more than 1% but less than 10% (17); (4) Countries for which the Uniform Distribution or the Normal Distribution indicates more than 10% of the population but less than 90% can be uniquely identified (9).
3 Postal Code Systems Our approach has been to use the Sweeney methodology for all appropriate United Nations countries. The necessary data to do this requires the total population, life expectancy, and postal code systems by country. Total population [3] and life expectancy [5] are easily found. The first two are easily found and have a high degree of accuracy. Postal code systems exist in most UN countries
218
W. Patterson
but not all, is of a different nature, since the information that is easily available is the potential range of values for postal codes in all of our selected countries. For example, and using ‘N’ to represent decimal digits in a postal code, and ‘A’ for alphabetical characters, it is possible to determine the total range of possible postal codes. For example, in the United States five-digit ZIP code system, which we would indicate as “NNNNN”, there are a total of 105 = 100,000 possible postal codes. It is necessary to be able to estimate the key statistics Sweeney used. Population data is easily available for all UN countries [3], as are mortality rates or life expectancy rates to develop data as in Sweeney’s paper. The third statistic used by Sweeney is, for the United States, the 5-digit form of postal code, called in the US the “ZIP code”. Throughout the world, most postal codes use a combination of numerals {0, 1, 2, …, 9} which we describe as ‘N’; and letters of the relevant alphabet. In the Roman alphabet (mostly in uppercase), we have {A, B, C, …, Z} which we designate as ‘A’. In the case of the older US 5-digit ZIP Code, the syntax is NNNNN, which allows for the maximum possible number of ZIP Codes as 105 = 100,000. As a comparison, the United Kingdom postal code system is (for almost all cases) AANA NAA, therefore 264 × 102 = 45,697,600. However, the number of allowable postal codes is better approximated by 817,960, since many letter combinations are not used. We have found the syntax for postal codes used in 149 of the UN countries [6]. 44 do not use a national postal code (See Table 1). In order to list all UN countries in the available space, we use the International Standards Organization (ISO) three-letter country code system that can be found at [7]. Table 1. 44 UN countries without standardized postal code systems. ISO country code AGO ARE ATG BFA BHS BLZ BOL BTN BWA CAF COD COG COM CIV DJI DMA ERI FJI GAB GMB GNQ GRD GUY KIR MRT MWI PAN PRK QAT RWA SLB SLE SSD STP SUR SYC SYR TGO TON TUV UGA VUT YEM ZWE
We will proceed without further consideration of these 44 countries. Not all syntactically allowable postal codes will be in use. For example, for the United States, the five-digit ZIP would allow for 105 = 100,000 codes. However, the current number of codes in use is 40,933, thus 41% of the allowable values. In order to develop an estimate in all of our target countries, we need to begin with the allowable postal code values and the life expectancy by country. A few examples of the syntax for postal codes in our target countries include: Canada (ANA NAN); Czech Republic (NNNNNNN); Kazakhstan (NNNNNN); Netherlands (NNNN AA); Philippines (NNNNN); Switzerland (NNNN). From the above, we first estimate the number of postal codes syntactically, as in the above. The first estimate of the number of postal codes per country (PC) is determined by the syntax and the potential number of occurrences for each character in the string representing the code. In a number of cases, it is possible to determine if a country uses all of the possibilities for codes under its coding system. But in most countries,
Analysis of Risks to Data Privacy for Family Units
219
not all possibilities are used—only a certain fraction of the eligible set of codes are actually in use. We first calculate the likelihood of identifying individuals uniquely if we assume uniform distribution of persons for each country. But we must make assumptions about postal code distributions in each country; thus, we can only assume that uniform distribution is reasonable as a first approximation to the results obtained for the United States. Using uniform distribution, we can accurately calculate the total number of “pigeonholes” for most of the identified UN countries, and then the uniform distribution by dividing the population by the number of pigeonholes.
4 “Pigeonholes” As we described in [2], we need to assess the number of individuals that can fit into the combination of categories, which are often called “pigeonholes”. We can thus rephrase Sweeney’s conclusions by restating that, in her study, approximately 87% have no more than one data point (or pigeon) assigned to that pigeonhole. Another way of describing the problem or series of problems is through the terms “bits” and “buckets”. As we describe assigning the characteristics of individuals into “pigeonholes”, we can describe the same problem as assigning “bits” to “buckets”. For the United States, the number of pigeonholes is the product of birthdate with year, gender, and (five-digit) ZIP code. The contribution to this number related to gender is 2, say pg = 2. For birthdate, we approximate the number of values using pd = 365 for days of the year (a slight simplification ignoring leap years), multiplied by the number of years, estimated by the country’s life expectancy in years. Call this ple . The final relevant factor in estimating the number of pigeonholes is the number of potential postal codes, PC [6]. Then the total number of pigeonholes PH is PH = pg × pb × ple × PC = (2 × 365) × ple × PC Given available data for all UN countries studied, the value PC is often not made public. As a first level analysis, we determine the necessary components in order to estimate, country-by-country of applicable UN members the necessary data to perform the Sweeney-like analysis, ignoring the 44 countries listed in Table 1. These components are for each person in any of the countries studied, gender; birthdate including month, day, and year; and postal code of residence, with life expectancy by country. What is not known about the components leading to the determination of numbers of the population stored in each pigeonhole is the distribution function. In the original Sweeney paper, she had additional information furnished by the US postal system that allowed her to give a reasonable estimate of the distribution function. In our case, the immediate estimate of the population distribution would be to assume the uniform distribution—in other words by dividing the population by the number of pigeonholes. This analysis does allow us to at least compare the distributions by country. We will further refer to this analysis as the Uniform Distribution. Beyond the Uniform Distribution, it is more likely that the actual distribution would approach a normal distribution (or a bell-shaped curve). In our earlier paper [2], we
220
W. Patterson
attempted to determine both the Uniform Distribution and the Normal Distribution for the allocation of the population to pigeonholes. In this paper, we will only refer to the Uniform Distribution as we will attempt to compare results of this analysis to analysis assuming the challenge to an attacker is not the identification of an individual, but rather the identification of a family or household unit, also assuming a Uniform Distribution.
5 Second Approach Using the Uniform Distribution helps because it is much easier for computational purposes, but on the other hand it is less likely to model exact world conditions. For one example, a 100-unit apartment building, with the same code as a nearby one belonging to a single-family dwelling will have distorted numbers of persons in the two codes. Thus we use a second model to determine the likelihood of multiple persons with the identical three coordinates, in this case altering the three data points required by basing the analysis on family units rather than individuals. The assumption must be made in both cases that the distribution of “pigeonhole” data can be modeled by a normal form distribution. This is not an unreasonable assumption, given that the data from the original Sweeney paper demonstrates a similar model.
6 Family Unit Definition In order to conduct the analysis of the threat to family units rather than individuals, we need to modify the triad T = (PC, F, B), as opposed to T = (PC, G, B), where PC and B are defined previously, and we replace G (gender) by F for family unit. For all UN countries, this data is available through the United Nations Children’s Fund (UNICEF) [4]. Considering the skill set of potential hackers, we change the problem somewhat by examining the difficulty of discovering not the data points to determine an individual uniquely, but rather the related question of uniquely identifying a family unit. The relationship between T and T , for the countries for which this data is available, is: T = T × H/G = T × H/2 where H = average household size, as found in [4]. Using the different triads T and T for all UN countries for which the Uniform Distribution can be calculated, the value of the uniform distribution varies by less than 6% in 136 UN countries. In only 9 countries do these values for uniform distribution vary by more than 6%, and we consider these countries individually in Sect. 7. In order to find an appropriate normal distribution, we use the data previously calculated for the quotient used in the uniform distribution, and use a Monte Carlo approach to estimate a peak value for a normal distribution to be fitted to the known data. We can see in the following Table 2 through 5 the percentages of uniquely identifiable combinations of T = (PC, G, B) and T = (PC, F, B) for each country. In the one exceptional case, the T value for South Africa (ZAF) is 86%. Although it is a reasonable assumption to look for a distribution of the data points or “bits” to be assigned to “buckets” beyond the uniform distribution described above. There is no overwhelming reason to use a normal distribution rather than any other distribution model, such as binomial, Poisson, student T or other.
Analysis of Risks to Data Privacy for Family Units
221
Table 2. Countries for which the absolute values of the differences between T and T differ by less than 1% (123) and both T and T > 90%. ISO country code ALB AND ARG ARM AUS AUT BEL BGR BHR BIN BLR BRA BRB BRN CAN CHE CHL CHN COL CPV CRI CUB CYP CZE DEU DNK DOM DZA ECU ESP EST FIN FRA FSM GBR GEO GHA GIN GNB GRC GTM HND HRV HUN IRL IRN IRQ ISL ISR ITA JAM JOR JPN KAZ KEN KGT KHM KNA KOR KWT LAO LBN LBR LBY LCA LIE LKA LTU LUX LVA MAR MCO MDA MDV MHL MKD MLI MLT MMR MNE MNG MUS MYS NAM NGA NIC NLD NOR NPL NZL PER PLW POL PRT PRY ROU RUS SAU SDN SEN SGP SLV SMR SOM SRB SVK SVN SWE SWZ THA TJK TKM TLS TTO TUR TZA UKR URY UZB VCT VNM WSM ZMB
Table 3. Countries for which the absolute values of the differences between T and T differ by less than 10% (17) and in all but in one case, T and T > 90%. ISO country code AFG AZE BDI EGY HTI IDN IND LSO MEX MOZ NER OMN PAK TUN USA VEN ZAF
Table 4. Countries for which the uniform distribution or the normal distribution indicates more than 10% of the population but less than 90% can be uniquely identified (9) ISO country code BEN BGD CMR ETH MDG NRU PHL PNG TCD
7 Extending to a Global Model The combined populations of the 193 UN countries are, at the time of this writing, 7,577 billion individuals, or approximately 96.8% of the world population of 7,826 billion. Our research has estimated that the 123 countries described in Table 2 have a combined population of 4,121,859,746, which is approximately 54.4% of the earth’s population; and also over 99% of that population can be determined uniquely by determining the triads T and T of data indicated in the Introduction. Considering further the combination of the 123 countries from Table 2 and the 17 countries from Table 3, we can see that they represent 6,767,575,199 persons or 89.3% of the global population. Furthermore, finding the triads T and T for Tables 3 and 4 would determine a specific individual over 95% of the time. Three of the countries (Bangladesh BGD, Ethiopia ETH, and the Philippines PHL) have a significant difference in the calculation for uniform distribution calculated with triads T and T . In all countries, the values of G = 2 and H (per country) give the ratio of T and T ; that ratio has an impact on the final calculation of difference between T and T when considering the individual country population. It is also the case that the
222
W. Patterson
average family size for these countries (between 4.5 and 4.7, above the world average of 4.0) contributes to the greater values for T over T . These are relatively large countries in populations (between 8th and 15th worldwide). All three of these countries have a similar postal code syntax (NNNN) thus a maximum of 10000 postal codes. The average populations per postal code are Bangladesh (8th largest), 16,958; Ethiopia (15th largest), 10,083; and Philippines (12th largest), 10,938. Benin BEN, Cameroon CMR, Chad TCD and Madagascar MDG: these four African countries are medium-sized in population, but in their number of available postal codes is much smaller, so the differences between T and T is largely a function of the larger population assigned to a smaller number of postal codes. The population per individual code is: Benin, 22,187; Cameroon, 73,118; Chad, 46,680; Madagascar, 26,251. Nauru (NRU) is a special case in that it has only one postal code assigned to it. Papua New Guinea has a similar profile to the three countries in Bangladesh, Ethiopia and Philippines, with roughly a ten-to-one ratio both in terms of population and postal codes. With both of these countries, the difference between the T and T is 16.8% and 12.1% respectively.
8 Conclusions Sweeney and others have shown that only three data points can identify an individual uniquely in many countries with a high probability of success. We have shown that relaxing the requirement for three data points to two will permit an attacker to identify a family unit uniquely with approximately the same effort as with the three data points. In the 149 countries analyzed, in essence most United Nations countries, sufficient data is publicly available that allows for this analysis to be considered. As described above, the only data required to conduct this analysis is the individual number of the country’s (where this exists), (PC); the population by country (P); the average size of household units (F); and the country life expectancy rates (ple ). As we have seen above, anyone attempting to identify an individual only using the T data points could succeed in doing this over 95% of the time for at least 149 of the UN countries. It is possible with sufficient Internet searching, to be able to find certain Internet databases which can return the identifying data points described as the triad T.
References 1. Sweeney, L: Simple demographics often identify people uniquely. Data Privacy Working Paper 3. Carnegie Mellon University, Pittsburgh (2000) 2. Patterson, W., Winston-Proctor, C.E.: An international extension of Sweeney’s data privacy research. In: Ahram, T., Karwowski, W. (eds.) Advances in Human Factors in Cybersecurity. Advances in Intelligent Systems and Computing, vol. 960, pp. 28–37. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-20488-4_3 3. United Nations. https://population.un.org/wpp/Download/Standard/Population/ 4. Mean International Wealth Index (IWI) score of region - Area Database - Global Data Lab. Globaldatalab.org 5. Wikipedia. https://en.wikipedia.org/wiki/List_of_countries_by_life_expectancy 6. Wikipedia. https://en.wikipedia.org/wiki/List_of_postal_codes 7. International Standards Organization (ISO). https://www.iso.org/iso-3166-country-codes. html.
Exploring Understanding and Usage of Two-Factor Authentication Account Recovery Jeremiah D. Still(B) and Lauren N. Tiller Department of Psychology, Old Dominion Univerity, Norfolk, USA {jstill,ltill002}@odu.edu
Abstract. Users employ strategies that make passwords weaker than they appear. Companies have started requiring users to adopt two-factor authentication (2FA) to increase security, making regaining account access difficult. A potential solution is to provide users with an account recovery method used as a failsafe for 2FA in the event of a lost, broken, or stolen second factor. However, limited research evaluates users’ employment of different types of 2FA account recovery methods. Our exploratory study surveyed 103 students. We found only 40% of the sample have more than one 2FA device enrolled per personal account. The majority opt to use 2FA devices that are executed using their phone, 81% use a mobile app, and 51% receive an SMS text message. Our findings suggest regaining account access could be challenging for users when their phone becomes unavailable. If companies do not adequately prepare for such occurrences, it will become costly and disruptive. Keywords: Human factors · Survey · Two-factor authentication
1 Introduction When it comes to knowledge-based authentication, users bear the responsibility of creating strong passwords to ensure the security of their online accounts [1]. The security requirements for creating a strong password are cumbersome [2–5]. To overcome these cognitive burdens, users often produce passwords that reflect common patterns or strategies that are easy to recall, which may reduce their account’s resistance to cyber-attacks. To increase account security and compensate for insecure account protection provided by traditional alphanumeric passwords, some companies (e.g., Microsoft, Google, and Facebook) have started to offer or require their users adopt two-factor authentication [6]. Two-Factor (or Multi-Factor) Authentication (2FA) is a layered authentication process requiring the user to couple their password with another authentication method. In 2018, federal agencies that use dot-gov domains, such as the Department of Justice, began to prompt officials to add the two-factor security feature to increase the system’s intruder attack resistance [7]. However, if a second-factor device fails, regaining account access can be problematic for the authorized user [8]. Essentially, when the authentication process requires more information to prove identity. As a result, regaining account access often becomes more difficult [8]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 223–231, 2021. https://doi.org/10.1007/978-3-030-79997-7_28
224
J. D. Still and L. N. Tiller
The password reset procedure for systems that only use a single-factor password is different from the 2FA account recovery processes. Renaud (2007), noted that systems using single-factor password authentication fulfill password reset requests by either asking the user to answer personal questions, e-mailing the user their forgotten password, or e-mailing the user a secure link that obliges the user to create a new password [9]. The account recovery process for systems that implement 2FA is more complicated. Even though passwords may be involved, account recovery is not the same as a basic password reset [10]. Systems implementing 2FA require extra steps to ensure that an account is recovered to its rightful owner. Account recovery procedures are essentially bypassing the system’s main security protocols, which necessitates systems to treat account recovery as an alternative authentication. The purpose of an account recovery method is to maintain the high cyber-attack resistance while still allowing the authorized user account access. There is limited research that evaluates different types of 2FA account recovery methods. We aimed to further understand the end user’s knowledge of and usage of 2FA authentication recovery. 1.1 User Studies Surrounding 2FA Das et al. (2018) conducted research comparing user acceptance of various USB Yubico keys in a two-part study [11]. They focused on collecting usability and acceptability data. Despite the Yubico design improvements, they showed that participants continued to express their belief in password strength alone. Notably, they stated, “Even the bestdesigned hardware will not be used if the benefits are not apparent” [11, p. 15]. More closely aligned with our research within a university setting. Colnago et al. (2018) explored the behaviors and opinions of 2FA adoption at Carnegie Mellon University (CMU) [12]. They found that users believed 2FA provided their account with more security, and it was reasonably easy to use. However, many found the requirement of 2FA annoying. Interestingly, their results indicated that users commonly reported problems such as forgetting their second factor, having it too far away, losing their phone, having a dead phone battery, having no data connection, and the hardware token desynchronizing. When these problems occurred, users reported consequences such as not being able to do homework and participate in class; not having access to e-mail or a computer system; not having access to a dorm or office. 1.2 2FA Account Recovery An account recovery authentication option is an account feature that some systems with 2FA make available for users to set up before losing a second-factor device. Large tech companies use several 2FA recovery options. Loveless (2018) conducted an informal exploratory evaluation of authentication practices for several websites (e.g., Facebook, Amazon, Apple ID, GitHub, Reddit, Yahoo, Twitter, LinkedIn, Gmail, Kraken, Live, and Coinbase) [10]. It was revealed that organizations are employing several recovery options. That is, end-users can opt to receive a set of downloadable recovery codes, or they can set up a backup e-mail, phone number, or device. None of the evaluated companies provided users with all these options, but a fair degree of flexibility was common.
Exploring Understanding and Usage of 2FA Account Recovery
225
Organizations with 2FA (e.g., Reddit, GitHub, and Google) are currently offering precautionary account recovery options [13, 14]. Other websites such as Apple, Evernote, Twitter, and Coinbase inform account holders that in the event of a lost second factor, it may take several business days to regain account access [15, 16]. LinkedIn users are required to complete a multi-part form and submit a copy of a government-issued ID when their second factor is unavailable [10]. Asking users to select the best account recovery option and understand their actions from a cybersecurity perspective requires expertise. 1.3 Cyber Hygiene: Training and Expertise Asking users to make good security decisions relating to authentication often falls under the heading of cyber hygiene. Cain, Edwards, and Still (2018) conducted an extensive study to evaluate users’ cyber hygiene knowledge of threats, concepts, and behaviors by examining cyber topics such as authentication, security software, social networking, web browsing, USB drive use, phishing scams, and Wi-Fi hotspot usage [17]. Their results indicated that people 45 years of age and older generally practice more secure cyber behaviors. Cyber hygiene knowledge did not differ by age. Another finding of their study suggested that users who were victims of past cyber-attacks reported behaviors and knowledge that did not differ from users who had not been subjected to a cyber-attack. Interestingly, the survey results showed that participants who indicated they had received past cybersecurity training had less knowledge and more risky behaviors than users who reported they had not received training. They found that 81% of their participants (n = 144) had received some form of cybersecurity training. Other research studies that evaluated the proportion of self-identified cybersecurity trained participants found much lower results [19% for college-age students, 18; 43% for adults, 19]. End users are asked to set up 2FA accounts. The registration process requires them to make decisions beyond the typical creation of a secure password. Now users are being asked to select a 2FA recovery method from a set of options. We aimed to reveal college users’ 2FA account recovery choices. And we explored the impact self-reported cybersecurity training and experience with a previous attack had on general authentication knowledge.
2 Method These data were extracted from a more extensive survey [20] examining other factors not presented in this article (e.g., Berlin Numeracy Test). We are going to focus on two question sections within the survey demographics and general knowledge of authentication. 2.1 Participants A total of 113 undergraduate students (females = 78) were recruited through the SONA Experiment Management System and were compensated for participation. Participants had to answer two of the three attention check questions correctly and complete the
226
J. D. Still and L. N. Tiller
security ranking questions to be included in the study. Data from 10 participants were omitted, resulting in 103 participants (females = 73). Ages ranged from 18 to 50 years (M = 21.50, SD = 6.10) and reported heavy daily computer usage (M = 8.35, SD = 3.99). The number of 2FA devices participants registered to any given account enrolled in 2FA ranged from 1 to 5 (M = 1.58, SD = 0.92). 2.2 Materials and Procedure This research used a 42-question survey that took participants approximately 35 min to complete. Previous research has noted that self-report is a valid measure for the topics covered by our survey. According to Russell et al. (2017), when users do not behave securely, their recount of non-secure behaviors still results in honest reporting [21]. 2.3 Knowledge of Authentication To develop the construct for the general authentication knowledge, we chose seven questions (two general and five threats) from the Cain et al. (2018) article that related to authentication practices. And, one new question as created to target specific authentication understanding [17]. Participants were asked to respond to each question with Strongly Agree, Agree, Neither Agree nor Disagree, Disagree, and Strongly Disagree. Overall knowledge scores reflect a combination of the general- and threat- authentication knowledge scores (Cronbach α of .60; an acceptable level of reliability). The knowledge of general authentication consisted of three questions about authentication security concepts. And, a new question to assess whether users understand the operational definition of authentication. These questions focused on capturing the participants’ knowledge of common authentication terminology. The knowledge of threats questions focused on capturing the participants’ knowledge regarding common threats, behaviors, or outcomes associated with secure authentication practices. An example of a threats statement is, “It is safe to share a password with others”.
3 Results 3.1 Demographics and Usage Twenty percent of participants indicated they had previous exposure to some type of educational cybersecurity material (see Table 1). When the survey data were collected, our sample needed to have their accounts enrolled in 2FA. However, only 89% of participants indicated that they used 2FA to protect any personal accounts. Suggesting the conceptual meaning of 2FA is not apparent to some users. Eighty-one percent of participants indicated they use a smartphone or tablet app as their second factor, and 60% of participants only have one 2FA device enrolled per account (M = 1.58, SD = 0.92). For more reports on 2FA familiarity, see Table 1 and Figs. 1 and 2.
Exploring Understanding and Usage of 2FA Account Recovery
227
Table 1. Frequency table for cybersecurity familiarity Variable
n
%
Received cybersecurity training Yes
11 10.7
No
92 89.3
Training location Work
8
7.5
School
5
4.7
Online
2
1.9
2
1.9
Other – military N/A
89 84.0
Taken a class with cybersecurity topics Yes
16 15.5
No
87 84.5
Cybersecurity expert Yes No
1
1.0
102 99.0
Target of a cybersecurity attack Yes
17 16.5
No
86 83.5
Note. N = 103.
3.2 Overall Authentication Knowledge: Cybersecurity Training and Experience with an Attack The results revealed that participants answered more than half of the authentication knowledge questions correctly (M = 6.91, SD = 1.70). However, a low number of participants indicated that they had “… received training in cybersecurity…” (N = 11). To make group sizes more equal, the “yes” cybersecurity training group was expanded to include participants who indicated they are a cybersecurity expert or have taken classes that covered cybersecurity topics. Specifically, we added the participants that selected “yes” to the questions, “Have you taken classes covering the topic of cybersecurity in the past?” or “Do you consider yourself an expert in cybersecurity?” (N = 21). An independent samples t-test was used to explore the relationship between overall authentication knowledge and cybersecurity training. Overall authentication knowledge scores for participants who had not received any form of cybersecurity training (M = 6.83, SD = 1.71, N = 82) were not significantly different from the scores of participants who had received cybersecurity training (M = 7.24, SD = 1.67, N = 21), t(101) = − 0.98, p = 0.165, d = −0.240.
228
J. D. Still and L. N. Tiller
100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Smartphone or Tablet App
SMS Text Message
Hardware
Software
Security Key USB Drive
Other
Fig. 1. 2FA devices used by type. Note: the 89 participants who reported Yes to using 2FA were asked to indicate the type of 2FA devices they had used. Participants had the option to select more than one kind of 2FA device, which resulted in N = 144 responses. The percentage represents the proportion of participants out of the 89 participants who reported they use 2FA.
100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 1
2
3
4
5 or More
Fig. 2. The number of 2FA devices enrolled per account. Note: the percentage represents the proportion of participants out of the 89 participants who reported they use 2FA.
An independent samples t-test showed the authentication knowledge scores for participants who have not been a cyberattack victim (M = 6.84, SD = 1.71, N = 86) were not significantly different from participants who have been a cyberattack victim (M = 7.29, SD = 1.65, N = 17), t(101) = −1.01, p = 0.315, d = −0.268. However, we cautiously present this finding as the groups are far from equal (i.e., cyberattack victim, No: 86 versus Yes: 17).
Exploring Understanding and Usage of 2FA Account Recovery
229
4 Discussion Systems implementing 2FA give users the option to enroll as many 2FA devices to their accounts as they desire. However, we found that 60% of participants indicated they typically have only one 2FA device enrolled per personal account (M = 1.58, Md = 1.00, SD = 0.92). This finding is aligned with Colnago et al. (2018) previous findings, which suggest users employ an average of 1.3 (Md = 1.00) 2FA devices [12]. Clearly, most users would not have another authentication option to gain account access if their primary 2FA device became unavailable. We found that 81% of participants used a mobile or tablet app as their 2FA device, and 51% indicated that they received an SMS text message. Only 10% of participants stated that they used either a hardware or a software device, and only 8% used a USB security key (see Fig. 1). Colnago et al. (2018) found that as the frequency of experiencing 2FA problems increases, users’ perceptions are negatively impacted, as well as the usability and security constructs [12]. Future training material should consider ways to convince users that it is necessary to enroll additional 2FA devices, other than just their phone. Even though our results suggested that cybersecurity training or experience with an attack did not impact users’ authentication knowledge, it is important to highlight that most of the sample indicated they had no prior experience with either. Further, exploration of previous training is limited because we did not inquire about the specific topics or the training’s breadth. Presumably, account recovery occurs intermittently; therefore, it is critical the 2FA option is easy to access and remember. Mobile phones are commonly employed as a recovery option, probably due to their easy access. A USB key is much less likely to be available. A knowledge-based authentication would need to be easy to retrieve from memory. A typical pattern across memorability studies suggests that graphical authentication schemes allow users to easily retrieve their passcode from memory (i.e., by employing both images and a recognition task; see 1; [22]. From a practitioner’s perspective, users need to be strongly encouraged to enroll in at least two different 2FA devices (e.g., mobile app and USB key). We found an overreliance on mobile phones. Therefore, we recommend that designers provide a narrative highlighting the danger associated with using the same device for both authentication and recovery. Similar to previous research suggesting that a company’s communication should lead the user away from making risky security decisions [23]. Some companies warn employees that 2FA account recovery might take several business days [15, 16]. This awareness will encourage users to set up a 2FA recovery method. However, they may not be selecting an actual recovery option (i.e., their phones might be unavailable). If companies do not adequately prepare for such occurrences, it could be costly and disruptive as employees are prevented from accessing critical services needed to perform their duties.
References 1. Cain, A.A., Still, J.D.: Usability comparison of over-the-shoulder attack resistant authentication schemes. J. of Usability Studies. 13, 196–219 (2018)
230
J. D. Still and L. N. Tiller
2. Ashford W.: Millions of web users at risk from weak passwords [Web blog post]. http:// www.computerweekly.com/Articles/2009/09/07/237569/Millions-of-web-users-at-riskfrom-weak-passwords.htm. Accessed 7 Sept 2009 3. Barton, B.F., Barton, M.S.: User-friendly password methods for computer-mediated information systems. J. Comput. Secur. 3, 186–195 (1984) 4. Hoonakker, P., Bornoe, N., Carayon, P.: Password authentication from a human factors perspective: results of a survey among end-users. In: proceedings of the Human Factors and Ergonomics Society Annual Meeting, vol. 53, pp. 459–463 (2009) 5. Labuschagne, W.A., Veerasamy, N., Burke, I., Eloff, M.M.: Design of cyber security awareness game utilizing a social media framework. In: Proceedings of Information Security South Africa, pp. 1–9 (2011) 6. Reese, K.R.: Evaluating the usability of two-factor authentication. Published Master’s thesis. All Theses and Dissertations Database (UMI No. 6869) (2018) 7. Shaban, H.: The government is rolling out 2-factor authentication for federal agency dotgov domains. The Washington Post. https://www.washingtonpost.com/technology/2018/10/ 08/government-is-rolling-out-factor-authentication-federal-agency-gov-domains/. Accessed 8 Oct 2018 8. Tellini, N., Vargas, F.: Two-Factor Authentication: Selecting and implementing a two-factor authentication method for a digital assessment platform. Unpublished Bachelor’s Thesis, KTH Royal Institute of Technology (2017) 9. Renaud, K.: A process for supporting risk-aware web authentication mechanism choice. J.Reliab. Eng. Syst. Saf. 92, 1204–1217 (2007) 10. Loveless, M.: How popular web services handle account recovery. Duo Security. https://duo. com/decipher/reality-of-online-account-recovery. Accessed 6 Mar 2018 11. Das, S., Dingman, A., Camp, L.J.: Why Johnny doesn’t use two factor a two-phase usability study of the FIDO U2F security key. In: Proceedings of the International Conference on Financial Cryptography and Data Security (2018) 12. Colnago, J., et al.: “It’s not actually that horrible”: exploring adoption of two-factor authentication at a university. In: CHI Conference on Human Factors in Computing Systems, pp. 1–11 (2018) 13. Prins, C.W.: 2-Factor authentication recovery codes [Web blog post]. 4me. https://www.4me. com/blog/two-factor-authentication/two-factor-authentication-recovery-codes/ .Accessed 5 Apr 2018 14. Wallen, J.: How to retrieve your Google 2FA backup codes (and make more) [Web blog post]. TechRepublic. https://www.techrepublic.com/article/how-to-retrieve-your-google-2fabackup-codes-and-make-more/. Accessed 7 Aug 2018 15. Afonin, O.: The ugly side of two-factor authentication [Web blog post]. https://blog.elcoms oft.com/2016/12/the-ugly-side-of-two-factor-authentication/. Accessed 20 Dec 2016 16. Ravenscraft, E.: What happens if I use two-factor authentication and lose my phone? [Web blog post]. lifehacker. https://lifehacker.com/what-do-i-do-if-i-use-two-factor-authentication-andlos-1668727532. Accessed 09 Dec 2014 17. Cain, A.A., Edwards, M.E., Still, J.D.: An exploratory study of cyber hygiene behaviors and knowledge. J. Inf. Secur. Appl. 42, 36–45 (2018) 18. Aytes, K., Conolly, T.: A research model for investigating human behavior related to computer security. In: proceedings of Americans Conference on Information Systems, vol. 9, pp. 2027– 2031 (2003) 19. National Cyber Security Alliance: NCSA/Norton by Symantec Online Safety Study (2010). http://www.staysafeonline.org/download/datasets/2064/FINAL+NCSA+Full+Online+Saf ety+Study+2010%5B1%5D.pdf 20. Tiller, L.N.: Account recovery methods for two-factor authentication (2FA): an exploratory study. Unpublished Master’s Thesis, Old Dominion University (2020)
Exploring Understanding and Usage of 2FA Account Recovery
231
21. Russell, J.D., Weems, C.F., Ahmed, I., Richard III, G.G.: Self-reported secure and insecure cyber behaviour: factor structure and associations with personality factors. J. Cyber Secur. Technol. 1, 163–174 (2017) 22. Wiedenbeck, S., Waters, J., Sobrado, L., Birget, J.C.: Design and evaluation of a shouldersurfing resistant graphical password scheme. In: Proceedings of the Working Conference on Advanced Visual Interfaces, pp. 177–184 (2006) 23. Nurse, J.R., Creese, S., Goldsmith, M., Lamberts, K.: Trustworthy and effective communication of cybersecurity risks: a review. In: Proceedings of the 1st Workshop on STAST, pp. 60–68 (2011)
Strategies of Naive Software Reverse Engineering: A Qualitative Analysis Salsabil Hamadache1,2(B) , Markus Krause1,2 , and Malte Elson1,2 1 Psychology of Human Technology Interaction Group, Faculty of Psychology, Ruhr University
Bochum, Universitaetsstrasse 150, 44801 Bochum, Germany {salsabil.hamadache,markus.krause,malte.elson}@rub.de 2 Horst-Görtz Institute for IT Security, Ruhr University Bochum, Bochum, Germany
Abstract. Considerable amount of research into the cognitive processes of code reading and understanding has been conducted. However, few studies have considered code comprehension with a specific focus on protected code. This research effort aims to contribute to filling this gap by describing strategies of software reverse engineering, the process of making sense of protected software code. Our study presented two Java programs, one available in clear code and one in obfuscated code. Participants worked on two tasks for each of the programs that required them to understand the program logic and how the code produces program behaviour. By means of a codebook, similarities and discrepancies between participants’ strategies are outlined in this paper. Keywords: Reverse engineering · Human factors · Code obfuscation · Problem solving
1 Introduction Complex problem solving consists of several steps which consist of several sub steps. It requires planning, keeping the plan in mind while implementing it, adjusting to new information, and evaluating one’s progress. The complexity of a problem is determined by its size, the interconnectedness of its components and its dynamic [2]. Understanding the strategies and cognitive processes that enable problem solving in the domain of IT security can yield useful implications for assessment and training purposes while it can also inform the sabotaging of security breaches such as reverse engineering, a process that has been conceptualized as complex problem solving before [1]. Reverse engineering software code aims at finding vulnerabilities in it, stealing the intellectual property implemented in it, or creating illegal copies. Because reverse engineering takes considerable time and is an effortful task, reverse engineers carefully assess whether their expected rewarded is worth the investment. This is why code developers often protect valuable software by means of obfuscation techniques that increase the cognitive demand and the duration of the reverse engineering process. To derive more sophisticated protection methods, it is helpful to gain as much information on the strategies and processes of attackers as possible. For example, a model © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 232–237, 2021. https://doi.org/10.1007/978-3-030-79997-7_29
Strategies of Naive Software Reverse Engineering
233
of the cognitive functions underlying reverse engineers might help establish how these cognitive functions can be sabotaged.
2 Past Research Comprehension of (Obfuscated) Code. The comprehension of software code has extensively been studied for decades (e.g. [3]), both by computer scientists and psychologists. However, nearly all studies used software code in its common appearance, without considering commonly implemented obfuscation methods. Moreover, these studies usually aimed at understanding how programming education can be improved or at the facilitation of code reading rather than its sabotage, given that the model of individuals reading and comprehending code has long been one of an ethical programmer trying to reconstruct code written by earlier colleagues with no illegitimate intentions at all. This process significantly differs from reverse engineering since there are various ways to obtain information about the program and usually, code is prepared in ways that ought to maximally facilitate reading it. Best practices require to extensively make use of comments, tabulators, and sensible class structures such that program logic is easily detected. Recently, publications considering protected code have emerged (e.g. [4] and [5]). Studies employing human participants have however been rare because both obfuscation and deobfuscation are often automatic, algorithmic processes. Moreover, many papers are written by developers of obfuscation methods who might not be interested in testing their actual effectiveness. Thus, it has been unproven for quite a long time whether obfuscation techniques sabotage human reading and sense making of software code at all [6]. This gap has first been filled by a series of experiments with human participants aiming at assessing the effectiveness of obfuscation techniques [6] and further research has considered in how far reverse engineering behaviours differ as a function of experience, and of the presence of obfuscation compared to its absence [7]. In these studies, one commonly used obfuscation method (opaque predicates) has been shown to neither hinder nor slow down reverse engineering, while another (identifier renaming) seemed quite effective in doing so. Qualitative Analysis. Qualitative research methods aim at deriving detailed descriptions instead of quantifiable effects and are particularly useful when previous theoretical work in a given field is lacking. Since our study’s purpose is to derive strategies and cognitive processes of software engineers, we chose a grounded-theory-based coding approach, in which behaviours are meticulously noted and then grouped and structured according to a codebook which ideally reflects cognitive theories or previous assumptions about the underlying process. Before we could transcribe the processes of our participants, we needed to find research we could base our codebook on. This research will be briefly covered next. A Model of Software Reverse Engineering. In [13], a model of software reverse engineering was suggested based on the processes of a professional reverse engineer (the author himself) during a realistic task on one, and cognitive theories on the other hand.
234
S. Hamadache et al.
The publication suggests that the following steps are typically undergone by reverse engineers: (1) Goal construction/planning, (2) carrying out a plan, (3) generating hypotheses or questions, (4) determining needed information, (5) experimentation to seek data,(6) instrumentation to isolate unavailable data, (7) evaluating and integrating, and (8) updating the mental model. To him, the process of reverse engineering thus mainly consists of an interplay between information gathering and application of retrieved information onto the system. Our codebook contains these categories, as we want to see whether behaviours performed by our participants, mainly students of IT-related subjects inexperienced in reverse engineering, can also be categorized by this framework that was based on expertise performance. While coding the first participant, we missed one behaviour which we added to the code book: (9) identifying something relevant/useful. Concrete Behaviours in the Task at Hand. As mentioned, the effect of experience and obfuscation on reverse engineering behaviours of a student population has been studied before [7]. For instance, participants’ code analysis behaviour differed as a function of obfuscation: When code was obfuscated, they opened files more frequently, used more advanced commands, ran the program more often and spent more time in the debugging mode. Differences between experienced participants and beginners were mostly that experienced participants used advanced commands and the debugging mode more often. Interestingly, the difference between time spent to solve the tasks on obfuscated vs. clear code was more pronounced for experienced participants, who were much faster than beginners when solving tasks on clear code, but almost as slow when working on the obfuscated program. This was in part because they had to spend significantly more time reading code. We will thus also include the following behaviours into our code book to see if we find similar trends: (10) run program, (11) use debugging mode/set breakpoint, (12) use advanced command, (13) read code, (14) open/switch file. It is noticeable that in comparison to the codes above, these are more concrete: While codes (1) through (9) are latent behaviours reflecting cognitive processes, behaviours coded in (10) through (14) are observable. This paper thus also functions as an integration of these very different yet in their own ways informative sources. For the sake of completeness, we added one code because we were certain we would also observe this behaviour: (15) change code to observe the effect (e.g. by running the program). After coding the first participant we noted that we wanted to mark the difference between strategically reading code in search for something in particular, or following a plan, and reading the code just to explore before developing concrete plans. We thus added the code (16) read code to explore, which makes (13) read code strategically. Lastly, the authors of [7] hypothesized that for many participants, it was necessary to in a first step undo the obfuscation, to then proceed to answer the questions at hand. This is why we included the final behaviour: (17) reverse obfuscation. Our final codebook was thus: (1) (2) (3)
goal construction/planning carrying out a plan generating hypotheses or questions
Strategies of Naive Software Reverse Engineering
(4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17)
235
determining needed information experimentation to seek data instrumentation to isolate unavailable data evaluating and integrating updating the mental model identifying something relevant/useful running the program using the debugging mode/set breakpoint using an advanced command or the search function reading code strategically opening/switching open file changing code to observe effect reading code to explore reversing obfuscation
3 Method and Materials Study Design. The same two small Java programs that have been used by [5] and [6] have been worked on by participants recruited within a related research project (https:// osf.io/4xud8/). Of the four reverse engineering tasks for each of the two programs used by [5], we chose to only use two per program in line with [6], to minimize the time and effort necessary for participants to invest for participation. Participants were randomly assigned to receive either of the programs in clear, and the other in obfuscated code. To be able to perform an in-depth analysis of their reverse engineering behaviour and to detect in how far obfuscations sabotaged reverse engineering, the screen was recorded, and participants were asked to “think-aloud”, i.e. verbalize their thoughts while working on the problem. Procedure. Participants were informed about the study objective, data protection, and their rights as participants. They confirmed their voluntary participation and filled out a questionnaire on demographics. Then, they received the four reverse engineering tasks relating to the two Java programs along with little information about the programs (see OSF for the exact information they could access). In two of the tasks, they were asked to indicate the line of code in which a particular action is implemented, while in two others they were asked for an integer value that defines an aspect of the program. Participants had 60 min and were free to pick the order in which they solved the tasks. They were instructed to work as fast as possible to resemble real-life reverse engineering situations, in which the goal is not only to reverse engineer the program, but to do so while minimizing time and resources. Microphones were used to record their voices and the program Captura [9] was used to record the screen. They were left alone in a room to limit the inhibition of thinking aloud. The usage of Eclipse functions or other ways to solve the problems was explicitly encouraged, while it was possible to solve the tasks without experience in advanced functions of Eclipse or Java. After data collection, the screen recordings together with the voice records were transcribed using the tool MAXQDA [10]. One of the authors noted every visible behaviour. In a next step, another coded the behaviours according to the codebook. As mentioned before, the codebook was updated after coding the first participant. He was thus re-coded after this update.
236
S. Hamadache et al.
4 Results and Discussion All codes were assigned to behaviours of at least one participant. Most common behaviours were reading code strategically, opening/switching files, identifying relevant or useful information, and evaluating and updating. On the latent variable level, it was apparent that participants who easily solved tasks had an adequate mental model of Java programs in general to start with. They were thus able to construct goals and plans while others started by exploring. They needed to open files less frequently because they had an idea where relevant functions ought to be. Even within classes, well-performing participants spent less time reading code as they easily detected the program logic. This led to more pronounced differences between their behaviour when dealing with clear vs. obfuscated code: When class names lacked semantic meaning, they had to use the functions to navigate from one class to another and had to partly guess in which classes to commence their search. No apparent trend emerged in the number of program executions as an effect of experience. Almost all participants ran the programs in the beginning (especially the Car Racing program) to get a first idea, and during the reverse engineering process to see how their input or their code changes affected the execution. For example, most participants tried to send messages via the Chat program to observe the behaviour asked for in the tasks. It appeared the more participants changed in the code to then observe the effect when executing the program, the easier it was to integrate and evaluate their idea. This fits the “experimenting to seek data” code based on [8] and constitutes an essential step in reverse engineering. Participants regularly evaluated their progress, for example by stating that they have no idea how to continue, or by confirming their answers in various ways. Another remarkable observation was that participants who were insecure about how to solve the tasks, very often switched between working on the programs and re-reading the task description. Surprisingly, only one participant undid an obfuscation and thereby rendered the code more readable. This might however relate to the fact that code was not heavily and consistently obfuscated. This is related to the study design of the study this data is taken from (see OSF link above). Overall, it seems the model developed in [8] has been confirmed in this study: Successful reverse engineering did involve goal construction for each of the tasks, followed by carrying out the plans which mostly consisted of generating hypotheses or questions about the programs and lines of code, experimenting to find needed information, evaluating and integrating, and updating one’s mental model. It was thus indeed an interplay between planning and carrying out the plan, and mostly consisted of information retrieval, information manipulation, and evaluation of these two. However, we could observe that some participants struggled in following such a systematic approach and rather started out by reading the code in an exploratory way, identified code portions relevant to the questions, and slowly made their way towards the correct solution. This often led to much slower problem solving and meant for some participants that they could only solve three items due to the time. As mentioned before, this is exactly the point of obfuscating code: even though they probably would have solved all tasks if given infinite time, a large enough delay often suffices to prevent
Strategies of Naive Software Reverse Engineering
237
reverse engineering. Less experienced participants also navigated from class to class by opening respective files, while experienced participants moved via functions, following the program logic that was apparently configured into their mental model. We could replicate the finding that identifier renaming sabotaged even experienced participants by slowing them down and generating confusion [6]. Future research projects of us will transcribe and code more data to then be able to quantitatively assess the significance and effects of the differences described above. For example, we will be able to replicate whether indeed, obfuscation leads to the necessity of using advanced Eclipse functions significantly more often, or to significantly longer code reading times.
References 1. Fyrbiak, M., et al.:. Hardware reverse engineering: overview and open challenges. In: 2017 IEEE 2nd International Verification and Security Workshop (IVSW), pp. 88–94. IEEE, July 2017 2. Dörner, D., Funke, J.: Complex problem solving: what it is and what it is not. Front. Psychol. 8, 1153 (2017) 3. Hendrix, D., Cross, J.H., Maghsoodloo, S.: The effectiveness of control structure diagrams in source code comprehension activities. IEEE Trans. Softw. Eng. 28(5), 463–477 (2002) 4. Schrittwieser, S., et al.: Covert computation: hiding code in code for obfuscation purposes. In: Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security, pp. 529–534. ACM, May 2013. https://doi.org/10.1145/2484313.248 4384 5. Park, J., Kim, H., Jeong, Y., Cho, S.J., Ha, S., Park, M.: Effects of code obfuscation on android app similarity analysis. J. Wirel. Mob. Netw. Ubiquitous Comput. Depend. Appl. 6(4), 86–98 (2015) 6. Ceccato, M., Di. Penta, M., Falcarin, P., Ricca, F., Torchiano, M., Tonella, P.: A family of experiments to assess the effectiveness and efficiency of source code obfuscation techniques. Empir. Softw. Eng. 19(4), 1040–1074 (2013). https://doi.org/10.1007/s10664-013-9248-x 7. Hänsch, N., Schankin, A., Protsenko, M., Freiling, F., Benenson, Z.: Programming experience might not help in comprehending obfuscated source code efficiently. In: Fourteenth Symposium on Usable Privacy and Security ({SOUPS} 2018), pp. 341–356 (2018) 8. Bryant, A.R., Mills, R.F., Peterson, G.L., Grimaila, M.R.: Software reverse engineering as a sensemaking task. J. Inf. Assur. Secur. 6(6) (2011) 9. Sachin, M.: Captura [computer software] (2019) 10. VERBI Software: MAXQDA 2020 [computer software], Berlin, Germany (2019)
How Safely Do We Behave Online? An Explanatory Study into the Cybersecurity Behaviors of Dutch Citizens Rick van der Kleij1,2(B) , Susanne van ’t Hoff-De Goede1 , Steve van de Weijer3 , and Rutger Leukfeldt1,3 1 The Hague University of Applied Sciences (THUAS), The Hague, The Netherlands
{R.vanderkleij,m.s.vanthoff-degoede}@hhs.nl
2 The Netherlands Organisation for Applied Scientific Research (TNO),
The Hague, The Netherlands [email protected] 3 Netherlands Institute for the Study of Crime and Law Enforcement (NSCR), Amsterdam, The Netherlands {svandeweijer,rleukfeldt}@nscr.nl
Abstract. The Capability-Opportunity-Motivation-Behavior (COM-B) model states that people’s behavior can be explained by their capabilities, opportunity, motivation, and the interaction between these components. This research focuses on applying the COM-B model to Dutch citizens’ cybersecurity behavior. Data are used from a Dutch cross-sectional online survey (N = 2,426). Multivariate analysis was performed with multivariate linear regression models. The results suggest, in line with the literature, that people that have relevant knowledge, and are motivated to protect themselves, report more cyber secure behavior. The opportunity people have to protect themselves was found to be only partly related to self-reported cybersecurity behavior. Explanations are provided for these findings, as well as opportunities for future research. Keywords: Precautionary behavior · Cyber hygiene · Human vulnerability · Information assurance
1 Introduction Cyberattacks often require human interaction to succeed, resulting in malware installation, online fraud, accidental data disclosures, and more. Victims of online bank fraud, for example, often appear to have unintentionally given their personal information to fraudsters by clicking on a hyperlink in a phishing e-mail or entering login details on a phishing website [1]. An important condition to decrease an individual’s risk of cybercrime victimization is end-user cybersecurity behavior [2]. In the literature this is also described as online safety, precautionary, protective, or cyber hygiene behavior (see, for example, [3–6]). People who behave in a more cyber secure - or cyber hygienically – manner abide by “golden” rules (i.e. best practices). For example, they avoid unsafe © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 238–246, 2021. https://doi.org/10.1007/978-3-030-79997-7_30
An Explanatory Study into the Cybersecurity Behaviors of Dutch Citizens
239
websites, do not click on unreliable hyperlinks, use strong passwords, and keep their technical security measures up to date [3, 7, 8]. The aim of this explanatory research is to investigate how cyber secure Dutch people behave and to empirically explain the determinants of this behavior. More specifically, we strive to explain cybersecurity behavior based on the capabilities and opportunity that people have to act more securely, and the extent to which they are motivated to behave in a more cyber secure manner. This knowledge can be a steppingstone to develop interventions to make people behave more cyber secure. In the remainder of this introduction, we discuss a generic behavior system in which capability, opportunity, and motivation interact to generate cybersecurity behavior. 1.1 The Capability Opportunity Motivation-Behavior Model The Capability Opportunity Motivation-Behavior model (COM-B) states that people’s behavior can be explained by their capabilities, opportunity, and motivation [9]. Capabilities are defined as the psychological and physical capacity of the individual to exhibit specific behavior, including having the necessary knowledge and skills. Opportunity is defined as all factors that lie outside the individual and that make or prevent behavior, such as the influence that our environment and the people around us can have on our behavior. Motivation is defined as all the brain processes that activate and direct behavior. This includes, according to Michie, Van Stralen, and West [9], not only goals that people set themselves and conscious decision-making processes, but also automatic and reflexive actions that are the result of personal aptitude, associative learning processes and emotions. These three components interact to generate behavior. More precisely, capability and opportunity can influence motivation. Capability, opportunity, and motivation can influence behavior that in turn influences these components. Hence, enacting a behavior can alter capability, motivation, and opportunity. Insight into the extent to which these components play a role in citizens’ cyber behavior is needed to be able to influence behavior effectively at a later stage through interventions that change these components [9]. As to date, COM-B has mainly been used as a behavioral change model in marketing and in the health domain, for example aimed at designing interventions to help people quit smoking (see, for example, [10]). With few exceptions, measurement models are therefore primarily focused on mapping consumer behavior and unhealthy behavior (see, for example, [11]).
2 Materials and Method 2.1 Sample and Procedure To explain cyber security behavior on the basis of the capabilities and opportunity that people have to act more security, and the extent to which they are motivated to behave in a more cyber secure manner, we used data from a Dutch cross-sectional online survey (N = 2,426 Dutch citizens) that was launched in May 2019 (see, [12]). A total of 12,114 people were invited by a professional panel agency to complete the survey. Participation was voluntary. In total there were 6,982 people who did not respond to the
240
R. van der Kleij et al.
invitation and 2,069 people clicked on the hyperlink in the invitation but did not complete the questionnaire. Subsequently, 1,859 of the non-response members, younger than 40, received a second invitation to increase the representativeness in this age category. A total of 3,117 people (25.7%) completed the questionnaire between 6th May and 26th May 2019. To be able to correctly perform the measurements and manipulations included in the survey, respondents had to complete the survey in one session. When people needed more than 60 min to complete the questionnaire, we saw this as evidence for not fulfilling this requirement. For this reason, 596 people were removed from the data. In the same manner, 95 people who completed the questionnaire within 20 min, which was in our opinion too little time to complete the questionnaire with appropriate attention, were removed from the data. The final sample consists of 2,426 Dutch internet users. Respondents’ characteristics were compared with those of non-respondents (N = 9,688) based on registered panel agency data. Compared with non-participants, respondents are more often male (53% versus 47%, p < .001) and on average older (57.9 versus 54.3 years, p < .001) than non-respondents, but did not differ significantly in level of education or the province in which they live. Respondents’ characteristics were also compared with the general Dutch population [13]. The sample of the current study is representative of the Dutch population with regard to gender, employment status, and residential area (i.e., the North, South, West, or East of the Netherlands), but are more often highly educated (50.0% versus 30.0%) and less often younger than 39 than the average population (13.8% versus 29.4%). The fact that the sample is not representative on all factors, however, may have only limited consequences for the validity of the survey, as was shown in earlier research in which convenience samples (such as Amazon’s Mechanical Turk) were compared with representative samples [14]. 2.2 Measures Dependent Variable: Cyber Security Behavior. This variable was measured by presenting 18 statements to our respondents related to several cyber security behavioral topics (Cronbach’s α = .747). Behavioral topics and underlying specific behaviors were first identified through a review of the literature, resulting in 18 behaviors in seven behavioral clusters: (1) Use of passwords; (2) Backing up important files; (3) Installing updates; (4) Use of security software; (5) Online security alertness; (6) Online disclosure of personal information; and (7) Dealing with attachments and hyperlinks. An example statement for measuring Use of password is: “I use the same password for different applications and services, such as social media, online banking and web shops”. Answer options for respondents were: always, often, sometimes, rarely, or never (5-point Likerttype scale) or not applicable. The highest score (5) was assigned to the safest answer option for each statement. Hence, the dependent variable ‘cyber security behavior’ is the average of these 18 statements. Independent Variables: Capability, Opportunity, and Motivation. To measure capability, a knowledge test was conducted consisting of 19 multiple choice questions about online safety and related topics (see also [3]). An example question is: “Which statement
An Explanatory Study into the Cybersecurity Behaviors of Dutch Citizens
241
is correct? A firewall is a system that…; a.… is used to filter and block unwanted e-mail from the Inbox; b…. can protect a network or computer against outside abuse; c…. is also known as IDS (Intrusion Detection System); d. All the above statements are correct, e. I do not know”. Hence, the independent variable ‘capability’ concerns the number of correct answers from respondents in this knowledge test. The opportunity that people have for safe online behavior was measured with two statements that were adapted from the Theoretical Domains Framework-based questionnaire developed by Huijg, Gebhardt, Crone, Dusseldorp and Presseau [15]: one statement regarding the demands that come from people’s social environment (“People around me (family / friends / acquaintances) believe online safety is important”) and one statement regarding people’s financial resources for protection (“Our household has sufficient financial resources to purchase security resources, such as a virus scanner, VPN or cloud service”). A five-point Likert type scale was used for both items, ranging from “completely disagree” to “completely agree”. The overarching measuring scale, however, had insufficient internal consistency. Therefore, both items are treated as separate independent variables. Hence, we continue this research with two variables: social environment and financial resources. The higher respondents scored on both variables, the more opportunity they indicate having for cyber secure behavior. In order to measure motivation, a three-item scale was developed, adapted from Herath and Rao [16]. Three items were measured on a five-point Likert type scale ranging from “totally disagree” to “totally agree”. The independent variable motivation concerns the average of these items (Cronbach’s α = .61). An example item is: “I want to do everything I can to protect myself against cybercrime”. Table 1. Descriptive statistics and Pearson zero-order correlations among the study variables. Variable
Mean
SD
1
3.81
0.50
1
12.24
3.69
0.374**
3. Financial resources
3.85
0.86
0.148**
0.177**
4. Social environment
3.83
0.67
0.111**
−0.084**
0.100**
1
5. Motivation
3.97
0.58
0.250**
0.015
0.238**
0.238**
1. Cyber security behavior 2. Capability
2
3
4
5
1 1 1
Notes. N = 2,419. **Correlation is significant at the 0.001 level (2-tailed).
Control Variables. Given that socio-economic status has been shown to relate to cybercrime victimization, and, to a large extent, cybercrime victimization can be traced back to the behavior of people, we controlled for educational level, having children below the age of 16, and employment status. Further, respondents were asked if they have a partner with whom they have been together for at least three months. To explicitly consider the possible effects of gender and age, we also controlled for these variables in all analyses.
242
R. van der Kleij et al.
3 Results 3.1 Descriptive Statistics and Bivariate Analyses Descriptive statistics have been performed with frequencies and crosstabs. It appears that security behavior is highly prevalent amongst Dutch citizens. More specifically, citizens report that they do not share passwords with others (M = 4.68; SD = 0.59), delete emails that they do not trust (M = 4.74; SD = 1.02), and almost never open attachments in emails from unknown senders (M = 4.62; SD = 0.59). Of all 18 behavioral items, respondents report on average the least secure behavior regarding encrypting personal information (M = 2.33; SD = 1.29) and verifying the authenticity of a suspicious e-mails by contacting the sender (M = 2.46; SD = 1.20). Means, standard deviations, and Pearson zero-order correlations between all variables are presented in Table 1. As could be expected based on the theoretical work of Michie et al. [9], all correlations with behavior were positive and significant. As can be seen in Table 1, the correlations between financial resources, social environment and motivation were also significant, as expected. However, contrary to theoretical predictions, the relation between capability and motivation was not significant, while the correlation between capability and social environment was significant. Table 2. Results of regression analysis for cyber security behavior. Model 1 Model
Variable
1. Control variables
Gender (male)
0.072
0.113***
0.019
0.018
Educational level
0.007
0.082***
0.006
−0.046*
2. Main effects
F R2 delta R2
SE
Model 2 B
SE
B
Age (years)
0.001
0.087**
0.001
Employed (Yes)
0.023
0.009
0.021
−0.013
Living together (yes)
0.022
0.019
0.020
0.013
Child living at home (< 16 year) (yes)
0.016
0.013
0.014
0.031
0.003
0.442***
Capability
0.164***
Social Env
0.014
0.071***
Fin. Resources
0.011
0.011
Motivation
0.016
0.185***
11.526*** 0.028
73.852*** .235 0.207***
Notes. N = 2,426. * p < .05, **p < .01, ***p < .001. B = unstandardized regression coefficient; SE = standard error.
An Explanatory Study into the Cybersecurity Behaviors of Dutch Citizens
243
3.2 Multivariate Analysis The multivariate analysis was performed with multivariate linear regression models. We tested two models with online behavior as the dependent variable to isolate the contribution of different terms. The first model tested the contributions of our control variables. In the second model, capability, opportunity and motivation were added to the regression model. The results of the regression models with cyber behavior as the dependent variable are presented in Table 2. The control variables in Model 1 significantly predict cyber security behavior (F = 11.526; p < .001) and explain 2.8% of the variance. The regression coefficients show that gender, educational level, and age are significantly associated with secure online behavior, while employment, cohabitation, and parenthood are not. The higher the age, the safer the cyber behavior. Also, males reported to behave in a more cyber secure manner. Further, the higher the education, the safer the online behavior. The addition of knowledge, opportunity and motivation in Model 2 significantly increased the explanatory power of the model (delta R2 = 0.207, p < 0.001). The positive and significant coefficients for capability, social environment, and motivation indicate that respondents with more knowledge, more opportunities in the social environment and more motivation are more likely to behave in a cyber secure manner. Together, Model 2 explains 23.5% of the variation in safe online behavior. Collinearity tolerance values for model 2 were all well above 0.10 indicating that there was not a problem with multicollinearity in the regression equations [17].
4 Discussion This explanatory study, using a large representative data set, sought to understand the self-reported cybersecurity behavior of Dutch people. This knowledge can be a steppingstone to develop interventions to make people behave in a more cyber secure manner. The predictive factors for cybersecurity behavior included in this study were capability, opportunity, and motivation. According to Michie et al. [9], capability, opportunity, and motivation interact to generate behavior. More specifically, opportunity can influence motivation as can capability. This study empirically demonstrated the foretold relation between opportunity and motivation, but not between capability and motivation. It appears that contrary to predictions by Michie et al. [9], capability, as measured in this research via knowledge, is not related to the motivation to take protective measures. Further, our study demonstrated a relation, currently not present in the model of Michie et al. [9], between opportunity and capability. Both the social environment and financial resources are related with capability. Interestingly, the relation between the social environment and capability is a negative one. Perhaps the perceived need to acquire knowledge on cybersecurity is lowered when people know that the people in their direct social environment are knowledgeable on the subject. Hence: the larger the social group people are part of, the greater the chance that someone else has already acquired knowledge on cybersecurity diminishing the need for others in that social group to do so as well. When we look more closely to how behavior is generated, it was found that both capability and motivation play an important role. This finding is in line with previous
244
R. van der Kleij et al.
research, that found that people that have relevant knowledge, and are motivated to protect themselves, have the intention to act more securely [2, 18–22]. The current study also demonstrated a positive link between the social environment and cybersecurity behavior. It is plausible to conclude that the social environment of citizens has a positive influence on their intention to behave in a more cyber secure manner. However, no relationship was found between security behavior and the financial resources one has to protect oneself. This may be explained by the fact that opportunities for safe online behavior, such as anti-virus software, firewalls, and browser plugins, are often open source and free of costs available from the internet or already integrated in the preinstalled operating systems on our devices. Hence, while physical security costs money, and better security costs more money, physical opportunity for better cyber security is often one mouse click away, for free, and readily integrated in the digital services or the operating system one uses. An important weakness of the current study is that we only looked at self-reported behaviors. We investigated how people say they typically behave online or would behave in a hypothetical situation. Evidence is accumulating that the self-reported behavior of people, however, does not always correspond to their actual behavior [23]. In the realm of cybersecurity, this may be because people do not want to disclose what they do online, have forgotten it, or give socially desirable answers. It is also plausible that not everyone realizes that they are behaving unsafe while being online. For instance, people do not always notice that they have clicked on a hyperlink in a phishing mail, leak personal information through online services, or downloaded malware onto their system. When research focuses solely on self-reporting online behavior, it may therefore result in an incorrect picture of how people actually behave online. Previous studies where actual behavior has been measured are scarce, however, within the domain of cybersecurity. The studies that have been done, mostly focus on victimization of phishing. Studies often use phishing tests to objectively measure the degree of susceptibility to phishing, that is to test their resistance to phishing attacks (see, for example, [3], for an overview). A study that extends beyond reaction on phishing mails is underway at The Hague University of Applied Sciences to assess how various variables affect the actual behavior of Dutch citizens [24]. Funding and Acknowledgments. This work was supported by the WODC (Research and Documentation Centre) of the Ministry of Justice and Security.
References 1. Jansen, J., Leukfeldt, R.: Phishing and malware attacks on online banking customers in the Netherlands: a qualitative analysis of factors leading to victimization. Int. J. Cyber Criminol. 10(1), 79–91 (2016). https://doi.org/10.5281/zenodo.58523 2. Van der Kleij, R., Wijn, R., Hof, T.: An application and empirical test of the Capability Opportunity Motivation-Behaviour model to data leakage prevention in financial organizations. Comput. Secur. 97, 101970 (2020) 3. Cain, A.A., Edwards, M.E., Still, J.D.: An exploratory study of cyber hygiene behaviors and knowledge. J. Inf. Secur. Appl. 42, 36–45 (2018). https://doi.org/10.1016/j.jisa.2018.08.002
An Explanatory Study into the Cybersecurity Behaviors of Dutch Citizens
245
4. Crossler, R.E., Bélanger, F.: Why would i use location-protective settings on my smartphone? Motivating protective behaviors and the existence of the privacy knowledge-belief gap. Inf. Syst. Res. 30(3), 995–1006 (2019) 5. Stanton, J.M., Stam, K.R., Mastrangelo, P., Jolton, J.: Analysis of end user security behaviors. Comput. Secur. 24(2), 124–133 (2005) 6. Padayachee, K.: Taxonomy of compliant information security behavior. Comput. Secur. 31(5), 673–680 (2012) 7. Crossler, R.E., Bélanger, F., Ormond, D.: The quest for complete security: an empirical analysis of users’ multi-layered protection from security threats. Inf. Syst. Front. 21(2), 343– 357 (2017). https://doi.org/10.1007/s10796-017-9755-1 8. Symantec: Security Center White Papers (2018). https://www.symantec.com/security-center/ white-papers 9. Michie, S., Stralen, M.M. Van, West, R.: The behaviour change wheel: a new method for characterising and designing behaviour change interventions. Implement. Sci. 42 (6) (2011). https://doi.org/10.1186/1748-5908-6-42 10. Brown, J., Kotz, D., Michie, S., Stapleton, J., Walmsley, M., West, R.: How effective and cost-effective was the national mass media smoking cessation campaign ‘Stoptober’? Drug Alcohol Depend. 135, 52–58 (2014) 11. Atkins, L., Michie, S.: Changing eating behaviour: what can we learn from behavioural science? Nutr. Bull. 38(1), 30–35 (2013) 12. Van’t Hoff-de Goede, S., Van der Kleij, R., Van de Weijer, S., Leukfeldt, R.: Hoe veilig gedragen wij ons online? Een studie naar de samenhang tussen kennis, gelegenheid, motivatie en online gedrag van Nederlanders. WODC report (2019). https://www.wodc.nl/binaries/ 2975_Volledige_Tekst_tcm28-421151.pdf 13. CBS: Bevolking; geslacht, leeftijd en burgerlijke staat (2019). https://opendata.cbs.nl/sta tline/#/CBS/nl/dataset/7461BEV/table?fromstatweb 14. Mullinix, K.J., Leeper, T.J., Druckman, J.N., Freese, J.: The generalizability of survey experiments. J. Exp. Polit. Sci. 2(2), 109–138 (2015) 15. Huijg, J.M., Gebhardt, W.A., Crone, M.R., Dusseldorp, E., Presseau, J.: Discriminant content validity of a theoretical domains framework questionnaire for use in implementation research. Implement. Sci. 9 (11) (2014). https://doi.org/10.1186/1748-5908-9-11 16. Herath, T., Rao, H.R.: Protection motivation and deterrence: a framework for security policy compliance in organisations. Eur. J. Inf. Syst. 18(2), 106–125 (2009). https://doi.org/10.1057/ ejis.2009.6 17. Cohen, P., West, S.G., Aiken, L.S.: Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Psychology Press, New York (2014) 18. Arachchilage, N.A.G., Love, S.: Security awareness of computer users: a phishing threat avoidance perspective. Comput. Hum. Behav. 38, 304–312 (2014). https://doi.org/10.1016/j. chb.2014.05.046 19. Downs, J.S., Holbrook, M., Cranor, L.F.: Behavioral response to phishing risk. In: Proceedings of the Anti-Phishing Working Groups - 2nd Annual eCrime Researchers Summit, pp. 37–44. New York, NY, USA: ACM Press (2007). https://doi.org/10.1145/1299015.1299019 20. Holt, T.J., Bossler, A.M.: Examining the relationship between routine activities and malware infection indicators. J. Contemp. Crim. Justice 29(4), 420–436 (2013). https://doi.org/10. 1177/1043986213507401 21. Parsons, K., McCormac, A., Butavicius, M., Pattinson, M., Jerram, C.: Determining employee awareness using the Human Aspects of Information Security Questionnaire (HAIS-Q). Comput. Secur. 42, 165–176 (2014). https://doi.org/10.1016/j.cose.2013.12.003
246
R. van der Kleij et al.
22. Shillair, R., Cotten, S.R., Tsai, H.Y.S., Alhabash, S., Larose, R., Rifon, N.J.: Online safety begins with you and me: convincing Internet users to protect themselves. Comput. Hum. Behav. 48, 199–207 (2015). https://doi.org/10.1016/j.chb.2015.01.046 23. Ellis, D.A.: Are smartphones really that bad? Improving the psychological measurement of technology-related behaviors. Comput. Hum. Behav. 97, 60–66 (2019) 24. Van ’t Hoff-de Goede, S., Leukfeldt, E., Van der Kleij, R., Van de Weijer, S.: The online behaviour and victimization study: the development of an experimental research instrument for measuring and explaining online behaviour and cybercrime victimization. In: Weulen Kranenbarg, M., Leukfeldt, R. (eds.) Cybercrime in Context, Crime and Justice in Digital Society, vol. I, pp. 21–41. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-605 27-8_3
Evaluating the BOLT Application: Supporting Human Observation, Metrics, and Cognitive Work Analysis Aryn Pyke1(B) , Ben Barone2 , Blaine Hoffman3 , Michael Kozak2 , and Norbou Buchler3 1 United States Military Academy, West Point, NY, USA
[email protected]
2 Lockheed Martin Advanced Technology Laboratories, Arlington, VA, USA
{ben.a.barone,michael.kozak}@lmco.com
3 Army Combat Capabilities Development Command Data & Analysis Center,
Aberdeen, MD, USA {blaine.e.hoffman.civ,norbou.buchler.civ}@mail.mil
Abstract. To evaluate a product (e.g., intrusion detection software) it is useful to track the workflow of the test user (analyst). The Behavioral Observations Logging Toolkit (BOLT) application allows observers to efficiently log data about analysts’ actions and affect as they engage with a product. BOLT’s backend supports data aggregation and visualization. In contrast, pen and paper observations are timeconsuming to digitize, aggregate, and analyze, as they are often unstructured, have inconsistent vocabulary, and lack precise temporal information. BOLT provides observers with a common (though customizable) ontology for observation types and observed subtasks. BOLT timestamps observations to support workflow visualizations (Gantt charts) and reveal the observed analyst’s time to complete subtasks (informative when comparing products). BOLT was applied at product evaluation events. We discuss how metrics like observation density vary with: i) the number of analysts per observer; ii) BOLT usability; and iii) usability of the product being evaluated by analysts. Keywords: Human observation metrics · Digital data collection application · Cognitive work analysis · Cyber product evaluation
1 Motivation To evaluate the usability and effectiveness of a product (e.g., intrusion detection software for cybersecurity) it is valuable to track the workflow, cognitive load, and affect of a person using it (e.g., cyber analysts). The Behavioral Observations Logging Toolkit (BOLT) application allows human observers to do just that. Historically, to gather data on workflow and affect, product users (analysts) would be asked to self-report their thoughts, actions, and/or feelings during either task execution (verbal protocols) or posthoc interviews and surveys (e.g., the System Usability Scale, [1]). Such self-report © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 247–255, 2021. https://doi.org/10.1007/978-3-030-79997-7_31
248
A. Pyke et al.
methods have several disadvantages, however. Verbal protocols require a person to multitask and may alter the execution of the task [2], are often incomplete or ambiguous, and are time-consuming to code/analyze. Post-hoc reports (like After Action Reviews), although they do not interfere with task execution, are subject to the fallibility of memory and do not provide as fine-grained data about the time course of workflow and affect [3]. Having a third-party observer track and log the analyst’s actions (continuous time-motion observation [4]) avoids many of these issues. In cyber product evaluations, observations are often logged via pen and paper and later manually entered into a computer. This process is time-consuming, as is the need to code, aggregate, and analyze the data, which may be largely unstructured, involve inconsistent vocabulary, and lack precise temporal information. In contrast, using an application can improve efficiency relative to manual logging (e.g., application to log observations of hand washing behavior, [5]). We developed the general purpose BOLT application to support flexible and efficient logging of observation and workflow data. Beyond saving the time/cost of digital reentry of paper logs, BOLT also associates each observation with an exact timestamp. Timestamps facilitate the accurate representation of a user’s workflow (time needed to complete phases of a task within the product), an important consideration for product comparison. Timestamped data also enables workflow visualizations (e.g., Gantt charts). BOLT also provides observers with a common structured ontology for the types of observations and observed subtasks. This ontology, customizable via settings in the application configuration, supports different events and task contexts while facilitating the interpretation and compilation/comparison of data across observers within a single event. Observer load is also lightened as observers can log an analyst’s transition between subtasks with the touch of a button and then easily add additional notes or details as necessary. The touch-capable interface aims to reduce head-down time in BOLT for more observation opportunities. For the current project, observers used BOLT (on Microsoft® Surface devices) to facilitate the logging of observation data of cybersecurity analysts assessing cyber security software tools at two events.
2 Using the BOLT Interface BOLT has two primary components: a touchscreen-compatible front-end User Interface (UI) for observers and a backend server infrastructure for data aggregation and visualization. We will primarily discuss the UI, developed and matured over the time BOLT was used to collect the data analyzed in this paper. Below we will outline BOLT’s initial deployed prototype, Version 1 (V1), and a subsequent improved interface, Version 3 (V3), that evolved from feedback and design evaluations [6]. We will not discuss in detail the Study Details page which allows an event organizer to specify details about a study event or scenario (e.g., how many observers, the set of subtasks available, etc.). For our use case, observers mainly interacted with the Scenario page, the primary interface for observation logging during operation. It consists of the following three sections: the Task Selection Panel, the Observation Log, and Active Tasks. The configuration of the event scenario will determine task groupings and labels in the Task Selection Panel. Each of these can be clicked or tapped to start tracking the workflow activity for an observed analyst, which adds a copy of the activity to both the Active Tasks and the Observation
Evaluating the BOLT Application
249
Log. The observer can right-click the entry or use the edit icon to access a pop-up panel to add or edit notes, enter affect responses (i.e., likability of or frustration with product related to product-acceptance), and take pictures (if enabled). Double-clicking or tapping an item in the Active Tasks section will add a stop event for the activity in the log and remove it from that section.
Observation Log
Task Selection Panel
Active Tasks
Fig. 1. BOLT version 1 (V1) interface - sccenario page
Figure 1 shows the V1 interface as it was used at the first evaluation event. The Observation Log, which contains a tabular, sequential representation of recorded data, dominates the interface. The Active Tasks section tracks the observed analysts’ ongoing tasks. When an analyst began a new task, an observer had to click the task in the Task Selection Panel, then select the relevant analyst from a list. The tabular view focus in V1 was useful for analysis and after-action review, but it was not ideal for supporting real-time human observation. After the event, feedback from observers and interface testing conducted by the collaboration of the Army Data Analysis Center, JAIC, and Lockheed Martin teams pushed towards a newer interface layout [6]. Figure 2 shows BOLT V3, used at the second product evaluation. While the three sections remain, their size and locations were altered to better support the human observer collecting and logging data in real time. The largest focus is given to the Active Tasks section to help the observer quickly see what is ongoing for each analyst. The tabular Observation Log is still present to provide time-ordered records. Either section can be used to edit and add information to an entry. The Active Tasks section is also split into dashboards to support the observation of multiple analysts. Clicking/tapping in an analyst’s dashboard marks that analyst active. The active analyst’s box is highlighted, and any tasks clicked in the Task Selection Panel will be assumed to reflect the workflow of that analyst, eliminating the need for a user selection list. Lastly, the dialogue box for
250
A. Pyke et al.
Task Selection Panel
Active Tasks Observation Log
Fig. 2. BOLT version 3 (V3) interface - sccenario page
adding notes was modified to provide a view of previous notes, making it easier for an observer to review and re-use relevant phrases and comments.
3 Use Case: BOLT to Support Cyber Product Evaluations To apply BOLT to a use case, it was deployed to support two commercial product evaluation events. During the evaluation scenarios, observers, using BOLT, logged network incident response analysts’ tasks as they followed their assessement workflows while operating new cyber software products. Types of observations that observers could log included: notes, task starts/stops, analyst affect (good, neutral, bad), analyst requests for support, and observer strikeouts (deleting a logged observation). The total number of observations is the sum of all of the above types. Since the events varied in duration, data were normalized to obtain observation density metrics (dividing the number of each type of observation for each observer by the session duration). For product A’s evaluation, 8 observers using BOLT V1 each observed two analysts. For product B’s evaluation, 8 observers using BOLT V3 each observed one analyst. We explored how data metrics varied with the following: i) BOLT usability; ii) number of analysts per observer; and iii) the usability of the product being evaluated. We also demonstrate the utility of BOLT’s precise timestamps (for start and end times of analyst tasks) to allow for workflow visualization and comparison across analysts.
Evaluating the BOLT Application
251
3.1 Observational Data Metrics Table 1. Observational data metrics from the two product evaluation events including a statistical comparison across the events of observations logged per analyst. Observational metric
Product A
Product B
t
p
Overall
Per Analyst
Overall
df = 14
1-tail
1. Observations/hour
26.8
13.6
51.7
−2.48
.021
2. Notes/hour
13.6
6.8
23.7
−2.12
.035
3. Task Started/hour
7.4
3.7
14.9
−3.13
.008
4. Good/Bad Affect/hour
4.2
2.1
3.4
−0.67
.257
5. Support Requests/hour
0.26
0.75
0.75
−1.71
.059
6. Strikeouts/hour
1.1
0.5
1.8
−2.29
.026
7. Strikeouts (%)
4.6%
4.6%
5.1%
0.16
.876
−0.15
.883
8. Note length (Median)
57.4
57.4
55.5
3.2 Impact of BOLT Usability on Observations We hypothesized that the more usable BOLT became, the more easily observers could log observations, and thus the higher the density of observations expected. The observers who rated the usability of BOLT V3 were the same set of observers who used it to log observations on analysts trying Product B. In this sample, we tested whether individual usability ratings of V3 correlated with differences in observation metrics across observers. We expected that ratings on SUS questions framed positively (odd questions, higher ratings mean more usable) would correlate positively with observational data density, while SUS questions framed negatively (even questions, higher ratings mean less usable) would correlate negatively with data density. For these and other tests of directional hypotheses, we used 1-tailed tests. We report significant (p < .05) and marginal (p < .10) results. The more an observer rated BOLT as complex (SUS Q2) or cumbersome (Q8), the fewer observations per hour (Q2: r = −.546, p = .081; Q8: r = −.511, p = .098) – in particular, fewer notes per hour (Q2: r = −.533, p = .087; Q8: r = −5.87, p = .063) were logged. Higher cumbersome ratings also led to fewer logs of analyst affect (r = −.672, p = .034). Conversely, observers reporting that they would like to use BOLT frequently (Q1) logged more analyst affect entries (r = .972, p < .001). Higher ratings of BOLT’s ease of use (Q3) and of people’s ability to learn BOLT quickly (Q7), predicted that logged notes would have longer median length (Q3: r = .510, p = .099; Q7:r = .510, p = .099). Thus, individual differences in the usability ratings of the observational data logging application can impact observational data metrics. An unexpected trend was discovered for strikeout frequencies. The strikeout feature of BOLT lets observers delete a logged observation. In the observation table, it is not actually deleted, but is assigned a “strikeout” status. This feature might be used, for example, if the observer accidentally logged that the analyst had started a new task
252
A. Pyke et al.
when they had not. Since the need to strikeout an observation reflects an input error, and since more usable applications should reduce error, we expected BOLT usability would be negatively correlated with strikeout density and the proportion of strikeouts (#strikeouts/#observations). However, higher ratings of BOLT’s ease of use (Q3) and of people’s ability to learn BOLT quickly (Q7) predicted a higher proportion of strikeouts (Q3: r = .575, p = .068; Q7: r = .575, p = .068), as did BOLT’s overall SUS rating (r = .568, r = .071). Conversely, higher ratings of BOLT’s complexity predicted fewer strikeouts per hour (Q2: r = −.866, p < .001). A possible explanation is that an observer finding BOLT complex may have found it difficult to locate or understand the strikeout feature, leading to fewer strikeouts and potentially data that should have been struck out but was not. Likewise, an observer who found BOLT less usable may have been pre-occupied with entering and managing observations and neglected strikeout actions. Conversely, an observer who rated BOLT well may have had no problem understanding and using strikeouts as part of their workflow. These data generally suggest that the frequency of use of a function like strikeout (to undo or delete an action) may not indicate that the application is not user-friendly, and may indicate the opposite. This pattern of logging more observations and performing more strikeouts when observers found BOLT more usable was replicated in a secondary analysis - a betweengroups comparison of data collected using BOLT V1 versus V3. BOLT V1 was rated significantly less usable (SUS = 29.4, SD = 13.4, N = 7) than BOLT V3 (SUS = 86.9, SD = 9.3, N = 8), t(13) = −9.752, p < .001. Using the more usable V3, observers logged more observations, t(14) = −1.49, p = .087, specifically more analyst tasks, t(14) = −2.07, p = .037. Interestingly, the length of the notes taken did not change across versions. We suggest that the (i) shift in focus of the BOLT interface from the Observation Log to the Active Tasks section and (ii) slight reduction in number of actions required to add an item and edit notes from V1 to V3 contributed to the increase in SUS score, reflecting V3 allowing easier and quicker logging of observations. This aligns with the density metrics. 3.3 Impacts of Observer Expertise on BOLT Usability Ratings and Observations A factor that may be related to observers’ BOLT usability ratings and their data logging patterns is observer expertise. Observer expertise was expected to lead to both higher BOLT usability ratings and observation densities. For observers using BOLT V3, we had three expertise measures: i) whether or not they had prior experience using BOLT (yes/no); ii) domain expertise in the domain relevant to the evaluated products (A&B) (here, cybersecurity: 0–3 months, 3–12 months, 1–2 years, 2–5 years, over 5 years); and iii) familiarity with the tasks that would be performed by the analysts being observed (1 = not familiar at all to 5 = very familiar). Prior experience with BOLT yielded higher responses (only) to SUS Q1 asserting that the observer would like to use BOLT frequently (r = .745, p = .017). Prior BOLT use also yielded longer median note lengths (72 vs. 46 characters), t(6) = −1.94, p = .050. Experience with BOLT might enable speedier operation of the application, allowing extra time to log more detailed notes. Unsurprisingly, observer domain experience and familiarity with the analyst tasks were correlated (r = .872, p = .002). We expected that such knowledge would leave the observer with more cognitive resources (‘cycles’) for operating BOLT, reducing the
Evaluating the BOLT Application
253
likelihood of frustration with the application (i.e., yield higher usability ratings). However, these expertise measures were not correlated with BOLT usability ratings. We also expected such expertise would leave the observer more cycles for finer grained observations. However, domain expertise was related to fewer observations per hour (r = − .698, p = .027) – specifically, task starts (r = −.800, p = .009). Observer domain expertise and task familiarity also tended to predict fewer strikeouts made by the observers (domain expertise: r = −.589, p = .062; task familiarity: r = −.571, p = .069). Observers familiar with the domain and observed task would presumably be less likely to initially miscategorize which task an analyst had commenced (reducing tasks logged), and thus would be less likely to need to delete (strikeout) logged observations. 3.4 Impact of Number of Observed Analysts Per Observer We also explored how the number of analysts an observer watched simultaneously (1 or 2) might affect the nature and density of the observational data logged per analyst. The null hypothesis would be that the same density of observations would be collected per analyst, but, we predicted that the grain size of observations would be coarser (lower density) when an observer had to log data about more than one analyst. We did independent sample t-tests for each metric of interest (due to Levene’s Test, used the sig- Fig. 3. Workflow visualizations for an analyst for nificance tests for equal variances not whom Product B afforded high productivity (left) assumed). When observers only had and an analyst for whom Product B afforded low to log data for one analyst the obser- productivity (right). vation density was higher for several metrics (see Table 1). An issue with this analysis is a confound between number of analysts observed and the usability of the BOLT version used (V1 used to observe two analysts at once was less usable). We suspected that two analysts per observer and lower BOLT usability both contributed to lower data density. To try to control for usability statistically, we did a follow-up analysis using a usability measure as a covariate (SUS Q2 re: unnecessary complexity). These data were available for V3 observers (watching one analyst each) and were predictive of data densities (Sect. 3.2). Q2 values for V1 observers were estimated using a regression relationship found for V3 (StrikeOutsPerHour = 5.567−2.494 * SUS_Q2, r = −.866, p = .005). With this usability covariate in a between-groups ANOVA (1 vs. 2 analysts observed), there remained effects of number of analysts per observer on both observations per hour per analyst, F(1,13) = 3.62, p = .079 (p = .040 1-tail), and specifically on task starts/stops per hour per analyst, F(1,13) = 6.73, p = .022. If observing two analysts simultaneously, an observer may not notice and log all task transitions made by observed analysts.
254
A. Pyke et al.
The data are compatible with our suggestion that observing more than one analyst can lead to coarser-grained observations. 3.5 Impact of the Usability of the Product Being Evaluated Analyst behavior presumably varies based the usability or usefulness of the product they are evaluating. Thus, these factors may systematically impact observational data. The higher the analyst’s rating of the usefulness of the product they were evaluating, the more observations (r = .588, p = .083) – and specifically task starts/stops (r = .691, p = .043) - were logged by their observer. Task density logged by the observer was also related to analysts’ usability ratings of the evaluated product (r = .559, p = .096). It seems plausible that the more useful/usable a product, the better analysts could progress through their workflow, and thus the more tasks they could accomplish in a given time. 3.6 Utility of BOLT’s Time-Stamped Data for Investigating Workflow BOLT automatically timestamps observations to support workflow visualizations and track the observed analyst’s time to complete tasks. Figure 3 shows workflows (Gantt Charts) of two observed analysts using Product B. The workflow starts with task 1 and ends with task 6 (documenting the work, i.e., reports on network incidents). Analyst 2 quickly and serially did tasks 1–3, and then transitioned back and forth among tasks 4–6. Analyst 9 did a lot of task switching for the early tasks, took longer overall, and invested less time in documenting their work. For Analyst 2, product B afforded greater productivity than it did for Analyst 9 (event organizers’ productivity metrics). Thus, workflow patterns from BOLT timestamps can offer insight in relation to other metrics.
4 Conclusions Observing two analysts at once reduced observation densities, including the number of notes taken per observed analyst. How usable individual observers found BOLT also impacted their observation density. Higher ratings of BOLT’s usability and prior experience using BOLT were also associated with longer note lengths. If free-form notes are a data type of particular interest, it is important to ensure both that i) observers are well trained on the logging tool and ii) observers are only assigned one analyst to observe at a time. While V3 made it easier for observers to log data, further improvements are possible. For example, while prior notes are available to cut and paste, a dictionary of common phrases to add as note content could further reduce data entry time. Acknowledgments. We thank the Joint Artificial Intelligence Center and Capt Luis Cintron, U.S. Air Force, for collaborating with us to use BOLT in support of their evaluation events, and DreamPort for hosting the events. We would also like to thank MIT Lincoln Labs for initial software development that served as a foundation for BOLT V1. The Office of the Under Secretary of Defense (Research & Engineering Cyber Technologies Program) provided project funding support. This research was supported by the U.S. Army Development Command (DEVCOM) Data & Analysis Center (DAC). The paper content does not necessarily reflect the position or the policy of the Government and no official endorsement should be inferred.
Evaluating the BOLT Application
255
References 1. Brooke, J.: SUS-a quick and dirty usability scale. Usabi. Eval. Ind. 189(194), 1–7 (1996) 2. Cooney, J.B., Ladd, S.F.: The influence of verbal protocol methods on children’s mental computation. Learn. Individ. Differ. 4(3), 237–257 (1992) 3. Kuusela, H., Paul, P.: A comparison of concurrent and retrospective verbal protocol analysis. Am. J. Psychol. 113(3), 387–404 (2000) 4. Keohane, C.A., et al.: Quantifying nursing workflow in medication administration. J. Nurs. Admin. 38(1), 19–26 (2008) 5. Viswanath, S.K., Jie, L., Meng, Q.S., Yuen, C., Tan, T.Y.: An android app for recording hand hygiene observa-tion data. J. Hosp. Infect. 92(4), 344–345 (2016) 6. Garneau, C., Hoffman, B., Buchler, B.: Behavioral Observations Logging Toolkit (BOLT): Initial Deployed Prototypes and Usability Evaluations. Technical Report, Defense Technical Information Center (2020). https://apps.dtic.mil/sti/citations/AD1099977
Vulnerability Analysis Through Ethical Hacking Techniques Ángel Rolando Delgado-Pilozo1(B) , Viviana Belen Demera-Centeno2 , and Elba Tatiana Zambrano-Solorzano2 1 Technical University of Manabi, Postgraduate Institute, Portoviejo, Manabi, Ecuador
[email protected]
2 Faculty of Computer Science, Technical University of Manabi, Portoviejo, Manabi, Ecuador
{viviana.demera,tatiana.zambrano}@utm.edu.ec
Abstract. This article explains the analysis of vulnerabilities with the ethical hacking modalities classified as External Hacking and Internal Hacking used in the information-gathering phase. This study was carried out in the network of the Decentralized Municipal Government of the Olmedo – Manabí County and different techniques of each modality were used to obtain information from this institution. The results obtained, such as the level of severity, what are the most common types of vulnerabilities, or to which categories they belong, are very important information for executing the next phases of ethical hacking and determining the real level of risk of the vulnerabilities present. Keywords: Vulnerabilities · Ethical hacking · Information security · Network attack · Computer risks
1 Introduction Nowadays, considering the rapid development of information and communication technologies, companies or institutions in both the public and private sectors have begun to make ample use of computer networks to optimize their communications, but they have to consider one of the greatest dangers, which is constantly confronting cyber-attacks. The objective of this research is to conduct a study to know and analyze the vulnerabilities that are present in the network of the Municipal Government of the Canton of Olmedo, using ethical hacking techniques to know what action or event could compromise the security of the information. This problem is motivated by the current dependence on technology, institutions and companies cannot have the luxury of unprotected access to their networks and systems with the risk of compromising their information, operations and reputation. These institutions need to have protection measures in place because a few minutes of downtime can cause massive damage such as a generalized alteration of the reputation and credibility of the company or organization, up to large monetary losses. These attacks also compromise the highly classified information of both the affected company and its employees. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 256–264, 2021. https://doi.org/10.1007/978-3-030-79997-7_32
Vulnerability Analysis Through Ethical Hacking Techniques
257
When we talk about ethical hacking we refer to the action of performing controlled intrusion tests on computer systems; that is to say that the consultant or pentester will act from the point of view of a cracker (hacker), to try to find vulnerabilities in the audited equipment that can be exploited, giving him - in some cases - access to the affected system even; but always in a supervised environment, in which the operability of the client organization’s computer services is not put at risk [1]. The term “hacking” is used to describe professionals with a defined set of skills who gain access to computer systems in an authorized or unauthorized manner. They achieve this by exploiting vulnerabilities or using security flaws. When access is unauthorized, it is called cracking, which is an act motivated by malice, mischief or activism by people involved in political or social movements. Malicious hacking is becoming commonplace with the growing popularity of the Internet, E-Commerce, and the Internet of Things [2].
2 Theoretical Framework 2.1 Importance of Ethical Hacking With the growth of the Internet, computer security has become a major concern for businesses and governments. In their search for a way to address the problem, organizations realized that one of the best ways to assess the intruder’s threat to their interests would be to have independent computer security professionals attempt to break into their computer systems. The “ethical hackers” would employ the same tools and techniques as intruders, but neither would damage the target systems or steal information. Instead, they would as-sess the security of the target systems and inform the owners with the vulnerabilities found and instructions on how to remediate them [3]. 2.2 Ethical Hacker An ethical hacker is a white hat hacker who does hacking for some good cause (like securing any organization). Ethical hackers are basically good people. They have legal rights to hack into other people’s systems. Ethical hackers scan ports, websites and find vulnerabilities through which a cracker can attack [4]. 2.3 Network Vulnerability Exploration Vulnerability scanning is the process of using one computer to look for weaknesses. It can also be used to determine vulnerabilities in a network. Security experts can use vulnerability scanning to find weaknesses to fix and protect systems. On the other hand intruders can also use it to attack a system and damage the system [5]. 2.4 Pentesting Penetration Testing When there is uncertainty about the effectiveness of various security mechanisms such as firewall controls, intrusion detection systems, file integrity monitoring, etc., are not as effective, it is best to perform a full penetration test. One of the most common techniques to ensure the level of effectiveness, is a penetration test or also known as Pentesting, which is a vulnerability analysis to locate the individual weaknesses of the system [6].
258
Á. R. Delgado-Pilozo et al.
3 Methodology This research utilized a three-phase methodology: information recollection, scanning and vulnerability analysis. Automatic and manual tools were used to obtain information, to evade the security, to carry out intrusion tests and studies on computer security with which each of the stages were carried out without affecting the real operation of the institution. Explaining better this methodology, it allowed the execution of computer tools oriented to cause problems in networks and information systems, in addition different types of scans were performed to know in a practical way the security holes that provided important information to find the computer security vulnerabilities. This methodology is more practical, since it focuses its analysis on the use of the same tools with which a given attack is carried out [7]. 3.1 Information Recollection Phase In this first stage, the objectives to be attacked are identified based on two possible scenarios: • Blind testing, i.e., no information is available from the client. • Tests with information, this is when the client provides certain information to the auditor [8]. For the recollection of information, visits were made to the facilities of the Autonomous Decentralized Municipal Government of Canton Olmedo (GAD Olmedo), where communication was established with the Administrative Director and the staff of the Technology area who provided information for this research, in addition to obtaining authorization to perform the analysis of vulnerabilities in the network of this institution through ethical hacking tech-niques and making the commitment to maintain the rules of confidentiality with the information. The first technique used is Footprinting with Google, which consists of collecting information about the organization using a web browser and the Google search portal, the query GAD Olmedo we obtained about 95,900 results. Optimizing the recollection of information we use the operators provided by google to perform a more specific search with the operator “site” that allows us to know if this institution had other web servers with a designated dns name within the domain https://www.olmedo.gob.ec/ combined with the operator “inurl” that helps us with the search containing in the URL the word “Manabi”, it was reduced to 4 results and we can deduce that there are no other web servers within the main domain of the institution. We continued with the recognition, we performed DNS footprinting with nslookup in a name query where we verified that this site has an IPv4 address, it was also possible to obtain more information about our target through the use of some useful options such as set type to set a query type, followed by the NS command which returned information about the name servers for the GAD Olmedo domain and MX which gave us data about who are the mail servers for the domain https://www.olmedo.gob.ec/.
Vulnerability Analysis Through Ethical Hacking Techniques
259
To confirm the hosting of the web server we made use of the WHOIS directory which is a public list containing domain names and the contact information of the persons or organizations associated with each domain name. Obtaining information that confirms that it is in delegated status to the Association of Municipalities of Ecuador - AME. The DNSEnum tool was used to perform DNS footprinting on the domain www.olmedo. gob.ec with the option --noreverse we skipped the reverse DNS queries. This allowed us to know not only the IP of the Server where the institutional web page of GAD Olmedo is hosted, but also the IP of the Name Servers and the mail server related to the institutional domain. 3.2 Scanning Phase This phase consists of determining the existing ports on the active hosts, their status, the protocols they use, the services and their version. This phase started with the use of Scanning Techniques which involves identifying the open ports on the active hosts. It includes detecting the operating system and the services associated with the open ports. Using the NMAP port scanner tool, we proceeded to the analysis of the victim hosts, where a stealth (half open, because it does not open a complete TCP connection) scan was executed with the -sS command TCP SYN scanning, a SYN packet is sent, as if a real connection were to be opened, and a response is expected. A SYN|ACK indi-cates that the port is listening. We proceeded to run a deeper scan in “connect” mode, this type of scan is more accurate than the half-scan type, but by completing the TCP 3 - way - handshake we expose ourselves to leaving traces of the connection in the logs. The - sT command was used to try to establish a tcp/connectscan connection, which allowed us to obtain very important information when other factors are involved, such as the firewall, the security implemented on the host or some endpoint protection tool. A very important data to obtain in any scan is to know the Operating System the victim host is running and its versions. With the -O command of the NMAP tool it was possible to obtain information together with the scan in “connect” mode. It is possible that the scanning of certain active hosts may not show open ports with the scanning techniques described above or may give false positives with open or filtered ports. In these cases it is necessary to perform advanced scanning techniques to confirm the validity of the results obtained. Techniques such as evading firewalls, which performs a FIN scan against a stateless firewall, or a bypass to an ipfilter fire-wall using a source port, can also be used, and source routing, also called route routing, allows the sender of a packet to specify partially or completely the route that the packet takes through the network, another technique is Fragmenting packets or changing the MTU, the Use of decoys in NMAP is a technique where the address of another host is used as the source of the Probes in order to mask the own IP in the scan, making it difficult to trace the true source, and the Zombie Scan technique which is quite sophisticated in terms of port scanning methods that completely hides the source IP of the scan. Figure 1 shows the difference in the results of a scan.
260
Á. R. Delgado-Pilozo et al.
Fig. 1. Scanning with the use of -D decoys and a scan evading firewalls with a FIN scan –sF. Source: Own elaboration.
3.3 Vulnerability Analysis Phase This phase consists of determining security problems in the targets, it can be done manually or automatically by means of auditing software, once the vulnerabilities are detected, the strategy with which the attack will be carried out is identified. It is necessary to guarantee the protection and reliability of the data, the information found in the applications, devices and services analyzed. The Vulnerability Analysis (VA) market provide capabilities to identify, categorize and manage vulnerabilities, according to Gartner Inc. which is a leading global IT research and consulting firm, in the 2020 “Voice of the Customer” Report, Gartner Peer Insights synthesizes reviews of vulnerability assessment solutions, Tenable with its Nessus tool has the highest rating of all vendors, allows analyzing vulnerabilities and using vulnerability priority rating (VPR), which is the result of Tenable Predictive Prioritization, by rating vulnerabilities according to severity level: Critical, High, Medium and Low, determined by two components: technical impact and threat [9]. The Vulnerability Analysis was carried out in two groups to the main equipment of the Telecommunications Network and servers of the GAD Olmedo, before starting the advanced options were configured regarding performance (“Custom” option) where the simultaneous scans were reduced from 30 hosts to 1 and from 4 checks per host to only 2. This was done for two main reasons: Avoid being detected by a perimeter protection device, and avoid congesting remote computers. The vulnerability scan can take several minutes, hours or even days, depending on the number of hosts scanned.
4 Analysis of Results This research has been carried out following a methodology that is directly linked or related to the phases of Ethical Hacking. The results obtained according to each phase that has been executed for the Analysis of Vulnerabilities in the GAD Olmedo Network will be presented below: 4.1 Results of the Information Gathering Phase To obtain the results in this phase we have worked with modalities of External and Internal Hacking, using the techniques of recognition and collection of information such as:
Vulnerability Analysis Through Ethical Hacking Techniques
261
Footprinting with Google, Nslookup, Who-Is Directories, Web Repositories, Dnsenum and Collection with Maltego that have allowed us to obtain relevant information about the domain of the institutional web page, IP Address of Servers where the web portal is hosted, mail server and related name severs, in addition to their IP addresses of these services located externally to the LAN network of the institution, as well as the IP addresses of the main telecommunications equipment and servers of the LAN network of the GAD Olmedo.
Fig. 2. Results of the information recollection phase. . Source: Own elaboration
Figure 2 shows the graphical relationship of the findings, where the technique that allowed us to obtain information on all the findings is that of searches in Web Repositories. With the Information Gathering Technique with Maltego we can observe a high number of IPs discovered due to the fact that a ping sweep was performed on the main Networking equipment and Servers of the LAN network of GAD Olmedo. 4.2 Results Scanning Phase In this phase, the results obtained were to identify the open ports in the active hosts, including the detection of the Operating Systems and the services associated to the open ports. The scanning techniques used were the following: SYN or Half-Open scanning, Scanning FULL or Connect-Scan, UDP Scanning, Special scans (if RST closed): NullScan (flagsoff), FIN Scan (FIN on), XMasScan (FIN + URG + PSH), ACK Scanning (if RST non-filtered). In some cases it was necessary to use advanced scanning techniques to obtain reliable results, due to the influence of the security implemented in the network such as Firewall, IDS, IPS, host security, EndPoint protection tools, etc. With the results obtained in the scanning of each active host, false positives could be discarded. In the Table 1 the results of the Scanning phase are presented where it is determined that the hosts with more open ports are the ones assigned the IP address 192.168.1.2 and 192.168.1.12, in additions the versions of the services that use the ports that are in open state were not known. The average is applied as a measure of central tendency; it is also necessary to know what
262
Á. R. Delgado-Pilozo et al.
the mode is, in order to have a clearer view of the type of service that is going to be attempted to be vulnerability. In addition, to know how dispersed the data are in relation to the mean, the standard deviation is used, raising the degree of confidence by obtaining data grouped close to the average. Table 1. Results of the scanning phase. Source: Own elaboration Scanned host
Ports
Services
Versión
Operating system
MAC address
192.168.1.1
1
1
0
1
1
192.168.1.2
6
6
6
1
1
192.168.1.3
3
2
2
1
1
192.168.1.4
3
3
2
1
1
192.168.1.5
3
3
2
1
1
192.168.1.6
2
2
2
0
1
192.168.1.7
2
2
2
0
1
192.168.1.8
2
2
1
1
1
192.168.1.9
1
1
1
1
1
192.168.1.11
5
5
3
1
1
192.168.1.12
10
10
9
1
1
Average
3.45
3.36
2.73
0.82
1.00
Mode
3
2
2
1
1
Standard Dev.
2.66
2.69
2.57
0.40
0.00
4.3 Results Vulnerability Analysis Phase After obtaining important information in the scanning stage about the open ports, services, versions, operating systems, etc. that have been identified, it is necessary to check which vulnerabilities are found and determine the level of risk they present. This information will be very important, because the higher the risk level of the vulnerabilities, the higher the probability of exploitation. This phase can be carried out using different tools, which can be manual or automatic. One of the most important factors to take into account is to carry out a good parameterization of the tool to be used in order to obtain reliable results such as the severity level, which are the most common types of vulnerabilities or to which types of categories they belong, etc. To obtain the results in this phase we used the Nessus tool, which is considered a world leader in this area, with which the analysis was performed in two groups of hosts, because this research was conducted with the network in normal activity, avoiding affecting the operation of the GAD Olmedo and services related to information Technology and communication. The results of the analysis made to the two groups of hosts, where I have used the frequency distribution to summarize the data through Table 2 that collects
Vulnerability Analysis Through Ethical Hacking Techniques
263
Table 2. Results of the vulnerability analysis phase. Source: Own elaboration Vulnerability Frequency Percentage Cumulative classification percentage Critical
1
2.38%
2.38%
Mixed
2
4.76%
7.14%
High
1
2.38%
9.52%
Medium
8
19.05%
28.57%
Low
2
4.76%
33.33%
Info
28
66.67%
100.00%
Total
42
100.00%
the values of the variable or modalities of the attribute in relation to the classification of the level of impact of vulnerabilities.
5 Conclusions • The use of several tools in both internal and external hacking makes it possible to obtain more reliable results that can be contrasted with each other to reduce the presence of false positives in the different phases of Vulnerability Analysis. • With advanced scanning techniques it is possible to bypass the security implemented in the network such as Firewall, IDS, IPS, host security, EndPoint protection tools, etc. To obtain reliable results of status of ports, services, versions, etc. • It has been shown that, by dividing the network into smaller groups to discover the vulnerabilities present, the risk of affecting network performance when performed in normal activity or in production is decreased. • The results presented are not only based on statistics generated by tools, but on the results of the Ethical Hacking executed in each phase of this research. • As future work I propose to apply the inclusion of automatic learning tools for advanced security evasion, discovery and exploitation of vulnerabilities under ethical and legal conditions.
References 1. Karina, A.: Hacking ético 101. Babelcube Inc. (2017) 2. Parrales, W.M.A., et al.: La ciberseguridad práctica aplicada a las redes, servidores y navegadores web. 3Ciencias (2019). https://www.researchgate.net/publication/337853806_La_cib erseguridad_practica_aplicada_a_las_redes_servidores_y_navegadores_web 3. Brijesh, K.P., Alok, S., Lovely, B.: Ethical Hacking (Tools, Techniques and Approaches) (2015). Disponible en. https://www.researchgate.net/publication/271079090_ETHICAL_H ACKING_Tools_Techniques_and_Approaches
264
Á. R. Delgado-Pilozo et al.
4. Patil, S., Jangra, A., Bhale, M., Raina, A., Kulkarni, P.: Ethical hacking: the need for cyber security. In: 2017 IEEE International Conference on Power, Control, Signals and Instrumentation Engineering (ICPCSI), pp. 1602–1606. IEEE (2017) 5. Wang, Y., Yang, J.: Ethical hacking and network defense: choose your best network vulnerability scanning tool. In: 2017 31st International Conference on Advanced Information Networking and Applications Workshops, pp. 110–113. IEEE. WAINA (2017) 6. Santos, Y.: Introducción a las pruebas de penetración, Research Gate (2015). https://www.res earchgate.net/publication/313839026_Introduccion_a_las_pruebas_de_penetracion 7. Muñoz, P., Camilo, C.: Análisis de metodologías de Ethical hacking para la detección de vulnerabilidades en las Pymes (2019). http://repository.unad.edu.co/handle/10596/30302 8. Gudiño, L., Wladimir, M.: Auditoría de seguridad informática en la red interna de la Universidad Técnica del Norte según la metodología Offensive Security Professional Training and Tools For Security Specialists y planteamiento de políticas de seguridad basadas en la norma ISO/IEC 27001 (2017). http://repositorio.utn.edu.ec/handle/123456789/6975 9. Gartner Information Technology Research. https://www.gartner.com/en/documents/3979309
User Perceptions of Phishing Consequence Severity and Likelihood, and Implications for Warning Message Design Eleanor K. Foster1(B) , Keith S. Jones1 , Miriam E. Armstrong1 , and Akbar S. Namin2 1 Department of Psychological Sciences, Texas Tech University, Lubbock, TX, USA
{eleanor.foster,keith.s.jones,miriam.armstrong}@ttu.edu 2 Department of Computer Science, Texas Tech University, Lubbock, TX, USA [email protected]
Abstract. To combat phishing, system messages warn users of suspected phishing attacks. However, users do not always comply with warning messages. One reason for non-compliance is that warning messages contradict how users think about phishing threats. To increase compliance, warning messages should align with user perceptions of phishing threat risks. How users think about phishing threats is not yet known. To identify how users perceive phishing threats, participants were surveyed about their perceptions of the severity and likelihood of 9 phishing consequences. Results revealed perceived severity and likelihood levels for each consequence, as well as relative differences between consequences. Concrete examples of warning messages that reflect these findings are provided. Keywords: Human factors · Cybersecurity · Social engineering · Phishing · Consequence · Severity · Likelihood · Risk · Warning · Message
1 Introduction Phishing occurs when someone attempts to obtain sensitive information through email. One strategy to help thwart phishing attacks is to warn users when an attack is suspected. However, users do not always comply with warnings [e.g., 1]. Thus, researchers have studied how to design cybersecurity warnings to increase compliance. That literature has produced three key recommendations. First, warnings should describe attack consequences [e.g., 2, 3]. For example, a message in response to an email asking the user to provide credit card information could warn that providing the requested information would enable the recipient to freely use their credit card to make any number of purchases and for any amount. Second, warnings should convey attack risk [e.g., 2, 4], which is a function of two factors: 1) the severity of the attack, and 2) the likelihood the user will experience the attack [2]. Continuing the previous example, the warning could convey that doing what the sender asked would be high risk because: 1) it would take a lot of time and effort to work with the credit card company to deal with fraudulent purchases (severity), and it is very likely the email is a phishing attempt © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 265–273, 2021. https://doi.org/10.1007/978-3-030-79997-7_33
266
E. K. Foster et al.
(likelihood). Third, warning messages should align with how users think about attacks [e.g., 5, 6]; otherwise, users will not trust the message [3, 7]. Continuing the previous example, the warning message could merely be descriptive if the user thinks attack risk is low, e.g., “it will be necessary to disavow any fraudulent purchases”. Alternatively, the warning message could be more strongly worded if the user thinks attack risk is high, e.g., “it will be necessary to disavow however many fraudulent purchases the recipient makes, which could require a lot of time and effort”. To create a phishing warning message that accounts for all three recommendations, one would need to describe the risk associated with the phishing attack’s consequences, and in a way that aligns with how users think about that risk. Research has investigated how users think about topics related to cybersecurity [e.g., 8], but not phishing attack consequence risk. Additional research is necessary if we are to design phishing warning messages that comply with all three of the design recommendations described above. 1.1 The Present Study We investigated how users think about risks associated with phishing attack consequences. To do so, users rated the severity and likelihood of phishing attack consequences. We then analyzed those ratings to understand a) the level of perceived severity and likelihood for each consequence, and b) whether perceived severity, likelihood, or both varies across consequences. Finally, we offered concrete recommendations regarding how to apply our findings to the design of phishing warning messages.
2 Method 2.1 Survey The survey consisted of a brief definition of phishing and two sets of questions. One set concerned the severity of 9 common phishing attack consequences (C1 through C9), each rated via a 7-point response item (1 = Not Severe, 7 = Severe); the other set concerned the likelihood of those consequences, each rated via a 7-point response item (1 = Not at all likely, 7 = Very likely). Table 1 provides descriptions for each consequence. C1 and C2 concern situations in which users do what the phisher asked, but are not aware of consequences other than perhaps being tricked. Questions about C1 and C2 allowed us to assess how users perceive the risk of clicking a phishing link or providing personal information per se. C3 through C9 concern situations in which users experience consequences beyond being tricked. Questions about C3 through C9 allowed us to assess how users perceive the risk associated with each of those consequences. 2.2 Procedure Each participant 1) provided informed consent, 2) completed demographic questions, 3) completed the survey, which was embedded within a larger survey and administered online, and 4) received partial course credit. The research complied with the APA Code of Ethics, and was approved by the Texas Tech Institutional Review Board.
User Perceptions of Phishing Consequence Severity and Likelihood, and Implications
267
Table 1. The 9 common phishing attack consequences investigated. Consequence
Description
C1
Phished because clicked phishing link
C2
Phished because gave phisher personal information via email
C3
Phisher gains your username & password for Web site
C4
Phisher performs actions on a Web site as if they were you
C5
Phisher accesses your information stored in a Web site
C6
Phisher deletes your information stored in a Web site
C7
Phisher modifies your information stored in a Web site
C8
Phisher prevents you from logging into a Web site
C9
Phisher takes control over one of your financial accounts
2.3 Participants In total, 1,649 students in an Introduction to Psychology course completed the study. Their data were examined for missing responses, careless responding, and outlier response patterns. Cases missing responses to more than 10% of our survey items were identified and removed. One hundred and twelve participants were removed for missing data. We then employed a long strings evaluation [9] to identify and remove participants from the data set who rated all response items the same. Five hundred two participants were removed for careless responding. Last, to identify participants whose response patterns were particularly unusual (chi-square p < .001), we computed Mahalanobis Distance [11] for the severity and susceptibility response items separately. One hundred twenty-one cases were removed from the data set for being outliers. After completing these steps, 914 participants (659 females, 252 males, 3 other) remained in the data set. Their ages ranged from 16–49 years (M = 19.03, SD = 2.52).
3 Results 3.1 Replacing Small Amounts of Missing Data We retained cases missing fewer than 10% of response items. For those cases, we employed 2 methods to replace the missing data. First, we used hot decking [10] to replace missing data with values from another participant whose responses were identical to the participant’s non-missing data (via the SPSS macro). Second, if no donor case was found, we replaced missing data with the mean of the missing data point. Twenty-one data points, .001% of the dataset, were replaced using these methods. 3.2 What Were the Perceived Severity and Likelihood Levels? We used bootstrapping [12] (1000 samples; sampled with replacement; sample size = 914) to compute a mean and confidence interval for perceived severity and likelihood for
268
E. K. Foster et al.
each consequence. We did so, rather than computing those statistics for the sample as a whole, to provide the best possible estimate of the population mean for each consequence. Table 2 provides the resultant means and confidence intervals. Inspection of Table 2 suggests perceived severity ranged from moderate to severe, with all consequences except C1 (Phished because clicked phishing link) falling in the fairly severe (above the mid-point but below the top-end) to severe range. In contrast, perceived likelihood fell in the somewhat likely range (above the bottom-end but below the mid-point); no consequences were rated as moderately likely to occur or greater. Table 2. Perceived severity and likelihood rating means and confidence interval estimates Consequence
Mean Severity
95% CI Severity
Mean Likelihood
95% CI Likelihood
C1
3.87
[3.80, 4.00]
3.18
[3.08, 3.32]
C2
5.40
[5.29, 5.51]
2.61
[2.47, 2.73]
C3
5.69
[5.60, 5.80]
3.18
[3.08, 3.32]
C4
6.16
[6.11, 6.29]
3.15
[3.07, 3.33]
C5
6.01
[5.91, 6.09]
3.24
[2.97, 3.23]
C6
5.47
[5.40, 5.60]
3.01
[3.08, 3.32]
C7
5.92
[5.81, 5.99]
3.12
[2.88, 3.12]
C8
5.66
[5.60, 5.80]
3.24
[2.98, 3.22]
C9
6.68
[6.63, 6.77]
3.18
[3.06, 3.34]
3.3 Do Perceived Severity or Likelihood Ratings Vary Across Consequences? We examined differences in perceived risk between the nine consequences to investigate whether any consequences were perceived as more severe or more likely than others. We focused on differences because ratings were not independent [13]. To guard against Type I error, we randomly divided the data set into two sub-sets of 457 participants (Split 1 and Split 2) and performed this analysis on each sub-set. For each participant within each split, we then computed difference scores for each consequence pair (e.g., C1–C2), separately for severity and likelihood. To obtain estimates for each difference score pair, we ran a bootstrap with replacement using 1000 replications, sample sizes of 914, and a corrected alpha of .001. We employed a stringent 99.9% confidence interval because we considered the 36 differences associated with each rating type (severity or likelihood) to be a family and aimed to maintain family-wise error at .05 (alpha = .05/36 = .001). In the following paragraphs, we interpret only effects observed in both splits. 3.3.1 Perceived Severity for Pairs of Consequences Table 3 provides confidence intervals for the 30 severity difference scores that were statistically significant in Split 1 and 2. Table 3 reveals 1) C9 (Phisher takes control
User Perceptions of Phishing Consequence Severity and Likelihood, and Implications
269
over one of your financial accounts) was rated as significantly more severe than all other consequences, 2) C4 (Phisher performs actions on a Web site as if they were you), C7 (Phisher modifies your information stored in a Web site), and C5 (Phisher accesses your information stored in a Web site) were rated as significantly more severe than C2 (Phished because gave phisher personal information via email), and 3) all consequences were rated as more severe than C1(Phished because clicked phishing link) (Table 4).
4 Discussion We had 2 goals: to determine 1) the levels of perceived severity and likelihood for each phishing consequence, and 2) whether individuals rated consequences as more or less severe, and more or less likely to occur. Our findings related to each will be described in the following sub-sections, followed by concrete examples of how our findings can be applied to warning message design. 4.1 Perceived Severity Consequences fell into one of 3 groups: 1) severe, 2) fairly severe, and 3) moderately severe. Perceptions of severity appear to reflect the extent to which a consequence is contextualized and concrete. The first group was comprised of a single consequence, i.e., C9 (Phisher takes control over one of your financial accounts). Mean perceived severity for C9 (6.68) approached the top-end of the severity scale (7). Further, perceived severity for C9 was significantly greater than that for all other consequences. To describe the risk associated with C9 in a way that aligns with how users think about C9, one should describe the severity of C9 with very strong language, and that language should be stronger than the language used to describe other consequences. The second group was comprised of 6 consequences, i.e., C2 through C8. Mean perceived severity for this group ranged from 5.40–6.16, which means these consequences were perceived as fairly severe. Perceived severity for consequences at the high end of that range differed significantly from that for the consequence on the low end of that range; however, each of those consequences were also not significantly different from other consequences in this group. That suggests perceived severity for consequences in this group were more homogeneous than not. To describe the risk associated with C2 through C8 in a way that aligns with how users think about those consequences, one should describe severity with fairly strong language, but that language should not be as strong as the language used to describe C9. The third group was comprised of a single consequence, i.e., C1 (Phished because clicked phishing link). Mean perceived severity for C1 (3.87) approached the severity scale’s mid-point (4); thus, C1 was perceived as moderately severe. Further, perceived severity for C1 was significantly less than that for all other consequences. This presents a challenge for describing the risk associated with C1. Security personnel do not want users to click links in phishing emails. Hoping to discourage users from doing so, they may create strongly worded messages warning users that something extremely bad could happen if they click a link. However, our results suggest such warnings will likely
270
E. K. Foster et al. Table 3. Significant 99.9% confidence intervals for severity difference scores. Consequence pair Split 1 (n = 457) Split 2 (n = 457) C1-C2
[-1.76, − 1.24]
[−1.75, − 1.25]
C1–C3
[−2.09, − 1.51]
[−2.07, − 1.53]
C1–C4
[−2.58, − 2.02]
[−2.58, − 2.02]
C1–C5
[−2.38, − 1.82]
[−2.49, − 1.91]
C1–C6
[−1.92, − 1.28]
[1.92, − 1.28]
C1–C7
[−2.29, − 1.71]
[−2.40, − 1.80]
C1–C8
[−2.11, − 1.49]
[−2.02, − 1.38]
C1–C9
[−3.09, − 2.51]
[−3.07, − 2.53]
C2–C3
[−.51, − .09]
[−.51, − .09]
C2–C4
[−.90, − .50]
[−1.01, − .59]
C2–C5
[−.83, − .37]
[−.81, − .39]
C2–C7
[−.74, − .26]
[−.83, − .37]
C2–C9
[−1.52, − 1.08]
[−1.53, − 1.07]
C3–C4
[−.56, − .24]
[−.65, − .35]
C3–C5
[−.46, − .14]
[−.46, − .14]
C3–C7
[−.38, − .02]
[−.47, − .13]
C3–C9
[−1.20, − .80]
[−1.19, − .81]
C4–C5
[.06,−.34]
[.06, .34]
C4–C6
[.50,−.90]
[.51, .89]
C4–C7
[.05, .35]
[.05, .35]
C4–C8
[.21, .59]
[.41, .79]
C4–C9
[−.65, − .35]
[−.63, − .37]
C5–C6
[.31, .69]
[.30, .70]
C5–C8
[.12, .48]
[.21, .59]
C5–C9
[−.87, − .53]
[−.75, − .45]
C6–C7
[−.65, − .35]
[−.64, − .36]
C6–C9
[−1.42, − .98]
[−1.41, − .99]
C7–C8
[.04, .36]
[.14, .46]
C7–C9
[−.97, − .63]
[−.86, − .54]
C8–C9
[−1.20, − .80]
[−1.30, − .90]
engender distrust because they will not align with how users think about clicking a potential phishing link [e.g., 3]. Alternatively, one could describe the severity of C1 with moderately strong language that is less strong than the language used to describe all other consequences, which would align with how users think about C1. That should
User Perceptions of Phishing Consequence Severity and Likelihood, and Implications
271
Table 4. Significant 99.9% confidence intervals for likelihood difference scores. Consequence pair Split 1 (n = 457) Split 2 (n = 457) C1–C2
[.25,.75]
[.35,.85]
C2–C3
[−.71, − .29]
[−.82, − .38]
C2–C4
[−.61, − .19]
[−.83, − .37]
C2–C5
[−.82, − .38]
[−.94, − .46]
C2–C6
[−.62, − .18]
[−.62, − .18]
C2–C7
[−.72, − .28]
[−.73, − .27]
C2–C8
[−.84, − .36]
[−.93, − .47]
C2–C9
[−.74, − .26]
[−.85, − .35]
C3–C6
[.04,.36]
[.04,.36]
C5–C6
[.06,.34]
[.15,.45]
C6–C8
[−.35, − .05]
[−.36, − .04]
increase the likelihood that users will trust the message [3, 5]. However, that increase in trust may not translate into compliance if the consequences of not clicking the link are perceived as more severe than the potential consequences of clicking it [5]. In such cases, it may be best to focus the wording of the warning message, not on clicking the link, but rather on what could happen if they do what the phisher asked. For example, one could word the warning message to convey that doing what the phisher asked could allow them to gain your username and password, which users perceive as fairly severe. Doing so would allow for the use of stronger language, which hopefully will convince users that the potential consequences of not doing what the phisher asked are less severe than the potential consequences of doing what they asked. 4.2 Perceived Likelihood Certain consequences differed from one another, but all fell below the mid-point of the scale. Thus, all consequences were perceived as being only somewhat likely to occur. This presents another challenge for describing the risks associated with these consequences. Specifically, to describe risk in a way that aligns with how users think about those consequences, one should convey that these consequences are only somewhat likely to occur (regardless of the actual likelihood that the user will experience those consequences). Doing so should increase users’ trust in the warning message [e.g., 3]. However, it will also probably decrease users’ motivation to do what is required to prevent the attack [14]. Alternatively, one could ignore the recommendation to describe risk in a way that aligns with how users think about those consequences, and instead describe the actual likelihood that the user will experience those consequences [2, 5]. However, as noted earlier, that should decrease users’ trust in the warning message [3] when attack likelihood is moderate to high. Accordingly, either of those approaches may do more harm than good. Therefore, it may be best to simply not describe likelihood in
272
E. K. Foster et al.
warning messages [c.f., 2, 5]. That would avoid decreasing a) users’ motivation to do what is required to prevent the attack or b) their trust in the warning message. As such, not describing consequence likelihood may be the lesser of the evils. 4.3 Concrete Examples of Warning Messages that Reflect Our Findings Figure 1 provides concrete examples of messages that reflect our recommendations for warning users about C9 (Phisher takes control over one of your financial accounts), C3 (Phisher gains your username & password for Web site), and C1 (Phished because clicked phishing link).
Fig. 1. Example warning messages reflecting our design recommendations.
Acknowledgements. This research was supported in part by the U.S. National Science Foundation (Award #: 1564293 & 1723765). Opinions, findings, and conclusions are those of the authors and do not necessarily reflect the views of the NSF.
References 1. Bravo-Lillo, C., Cranor, L.F., Downs, J., Komanduri, S.: Bridging the gap in computer security warnings: a mental model approach. IEEE Sec. Priv. 9, 18–26 (2011) 2. Hardee, J.B., West, R., Mayhorn, C.B.: To download or not to download: an examination of computer security decision making. Interactions 13, 32–37 (2006) 3. Bartsch, S., Volkamer, M., Theuerling, H., Karayumak, F.: Contextualized web warnings, and ˇ how they cause distrust. In: Huth, M., Asokan, N., Capkun, S., Flechais, I., Coles-Kemp, L. (eds.) Trust 2013. LNCS, vol. 7904, pp. 205–222. Springer, Heidelberg (2013). https://doi. org/10.1007/978-3-642-38908-5_16 4. Bauer, L., Bravo-Lillo, C., Cranor, L.F., Fragkaki, E.: Warning design guidelines. CMUCyLab. 13, 1–27 (2013) 5. Bartsch, S., Volkamer, M.: Effectively communicate risks for diverse users: a mental-models approach for individualized security interventions. In: GI-Jahrestagung, pp. 1971–1984 (2013) 6. Blythe, J., Camp, L.J.: Implementing mental models. In: 2012 IEEE Symposium on Security and Privacy Workshops, pp. 86–90. IEEE Press, San Francisco (2012)
User Perceptions of Phishing Consequence Severity and Likelihood, and Implications
273
7. Ibrahim, T., Furnell, S.M., Papadaki, M., Clarke, N.L.: Assessing the usability of end-user security software. In: Katsikas, S., Lopez, J., Soriano, M. (eds.) TrustBus 2010. LNCS, vol. 6264, pp. 177–189. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-151521_16 8. Wash, R.: Folk models of home computer security. In: Proceedings of the Sixth Symposium on Usable Privacy and Security, pp. 1–16. ACM, New York (2010) 9. Johnson, J.A.: Ascertaining the validity of individual protocols from web-based personality inventories. J. Res. Pers. 39, 103–129 (2005) 10. Andridge, R.R., Little, R.J.A.: A review of hot deck imputation for survey non-response. Int. Stat. Rev. 78, 40–64 (2010) 11. Meade, A.W., Craig, S.B.: Identifying careless responses in survey data. Psychol. Methods. 17, 437–455 (2012) 12. Tabachnik, B.G., Fidell, L.S.: Using Multivariate Statistics, 6th edn. Pearson, Boston (2013) 13. Cumming, G., Finch, S.: Inference by eye: confidence intervals and how to read pictures of data. Am. Psychol. 60, 170–180 (2005) 14. Krol, K., Moroz, M., Sasse, M.A.: Don’t work. Can’t work? Why it’s time to rethink security warnings. In: 7th International Conference on CRiSIS, pp. 1–8. IEEE Press, Cork (2012)
Author Index
A Afzal, Muhammad Raheel, 52 Agarwal, Anmol, 176 Alvarez-Tello, Jorge, 143 Arezes, Pedro, 60 Ariente Neto, Rafael, 60 Armstrong, Miriam E., 265 Arvola, Mattias, 10 B Barone, Ben, 247 Berg, Samantha, 121 Bertenthal, Bennett I., 199 Billing, Erik, 87 Bjurling, Oscar, 10 Borges, Guilherme Deola, 60 Buchler, Norbou, 247 Buckley, Oliver, 184 Buele, Jorge, 158 Burlando, Francesco, 103, 135 C Cai, Yang, 207 Caicedo, Andrés, 143 Campbell, James, 184 Cankaya, Ebru Celikel, 176 Carneiro, Paula, 60 Casiddu, Niccolò, 103 Catoor, Tim, 52 Chaverri, Juan Pablo, 95 Chávez-Chica, Elizabeth, 158 Christofakis, Chris, 52
D de Mattos, Diego Luiz, 60 Delgado-Pilozo, Ángel Rolando, 256 Demera-Centeno, Viviana Belen, 256 E Earl, Sally, 184 Elson, Malte, 232 F Fagan, Shawn E., 199 Ferrari Tumay, Xavier, 135 Fossati, Gina, 176 Foster, Eleanor K., 265 Friedrich, Max, 110 G Gordón, Jacqueline, 143 Guerrero, Luis, 95 H Hamadache, Salsabil, 232 Hoffman, Blaine, 247 Holm, Dimiter, 52 Hugenberg, Kurt, 199 J Jing, Qianqian, 69 Jones, Keith S., 265 K Kapadia, Apu, 199 Kocharov, David, 110 Kozak, Michael, 247
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Zallio et al. (Eds.): AHFE 2021, LNNS 268, pp. 275–276, 2021. https://doi.org/10.1007/978-3-030-79997-7
276 Krause, Markus, 232 Krausman, Andrea, 121 Kroninger, Christopher, 121 L Lailari, Guermantes, 43 Leukfeldt, Rutger, 238 Li, Yunhui, 69 Lieb, Joonas, 110 Lindblom, Jessica, 87 Liu, Xinxiong, 150 Louw, Robrecht, 52 Luo, Jing, 69 M Maier, Siegfried, 3 Merino, Eugenio Andres Diaz, 60 Meyer, Carsten, 35 Mnatsakanyan, Satenik, 110 Mora, Ariel, 95 N Namin, Akbar S., 265 Neubauer, Catherine, 121 P Patterson, Wayne, 169, 215 Peeters, Gerben, 52 Pilozzi, Paolo, 52 Porfirione, Claudia, 103 Pyke, Aryn, 247 R Ramírez-Benavides, Kryscia, 95 Richert, Anja, 128 Robison, Christa, 121 Rosén, Julia, 87 Roth, Gunar, 25 Rutledal, Dag, 79
Author Index S Salazar, Franklin W., 158 Schaefer, Kristin E., 121 Schelle, Alexander, 16 Schiffmann, Michael, 128 Schulte, Axel, 3, 25, 35 Singh, Yogang, 52 Slaets, Peter, 52 Still, Jeremiah D., 223 Storms, Stijn, 52 Stütz, Peter, 16 Sun, Yue, 150 T Thoma, Aniella, 128 Tiller, Lauren N., 223 V Vacanti, Annapaola, 103, 135 van ’t Hoff-De Goede, Susanne, 238 Van Baelen, Senne, 52 van de Weijer, Steve, 238 van der Kleij, Rick, 238 Varela-Aldás, José, 158 Vega, Adrián, 95 W Wade, Lauren, 199 Wilbanks, Linda R., 191 Y Yayla, Gökay, 52 Z Zambrano-Solorzano, Elba Tatiana, 256 Zapata, Mireya, 143 Ziemke, Tom, 10