188 36 49MB
English Pages 792 [363] Year 2023
Human Factors in Simulation and Training Human Factors in Simulation and Training: Application and Practice covers the latest applications and practical implementations of advanced technologies in the field of simulation and training. The text focuses on descriptions and discussions of current applications and the use of the latest technological advances in simulation and training. It covers topics including space adaptation syndrome and perceptual training, simulation for battle-ready command and control, healthcare simulation and training, human factors aspects of cybersecurity training and testing, design and development of algorithms for gesture-based control of semi-autonomous vehicles, and advances in the after-action review process for defence training. The text is an ideal read for professionals and graduate students in the fields of ergonomics, human factors, computer engineering, aerospace engineering, occupational health, and safety.
Human Factors in Simulation and Training Application and Practice Second Edition
Edited by
Dennis Vincenzi, Mustapha Mouloua, P. A. Hancock, James A. Pharmer, and James C. Ferraro
Front cover image: Nadezda Murmakova/Shutterstock Second edition published 2024 by CRC Press 2385 NW Executive Center Drive, Suite 320 Boca Raton, FL 33431 and by CRC Press 4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN CRC Press is an imprint of Taylor & Francis Group, LLC © 2024 selection and editorial matter, Dennis Vincenzi, Mustapha Mouloua, Peter A. Hancock, James A. Pharmer, and James C. Ferraro; individual chapters, the contributors First edition published by CRC Press 2019 Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, access www.copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact [email protected] Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used only for identification and explanation without intent to infringe. ISBN: 978-1-032-51249-5 (hbk) ISBN: 978-1-032-51250-1 (pbk) ISBN: 978-1-003-40135-3 (ebk) DOI: 10.1201/9781003401353 Typeset in Times by Deanta Global Publishing Services, Chennai, India
Contents Preface......................................................................................................................vii Editors........................................................................................................................ix Contributors............................................................................................................ xiii Chapter 1 Controls and Displays for Aviation Research Simulation: A Historical Review.................................................................................. 1 Gloria L. Calhoun and Kristen K. Liggett Chapter 2 Augmented Reality as a Means of Job Task Training in Aviation...... 65 Dan Macchiarella, Jiahao Yu, Dahai Liu, and Dennis A. Vincenzi Chapter 3 Civil Aviation: Flight Simulators and Training.................................. 87 Ronald J. Lofaro and Kevin M. Smith Chapter 4 Integrating Effective Training and Research Objectives: Lessons from the Black Skies Series of Exercises............................ 129 Christopher Best, Gregory Funke, Winston Bennett, Michael Tolston, Simon Hosking, and Robert Bolia Chapter 5 Extended Reality in Training Environments: A Human Factors Trend Analysis................................................................................... 151 Salim A. Mouloua, Gerald Matthews, John French, and Mustapha Mouloua Chapter 6 Mitigation of Motion Sickness Symptoms by Adaptive Perceptual Learning: Implications for Space and Cyber Environments.................................................................................... 179 Mustapha Mouloua, John French, Janan A. Smither, and Robert S. Kennedy Chapter 7 Decision-Making under Crisis Conditions: A Training and Simulation Perspective......................................................................209 Jiahao Yu, Tiffany Nickens, Dahai Liu, and Dennis A. Vincenzi
v
vi
Contents
Chapter 8 Healthcare Simulation and Training................................................. 225 Sarah A. Powers and Mark W. Scerbo Chapter 9 Best Practices in Surgical Simulation............................................... 255 Dominique Doster, Christopher Thomas, and Dimitrios Stefanidis Chapter 10 Healthcare Simulation Methods: A Multifaceted Approach............. 275 Amy L. Hanson and Aaron W. Calhoun Chapter 11 Design and Development of Algorithms for Gesture-Based Control of Semi-Autonomous Vehicles............................................. 297 Brian Sanders, Yuzhong Shen, and Dennis Vincenzi Chapter 12 The Influence of New Realities: How Virtual, Augmented, and Mixed Reality Advance Training Methods in Aviation.................... 317 Graham King, Kendall Carmody, and John Deaton Chapter 13 Training, Stress, Time Pressure, and Surprise: An Accident Case Study......................................................................................... 331 Julianne M. Fox and Mustapha Mouloua Index....................................................................................................................... 347
Preface As we look toward the future, we find that it is neither totally random nor is it totally predictable. This is a truth that persists despite the years that have passed since the publication of the previous edition of this book. We maintain that, if the future were completely predictable, there would be no point looking forward because we would already know what was to come. If it were completely random, we would not bother because we could not know anything systematic about forthcoming events. That life lies between these two polar extremes gives us both the motivation to try to understand the future and the belief that we can do so, at least to a useful degree. Indeed, the triumphs of science encourage us to believe that we are making “progress” in so far as our predictions of the future. At least in relation to many physical processes, these are growing more accurate as the years progress. And, of course, the more we can know about the future, the more we can generate rational courses of action based upon this understanding. This respective confluence of ideas encourages us to develop theories, models, methodologies, and other such instruments to continue to improve our predictive capabilities. However, although certain forms of prediction work well for some of the simpler physical processes, there are many forms of complex interaction in which our predictive capacities are at present rudimentary at best. Unfortunately, many of these complex processes—global warming, for example—may prove so dangerous to our species that we cannot afford to assert predictions that are radically incorrect. Flawed prediction here can spell our end. As a consequence, we are in ever greater need of technologies that allow us to generate and refine predictions as well as exploring alternative potentialities found to be countertheses and antitheses to these various propositions. One such technology is simulation. As a tool, simulation is an aid to the imagination. It allows us to create, populate, and activate possible futures and explore the ramifications of these developed scenarios. However, in common with all tools, it performs its task only to the degree that it is open to facile interaction with the user. One can imagine that on many occasions, a poor simulation with its impoverished or wildly inaccurate outcomes might be of even more harm than good. Thus, as with all tools and technologies, we certainly need the application of the branch of science that turns user–machine antagonism into user–machine synergy. That branch of science is human factors. Hence, the focus of this present work is on human factors issues as they pertain to simulation in support of training and predicting humans to do certain tasks. In general, these issues revolve around two central themes, represented uniquely across two books. The theme of this book is pragmatic and utilitarian in nature because it reflexively asks how simulation itself can help address human factors issues. These include training, design, evaluation, testing, certification, and visualization. In this way, human factors seek to both refine and improve the technology of simulation and then, in turn, benefit from those very improvements. The chapters presented in the following text reflect these general concerns, as well as other emerging and critical vii
viii
Preface
issues. The chapters in this book pertaining to application and practice will expand on concepts of interest to the simulation and training communities regarding simulator usage, particularly with respect to the validity and functionality of simulators as training devices. These chapters provide context to the theory surrounding the use of simulation for the purposes of training or evaluating performance. Topics include controls in aviation research, simulation in surgery, and guidance for applications supporting semi-autonomous vehicle operation. Enveloping both theory and application, this book series will address in detail numerous issues and concepts pertaining to human factors in simulation, gathering this important information into two comprehensive volumes. Dennis A. Vincenzi Mustapha Mouloua P. A. Hancock James A. Pharmer James C. Ferraro
Editors Dennis A. Vincenzi earned his doctoral degree in 1998 from the University of Central Florida in Human Factors and Applied Experimental Psychology and has over 25 years of experience as a Human Factors researcher. He has been employed by Embry-Riddle Aeronautical University from 1999 to 2004, where he held the position of Assistant Professor in the Department of Human Factors and Systems in Daytona Beach. In 2004, Dr. Vincenzi left Embry-Riddle to work for the United States Navy as a Senior Human Factors Engineer at the Naval Air Warfare Center Training Systems Division (NAWCTSD) in Orlando, FL. His duties included performing Human Factors research involving simulation and training system development for a variety of Navy sea and air platforms, including the F/A 18 Hornet and Super Hornet, F-35 JSF, Los Angeles, Ohio, and Virginia class submarines, and Littoral Combat Ship (LCS). He was also heavily involved in research involving pilot selection, human performance, and ground control station design for a number of Navy, Marine Corps, and Special Operations Command Unmanned Aerial Systems (UAS). Since returning to Embry-Riddle Aeronautical University in 2012, Dr. Vincenzi has been involved in research related to UAS and regulatory requirements within the NAS and has been heavily involved in the development of an experimental gesturebased interface used for investigating user preference, usability, and functionality issues related to interface design in virtual environments. Dr. Vincenzi is currently the Program Chair for the Master of Science in Human Factors program at EmbryRiddle Aeronautical University. Mustapha Mouloua is Professor of Psychology and Director of the Transportation Research Group at the University of Central Florida (UCF), Orlando, FL. He earned his Ph.D. (1992) and M.A. (1986) degrees in Applied/Experimental Psychology from the Catholic University of America, Washington, DC. Before joining the faculty at UCF in 1994, he was Postdoctoral Fellow at the Cognitive Sciences Laboratory of the Catholic University of America from 1992 to1994, where he studied and researched several aspects of human–automation interaction topics sponsored by NASA, and the Office of Naval Research (ONR). He has over over 30 years of experience in the teaching and research related to complex human–machine systems. His research interests include vigilance and sustained attention, cognitive aging, human performance assessment, human–automation interaction, pilot–alerting systems interaction, automation and workload in aviation systems, simulation, and training in transportation systems. Dr. Mouloua made over 300 conference presentations with his undergraduate and graduate students, as well as his professional colleagues. He also has about 200 research publications and scientific reports published in journals and proceedings, such as Experimental Aging Research, Human Factors, Ergonomics, Perception and Psychophysics, Journal of Experimental Psychology: Human Perception and Performance, International Journal of Aviation Psychology, Journal of Cognitive Engineering and Decision Making, Proceedings of the Human ix
x
Editors
Factors and Ergonomics Society, Applied Ergonomics, Transportation Research Part F: Traffic Psychology and Behaviour, Ergonomics in Design, Transportation Research Record, and International Journal of Occupational Safety and Ergonomics. Together with his colleagues Raja Parasuraman and Robert Molloy, he was the winner of the Jerome Hirsch Ely Award of the Human Factors and Ergonomics Society in 1997. He was previously Director of the Applied/Experimental and Human Factors Psychology doctoral program (2008–2017). At UCF, Dr. Mouloua earned eight prestigious Teaching and Research Awards and was inducted into the UCF College of Sciences Millionaire Club for procuring over $1 million in research funds. He was awarded a UCF “Twenty Years’ Service” award in 2014, was awarded the UCF International Golden Key and Honorary member status in 2011, and his research was selected to be among the top 30 best published research articles in the last 50 years by the Human Factors and Ergonomics Society in 2008. P. A. Hancock, D.Sc., Ph.D., is Provost Distinguished Research Professor in the Department of Psychology and the Institute for Simulation and Training, as well as at the Department of Civil and Environmental Engineering and the Department of Industrial Engineering and Management Systems at the University of Central Florida (UCF). At UCF in 2009, he was created the 16th ever University Pegasus Professor (the Institution’s highest honor) and in 2012 was named 6th ever University Trustee Chair. He directs the MIT2 Research Laboratories. He is the author of over 1,100 refereed scientific articles, chapters, and reports as well as writing and editing more than 25 books. He has been continuously funded by extramural sources for every one of the forty years of his professional career. This includes support from NASA, NSF, NIH, NIA, FAA, FHWA, NRC, NHTSA, DARPA, NIMH, and all of the branches of the US Armed Forces. He has presented or been an author on over 1,200 scientific presentations. In association with his colleagues Raja Parasuraman and Anthony Masalonis, he was the winner of the Jerome Hirsch Ely Award of the Human Factors and Ergonomics Society for 2001, the same year in which he was elected a Fellow of the International Ergonomics Association. In 2006, he won the Norbert Wiener Award of the Systems, Man and Cybernetics Society of the Institute of Electrical and Electronics Engineers (IEEE), being the highest award that Society gives for scientific attainment. He is a Fellow and past President of the Human Factors and Ergonomics Society and a Fellow and twice past President of the Society of Engineering Psychologists as well as being a former Chair of the Board of the Society for Human Performance in Extreme Environments. Most recently he has been elected a Fellow of the Royal Aeronautical Society (RAeS) and in 2016 was named the 30th Honorary Member of the Institute of Industrial and Systems Engineers (IISE). He currently serves as a member of the United States Air Force, Scientific Advisory Board (SAB), and has also served on the US Army Science Board (ASB). He is also a Fellow of AAAS and IEEE. James Pharmer is the Chief Scientist for the Research, Development, Test, and Evaluation (RDT&E) Department and the Head of the Experimental and Applied Human Performance Research and Development (R&D) Division at the Naval Air
Editors
xi
Warfare Center Training Systems Division (NAWCTSD) in Orlando, Florida.. He is a Naval Air Warfare Center Aviation Division (NAWCAD) Fellow and has over 20 years of experience in training and human performance R&D for advanced military systems across a variety of warfare domains. His work includes conducting R&D and direct participation on systems acquisition teams to support human systems integration (HSI) implementation for Navy ships, aircraft, and systems. He chairs multiple working groups to develop HSI policy, processes, and education. He holds a doctoral degree in Applied Experimental Human Factors Psychology from the University of Central Florida and a master’s degree in Engineering Psychology from the Florida Institute of Technology. James C. Ferraro is a human factors research scientist specializing in simulation and game-based assessment of human performance in complex systems. He earned his Ph.D. in Human Factors and Cognitive Psychology from the University of Central Florida (UCF) in 2022 and his M.A. in Applied Experimental and Human Factors Psychology from UCF in 2019. Dr. Ferraro has led and contributed to a number of research efforts in support of government-sponsored (NAVAIR, USAF) projects to improve training and selection of personnel in various occupations. Areas include air traffic control, tactical urban warfare, explosive ordnance disposal, special forces rotary wing operations, and unmanned aircraft operations. His research on topics such as pilot/operator attentional strategies, trust in automated systems, and predictors of individual performance has been presented at local, regional, and international conferences (Human Factors and Ergonomics Society Annual Meeting, International Symposium on Aviation Psychology, Conference on Applied Human Factors and Ergonomics) and published in multiple academic journals (Ergonomics, Applied Ergonomics). He is the technical editor of the two-volume book set Human Performance in Automated and Autonomous Systems (2019) and the co-author of published book chapters pertaining to human monitoring of automated systems and the role of trust in unmanned vehicle operations. Dr. Ferraro is currently a Senior Research Scientist with Adaptive Immersion Technologies, based in Tampa, FL.
Contributors Winston Bennett Air Force Research Laboratory, Warfighter Interactions and Readiness Division Wright-Patterson Air Force Base, OH Christopher Best Human and Decision Sciences Division, Defence Science and Technology Group Melbourne, Australia Robert Bolia Air and Space Division, Defence Science and Technology Group Melbourne, Australia Aaron W. Calhoun University of Louisville School of Medicine Louisville, KY Gloria L. Calhoun Air Force Research Laboratory (retired) Wright-Patterson Air Force Base, OH Kendall Carmody Florida Institute of Technology Melbourne, FL States
John French Embry-Riddle Aeronautical University Daytona Beach, FL Gregory Funke Air Force Research Laboratory Wright-Patterson Air Force Base, OH Amy L. Hanson University of Louisville School of Medicine Louisville, KY Simon Hosking Human and Decision Sciences Division, Defence Science and Technology Group Melbourne, Australia Robert S. Kennedy* Graham King Florida Institute of Technology Melbourne, FL Kristen K. Liggett Air Force Research Laboratory (retired) Wright-Patterson Air Force Base, OH
John Deaton Florida Institute of Technology Melbourne, FL
Dahai Liu Embry-Riddle Aeronautical University Daytona Beach, FL
Dominique Doster Indiana University School of Medicine Indianapolis, IN
Ronald J. Lofaro* Federal Aviation Administration (retired)
Julianne M. Fox Decision Speed, Inc. Larkspur, CA
Nickolas D. Macchiarella Embry-Riddle Aeronautical University Daytona Beach, FL xiii
xiv
Contributors
Gerald Matthews George Mason University Fairfax, VA
Kevin M. Smith United States Navy (retired) Mesquite, NV
Mustapha Mouloua University of Central Florida Orlando, FL
Janan A. Smither University of Central Florida Orlando, FL
Salim A. Mouloua George Mason University Fairfax, VA
Dimitrios Stefanidis Indiana University School of Medicine Indianapolis, IN
Tiffany Nickens National Aeronautics and Space Administration (NASA) Washington DC
Christopher Thomas Indiana University School of Medicine Indianapolis, IN
Sarah A. Powers Old Dominion University Norfolk, VA
Michael Tolston Air Force Research Laboratory Wright-Patterson Air Force Base, OH
Brian Sanders Embry-Riddle Aeronautical University Daytona Beach, FL
Dennis A. Vincenzi Embry-Riddle Aeronautical University Daytona Beach, FL
Mark W. Scerbo Old Dominion University Norfolk, VA
Jiahao Yu Embry-Riddle Aeronautical University Daytona Beach, FL
Yuzhong Shen Old Dominion University Norfolk, VA
* The editors would like to pay their respects to Dr Robert S. Kennedy and Dr Ronald J. Lofaro, who sadly passed away prior to publication of this book project. We are very grateful for their contributions and dedication to the field of human factors in simulation and training.
1
Controls and Displays for Aviation Research Simulation A Historical Review Gloria L. Calhoun and Kristen K. Liggett
CONTENTS Disclaimer................................................................................................................... 3 Introduction................................................................................................................. 3 Fixed-Based Simulators.............................................................................................. 4 Integrated Information Presentation and Control System Study (IIPACSS) Simulator, Boeing, Seattle, WA........................................... 4 Display Technology............................................................................... 5 Control Technology............................................................................... 5 Representative Research........................................................................ 5 Impact��������������������������������������������������������������������������������������������������� 7 Digital Synthesis (DIGISYN) Simulator........................................................... 7 Display Technology............................................................................... 7 Control Technology............................................................................... 8 Representative Research........................................................................ 9 Impact������������������������������������������������������������������������������������������������� 11 Microprocessor Applications for Graphics and Interactive Communication (MAGIC) Simulator.................................................. 12 Display Technology............................................................................. 12 Control Technology............................................................................. 13 Representative Research...................................................................... 13 Impact������������������������������������������������������������������������������������������������� 14 Panoramic Cockpit Control and Display System (PCCADS) Simulator........ 14 Display Technology............................................................................. 15 Control Technology............................................................................. 15 Representative Research...................................................................... 16 Impact������������������������������������������������������������������������������������������������� 19 Helmet-Mounted Oculometer Facility (HMOF) Simulator.............................20 Display Technology.............................................................................20 Control Technology.............................................................................20
DOI: 10.1201/9781003401353-1
1
2
Human Factors in Simulation and Training
Eye and Head Monitor......................................................................... 21 Representative Research...................................................................... 21 Impact������������������������������������������������������������������������������������������������� 23 Synthetic Interface Research for UAV Systems (SIRUS) Simulator...............24 Display Technology.............................................................................25 Control Technology.............................................................................25 Representative Research......................................................................25 Impact������������������������������������������������������������������������������������������������� 27 Vigilant Spirit Control Station (VSCS) Simulator........................................... 27 Display Technology............................................................................. 27 Controls Technology............................................................................28 Representative Research......................................................................28 Sense and Avoid (SAA) Display Symbology Evaluation....................28 Cyber Threat Information Requirements Investigation for UAV Crews...................................................................... 29 Impact������������������������������������������������������������������������������������������������� 30 Intelligent Multi-Unmanned Vehicle Planner with Adaptive Collaborative/Control Technologies (IMPACT) Simulator................. 30 Display Technology............................................................................. 31 Control Technology............................................................................. 31 Representative Research...................................................................... 32 Impact������������������������������������������������������������������������������������������������� 34 Motion-Based Simulators......................................................................................... 35 Dynamic Environmental Simulator (DES)...................................................... 35 Display Technology............................................................................. 36 Control Technology............................................................................. 36 Representative Research...................................................................... 36 Impact������������������������������������������������������������������������������������������������� 38 Disorientation Research Device....................................................................... 38 Display Technology............................................................................. 38 Control Technology............................................................................. 39 Representative Research...................................................................... 39 Impact������������������������������������������������������������������������������������������������� 39 In-Flight Simulators.................................................................................................. 39 NASA’s OV-10.................................................................................................40 Representative Research......................................................................40 Impact������������������������������������������������������������������������������������������������� 42 Total In-Flight Simulator (TIFS) NC-131H Transport Aircraft....................... 43 Representative Research...................................................................... 43 Impact������������������������������������������������������������������������������������������������� 44 Variable In-Flight Stability Test Aircraft (VISTA) Lockheed NF-16D Fighter Aircraft....................................................................................44 Representative Research...................................................................... 45 Impact������������������������������������������������������������������������������������������������� 47
Controls and Displays for Aviation Research Simulation
3
University of Iowa Operator Performance Laboratory Aero L-29 Delfin Jet..... 48 Representative Research...................................................................... 49 Impact������������������������������������������������������������������������������������������������� 52 Multisensory Displays and Controls......................................................................... 52 Displays and Controls to Support Human–Machine Teaming.................................. 54 Summary................................................................................................................... 55 Acknowledgment...................................................................................................... 56 References................................................................................................................. 56 Acronyms/Abbreviations.......................................................................................... 62
DISCLAIMER The views expressed are those of the authors and do not reflect the official guidance or position of the United States Government, the Department of Defense, or of the United States Air Force. Statement from DoD: The appearance of external hyperlinks does not constitute endorsement by the United States Department of Defense (DoD) of the linked websites, or the information, products, or services contained therein. The DoD does not exercise any editorial, security, or other control over the information you may find at these locations.
INTRODUCTION This chapter will trace how controls and displays used in research simulators changed from 1970 through the present to effectively evaluate new crew station technologies for Air Force combat aircraft systems. The early 1970s marked the dawn of the electro-optical (E-O) era in aviation simulators. Actually, there were investigations utilizing E-O instruments as early as the 1930s. For example, in 1937, a cathode ray tube (CRT) based E-O display called the Sperry Flightray was evaluated on a United Airlines’ Flight Research Boeing (Bassett & Lyman, 1940). Over the next several decades, E-O displays were slowly integrated into predominately electromechanical (E-M) designs, such that pilots (private, commercial and military) were flying cockpits that incorporated a mix of E-M and E-O instruments. Thus, the time boundaries between the E-M and E-O approaches are very vague, even though the design boundaries are clear (Nicklas, 1958). By the early 1970s, although the majority of operational aircraft contained cockpits based on E-M instruments, cockpit designers were seriously considering the design of cockpits based primarily on E-O displays. Their research during this time frame had a definite influence on aircraft. For instance, the US Navy’s F-18 aircraft introduced in 1983 made extensive use of multifunction CRT displays. As part of this chapter, several research simulators will be described to illustrate the evolution of control and display technology; also, some lessons learned from the experiments carried out in these simulators will be cited. Unless otherwise stated, all of these simulators were (or are currently) at Wright-Patterson Air Force Base, Ohio.
4
Human Factors in Simulation and Training
Finally, the chapter will present changes that are anticipated in control and display technology for future simulations.
FIXED-BASED SIMULATORS Integrated Information Presentation and Control System Study (IIPACSS) Simulator, Boeing, Seattle, WA At the start of the 1970s, the state-of-the-art in cockpit instrumentation was exemplified by the F-4 Phantom. The Phantom’s two crew stations were composed of all E-M instruments, with the exception of a few single-function CRTs. However, advances in avionics were enabling the inclusion of an increasing number of computer-based functions within aircraft. If all of these additional computer-based functions had to be accessed through single-function E-M instruments, there would not be enough room in the crew station to accommodate all of required controls and displays. Moreover, it was likely that locations outside of the pilot’s primary reach and vision envelope would have to be used. Thus, a new approach to the design of fighter crew stations was clearly needed. One means of preventing the pilot from becoming overloaded was to restrict the information by “time sharing” controls and displays so that only the information, relevant to the pilot’s current task, was available. This restriction led to a change in design requirements. “The requirement exists to develop an integrated control and display system that will present only essential information in a format that can be translated easily by its user into direct control inputs” (Zipoy et al., 1970, p. 1). The answer was to substitute multifunction E-O displays for single-function E-M instruments. However, the question was, “How well can the operators use these new types of displays and their associated controls?” Research in the IIPACSS simulator (Figure 1.1) arose from a need to verify that the new approach (utilizing E-O displays that combined many of the functions of separate E-M instruments) would not degrade operator performance.
FIGURE 1.1 Integrated Information Presentation and Control System Study (IIPACSS) simulator, circa 1970. (Boeing photo produced under US Air Force Contract # F33615-73-C-1201.)
Controls and Displays for Aviation Research Simulation
5
Display Technology This simulator contained one color and six monochrome CRTs that were multifunctional in the sense that numerous menus and pictorial formats could be presented on the same device at different phases of the mission. Although an out-thewindow scene was available during experiments, it was not the computer projection that we know today. It was a film taken from a camera in an aircraft that flew a particular route. The film was run backward to give the illusion that the aircraft was indeed flying the route that was preprogrammed. However, about three years later, a terrain board coupled with a camera that flew over the board provided the out-thewindow scene for the pilot. Control Technology Besides the required throttle and flight controls, the simulator contained a great number of switches. If you examine Figure 1.1 carefully, you will see that there are 141 push-button switches in this simulator. In the early 1970s, multifunction switch technology was not yet available. Representative Research Introduction: Based on an initial evaluation of the cockpit seen in Figure 1.1, it was clear that work was needed on the design of multifunction keyboards, as well as other aspects of the cockpit, such as intuitive display formats. This study (Willich & Edwards, 1975) addressed these issues. This study also incorporated the functions of the A-7D operational crew station because it had the more sophisticated avionics systems at that time. After a detailed functional analysis was performed on the A-7D, whose crew station contained primarily E-M instruments, those functions were then assigned to various multifunction displays and multifunction keyboards. The cockpit in Figure 1.1 was modified to incorporate the functions of the A-7D as well as lessons learned from the initial pilot evaluations conducted with this older, unmodified simulator. An outline of the front instrument panel of the modified cockpit appears in Figure 1.2. The objective of this experiment was to evaluate a head-up display (HUD) format, different display formats for the horizontal situation display, and a newly designed multifunction keyboard. In this section, we will discuss only the results of the multifunction keyboard evaluation. (For other aspects of the study, see Willich and Edwards, 1975.) The detailed objectives relative to the keyboard were to “evaluate the utility of the multifunction keyboard in terms of matrix size, number of integrated functions, logic indenture levels, and operational suitability in accomplishing the mini mission scenarios” (Willich & Edwards, 1975, p. 67). A mini mission is a flight phase, for example, air-to-ground. To understand the detailed objectives, some explanation of how the multifunction keyboard was constructed is required. Multifunction Keyboard: At the time of this study, bezel-mounted switches with legends that appeared on the display surface (most Automatic Teller Machine [ATM] designs), had not yet been envisioned. However, the mechanism used by ATMs to obtain a cash advance in that one proceeds through multiple levels of menu logic,
6
Human Factors in Simulation and Training
FIGURE 1.2 Modified Integrated Information Presentation and Control System Study (IIPACSS) simulator. (Boeing photo produced under US Air Force Contract # F33615-73-C-1201.)
was employed in the following manner. To organize the over 100 switches in Figure 1.1, switches were created that had a limited number of legends – 12 legends per button in this case. A 4 × 6 matrix of these switches was created, thereby allowing a total of 288 switch legends. Two identical matrix-style keyboards were placed to the left and to the right of the horizontal situation display (bottom-large CRT) to allow operation by either hand (Figure 1.2). The keyboard worked in the following manner: As power was applied to the multifunction keyboard, the button legends on the top row showed the names of the major systems onboard the aircraft, such as communication, navigation, sensors, etc. When one of the top row buttons was pushed, for example the communication (COMM) button, the various types of radios (ultrahigh frequency [UHF], very high frequency [VHF], identify friend or foe [IFF], etc.) would then appear as legends on the keyboard, and the previous legends would disappear. The pilot would then be at the second logic level. If the pilot then pushed the UHF button, the sub-functions of the UHF would be shown (third logic level), and, as before, the previous legends would disappear. Successive changes of legends on the buttons allowed the pilot to proceed through various keyboard logic levels.
Controls and Displays for Aviation Research Simulation
7
However, the status of the radios (e.g., the current tuned frequency of the radio) did not appear on the keyboard; the radio status was shown on the small CRTs located at the top portion of the instrument panel. Test Procedures: All eight pilot participants were experienced in fighter or attack aircraft. Each flew three mini missions: air-to-air, air-to-ground, and instrument landing. During these mini missions, the pilots used the multifunction keyboard in normal and degraded modes (e.g., a CRT failure) to perform tasks involving the communications, navigation, sensors, and aircraft subsystem functions. Results: An examination of the three mini-missions found that there was no significant difference between normal and degraded mode performance in the air-toair and air-to-ground mini-missions. In the landing mini mission, it took significantly longer to enter a radio channel change in the degraded mode. The pilots also filled out a questionnaire to obtain opinions on the utility of the multifunction keyboard and associated multifunction displays. Pilots felt the multifunction keyboard was very easy to operate, failures were easy to correct, and it was equally suitable for day and night use. However, they were evenly split as to the efficiency of the keyboard. Those who liked the keyboard were especially fond of the compact nature (combining several functions into a fewer number of switches). However, those pilots who did not like the keyboard felt they could access singlefunction switches more quickly than going through the four levels of menu logic required with the multifunction keyboard to access some functions. Conclusions: There were two basic conclusions from the research performed in this experiment: (1) the functions of a state-of-the-art aircraft (at the time) could be successfully incorporated into a multifunction crew station and (2) the multifunction keyboard, coupled with its corresponding CRT status displays, was a viable means of performing tasks needed to accomplish the functions. However, the optimization of the relationships between the keyboard and the corresponding CRT status displays had not yet been achieved. The pilots manipulated the switch legends on the keyboard matrix (see Figure 1.2), but the changed functions appeared on the CRTs located at the top portion of the instrument panel. As the CRTs were a considerable distance from the multifunction keyboard, increased scanning time to verify the correct task had been performed was required. Impact This simulation was conducted as part of the US Air Force’s Digital Avionics Information System Program. The US Navy had a similar program called the Advanced Integrated Display System Program. The research conducted by these two programs served as the basis for the E-O crew stations we see today in modern aircraft.
Digital Synthesis (DIGISYN) Simulator Display Technology This simulator contained a HUD (but no external visual scene) and from four to six multifunction head-down CRTs, depending on the evaluation. The two CRTs in the center front panel (vertical situation display and horizontal situation display) were
8
Human Factors in Simulation and Training
color, as well as the left upper CRT in some studies. The other head-down CRTs were monochrome. For some experiments, a cluster of E-M engine instruments on the upper right front panel was employed. Besides these few E-M displays, the majority of displays were E-O, which offered a great deal of flexibility in crew station design. First, a particular format could be presented on any of the CRTs, and one evaluation focused on this advantage. Pilot performance was examined with eight arrangements of display formats depicting vertical situation, horizontal situation, and status information. The results failed to show a performance decrement across arrangements, demonstrating that this is a viable option, should one of the E-O displays fail during flight (Calhoun et al., 1980). With the flexibility afforded by computer-driven displays, the available graphics capability could also be exploited, rather than just transferring dedicated E-M display formats onto E-O surfaces. Moreover, the formats could be designed to integrate information from several dedicated E-M displays onto a single E-O display. However, to ensure that the resultant format provided information in a manner that the pilot could quickly assimilate and respond to, extensive research was required to determine which type of format and level of abstraction (e.g., alphanumeric, graphic, schematic, or pictorial) was best for the pilot’s specific task. Research was also required to examine whether color should be employed in computer-generated imagery, beyond the conventional sky and ground coding of the attitude director indicator (ADI) sphere and colors (green/amber/red) used in the aircraft advisory system. For several years, DIGISYN supported such display format evaluations. The section “Representative Research” provides a summary of a few studies examining the use of color coding. Control Technology Besides a joystick and throttle for flight control, the simulator utilized a combination of single-function controls (e.g., a telephone-style keyboard, forward of throttle on left console) as well as the multifunction control (lower-left front panel). Each switch of a multifunction control addressed logic that both determined the function of the switches and initiated the execution of those functions when the switches were selected. Obviously, if the function of a switch changed, it was important that its current function be displayed. To reflect what operation they controlled, multifunction switch legends changed, using one of two technologies available at that time: projection switches (Figure 1.3a) and CRT-based bezel-mounted switches (Figure 1.3b). Projection switches contained a filmstrip with a series of light bulbs behind the strip. Based on the legend desired, the computer sent a signal to the appropriate light bulb, thereby lighting up the correct legend. A limitation of this technology was that only ten legends would fit on the filmstrip that was below the switch surface. Further, if a different legend was desired other than the current ten, a new filmstrip had to be created. In CRT-based multifunction controls, the switches are adjacent to the bezel of a CRT. Thus, the switches could have as many legends as the CRT could generate and changing a legend only involved a software modification. However, this technology
Controls and Displays for Aviation Research Simulation
9
FIGURE 1.3 (a) Digital synthesis (DIGISYN) simulator with projection switches, circa 1976. (US Air Force photo.) (b) Digital synthesis (DIGISYN) simulator with CRT-based bezel-mounted switches, circa 1976. (U.S. Air Force photo.)
also had limitations. Because of the switch depth, switches could not be mounted on the bezel itself, but rather had to be mounted outboard. The distance between the switch and its corresponding legend could result in parallax problems at certain viewing angles or seat adjustments, making the association of a switch to a displayed legend ambiguous and not immediately apparent. Several investigations examining how best to implement a multifunction control were conducted with this simulator. More specifically, this research: • Compared projection switch-type multifunction control to CRT bezelmounted switches, and evaluated their location in the cockpit (Reising, 1977); • Compared two logic design implementations, that is, branching logic for each individual aircraft system versus tailored logic that presents options that are most likely needed options for the current flight phase (Herron, 1978); and • Generated design criteria for multifunction controls (e.g., how to label switches, implement switching logic, maximize the accessibility of frequently used functions, optimize switch and function assignment, and minimize hand motion; Calhoun, 1978; Calhoun & Herron, 1982). Representative Research Introduction: Prior to the availability of the DIGISYN simulator, the majority of research examining the utility of color coding used participants who devoted their full attention to the color display and performed single relatively simple tasks (Christ, 1976). This research also showed that the impact of color coding is highly situation specific and depends on a number of diverse factors such as operator task, display medium, and display environment (Krebs et al., 1978). DIGISYN, with color E-O displays, was an ideal platform to examine the utility of color coding on formats that were used in a somewhat peripheral manner as the highly loaded pilot also performed multiple complex tasks.
10
Human Factors in Simulation and Training
Test Procedures: Similar procedures were used in three separate experiments to examine the utility of color coding. At least 16 A-7D pilots participated in each experiment. After training, pilots flew one or more flights with each of the conditions being examined in the respective experiment. The mission tasking was designed to represent the workload present in operational flights. Pilots were required to maintain flight parameters (using the HUD as the primary flight display) as well as complete communications, navigation, and weapons tasks using a multifunction control and keypad. Also, pilots had to respond to information retrieval questions that required them to utilize the display format under evaluation. With the number of ongoing and intermittent tasks, pilots only had time to quickly glance at the format under evaluation to retrieve requested information. Performance on all tasks was recorded. Subjective comments were also obtained with questionnaires. Color Formats: Three different display formats were evaluated in separate experiments examining the utility of color coding: threat format (Kopala, 1979), engine format (Calhoun & Herron, 1981), and weapons format (Aretz & Calhoun, 1982). Threat Format: This format appeared on the color CRT directly below the HUD. Besides navigation information, symbology was presented to denote locations of aircraft (symbol “.”), surface-to-air missiles (“S”) and anti-aircraft artillery (“A”). Each symbol was augmented with a state designator, one of three shapes to denote friendly, unknown, or hostile. These states were color coded in one condition: green, yellow, and red respectively. The two coding conditions (shape-coded symbology versus shape- and color-coded symbology) were tested under three different symbol density levels: 10, 20, and 30 symbols. Engine Format: On the upper-left CRT, each of eight engine parameters was represented by a box that contained the current parameter value. Vertical rectangular bars extended from the top or bottom of the boxes, as the corresponding parameter deviated from the normal operating range midpoint. All parameters were normalized to the same range for easier interpretation. Normal, cautionary, and emergency states were indicated by shade and flash codes on the monochrome format (unfilled bar/ white bar/flashing white bar) and by color codes on the color format (green/yellow/ red). Performance on retrieval of engine information was recorded for the two CRT engine formats (monochrome and color), as well as a cluster of conventional E-M instruments (fuel flow, turbine outlet temperature, RPM, oil pressure, oil quantity, and three hydraulic pressure indicators) on the right front instrument panel. These E-M instruments operated as in conventional cockpits, with colored tape to denote operating ranges. For all three-format conditions, the simulation included implementation of the master caution indicator and corresponding messages on failed parameters. Weapons Format: In three of four experimental conditions evaluated, the upperleft CRT presented information pertaining to all the weapons onboard the aircraft, as well as information pertaining to the weapon option selected. The format consisted of a white planform against a darker background. Shapes on the planform presented the weapons onboard, and a different shape was used for each type of weapons, one shape for each of six different types. The station from which the selected weapons would be delivered was indicated by the location of the symbols on the planform.
Controls and Displays for Aviation Research Simulation
11
Line/flash (monochrome) or shade (color) coding were used to code the status of each selected weapon. This included weapons selection status, master arm switch activation, drop mode, interval, weapon fuzing, release status, and presence of a hung bomb. Besides the monochrome-coded and the color-coded pictorial formats, an alphanumeric format was also evaluated that presented information on the CRT used in the multifunction control. In a fourth condition, both the alphanumeric and color pictorial formats were presented. Results: In the experiment that utilized a threat format, the results showed a 40% increase in time to identify friendly, unknown, and hostile symbols when monochrome shape coding was used, compared to redundant color-coding. The effectiveness of redundant color coding became more pronounced as the symbol density of the threat format increased. Performance with the monochrome-coded pictorial weapons format was also found to be significantly worse than the color pictorial format. Moreover, the monochrome format was also worse than the alphanumeric format and the combined alphanumeric and color pictorial format. The subjective data showed pilot preference for the combined format. One pilot commented that the pictorial format helped one acquire situation awareness with a quick glance, with the alphanumeric information as a backup if there was any confusion. Different results were obtained with the engine format. There were no significant performance differences between the monochrome and color CRT formats. With regard to having the engine information integrated onto a single format versus the conventional array of E-M instruments, both CRT formats were superior as measured by pilots’ speed and accuracy in identifying failed engine parameters. Conclusions: The results from these experiments concur with the literature review provided by Reising and Calhoun (1982). Color coding resulted in performance improvements when the format was unformatted, highly dense, involved a search for relevant information, and had a logical relationship between color and the tasks. Both the threat and weapons formats can be viewed as dense and unformatted (e.g., 30 threat symbols at some levels and weapons information changed depending on weapon option). Both formats also involved an active search for information, either to find a particular threat or determine a parameter of a weapon store. The color coding also had a relationship to the task (e.g., red for hostile threat and hung bomb). The engine format, in contrast, was a simple display with the information clearly shown in histograms. The location of a specific parameter was constant: the corresponding box at the center of the display. Additionally, the master caution alerting system served as an additional cue of abnormal states. Thus, monochrome codes were sufficient for the E-O format presenting engine information; color coding did not show a payoff. Impact The DIGISYN can be viewed as one of the earliest test-beds primarily based on multifunction technology; the cockpit featured multifunction control and the majority of displays were E-O. Thus, this simulator was ideal for research focused on exploiting the advantages of computer-based controls and displays. As a result of the numerous experiments that were conducted over many years, the utility of multifunction
12
Human Factors in Simulation and Training
controls and integrated formats on multifunction displays was demonstrated. Many design guidelines were identified in the process as well. Without question, the research conducted with the DIGISYN was a strong contributor to the glass-cockpit crew station designs operational today.
Microprocessor Applications for Graphics and Interactive Communication (MAGIC) Simulator The MAGIC simulator was employed to conduct part-task pilot-in-the-loop research studies investigating pilot–vehicle interfaces for cockpit applications (Figure 1.4). The simulator was a single-seat fighter shell. Six computers were used to support the simulation: three personal computers (PCs) and three graphics workstations. The fact that MAGIC relied on PCs for its operation demonstrates the low-cost aspect of this type of simulator. The cockpit was outfitted with various off-the-shelf products over the years for the purpose of comparing different controls and displays. Studies included the use of various HUD symbology sets to recover from unusual attitudes (Reising et al., 1988), and pathway-in-the-sky HUD symbology for complex, curved approaches and landings (Reising et al., 1995). Additional studies (see the section “Representative Research”) compared the use of three-dimensional (3-D) joysticks, touch screens, and speech recognition to designate targets. Display Technology MAGIC contained five color CRTs that provided dynamic graphics capability. Typical displays for the head-down CRTs were system status formats, computerized checklists, radar sensor displays, and digital images from laser disks. There was no
FIGURE 1.4 Microprocessor Applications for Graphics and Interactive Communication (MAGIC) simulator, circa 1985. (US Air Force photo.)
Controls and Displays for Aviation Research Simulation
13
out-the-window visual scene, so subjects used the topmost monitor to view HUD symbology. The center CRT, typically containing a moving map display, could be exchanged with a 3-D display that pilots could use to view images with liquid crystal display (LCD) shutter glasses. Control Technology An F-16 side-mounted limited-displacement control stick was used to fly the F-16 aeromodel for the flight tasks. The stick also contained a weapon’s release button, a trigger, and a pitch-trim switch. An A-7 aircraft throttle was also employed and included speed brakes and communications switches. The cockpit also contained three banks of four programmable display push buttons each. These were pixel-addressable light-emitting diode displays capable of displaying alphanumeric or pictorial information. There was also a bank of four multicolor switches below the topmost monitor. Three of the CRTs housed touch screen overlays that were used as control interfaces to change the graphics on the screen. MAGIC also contained various speech recognition systems, again used as a control interface to change displays for the pilot. Other control devices included a magnetic tracker and an ultrasonic tracker attached to the pilot’s glove to manipulate cursor control in 3-D space. Voice Systems: Over the years, MAGIC hosted different speech recognition systems, each with its own strengths and weaknesses. It was this observation of several systems that led to a unique study geared at increasing the recognition accuracy of state-of-the-art speech system of that time (Barry et al., 1992). The idea was to combine the strengths of each of three individual systems (two working in isolated mode and one in connected-speech mode) to increase recognition accuracy in the following manner. When a person spoke a word, all three systems reported a best-guess word, a second-choice word, and a distance score for each of the two words reported. A “majority rules” algorithm was implemented that determined the word recognized as a best guess by the majority of the three systems. If there was no majority, the second-choice words were added to the set of words, and a majority was looked for again. Finally, if there was still no majority, the response with the lowest distance score was reported. Using this algorithm, word recognition accuracy increased from 92.99% (the average of the three systems’ individual accuracies) to 99.43% (the accuracy using the “majority rules” algorithm). Representative Research Introduction: One series of experiments conducted in MAGIC evaluated methods for designating targets residing in a stereoscopic 3-D volume. This research was published in several articles (Barthelemy et al., 1991; Liggett et al., 1993; Reising et al., 1992; and Solz et al., 1994). Three different cursor-control devices were used to designate targets. These included a three-axis joystick, an ultrasonic tracking device, and a voice control system. The ultrasonic tracking device was attached to a glove, and participants moved the cursor using this device by pointing to the target
14
Human Factors in Simulation and Training
of interest. Because this type of task requires both gross and precise positioning, two aiding techniques were implemented. One was simple in that a color change of the target was instituted when the cursor penetrated the target area, informing the participants that the cursor was indeed in the same physical space as the target. The other aiding algorithm, referred to as enhanced aiding, used a mathematical algorithm to compute the distance from the cursor to the target closest to it as it traversed the 3-D space. Once this distance was computed, the closest target was highlighted, thus eliminating the need for precise positioning (Osga, 1991). The algorithm continuously computed the distance between the cursor and all targets in the depth volume. The 3-D volume within which targets and the cursor interacted extended from 7 in. in front of the physical display surface to 15 in. behind the physical display surface. Participant performance differences based on two target densities were also investigated. Results: Results showed that the hand tracker provided the best performance with respect to total target designation time. Both target designation time and accuracy were improved with the enhanced aiding approach. A speed–accuracy trade-off was observed when the density variable was analyzed; the low-density condition provided faster total target designation times, but the high-density condition had fewer errors. Impact This cockpit simulation provided a consistent, uniform experimental environment for conducting a number of part-task evaluations. Consistency is especially important in stereographic evaluations as the distance between a participant’s eyes and the display affects image disparity and, therefore, perceived stereographic effect. Also, the versatility of the cockpit supported the easy integration of the various control devices. The evaluations supported by this simulation also enabled the investigation of alternatives to traditional control and display devices for cockpit tasks that were becoming more challenging as information being presented on the traditional twodimensional (2-D) displays increased. As such, pilots’ visual processing capabilities were being overloaded, and 3-D displays offered a potential solution to this problem. However, introducing this type of display also introduces control challenges. This simulator facilitated the evaluation of numerous control techniques that may compensate for new control issues associated with the incorporation of 3-D displays in future cockpits.
Panoramic Cockpit Control and Display System (PCCADS) Simulator During the second half of the 1980s, the continued maturation of larger flat-panel displays (e.g., LCDs) started researchers thinking about the design of a cockpit in that the whole front panel would be a single display. As with any new technology many questions arose, such as: “What is it the best way to optimize the design of the crew station when operators can place display formats wherever they wish?” In addition, new display formats could extend across either the entire or a part of the display
Controls and Displays for Aviation Research Simulation
15
FIGURE 1.5 Panoramic Cockpit Control and Display System (PCCADS) simulator, circa 1988. (US Air Force photo.)
such as a half, a third, etc. Also, improvements in helmet-mounted displays (HMDs) warranted investigation of how they would interact with the new displays available with the single instrument panel. The PCCADS research simulator was developed to evaluate the potential improvements in mission effectiveness by providing a large color display area and the effectiveness of including an HMD and a helmet-mounted sight (HMS) in the cockpit (Figure 1.5). Display Technology The HMD provided airspeed, altitude, attitude, heading, and weapon status cues. It also portrayed a line of sight (LOS) for radar, and one for a weapon seeker. This was accomplished with the use of a magnetic head-tracker. The tracker-head position data and status information were used to point the simulated radar antenna or the simulated weapon seeker. The head-down portion of the simulator was unique in that it was one large display (18 × 24 in.) containing an integrated picture of mission-essential information. This also allowed for the rapid reconfiguration of numerous head-down configurations. For example, via software, the head-down display could be configured to look like an F-15E or an F-16 cockpit instrument panel. PCCADS employed a projection system to show a realistic out-the-window scene (37 degrees horizontal by 27 degrees vertical field-of-view [FOV]), driven by a stateof-the-art graphics generator. Control Technology The simulator employed a touch-sensitive overlay in order to manipulate switches displayed, as well as present formats showing aircraft attitude and flight status in
16
Human Factors in Simulation and Training
various locations on the head-down display. Speech recognition and control were other options for interaction. An F-15E stick and throttle, along with their additional switches and buttons, provided “HOTAS” functionality, that is, function selection while keeping the pilots’ “hands on throttle and stick.” Representative Research Introduction: The research discussed in this section dealt with one of the basic aspects of flying – maintaining flight safety when there is no dedicated head-down primary attitude indicator (AI). At the time this research was conducted, there was a definite desire to provide the pilot with as much mission-related information as possible. There was a second idea that the HUD could be used as the primary flight reference display and be substituted for a head-down primary AI. However, there was a concern that loss of attitude awareness (a potential flight safety problem) could result. The Evolution of the Background Attitude Indicator: With limited panel space, one design solution was to decrease the size of the ADI and move it out of the primary viewing area, with pilots employing the HUD as the primary flight display. Researchers at Lockheed, Ft. Worth (Spengler, 1988) designed an alternate approach. They created a background attitude indicator (BAI) format designed with the goal of replacing the conventional dedicated head-down ADI while maintaining flight safety. The BAI uses a ¾ in. electronic border around the outer edge of a headdown display format. The evolution of this concept is illustrated in Figure 1.6 and its implementation is shown in Figure 1.7.
FUSELAGE DOT
HORIZON LINE
AIRCRAFT WINGS
TYPICAL ATTITUDE INDICATOR AND DIGITAL READOUTS
EXTEND HORIZON LINE AND WINGS
DIGITAL READOUTS
OVERLAY TACTICAL FORMAT
FIGURE 1.6 Evolution from attitude director indicator (ADI) to background attitude indicator (BAI).
17
Controls and Displays for Aviation Research Simulation HORIZON LINE
SKY
GROUND
HORIZON LINE TACTICAL FORMATS
FIGURE 1.7 Spengler background attitude indicator. (Adapted from Spengler, R.P. 1988. Advanced Fighter Cockpit (Tech. Rep. ERR-FW-2936), Fort Worth, TX: General Dynamics.)
In Figure 1.7, three display formats are shown on a front panel, the central rectangular portion of each presenting mission-related information. The background border extended across all three displays and presented a single attitude format. The attitude information, in essence, framed the mission-essential display format and acted as one large AI. The BAI consisted of a white horizon line with blue above it to represent positive pitch, and brown below it to represent negative pitch. This display worked very well for detecting deviations in roll, but was less successful in showing deviations in pitch because; once the horizon line left the pilot’s field-ofview (FOV), the only attitude information present in the BAI was solid blue (sky) or brown (ground). Because the concept was effective in showing roll deviations but lacking in the pitch axis, enhancing the pitch axis became the focus of the research using this simulator. PCCADS BAI Research – Part 1: The initial work began by enhancing the pitch cues for a BAI, that framed one display format only (as opposed to framing three display formats as in the original Lockheed work; Liggett et al., 1992). Eight variations of the BAI were evaluated, and they each contained the following common elements (Figure 1.8): 1. Digital readouts of airspeed, altitude, and heading 2. Wing reference lines to provide an attitude reference (extensions of the normal miniature aircraft wings)
18
Human Factors in Simulation and Training GHOST HORIZON
AIRCRAFT WINGS
DIGITAL READOUTS
FIGURE 1.8 Digital readouts, wings, and ghost horizon (plane is in a 45 degree roll, negative pitch).
3. Ghost horizon (a dashed white line that appeared when the true horizon left the pilot’s FOV, and that indicated the direction of the true horizon)
This configuration was tested alone, as well as with the additions of color shading (the lightest shade of blue or brown appeared at the horizon and became gradually darker as positive or negative pitch increased to 90 degrees), color patterns (a vertical wedge with the thinnest portion at the horizon and the thickest portion at the zenith or nadir), and pitch lines with numbers. These design features were compared individually, in combinations of two, and with all three present. To determine if effective pitch information was being portrayed, the PCCADS study simulated the task of recovering from unusual attitudes. This task is often used to determine if adequate pitch information is present, as it is a key factor in a successful recovery. Results of BAI – Part 1: Results showed that the combination of color shading and color patterns was the format that had the quickest initial stick input time. When using this format, the pilots moved the control stick to begin their recoveries more quickly than when using any other format. This measure of initial stick input time related to the interpretability of the format because the pilots looked at the format, determined their attitude via the cues on the BAI, and began their recovery as quickly as possible. PCCADS BAI Research – Part 2: Follow-on research (Reising et al., 1995) was conducted to evaluate the use of color shading and patterns to portray pitch information when the BAI extended across three horizontally adjacent head-down
Controls and Displays for Aviation Research Simulation
19
formats (the display configuration employed by Lockheed). The procedures and pilot tasking were similar to the first study. There were, however, two different mechanizations of the BAI: Triplets and Global. The Triplets format consisted of each of the three displays presenting individual, identical attitude information. Each display acted as a single, independent AI. Because the pilot could be focusing on the information from any of the three display formats at a given time, it was thought that being able to interpret the aircraft’s attitude from using just the information from that specific BAI may be beneficial. The Global format consisted of all three horizontally adjacent BAIs acting as one large AI as in the original Lockheed study. It was anticipated that using the global BAI would be similar to seeing the outside world in its entirety and thus provide a benefit to the pilot. The Triplets and the Global formats had the same common elements of digital readouts, wing reference lines, a ghost horizon, and sky pointers. The pitch cues used were of two styles: (1) color shading and color patterns (the best format from the previous research) and (2) color shading, color patterns, and pitch lines with numbers. Although the second format was not considered the most beneficial from the previous research, the pilots expressed a unanimous preference for the BAI format that included pitch lines and numbers. Results of BAI – Part 2: Objective results were inconclusive; however, subjective results revealed that the pilots highly favored the Global format that provided color shading, color patterns, and pitch lines with number references. Thirteen of 16 subjects ranked this type of format highest. The global aspect tended to give the pilots excellent peripheral bank cues, also the combination of shaded patterns and pitch lines with numbers gave both qualitative and quantitative pitch reference, as well as pitch rate information. The Triplets format was rated low because the individual formats tended to distract the pilot with each BAI moving separately and displaying identical yet independent attitude information. The pilots were inclined to use only the center display for attitude information and completely ignore the two outboard displays. Conclusions: Based on the results of these simulation studies, BAIs appear to be a viable means of enabling the pilot to recover from unusual attitudes. Single BAIs work best with visual cues (such as color shading and color patterns) that create a flow pattern to facilitate pilot detection of motion while not requiring the pilot to focus on a specific readout. When using multiple BAIs, pilots preferred having a global BAI that uses a combination of shaded patterns and pitch lines. The pitch lines with numbers allowed the pilot to make an exact, quantitative assessment of attitude, and the color shades and wedge width gave the pilot “quick glance” qualitative orientation information. Impact This research demonstrated the feasibility of a new and innovative display format that would not be possible without the inclusion of a large CRT in the cockpit. Because of the CRT’s large surface area, the display formats can be configured in non-traditional ways. The trend of duplicating the E-M instrumentation with an E-O
20
Human Factors in Simulation and Training
display format may finally disappear as a paradigm shift takes place, and the full potential of large E-O displays becomes apparent.
Helmet-Mounted Oculometer Facility (HMOF) Simulator The HMOF simulator was established to capitalize on the unique capability for unobtrusive and accurate monitoring of eye and helmet positions using Honeywell’s oculometer. This oculometer system was incorporated into a single-seat simulator (A-7 geometry) that contained various controls and displays to support research that was more basic in nature. Test participants were not pilots and the tasks they performed were designed to represent cockpit workload demands, rather than simulate actual piloting tasks. One line of research in this simulator focused on determining whether eye and head measures were valuable objective indicators of the effectiveness of attention cues and control and display design. Parameters of eye and head movements (e.g., sequence and latencies) were examined in comparison to the conventional performance index, manual reaction time, as a function of several factors: attention cue modality, tasks, attention allocation between tasks, and information location (Calhoun et al., 1985). One of the cue modalities evaluated was the application of 3-D auditory signals (Calhoun & Janson, 1990). Results suggested that these relatively unobtrusive measures may be valuable indices for evaluating candidate crew station designs by detecting a pilot’s awareness of cues and changes in information presented. Another line of research evaluated the application of the operator’s LOS as an alternative control. With such control, the computer initiated a predefined action once it received an input based on the operator’s point-of-gaze. Use of eye control eliminated the need for a selective manual response by substituting the natural movement of the eye that was inherent to the visual task. Thus, in cockpit applications, pilots would be afforded a useful hands-free, head-up control mechanism. Research in this facility examined the spatial and temporal parameters for implementing the eye-control algorithm and quantified the efficiency of eye control compared to other control mechanisms (Calhoun et al., 1986). Display Technology In its basic configuration, the simulator contained two monitors. The upper centrally located monochrome monitor (approximately 10 × 12 cm) presented symbology for a pursuit tracking task that could be varied in difficulty level. A color monitor (approximately 20 × 30 cm) was located below the front switch panel. This simulator had no external visual scene. During testing, the cockpit was darkened by a lighttight curtain that surrounded the simulator. Control Technology The right-console joystick was used for the participants’ inputs to the tracking task. The stick was fitted with four switches, two of which were thumb-actuated
Controls and Displays for Aviation Research Simulation
21
pushbuttons. A pressure-sensitive 12.5 × 12.5 mm switch plate was mounted on the left console. The front switch panel contained seven dedicated switches. These momentary switches measured 14 × 20 mm. The middle switch subtended a visual angle of 1.2 × 1.7 degrees. The switches were labeled with black numerals. For some experiments, control based on eye LOS was activated. Eye and Head Monitor The participant’s eye was illuminated by a halogen lamp filtered to pass nearinfrared light. This light was collimated and reflected from a small coating on a parabolic helmet visor into the right eye. Some light was reflected from the cornea and a portion of the light that entered the pupil was reflected by the retina, passed out of the eye through the pupil, and was scanned by a miniature charge-coupled device (CCD) video camera. As the eye rotated about its center of rotation to look around the visual field, the corneal reflection moved differentially with respect to the pupil. Thus, eye direction could be determined from the relative positions of the center of the pupil and the center of the corneal reflection. At extreme angles of fixation, eye direction was determined from the shape of the pupil. A magnetic HMS provided accurate helmet position and attitude determination in six degrees of freedom with respect to a fixed coordinate system. The HMD utilized a transmitter mounted behind and above the helmet to create a magnetic field around the cockpit and a helmet-mounted receiver that responded to movement through the field with varying output voltages. A computer calculated helmet position and rotation based on these voltages. These data were combined with eye-angle data to determine eye LOS with respect to a fixed coordinate system. Representative Research Introduction: In that the visual system is the primary channel for pilots to acquire information, and eye muscles are extremely fast, it is advantageous to have the direction of eye gaze also serve as a control input. In other words, if the pilot is looking at a target or button, it is more efficient to use the pilot’s gaze to aim a weapon or select a switch, as shown in Figure 1.9. One approach to implementing gaze-based control is to combine LOS data with LOS dwell-time criteria. The operator selects an item on a display simply by looking at it for the criterion time. Using dwell time to initiate the control action is particularly useful if the operator’s gaze is only being utilized to call up additional data. In this manner, the operator’s sequential review of a series of icons can be made more rapidly, with detailed information popping up, as the gaze briefly pauses on each icon. Typically, required dwell times ranged from 30 ms to 250 ms. Longer dwell times tend to mitigate the speed advantage of gaze-based control. However, shorter dwell times increased the likelihood of a Midas touch, with commands activating wherever the operator gazes. One solution was to require a consent response such that gaze-based control was similar to the operation of a computer mouse and button press. The gaze (or mouse) indicated the response option on a display, and the consent (or button press)
22
Human Factors in Simulation and Training
FIGURE 1.9 Illustration of cockpit application of eye control. (US Air Force graphic.)
triggered the control action. This mechanism was evaluated in an experiment in which the participants selected discrete switches on the simulator’s front panel while manually tracking a target (Calhoun et al., 1986). In two of the three control methods, participants directed their gaze at the switch indicated by an auditory cue and then made a consent input (either a manual response via a joystick button or a verbal response). In a third condition, participants selected the switches with their left hand. Procedures: Six participants were randomly assigned to a sequence of the three switching methods. The order of the switching methods was such that, across participants, each method was preceded equally often by each of the other methods. In the conventional manual method, participants selected the cued switch with their left hand. The switch was illuminated during switch closure. Between switch selections, participants were required to keep their left hand on the left console switch plate, and the position of this switch was recorded continuously throughout the run. In the two eye-control methods, participants directed their gaze at the cued switch. The participants’ resulting eye LOS was computed at a rate of 60 Hz. When the system detected an eye LOS within 2.54 cm of the center of a switch for two of three consecutive samples (at least 33.4 ms), that switch was illuminated as feedback to the participant. The switch remained illuminated until (1) another switch was selected, (2) a five-second time-out interval had expired, or (3) a consent response was made. Thus, the operator could make the consent response while not looking at the switch (e.g., return attention to the tracking task). In one eye-control method, participants manually closed a push-button on the joystick for the consent response. In the second eye-control method, the consent consisted of uttering the word “Go” into the microphone. The participant then heard either the word “Go” or a beep through the intercom, to provide feedback as to whether the speech system successfully recognized the utterance or not.
Controls and Displays for Aviation Research Simulation
23
In each five-minute run, an auditory cue (“one”, “two”, … , “six”) corresponding to the switches numbered 1 through 6 was presented 42 times while the participant was completing a tracking task (manual inputs on a joystick to overlay a dot on a continuously moving cursor). Eight five-minute runs constituted a session. Sessions were conducted with each method until tracking error and switching time performance met training criteria. Switching time and accuracy data from the final four runs for each of six participants per switching method were analyzed (over 3,000 switching trials). Results: Switch activations that were not completed (i.e., switch not selected or consent not made) or completed incorrectly were dropped. The remaining data for accurate switch selections (96% of the trials) showed that it took less than a tenth of a second longer for the participants to select these switches with their eye LOS and push a button on the joystick (manual consent) than when using their left hand. This lack of a significant difference in average selection time indicates that eye control is a practical method for activating switches mounted in the central FOV, and that eye control switching is a feasible alternative to manual switching, especially when it is desirable to keep the hands on the left- and right-console controls. The results also showed that average switching time was significantly longer with the eye and voice consent method (2.83 seconds) than with the eye and manual consent method (1.78 seconds) or the manual method (1.72 seconds). It is important to note for the eye and voice consent method, that the time required for the voice system to recognize an utterance and transmit the results to the computer made up a component of the total switching time (0.92 seconds). Subtraction of the equipment induced response lag from each eye and voice consent switching time and examination of these data indicated that the differences in mean times for the three switching methods were not significant. Conclusions: The very small difference in selection time between the eye and manual consent method and the manual method indicated that eye-controlled switching is a feasible alternative to manual switching. The longer switching time with eye control and voice consent illustrated the importance of examining the total switching time, from the beginning of the switching task to the closing of the consent switch, when comparing control mechanisms. The delay introduced by the equipment components and by the duration of the utterance resulted in a corresponding inflation in overall switching time. Impact The HMOF was unique, as all the research conducted on this simulator was geared toward exploiting the capabilities afforded by a single technology – LOS tracking. Additionally, the research conducted using this simulator illustrates how a representative scenario and task environment can be utilized iteratively to identify optimal settings for the numerous parameters involved in a new concept. Ideally, the implementation of any new control and display approach should be fine-tuned with such a research test bed, before evaluation in a higher-fidelity simulation. Research with this simulator also marked a significant change in control technology, switching
24
Human Factors in Simulation and Training
from comparing alternative candidate approaches (e.g., manual versus speech) to an approach that integrates two or more technologies such that they are used together to perform a task. In this instance, the controls were mapped to different subcomponents of a task. The operator used eye gaze to designate a desired function and either a generic button or voice command for a consent response, commanding the system to execute the designated function. The use of both technologies capitalizes on the ability of eye gaze to rapidly designate a position on 2-D surfaces and a button press or voice command to quickly initiate an action.
Synthetic Interface Research for UAV Systems (SIRUS) Simulator Unmanned aerial vehicles (UAVs) have become key to aerospace intelligence, surveillance, and reconnaissance operations. More recently, their role has been expanded into search and rescue, chemical and biological detection, communication relays, and various combat operations. Many UAVs are remotely operated as multipletask teleoperated control systems via stick-and-throttle manipulations. The physical separation of the crew from the aircraft makes this control challenging, as groundbased operators do not receive the rich stream of multisensory information that onboard pilots receive regarding the surrounding environment, and the information that is received is often delayed or degraded due to limitations to communication. The SIRUS ground control station simulator (Figure 1.10) was established in the late 1990s to support research that evaluated the potential value of multisensory interfaces for improving control station operations where the UAV is under direct tele robotic control.
FIGURE 1.10 Synthetic Interface Research for UAV Systems (SIRUS) simulator, circa 1998. (US Air Force photo.)
Controls and Displays for Aviation Research Simulation
25
This simulator consisted of two operator stations. The Air Vehicle Operator (AVO) sat at the left workstation to control UAV flight, manage subsystems, and handle external communications. From the right workstation, the sensor operator (SO) was responsible for locating and identifying targets by controlling cameras mounted on the UAV. Using this simulator, the validity of novel concepts was tested by having participants employ the technology while completing representative control operations. For example, research addressed the utility of head-coupled headmounted display applications (Draper et al., 2002), joystick haptic vibration alerts (Draper et al., 2000), wrist tactile alerts (Calhoun et al., 2004), and voice-based control (Draper et al., 2003). A series of experiments specifically addressed visual display enhancements ranging from adding a simple symbology element (Draper et al., 2000) to overlaying more detailed symbology from synthetic vision systems that highlight, in real time, key information elements of interest directly on the camera video image. The latter technology proved especially useful for improving sensor operator performance on several types of tasks and virtually expanding the field-ofview to increase operator situation awareness and improve task performance (e.g., Calhoun & Draper, 2010; Calhoun et al., 2006; Draper et al., 2006). Display Technology Each operator station had an upper and a head-level 17 in. color CRT display, as well as two 10 in. head-down color displays. The upper CRT of both stations generally displayed a bird’s-eye area map (fixed, north up) with overlaid symbology identifying such things as current UAV location, mission waypoints, and current sensor footprint. The head-level CRTs (i.e., camera display) presented simulated video imagery from the cameras mounted on the UAV. HUD symbology was overlaid on the AVO’s camera display, whereas sensor-specific data were overlaid on the SO’s camera display. The four smaller head-down displays presented detailed subsystem and communication information. This simulator had no external visual scene. Control Technology Both stations had right-hand and left-hand joysticks, as well as two left-hand levers. At the AVO’s station, the joystick and throttle were used to control the UAV’s flight path and speed. At the SO’s station, the right-hand joystick controlled the gimbaled camera position and the left-hand joystick controlled camera zoom factor. Each station also had a trackball and a QWERTY-type alphanumeric keyboard with a horizontal row of function keys on top. Representative Research Introduction: Current UAV missions require a high degree of crew coordination to successfully locate and identify ground targets. For example, the AVO’s camera display can be configured to look at large FOV imagery from the gimbaled camera controlled by the SO, whereas the SO views higher-resolution (smaller-FOV) imagery from the same camera to facilitate individual target identification. Thus, while the SO is zoomed in on a particular area, the AVO can spot potential targets that lie outside the SO’s instantaneous FOV. Typically, this target information is communicated
26
Human Factors in Simulation and Training
verbally between the operators, but this is complicated because each operator uses a different frame of reference (earth- versus sensor-referenced). The AVO views the target in cardinal directions: north, south, east, and west. The SO then has to map these directions to where the camera is currently pointing with respect to the direction the UAV is flying. A common frame of reference would help communicate target information. This study (Draper et al., 2000) evaluated the following four display concepts (Figure 1.11) superimposed on the SO’s camera-view display: • Baseline: No additional symbology provided. • Floating Compass Rose: This provided a constant reference to realworld cardinal headings (N, S, E, W), regardless of air vehicle or camera orientation. • Locator Line/Telestrator: Via a cursor/track ball, the AVO designated a target location on the AVO’s (10 degrees FOV) display, resulting in a locator line being presented on the SO’s (1 degree FOV) display, indicating direction and angular distance the camera’s LOS should traverse in order to overlay it on the target. • Combined locator line/telestrator and Floating Compass Rose (N, S, E, W). Procedures: Twelve participants acted as SOs, and four rated pilots were trained to serve as AVOs. The AVO directed the SO to a ground-target area. The SO’s task was to maneuver the camera aimsight reticle onto the target and designate it. Targets initially appeared either within the AVO head-level display (near condition: 5 degrees radial distance from center) or outside the display (far condition: 20 degrees radial distance). The far condition required the AVO to initially utilize the upper (map) display to instruct the SO to maneuver the camera to the local area. Results: The results indicated that target designation time was significantly reduced for conditions that utilized the locator line (alone or with the Compass Rose) for both near and far targets. Time to designate targets was reduced by an average
FIGURE 1.11 Three display concepts for target search and localization by the SO of a UAV ground control station.
Controls and Displays for Aviation Research Simulation
27
of almost 50%. There was also less verbal communication when the locator line was used, freeing the audio channel for other tasks. Conclusions: The locator line expedited transfer of target location information between the UAV operators (AVO and SO). The locator line concept was based on the effective use of similar symbology on aircraft HUDs and HMDs. Thus, an interface concept that was found useful for manned crew stations was found to be useful also for unmanned aircraft control. For both applications, the locator line concept may have additional utility for potential targets identified by sources external to the crew. Impact Research with this simulator demonstrated the potential for reducing workload and improving operator situation awareness and task performance. However, it has also shown that technology proven useful for other complex control applications may not be useful for UAV control. For instance, it was thought that HMD technology would enhance the SO’s wide-area searches and spatial orientation, as it has for some manned aircraft applications. However, the results of a series of studies showed that there must be a fundamental limitation of head-coupled control for performing teleoperated search tasks. Use of the joystick and workstation display resulted in better performance on all measures compared to several HMD configurations evaluated (Calhoun et al., 2003). These findings illustrate that it is critical to test candidate interfaces for UAV control in representative ground control station simulators.
Vigilant Spirit Control Station (VSCS) Simulator As the role of UAVs increases in military and civilian operations, there is a growing need for research simulators to address the unique challenges these vehicles pose to the operators. Examples include how to: identify approaches to improve the operator’s presence in remote environments; design controls and displays that will facilitate the transition of the operator’s role from directly flying a vehicle to managing multiple UAVs, and inform the UAV operator of the detection of cyber activity on their vehicles and provide recommended actions. The VSCS (see Figure 1.12) is a mature, open architecture Windows PC-based multi-UAV testbed that has been widely used for over 20 years to develop and evaluate new autonomous capabilities as well as associated operator interfaces, in both high-fidelity simulations and DoD and NASA flight tests (Feitshans et al., 2008; Rowe et al., 2009). Furthermore, it has been used to examine prototype controls and displays for future applications; two of these experiments will be summarized here. Display Technology Typically, two large monitors present multiple display panels. A Tactical Situation Display (TSD) provides ownship/route information on a moving map. The health and status panel includes telemetry data, a chat room, communications panel, electronic
28
Human Factors in Simulation and Training
FIGURE 1.12 Vigilant Spirit Control Station (VSCS) Simulator. (US Air Force photo.)
checklists, and a subsystem annunciator panel. Other display panels provide information pertinent to the objective of the specific research. Controls Technology The TSD serves as the pilot’s primary control interface via UAV symbol selection and inputs on pull-down menus and text boxes via the mouse and keyboard. Representative Research Two specific experiments will be summarized to illustrate the flexibility with which VSCS can support research. The first experiment addresses display symbology to aid in the operation of UAVs in the National Airspace System (NAS). The other examines how information regarding the cyber security of their vehicle should be presented to UAV operators. Sense and Avoid (SAA) Display Symbology Evaluation Introduction: This experiment evaluated how information from a Sense and Avoid (SAA) maneuver decision aid (Jointly Optimal Conflict Avoidance [JOCA] algorithm developed by Bihrle Applied Research, Inc.; Graham et al., 2011) should be presented to aid UAV pilots. Three stand-alone SAA Displays were evaluated differing in how the information portrayed the ranges of potential heading and altitude maneuver changes to avoid collisions (Bartik et al., 2017). Color coding was used in all three displays but the symbology varied: arcs in the “Banding Display,” dots and a scale with interactive functionality in the “Probing Display,” and a square indicating combinations of maneuvers in the “Dual Perspective Display.” Two automation thresholds were also evaluated that differed in terms of the degree of separation maintained between aircraft (“Well Clear” larger than “Near Mid-Air Collision” (NMAC)). These thresholds, in turn, determined when the automation would take over and initiate a collision avoidance maneuver. Procedure: Each of 22 pilots (16 unmanned and 8 manned) completed six trials: two trials with each automation threshold (Well Clear and NMAC) with each of
Controls and Displays for Aviation Research Simulation
29
three stand-alone SAA Displays (Banding, Probing, and Dual Perspective). Pilots in each trial responded to scripted health and status tasks while operating a simulated UAV along a flight path, maintaining safe separation as if they were operating in the NAS. This involved monitoring a SAA display populated with traffic information in proximity to ownship driven by the VSCS generation of six simulated unique traffic encounters per trial. Results: Pilot performance with the baseline Banding Display was just as good as with the two new novel display concepts. These results suggest the additional algorithm transparency features included in the Probing and Dual Perspective Displays are not critical for UAV operations in the NAS. SAA violations were also consistent throughout both automation thresholds, but the Well Clear threshold paired with the algorithm resulted in less time spent in violation of Well Clear. Lastly, despite the algorithm maneuvering far later than when the operators preferred, the type of maneuvers performed by the algorithm was quite similar to the participants’ maneuvers, in terms of their directions and magnitudes. Conclusion: This study demonstrated the display and performance implications of integrating an advanced SAA algorithm. Further research is needed, though, to evaluate whether increased transparency may be of use for more complex engagements and help tease apart any differences across SAA Displays. Incorporation of adjustable automation thresholds rather than one or the other may prove to be the best solution for more complex environments. Cyber Threat Information Requirements Investigation for UAV Crews Introduction: This experiment was designed to determine the level of awareness UAV crews had of cyber threats to their vehicles, and to determine the information requirements for designing displays to help them understand and resolve threatening cyber activity. In addition to traditional threats to their aircraft, UAV crews will have to deal with future threats, such as cyberattacks on their vehicles. To combat this threat, a Cyber Security Module (CSM) consisting of technologies to detect and defend UAVs from cyberattacks is being developed. In order to determine the best way to integrate this new technology and the information it could provide the UAV crews, a study was conducted using the VSCS (Liggett et al., 2017). The VSCS simulation component allows researchers to create ecologicallyvalid and repeatable mission scenarios. For this study, the basic scenario for data collection was developed such that, if there was a cyberattack on the vehicle, the crew could not successfully complete their mission. Two different cyberattacks were simulated in VSCS to imitate a low-sophistication cyberattack (loss of the sensor ball feed) and a high-sophistication cyberattack (gradual drift of the sensor’s global positioning system coordinates). To simulate the CSM in VSCS, alerts and checklists were provided to the crews during scenarios in which cyber threats were present and the CSM was active. Procedure: Five two-person crews (pilot and sensor operator) participated in the study. There were two independent variables: two levels of CSM (active and not active) and two levels of cyberattack (low- and high-sophistication) for a total of
30
Human Factors in Simulation and Training
four conditions. The dependent measure was the percent of mission tasks accurately completed. Results: When no cyberattack was present, crews, on average, were able to accurately perform 95% of the tasks in the mission, but when a cyberattack was present without the CSM, crews completed only 25% of mission tasks. However, adding the CSM brought task completion back up to 83%. Conclusion: This study provided a significant first step in understanding how UAV operators need to receive information about cyberattacks in order to maintain mission effectiveness. Clearly, the best situation is to have a CSM that can detect the type of threat and provide information on how best to respond. Integrating cyber security alerts and checklists into the standard format for mechanical and electrical alerts and checklists provides a sense of familiarity for the operators when dealing with these new types of threats. However, in the future, new interfaces will need to be designed so operators can cross-check multiple sources of information quickly and efficiently so they can also detect and appropriately respond to cyberattacks that have gone undetected by the CSM. Impact The reported research illustrates the utility of the VSCS simulation to explore how best to provide information from decision-aiding technologies. For example, it is effective in simulating cyberattacks on UAVs and showing the advantage of presenting information from a cyber security module to crews to maintain mission effectiveness. In that this government-developed extensible system utilizes a plug-in architecture without proprietary software, VSCS can be easily modified to simulate different UAV platforms and support a range of mission roles. It can also serve as a useful simulation to evaluate prototype human–machine teaming concepts under development, as exemplified by the incorporation of the commercially developed JOCA SAA algorithm into the reported experiment.
Intelligent Multi-Unmanned Vehicle Planner with Adaptive Collaborative/Control Technologies (IMPACT) Simulator Besides enabling an operator to manage multiple UAVs, there is also interest in the ability to employ a heterogeneous mix of unmanned vehicles (UVs) to provide synergetic, multi-domain strategic capabilities for complex, unpredictable situations. This will require more advanced intelligent-agent support, typically referred to as “autonomy”, which has the capability to achieve goals independently, without intervention. An entirely new controls/display interface approach is also required to support single operator management of multiple heterogeneous UVs that provides the operator transparency into the supporting autonomy and supports bi-directional collaboration and high-level tasking between the operator and autonomy. Also, for joint human– autonomy teaming, the operator must maintain overall situation awareness of the status of the autonomy’s processing and the rational for its recommendations, the basis for a shared mental model of who is doing what (as well as when and why). To ensure agility, the interface must support a range of control options whereby the operator
Controls and Displays for Aviation Research Simulation
31
can, depending on mission demands, be “on the loop” (supervising the autonomy), as well as being “in the loop” (exercising teleoperation to precisely control a particular vehicle/sensor temporarily). In response to this need, a tri-service team designed and implemented the IMPACT simulator to enable command and control of 12 UVs (4 air, 4 ground, and 4 sea surface vehicles) in a simulated mission to defend a military base perimeter by responding to multiple unexpected events (Draper et al., 2018). Display Technology Figure 1.13 illustrates IMPACT’s typical configuration featuring four displays. The top TSD provides a map and current information such as pertinent mission locations, UVs and their associated routes, as well as a vehicle panel showing a summary of each UV’s status. The left monitor provides system information and the right monitor provides a detailed dashboard for multiple UVs that includes status information and each UV’s sensor feed. The bottom monitor provides a “sandbox” display that mirrors the TSD but also allows the operator to create “what-if” scenarios by generating and comparing possible UV plans before implementing them. The sandbox is also where most of the interfaces are located that support operator management of numerous UVs. Given the large amount of information to present in the control station, the displays employ video game-inspired pictorial icons that present information in a concise, integrated manner to facilitate retrieval of the states/goals/progress for multiple systems and support direct perception and manipulation principles. Control Technology IMPACT employs a “playbook” delegation approach that enables seamless transition between control states (from manual to fully autonomous). With this adaptable automation scheme, the operator retains authority and decision-making responsibilities
FIGURE 1.13 Intelligent Multi-Unmanned Vehicle Planner with Adaptive Collaborative/ Control Technologies (IMPACT) Simulator. (US Air Force photo.)
32
Human Factors in Simulation and Training
that help avoid “automation surprises.” By supporting flexible operator– autonomy teamwork, agility is enabled to better respond in dynamic mission environments. At one extreme, the operator can manually control UV movement or build plays from the ground up, specifying detailed parameters. At the other extreme, the operator can quickly task one or more UVs by only specifying play type and location with an intelligent agent determining and executing all other parameters. For example, when an IMPACT operator calls a play to achieve air surveillance on a building, the intelligent agent recommends a UV to use (based on estimated time en route, fuel use, environmental conditions, etc.), a cooperative control algorithm provides the shortest route to get to the building (taking into account no-fly zones, etc.), and an autonomics framework monitors the play’s ongoing status (e.g., alerting if the UVs won’t arrive at the building on time). IMPACT’s play calling interfaces also facilitate operator-autonomy communication on mission details key to optimizing play parameters (e.g., target size and current visibility) as well as supporting operator/autonomy shared awareness (e.g., a display showing the tradeoffs of multiple autonomy-generated courses of actions [COAs] across mission parameters). Play progress is depicted in a matrix display reflecting autonomics monitoring, as well as a tabular interface that aids play management (e.g., allocation of assets across plays). Figure 1.14 illustrates the interfaces on the sandbox display to call plays, tweak plays via a workbook, monitor active plays, and manage chat communications. Each UV symbol and its respective route is presented in a unique color. Additional details are available (Calhoun et al., 2018; Frost et al., 2019). Besides determining the degree to which the autonomy assists with UV control, IMPACT’s interfaces were initially designed to also provide the operator flexibility in terms of which control modality could be employed to make inputs. Specifically, plays could be called or edited (1) via mouse/click inputs, (2) by using a touchscreen monitor, or (3) via speech commands (Calhoun et al., 2017). Furthermore, the operator could flexibly employ any of the three control modalities. In other words, the interfaces were designed to support all three modalities for each step in utilizing the play-based interfaces. Representative Research Introduction: The research summarized here was one of the earlier human-in-theloop experiments using the IMPACT simulation. Its objective was to compare the IMPACT prototype to a baseline UV control system that represented the state-of-theart for that timeframe (Behymer et al., 2017). The baseline system included a subset of IMPACT’s capabilities including the route planner and its associated interface, in that route planners were operational at the time. However, the baseline system lacked autonomy assistance in terms of UV asset recommendations, plan monitoring, and speech-based control. Procedures: The experimental design was a 2 (Baseline, IMPACT) × 2 (low, high mission complexity) within-participant design with the order of conditions blocked by system (half of the participants used IMPACT first, the other half used
FIGURE 1.14 IMPACT simulator’s sandbox map showing six unmanned vehicles on patrol and three ongoing plays: ground UV inspecting point with an air UV providing communication relay support, air UV and two ground UVs escorting a ground entity, and one sea UV inspecting a water threat. Dashed symbology shows a proposed plan for an air sector search at a point. (US Air Force photo.)
Controls and Displays for Aviation Research Simulation 33
34
Human Factors in Simulation and Training
the baseline first) and counterbalanced across task complexity. Mission complexity was manipulated by varying the number and timing of tasks. Each of the eight participants familiar with base defense and/or UV operations performed four 60-minute base defense missions. Participants completed a variety of defense mission-related tasks in each mission involving 12 simulated heterogeneous UVs. Results: Participants’ task performance was better on multiple mission performance metrics with the IMPACT system in comparison to the baseline system. Participants were also able to execute plays using significantly fewer mouse clicks with IMPACT as compared to baseline. The overall usability of each system was assessed using the System Usability Scale (SUS; Brooke, 1996). Participants rated IMPACT higher than the baseline on all ten SUS items, and IMPACT’s overall SUS score was significantly higher than the baseline’s overall SUS score. Participants also subjectively rated IMPACT significantly better than the baseline in terms of its perceived value to future multi-UV operations as well as its ability to manage workload. In fact, every participant gave IMPACT the highest possible score for potential value, and all but one participant gave IMPACT the highest possible score for its ability to aid workload. Subjective data with respect to the speech and touch input modalities, however, was not as positive. Instead, participants favored the mouse input modality and their inputs were also faster and more accurate with mouse-based input compared to touch or speech input (Calhoun et al., 2017). Conclusions: Evaluation results indicated that the play-based innovative control and display-based approach supports operator–autonomy teaming for effective management of a dozen simulated vehicles performing base defense tasks (Behymer et al., 2017). Impact The IMPACT simulation has successfully supported a series of laboratory experiments and live tests including the Autonomy Strategic Challenge Warrior live exercise supported by The Technical Cooperative Program (TTCP) – a five-nation collaborative project (Bartik et al., 2020). Results using this novel instantiation of adaptable automation have been very positive. This proven play-based approach, employing concise icons that facilitate direct, intuitive, and efficient task workflow, also informs other Air Force systems/programs. Examples include how it can support operational efforts involving small UAVs, as well as provide human–automation teaming interfaces for the Advanced Battle Management Systems, currently under development. Research with IMPACT has also identified needed improvements for effective human–autonomy teaming, such as in the bi-directional human–autonomy communications methods and aids that support collaborative problem-solving (e.g., naturalistic dialogue and sketch interactions). Examinations of a variety of team structures on overall human–autonomy teaming are needed as well as mechanisms that improve management of temporal constraints. Additionally, interest in shifting to larger numbers of collaborating systems to provide joint all-domain command
Controls and Displays for Aviation Research Simulation
35
and control (JADC2) strategic capabilities (integrated air, land, maritime, cyber, and space capabilities) has prompted ongoing IMPACT research threads. One effort aims to enable distributed collaborative support between an IMPACT operator and ground-dismounted soldiers equipped with an Android-Based Tactical Awareness Kit (ATAK). Another is exploring how cyber-related effects can be initiated with the play-based approach, either separately or in conjunction with plays involving UVs.
MOTION-BASED SIMULATORS The simulators discussed thus far are all fixed-based simulators, that is, they do not include simulated realistic platform motion. However, as some aircraft become more and more agile, there are many human factors issues that need to be considered for cockpit design. Because of this, motion-based simulators provide a means for exploring human-related issues prior to more costly flight test options. Primarily, motion-based simulators are used for demonstration and training but some are used for gravity (G)-tolerance testing, such as centrifuges, and others are used to study the incidence and effects of pilot spatial disorientation (a pilot’s misperception of the attitude, position, or motion of his/her aircraft). In contrast to fixed-based simulators, there has been little control or display research conducted in motionbased simulators to date. This is unfortunate as motion environments have numerous physiological and psychological consequences that could impact the utility of controls and displays. For instance, under high acceleration it is difficult to move the arm or hand to select functions on the front panel. Hence, the effects of acceleration on the utility of eye gaze and speech-based control would be of interest to see if these are viable alternative controls. High acceleration is also known to affect color vision. In fact, research was conducted in one motion-based simulator to evaluate this effect. This simulator and research will be described in the next section.
Dynamic Environmental Simulator (DES) The DES centrifuge provided multi-axis G exposures in a gimbaled cab (Figure 1.15). It exposed participants up to a maximum of 9Gs, and could combine accelerations (Gx + Gy + Gz). A variety of physiological and experimental measurements could be collected during simulation testing. These included heart rate, skin response, eye blink rate, blood flow, head movement, and G-exposure data. There was also closedcircuit video and audio available for data recording purposes. The DES was used to test personal protective equipment, helmet-mounted systems, and cockpit systems. It supported both sustained acceleration and spatial disorientation research. After modification, the DES allowed closed-loop motion-based studies. So, in addition to the previously mentioned collectible data parameters, pilot flight performance metrics could be collected and analyzed.
36
Human Factors in Simulation and Training
FIGURE 1.15 Dynamic Environmental Simulator (DES), circa 1969. (US Air Force photo.)
Display Technology The cab contained a domed visual scene provided by a front projection system. It displayed a 180 degrees horizontal by 160 degrees vertical out-the-window FOV. In addition, HUD symbology could be projected onto the out-the-window scene for HUD symbology evaluations. The head-down display contained one 23 in. diagonal LCD, and the format presented on this display could be changed via software to represent a number of head-down instrument panels (e.g., an F-18 head-down suite). Control Technology The cab contained an F-16 control stick and throttle that was used by participants to fly an F-16 aeromodel. As with many of the simulators described, this system could be switched with other aeromodels, sticks, and throttles to represent a variety of aircraft cockpits. Representative Research Introduction: One of the biggest advances in head-down information presentation was the addition of color to the displays in the 1970s. Pilots could now rely on known color schemes to determine the meaning of specific objects on a display (e.g., green = good, red = bad) and could learn new color schemes for displays use. Color has been shown to improve pilot performance and reduce workload. However, when pilots are pulling high Gs or sustaining acceleration, the blood pressure in the eye is reduced, and this may cause changes in color vision. Four experiments were conducted in the DES to help determine the effects of sustained acceleration on color vision (Chelette et al., 1999). A few will be summarized here. Background: To get an idea of what actually happens to color vision under high Gs, a preliminary study was conducted in which participants with normal color
Controls and Displays for Aviation Research Simulation
37
vision viewed a color map as the G profile ramped up from the baseline (1.4Gs) at a rate of 0.1Gs per second until the participants experienced almost complete blackout. Participants were to report what they saw during this process. A number of participants said that the river (a cyan color) faded away first, then the yellow and green of the terrain faded together, and finally the red and dark blues faded to black. This observation led to numerous studies conducted in the DES to explore visual contrast sensitivity, night vision, and visual acuity under Gs. Luminance Study: Four colors (red, green, blue, and yellow) at various luminance levels were tested in the following manner. A grid display was developed that contained four rows of digits in various colors; one color per row. The columns had varying luminance contrast ratios. For instance, each column had the same luminance contrast ratio with its background, regardless of the digit color. Participants were subjected to a G profile that progressed from the baseline to near blackout at a slow onset rate. Results showed that digits were recognized longer when they had greater luminance contrast ratios with their backgrounds than digits with lesser luminance contrast ratios with their backgrounds. Color Identification Study: The objective of this study was to determine if participants could identify colors at high Gs. The study employed five colors (red, green, blue, yellow, and gray), at three contrast ratios to represent dark viewing, daylight viewing, and twilight conditions. Six G levels were tested from 1G-9Gs (1.0, 7.0, 7.5, 8.0, 8.5, and 9.0). Results showed that there were no significant differences in terms of reaction time and accuracy between the colors, the contrast ratios, or the G levels. However, one participant had a large number of errors with the color yellow. Therefore, even though it does not appear that most participants have a hard time distinguishing colors under G, this study showed how an undetected color perception deficit of a particular participant may become evident at high Gs. Color Discrimination Study: The objective was to determine if colors could be discriminated under high Gs when the task involved mathematical judgment and choice. This task was more representative of a pilot task in that the display contained seven targets, four of one color and three of another. The participants’ task was to simply press a button that indicated the color of the most number of targets. For this study, contrast ratio was held constant, and four G levels were used (1.0, 7.0, 8.0, and 9.0). Although the overall error rate was below 10%, trends showed that there were more errors in this task than in the previous ones, and the majority of errors occurred at 9Gs. The most common errors encountered were not being able to discriminate between yellow and green (yellow was commonly mistaken for green) and not being able to discriminate between gray and blue (gray was commonly mistaken for blue). Conclusions: On the basis of results of the reported studies, colors with similar luminance contrast ratios should not be used on the same display because they may fade together during high-G maneuvering. These types of studies may be instrumental in detecting color deficiencies for pilots who intend to fly high-G aircraft. Also, under high-G conditions, pilots may be prone to not being able to discriminate between yellow and green. Given that green is commonly used to represent friendly objects, this may compromise pilot performance if they cannot discriminate between friendly and unknown entities.
38
Human Factors in Simulation and Training
Impact A motion-based simulator like the DES is the perfect avenue for conducting many types of control and display research in that pilots can fly high-G profiles in a realistic manner with closed-loop control. For example, studies like these can produce display design guidance for color displays in high-G aircraft and can help determine color vision screening recommendations.
Disorientation Research Device One of the newer motion-based devices built to explore challenges associated with pilot spatial disorientation (SD) is the Disorientation Research Device (DRD), housed at the Naval Medical Research Unit – Dayton (NAMRU-D) at WrightPatterson Air Force Base (see Figure 1.16). This six-degrees-of-freedom device (roll, pitch, yaw plus planetary motion) can replicate angular and linear flight accelerations with up to 3 Gs of sustained force. It was also designed to mimic some SD vestibular illusions. The system rests on a 35-foot-diameter platform which can turn either clockwise or counter-clockwise at up to 150 degrees/sec. Linear motion capabilities include 33 feet of horizontal translation and six feet of vertical motion. One of the objectives for building the DRD was to create a motion-based simulator capable of inducing pilot SD to determine the best methods for combating these situations in flight. Therefore, one of the first studies conducted in the DRD was to compare fixed-based and motion-based simulation capabilities (the DRD with and without motion activated) on their effectiveness in inducing SD (Williams et al., 2021). Display Technology The cockpit instrument panel in the DRD is a wide-field high-resolution visual display system (26 in. diagonal monitor; 1,366 × 768 pixel resolution) with touchscreen
FIGURE 1.16 Disorientation Research Device (DRD): The KrakenTM. (US Navy photo,)
Controls and Displays for Aviation Research Simulation
39
capability. The out-the-window (OTW) scene is displayed on a 65 in. diagonal super ultra-high definition flat panel display that provides a 83 degrees horizontal × 53 degrees vertical field-of-view. Both the instrument panel and the OTW graphics are generated with X-Plane flight simulation software (Version 11.41). Control Technology As mentioned previously, the instrument panel monitor has touchscreen capability for pilot/study participant interaction with the display. Engine power is controlled with a Thrustmaster Warthog throttle, and pitch and roll are controlled with a Flightlink G-Stick III joystick. Yaw is controlled with adjustable rudder pedals. The cockpit is fully networked to an external control station with full monitoring and recording capabilities. The DRD can be operated in pilot-in-the-loop or preprogrammed modes. Research data recording capabilities include two-way cockpit voice communications, a 3-camera video system for visual recordings, eye and head tracking, physiological monitoring, and full “flight” data recording. Representative Research To date, the limited research conducted in the DRD has not been focused on controls and displays. However, a forthcoming study will compare simulation approaches (in-flight simulation and motion-based simulation) on the evaluation of candidate technologies (visual symbology sets presented on a helmet-mounted display and spatialized auditory displays) for SD prevention (Geiselman et al., 2017). More information is presented in the next section on in-flight simulators as the preliminary study was conducted in the University of Iowa Operator Performance Laboratory Aero L-29 Delfin Jet and will be repeated in the DRD for comparison of results. This study will also serve to verify the SD illusions created in the motion-based DRD with that of real flight. Impact Pilot SD continues to be a problem for both military and civilian pilots and is the leading cause of aviation mishap fatalities. On the military side, it is one of the most costly challenges measured by both life and equipment lost. The ability to accurately replicate SD conditions/illusions is paramount to conducting effective research on ways to combat this threat with effective control and display design. The DRD is a state-of-the-art motion-based simulator specifically designed to study and address pilot SD and will contribute to meaningful research in this area for years to come.
IN-FLIGHT SIMULATORS Although the motion-based simulators described in the previous section allow for testing in some aspects of the flight regime, there are many other aspects that can only be examined in flight. The purpose of in-flight simulators is to get as close as possible to the environment in which the crew station technology will ultimately be employed. A few will be discussed in the following sections.
40
Human Factors in Simulation and Training
NASA’s OV-10 Although not designed initially to be an in-flight simulator, the OV-10 has been used extensively to test spatial audio and speech recognition technologies. The OV-10A aircraft is a twin-engine, two-crew-member, tandem-seating turboprop aircraft (Figure 1.17). The displays in the rear seat were dependent on the research taking place at the time of flight. For instance, during a speech recognition study, a monochrome monitor was installed that displayed words the participants were to say into a microphone. There was a keyboard in the backseat, as well as a push-to-talk switch, an acoustic microphone, and a noise-canceling boom microphone for the speech recognition studies. Representative Research Introduction: Speech recognition has long been advocated as a natural and intuitive method by which humans could potentially communicate with complex systems. Research in the area of robust speech recognition, in addition to advances in computational speed and signal processing techniques, has resulted in significant increases in recognition accuracy, key to users accepting the technology. Speech recognition systems have advanced to the point where applications of this technology have significantly increased. The demands on military pilots are extremely high because of the very dynamic environment within which they operate. The pilot has only limited capability to effectively manage available onboard and off-board information sources using just hands and eyes. Because workload is high, and the ability to maintain situational awareness is imperative for mission success, voice control is ideal for military cockpit applications. For these reasons, and because recognition accuracy rates were approaching acceptable levels for cockpit applications, research began to evaluate the potential use of automated speech recognition technology as a natural, alternative method for the management of aircraft subsystems. The key objective was to
FIGURE 1.17 OV-10A Aircraft. (US Air Force photo.)
Controls and Displays for Aviation Research Simulation
41
confirm that performance would not deteriorate in the operational flight environment due to high noise, acceleration, or vibration. Williamson et al. (1996) conducted a study to measure word-recognition accuracy of the ITT VRS-1290 speech recognition system in an OV-10A test aircraft both on the ground and in 1G and 3G flight conditions. A secondary objective of this study was to compile a speech database that could be used to test other speech recognition systems. Test Procedures: Sixteen participants were involved in this study. All participants were tested in the laboratory, in the hangar (sitting in the aircraft cockpit with no engines running), and in flight. During flight, participants experienced a 1G data collection session (referred to as 1G1), followed by a 3G data collection session, and then another 1G data collection session (referred to as 1G2) to test for possible fatigue effects. The study was divided into two separate sessions. The first session consisted of generating the participants’ speech templates (samples of them saying each word in the vocabulary to be tested) in a laboratory setting and collecting some baseline performance data. Participants were briefed on the nature of the experiment, and template enrollment was performed. A system identical to the one in the aircraft was used as the ground support system for template generation. The participants used the same helmet and boom-mounted microphone that was used in the aircraft. Template training involved the participants’ speaking a number of sample utterances. Once template generation was completed, a recognition test followed that consisted of reciting the utterances to collect baseline recognition data. The first aircraft test session was performed in the hangar to provide a baseline (the aircraft in quiet conditions). This consisted of each participant speaking the 91 test utterances twice, for a total of 182 utterances. During both ground and airborne testing, participants needed little or no assistance from the pilot of the aircraft. The participants sat in the rear seat of the OV-10A and were prompted with a number of phrases to speak. All prompts appeared on a 5 × 7 in. monochromatic LCD in the instrument panel directly in front of the participants. Their only cockpit task was to reply to the prompts. Close coordination was required, however, between the pilot and participants while the 3G maneuvers were being performed as the pilot had to execute a specific maneuver in order to keep the aircraft in a 3G state. Results: Three comparisons of word-recognition accuracy were of primary interest:
1. Ground (lab + hangar) versus air (1G1 + 3G + 1G2) 2. 1G (1G1 + 1G2) versus 3G 3. 1G1 versus 1G2
Orthogonal comparisons were done for each of these scenarios. However, no significant differences were found (Figure 1.18). Conclusions: Results showed that the ITT VRS-1290 Voice Recognizer/ Synthesizer system performed very well, achieving over 97% accuracy overall flight
42
Human Factors in Simulation and Training
FIGURE 1.18 Mean word accuracy for each test condition. (From Williamson, D. T., Barry, T. P., and Liggett, K. K. 1996. Flight test performance optimization of ITT VRS-1290 speech recognition system. Audio Effectiveness in Aviation: Proceedings of the Aerospace Medical Panel Symposium.)
conditions. The concept of speech recognition in the fighter cockpit is very promising. Any technology that enables a pilot to stay head-up and hands-on will greatly improve flight safety and situational awareness. Impact This flight test represented one of the most extensive in-flight evaluations of a speech recognition system ever performed. Over 5,100 utterances comprising more than 25,000 words or phrases were spoken by the 12 participants during the flight (4 of the 16 participants’ flight test data was not usable). This number combined with the two ground conditions resulted in a test of over 51,000 words and phrases. The audio database of digital audio tape (DAT) recordings was transferred onto CD-ROM and was used to facilitate laboratory testing of other speech recognition systems. It was also made available for distribution to the speech recognition community. The DAT recordings proved to be extremely valuable because many new voice recognition systems had been produced after this study was conducted. With this database, new systems could be tested against speech recorded in an extremely harsh environment (the participants’ crew station was directly in line with the noisy engines) without requiring additional flight tests. Finally, the example study illustrates the importance of flight-testing controls and displays in the environment in which they will be used. The OV-10, as well as the in-flight simulators discussed in this section, are invaluable as risk-reduction vehicles for ensuring that the control and display technology can be integrated into the airborne environment.
Controls and Displays for Aviation Research Simulation
43
Total In-Flight Simulator (TIFS) NC-131H Transport Aircraft The purpose of TIFS (Figure 1.19) was to perform airborne simulations of existing or new aircraft to evaluate flying qualities, flight control characteristics, human factors concerns, or other issues of interest. This simulator allowed for the variation of numerous parameters in-flight, such as aircraft flight characteristics, controller feel characteristics, and HUD formats. In addition, the cockpit could be reconfigured to represent other aircraft. The front seat served as the evaluation cockpit and the rear seat served as the safety cockpit. The HUD and head-down displays were driven by a programmable display generator that allowed for quick changes to the display formats. The method for aircraft control was also variable; control sticks, wheels, and throttles were easily interchanged in the evaluation cockpit. Representative Research The TIFS aircraft has evaluated a number of control and display configurations for different aircraft, for example, the “glass” version of the Air Force’s C-141 transport aircraft, called the Control/Display System (CDS). The objective of this study was to test the adequacy of the LCDs that replaced the E-M primary flight displays (ADI and horizontal situation indicator). The pilots performed tasks such as unusual attitude recovery and instrument system landing approaches. The objective and subjective flight test data showed that performance, pilot workload, spatial orientation, and air crew acceptance using the proposed CDS display format were improved or no worse than commensurate with, that obtained using the current C-141 instrument format, in almost every instance of analysis (Gawron & Bailey, 1995, p. 84).
FIGURE 1.19 Total In-Flight Simulator (TIFS). (US Air Force photo.)
44
Human Factors in Simulation and Training
The aircraft has also been used by NASA (National Aeronautics and Space Administration) to test the Synthetic Vision System (SVS) that was a key part of the High-Speed Research Program. “The SVS project develops synthetic vision technologies with practical applications to eliminate low visibility conditions as a causal factor in civil aircraft accidents.” (Willshire et al., 2000, p. 376). As part of the flight testing of the SVS technologies, the TIFS crew station was configured with displays showing different views of the outside world that would aid the pilot and aircraft landing during poor visibility conditions. The results showed that, whereas some features (such as ground texturing) were beneficial, others (such as minimized display formats) did not aid in the landing task. Impact The two studies just discussed serve to illustrate the versatility of the TIFS aircraft in evaluating different control and display configurations. The real impact of TIFS was its ability to be reconfigured to evaluate everything from transport cockpit displays such as in the C-141, to handling qualities of the space shuttle. After supporting over 2,500 research flights, this aircraft was retired to the USAF Museum.
Variable In-Flight Stability Test Aircraft (VISTA) Lockheed NF-16D Fighter Aircraft VISTA (Figure 1.20) was an F-16D flight test vehicle in which the front seat served as an evaluation cockpit, and the rear seat served as a safety cockpit. It was used to conduct a variety of airborne simulations to evaluate flying qualities, flight controls, and control and display issues. One of these simulations was to further evaluate voice recognition, a follow-on to the OV-10 evaluation reported earlier in this chapter
FIGURE 1.20 Variable In-Flight Stability Test Aircraft (VISTA). (US Air Force photo.)
Controls and Displays for Aviation Research Simulation
45
(Williamson et al., 1996). Briefly, this flight test verified recognition engine parameters, identified microphone audio issues, and provided additional samples including non-native English speakers to supplement the database available to the speech recognition research community (Barry et al., 2006). To further describe use of VISTA, a study that evaluated HMD symbology is described next. Representative Research Introduction: HMDs can provide an important function in aircraft – off-boresight targeting. This capability highlights a major difference between HMDs and HUDs. HUDs are mounted to the instrument panel of the cockpit and can provide on-boresight information only. On-boresight refers to the visual area the pilot sees when looking down the longitudinal axis of the aircraft (i.e., looking straight ahead). HMD’s are mounted to the pilot’s head and can provide on-boresight as well as offboresight information. Off-boresight refers to all other visual areas the pilot views (i.e., not looking straight ahead). Some of the challenges with integrating HMDs in the cockpit are determining how to present information, and what information needs to be presented to the pilot not only at various phases of the mission but also at various head positions. This study (Jenkins et al., 2002) focused on determining the best off-boresight HMD symbology for targeting, as well as attitude maintenance during realistic air-to-air and air-to-ground target acquisition scenarios. Unusual attitude recovery performance was also examined to determine if the HMD symbology helped or exasperated the pilots’ ability to recover from challenging, SD-like situations. Symbology Sets: The off-boresight symbology sets tested included a nondistributed flight reference (NDFR) format and a visually coupled acquisition and targeting system (VCATS) format. The NDFR allowed ownship status information to always be available on the HMD regardless of where the pilot is looking. It included both digital and analog information. The digital information provided airspeed, altitude, and heading. The analog information was portrayed with the arced portion of the display. Attitude was interpreted by comparing the position and length of the arc with the aircraft symbol. Four examples are shown in Figure 1.21: straight and level, 180 degrees roll, 30 degrees right roll, 135 degrees left roll. VCATS (Figure 1.22) was designed as a high-altitude ownship attitude reference. It included a horizon line split by an aircraft symbol representing climb/dive angle. The horizon line on this symbology set rotated as does the standard ADI and HUD horizon line. The VCATS horizon line also changed shape as climb/dive angle increased or decreased. For instance, when climb/dive angle was negative, the line became dashed and portrayed a chevron-type symbol. Digital readouts of airspeed, heading, and altitude were presented around the outer boundaries of the HMD FOV. The military standard 1787 HUD (MIL-STD HUD) symbology was present on the HUD for the air-to-air and air-to-ground tasks, and on the HMD in virtual HUD mode during the unusual attitude recovery tasks when pilots looked on-boresight. The reason the on-boresight symbology was presented on the HMD for this task was that the pilot-subjects wore leather visors to prevent themselves from seeing the outside world as the evaluation pilot flew the aircraft into the unusual attitude. These
46
Human Factors in Simulation and Training
FIGURE 1.21 Non-distributed flight reference (NDFR) symbology.
visors also prevented them from seeing the HUD. Thus, the virtual HUD on the HMD was utilized for this task. This symbology included a climb/dive ladder, moving horizon line, fixed aircraft reference, bank scale, clocks to represent airspeed and altitude, and a heading tape (Figure 1.23). Procedures: Participants evaluated the various HMD formats for a total of 11.7 flight hours. Test points were flown from the front cockpit by the evaluation pilot. The safety pilot in the rear cockpit set up the HMD configurations, performed routine F-16 flight procedures, and monitored the safety of the flights. Pilots became familiar with the symbology while sitting in the cockpit of the VISTA test aircraft. This essentially functioned as a ground simulator. For the air-to-air and air-toground tasks, pilots flew with the standard HUD symbology on the HUD when they were looking on-boresight, and either nothing, NDFR, or VCATS symbology on the HMD when they looked off-boresight. For the unusual attitude recovery task, pilots were looking off-boresight when the task began, and they flew with the standard HUD symbology on the HMD when looking on-boresight, and either nothing, the NDFR, or VCATS symbology on the HMD when looking off-boresight. Results: For the unusual attitude recovery task, pilots performed 37% faster in initiating a correct input with the NDFR format than with the MIL-STD HUD format.
Controls and Displays for Aviation Research Simulation
47
FIGURE 1.22 Visually coupled acquisition and targeting system (VCATS) symbology.
Pilots also performed 18% faster with the NDFR than VCATS for the same measure. For the air-to-ground task, both the NDFR and the MIL-STD HUD provided an adequate or desired amount of off-boresight search time, while the NDFR allowed the highest percentage (longer search times mean that the off-boresight symbology provided adequate information for the pilots to maintain safe flight without having to return attention to the on-boresight flight display). For the air-to-air task, the NDFR format allowed pilots to achieve the highest percentage of off-boresight search time while still maintaining aircraft parameters. Although the VCATS symbology also provided an adequate amount of off-boresight search time, one of the primary task performance metrics was degraded when pilots used VCATS for off-boresight searching. Conclusions: This study shows the advantages of using off-boresight attitude symbology not only for air-to-air and air-to-ground target acquisition tasks but also for recovering from unusual attitudes. Impact All new technologies that are incorporated into the cockpit need to be thoroughly tested to determine the applicability of the technology, not only for its intended
48
Human Factors in Simulation and Training
FIGURE 1.23 Standard head-up display (HUD) symbology.
purpose but for all aspects of flight. Using an in-flight simulator like VISTA provided a realistic environment to conduct such technology testing. This type of simulation can lower the risk of integrating new technologies by conducting proof-of-concept testing and solving integration issues. Research, such as reported here, helped accelerate the application of HMDs, both retrofitting older aircraft and incorporation into newer fighter aircraft to enhance mission performance.
University of Iowa Operator Performance Laboratory Aero L-29 Delfin Jet To support both civilian and military aircraft cockpit research, the University of Iowa Operator Performance Laboratory (OPL) has two L-29 single-engine, tandem-seat flight jet trainers (Figure 1.24), as well as Mi-2 helicopters. The Aero L-29 Delfín is a military jet trainer developed and manufactured by Czechoslovakian aviation manufacturer Aero Vodochody. The jets are equipped with oxygen systems, g suits, and pressurization making them capable of performing high dynamic maneuvers. The front seat is used by the safety pilot, with the evaluation pilot in the rear seat. Typically, the rear seat is covered with an opaque cloth hood, occluding the outside view to create highly realistic nighttime conditions. With this configuration, flight in the L-29’s backseat simulates single-seat 5th generation flight with an HMD. The aircraft can also flexibly serve as a Live, Virtual, Constructive (LVC) simulator in the hanger, particularly useful for training in preparation for flight tests. The aircraft
Controls and Displays for Aviation Research Simulation
49
FIGURE 1.24 University of Iowa Operator Performance Laboratory Aero L-29 Delfin Jet. (Photo courtesy of University of Iowa Operator Performance Laboratory.)
and supporting systems support both air-to-air and air-to-ground simulations. The aircraft are also equipped for human performance state assessment, including the collection of physiological data. The rear seat is equipped with a high-brightness, 21 in. touch screen display, as well as an F-35 cueing helmet that includes a head tracker, and binocular eye tracker. The technology is state-of-the-art including a synthetic vision system. Controls include a stick and throttle with numerous system switches to support HOTAS. Representative Research Objective: Although HMD technology has been shown to be advantageous in that it presents critical visual information (such as primary flight reference symbology, targeting information, and imagery) regardless of whether the pilot is looking directly forward (on-axis) or elsewhere (off-axis or off-boresight [OBS]) (see VISTA Representative Research above), it has also been shown to cause pilots to experience spatial disorientation when switching between on- and off-axis views. This research aimed to aid pilots in maintaining attitude awareness, especially when looking away from the HUD to acquire information off axis, for example accessing ownship attitude information within the HMD field-of-view. Three different OBS (off boresight) HMD visual symbology sets were compared using operationally representative scenarios to determine which is best to prevent pilot loss of spatial orientation when off-axis tasks are performed (Schnell et al., 2017). Symbology Sets: All three symbology sets included a heading tape, but differed in how OBS HMD symbology was presented. The HMD’s Current Display Format (CDF; Figure 1.25) was one of the three sets. It did not provide attitude (climb/dive/ roll) information; only speed, head heading, and altitude were shown. To obtain aircraft attitude information, pilots were required to either interpret the rate of change in speed, heading, and altitude readouts or look in the forward direction at the HMD combiner to employ a virtual HUD, because there was not a separate HUD display
50
Human Factors in Simulation and Training
FIGURE 1.25 HMD’s Current Display Format (CDF). (US Air Force graphic and photo)
FIGURE 1.26 HMD Distributed Flight Reference (DFR) Format.
surface. Thus, the “vHUD” symbology was rendered in the HMD whenever the pilot was looking forward. When the head tracker sensed the pilot was looking off boresight, the vHUD symbology disappeared. The second symbology set, HMD Distributed Flight Reference (DFR; Figure 1.26), added aircraft attitude information in the upper right corner of the HMD FOV. It included a forward-referenced HMD stabilized aircraft symbol with a movable earth reference circle that rotated around the symbol center reflecting bank angle. It was either “opened” or “closed” with regard to flightpath angle (climb/dive vs. pitch). The earth reference circle had two end-tick marks that referenced the nearest horizon on each side. Thus, for a flight path that pointed straight up, or nearly so, the
Controls and Displays for Aviation Research Simulation
51
earth circle perimeter was fully “open” to denote the absence of ground and thus, not drawn but end-tick marks remained. This indicated that the aircraft was in a climb attitude. For a flight path that pointed more straight down, the earth reference was closer to a full circle, indicating that there was no sky left around the forward direction (only ground) and that the aircraft was in a dive attitude. To summarize, the end-tick marks indicated the direction of the nearest horizon. In a level flightpath attitude, the earth reference was a semi-circle (the upper or sky half “open” and ground half drawn). A full description of the symbology mechanization is available in Geiselman (1999). The third symbology set for this study was the same symbology set that provided pilots with the best performance in the VISTA flight test reported previously – the non-distributed flight reference (NDFR; see Figure 1.27). Information included twodigit aircraft heading in the center circle and airspeed and altitude on the left and right wing, respectively, of the aircraft symbol. Procedures: Ten evaluation pilots viewed the image on the HMD’s combiner as a binocular, fully overlapped monochrome green picture of 1,280 × 1,024 pixels on a 40 h × 30 v degrees FOV. The OBS symbology sets were shown when the HMD rotated more than 15 degrees laterally or tilted more than 25 degrees vertically from the aircraft forward center line. To determine the benefits of candidate symbology sets to prevent loss of spatial orientation, three sorties were developed that were representative of SD-prone conditions identified in prior 5th-generation fighter aircraft incidences. During three sorties, each evaluation pilot followed instructions of an experimenter playing the role of the Joint Terminal Attack Controller (JTAC). The HMD presented a world stabilized diamond symbol as well as an azimuth steering line superimposed over the general target area. By employing a “talk-on” between the JTAC and pilot, target information was provided followed by a standardized nine item (nine-line) brief that the pilot noted. During the talk-on, references were made to features available on an
FIGURE 1.27 HMD Non-Distributed Flight Reference (NDFR) Format.
52
Human Factors in Simulation and Training
onboard map, starting with more prominent visible features (rivers and highways) to more detailed items. The pilot was also given altitude block assignments that became increasingly restrictive across scenarios. This “talk-on” procedure required the pilot to look OBS for long and frequent periods of time and was designed to require the pilot to divide attention between airmanship and weaponeering; visually identifying target features in the virtual world simulated with a Distributed Aperture System (DAS) using a head-tracked graphics processor. The pilots’ tasks were to maneuver to visually acquire each target, identify with the HMD DAS, and employ an Mk-82 bomb using a Continuous Computed Impact Point (CCIP) delivery method. The pilots also provided immediate time-on-target for either a show-of-force or a bomb-on-target delivery and provided subjective workload and situation awareness ratings. Results: Detailed results are available in Schnell et al. (2017). One key result pertains to mean talk-on duration (talk-on start to talk-on complete) since the less time spent during the talk-on process, the sooner the weapon can be deployed. Also, recall that the talk-on process requires a significant amount of OBS time. Therefore, this metric tests the efficacy of the OBS symbology for completing the task of delivering a weapon to a visually acquired target while not contributing to pilot SD. Results showed that the DFR and the NDFR symbology enabled study participants to accomplish the talk-on process with fewer OBS head movements of longer duration when compared to the CDF. This was also evident in the heat-maps of head tracker data. The DFR and NDFR symbology enabled the pilots to complete each talk-on with fewer long-duration OBS head movements compared to the CDF. Conclusion: The DFR and the NDFR symbology sets both allowed pilots to successfully perform their targeting task and maintain attitude awareness. Based on this study and the previous study describing additional testing of the NDFR, it has been shown to be an effective set of symbology for pilots using HMDs to perform a variety of tasks and should be considered as a candidate for transition to operational use. Impact The use of an in-flight simulator was very effective in evaluating the different symbology sets in a more realistic, demanding flight environment, ensuring that they weren’t also causing new challenges for pilots (e.g., SD). It also provided guidance for subsequent research. For example, follow-on research is planned to examine if a Spatial Audio Horizon Cueing system would augment or be an improvement over different candidate HMD symbology sets (Geiselman et al., 2017; Schnell, 2019). In-flight simulators can also inform the design of motion-based simulators, as mentioned under the DRD simulator description in the “Motion-Based Simulators” section. In turn, data collected in the L-29 can be used to validate the DRD’s ability to accurately and economically reproduce the forces of flight in a highly controlled laboratory setting.
MULTISENSORY DISPLAYS AND CONTROLS As the preceding sections illustrate, both display and control technologies available for use in simulation research have significantly advanced over the past 50
Controls and Displays for Aviation Research Simulation
53
years. This is illustrated, for example, by the shift from single-function to multifunction displays and controls and the shift from HUDs to HMDs. Research using simulators has also provided valuable data into whether advanced/novel technologies are applicable for aviation. For example, the use of stereoscopic threedimensional displays did not result in improved flight performance (McIntire et al., 2014). Similarly, simulations have shown that many novel multisensory display and control approaches, popular in the mid-1990s, need further technological maturity before being considered candidates for aviation application. Some are briefly described here. Multisensory displays can convey important information without redirecting the operators’ gaze point and provide relief when operators are visually saturated. One alternative display is the use of localized audio (commonly referred to as 3-D audio) that consists of tones or cues presented via headphones at fixed positions in the external environment, regardless of listener’s head position. The tone placement can vary in azimuth, elevation, and range. Research in the DES (see “Motion-Based Simulators” section) showed that pilots’ ability to localize virtual auditory tones is relatively unchanged up to approximately 5.5 +Gz, but begins to deteriorate at 7.0 +Gz (Nelson et al., 1998). Candidate aviation applications include enhancing pilots’ spatial orientation (Endsley & Rosiles, 1995; Geiselman et al., 2017), redirecting gaze (Perrott et al., 1996), reducing target search and detection time (Simpson et al., 2002), and flying in degraded visual environments such as fog (Milam et al., 2019). Tactile displays are another novel display example (van Erp, 2002). Wrist-worn tactors can alert pilots of automation interventions (Sarter, 2000), cue system faults (Calhoun et al., 2002), and alert weather severity (Rodriguez-Paras et al., 2021). A thigh-mounted vibrotactile display can provide critical directional orientation in the vertical plane (Salzer et al., 2011). Torso vests containing arrays of tactors that vibrate to convey various types of information have been tested (e.g., to show attitude information; Rupert et al., 2016). Also, a combination of multisensory approaches can be more effective than singular modalities, such as using both 3-D audio and tactile cues to help operators counter spatial disorientation (Brill et al., 2015). The interest in multisensory controls for aviation stems from the concern that head-down glances can cause disorientation and vertigo, as well as distract pilots from attending to primary flight tasks. Several hands-free, head-up control technologies have been evaluated, two of which have merited aviation applications. Control inputs based on head-aiming are one example (Highland et al., 2021), primarily used in conjunction with head-mounted displays (see “SIRUS” and “University of Iowa Operator Performance Laboratory” sections). Also, speech-based input (see sections on “MAGIC”, NASA OV-10, and “VISTA”) more recently was enabled in the F-35, but its use has been problematic due to faulty input recognition in extreme flight conditions (Hush-kit aviation magazine, 2021). Examples of other multisensory controls include inputs based on (1) eye line of sight (complicated by varying lighting conditions and extreme look angles; Calhoun & Janson, 1991), (2) lip movement gestures measured with ultrasonic sensors (Jennings & Ruck, 1995), (3) eyebrow or clenched jaw inputs detected with electromyographic signals (Junker et al., 1995), and (4)
54
Human Factors in Simulation and Training
brain electrical activity (e.g., electroencephalographic signals detecting luminance modulation of different control station options (Middendorf et al., 1999; Nasman et al., 1997). Given that the technological limitations of such novel approaches can be overcome, their application is likely to involve multimodal control. For instance, speech recognition may improve if ultrasonic lip motion is also measured (especially for noisy environments; Jennings & Ruck, 1995) or eye line of sight is tracked to focus the vocabulary search to the most likely commands associated with the gaze point.
DISPLAYS AND CONTROLS TO SUPPORT HUMAN–MACHINE TEAMING As noted earlier, there have been a number of changes in controls and displays over the past 50 years in aviation research simulators, primarily reflecting advancements in specific interface technologies. More recently, interface designs are addressing advances in intelligent-agent support technologies (e.g., enabled with artificial intelligence, machine learning, etc.). This chapter, for example, describes control/ display interface designs that team pilots with intelligent agents to aid decisionmaking and task completion for Sense and Avoid de-conflictions and cyberattacks (see section on “VSCS”), as well resource allocation, course of action analysis, and operation assessment (see section on “IMPACT”). Human–machine teaming, in fact, is now considered essential for dynamic, complex, and uncertain military operations because the operator’s decision-making is limited by available information, cognitive processing power, and available time (Chen et al., 2018; Voshell et al., 2016; USAF, 2019). By augmenting the human’s pattern recognition ability and intuition/flexibility with intelligent systems’ computation/reasoning, the unique capabilities of both human and machine team members can be harnessed for more informed, robust, and timely operations. Taylor (1993) referred to this teaming as “cooperative functioning” in that the human and machine members should interact at multiple levels and on all tasks. Calhoun et al. (2016) illustrated this goal in two scenarios that contrast control and display approaches that are collaborative versus supervisory (when humans have authority over the machine). Details are also included on design challenges for supporting collaboration and how to provide intent/decision support and information fusion required for human–machine interactions. Requirements have also been identified by examining the capabilities and limitations of intelligent agents from the standpoint of human teams (Joe et al., 2014) and pinpointing challenges to establishing common ground and coordination across team players (Klein et al., 2004). For instance, bi-directional communication is needed in human–machine teams that enables the operator to convey mission objectives and constraints to the machine agents, as well as the intelligent agent(s) explaining the basis and details of its decision guidance and actions (i.e., provide transparency on its reasoning at an appropriate level of detail; Chen et al., 2017). Ideally, a give-and-take human–machine dialogue can be mechanized that is characteristic of human–human teams (Bradshaw et al., 2004).
Controls and Displays for Aviation Research Simulation
55
Driven by advancements in artificial intelligence, more and more systems will include some form of intelligent aiding. This will result in changes to the role and tasking of the human operators. Research simulators will be a critical component for evaluating new human–machine teaming solutions with candidate control and display technologies to determine their effectiveness in providing mutual support and assistance. A potential source to inform the design of controls and displays in aviation research simulators is a review of test environments and experimental designs addressing human–machine teaming (O’Neill et al., 2020). Ideally, the simulators should support a multi-task environment by which the level of automation for each task can be dynamic and context dependent. In this manner, adaptable automation (Calhoun, 2021) can be applied. This will enable the evaluation of candidate control and display interfaces on their effectiveness in establishing and updating working agreements that define each human and machine member’s responsibilities for completing task-related functions, as well as coordinate courses of action, communicate pertinent information, track task completion/system status, and support shared situation awareness.
SUMMARY Prior to the 1970s, controls and displays in aircraft research simulators tended to replicate the operational approach; each control and display device performed a dedicated function. However, with the advent of E-O devices and computer-based systems, a crew station revolution began and interfaces were implemented that integrated multiple functions onto fewer control and display devices. Multifunction displays presented information that used to be on one or more analog displays. Adding switches to the joystick and throttle enabled pilots to control multiple functions while keeping their hands-on throttle and stick. Moreover, the number of functions that could be controlled by a single multifunction keyboard was only limited by the workload involved in navigating multiple pages to access functions several levels deep in the menu structure. Still, these controls and displays required some “head-down” time in the cockpit. In the 1980s and 1990s, research simulation shifted to exploiting this digital technology and exploring how it could be used in innovative ways. Rather than merely duplicate conventional formats on multifunction displays, the use of color, pictorial representation, and 3-D displays were explored for facilitating information acquisition. In addition, head-mounted displays driven by head position had the advantage of allowing pilots to control the aircraft while keeping their “eyes out of the cockpit.” Simpler, less-cumbersome control technologies were also evaluated including use of speech commands and head and/or eye gaze. More recently, research started to evaluate controls and displays that will tap into the pilots’ biosignals to an even greater extent. Displays considered including tactile displays and spatial audio displays. Novel controls included interfaces based on gesture input and biopotential signals from muscles or the brain. Each of these technologies promoted head-up, hands-free control and display. However, aside from
56
Human Factors in Simulation and Training
voice-based systems, these technologies are still immature and have either limited bandwidth or integration hurdles to be solved with further development and simulation evaluation. The 21st century marked a shift to a human-centered control/display design approach. The pilot’s requirements were paramount, and as a result, attention shifted to what control/display options best fulfilled them. Multimodal approaches were also stressed, in order not to overload any single modality of the pilot. Along with this increased focus on the pilot, the role that artificial intelligence advancements can best assist the pilot and respective operations is now a key focus, examining, for example, dynamic function allocation across human and machine team members and what authorization procedures should be in effect. Although this chapter’s treatment of the evolution of controls and displays focused on research simulators for Air Force cockpits and UAV operator consoles, similar trends are evident in other domains such as medicine, power plants, and transportation in terms of improving how information is controlled and displayed in any complex workstation. Additionally, this chapter illustrates the importance of research simulation. Iterative use of a representative task environment in research should be employed to optimize the implementation parameters of candidate controls and displays. However, it is important to also evaluate candidate interfaces in highfidelity simulators to be more similar to operational environments. Ultimately, testing in the actual application environment is ideal because that enables true validation of interface designs. Regardless of the testing environment, simulators will play a key role in the advancement of workstation control and display technologies for efficient operator–autonomy teams.
ACKNOWLEDGMENT The authors would like to acknowledge Dr. John M. Reising who spearheaded much of the fixed-based simulator research reported herein. It is due to his expertise in this area that this chapter was initiated, and we are grateful for both his significant contributions to it and his valuable insight.
REFERENCES Aretz, A. J., & Calhoun, G. L. (1982). Computer generated pictorial stores management displays for fighter aircraft. Proceedings of the Human Factors Society 26th Annual Meeting, 455–459. Barry, T. P., Liggett, K. K., Williamson, D. T., & Reising, J. M. (1992). Enhanced recognition accuracy with the simultaneous use of three automated speech recognition systems. Proceedings of the Human Factors Society 36th Annual Meeting, 288–292. Barry, T. P., Williamson, D. T., & Snyder, R. A. (2006). Evaluating the Dynaspeak Speech Recognition System for Use in the Joint Strike Fighter (JSF) Using VISAT Flight Test Recordings. Technical Report. Available at 711 Human Performance Wing, 45433– 7022. Wright-Patterson Air Force Base, Dayton, OH: Human Effectiveness Directorate. Barthelemy, K. K., Reising, J. M., & Hartsock, D. C. (1991). Target designation in a perspective view, 3-D map using a joystick, hand tracker, or voice. Proceedings of the Human Factors Society 35th Annual Meeting, 97–101.
Controls and Displays for Aviation Research Simulation
57
Bartik, J., Darrah, S., Moulton, S., & Lemasters, L. (2017). Detect and Avoid (DAA) Automation Maneuver Study (AFRL-RH-WP-TR-2017-0018). Wright-Patterson Air Force Base, Dayton, OH: Air Force Research Laboratory. Bartik, J., Rowe, A., Draper, M., Frost, E., et al. (2020). Autonomy Strategic Challenge (ASC) Allied IMPACT Final Report. TTCP Technical Report (ASC-01-2020). Bassett, P., & Lyman, J. (1940, July). The flightray, a multiple indicator. Sperryscope, 9(3), 10. Behymer, K., Rothwell, C., Ruff, H., Patzek, M., Calhoun, G., Draper, M., Douglass, S., Kingston, D., & Lange, D. (2017). Initial Evaluation of the Intelligent Multi-UxV Planner with Adaptive Collaborative/Control Technologies (IMPACT). Technical Report AFRL-RH-WP-TR-2017-0011. Wright-Patterson Air Force Base, Dayton, OH: Air Force Research Laboratory. Bradshaw, J. M., Acquisti, A., Allen, J., Breedy, M. R., Bunch, L., Chambers, N., Galescu, L., et al. (2004). Teamwork-centered autonomy for extended human-agent interaction in space applications. Artificial Intelligence and Human-Robot Interaction (AAAI) Spring Symposium Series, 136–140. Brill, J., Lawson, B. D., & Rupert, A. H. (2015). Audiotactile aids for improving pilot situation awareness. 18th International Symposium on Aviation Psychology, 13–18. https:// corescholar.libraries.wright.edu/isap_2015/105 Brooke, J. (1996). SUS: A ‘quick and dirty’ usability scale. In P. Jordan, B. Thomas, I. McClelland, & B. Weerdmeester (Eds.), Usability Evaluation in Industry. Bristol, PA: Taylor & Francis Group (pp. 4–7). Calhoun, G. L. (1978). Control logic design criteria for multifunction switching devices. Proceedings of the Human Factors Society 22nd Annual Meeting, 383–387. Calhoun, G. (2021). Adaptable (not adaptive) automation: The forefront of human–automation teaming. Human Factors. https://doi.org/10.1177/00187208211037457 Calhoun, G. L., Arbak, C. J., & Janson, W. P. (1985). Eye and head response to an attention cue in a dual task paradigm. Proceedings of the Human Factors Society 29th Annual Meeting, 1125–1129. Calhoun, G. L., & Draper, M. H. (2010). Unmanned aerial vehicles: Enhancing video display utility with synthetic vision technology. In M. Barnes & F. Jentsch (Eds.), HumanRobot Interactions in Future Military Operations (Chapter 13, pp. 229–248). London, UK: Ashgate. Calhoun, G. L., Draper, M. H., & Ruff, H. A. (2003). Multi-sensory interface concepts for teleoperated unmanned air vehicle (UAV) systems. Proceedings of the 12th International Symposium on Aviation Psychology, 190–195. Calhoun, G. L., Draper, M. H., Ruff, H. A., & Fontejon J. V. (2002). Utility of a tactile display for cueing faults. Proceedings of the Human Factors and Ergonomics Society 46th Annual Meeting, 2144–2148. Calhoun, G. L., Draper, M. H., Ruff, H. A., Nelson, J., & Lefebvre, A. (2006). Simulation assessment of synthetic vision system concepts for UAV operations. SPIE Defense & Security Symposium: Enhanced & Synthetic Vision (Vol. 6226, 62260E–1-62260E-12). Orlando, FL. Calhoun, G. L., Fontejon, J. V., Draper, M. H., Ruff, H. A., & Guilfoos, B. (2004). Tactile versus aural redundant alert cues for UAV control applications. Proceedings of the Human Factors and Ergonomic Society 48th Annual Meeting (Vol. 48, No. 1, 137–142). Calhoun, G. L., Goodrich, M. A., Dougherty, J. R., & Adams, J. A. (2016). Human-autonomy collaboration and coordination toward multi-RPA missions. In N. Cooke, L. Rowe, & W. Bennett (Eds.), Remotely Piloted Aircraft: A Human Systems Integration Perspective, (Chapter 5, pp. 101–136). Hoboken, NJ: Wiley. Calhoun, G. L., & Herron, E. L. (1981). Computer generated cockpit engine displays. Proceedings of the Human Factors Society 25th Annual Meeting, 127–131.
58
Human Factors in Simulation and Training
Calhoun, G. L., & Herron, E. L. (1982). Pilot-machine interface considerations for advanced avionics systems. AGARD 43rd Symposium of the Avionics Panel on Advanced Avionics and the Military Aircraft Man/Machine Interface (Vol. 24, 1–7). Calhoun, G. L., Herron, E. L., Reising, J. M., & Bateman, R. P. (1980). Evaluation of Factors Unique to Multifunction Controls/Displays Devices. Report No. AFWAL-TR-80-3131. Wright-Patterson Air Force Base, Dayton, OH: Air Force Wright Aeronautical Laboratories. Calhoun, G. L., & Janson, W. P. (1990). Eye and head response as indicators of attention cue effectiveness. Proceedings of the Human Factors Society 34th Annual Meeting, 1–5. Calhoun, G. L., & Janson, W. P. (1991). Eye control interface considerations for aircrew station design. Sixth European Conference on Eye Movements. Leuven, Belgium: University of Leuven. Calhoun, G. L., Janson, W. P., & Arbak, C. J. (1986). Use of eye control to select switches. Proceedings of the Human Factors Society 30th Annual Meeting, 154–158. Calhoun, G. L., Ruff, H. A., Behymer, K. J., & Frost, E. M. (2018). Human-autonomy teaming interface design considerations for multi-unmanned vehicle control. Theoretical Issues in Ergonomics Science, 19(3), 321–352. Calhoun, G. L., Ruff, H. A., Behymer, K. J., & Rothwell, C. D. (2017). Evaluation of interface modality for control of multiple unmanned vehicles. Human Computer Interaction International (HCII), International Conference on Engineering Psychology and Cognitive Ergonomics, 15–34). Cham, Springer. Chelette, T., Allnutt, R., Tripp, L., Esken, R., Bolia, S., & Post, D. (1999). Polychromatic percepts during hypergravity. Journal of Gravitational Physiology, 6(1), 13–4. Chen, J. Y. C. (2018). Human-autonomy teaming in military settings. Theoretical Issues in Ergonomics Science, 19, 255–258. https://doi.org/10.1080/1463922X.2017.1397229 Chen, J. Y. C., Lakhmani, S. G., Stowers, K., Selkowitz, A. R., Wright, J. L., & Barnes, M. (2017). Situation awareness-based agent transparency and human-autonomy teaming effectiveness. Theoretical Issues in Ergonomics Science, 19(3), 259–282. Christ, R. E. (1976). Analysis of color and its effectiveness. Paper Presented at the Naval Air Test Center Third Advanced Aircrew Display Symposium. Patuxent River, MD. Draper, M., Calhoun, G., Nelson, J., & Ruff, H. (2006). Evaluation of Synthetic Vision Overlay Concepts for UAV Sensor Operations: Landmark cues and Picture-in-Picture. Technical Report AFRL-HE- WP-TR-2006-0038. Wright-Patterson Air Force Base, Dayton, OH: Air Force Research Laboratory. Draper, M. H., Calhoun, G. L., Ruff, H. A., Williamson, D. T., & Barry, T. P. (2003). Manual versus speech input for unmanned aerial vehicle control station operations. Proceedings of the Human Factors and Ergonomics Society 47th Annual Meeting, 109–113. Draper, M. H., Geiselman, E. E., Lu, L. G., Roe, M. M., & Haas, M. W. (2000). Display concepts supporting crew communication of target location in unmanned air vehicles. Proceedings of the Human Factors and Ergonomics Society 44th Annual Meeting, 385–388. Draper, M., Rowe, A., Douglass, S., Calhoun, G., Spriggs, S., Kingston, D., ... & Reeder, J. (2018). Realizing Autonomy via Intelligent Hybrid Control: Adaptable Autonomy for Achieving UxV RSTA Team Decision Superiority (Also Known as Intelligent MultiUxV Planner with Adaptive Collaborative/Control Technologies (IMPACT)) (AFRLRH-WP-TR-2018-0005). Wright-Patterson Air Force Base, Dayton, OH: Air Force Research Laboratory. Draper, M. H., Ruff, H. A., Fontejon, J. V., & Napier, S. (2002). The effects of head-coupled control and head-mounted displays (HMDs) on large-area search tasks. Proceedings of the Human Factors and Ergonomics Society 46th Annual Meeting, 2139–2143.
Controls and Displays for Aviation Research Simulation
59
Draper, M. H., Ruff, H. A., Repperger, D. W., & Lu, L. G. (2000). Multi-sensory interface concepts supporting turbulence detection by UAV controllers. Proceedings of the Human Performance, Situation Awareness and Automation Conference, 107–112. Endsley, M. R., & Rosiles, S. A. (1995). Auditory localization for spatial orientation. Journal of Vestibular Research, 5(6), 473–485. Feitshans, G. L., Rowe, A. J., Davis, J. E., Holland, M., & Berger, L. (2008). Vigilant Spirit Control Station (VSCS): The face of COUNTER. Proceedings from American Institute of Aeronautics and Astronautics (AIAA). AFRL-RH-WP-TR-2012-0015. WrightPatterson Air Force Base, Dayton, OH: Air Force Research Laboratory. Frost, E., Calhoun, G., Ruff, H., Bartik, J., Behymer, K., Spriggs, S., & Buchanan, A. (2019). Collaboration interface supporting human-autonomy teaming for unmanned vehicle management. International Symposium of Aviation Psychology. Gawron, V. J., & Bailey, R. E. (1995). In-flight evaluation of the C-141 all-glass cockpit. Proceedings of the 8th International Symposium on Aviation Psychology, 80–85. Geiselman, E. (1999). Practical considerations for fixed wing helmet-mounted display symbology design. Proceedings of the Human Factors and Ergonomics Society 43rd Annual Meeting. Santa Monica, CA. Geiselman, E. E., Williams, H. P., & Schnell, T. (2017). Use of a live, virtual, constructive simulation approach to evaluate visual symbology on a helmet-mounted display for spatial disorientation prevention. Proceedings of IMAGE Society 2017 Conference. Graham, S., Chen, W., De Luca, J., Kay, J., Deschenes, M., Weingarten, N., Raska, V., & Lee, X. (2011). Multiple intruder autonomous avoidance flight test. Infotech@Aerospace 2011. St. Louis, MS, March 29–31. Herron, E. L. (1978). Two types of system control with an interactive display device. Proceedings of the 1978 International Symposium and Exhibition of the Society for Information Display, 84–85. Highland, P., Harp, D., Schnell, T., Geiselman, E., & Havig, P. (2021). The customer is always right… Towards rhino pointing and eye tracking interfaces for combat aviators. 48th International Symposium on Aviation Psychology, 86. Hush-kit Aviation magazine. (2021). https://hushkit.net/2021/01/21/what-is-good-and-bad -about-the-f-35-cockpit-a-panthers-pilots-guide-to-modern-cockpits/ Jenkins, J. C., Thurling, A. J., Havig, P. R., & Geiselman, E. E. (2002). Flight test evaluation of the non-distributed flight reference off-boresight helmet-mounted display symbology. In R. J. Lewandowski, L. A. Haworth, & H. J. Girolamo (Eds.). Proceedings of SPIE Helmet-Mounted Displays VII (Vol. 4711, pp. 341–355). Los Angeles, CA: SAGE Publishing. Jennings, D. L., & Ruck, D. W. (1995). Enhancing automatic speech recognition with an ultrasonic lip motion detector. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 868–871. Joe, J., O’Hara, J., Medema, H., & Oxstrand, J. (2014). Identifying requirements for effective human-automation teamwork. In Proceedings of the 12th International Conference on Probabilistic Safety Assessment and Management (PSAM 12, Paper# 371), (INL/ CON-14-31340). Junker, A., Berg, C., Schneider, P., & McMillan, G. (1995). Evaluation of the Cyberlink Interface as an Alternative Human Operator Controller. US Air Force Technical Report AL/CF-TR-1995-0011. Wright-Patterson Air Force Base, Dayton, OH: Armstrong Laboratory. Klein, G., Woods, D. D., Bradshaw, J. M., Hoffman, R. R., & Feltovich, P. J. (2004). Ten challenges for making automation a “team player” in joint human-agent activity. IEEE Intelligent Systems, 19(6), 91–95.
60
Human Factors in Simulation and Training
Kopala, C. J. (1979). The use of color-coded symbols in a highly dense situation display. Proceedings of the 23rd Human Factors Society Annual Meeting, 397–401. Krebs, M. J., Wolf, J. D., & Sandvig, J. H. (1978). Color Display Design Guide. Report ONRCR213-136-2F. Minneapolis, MN: Office of Naval Research. Liggett, K. K., Reising, J. M., Beam, D. J., & Hartsock, D. C. (1993). The use of aiding techniques and continuous cursor controllers to designate targets in 3-D space. Proceedings of the Human Factors Society 37th Annual Meeting, 11–15. Liggett, K. K., Reising, J. M., & Hartsock, D. C. (1992). The use of a background attitude indicator to recover from unusual attitudes. Proceedings of the Human Factors Society 36th Annual Meeting, 43–47. Liggett, K., Venero, P., & Thomas, G. (2017). An Investigation into the information requirements for remotely piloted aircraft crew when dealing with cyber threats. Proceedings of the 19th International Symposium on Aviation Psychology, 317–322. McIntire, J. P., Havig, P. R., & Geiselman, E. E. (2014). Stereoscopic 3D displays and human performance: A comprehensive review. Displays, 35(1), 18–26. Middendorf, M. S., McMillan, G. R., Calhoun, G. L., & Jones, K. S. (1999). EEG-based control of virtual buttons. Proceedings of the 43rd Annual Meeting of the Human Factors and Ergonomics Society, 942–946. Milam, L., Akins, E., Simpson, B., Williams, H., & Jones, H. (2019). Techniques to Explore Spatial Audio Cues for Aiding Helicopter Navigation in Degraded Visual Environments. Army Aeromedical Research Lab. Fort Rucker, United States. Nasman, V. T., Calhoun, G. L., & McMillan, G. R. (1997). Brain-actuated control and HMDs. In J. Melzer & K. Moffitt (Eds.), Head-Mounted Displays: Designing for the User (285–310). New York: McGraw-Hill. Nelson, W. T., Bolia, R. S., McKinley, R. L., Chelette, T. L., Tripp, L. D., & Esken, R. L. (1998). Localization of virtual auditory cues in a high +Gz environment. Proceedings of the Human Factors and Ergonomics Society 42nd Annual Meeting, 97–101. Nicklas, D. (1958). A History of Aircraft Cockpit Instrumentation 1903–1946. Technical Report No. 57-301. Wright-Patterson Air Force Base, Dayton, OH: Wright Air Development Center. O’Neill, T., McNeese, N., Barron, A., & Schelble, B. (2020). Human–autonomy teaming: A review and analysis of the empirical literature. Human Factors. https://doi.org /0018720820960865 Osga, G. A. (1991). Using enlarged target area and constant visual feedback to aid cursor pointing tasks. Proceedings of the Human Factors Society 35th Annual Meeting, 369–373. Perrott, D. R., Cisneros, J., McKinley, R. L., & D’Angelo, W. R. (1996). Aurally aided visual search under virtual and free-field listening conditions. Human Factors, 38, 702–715. Reising, J. M. (1977). Multifunction keyboard configurations for single-seat, air-to-ground fighter cockpits. Proceedings of the Human Factors Society 21st Annual Meeting, 363–366. Reising, J. M., & Calhoun, G. L. (1982). Color display formats in the cockpit: Who needs them? Proceedings of the Human Factors Society 26th Annual Meeting, 446–450. Reising, J. M., Liggett, K. K., & Hartsock, D. C. (1995). New flight display formats. Proceedings of the 8th International Symposium on Aviation Psychology, 86–91. Reising, J. M., Liggett, K. K., Rate, C., & Hartsock, D. C. (1992). Three-dimensional designation using two control devices and an aiding technique. Proceedings of SPIE Electronics Imaging Symposium (Vol. 1669, 146–154). Reising, J. M., Liggett, K. K., Solz, T. J., & Hartsock, D. C. (1995). A comparison of two head up display formats used to fly curved instrument approaches. Proceedings of the Human Factors and Ergonomics Society 39th Annual Meeting, 1–5.
Controls and Displays for Aviation Research Simulation
61
Reising, J. M., Zenyuh, J. P., & Barthelemy, K. K. (1988). Head-up display symbology for unusual attitude recovery. Proceedings of the National Aerospace and Electronics Conference, 926–930. Rodriguez-Paras, C., McKenzie, J. T., Choterungruengkorn, P., & Ferris, T. K. (2021). Severity-mapped vibrotactile cues to support interruption management with weather messaging in the general aviation cockpit. Atmosphere, 12, 341. https://doi.org/10.3390 /atmos12030341 Rowe, A. J., Liggett, K., & Davis, J. E. (2009). Vigilant spirit control station: A research testbed for multi-UAS supervisory control interfaces. Proceedings of the Fifteenth International Symposium on Aviation Psychology, 287–292. Rupert, A. H., Woo, G., Brill, J. C., & Lawson, B. (2016). Countermeasures for loss of situation awareness: Spatial orientation modeling to reduce mishaps. 2016 IEEE Aerospace Conference, 1–9. https://doi.org/10.1109/AERO.2016.7500725 Salzer, Y., Oron-Gilad, T., Ronen, A., & Parmet, Y. (2011). Vibrotactile “on-thigh” alerting system in the cockpit. Human Factors, 53(2), 118–131. Sarter, N. B. (2000). The need for multisensory interfaces in support of effective attention allocation in highly dynamic event-driven domains: The case of cockpit automation. The International Journal of Aviation Psychology, 10(3), 231–245. Schnell, T., Geiselman, E., Simpson, B., & Williams, H. (2019). Helmet mounted display format and spatial audio cueing flight test. 20th International Symposium on Aviation Psychology, 241. Schnell, T., Reichlen, C., Reuter, C., Geiselman, E., Knox, J., & Williams, H. (2017). A comparison of helmet-mounted display symbologies during live flight operational tasks. 19th International Symposium on Aviation Psychology, 485. Simpson, B. D., Bolia, R. S., McKinley, R. L., & Brungart, D. S. (2002). Sound localization with hearing protectors: Performance and head motion analysis in visual search task. Proceedings of the Human Factors and Ergonomics Society 46th Annual Meeting, 1618–1622. Solz, T. J., Reising, J. M., Liggett, K. K., Lohmeyer, T., & Hartsock, D. C. (1994). The use of aiding techniques and varying depth volumes to designate targets in 3-D space. Proceedings of the Human Factors Society 38th Annual Meeting, 1–5. Spengler, R. P. (1988). Advanced Fighter Cockpit. Technical Report ERR-FW-2936. Fort Worth, TX: General Dynamics. Taylor, R. (1993). Human factors of mission planning systems: Theory and concepts. AGARD-LS-192 New Advances in Mission Planning and Rehearsal Systems, 2-1–2-22. USAF Science and Technology Strategy: Strengthening USAF Science and Technology for 2030 and Beyond. (2019). Available at: https://www.airforcemag.com/ PDF /DocumentFile / Documents / 2019 / USAF %20Science %20and %20Technology %20Strategy.pdf van Erp, J. B. (2002). Guidelines for the use of vibro-tactile displays in human computer interaction. EuroHaptics 2002, 18–22. Voshell, M., Tittle, J., & Roth, E. (2016, March). Multi-level human-autonomy teams for distributed mission management. 2016 AAAI Spring Symposium Series. Williams, H. P., Horning, D. S., Etgen, C., & Powell, C. R. (2021). Effects of Cockpit Workload and Motion on Incidence of Spatial Disorientation in Simulated Flight. Technical Report NAMRU-D-21-034. Wright-Patterson Air Force Base, Dayton, OH: Naval Medical Research Unit-Dayton. Williamson, D. T., Barry, T. P., & Liggett, K. K. (1996). Flight test performance optimization of ITT VRS-1290 speech recognition system. Audio Effectiveness in Aviation: Proceedings of the Aerospace Medical Panel Symposium.
62
Human Factors in Simulation and Training
Willich, W., & Edwards, R. E. (1975). Analysis and Flight Simulator Evaluation of an Advanced Fighter Cockpit Configuration. Technical Report AFAL-TR-75-36. WrightPatterson Air Force Base, Dayton, OH: Air Force Avionics Laboratory. Willshire, K. F., Latorella, K. A., & Glaab, L. J. (2000). NASA Langley crew systems contributions to aviation safety technology: Results of studies to date. Proceedings of the IEA 2000/HFES 2000 Congress, 4-376–4-379. Zipoy, D. R., Premselaar, S. J., Gargett, R. E., Belyea, I. L., & Hall, H. J. (1970). Integrated Information Presentation and Control Systems Study, Vol. 1, System Development Concepts. Technical Report AFFDL-TR-70-79, Vol. 1. Wright-Patterson Air Force Base, Dayton, OH: Air Force Flight Dynamics Laboratory.
ACRONYMS/ABBREVIATIONS 2-D: 3-D: ADI: AI: ATAK: ATM: AVO: BAI: CCD: CCIP: CDF: CDS: COMM: CRT: CSM: DAS: DAT: DES: DFR: DIGISYN: DoD: DRD: E-M: E-O: FOV: G: HMD: HMOF: HMS: HOTAS: HUD: IFF: IIPACSS:
two-dimensional three-dimensional attitude director indicator attitude indicator Android-Based Tactical Awareness Kits automatic teller machine Air Vehicle Operator Background Attitude Indicator charge-coupled device Continuous Computed Impact Point Current Display Format Control/Display System communication Cathode Ray Tube Cyber Security Module Distributed Aperture System digital audio tape Dynamic Environmental Simulator Distributed Flight Reference Digital Synthesis Simulator Department of Defense Disorientation Research Device electro-mechanical electro-optical field-of-view gravitational force helmet-mounted display Helmet-Mounted Oculometer Facility Simulator helmet-mounted sight hands on throttle and stick head-up display identify friend or foe Integrated Information Presentation and Control System Study
Controls and Displays for Aviation Research Simulation
IMPACT: JADC2: JOCA: JTAC: LCD: LOS: LVC: MAGIC: MIL-STD: NAMRU-D: NAS: NASA: NDFR: NMAC: OBS: OPL: OTW: PC: PCCADS: SAA: SD: SIRUS: SO: SUS: SVS: TIFS: TSD: TTCP: UAV: UHF: UV: VCATS: VHF: vHUD: VSCS: VISTA: VSCS:
63
Intelligent Multi-Unmanned Vehicle Planner with Adaptive Collaborative/Control Technologies Simulator joint all-domain command and control Jointly Optimal Conflict Avoidance Joint Terminal Attack Controller liquid crystal display line of sight Live, Virtual, Constructive Microprocessor Applications for Graphics and Interactive Communication military standard Naval Medical Research Unit – Dayton National Airspace System National Aeronautics and Space Administration non-distributed flight reference Near Mid-Air Collision off-boresight Operator Performance Laboratory out-the-window personal computer Panoramic Cockpit Control and Display System Simulator Sense and Avoid spatial disorientation Synthetic Interface Research for UAV Systems Simulator sensor operator System Usability Scale Synthetic Vision System Total In-flight Simulator Tactical Situation Display The Technical Cooperative Program Unmanned aerial vehicles ultra-high frequency unmanned vehicles Visually Coupled Acquisition and Targeting System very high frequency virtual head-up display Vigilant Spirit Control Station Variable In-Flight Stability Test Aircraft Vigilant Spirit Control Station
2
Augmented Reality as a Means of Job Task Training in Aviation Dan Macchiarella, Jiahao Yu, Dahai Liu, and Dennis A. Vincenzi
CONTENTS Augmented Reality (AR).......................................................................................... 67 Historical Overview: AR and Training..................................................................... 67 Cognition and AR..................................................................................................... 73 Elaboration and Recall..................................................................................... 73 Spatial Relations.............................................................................................. 74 Memory Channels and AR............................................................................... 77 Knowledge Development and Training Transfer...................................................... 78 What Is the Future of Job Training – Training on the Job Literally?........................80 Conclusion................................................................................................................ 83 References................................................................................................................. 83 Historically, the aviation industry expends significant amount of time and resources training and retraining its workforce to perform psychomotor and cognitive maintenance tasks necessary to keep aircraft safely flying (Ott, 1995). The industry continues to dedicate a substantial amount of its effort and capital ensuring that its workforce is prepared to maintain modern and complex aircraft systems. Despite the rapid advances in computer-based training technologies (e.g., augmented reality [AR]), aviation maintenance workers presently participate in job task training in traditional face-to-face settings that would be familiar to aviation maintenance workers from generations past. Changing the manner in which aviation maintenance workers are trained, with the goal of capturing the positive effects associated with computer-based training technologies, has the potential to optimize training. Airframe and Powerplant (A&P) certified mechanics are serving as the primary workers in the nation’s aviation industry. The United States General Accounting Office (2003) completed a study that highlights the need for curriculum reform by the Federal Aviation Administration (FAA) for the training and certification of A&P mechanics. A relatively large number of workers in the aviation maintenance field possess an A&P license. The number of A&P mechanics in the US labor market was not forecasted to meet the industry’s needs (US General Accounting Office, 2003). DOI: 10.1201/9781003401353-2
65
66
Human Factors in Simulation and Training
As in Boeing pilot and technician outlook 2020–2039, new personnel demand was calculated based on a 20-year fleet forecast, aircraft utilization, attrition rate, and regional differences. There will be a 192,000 gap for mechanics in North America while 739,000 new technicians will be needed to maintain the global fleet in the next 20 years (Boeing, 2020). Even though COVID-19 caused a temporary shortterm oversupply in the aviation industry, the long-term demand is still robust as mechanics are still retiring faster than they are replaced (Aviation Technician Education Council, 2020). The average age of an FAA mechanic is 52, and 33% of mechanics are over 60 while new mechanics make up only 2% of the mechanics’ population annually (Aviation Technician Education Council, 2020). A panel convened by the US General Accounting Office (2003) cited the current curriculum as being “obsolete geared to smaller less complex aircraft” (p. 1). Within the next several years, institutions training future aviation maintenance workers will receive a new curriculum for training A&P mechanics. This new curriculum will address the modern complexities of systems and materials being used in aircraft. This change of curriculum, combined with significant cost in resources and time necessary to train and retrain aviation maintenance workers, creates an opportunity to change the fundamental nature of instructional delivery systems (IDS) being used in the aviation maintenance training field. AR has the potential to help the aviation industry meet its training need due to its visual-spatial dynamic that is analogous to a spatial graphical user interface (Kaplan et al., 2021; GUI; Majoros & Boyle, 1997; Majoros & Neumann, 2001; Neumann & Majoros, 1998). Several key factors are associated with training aviation workers: Aviation maintenance work tasks require a high level of knowledge in the field, from entry level (i.e., novice) to the highly skilled level (i.e., expert); the FAA rigidly regulates training curriculum and certification of workers; workers perform work tasks at irregular intervals (e.g., replacing an oil pump on a turbine engine may only occur once in 5 years); and when workers fail to perform work tasks properly, the consequences could be dire. Aviation flight-related, and aviation accidents in general, often result in the loss of human life and large-scale destruction of property. Highlighting the consequences of improper maintenance, the National Transportation Safety Board (NTSB) determined that the crash of Alaska Airlines Flight 261 in January 2000 was due to maintenance irregularities (US General Accounting Office, 2003). The FAA licenses and regulates aviation maintenance workers as part of its effort to ensure safe aviation operations and protect the public in general. The FAA originally developed its core curriculum for repairing and maintaining aircraft over 50 years ago (US General Accounting Office, 2003). Aviation maintenance workers inspect and repair engines, landing gear, instruments, pressurized sections, and other parts of the aircraft. Additionally, they conduct routine maintenance and replacement of parts; repair surfaces for both sheet metal and composite materials; and inspect for corrosion, distortion, and cracks in the fuselage, wings, and tail. While performing maintenance, A&P mechanics test parts and equipment to ensure that they are working properly and then certify that the aircraft is ready to return to service. Aviation maintenance workers often work under time pressure to maintain flight schedules. The majority of them obtain an A&P license through
Augmented Reality as a Means of Job Task Training in Aviation
67
certification by the FAA. Those who do not possess an A&P license can only perform maintenance tasks under the direct supervision of an A&P-licensed mechanic. There are 181 active schools that hold FAA certificates which were issued under Title 14 of the Code of Federal Regulations part 147 (Aviation Technician Education Council, 2020). Candidates for the A&P license must successfully complete a minimum of 1900 hours of classroom instruction at any of these FAA-approved aviation maintenance technician schools or acquire documented evidence that they have at least 30 months of on-the-job training (e.g., service as an aviation mechanic in the military) or show evidence detailing work experience with aircrafts’ engines and bodies. After meeting the requisites for licensing, A&P candidates must pass written and oral tests and demonstrate through a practical test that they can perform maintenance tasks (US General Accounting Office, 2003). Any instructional delivery system or new learning paradigm that has a significant positive effect on aviation maintenance workers during their initial training or retraining after receiving an A&P license could reduce training time and costs, helping to meet the industry’s, need for trained workers.
AUGMENTED REALITY (AR) AR presents a visual-spatial dynamic that may elicit efficiencies during aviation maintenance training (Macchiarella, 2004; Neumann & Majoros, 1998; Valimont, 2002). AR applications that deliver composite virtual and real scenes during aviation maintenance training are analogous to the spatial GUI that now dominates human– computer interaction and may aid attention, memory, and recall (Neumann & Majoros, 1998). However, AR is an emerging technology, and essentially very little research regarding its effectiveness as a training paradigm has been conducted. At the same time, it should be noted of the low fidelity of computer-based systems as they are not realistic enough to completely replace conventional face-to-face training, especially in maintenance which involve complex hands-on experiences (Gonzalez-Franco et al., 2017). However, new research is constantly expanding the body of knowledge of the technologies necessary to bring AR into the real world for application (Azuma, 2004). As the special nature of AR, computer-generated virtual imagery information can be overlaid onto a live direct or indirect real-world setting (Lee, 2012), as “it positions the learner within a real-world physical and social context while guiding, scaffolding and facilitating participatory and metacognitive learning process” (Dunleavy & Dede, 2014).
HISTORICAL OVERVIEW: AR AND TRAINING Essential to understanding the concept of AR is the need to distinguish between real objects, virtual objects, and objects that display characteristics of both reality and virtuality. Milgram and Kishino (1994) effectively defined AR and placed it into a mixed-reality continuum (see Figure 2.1). Milgram’s virtuality continuum is useful for categorizing surroundings as perceived by the human mind. On one end of the continuum is the real environment. It is comprised of real objects that have an actual
68
Human Factors in Simulation and Training
FIGURE 2.1 Milgram’s reality–virtuality continuum. (Adapted from Milgram, P., and Kishino, F. 1994. A taxonomy of mixed reality visual displays. IEICE Transactions Information Systems, E77-D(12), 1321–1329.)
existence. The virtual environment at the other end of the continuum comprises objects that exist in essence or effect but not in a formal or actual state of being. Between these two ends lies the world of mixed reality (Azuma, 1997; Azuma et al., 2001; Billinghurst et al. 2001; Milgram & Kishino, 1994). The distinction between varying degrees of reality and virtuality is not significant in terms of human interaction with the mixed-reality world. However, from a technical perspective for creating a mixed-reality world, varying degrees of reality and virtuality are significant. It is more difficult to bring virtual elements into real environmental settings (viz., outside a laboratory setting) than it is to bring a real environment object into a computergenerated virtual environment scene (e.g., using one’s own hand, fitted with a haptic input device, to grasp a virtual object). Effectively, AR is any scene or case in which the real environment is supplemented by using computer-generated graphics. While in a broader perspective, extended reality (XR) is used as an umbrella term for virtual reality, augmented reality, and mixed reality. Training in XR does not express a different outcome than training in a non-simulated, control environment which means the effects may be equal (Kaplan et al., 2021). Azuma’s (1997) monograph defines AR as a variation of virtual environments (VE) and provides detailed information on all key aspects of AR-based systems. VEs are more commonly referred to as virtual reality (VR). Users of VE technologies are fully immersed in a synthetic environment. An AR system supplements the real world with virtual (i.e., computer-generated) objects that appear to coexist in the same space as the real world. Azuma et al. (2001) defines AR systems as having the following properties: combine real and virtual objects in a real environment, run interactively, run in real time, and register (i.e., align) real and virtual objects with each other. AR is a machine vision and computer graphics technology that merges real and virtual objects into unified, spatially integrated scenes. Azuma (Azuma, 1997; Azuma et al., 2001) deconstructs all AR systems into three subsystems: scene generator, display device, and tracking–sensing device. He clearly defines AR as its own field of study due to AR’s unique blending of computer-generated worlds and the real world to form a new world for humans to function within. AR systems fall into either one of two categories (Feiner et al., 1997; Kalawsky et al., 2000). The categories are: optical-based technologies (see Figure 2.2) and video-based technologies (see Figure 2.3). Optical-based systems typically employ a
Augmented Reality as a Means of Job Task Training in Aviation
69
FIGURE 2.2 Simple schematic of an optical see-through HMD AR system. (Adapted from Azuma, 1997.)
FIGURE 2.3 Simple schematic of a monitor-based AR system. (Adapted from Azuma, 1997.)
headmounted display (HMD) that is comprised of see-through lenses that enable the user to see the real world with the virtual world projected on combiner lenses positioned in front of the eye. The combiner lenses are partially transmissive, enabling the user to look directly through them to see the real world. The user sees the virtual world superimposed over the physical view of the real world. Video-based systems use video cameras that provide the user a view of the real world. Video from these cameras is combined with the graphic images created by a scene-generating computer to blend the real and virtual worlds. The result is sent to the monitors in front of the user’s eyes in a closed-view HMD or to a traditional computer monitor. Fishkin et al. (2000) propose that AR-like systems have the potential to transform human–computer interaction as drastically as the GUI-transformed computing. They state that “the physical configuration of computational devices is a major determinant of their usability” (p. 75). The authors highlight that traditional physical interaction with computers is limited. Humans primarily interact with computers through a pointing device, display, buttons, or keys. This means that the human–computer
70
Human Factors in Simulation and Training
interaction is identified as the windows, icons, menus, and pointing devices (WIMP) approach (Shneiderman, 1998). Applying the uses of a piece of paper by humans as a metaphor for the human–computer interaction, humans use paper in numerous and varying ways while recording data, including writing, flipping, thumbing, bending, and creasing. Humans have developed dexterity, skills, and practices that are not brought fully to bear on computational device interfaces; human interaction with paper is more varied than typical human–computer interaction. Billinghurst and Kato (2002) provide an overview of the technologies associated with creating AR and some of the possible applications for enhancing collaborative work in educational settings. The authors use scenes from the movie Star Wars as a metaphor. In Star Wars, characters communicate with each other, across great distances, while observing computer-generated and projected three-dimensional (3-D) life-size virtual images. These images are superimposed on the real world. The authors cite these scenes as foreshadowing collaborative AR. They state that the long-term goal of AR research is to make it possible for the real world and virtual world to blend seamlessly together; real and virtual worlds would become indistinguishable from one another. Billinghurst et al. (2000) discuss a technology and its implications for collaborative learning through the use of AR. The authors developed “The MagicBook” to explore the use of AR to bring text-based books to life with virtual animations. The reader, or readers when used in a collaborative learning environment, read the book while looking through a handheld see-through display. The handheld see-through display is similar to a heads-up display in a fighter aircraft. As the reader observes pages, virtual 3-D avatars and images appear on the book page and act out scenes that are described in the text. This article illustrates the stunning technology available to transform two-dimensional (2-D) books into the “third dimension.” Neumann and Majoros (1998) provide a review of cognitive studies and analyses relating to how AR interacts with human abilities. They describe how these AR-enhanced human abilities may benefit manufacturing and maintenance tasks. The authors describe possible applications for AR and a prototype system designed to facilitate aviation worker training and performance of aviation maintenance tasks. They state that AR has a considerable effect on recall by establishing to-be-recalled items in a highly memorable framework; by using AR to develop scenes in an easyto-remember framework, AR can complement human information processing. This complement can reveal itself in training efficiency applicable to a wide variety of maintenance tasks. The authors provide a list of potential AR uses and state that the possible applications of AR are nearly limitless. Majoros and Neumann (2001) propose that AR can complement human information processing during the performance of aviation maintenance tasks (e.g., on-orbit maintenance procedures). They provide analysis of cognitive models that suggest that scenes merging real and synthetic features (i.e., AR) will complement human information processing by controlling attention, supporting short- and longterm memory, and aiding information integration. They state that applications of AR enable immediate access to information; immediate access to information is akin to an expert’s retrieval from short-term memory or well-encoded long-term memory.
Augmented Reality as a Means of Job Task Training in Aviation
71
Easy interaction with the design interface should allow rehearsals and stable links between graphics and the real world. Yeh and Wickens (2001) report their findings regarding an application of AR as a means of providing “intelligent cueing.” Intelligent cueing is the application of AR to a scene assumed to be important by a computer-based optical searching assistant. In their experiment, the authors used 16 participants actively serving in the US Army or US Marine Corps. Participants were presented with a high-definition virtual-reality scene of a desert environment. Virtual targets were placed into the scene and were observable by the participant and the computer-based optical searching assistant. Reliability of cueing was manipulated between 100% and 75% reliability to help the authors develop inferences regarding cue reliability and detection behavior (i.e., detection distance and accuracy). The researchers defined reliability as the degree of accuracy the cue provided to the participant as it pointed to the virtual object in the desert scene (e.g., cueing that is 75% reliable accurately points to the virtual target three out of four times). Unreliable cueing was found to induce the cognitive response of disuse of the cue. Reliable cueing was found to induce user reliance on the cue, or in some cases, overuse. Kalawsky et al. (2000, 1999) provide a brief background of AR to define terms and provide information on psychological and physiological aspects associated with AR. They highlight that AR does not have to be a purely visual augmentation; additionally, it may encompass the use of other sensory modalities. One of the sensory modalities highlighted is the use of 3-D sound to provide enhanced spatial awareness. The authors do make a key point that AR is not widely used due to technical problems associated with registering the virtual world to the real world. Registration is the process of creating one coherent scene. It is a difficult process outside static settings such as those found in laboratories. Poupyrev et al. (2002) report on their development of a “tile” system approach to implementing an AR environment. Each tile has a unique marker that a computerbased AR system can recognize and then use to render a virtual image as an overlay on a real-world scene. The authors positioned the tiles on a magnetic whiteboard to demonstrate an application for the rapid prototyping of an aircraft instrument panel. In addition to tiles replicating aircraft instruments, the authors included tiles with the functionalities of delete, copy, and help. These “functionality” tiles enabled the user to manipulate the AR environment in a manner similar to the way icons interact with the common GUI found on personal computers. Cooper et al. (2016) compared the performances of the three groups of regular training, virtual training, and virtual with augmented cues. Participants react faster at the virtual training, but augmented cues did not make a significant difference. In the scenario with augmented cues, however, participants performed the task with fewer errors than participants in the minimal cues training group. At the same time, some systematic differences in subjective ratings that reflected objective performance were also observed (Cooper et al., 2016). Several studies identify that the development of AR environments for training purposes is an inherently interdisciplinary pursuit (Macchiarella, 2004, 2005; Macchiarella et al., 2005a, b; Vincenzi et al., 2003). The design of an effective AR environment entails incorporating theories of
72
Human Factors in Simulation and Training
computer design, empirical research in several fields, the nature of human perceptual and cognitive systems, reasoning with diverse forms of information, human learning under varying situations, technology for presenting information to the human user, and getting information to and from the user and the computer in an effective manner. Developing an understanding of human abilities and complementary applications of AR to create mixed-reality worlds is an essential element in the design of any AR learning environment that complements human cognitive activity. Vincenzi and Gangadharan (2001) identify distinctive human abilities as being able to: • Detect meaningful stimuli and patterns; • Integrate information within and between sensory modalities (e.g., sight, sound, and smell as indicators of condition); • Compare information/events to standards; and • Perform qualitative judgments. They identify complementary applications of AR annotations as follows: • Tethering virtual annotations to real-world workpieces minimizes the need to search for information. • Virtual images can provide examples of correct conditions. • Markers or flags can direct attention to specific real-world work-piece features. • Virtual annotations can influence the users’ anticipation, (e.g., knowledge of possible defects with the real-world workpiece. • With input options, users can obtain the desired level of information detail for the work task. • Virtual objects offer an easy-to-use interface for recording work task steps. In a mega-analysis conducted by Santos et al. (2013), AR learning experiences had a widely variable effect on student performance from a small negative to large depending on the device, method, and scenario. Overall, AR incurs three fundamental advantages including real-world annotation, contextual visualization, and vision-haptic visualization (Santos et al., 2013). AR is a relatively new field within the computer science field of study, and its nature is inherently interdisciplinary. The concept of augmenting an observer’s perception of reality has age-old roots (Stapleton et al., 2002). Reality alteration or augmentation was, and still is, used by magicians and entertainers in the form of illusions and other gimmicks to bewilder, amaze, and entertain. The desired goal is to make people perceive ordinary objects in extraordinary ways. The modern development of computer-based AR has the ability to bewilder, amaze, and entertain. However, commercial applications of AR designed with the goal of improving education, training, and work task performance can create a new mixed-reality world inconceivable just a few decades ago. AR requires connecting reality with imagination to make people perceive ordinary objects in extraordinary ways. In order to integrate AR technologies into the specific training environment
Augmented Reality as a Means of Job Task Training in Aviation
73
will require a significant investment in the development of new training content, improved processes, and procedures using these new digital capabilities (Osborne & Mavers, 2019).
COGNITION AND AR Elaboration and Recall Ormrod (1999) and Haberlandt (1997) identify key aspects of elaboration and recall. The manner in which information is encoded and retained determines how easy it will be to retrieve for future use. Cues can be used to aid this retrieval immediately and for the long term from memory. Although not yet thoroughly tested, researchers have theorized that AR-based learning may inherently possess a great potential for facilitating retention of learned material to be retrieved later for real-world application during work tasks (Macchiarella, 2004; Majoros & Neumann, 2001; Valimont, 2002). AR-based learning can affect many more modalities of human senses than present learning paradigms. By complementing human associative information processing, and aiding information integration through multimodal sensory elaboration by the use of visual-spatial, verbal, proprioceptive, and tactile memory while the learner is performing a knowledge acquisition task, AR can enable increased elaboration during the time the learner participates in an AR-based learning environment (Bjork & Bjork, 1996; Majoros & Neumann, 2001; Vincenzi et al., 2003). Hypothesizing that the uses of text labels in AR scenes serve as cues for retrieval is consistent with the Tulving and Osler (1968) study. The study found that, when subjects studied a list of words with an associated mnemonic aid (i.e., cue word), they had a significantly higher level of recall as compared to a group that did not use a cue word. Applying the same principle to the AR environment, virtual text labels appearing on real-world objects serve as a word cue, or mnemonic, for the object. However, even though augmented cues have been shown to enhance performance and satisfaction in the transfer of virtual training, the effects may not be significant (Cooper et al., 2016). Elaboration is the process by which one expands upon new information by creating multiple associations among the incoming information. Stein et al. (1984) conducted research to determine the effectiveness of elaboration on immediate recall. They found that, when the elaborative cue was closely related to the to-be-recalled material (e.g., information to be recalled, the strong man read a book; the cue, about weight lifting), the learners displayed a significantly higher level of recall. With regard to educational practice and cues, Reigeluth (1999) defines four key elements of elaboration: selection, sequencing, synthesizing, and summarizing of the subjectmatter content. It draws from different sensory inputs and past information already held in long-term memory. In terms of learning procedural tasks, the learner focuses on sequential steps to help select and sequence acquisition of knowledge in a manner that will optimize attainment. Elaboration has been shown to greatly improve the encoding and retention of newly acquired information. When precise cues are
74
Human Factors in Simulation and Training
applied in AR scenes (i.e., virtual text annotations naming functions and components of a real-world object) higher levels of recall can be anticipated. Mayer (1992) provides a brief monograph of psychology theory and research. He begins with E. L. Thorndike’s work (1905) and concludes by citing contemporary authors who address the application of cognitive psychology in educational practice. He ties together developments in the fields of psychology and learning theory to show the origins of recent educational practice. The author concludes that the behaviorist influences in educational practice are waning, and educational practice based on cognitive psychology is prevailing. Research has shown that retrieval and recall of learned information is most effective when the similarities between the learning environment and the task environment are maximized. As participants percept knowledge from the AR training, actions and responses need to align with the expectations, and information should be given in an appropriate manner that is specific and realistic to the working scenario (Petersen & Stricker, 2015). Meaningful learning occurs when the learner has relevant prior knowledge to form a frame of reference from which to draw (Bjork & Bjork, 1996; Knowles, 1984; Stein et al. 1984). Elaboration, within domains of knowledge the users are familiar with, may be one of the key strengths associated with using AR in learning settings. In terms of elaboration and recall, AR may have the ability to facilitate the sequencing of ideas that will assist in learning cognitive and complex psychomotor tasks (e.g., isolating a fault in an aircraft electrical system).
Spatial Relations Spatial cognition (i.e., cognitive functioning that enables the ability to process and interpret visual information regarding the location of objects in an environment—often referred to as visual-spatial ability), relates the representation of spatial information (e.g., location) in memory. Spatial information has been found to be an extremely powerful form of elaboration for establishing associations in memory that facilitate recall. Researchers have found that spatial information is automatically processed when visual scenes are encoded into long-term memory (Lovelace & Southall, 1983; Neumann & Majoros, 1998; Pezdek & Evans, 1979). Pezdek and Evans (1979) conducted four experiments to assess the role of verbal and visual processing in memory for aspects of a simulated, real-world spatial display. Participants viewed a 3-D model of a city with 16 buildings that were placed on the display. The buildings were represented on the model with, or without, an accompanying name label on each building. The participants studied the display and subsequently were tested on recall and recognition of the building names, picture recognition of the buildings, and spatial memory for where the buildings were located within the model. Overall, picture recognition accuracy was low, and the presence of a name label on each building significantly reduced picture recognition accuracy but improved location recognition accuracy. The authors concluded that spatial location information was not encoded independently of verbal and visual identity information. In this study, labeling facilitated location identification accuracy. It did not significantly
Augmented Reality as a Means of Job Task Training in Aviation
75
affect visual recognition. The authors’ real-world spatial display (i.e., 3-D model) in several ways is comparable to the AR environment. AR environments are inherently 3-D in nature. The real-world objects occupy three dimensions of space, and the virtual component of the scene can be rendered to present a 3-D appearance. Saariluoma and Kalakoski (1997) conducted four experiments to test the effects of imagery on the long-term working memory of skilled chess players. The purpose of their experimentation was to gain insights on effects of visual and auditory inputs on game play. The authors hypothesized that the visual modality would have the most significant effect on how chess players form mental images of game play. The authors concluded that skilled imagery is built on long-term working-memory retrieval structures and that effective transformation of information between these retrieval structures and visual working memory is required to construct complex mental images. Expert chess players are better able to construct complex mental images of task-specific materials than less skilled chess players. Regardless of modality of information transmission, the chess move is transformed into visuospatial code and stored as such by the chess player. Participants in AR learning environments view scenes that contain both the real-world object being studied and the corresponding virtual overlay. It is reasonable to believe that these mixed-reality scenes are encoded into long-term memory as one integrated scene and, when the scene is recalled, a transformation to long-term working memory is required to construct mental images in the visual working memory. Should this effect prove true, participants in AR-based learning would demonstrate significantly higher levels of recall when compared to participants using traditional forms of learning. Nakamura (1994) describes research conducted to measure the effect, on recall, of different types of spatial relations. The spatial relations are grouped into three categories: scene-expected, scene-unexpected, and scene-irrelevant. The author’s findings contribute to the body of knowledge dealing with spatial relations with regard to attention and recall. When spatial scenes incorporate elements that are not naturally associated with the scene, viewer attention is drawn to the scene. When the scene contains multiple surprising but naturally occurring elements, the viewers demonstrate higher levels of recall. Application of these findings can facilitate learner recall in various training and educational settings. Phaf and Wolters (1993) report on four experiments they conducted to examine the processes that determine the effectiveness of rehearsal on long-term memory. They cite previous research that divided rehearsals into one of two categories: maintenance rehearsal and elaborative rehearsal. Maintenance rehearsal involves rote repetition of an item’s auditory representation. Elaborative rehearsal involves deep semantic processing of to-be-remembered items, resulting in the production of durable memories. The authors’ experiments led them to several conclusions. First, the effectiveness of a rehearsal depends on the degree of attentional processing applied to the material being rehearsed. Second, an important criterion for attentional processing seems to be the “novelty” of stimuli being rehearsed. Third, attention may result in faster learning because novel patterns may enable the development of new associations. These findings may affect instructional design; increasing attention during rehearsals could lead to higher levels of recall.
76
Human Factors in Simulation and Training
Pham and Venkataramani (1997) report on their investigation of the processes of source identification and its effect on effectual communication. The authors propose a framework that identifies four types of source identification processes: semantic cued retrieval, memory-trace refreshment, schematic inferencing, and pure guessing. They hypothesize that these processes are sequential in nature. The authors report on two experiments. They support their position that these processes occur in a contingent manner; their experimental cases all supported this position and were statistically significant. Additionally, the authors hypothesized that cued retrieval was the dominant process. Moreno and Mayer (1999) review previous research and report on their research regarding the learning effects associated with multimedia instructional designs that employ varying combinations of text, narration, images, and animation. They elaborate upon the contiguity principle that states: “The effectiveness of multimedia instruction increases when words and pictures are presented contiguously in time and space” (Moreno and Mayer, 1999, p. 385). The authors refine the contiguity principle into the temporal-contiguity effect and spatial-contiguity effect. The spatial-contiguity effect occurs when text and images are integrated into one visual scene. The temporal-contiguity effect occurs when visual and spoken materials are temporally synchronized. They conclude that, when learners are presented with a visual presentation that incorporates text or narration, narration has a more significant effect on the learner. The authors qualified their findings by calling for more research. In this experiment, they did not factor in individual differences in spatial ability, coordination ability, and experience. Waller (2000) conducted a multivariate study of relationships between several factors and the ability to acquire and transfer spatial knowledge from a VE. The author bisects spatial ability into related dimensions: spatial visualization and spatial orientation. Spatial visualization is the ability to manipulate figures mentally. Spatial orientation is the ability to account for changes in viewpoint. When both factors are psychometrically assessed as being higher in an individual, that individual demonstrated an increased ability to acquire spatial information from a VE. Additionally, proficiency with the VE’s interface was found to significantly affect performance measures of spatial knowledge. The author postulates that a likely explanation for this finding centers on user attention while engaged in spatial learning (i.e., effortful processing of the interface interferes with the user’s ability to learn in the environment). Waller’s research empirically demonstrates that measured spatial abilities correlate to the ability to learn from a VE, and additionally, that the degree of attention or level of difficulty associated with the user interface detracts from one’s ability to learn spatially. Replication of this study with an AR environment would quantitatively substantiate the position that AR inherently leads to efficiencies while learning due to its low-effort interface and attentional nature that creates spatial scenes for learning. Several studies have found that gender affects spatial ability, and males tend to have higher levels of spatial ability (Cutmore et al., 2000; Czerwinski, Tan, & Robertson, 2002; Hamilton, 1995; Waller, 2000). Cutmore et al. (2000) conducted research into cognitive factors affecting virtual navigation performance, while navigating within
Augmented Reality as a Means of Job Task Training in Aviation
77
a desktop computer-generated VE clearly describes the differences in spatial ability between males and females. Various cues were used as treatments to experimental groups (e.g., compass pointers, icons for association with locations, and icons for association with landmarks). Males acquired route knowledge from landmarks quicker than females. The specific cause of this difference is speculative. However, multiple studies substantiate its existence. Cutmore et al. (2000) make an important point that gender should be a factor when designing VE training environments. Further research into gender differences with regard to spatial ability is necessary for mixed-reality worlds. However, postulating that it exists is prudent. By its inherent nature, AR presents a visual-spatial dynamic that can be expected to enable learning advantages associated with spatial cognition that helps effective encoding of information into memory and facilitating recall, which is extremely important for the aviation industry where spatial information is vital (Kaufmann et al., 2003; Kaplan et al., 2021). Virtual text labels, or virtual overlays in general, become associated with the real-world object and encoded into memory as one visual image. Spatial cognition is an integral element of AR and human learning (Majoros & Neumann, 2001; Majoros and Neumann, 2001).
Memory Channels and AR AR interfaces affect more modalities of human senses than present learning paradigms (Bjork & Bjork, 1996; Macchiarella, 2004; Neumann & Majoros, 1998). AR is believed to complement human associative information processing by aiding information integration through multimodal sensory elaboration. Multimodal sensory elaboration occurs by utilizing visual-spatial, verbal, proprioceptive, audio, and tactile memory while the learner is encoding the information into long-term memory. This elaboration on the subject material may occur due to an increase of memory channels, enabling a greater chance for information to be encoded properly and retained in long-term memory. Effective encoding is key to the learner’s ability to recall information for application in a real-world environment. Mania and Chalmers (2001) studied the effects of immersion in a VE on recall and spatial perception. Several of their findings were inconclusive, but they did find a significant correlation between recall and environments that presented multimodal sensory elaboration as found in three different environments with corresponding inherent levels of immersion. The environments for the research comprised the real world and a virtual world, in which the subjects were fully immersed, and a virtual world created with a desktop computer, in which subjects were partially immersed. Their research found overall that relevant multimodal stimuli enhanced recall. Gamberini (2000) studied groups of subjects who were exposed to a fully immersive VE or a nonimmersive VE (i.e., a virtual world depicted within a real-world setting on a desktop computer). The researcher found that subjects in the nonimmersive group scored higher in the areas of recall for spatial and visual memories. He postulated that several factors affected this outcome. His key factor for consideration was that the nonimmersive environment is more familiar to subjects because they see both real-world and virtual-world objects. In an AR learning environment,
78
Human Factors in Simulation and Training
real-world objects (e.g., turbine engine aircraft oil pump) are presented to learners, and the learners can engage in learning in a multimodal sensory fashion. Multimodal sensory elaboration can create a framework of associations that aid recall and learning (Majoros & Neumann, 2001; Neumann & Majoros, 1998). Each association of a virtual object (e.g., virtual text label) with a real-world object serving as a workpiece is the basis for a link in memory that might not otherwise exist. Together these links (e.g., a visual arrangement of text callouts in an AR workpiece scene) may form a framework like that created when subjects use a classic mnemonic technique to remember a list of items. With this method, a subject associates items to be remembered with invented places or landmarks on an imaginary path (Neumann & Majoros, 1998; Yates, 1984). During recall, the subject “mentally walks” on the path, encounters a mental landmark, visualizes the item associated with the landmark (e.g., to-be-recalled item on a real-world workpiece), and then processes the to-berecalled item into working memory. AR has the potential to expand these mental landmarks to include multimodal sensory input that establishes multiple channels to the memory. Users of AR are provided a framework (i.e., the real world) that holds the items that will be recalled. This association and multimodal elaboration does not necessarily happen intentionally; it can occur as a by-product of the use of enhanced workpiece scenes (Neumann & Majoros, 1998).
KNOWLEDGE DEVELOPMENT AND TRAINING TRANSFER Reduced costs and increasing capabilities of computer-based technologies have initiated dramatic increases in the application of computer-delivered instruction such as computer-based training, web-based training, multimedia learning environments (Brown, 2001), virtual reality (Stone, 2001), and augmented reality (Majoros & Neumann, 2001). Computer-based training has become ubiquitous throughout the government, military, and commercial training associated with the aviation field. It typically gives the learner the loci of controls over instruction. Learner-controlled environments offer learners choices regarding practice level, time on task, and attention. However, computer-based training systems usually incur fidelity problems as the difference between reality and simulated environment (Gonzalez-Franco et al., 2017), which makes the transfer of training important. Transfer of training refers to how well learned skills and information can be applied in a different work setting. In the case of AR-based training, skills first acquired in a mixed-reality work setting would serve as training for subsequent skill application in the real world. Application of these skills could involve cognitive or psychomotor work tasks. In the future, the new mixed-reality world may redefine how workers are trained (Kalawsky, Stedmon, Hill, & Cook, 2000; Majoros & Neumann, 2001). The traditional training paradigm employs some form of training (e.g., computer-based tutorials, face-to-face instruction, self-study with printed manuals) prior to licensing or assignment to a work task. In this future mixed-reality world, AR may make some forms of training unnecessary or at least reduced in time and scope (Macchiarella, 2005; Macchiarella et al., 2005a, b; Majoros & Neumann, 2001; Vincenzi et al., 2003). Cognitive tasks normally associated with training could be performed for
Augmented Reality as a Means of Job Task Training in Aviation
79
the human by the AR system. This characteristic of AR may enable just-in-time training functions that occur simultaneously with work task performance. As an example, AR could provide scenes that are annotated with types of information that is customarily learned through training. This presentation of information could support humans in inspection tasks or enable them to perform work tasks that are rarely encountered and with little prior training. AR scenes, in the same manner as VR scenes, have the ability to direct learner attention and facilitate the acquisition of spatial knowledge regarding a real world or virtual world (Witmer et al., 1996). Virtual environments provide that symbolic media (e.g., a map or photograph) cannot provide. Witmer et al. (1996) conducted a study using undergraduate students at the University of Central Florida in conjunction with the US Army. Selected test participants rehearsed navigation through a building either using VE or photographs and maps. The participants using the VE rehearsal were significantly more accurate in their navigation of the real building. Additionally, the authors postulated that additional VR cues, tactile or aural, would enhance the participants’ gained knowledge and improve navigation through the building. The creation of an AR-based mixed-reality world, where the positive transfer information occurs with users, could enhance training environments. Waller et al. (2001) conducted research involving the effects of visual fidelity and individual differences on the ability to learn in a virtual environment, and subsequently transfer the learned knowledge to a real-world use. They found that the fidelity of the VE is less important when used to train tasks that do not require higherlevel cognitive processes. Additionally, the authors found individual differences, such as cognitive abilities and level of computer-use experience, did impact the transfer of training for virtual-to-real and real-to-virtual environments more than the fidelity of the simulation. Two possible positive effects can be inferred regarding AR and this research. First, AR can be designed to deliver information that normally is obtained through training, in effect reducing cognitive load and helping to mitigate differences in cognitive abilities while training. Second, the AR interface is intuitive, and typically does not require an interface device (i.e., trackball, joystick, etc.). The intuitive interface of AR may help mitigate differences in levels of computer-use skills. The users of AR look at a real-world object, and virtual scenes of information are automatically presented for use. Self-efficacy (i.e., people’s judgments of their capabilities to organize and execute courses of action necessary to attain designated types or levels of performances) (Bandura, 1986, 1997) is central to the success or failure that learners experience as they engage in the tasks necessary to attain knowledge in a given field. High selfefficacy helps create feelings of serenity or “peace of mind” as learners approach difficult tasks and activities that comprise decision-making and complex work tasks. Majoros and Neumann (2001) postulate that AR scenes may support self-efficacy by creating an environment where the learner, or user of AR, has the loci of controls over their learning environment. A high level of individualized control for the learning situation has a positive effect on learning (Ormrod, 1999; e.g., allow users to invoke an AR scene with virtual “paste and copy” to keep information accessible while conducting a real-world work task).
80
Human Factors in Simulation and Training
With regard to concurrent training and performance, AR enables learning experiences where users train for tasks in a manner that identically replicates performance of the task in the real-world environment; this type of a “real-world” training environment has shown to provide advantages regarding transfer of knowledge and training (Majoros & Neumann, 2001). Rose et al. (2000) empirically ascertained that VEs do transfer training as effectively as real-world training. They also highlight that three main factors influence interference between concurrent task learning: task similarity, practice, and task difficulty. Regarding task similarity, the authors concluded that the extent of interference between two separate tasks is dependent on the degree they share a stimulus modality (e.g., visual, auditory, and tactile) and whether they rely on the same memory coding processes (e.g., verbal and visual). Rose et al. (2000) cite research by Sullivan (1976) as corroborating their position that concurrent tasks are impaired when the difficulty of the tasks is increased. They differentiate between performance that is resource limited (i.e., dependent on the mental processing resources available to devote to the task) and data limited (i.e., dependent on external stimulus quality—instructions, notes, cues, etc.). In both cases of performances, both resource limited and data limited, AR has the potential to enhance concurrent training by delivering annotated work scenes that reduce mental work loads through virtual text callouts, equipment diagrams, and instructions with step-by-step sequencing. As AR training environments mature, creation of just-in-time or concurrent training may be feasible (Majoros & Neumann, 2001). One objective of future applications of AR may be to provide annotated visual scenes that supplant the need for certain aspects of training. This substitution for training would occur by providing AR-delivered information to the user, during work task performance, in lieu of the user recalling work task steps from long-term or working memory.
WHAT IS THE FUTURE OF JOB TRAINING – TRAINING ON THE JOB LITERALLY? Applications of AR can enable learning environments embedded in the real world and make the real world part of the computer interface (see Figure 2.4). Future applications of virtual environments can take the form and function of a mixed-reality world with hypertext linking to vast resources of information and instructional content. The visual nature of the AR scenes is, in many ways, analogous to a GUI in the mind’s eye. AR may have a positive effect on recall by enticing elaboration through the creation of multiple associations between the real-world object being studied and the to-be-learned virtual information (Macchiarella, 2004; Valimont, 2002). In this new mixed-reality world, multimodal sensory elaboration can create a framework of associations that aid recall and learning. Each association of a virtual object (e.g., virtual text label) with a real-world object could serve as a basis for a link in memory that might not otherwise exist. Together these links (e.g., a visual arrangement of text callouts in an AR workpiece scene) may form a framework like that created when students use a mnemonic technique to remember a list
Augmented Reality as a Means of Job Task Training in Aviation
81
FIGURE 2.4 AR-aided inspection of an aircraft elevator.
FIGURE 2.5 AR scene with instructions for servicing a turbine engine oil pump.
of items (see Figure 2.5). With this method, in the mind’s eye, a student would associate items to be remembered with places or landmarks after viewing mixed-reality images of the studied item. During recall, the student “mentally walks” on the path; encounters a mental landmark, visualizes the item associated with the landmark (e.g., to-be-recalled aspect of a real-world workpiece), and then processes the to-berecalled item into working memory. AR has the potential to expand these mental landmarks to include multimodal sensory input that establishes multiple channels to the memory. Users of AR are provided a framework (i.e., the real world) that holds the items that will be recalled. This association and multimodal elaboration does not
82
Human Factors in Simulation and Training
necessarily happen intentionally; it can occur as a by-product of the use of enhanced workpiece scenes. Another facet is the application of mobile augmented reality learning environments as people spend more time on their mobile devices. The technological, theoretical, and assessment challenges for mobile-based AR need to be addressed for mobile augmented reality learning environment to fulfill its potential (Ifenthaler & Eseryel, 2013). Transfer of training refers to how effectively learned skills and information can be applied in a work setting. In the case of AR-based training, skills first acquired in a mixed-reality work setting would serve as training for subsequent skill application in the real world. Application of these skills could involve cognitive or psychomotor work tasks. In the future, the new mixed-reality world may redefine how workers are trained (Kalawsky et al., 2000; Majoros & Neumann, 2001). The traditional training paradigm employs some form of training (e.g., computer-based tutorials, face-toface instruction, self-study with printed manuals) prior to licensing or assignment to a job task. In this future mixed-reality world, AR may make some forms of training unnecessary or at least reduced in time and scope (Macchiarella et al., 2005a, b; Macchiarella & Haritos, 2005). Cognitive tasks normally associated with training could be performed for the human by the AR system. This characteristic of AR may enable just-in-time training functions that occur simultaneously with job task performance. As an example, AR could provide scenes that are annotated with types of information that is customarily learned through training. This presentation of information could support humans in inspection tasks or enable them to perform job tasks that are rarely encountered and with little prior training. AR has the potential to transform computing as drastically as the GUI-transformed computing (Fishkin et al., 2000; Vincenzi et al., 2003). Physical configuration of computational devices is a major determinant of their usability. Despite the rapid advances in computing technology afforded by exponential increases in computational power, humans still interact with computers in a very limited manner. The mode of interaction available for humans with computers primarily consists of a keyboard and a pointing device. In most cases, the pointing device is a mouse, and humans are limited to pointing, dragging, and drawing. When contrasting the various ways humans interact with each other (e.g., speaking—actual meaning of words; speaking—use of tone, listening, touching, gesturing, etc.), human interaction with computers is relatively simple (Alessi & Trollip, 2001). As researchers, computer scientists, and practitioners of AR solve the technological issues associated with using AR in real time and in the real world, AR-based human and computer interaction can become more like human-to-human interaction and engage more human modalities. The movie Minority Report (Frank & Cohen, 2001) portends human interactions with computers in an insightful and powerful way. The film depicts numerous applications of AR. Police officers interact with computer-generated images from human minds through a wall-sized interface device they manipulate with speech and touch. The officers can tear virtual media from the display, move media around, change view aspects, and generally use the virtual media in the real world as if it were a real-world object. Another interesting application of AR in the movie is
Augmented Reality as a Means of Job Task Training in Aviation
83
for marketing and sales purposes. Pedestrians walk past scanners and receive an iris scan that positively identifies them. This application of biometric identification enables a computer to generate a holographic 3-D salesperson that is implanted into the real world as an AR feature. The 3-D salesperson makes a personalized sales presentation to the pedestrian. The movie presents many other innovative examples for applications of AR.
CONCLUSION As the computational power of computers continues its rapid advance, as prophesied by Moore (1965), developers of AR-based training, during the upcoming decades, will have the opportunity to create AR workstations that are portable and powerful. These portable and powerful workstations can enable AR in the real-world work settings of the aerospace industry. AR has the potential to positively affect training by enabling higher levels of recall and just-in-time training functions. Training could occur in the actual work setting and at times simultaneously with job task performance. The net positive effect resulting from the use of AR as a learning medium may derive from the learners’ ability to mentally match augmented information directly with the workpiece in front of them; future research is required to fully ascertain these effects on the cognitive activities associated with job tasks.
REFERENCES Alessi, S. M., & Trollip, S. R. 2001. Multimedia for Learning: Methods and Development (3rd ed.). Boston: Allyn and Bacon. Aviation Technician Education Council. 2020. Pipeline Report & Aviation Maintenance School Directory. https://www.atec-amt.org/uploads/1/0/7/5/10756256/atec -pipelinereport-truncated-20200416.pdf Azuma, R. T. 1997. A Survey of Augmented Reality. Presence, 6(4), 355–385. Azuma, R. T. 2004. Overview of Augmented Reality. Proceedings of the Conference on SIGGRAPH 2004. Los Angeles, CA. Azuma, R. T., Baillot, Y., Behringer, R., Feiner, S., Julier, S., & MacIntyre, B. 2001. Recent Advances in Augmented Reality. IEEE Computer Graphics and Applications, 21(6), 34–47. Bandura, A. 1986. Social Foundations of Thought and Action: A Social Cognitive Theory. Englewood Cliffs, NJ: Prentice Hall. Bandura, A. 1997. Self-Efficacy: The Exercise of Control. New York: Freeman. Billinghurst, M., & Kato, H. 2002. Collaborative Augmented Reality. Communications of the ACM, 45, 64–70. Billinghurst, M., Kato, H., & Poupyrev, I. 2000. ARToolKit User’s Manual. Seattle, WA: University of Washington. Billinghurst, M., Kato, H., & Poupyrev, I. 2001. The MagicBook—Moving Seamlessly Between Reality and Virtuality. Computer Graphics and Applications, 21(3), 2–4. Bjork, R. A., & Bjork, E. L. (Eds.). 1996. Memory. San Diego, CA: Academic Press. Boeing. 2020. Pilot and Technician Outlook 2020–2039. https://www.boeing.com /resources/ boeingdotcom /market /assets/downloads/2020_ PTO_ PDF_Download.pdf Brown, K. G. 2001. Using Computers to Deliver Training: Which Employees Learn and Why. Personnel Psychology, 54(2), 271–296.
84
Human Factors in Simulation and Training
Cooper, N., Milella, F., Cant, I., Pinto, C., White, M., & Meyer, G. 2016, September. Augmented Cues Facilitate Learning Transfer from Virtual to Real Environments. In 2016 IEEE International Symposium on Mixed and Augmented Reality (ISMARAdjunct) (pp. 194–198). IEEE. Cutmore, T. R. H., Hine, T. J., Maberly, K. J., Langford, N. M., & Hawgood, G. 2000. Cognitive and Gender Factors Influencing Navigation in a Virtual Environment. International Journal of Human-Computer Studies, 53, 223–249. Czerwinski, M., Tan, D., & Robertson, G. 2002. Women Take a Wider View. Paper presented at the ACM, SIGCHI, Conference on Human Factors and Computing Systems [Spatial Cognition], Minneapolis, MN. Dunleavy, M., & Dede, C. 2014. Augmented Reality Teaching and Learning. In J. Spector, M. Merrill, J. Elen, & M. Bishop (Eds.), Handbook of Research on Educational Communications and Technology, 735–745. New York: Springer. Feiner, S., MacIntyre, B., & Hollerer, T. 1997. A Touring Machine: Prototyping 3D Mobile Augmented Reality Systems for Exploring the Urban Environment. Proceedings of the International Symposium on Wearable Computing, 74–81. Fishkin, P., Gujar, A., Harrison, B., Moran, T., & Want, R. 2000. Embodied User Interfaces for Really Direct Manipulation. Communications of the ACM, 43(9), 75–80. Frank, S., & Cohen, J. (Writer). 2001. Minority Report. Hollywood, CA: Twentieth Century Fox and Dream works LLC. Gamberini, L. 2000. Virtual Reality as a New Research Tool for the Study of Human Memory. CyberPsychology and Behavior, 3(3), 337–342. Gonzalez-Franco, M., Pizarro, R., Cermeron, J., Li, K., Thorn, J., Hutabarat, W., Tiwari, A., & Bermell-Garcia, P. 2017. Immersive Mixed Reality for Manufacturing Training. Frontiers in Robotics and AI, 4, 3. Haberlandt, K. 1997. Cognitive Psychology (2nd ed.). Needham Heights, MA: Allyn and Bacon. Hamilton, C. 1995. Beyond Sex Differences in Visuo-Spatial Processing: The Impact of Gender Trait Possession. British Journal of Psychology, 86(1), 1–20. Ifenthaler, D., & Eseryel, D. 2013. Facilitating Complex Learning by Mobile Augmented Reality Learning Environments. In R. Huang, J. M. Spector, & Kinshuk (Eds.), Reshaping Learning (pp. 415–438). Berlin, Heidelberg: Springer. Kalawsky, R., Stedmon, A. W., Hill, K., & Cook, C. 2000. A Taxonomy of Technology: Defining Augmented Reality. Paper presented at the Human Factors and Ergonomics Society Annual Meeting, Santa Monica, CA. Kaplan, A. D., Cruit, J., Endsley, M., Beers, S. M., Sawyer, B. D., & Hancock, P. A. 2021. The Effects of Virtual Reality, Augmented Reality, and Mixed Reality as Training Enhancement Methods: A Meta-Analysis. Human Factors, 63(4), 706–726. Kaufmann, H. 2003. Collaborative Augmented Reality in Education. Institute of Software Technology and Interactive Systems, Vienna University of Technology. Knowles, M. 1984. The Adult Learner: A Neglected Species (3rd ed.). Houston: Gulf Port Publishing. Lee, K. 2012. Augmented Reality in Education and Training. TechTrends, 56(2), 13–21. Lovelace, E. A., & Southall, S. D. 1983. Memory for Words in Prose and Their Locations on the Page. Memory and Cognition, 11(5), 429–434. Macchiarella, N. D. 2004. Effectiveness of Video-Based Augmented Reality as a Learning Paradigm for Aero space Maintenance Training. Dissertation Abstracts International, 65(09), 3347A, (UMI No. 3148420). Macchiarella, N. D. 2005. Augmenting Reality as a Medium for Job Task Training. Journal of Instruction Delivery Systems, 19(1), 21–24.
Augmented Reality as a Means of Job Task Training in Aviation
85
Macchiarella, N. D., Gangadharan, S. N., Liu, D., Majoros, A. E., & Vincenzi, D. A. 2005a. Application of Augmented Reality for Aerospace Maintenance Training. Proceedings of the 11th International Conference of Human Computer Interaction. Las Vegas, NV, CD-ROM, 1–5. Macchiarella, N. D., Gangadharan, S. N., Liu, D., Majoros, A. E., & Vincenzi, D. A. 2005b. Augmenting Reality as a Training Medium for Aviation/Aerospace Applications. Proceedings of the Human Factors and Ergonomics Society 49th Annual Meeting. Orlando, FL, 2174–2178. Macchiarella, N. D., & Haritos, T. 2005. A Mobile Application of Augmented Reality for Aerospace Maintenance Training. Proceedings of the 24th Digital Avionics Systems Conference, Avionics in a Changing Market Place: Safe and Secure. Washington, DC, 5.B.3–1—5.B.3–9. Majoros, A., & Boyle, E. 1997. Maintainability. In G. Salvendy (Ed.), Handbook of Human Factors and Ergonomics (2nd ed., pp. 1569–1592). New York: John Wiley. Majoros, A., & Neumann, U. 2001. Support of Crew Problem-Solving and Performance with Augmented Reality. Galveston, TX: Bioastronautics Investigators’ Workshop. Mania, K., & Chalmers, A. 2001. The Effects of Levels of Immersion on Memory and Presence in Virtual environments: A Reality Centered Approach. CyberPsychology and Behavior, 4(2), 247–264. Mayer, R. 1992. Cognition and Instruction Their Historic Meeting Within Educational Psychology. Journal of Educational Psychology, 84(4), 405–412. Milgram, P., & Kishino, F. 1994. A Taxonomy of Mixed Reality Visual Displays. IEICE Transactions Information Systems, E77-D(12), 1321–1329. Moore, G. E. 1965. Cramming More Components Onto Integrated Circuits. Electronics, 38(8), 1–4. Moreno, R., & Mayer, R. 1999. Cognitive Principles of Multimedia Learning: The Role of Modality and Contiguity. Journal of Educational Psychology, 91(2), 358–368. Nakamura, G. 1994. Scene Schemata in Memory for Spatial Relations. American Journal of Psychology, 107(4), 481–497. Neumann, U., & Majoros, A. 1998. Cognitive, Performance, and System Issues for Augmented Reality Applications in Manufacturing and Maintenance. Proceedings of IEEE the Virtual Reality Annual International Symposium (VRAIS), 4–11. Ormrod, J. 1999. Human Learning (3rd ed.). Upper-Saddle River, NJ: Prentice-Hall. Osborne, M., & Mavers, S. 2019, October. Integrating Augmented Reality in Training and Industrial Applications. In 2019 Eighth International Conference on Educational Innovation through Technology (EITT) (pp. 142–146). IEEE. Ott, J. 1995. Maintenance Executives Seek Greater Efficiency. Aviation Week and Space Technology, 142, 2. Pezdek, K., & Evans, G. W. 1979. Visual and Verbal Memory for Objects and Their Spatial Locations. Journal of Experimental Psychology: Human Learning and Memory, 5(4), 360–373. Petersen, N., & Stricker, D. 2015. Cognitive Augmented Reality. Computers & Graphics, 53, 82–91. Phaf, R., & Wolters, G. 1993. Attentional Shifts in Maintenance Rehearsal. American Journal of Psychology, 106(3), 353–382. Pham, M., & Venkataramani, J. 1997. Contingent Processes of Source Identification. Journal of Consumer Research, 24(3), 249–266. Poupyrev, I., Tan, D., Billinghurst, M., Kato, H., Regebrecht, H., & Tetsutani, N. 2002. Developing a Generic Augmented-Reality Interface. Computer Magazine, 35(3), 44–50.
86
Human Factors in Simulation and Training
Reigeluth, C. M. 1999. The Elaboration Theory: Guidance for Scope and Sequence Decisions. In C. M. Reigeluth (Ed.), Instructional-Design Theories and Models: A New Paradigm of Instructional Theory. (Vol. II). Hillsdale, NJ: Lawrence Erlbaum Assoc. Rose, F. D., Attree, B. M., Brooks, D. M., Parslow, D. M., Penn, P. R., & Ambihaipahan, N. 2000. Training in Virtual Environments: Transfer to Real World Tasks and Equivalence to Real Task Training. Ergonomics, 43(4), 494–511. Saariluoma, P., & Kalakoski, V. 1997. Skilled Imagery and Long-Term Working Memory. American Journal of Psychology, 110(2), 177–202. Santos, M. E. C., Chen, A., Taketomi, T., Yamamoto, G., Miyazaki, J., & Kato, H. 2013. Augmented Reality Learning experiences: Survey of Prototype Design and Evaluation. IEEE Transactions on Learning Technologies, 7(1), 38–56. Shneiderman, B. 1998. Designing the User Interface, Strategies for Effective HumanComputer Design (3rd ed.). Reading, MA: Addison-Wesely. Stapleton, C., Hughes, C., Moshell, M., Micikevicius, P., & Altman, M. 2002. Applying Mixed Reality to Entertainment. Computer, 35(12), 122–124. Stedmon, A. W., Hill, K., Kalawsky, R. S., & Cook, C. A. 1999. Old Theories, New Technologies: Comprehension and Retention Issues in Augmented Reality Systems. Proceedings of the 43rd Annual Meeting of the Human Factors and Ergonomics Society. Santa Monica, CA, 1050–1054. Stein, B., Littlefield, J., Bransford, J., & Persampieri, M. 1984. Elaboration and Knowledge Acquisition. Memory and Cognition, 12(5), 522–529. Stone, R. 2001. Virtual Reality for Interactive Training: An Industrial Practitioner’s Viewpoint. International Journal of Human–Computer Studies, 55(4), 699–711. Sullivan, L. 1976. Selective Attention and Secondary Message Analysis: A Reconsideration of Broadbent’s Filter Model of Selective Attention. Quarterly Journal of Experimental Psychology, 28, 167–178. Thorndike, E. L. 1905. The Elements of Psychology. London: Routledge and Kegan. Tulving, E., & Osler, S. 1968. Effectiveness of Retrieval Cues in Memory for Words. Journal of Experimental Psychology, 77(4), 593–601. US General Accounting Office. 2003. FAA Needs to Update the Curriculum and Certification Requirements for Aviation Mechanics. Washington, DC: United States General Accounting Office. Valimont, B. 2002. The Effectiveness of an Augmented Reality Learning Paradigm. Daytona Beach, FL: Embry-Riddle Aeronautical University. Vincenzi, D., & Gangadharan, S. 2001. Project Proposal Collaborative Research on Augmented Reality. Daytona Beach, FL: Embry-Riddle Aeronautical University. Vincenzi, D. A., Valimont, B., Macchiarella, N. D., Opalenik, C., Gangadharan, S., & Majoros, A. 2003. The Effectiveness of Cognitive Elaboration Using Augmented Reality as a Training and Learning Paradigm. Proceedings of the Human Factors and Ergonomics Society 47th Annual Meeting. Denver, CO, 2054–2058. Waller, D. 2000. Individual Differences in Spatial Learning from Computer-Simulated Environments. Journal of Experimental Psychology, 6(4), 307–321. Waller, D., Knapp, D., & Hunt, E. 2001. Spatial Representations of Virtual Mazes: The Role of Visual Fidelity and Individual Difference. Human Factors, 43(1), 147–158. Witmer, B., Baily, J., & Knerr, B. 1996. Virtual Spaces and Real World Places: Transfer of Route Knowledge. International Journal of Human–Computer Studies, 45(4), 413–428. Yates, F. A. 1984. The Art of Memory. London: Routledge and Kegan Paul. Yeh, M., & Wickens, C. 2001. Display Signaling in Augmented Reality: Effects of Cue Reliability and Image Realism on Attention Allocation and Trust Calibration. Human Factors, 43(3), 355–365.
3 Flight Simulators Civil Aviation and Training Ronald J. Lofaro and Kevin M. Smith CONTENTS Introduction............................................................................................................... 88 An Overview of Civil Aviation Training, Flight Simulators, and the Human Factors Therein................................................................................................90 Introduction......................................................................................................90 Flight Simulator Basics....................................................................................92 Human Factors and Flight Simulators............................................................. 93 Major Drivers in Civil Aviation Pilot/Crew Training...................................... 95 Flight Simulators and Flight Training Devices.........................................................96 Overview..........................................................................................................97 Flight Simulators and Training........................................................................97 Flight Training Devices and Training.............................................................. 98 Flight Simulator and FTD Assessment............................................................99 SFAR 58, AND AQP............................................................................................... 100 History ........................................................................................................... 100 SFAR/AQP: Overview and Synopsis............................................................. 100 SFAR 58......................................................................................................... 101 AQP, LOS/LOFT, and Simulators................................................................. 102 Line-Oriented Flight Training (LOFT)................................................................... 102 Background.................................................................................................... 102 Maximizing LOFT: The Mission Performance Model and the Operational Decision-Making Paradigm........................................................................... 104 LOFT: Current and Future............................................................................. 104 Risk Identification and Management: Training and Evaluation with MPM and ODM Paradigm................................................................. 104 Mission Performance Model.......................................................................... 105 From CRM/MPM to ODM............................................................................ 107 Operational Risk Management and Decision-Making............................................ 109 Optimizing Performance during Complex Operations.................................. 109 Introduction.................................................................................................... 109 The Overall Mission Continuation Decision................................................. 110 Determining Risk........................................................................................... 111
DOI: 10.1201/9781003401353-3
87
88
Human Factors in Simulation and Training
Operational Envelope..................................................................................... 112 The Unstable, Missed Approach Decision.............................................................. 115 Bayesian Probability...................................................................................... 115 Problem-Solving under Conditions of Uncertainty....................................... 116 The Takeoff “Go/No-Go” Decision............................................................... 117 Unexpected Operational Difficulties............................................................. 118 Optimizing the Decision Function................................................................. 119 Operational Analysis of the Takeoff Decision............................................... 120 Conclusion..................................................................................................... 120 ODM and MPM in LOFT Design, Development, and Evaluation......................... 121 Introduction.................................................................................................... 121 LOF: DOM and MRM................................................................................... 122 LOFT Design: Another Approach................................................................. 124 Training.......................................................................................................... 126 References............................................................................................................... 127 Federal Aviation Administration Advisory Circulars (Ac) and Regulations........... 128
INTRODUCTION The use of flight simulators for civil pilot/crew training and performance evaluation has evolved to the point where they have become key and indispensable tools for air carrier education—and will continue to be so. This is due to a conjunction of factors: safety, cost, simulator fidelity, and fairly recent changes and additions to the federal aviation regulations (FARs). A short bit of explanation on the Federal Aviation Administration (FAA) and FARs follows. The FAA regulates aviation in the US, from air traffic to civil aviation security, the operation of air carriers, pilot training, and more. The federal regulations covering aviation are all found in Combined Federal Regulations (CFR), Title 14, Aeronautics and Space; these are commonly referred to as the FARs. The FARs are divided into parts (1 through 199) and each part has a descriptive title and is a specific and detailed regulation. For example, CFR 14, Part 121 is the FAR titled Operating Requirement: Domestic Flight and Supplemental Operations. The FARs are usually referred to by part number; this one would just be called Part 121. As an aid to aviation, the FAA normally published advisory circulars (ACs) that relate to specific FARs. The ACs are designed to provide assistance and guidelines in complying with FARs. In fact, these ACs, which often are many times longer than the length of the FAR, are written in meticulous detail and serve as a “how to” template. The ACs are titled, numbered, and grouped by such areas as air traffic control, civil and general aviation security, pilot training, and so on. For example, AC 120-45A is the AC titled Airplane Flight Training Device Qualification. The 120 specifies the general area of air carrier and commercial operations and helicopters, whereas the 45 refers to the fact that this is the 45th AC under the 120 area. The “A” means it is the first revision.
Civil Aviation
89
For the remainder of the chapter, we will refer to the FARs by part number only, and the same for ACs. A caveat: We will, somewhat extensively, cite from FARs and ACs. As these are large documents, and we have space limitations, we will sometimes omit sentences within a relevant portion of a citation. We may, occasionally, rather than use a long citation, synopsize what we are referring to. (For the interested reader, the complete FARs and ACs can be found on the FAA website: http:// www.faa.gov) We will begin this chapter with an overview of flight simulators (FS) and their use in the air carrier arena. From there, this chapter will proceed to a brief look at the human factors issues and problems in civil aviation’s use of flight simulators. These are, in the main, the same issues and problems of flight simulator use in any environment: fidelity, part-task trainers, transfer of training, motion axes, transition training, and the like. In addressing the human factors of civil aviation, we recognize that the hexapod axial motion bases and the veridicality of the FS to the actual flying environment are what have led to the use of the FS as THE major pilot/crew training, evaluation, and certification tool. As one example, flight simulators today have assumed such an important place in air carrier training that an aircraft-type rating can be obtained (almost) entirely in a flight simulator. Advances in the capabilities of this generation of flight simulator have led to modified FAA’s classification schemas (levels) for airplane flight simulators and flight training devices (FTDs) and to the first new pilot/crew training effort in over 30 years, the Advanced Qualification Program (AQP). Our HF focus will be on training, and the models for training and evaluating the skills we see as paramount for the pilot/crew in accomplishing risk identification and risk management. Thus, the chapter will deal extensively with the use and maximization of line-oriented flight training (LOFT). The Air Transport Association of America (ATA) and the FAA have worked on the best ways to design, develop, and implement LOFT and line-oriented evaluation (LOE) in flight simulators, as shown by the relevant FARs and ACs. The result is that LOFT, which is done entirely in a flight simulator, has become, over the past 15–17 years, what could be called the “crown jewel” of air carrier pilot/crew training. LOFT and LOE received initial impetus from their use in crew resource management (CRM) training. However, CRM in the mid-1990s, encountered problems, with one resultant being the “Big Three” US flag air carriers (American, Delta, and United) almost completely revamping and renaming their CRM programs (Aviation Week and Space Technology, 1996). LOFT, always recognized as the key element in CRM, has thus become more and more recognized as the independent and indispensable training and evaluation tool for pilot/crew performance. The FAA’s emphasis on air carriers’ going to a new training and certification paradigm, the above-mentioned Advanced Qualification Program as spelled out in (Special FAR) SFAR 58, has further enhanced the role of LOFT and LOE. Lastly, this chapter will deal with LOFT design, development, and use in crew performance and evaluation, using the mission performance model and the operational decision-making (ODM) paradigm. We will lay out the use of LOFT for
90
Human Factors in Simulation and Training
learning/practice in operational decision- making, which results in risk assessment and reduction. This, the authors have long considered as the key functions for any airline captain and crew.
AN OVERVIEW OF CIVIL AVIATION TRAINING, FLIGHT SIMULATORS, AND THE HUMAN FACTORS THEREIN Introduction There are three somewhat obvious benefits from the use of a flight simulator in training. These are the underpinning of the now-extensive use of flight simulators in general aviation, civil aviation, and military aviation. Briefly put, they are: 1. Cost reduction and increased efficiency by replacing the real system, the plane, with the flight simulator. 2. Reduction in the hazards of training in the plane. The loss of life and injuries that result from training accidents and incidents are well-documented. 3. The ability to train skills and performances that cannot be trained in the aircraft, such as malfunctions and adverse conditions and, more important, missions/tasks that may never be performed in real word operations but are essential components of the operational mission profile of the aircraft (cf. Flexman & Stark, 1987). The extensive use of flight simulators in the civil aviation world has resulted from both recognizing these benefits and a confluence of other causes. The first, as said, is that the flight simulator is a safe environment, putting neither crew nor planes at risk. However, although crew and aircraft safety is foremost, looking further we see the following set of converging vectors: In the formative years, and continuing well into the late 1980s and early 1990s, both civil aviation and the FAA relied on ex-military (mainly active-duty personnel who left after fulfilling their commitment, but also some retirees in their 40s) for a source of experienced manpower. In civil aviation, these former military included pilots and aviation maintenance technicians (AMTs). The same was true in the FAA, where there was, perhaps, a higher emphasis placed on the ex-military pilot, who could be placed in the flight standards, aircraft, and pilot certification areas without missing a beat. The rationale for the aviation industry seeking (and welcoming) military personal is both obvious and subtle. It is obvious that the military was a source of highly trained and qualified personnel. Plus, their training was both extensive and standardized. It is also true that there is a “brotherhood of airman,” where inclusion is highly dependent on airmen background, experiences, training, and even common acquaintances. Add to these factors a common “language,” one that is technical, acronym-laden, and replete with idiomatic expressions. All of the above are still active (albeit to a lesser degree) today. One result of the influx of military personnel was the use of the flight simulator as the major training tool for aircrew.
Civil Aviation
91
In passing, it must be remembered that the earliest flight simulator, with replicated cockpit instrumentation, controls, and most aspects of flight built by Edward A. Link in 1929, was a generic flight simulator for general aviation (GA) use. This Link trainer, whose picture we have all seen (it looks like a child’s drawing of a plane mounted on a base), soon evolved into an instrument flight trainer as a result of World War II. From there, the great advances in flight simulators had to do with its use in military training. The history of the early flight simulator development of both capability and capacity for training was for use in training military pilots/air crew. It was natural, as many of the early air carrier and FAA personnel had military backgrounds, that the use of the flight simulator in training and certification became paramount. Add to this a piece of reality that is often overlooked: The air carriers simply do not have enough planes to take significant or even small item numbers out of service for pilot training. As one example, the largest US flag carriers own/lease more than 300 planes each. Unlike the military, which has large numbers of aircraft that are dedicated only to training pilots, the air carriers must use their planes, in the main, to generate income. Considering that the air carriers run 24/7 schedules, it is apparent that the flight simulator is, must be, and will continue to be the training and certification tool in civil aviation, as the planes owned or leased by the carriers are, and must be used, to generate revenue. To resume: This military use of the flight simulator drove much of early flight simulator technology as aircraft become more complex, more automated, and achieved higher performance. On the civil aviation front, the air carriers began to insist that the delivery of a flight simulator for a new type aircraft be simultaneous, or even before, the new aircraft type was put in everyday service. Similarly, the civil side of aviation demanded more and more simulator capability (fidelity and veridicality). The major air carriers used their pilot and crew training facilities to house a growing number of flight simulators. Companies that made flight simulators worked closely with both the airframe manufacturers and the air carriers to design and deliver flight simulators that met the changing needs of pilot training and certification. Today, we have civil aviation flight simulators for such as the B-757/767 and 777 which cost upwards of $35 million and cost in the thousands of dollars per hour when in operation. These operational costs include maintenance, the simulator operator, and the A/C needed to maintain the temperature in the flight simulator faculty at a level that does not impact the highly sophisticated computers that drive the flight simulator. The final vector, or piece, of the equation has been the technological advances in flight simulator capability during the 1980s and 1990s. The advances in the fidelity of the visual scene presented to the crew, as well as the fidelity of the response of flight simulator instruments (be they “glass” or “steam”) to inputs by the crew are outstanding examples of this fidelity. In summation, through a convergence of causal vectors, the period from approximately 1970 to today has seen the emergence of the flight simulator as the essential tool for pilot training and certification in the civil area.
92
Human Factors in Simulation and Training
Flight Simulator Basics The primary functions of any flight simulator (i.e., the functional definitions, as it were) are:
1. To present information that the real system would present for the purpose of training 2. To provide a practice environment that facilitates and enhances the skills and knowledge of the pilot and thus provides learning that enhances performance in the real system, the airplane Put into other terms, a flight simulator is a system designed to “imitate” the functions of another system (a plane) in a real operational environment and to be a realistic substitute that responds realistically to flight crew inputs. The key here is that a flight simulator can be programmed to offer varied experiences to a flight crew, but experiences that are safe, in that if you “crash” in a simulator, there is no injury, save to your pride. (We will return to the role of experience in flight training and flying later in this chapter.) Basically, a flight simulator is a training device that is safer, less expensive, capable of quick modification, and can operate in all weather and for all or any part of a 24-hour day. The flight simulator presents accurate cockpit displays to the flight crew and accurately (except for complete motion capability) responds to control and avionics inputs—all the time processing and storing data on the crew’s control inputs, etc. The characteristics of a flight simulator are that it: 1. Stores data that can be replayed and analyzed on crew input/crew response 2. Stores data that can be used to generate a realistic “environment”/mission or portion; control and other responses to crew inputs 3. Displays such data /info both to the flight crew and to the flight simulator operator 4. Responds to crew inputs accurately as to their effects on both system and environment 5. As would the actual plane, has accurate/valid displays of the status of on-board systems/ components (e.g., EPRs ) that are so vital for the crew to see and monitor 6. Provides two-way training interfaces for the flight simulator instructor and the flight crew being trained There are research applications of different types of flight simulators. The engineering development flight simulator is used in the cockpit/flight control systems design phase. This flight simulator makes system design easier because it is quickly reconfigurable, so that one can conduct system experiments on changes and reconfigurations without having to build or tear down a real system. The research/engineering simulator is a system used to help with basic R&D and applied research on system operations (to look at differences in various flight-control
Civil Aviation
93
systems; e.g., fly by wire vs. manual vs. fly by light). This flight simulator has many of the capabilities mentioned for the engineering development flight simulator. It can also evaluate human performance limits in the system and, evaluate system interaction with other systems (data-link, dispatch, ATC). And lastly the research/engineering flight simulator can be used to train personnel in the operation of the system. Although, as mentioned, there are several types of flight simulators, the focus in this chapter will be what is often termed the “full-up” flight simulator. Simply put, a “full-up” flight simulator has: 1. Three-axis motion base (pitch/roll/yaw) in two directions each (hexapod motion, six degrees of freedom) 2. Computer-generated graphic displays for the “out the windows” visual scene. These include most types of weather and environments, as well as night scenarios 3. A complete cockpit that is the same as in the aircraft type that the flight simulator models 4. Tremendous computer memory, which allows for superb capability in realistic flight simulation. In short, the flight simulator has ultra-realism, except damping as to the motion bases. These flight simulator capabilities are usually ranked in terms of fidelity—the closeness to which the flight simulator mimics actual flight. What may be of more import is the flight simulator’s veridicality (see below).
Human Factors and Flight Simulators The human factors training functions that a flight simulator can address are many and varied. We will simply list many of them. Such a list must include briefings and demonstrations, practice, performance analysis, learning enhancement, providing knowledge of results of actions, providing supplementary cues to the flight crew, building cognitive structures, performance assessment and, finally, providing a safe environment for introducing adverse conditions, malfunctions, and outright mechanical and other failures (cf. Flexman & Stark, 1987). Human factors training as a major component of pilot training became possible with the advent of AQP. This was the first time that a complete review of traditional training became possible. No longer were air carriers tied to the existing FARs, but were allowed to build programs to meet more specifically targeted needs. This training usually included CRM, but more accurately, allowed curricula development that addressed complex problem-solving events that a flight crew would encounter operating under adverse conditions. Thus, one saw an explosion of “line-oriented” training and certification events as part of this new emphasis on human factors. There are many issues, human factors being a major one, in the building and use of flight simulators for the training and evaluation of pilot/crew performance. We have briefly mentioned some, although we did not identify them as issues, in the preceding subsection. These issues cluster around the level of flight simulator fidelity
94
Human Factors in Simulation and Training
necessary for training and/or evaluating a pilot/crew task. This is not to say that other concerns do not exist. But, in the main, the use in training and evaluations of a flight simulator deals what is called fidelity (of the flight simulator to the aircraft) in terms of (a) the motion bases; (b) the out–the-windscreen/window visuals and, (c) the simulator response to control inputs. All these issues can be subsumed under the more general question of “How veridical to actual flight does a flight simulator have to be in order to ensure training that fully prepares a pilot/crew for actually flying the plane?” Veridicality is the closeness of the correspondence of the knowledge structures formed by using the flight simulator (learning and using controls/input responses/ instrument responses/visual scene/ motion, etc.) to the information environment it represents, that is, the actual aircraft type. Because the flight simulator is used in training to build knowledge structures in the crew that will be used in actual flight, it is obvious that veridicality is the primary factor in flight simulator design and use. The higher the fidelity of a flight simulator, the more veridical the knowledge structures built in the flight simulator are—making the flight crew optimally prepared for actual flight. To be sure, we do not want to give the impression that other issues do not exist and could include incorrect control inputs, incorrect sequencing, poor or incorrect decision-making, and more. However, these are beyond the scope of this chapter. We will, however, look at one basic human factors/training assumption: The use of any flight simulator for training tasks and skills results in the skills gained transferring to the actual cockpit, called “transfer of training.” The assumption is that the knowledge structures and information acquired previously on one task affect (neutrally) the ability to be trained on another/other tasks. Although we are talking of tasks here, the intent is that the learning of tasks trained in the flight simulator will aid the learning (performance) of a new task in the plane. Confusing as this may sound, the task learned in the flight simulator, when performed in the plane, is referred to as a new task. Why? Because the task in the plane, even though it is the same as the task in the flight simulator, is being done now in the plane; therefore, it is called a new task. Note: Transfer of training also occurs from plane to flight simulator. There are two types of training transfer that occur. The desired one is called positive transfer where previous training/experience aids “learning” of new tasks. However, there also exists negative transfer where previous experience interferes with learning new tasks or performing the trained, “old” task in new environment. It is important again to note that positive and negative transfer can occur in either direction: flight simulator to plane or plane to flight simulator. There are some examples of skills gained in flying that are not transferrable to the cockpit. This would seem to have to do with the veridicality of the flight simulator to the aircraft and can present some difficult problems. It would also seem to have to do with how the flight simulator was certified for use. As one component of transfer of training, there has been great emphasis on the need for and value of the flight simulator having the capability for hexapod axial motion. At this writing (2006), the FAA, NASA, and the DOT/Volpe Transportation
Civil Aviation
95
Center were conducting sophisticated experiments on the benefits and need for hexapod motion for the platform on which the airline flight simulator is mounted. The ultimate goals of these experiments ((Burki-Cohen et al., 2003; Go et al., 2003) are to provide information for a possible FAA AC and to develop information for a possible FAA policy on flight simulator motion requirements in airline pilot training and evaluation. A brief overview of the research findings to date is that hexapod platform motion has a significant positive effect for flight crew evaluation, but no significant benefit for training. Further, certain enhancements to the motion washout filters (lateral side force and heave motion) seem to be beneficial in all cases. However, for more complete information, the reader is referred to the works cited above, as it is not our purpose here to engage in a lengthy discussion of flight simulator motion issues and research. Another problem is that of “simulator sickness”: a phenomenon in which a pilot becomes sick (vertigo, nausea) in the flight simulator. This has been handled in several ways. It now seems clear that it is caused by a complex interaction that involves conflicting visual and kinesthetic cues. Another part of the interaction seems to be the time duration between flying and using the flight simulator. It would appear that, for some individuals, use of the flight simulator in close proximity to having actually flown can cause simulator sickness. This is (usually) easily remedied by specifying a minimum time duration that must pass between flight and flight simulator. In some cases, there has been a reverse “simulation sickness” reaction whereby the pilot becomes sick in the aircraft. Obviously, this is dangerous, as well as possibly career ending. To the author’s knowledge, this has been handled via specifying (as before) minimum time duration between flight simulator and use flight. Although medication, ranging from over-thecounter motion sickness pills to prescription antivertigo drugs can be used, it would seem to be only a one-time or short-term remedy. The rationale here seems obvious. We now will leave the above issues as we are quite sure that they are well covered in other chapters of this text and, the issues/problems that we mention in passing are not the focus of the chapter.
Major Drivers in Civil Aviation Pilot/Crew Training Air crew training for air carrier pilots (Part 121) has evolved over the years due to five main drivers. They are:
1. Technical advancements in aircraft systems and simulation realism 2. Engine out operations 3. Mission-critical alerts and warnings 4. Adverse condition operations 5. Human factors
Technical advancements in aircraft systems have, for obvious reasons, driven pilot training programs. The features of a new system and how it should be
96
Human Factors in Simulation and Training
used operationally have always been built into the curriculum. A good example of this is terrain collision avoidance system, version II (TCAS II), a hard/software system that provides the aircrew with alerts for terrain collision avoidance. Although TCAS I was initially introduced with a part-task trainer, it is now an important feature of full-mission simulators and collision avoidance training is now possible. Engine out operations as a major driver of pilot training are not quite so obvious. With the advent of multiengine aircraft, engine out-training had always been important to pilot certification. However, with the advent of swept-wing turbojet aircraft, this training took center stage. This was due to the unique aerodynamic properties of swept-wing aircraft, more specifically, asymmetrical thrust, and axis coupling. During asymmetrical thrust operations, the swept-wing turbojet aircraft experiences pronounced axis coupling, manifested in a rapid roll off or wing drop along with an equally pronounced yaw. Increased pilot skill was and is the only counter-tactic to this potentially fatal condition. In 1968, a DC-8 training accident involving asymmetrical thrust prompted the effort to conduct all simulator training. Such an effort was successful, prompting, among other things, advancements in simulator realism. Although often overlooked, the third major pilot training and certification driver has and continues to be mission-critical alerts and warnings. This area includes such maneuvers as stalls and steep turns, wind shear recovery, and ground proximity warning recovery. Recent additions include CAT III auto land system failure recovery maneuvers. It is important to note that all of these recovery maneuvers require the aircraft to be “hand-flown” by the pilot. This last statement brings to the surface what we think is the major challenge facing training managers today: the tension line between increasingly sophisticated autopilot systems and the continuing and pressing need for a high degree of basic “stick and rudder” pilot skills. Pilots therefore need to demonstrate proficiency in (1) adverse conditions (which include engine out operations), (2) low visibility operations, (3) mission-critical alerts and warnings, and (4) system and human limitations. We shall later show how these four conditions form the boundaries for the pilot’s worldview and how they can be incorporated into a training/operation model to manage/reduce risk.
FLIGHT SIMULATORS AND FLIGHT TRAINING DEVICES Before we begin this and the following sections, we again feel constrained to make the following caveat. Much of the following is from FAA documents: SFAR 58 and its accompanying AC (AC 120-AQP); AC 120-40B; AC 120-45A; AC 120-35B, and AC 1210-45B. For brevity, “ease of flow” and resultant clarity, we have condensed some of the material in these; we omitted portions not pertinent to this chapter and often deleted references to other FARs and ACs; on some occasions, there is paraphrasing. As has been said, the complete documents are online if the reader wants to see the entirety of any of the FAR and AC cited.
Civil Aviation
97
Overview The availability of advanced technology has permitted greater use of flight simulators for training and checking of flight crewmembers. The complexity, costs, and operating environment of mod-ern aircraft also have encouraged broader use of advanced simulation. Simulators can provide more in- depth training than can be accomplished in airplanes and provide a very high transfer of learning and behavior from the simulator to the airplanes. The use of simulators in lieu of airplanes results in safer flight training and cost reductions for the operators. It also achieves fuel conservation and reduction in adverse environmental effects. As technology progressed and the capabilities of flight simulation were recognized, FAR revisions were made to permit the increased use of simulators in approved training programs. Simulators have been used in training and some checking programs since the middle 1950s. Various FAR amendments gradually permitted additional simulators use in training and checking aircrews. A significant recognition of simulator capability has occurred since the early 1970s. In December 1973, FAR Amendments 61-62 and 121-108 permitted additional use of visual simulators. In the early 1990s, various ACs and SFAR 58 further recognized simulator capability and use in training and evaluating flight crews. Of importance is the fact that the FAA makes a distinction between an airplane (flight) simulator and an airplane flight training device. The FAA AC that deals with airplane simulators is FAA AC 120-40B, and AC 120-45B deals with FTDs. The term FTD covers everything from a PC with training-specific software to a mock-up of an instrument panel, to a complete cockpit. However, what air carrier pilots/crew use for our focal point, LOFT, is a full-up, hexapod axial motion-based, high-quality visual scene flight simulator, often called the “box” or the “sim.” Although FTDs are used as part-task trainers, it is the sim alone that is used for LOFT. The box has full mission capabilities to include ATC chatter/instructions as well as day/night and various weather and wind conditions. The simulator operator can program a mission (a flight from point A to point B) that introduces the full spectrum of conditions and problems gleaned from the experiences and reports of other pilots who have flown that particular route. The mission simulation also introduces conditions and problems that have been encountered or reported on other flights/routes.
Flight Simulators and Training An airplane simulator (commonly called a flight simulator) is a full-size replica of an airplane’s instruments, equipment, panels, and controls in an open flight deck area or an enclosed airplane cockpit, including the assemblage of equipment and computer software programs necessary to represent the airplane in ground and flight operations, a visual system providing an out-of-the cockpit view, a force (motion) cueing system with provides cues at least equivalent to that of a three degree of freedom motion system; and is in compliance with the minimum standards for a Level A simulator specified in AC 120-40, as amended.
98
Human Factors in Simulation and Training
The airplane simulators are placed, graded as it were, into 4 levels, A through D; the FTDs are similarly ordered, except the classification scheme of levels 1 through 7 is used. In both cases, the levels refer to the capabilities and complexities (hard and soft ware) of the training equipment. In both cases, all equipment are placed in a matrix, by level, that indicates what flight tasks can be trained at each level. The new designations and their relationships with the simulator definitions used and in FAR Part 121, Appendix H are: Level A—Visual Level B—Phase I Level C—Phase II Level D—Phase III While trying not to oversimplify this distinction, the main difference is that a “full-up” airplane simulator has axial motion capability whereas an FTD does not. This will become clearer below as we give the FAA definitions of both types of flight training equipment.
Flight Training Devices and Training An airplane flight training device is a full-scale replica of an airplane’s instruments, equipment, panels, and controls in an open flight deck area or an enclosed airplane cockpit, including the assemblage of equipment and computer software programs necessary to represent the airplane in ground and flight conditions to the extent of the systems installed in the device. An FTD does not require a force (motion) cueing or visual system and meets the criteria outlined in the AC for a specific flight training device level. In an FTD, any flight training event or flight checking event can be accomplished. Nonvisual simulators are now grouped with Level 6 training devices, but must meet the requirements, except for visual, of a Level A simulator. There is no other change in their characteristics or description; just their name. Alphabetic designations were chosen for simulators to maintain a distinction form the numerically designated training devices. In coordination with a broad cross section of the aviation industry, the FAA has defined seven levels of flight training devices as Level 1 through Level 7. Level 1 is currently reserved. Levels 2 and 3 are generic in that they are representative of no specific airplane cockpit and do not require reference to a specific airplane. Levels 4 through 7 represent a specific cockpit for the airplane represented. Within the generic or specific category, every higher level of flight training device is progressively more complex. Because of the increase in complexity and more demanding standards when progressing from Level 2 to Level 7, there is a continuum of technical definition across those levels. (Note: For complete matrices of flight simulator and FTD/levels and the tasks that can be trained and/or checked in each device, see AC120-40B and AC120-AQP.)
Civil Aviation
99
Flight Simulator and FTD Assessment The need for standard flight simulator and FTD assessment and qualification criteria was necessitated by the use of simulators for training and checking. The evolution of the simulator technology and the concomitant increase in permitted use has required a similar evolution of the criteria for simulator qualification. A listing of known simulator criteria should be, therefore, informative. The qualification basis for a given simulator may be any of the past criteria, depending on when the simulator was first approved or last upgraded. The training and checking credits for nonvisual and visual simulators were delineated in FAR Part 61, Appendix A, and FAR Part 121, Appendices E and F. Four levels of simulators were addressed; Basic (nonvisual and visual simulators), Phase 1, Phase II, and Phase III. (These designations have since been replaced by levels A through D as seen in subsection A.) Each of the four levels is progressively more complex than the preceding level and each contains all the features of preceding levels plus the requirements for the designated level. As the technology has advanced, so has the qualification guidance. Efforts to keep the criteria updated are, therefore, ongoing with active participation from both industry and government resources. Any FTD or airplane flight simulator must be assessed in those areas that are essential to accomplishing airman training and checking events. The assessment requirements and guidelines are, essentially, the same for both FTD and flight simulator. This includes climb, cruise, descent, approach, and landing phases of flight. Crewmember station checks, instructor station functions, checks, and certain additional requirements depending on the complexity of the device (i.e., touchactivated cathode ray tube instructor controls; automatic lesson plan operation; selected mode of operation for “fly-by-wire” airplanes, etc.) must be thoroughly assessed. Should a motion system or visual system be contemplated for installation on any level of flight training device, the operator or the manufacturer should contact the NSPM for information regarding an acceptable method for measuring motion and/or visual system operation and applicable tolerances. The motion and visual systems, if installed, will be evaluated to ensure their proper operation. The FAA’s intent is to evaluate flight simulators and FTDs as objectively as possible. Pilot acceptance, however, is also an important consideration. Therefore, the flight simulator or FTD will be subjected to validation tests listed in the relevant ACs. These tests include a qualitative assessment by an FAA pilot who is qualified in the respective airplane or set of airplanes. Validation tests are used to compare objectively flight simulator or FTD data and airplane data (or another approved reference data) to assure that they agree within a specified tolerance. Functions tests provide a basis for evaluation of the flight simulator or FTD capability to perform over a typical training period and to verify correct operation of the controls, instruments, and systems. The above subsections should suffice as an introduction to the FARs and ACs as they apply to defining flight simulators and FTDs, as well as to the concept of “levels” of flight simulators and FTDs. When we deal with LOFT later in this chapter, it will be LOFT as done in a Level D flight simulator.
100
Human Factors in Simulation and Training
SFAR 58, AND AQP History In 1975, the FAA began to deal with two issues: hardware requirements needed for total flight simulation and the redesign of training programs to deal with increasingly complex human factors problems. At the urging of the air transportation industry, the FAA addressed the hardware issue first. This effort culminated in 1980 in the development of the Advanced Simulation Program. Since then, the FAA has continued to pursue approaches for the redesign of training programs to increase the benefits of advance simulation and to deal with the increasing complexity of cockpit human factors. A joint government–industry task force was formed on flight crew performance issues. On September 10, 1987, the task force met at the Air Transport Association’s headquarters to identify and discuss flight crewmember performance issues. Working groups in three major areas were formed, and the recommendations to the joint task force were presented to the FAA administrator. Some of the substantive recommendations to the FAA administrator from the flight crew member training group were the following: a. Provide for a Special Federal Aviation Regulation (SFAR) and Advisory Circular to permit development of innovative training (SFAR 58) b. Require all training to be accomplished through a certificate holder’s training program c. Provide for approval of training programs based on course content and training aids rather than specified programmed hours (SFAR 58) d. Require Cockpit Resource Management (121.404, SFAR 58) training and encourage greater use of line-oriented flight training
SFAR/AQP: Overview and Synopsis In this subsection, we will show how the relevant FAR (SFAR 58) on AQP came into being, the portions of it that directly impact the use of flight simulator, and finally, why and how SFAR 58 and the accompanying extensive AC have changed civil pilot training (the most significant change being the enhanced role of flight simulator and LOFT). We will now present a brief look at the aspects of SFAR 58 that pertain to training and to the use of flight simulators and other training devices. Note: Any Special FAR (SFAR) expires within 5 years unless extended or made into an FAR. In the case of SFAR 58, it would have expired in late 1995, but has been extended until 2 October, 2005. It is interesting to note that the original AQP AC accompanying SFAR 58 was published in 1990. It has been updated and is in the process of finalization to be reissued in its newest version. It was expected that this would occur in early 2004, or some 14 years since the original AC was published. In response to the recommendations from the joint government–industry task force and from the National Transportation Safety Board (NTSB), the FAA put forward SFAR 58, Advanced Qualification Program, in October 1990. AQP was also established to permit a greater degree of regulatory flexibility in the approval of
Civil Aviation
101
innovative pilot training programs. Based on a documented analysis of operational requirements, an airline (FAA certificate-holder) under AQP may propose to depart from traditional training practices and requirements for pilot/crew with respect to what, how, when, and where training and testing are conducted. This is subject to FAA approval of the specific content of each proposed program. SFAR 58 requires that all departures from traditional regulatory requirements be documented and based upon an approved continuing data collection process sufficient to establish at least an equivalent level of safety. AQP provides a systematic basis for matching technology to training requirements and for approving a training program with content based on relevance to operational performance.
SFAR 58 SFAR 58 provides for approval of an alternate method, Advanced Qualification Program, for qualifying, training, certifying, and otherwise ensuring competency of crew members, aircraft dispatchers, other operations personnel, instructors, and evaluators who are required to be trained or qualified under parts 121 and 135 of the FAR or under this SFAR. For pilots in command, seconds in command, and flight engineers, a proficiency evaluation—a portion of which may be conducted in an aircraft, flight simulator, or flight training device as approved in the certificate holder’s curriculum—must be completed during each evaluation period. Each AQP qualification and continuing qualification curriculum must include approved training on and evaluation of skills and proficiency of each person being trained under an AQP to use their cockpit resource management skills and their technical (piloting or other) skills in an actual or simulated operational scenario. (The integrated assessment of CRM and technical flight skills will be discussed later.) For flight crew members this training and evaluation must be conducted in an approved flight training device or flight simulator. A person enrolled in an AQP is eligible to receive a commercial or airline transport pilot, flight engineer, or aircraft dispatcher certificate or appropriate rating based on the successful completion of training and evaluation events accomplished under that program if the applicant shows competence in required technical knowledge and skills (e.g., piloting) and cockpit resource management knowledge and skills in scenarios that test both types of knowledge and skills together. (Note: There are other requirements, but, as said, we are focusing on the flight simulator in AQP.) As has been said, any flight simulator or FTD that will be used in an AQP for one of the following purposes must be evaluated by the FAA administrator for assignment of a flight training device or flight simulator qualification level: (i) Required evaluation of individual or crew proficiency (ii) Training activities that determine if an individual or crew is ready for a proficiency evaluation (iii) Activities used to meet requirements for recent experience (iv) Line operational simulations (and to include LOFT)
102
Human Factors in Simulation and Training
AQP, LOS/LOFT, and Simulators The capabilities and use of simulators and other computer-based training devices in training and qualifications activities have changed dramatically. SFAR 58 and AC 120-AQP allow certificate holders that are subject to the training and evaluation requirements of Part 121 and Part 135 to develop innovative training and qualification programs that incorporate the most recent advances in training methods and techniques. SFAR 58 and the AC also apply to training centers under Part 142, which intend to provide training for eligible certificate holders. AQP emphasizes crew-oriented training and evaluation. These training and evaluation applications are now grouped under the general term of line operational simulations, including LOFT, special purpose operational training, and line operational evaluation. Due to the role of crew resource management issues in fatal accidents, it has become evident that LOS is the most appropriate environment to train and evaluate both technical and CRM skills. Consequently, a structured LOS design process is necessary to specify and integrate the required CRM and technical skills into line-oriented LOS scenarios. These should provide the opportunity for training or evaluation, as appropriate, in accordance with approved AQP qualification standards. All of the above can be done in an FAA-approved flight simulator.
LINE-ORIENTED FLIGHT TRAINING (LOFT) Background LOFT emphasizes an orientation on events that could be encountered in line operations (“flying the line”). Thus, mission realism—making the LOFT session correspond as closely as possible to event sets that could or would be encountered in flying one or more point A to point B legs—becomes the major driver in LOFT design. In other words, events that make up a LOFT scenario should pass the test of mission realism where it is reasonable to assume that this “could” happen in the real world. The use of flight training devices and flight simulators has become increasingly important in training flight crew members. As the level of sophistication in simulators increased, air carriers have come to rely on simulators for part or all of their flight training programs. Since the mid-1970s, some FAR Part 121 and Part 135 operators have implemented alternative simulator training (now LOFT) to train crewmembers. LOFT is training in a simulator with a complete crew using representative flight segments that contain normal, abnormal, and emergency procedures that may be expected in line operations. This FAA AC specifies the multiple types of line operational simulations, of which LOFT is one. The AC also specifies the types of LOFT and LOE. In this AC, the FAA provides guidelines for LOFT content, LOFT use, LOE use and LOFT/ LOS instructor qualifications. We will briefly show some relevant portions of this AC because LOFT and LOS are done in a flight simulator and because LOFT is the vital venue for pilot training and evaluation. (Excerpt from FAA AC 120-35B 58, Line Operational Simulators/LOS, with our usual caveat.)
Civil Aviation
103
LOFT is a useful training method because it gives crewmembers the opportunity to practice line operations (e.g., maneuvers, operating skills, systems operations, and the operator’s procedures) with a full crew in a realistic environment. Crewmembers learn to handle a variety of scripted real-time scenarios, which include routine, abnormal, and emergency situations. They also learn and practice cockpit resource management skills, including crew coordination, judgment, decision-making, and communication skills. The overall objective of LOFT is to improve total flight crew performance, thereby preventing incidents and accident during operational flying. The types of LOFT are: 1. Qualification LOFT—An approved flight simulator course of LOFT to facilitate transition from training using flight simulation to operational flying. Qualification LOFT meets other requirements of FAR Part 121, Appendix H. 2. Recurrent LOFT—An approved flight simulator course of LOFT which may be used to meet (yearly) recurrent flight training requirements and to substitute for alternate proficiency checks. 3. Line Operational Evaluation—An evaluation of crewmembers and crews in a flight training device or flight simulator during real-time Line Operational Simulations. LOE is primarily designed for crewmember evaluation under an AQP. LOE is conducted in a flight simulator or flight training device and is designed to check for both individual and crew competence. [Authors: Such competencies should be demonstrated in a mission-realistic environment.] LOE may also be used to evaluate a specific training objective. Operators conducting LOE may be approved to use any level of flight simulator or flight training device, depending on the objective of the evaluation and the capability of the device. The level of the flight simulator of flight training device required to support evaluation in LOE will depend upon the evaluation objectives and the device’s capability to support the objectives. Special purpose operational training (SPOT) is an approved course of operationally oriented flight training, conducted in a flight simulator or flight training device, which may be used to learn, practice, and accomplish specific training objectives, for example, training in variant aircraft or special aircraft equipment. LOFT is “no-jeopardy” training, that is, the instructor does not issue a passing or failing grade to a participating crewmember. As a LOFT scenario progresses, it is allowed to continue without interruption so crewmembers may learn by experiencing the results of their decisions. Decisions which produce unwanted results do not indicate a training failure but serve as a learning experience. If the LOFT instructor identifies crew member performance deficiencies, additional training or instruction will be provided. This training or instruction may be in any form, including additional LOFT. Before the crew member may return to line operations, the performance deficiencies will be corrected and the instructor will document the training as satisfactorily completed. The “no-jeopardy” concept allows crew members to use their full resources and creativity without instructor interference. At the end of a
104
Human Factors in Simulation and Training
LOFT session and after debriefing, the instructor certifies that the training has been completed. (We will return to jeopardy versus nonjeopardy in LOFT later; it has both a history and problematic aspects.) To iterate: Each AQP qualification and continuing qualification curriculum must include approved training on and evaluation of skills and proficiency of each person being trained under an AQP to use their cockpit resource management skills and their technical (piloting or other) skills in an actual or simulated operations scenario. For flight crewmembers, this training and evaluation must be conducted in an approved flight training device or flight simulator.
The reader may feel, at this point, that what has been presented has been an overabundance of FAA definitions, regulations, policies, and guidance. This is only somewhat true and if the reading of what has come before may have been somewhat dry and/or tedious, a point must be made again. All of civil aviation’s activities come under the purview of the FAA. It is not possible to completely or clearly understand the role and functions of flight simulation (whether in flight simulator or FTD) in civil aviation without the information so far presented.
MAXIMIZING LOFT: THE MISSION PERFORMANCE MODEL AND THE OPERATIONAL DECISION-MAKING PARADIGM LOFT: Current and Future We have described the initial development of LOFT, its current form, and content. We have stated that LOFT is the major training and check tool in an AQP Program. LOFT and LOE, as performed in the flight simulator, simply put, are both the optimal training/testing environment and the “court of last resort,” as it were. Upon successful completion of LOFT/LOE, the pilot crew have earned new ratings or certifications or are “good to go” for another year. However, the current LOFTs and LOEs need to be strengthened for exactly the reasons cited above; they are the best, and safest methods for cutting-edge, realistic training and evaluation, and they provide a final stamp of approval in an AQP—as well as a more traditional Part 121-based training program. We have set the stage to present how our earlier statements about the tremendous potential and existing use-value of LOFT can be merged and realized via the MPM and the ODM models.
Risk Identification and Management: Training and Evaluation with MPM and ODM Paradigm The end result of all civic pilot training should be to prepare a pilot to identify, assess, and manage risk. The primary role of the pilot as a risk manager has been emphasized multiple times over the past 10 years by the authors (Lofaro & Smith, 2003, 2001, 2000, 1999, 1998, 1993). LOFT is simply the preeminent tool, as well
Civil Aviation
105
as test situation for training and evaluating civil captains/crews. Over the years, two major models have been developed by which LOFTs can be designed and crew performance enhanced as well as evaluated. The first is the mission performance model (MPM) as developed by Captains Kevin Smith, and William Hamman of UAL, with some input from Jan DeMuth and Ron Lofaro of the FAA. The MPM came from the recognition that the CRM skills must be integrated with a corresponding set of technical skills (flight control skills) in an interactive matrix in order to fully evaluate overall crew proficiency. Further, such an integrated CRM approach would serve as a training tool—in LOFT design and in specifying where the CRM/flight control skill linkages existed. An approach to integrated CRM, along with both human factors and flight control/technical skill evaluation scales, was partially developed during an FAA-hosted workshop in 1992. Dr. Lofaro was the designer and facilitator of this workshop and Captain Smith, along with several training captains from NW, DL, United Airlines, the chief pilot for Boeing, and others were the participants. The results of that workshop are in Report DOT/FAA/ RD-92/5: Workshop on Integrated Crew Resource Management (Lofaro 1992). The integration, and assessment, of CRM and flight control skills received considerable attention—and, a fair share of concern and skepticism—in the 1980s and early 1990s. As one response, the ATA formed a joint air carrier/FAA/academic working group to deal with this and other CRM issues in 1990; both Kevin Smith and Ron Lofaro were on that group. Dr. Robert Helmreich, in conjunction with several major air carriers, developed a complete set of flight crew CRM performance markers (he termed them “CRM behavioral markers”) with behaviorally-anchored rating scales. In a NASA/FAA/University of Texas project, Helmreich worked with several air carriers on research that involved the use of these markers in LOFT. In 1991, Captain Kevin Smith (United Air Lines) and Jan DeMuth (FAA Flight Standards) developed an initial set of performance markers for the technical/flight control skills. Both the CRM and the technical sets of markers were used in the next step of CRM integration: the attempt at developing an analytic paradigm. Kevin Smith created the framework for a model that demonstrated that the CRM human factors skills and the technical/flight control skills are interrelated, interdependent, and often simultaneous in execution—that, for safe and efficient flight, CRM can sometimes be integral to flight control, and vice versa. This model is called the mission performance model. Captain Smith worked with Captain Hamman and others to develop exemplars of the application of the MPM to actual flight maneuvers, such as an engine-out at V1, with a turn procedure required by the terrain.
Mission Performance Model The model is based on these concepts: 1. Flying is an integrated, mission-oriented activity and must be evaluated as such. 2. The crew’s performance is not adequately captured by totaling the sum of the component tasks/subtasks/elements. The focus must be on crew function—usually at the task and critical subtask levels.
106
Human Factors in Simulation and Training
3. Flight proficiency skills/knowledge are interwoven, interdependent, and necessarily interact with the CRM skills/knowledge differentially across tasks and conditions. These interactions can be identified/specified by a matrix-type crew mission performance model using the tasks, which comprise a mission/flight leg. (This is what we term integrated CRM.) 4. The model can capture these interactions and can be sensitive to changes in both task and mission—for example, show that, for different tasks and conditions, the technical/flight proficiency skills, the CRM skills, and their interactions, will vary. This is an indication that the model has a measure of discriminatory power or “sensitivity” to changes in task and conditions. 5. Helmreich’s behavioral markers can adequately delineate CRM skills and provide one basis for the (flight crew) mission performance model, as can the technical markers capture the flight control skills and form the other MPM basis (see below). Finally, the bases for the technical proficiency evaluation currently exist in a behavioral marker- type format with scales. Both the marker and their scales can be validated/modified for evaluation of all these proficiencies, which will be called “crew performance markers—technical factors.” This arena focuses on the crew as a unit and how well they discharge the technical aspects of the mission. It specifically addresses precision maneuvers across these areas:
1. Flight maneuvers and attitude control 2. Propulsion/lift/drag control 3. System operations 4. Malfunction warning and reconfiguration 5. Energy management
Another rationale for the MPM, and later, ODM, is that pilot/crew performance has often been seen as a series of discrete tasks, where each task was further decomposed to reveal a set of subtasks combined with the requisite knowledge and skills necessary for subtask completion. For many applications, such as aircrew training, this produces a large collection of task, knowledge and skill data. In most traditional pilot or crew training programs, these are taught individually as isolated knowledge components. Consequently, the trainee is left with the responsibility of combining these isolated knowledge components into integrated wholes (Merrill & Li, 1989). However, the linear decomposition of individual tasks does not address integrated functioning nor does it reveal how tightly coupled teams (flight crews) perform, thus an analytical process other than the traditional task analysis approach is considered necessary. Therefore, the MPM uses the functional modeling approach. The mission performance model has embedded within it the concept of functions. It is proposed that the model, as constructed, represents all significant functions necessary for the successful completion of an air transport mission. This model views crew performance as consisting of system-level functions that represent the mechanisms used to perform a mission activity. The importance of a model that is founded
Civil Aviation
107
on a set of systems-level functions cannot be overstated. Moreover, the model delineates crew performance at a level of abstraction that is significantly different than the current descriptions of individual performance. The MPM consists of a set of functions that can be activated by inserting an instance/example—in other words, asking the function to specify/describe a particular activity or situation in the mission. If a particular function such as workload management was asked to “spin out” the components of a particular mission activity, such as takeoff with an engine failure at V1, then the function should be able to organize, sequence, distribute, and coordinate key crew actions so that a successful outcome could be assured. This workload management function, then, can be viewed as a generic performance statement that:
a. Can be applied to many mission activities/situations, and b. Can be activated for the application to, and specification of, any one of these activities/ situations.
The mission performance model specifies the components of flight crew “effectiveness” (effective performance). That the model represents effectiveness is important to understand since, if the crew is really engaging in the set of functions that are both germane and linked to the problem at hand, and if these functions are the prerequisites for a successful outcome, then effectiveness has been demonstrated. Similarly, the model is prescriptive; it prescribes what needs to be accomplished for the crew to perform effectively. For example, we can specify, during the LOFT design process, what are very likely to be the necessary crew behaviors. In summary, in the MPM, human factors as well as technical performance clusters are specified along with the applicable markers under each cluster. For example, under workload management and situational awareness, key markers include preparation, planning, vigilance, workload distribution, and distraction avoidance. Similarly, under the cluster entitled “propulsion/lift/drag control,” the key makers include instrument interpretation, energy management, power control, lift control, and drag control. When all these markers are combined into a matrix array with their various categories, the MPM emerges.
From CRM/MPM to ODM Upon completion of the 1992 Integrated CRM Workshop, a new set of issues and concerns became apparent to Smith and Lofaro. The integrated CRM concept and the MPM were well received by the workshop participants. However, due to many factors—such as a lack of FAA interest in follow-on efforts and a CRM “establishment” that was not open to taking CRM to either another level or in new directions, along with the jeopardy issue—it was clear that integrated CRM and the MPM had become dead issues. Of much more import was the realization that CRM was not the human factors silver bullet. Captain Hal Sprogis asked, “Is the Aviation Industry Experiencing CRM Failure?” (Sprogis, 1997). Captain Daniel Maurino had written
108
Human Factors in Simulation and Training
Crew Resource Management: A Time for Reflection (Maurino, 1999). Both indicate that we may have expected too much from CRM; that the relationship between CRM and safety, which was and is the prime rationale offered for teaching CRM, has not been proven; that CRM is a process, not an outcome, and certain efforts to assess outcomes (i.e., individual performance) may be misguided. American Airlines, in July of 1996, set aside much of CRM as they were doing it. Their reason was that their flight crews have valid objections to, and concerns about, CRM: CRM is too often viewed as a number of interpersonal issues that simply do not define the problems that we face in aviation … CRM training will most likely always be defined and suffer in terms of the first generation of courses … which were seen as “touchy-feely,” “getting along,” and “managing human relations or resolving personality conflicts” rather than dealing with truly important concerns. (Ewell & Chidester, 1996)
American’s new focus is on preparing flight crews for the daily challenges of normal and normal operations encountered flying the line. Delta Airlines, in the same timeframe, revamped their “CRM for New Captains” course and now calls it “In Command.” As with American, Delta emphasized leadership, responsibility, and performance. So, in 1996–1997, we see these two major carriers eschewing overemphasis on communication and interpersonal relations in their CRM training. Lastly, United Airlines’ version of CRM was and is called C-L-R, where the C is for “command” and the L is “leadership,” indicating that United wanted to bypass the interpersonal with C-L-R and move on to the performance issues. Yet, even United changed aspects of their CRM in 1997. Further, “common wisdom” was that pilots made good decisions easily and almost naturally, aided by (some) increase in experience. The facile assumption that additional experience will teach pilots to make better decisions has proven to be a dangerous fallacy. Experience can be a nasty teacher, often giving the test before, or without, giving the lessons and materials needed for the test. Experience can also reinforce poor decisions and behaviors that seemingly “worked” in the past (the “not your day to die” phenomenon). There was also the commonly accepted view that decision-making is but one of the components of CRM. This was, and is, a gross error. CRM, with its emphases on communications and team function, is but one enabler of good decisions. As such, it is a part of decision-making, not vice versa. CRM is, simply put, an enabler of decision-making. Decision-making is the primary tool to be used by the pilot and crew with their primary functions: risk identification and risk reduction. In short, risk management. And, it became apparent that aeronautical decision-making was greatly different that decision-making on the ground, and that a new paradigm was necessary that both articulated the differences and had a new set of decision-making/DM techniques specific to what pilots and crew encounter. Another realization was on the primacy of LOFT in pilot training. As one result of this, Smith et al. wrote an interrelated set of papers on LOFT design and delivery, that later formed a session
Civil Aviation
109
at the 1993 International Symposium of Aviation Psychology biennial meeting in Columbus, Ohio. As another result, Lofaro designed and held an FAA/Industry/DoD/Academe Workshop in Denver (1992) which had some of the CRM workshop participants and added others from the decision-making world. The two-volume FAA report on this workshop (DOT/FAA/RD-92, Vols. I, II; Lofaro and Adams) initiated the efforts for what has become the operational decision-making model of Kevin Smith and Ron Lofaro (Lofaro & Smith, 2001, 2003).
OPERATIONAL RISK MANAGEMENT AND DECISION-MAKING Optimizing Performance during Complex Operations Managing risk, thinking critically, and making sound decisions when performing complex operations are our greatest challenges. When a problem arises during a complex business or military operation, non-linearity becomes a reality—the operation continues while the problem is being addressed. And importantly, the problem-solving team is almost always the same as those conducting the operation. This dual track immediately puts stress on the human-machine system resulting in a mission-critical situation where high levels of complexity and uncertainty prevail. This section addresses this reality by characterizing the critical operational concerns when addressing risk management and decision-making during the course of performing an intense complex operation. We present an understanding of the unique characteristics of an operational decision, the decision analytic structure that contains a robust risk management and decision processing algorithm, and importantly the analytics that can be employed when uncertainty prevails. A rigorous analytic process that can be employed when operating under conditions of complexity and uncertainty is presented herein.
Introduction As a general class of phenomena, complex environments contain complex situations and complex systems. Complex environments are one of the most challenging to consider, in large measure because of our inability to understand and predict; they can be fraught with uncertainty. If one is planning to operate in a complex environment by employing large-scale dynamic systems, conventional reasoning— especially determinism—cannot be used. Complex entities are non-deterministic by nature because complexity theory informs us that complex systems exhibit novel behavior and emergent properties, rendering these entities and phenomena into a class by themselves residing outside of conventional wisdom. Tackling the decision problem for large-scale dynamic systems utilized in the field of aviation is of immediate importance yet is arguably the most difficult. This is because very little is understood with respect to optimizing the performance of such systems, and previous attempts have not considered the levels of uncertainty
110
Human Factors in Simulation and Training
associated with such systems. This chapter adds some measure of analytic rigor to the discussion.
The Overall Mission Continuation Decision Operational decision theory was created to support operational decision-making. Specifically, this body of knowledge helps identify and optimize operational decisions. Operational decisions are singular among all other classes of decisions and represent the most important command activity. Importantly, ODM provides for the broad situation awareness needed to identify risk and the structural mechanisms necessary to manage a rising risk profile. In an effort to redesign pilot training from the ground up, the Advanced Qualification Program set out to understand specific pilot activities, with the objective of directly attacking the causes of controlled flight into terrain (CFIT). They defined observable mission-related activities and attempted to integrate these with team-related or interpersonal activities (crew resource management). Through this and other studies, they realized that various mission tasks were not performed in a linear sequence, but were done selectively and differentially. Furthermore, simulator studies by Smith revealed more astounding results: High-performing crews did something unexpected—they prioritized their tasks. So while the CRM (the management of human resources) deconstructed task activities to understand the pilots and crews’ tasks, it revealed something else entirely. But what was it? The answer came from Dr. Robert J. Sternberg (1985) and his triarchic theory of intelligence. He proposed a cognitive superstructure that informed and triggered selective activities according to some rule as yet unidentified. If we understood the characteristics of this superstructure then we had a chance to understand how and why such crews did so well. In the pilot’s task universe, “mission activities” existed side by side with other “task organizing” activities. High-performing pilots could differentiate and prioritize, thereby optimizing mission outcome. But what exactly is going on? Smith and Larrieu proposed a radical idea: How humans perform in groups (as a crew) is beside the point. What is critical for mission success is how well flight crews and ultimately the captain solves problems in a complex environment. Thus in subsequent work by Smith and Larrieu, non-linear problem-solving took center stage.
1. This breakthrough came with the following insights. 2. All air carrier mission activities are highly planned, often using sophisticated planning tools. 3. While all activities are planned, excellent pilots do not plan real-time activities. Some are discarded altogether. 4. These pilots prioritize and select tasks using some kind of decision-making process to optimize mission outcome. This decision-making process gained definition after Keeney and Raiffa (1976) invented a branch of mathematics that dealt with the numerical weighting of multiple
Civil Aviation
111
attributes with the multi-attribute utility theory (MAUT). This defined operational decisions, identified key decisions, and specified triggers that activate certain decision pathways. High-performing pilots were selecting optimum pathways, but this had yet to be understood. 1. An operational decision for pilots is now defined by Smith and Hastie (1992) as containing three unique components: 1. It must often be performed using incomplete information. 2. Once airborne, it is always performed under increased time compression. 3. Consequences of poor decisions are often catastrophic, placing the aircraft, crew, passengers, and the corporation in jeopardy.
Determining Risk The operational decision for the air transport mission is a four-branched network, which captures the planning nature of the activity, the need to prioritize to optimize the outcome, and conforms to the following rules:
1. If the risk to the completion of the mission is low, then continue with the original mission plan. 2. If the risk to the mission is moderate, then modify the mission to either reduce or stop the risk from rising. 3. If the risk is high, abandon the mission plan and activate available alternatives. This we refer to this as “divert—reject—abandon.” This breakthrough concept was presented at numerous symposia and dramatically changed the dialogue so that more and more aviation professionals were willing to discuss decision-making and prioritizing. See Figure 3.1. In order for Figure 3.1 to be effective we ask what is the nature of risk? How can we identify and quantify it? Moreover, how can we trigger a particular decision path of the four-branched structure to ensure an optimum outcome? The nature of risk means we must deal with it or it can get worse—it will be a rising risk. In aviation systems, unless decisive action is taken during critical events, risk will continue to rise to a point beyond which one experiences a catastrophic mission failure. Such a point is called the critical event horizon. Rising risk can be explained by using the risk continuum, as is shown in Figure 3.2a. The risk continuum is organized into three zones. When risk rises it passes through zone one, where the risk is judged to be low, to zone two, where the risk is moderate. If the encountered event is critical enough, or if risk has not been mitigated, it will likely become high risk, zone 3. Each zone determines certain action. For low risk, continue with the mission plan. For moderate risk, modify the mission plan to arrest the rise or lower the risk. For high risk, where catastrophic failure is probable, abandon the mission plan and immediately implement survival measures. The course the risk takes is determined by critical events, almost like the critical events enter the operational environment acting like a hostile agent. The result is
112
Human Factors in Simulation and Training
FIGURE 3.1 (a) Four-branch decision analytic structure depicting basic concept concerning risk management. (b) Depiction of the risk management algorithm.
an attack on the mission system where degraded functions are possible, likely up to and including total system failure. This hostile agent invades the operational system or mission space. Arguably, the operator’s most critical decision is to determine if catastrophe is imminent and if so to take decisive action. The catastrophe avoidance algorithm is depicted in Figure 3.2b.
Operational Envelope The mission space is a reality with explicit boundaries. These boundaries define when it is acceptable to operate and when it is not. For example, it is not acceptable to operate when a hostile agent such as a “microburst alert” enters the mission space for an airplane. Nor is it acceptable to continue to fly to the intended destination when a power plant is degraded. Hostile agents come from four general directions. These are:
1. Any adverse condition, such as adverse wind, freezing precipitation, and so forth.
Civil Aviation
113
FIGURE 3.2 (a) The risk continuum. (b) Catastrophe avoidance algorithm.
2. Restricted visibility. This can often limit the ability to land at a particular airport, causing great concern if insufficient fuel remains to proceed to an alternate airport. 3. Mission-critical alerts and warnings. This could be such things as terrain alert, traffic alert, or thunderstorm detection. 4. Human and system limitations. System limitations could be speed or altitude, where human limitations could be fatigue, task overload, or inexperience. These four hostile agents create boundary conditions. The boundary conditions are the edges of an operational envelope. See Figure 3.3. Low risk resides within the envelope. Risk factors that impact the mission but do not place the aircraft outside the boundaries should be considered as moderate. The operational strategy would be to modify but not abandon the mission. Some agents are more dangerous than others. For example, a microburst alert is more dangerous than a wind shear alert. Thus we can say that some agents are highly energetic (like the microburst alert) and others are moderately energetic (like the wind shear alert). Some agents, regardless of their energy type, have one other rather important characteristic—they can bind with another agent. This
114
Human Factors in Simulation and Training
FIGURE 3.3 Operational envelope.
FIGURE 3.4 The cumulative effect of two risk factors.
phenomenon is called the “cumulative effect.” This can be a dangerous situation because it can go undetected and produce a situation so dire that catastrophic mission failure is imminent. For example, a combined agent could be low visibility at the destination airport combined with strong crosswind, which could produce an untenable situation. These are two hostile agents coming from two different sides of the operational envelope: adverse conditions and restricted visibility, as shown in Figure 3.4. This figure shows the combined vector is the resolved hypotenuse of the triangle formed. The vector travels to the corner of the mission space; this represents a rising risk situation that must be addressed immediately. Let us look at an example of risk, the operational envelope, and the cumulative effect. On December 12, 2005, Flight 1248 attempting to land at the ChicagoMidway airport crashed (National Transportation Safety Board 2007). At the time a significant Midwest snowstorm made the weather exceptionally poor. The flight crew had to deal with four hostile agents that had entered the mission space:
1. Braking action advisories in effect with fair-to-poor braking action reported. 2. Short runway with no overrun. 3. Adverse wind, with an 8-knot tailwind reported. 4. Low visibility and approaching landing minimums.
Civil Aviation
115
Under the adverse conditions category, there are three mission-critical impact areas: braking action advisories, a short runway, and adverse wind. Under the restricted visibility category, the mission-critical impact was the plane was approaching CAT I minimums. While it can be argued that any single situation does not represent a “show stopper,” in the words of Senior Captain and Safety Manager Captain Bill Yantiss, “Being legal according to the book does not always make it safe.” By referring to the operational envelope, we can see that the mission should be immediately abandoned, and the approach and landing should not be attempted. To attempt the approach and landing leaves the rising risk unchecked and it will pass beyond the critical event horizon, resulting in catastrophic mission failure. In this case, this is precisely what happened because a meaningful risk assessment was not performed by the flight crew. Mission performance is optimized by first understanding the prevailing risk and then knowing what to do about it. When risk begins to rise, the flight crew must prioritize or discard activities to manage a rising risk profile. If risk mitigation measures are not effective, and the risk is high or projected to go to high, then the mission plan must be abandoned, and survival measures must be taken.
THE UNSTABLE, MISSED APPROACH DECISION Bayesian Probability Most probability theories deal with the here and now; Bayesian probability theory does not. Bayesian probability is based on the concept that the likelihood of an event can be understood in terms of a moving dynamic, which in turn acquires additional relevant information over time. In an operational environment where motion and stability are of major concern, such as with an aircraft, Bayesian probability helps flight crew members determine the level of risk and uncertainty caused by certain events. A change in the operational environment causes the emergence of an event. At this point it is not known if such an event is critical to the mission or not. (Most operations that are non-trivial contain a mission statement and thus they can be referred to as a mission.) The event’s impact on the mission is considered. This impact is stated in terms of the likelihood that the mission may or may not continue unimpeded. These impediments can be referred to as risk, which is the level of uncertainty that the mission will succeed. Low risk means that mission success is essentially assured, while high risk may signify mission failure is most likely. This level of uncertainty is the problem space where the initial projection is expressed as a hypothesis P(A) that the condition will deteriorate. Meanwhile, the mission continues and it follows a trajectory through time and space. In a dynamic environment, things change rapidly. After the initial assessment is made, additional evidence is most likely encountered or added. This additional evidence may likely emanate from a completely different source than the initial, triggering event. It resides along the planned trajectory, sometime after P(A) was encountered, and is expressed as P(B).
116
Human Factors in Simulation and Training
In the next step, since both P(A) and P(B) may be stand-alone events, we attempt to correlate both into a single metric. This will also serve to update P(B) with respect to P(A). Thus, we will update the figure of merit for P(B) given that P(A) is true. This is expressed as P(B/A). So far, we have: 1. P(A) is the probability of encountering a mission-critical event and its impact on the mission. 2. P(B) is additional evidence that has been encountered, expressed in probabilistic terms. 3. P(B/A) is an updated figure of merit for B given that A is true. 4. P(A/B) is the unknown that we wish to determine. It is the updated level of uncertainty.
Problem-Solving under Conditions of Uncertainty Our case study involves an operation where it is critically important that the onset of instability be determined notwithstanding the uncertainty associated with other events. • At point A on the trajectory, an operational parameter has been exceeded. The likelihood that this out-of-tolerance condition will result in the onset of instability at the approaching point D, is key operational knowledge to maintain the integrity of the operation. This is represented by P(B/A). • Additional evidence is obtained at point B and is represented by P(B). This evidence may be germane to the operation, and it could influence the determination of the onset of instability. • At the conditional point along the trajectory, identified as point C, P(B) is assessed. Given that P(B/A) is true, P(B) is updated and given a value commensurate with this updated evidence. This probability is represented by P(A). • It is necessary to determine at point D on the trajectory whether the onset of instability will occur. If it is highly probable that instability will occur, then best practices dictate that the mission should be abandoned prior to reaching this limit. This probability is the determined value of P(A/B). This is the unknown that we wish to discover, or, in another way of presenting it, this is the unknown in the mathematical equation that represents all three previous points. The mathematical equation is represented in Figure 3.5. The case study is shown in Figure 3.6.
FIGURE 3.5 Bayes’s theorem.
Civil Aviation
117
FIGURE 3.6 Bayesian algorithmic reasoning.
The Takeoff “Go/No-Go” Decision At the center of our image of an operational decision-maker is an actor (in the form of a crew member) who receives information about the current state and “encodes” this information as a value along a unitary risk dimension. Consider the example of takeoff operations in large transport aircraft, where the crew is constantly monitoring sources of information about the environment and the condition of the aircraft to assess the emergence of any operational risk that could impact the operation. High risk implies a significant danger to the operation. Low risk, on the other hand, implies a continuation of the takeoff is appropriate. However, this message is often not perfect and ambiguity may very well occur. The critical aspect is whether the information received came from the danger (high risk) distribution of all possible events or the no danger (low risk) distribution of events. When the value is either high or low, the decision is relatively straightforward. However, when the value is in the midrange, the decision becomes
118
Human Factors in Simulation and Training
FIGURE 3.7 Probabilistic distribution of the go and no-go takeoff decision
difficult because the discrimination is not clear. This region is called the zone of ambiguity. Figure 3.7 shows the probabilistic distribution of the go or no-go situation for the takeoff operation. The distribution on the left indicates the probabilities of receiving event messages at various locations on the risk continuum when the crew should decide to continue to “go.” The distribution on the right is the analogous probability density function when, in fact, the decision should be “no go” (abort the takeoff). In most cases in commercial flight operations, the information received by the crew indicates the takeoff operation is advised.
Unexpected Operational Difficulties In rare cases, the information received indicates high risk and thus a no-go signal is received. When there is a criterion located on the risk continuum and when the information available identifies a value above a threshold, the decision maker decides to abort the takeoff. Following the conventions of signal detection theory, Xc labels the decision criteria that are used to make the optimum decision (Figure 3.8). This situation is more complex than first realized when the area surrounding Xc is examined more closely. This is shown in Figure 3.9. In the first case, the crew should have continued but rejected (Case A). The second case is where the crew should have rejected but continued (Case B). Numerous studies have shown in the actual operational environment that ambiguity prevails. In this ambiguous area, messages from both the continue and reject cases overlap, and the implications of the actual perception of risk are ambiguous and confusing.
Civil Aviation
119
FIGURE 3.8 Eliminate ambiguity and confusion.
FIGURE 3.9 The takeoff decision.
Optimizing the Decision Function Ideally, we want to minimize both Case A and Case B error as much as possible. Here are two approaches. The first approach at improving the takeoff decision performance is the plausible approach. The second approach is more analytically generous. In the plausible approach, crews can be instructed to set Xc decision criteria in such a way that the opportunity for error is minimized. Any procedural setting of Xc inevitably involves trade-offs, where fewer rejections when the pilot should have continued means more takeoffs when the pilot should have rejected and vice versa. The plausible location for the criteria is between the two distributions, at the point where their density function curves cross. Such a location would appear exactly balanced and would minimize the two types of errors. However plausible, this is completely inappropriate because the loss incurred in takeoff under reject conditions is significantly greater than in a reject condition when the pilot should have continued the takeoff. However, if policymakers and operational managers could set values on the two relevant errors, it would be possible to set criteria for crews to optimize the decision outcome.
120
Human Factors in Simulation and Training
The analytical approach focuses on improving the discrimination between the two states (danger—should reject; low risk—can continue). This should be the primary goal of improvements in technology. But improvements in “discriminatory training” are also necessary. This corresponds to changing the relationship between the distributions by moving the distributions farther apart; Figure 3.9 shows how a more accurate decision can be realized.
Operational Analysis of the Takeoff Decision In this section, we will discuss the decision analytic structure and use the takeoff operation to further explain its properties. The decision structure is represented in Figure 3.9. The key choice points are depicted. They involve the choice of continuing takeoff as planned, continuing takeoff with modifications to the operational plan, or aborting takeoff due to significant danger. In this example, the key choice points involve the primary choice to continue the takeoff as planned or reevaluate the takeoff plan. The secondary choice is contingent on the first—either to modify the operation to accommodate a rising risk or to abort the takeoff due to excessive risk. The key activities associated with each choice point are the mechanism by which the decision is executed. It is important to realize that an optimum decision selection criterion, called an alternative, is the ability to select the most accurate decision path with respect to the prevailing risk at the time. Choice Point A represents the primary binary decision. Notice that this entails the evaluation of risk. This is important for many reasons. Risk analysis is critical in selecting the correct path. While execution of the proscribed maneuver is important, it is at Choice Point B after the risk is examined. Many studies as well as documented operational experience have suggested that the takeoff accident rate is excessive. But while this insight is important, they focus mostly on the maneuver execution phase rather than first examining the higher-order skill requirements involving the optimization of the operational decision.
Conclusion A serious challenge facing aircrews is maintaining an acceptable level of risk while performing a mission. Key to their success is to determine with accuracy and clarity if a low-risk situation prevails or is anticipated. If so, then crews can continue with the mission as planned. If a moderate risk posture is evident, then crews must modify the mission plan accordingly. If the risk posture is judged to be high, then crews must discontinue the current plan. The effective management of risk involves the optimum placement of the decision criteria, which we have labeled Xc, along the risk dimension. It also involves the reduction of the ambiguity zone through discrimination methods. Current data shows that the probability that crews will not make the correct abort decision with respect to accurate assessment is 54 percent. Flight crews are incorrectly assessing risk and aborting takeoffs at an alarming rate.
Civil Aviation
121
FIGURE 3.10 The takeoff decision.
Figure 3.10 summarizes this situation in decision analytic terms. The abort decision column shows that 54 percent were incorrect (cell B) while 46 percent were correct (cell D). Among the many solutions that have been proposed to reduce takeoff accidents, several proposals have involved moving the decision criteria, Xc. However, such an administrative adjustment of the decision criteria should not be undertaken without a careful analytical study of improving the discrimination capabilities of prevailing risk.
ODM AND MPM IN LOFT DESIGN, DEVELOPMENT, AND EVALUATION Introduction We must deal with major issues before we go into LOFT design using ODM and MPM. The first is that CRM training was first developed in the late 1970s/early 1980s after a series of disastrous and fatal air carrier accidents—accidents where perfectly functioning planes crashed. Human- factor errors by pilot/crew were seen as the cause of these accidents. As a result, the FAA wanted the air carriers to implement new, human-factors-oriented training—for example, CRM. Rather than append existing FARs to make CRM mandatory, the FAA chose another path. (Of interest here is that in SFAR 58, the FAA decided, after many years, to make CRM mandatory in AQP training). To return to the situation at hand, the FAA, in order to make CRM training costs palatable to air carriers, offered to waive some hours of pilot recurrency training in lieu of CRM training. As training is a “big buck” item for air carriers, this allowed the carriers to save (not spend additional funds) and give CRM. It also allowed the FAA to ensure the CRM training would, to a great extent, be given, thus silencing some critics who, understandably, wanted new HF training to counter the rash of accidents. However, there was a sticking point: jeopardy. Simply put: As a further inducement to carriers to give CRM, the FAA and ALPA agreed that CRM would be “nojeopardy” training. When CRM skills were evaluated in a LOFT, neither pilot nor crew could fail be given a “down,” which requires additional training and checking. Therefore, the evaluation of a LOFT such as the CRM, consisted of videotaping the
122
Human Factors in Simulation and Training
LOFT and a critique/debrief given to the crew upon completion of the LOFT. The videotape is then erased. Our view is that LOFTs built around the MPM and ODM must be evaluated as a jeopardy LOFT session. The MPM has technical markers that encompasses actions and skills normally evaluated in a check ride (flight simulator or actual flight) and, which can be failed. These include attitude management, course deviation, power management, etc. In short, the full panoply of flight skills that are usually evaluated in sessions/flights can be included, so it would seem that failure should be an option. Added to this, we would have LOFT scenarios that enable ODM, and both pilot and crew must be allowed to succeed or fail. We have postulated ODM as the primary tool for training pilots and crew to do risk identification and management—and that risk identification and management are the primary functions of a flight crew. The common fallacies about decisionmaking were discussed early in the chapter. We add this: Author Captain Smith functioned as a “line check airman” during the late 1990s. He was checking fairly senior pilots transitioning to a large (250-plus passenger capacity), highly automated aircraft. He was saddened (and both angry and frightened) to see the number of times pilots either did not recognize a decision point in time to stay ahead of the power curve; did not recognize a decision and action point at all; and made poor decisions—decisions which raised the risk for having a safe and successful flight. In conversations, the authors became even more convinced of the absolute need for ODM. What LOFT not only lends itself to evaluation but is actually designed for evaluation? A LOFT that uses event sets with embedded decision points carefully designed to force decisions on a risk continuum, and a LOFT that is designed with event sets partitioned into the MPM matrix, thus showing both CRM and flight skills. The evaluations of these carry with them the possibility of failure, without which they are meaningless.
LOF: DOM and MRM In designing and developing LOFT scenarios, the basic unit, as proposed in 1993 (viz., Hamman, Smith, Lofaro, and Seamster, 1992) is the event set. The LOFT scenario is a set of event sets selected from real world ops reports or made from amalgams of events and incidents as reported, or actually put together from the experiences of the LOFT design team. (An aside: NTSB accident reports can also be used.) It should be clear that superior LOFT design is a team effort, and the team should be carefully selected. The LOFT design team must consist of senior pilots with extensive flight time in the aircraft type the LOFT scenario is being created for. These pilots should also have experience in the air carriers’ training complex. It is also somewhat desirable to have a person from the training department with ISD credentials, as will be shown later. One LOFT design team member should be a flight simulator operator to ensure that the event sets selected can be replicated in the flight simulator. Finally, it is truly preferable if one or more LOFT design team members have been line check airmen.
Civil Aviation
123
Having selected the team, the next step would be to layout an overview of the LOFT mission/ flight leg(s). This overview would include the basics, such as Wx, time-of the year ops (i.e., winter), and departure and destination airports, as well as alternatives. Into this skeletal framework, the team will select the event sets for each phase of flight (takeoff, cruise, descent, landing) as well as any pre-takeoff event sets that may impact the flight leg. The next steps are the crucial ones: Carefully select the problems that you want the flight crew to encounter: mechanical, system malfunctions, etc. Then, plan the sequence into which you want to embed the problems. Remember that the overall goal is not to create the fabled “LOFT from hell”—one which cannot be successfully flown but must result in a loss of flight control. In both the sequencing of the vent sets and the selection of the problems to be embedded in the LOFT, the MPM and the ODM are be used as the structural underpinnings. It is done in this manner:
1. Upon selection of the problems and the phases of flight that these problems are to occur in, the ODM is used to build a sequence that results in a rising risk. The decision points are identified. (A “decision point” is a point in the flight where if no decision and resultant action is taken, or if a wrong decision is made, the risk rises from low to moderate or from moderate to high.) Upon identifying the decision points, the basic sequence is modified to add the various outcomes from no-decision/wrong decision; that is to say, the sequence now contains branches that are dependent upon the decisions made/unmade. Each branch or node will also need have any changes in conditions and systems (again; Wx, en route or at destination, systems malfunctions, etc.) built in. 2. Next, the MPM is integrated with ODM The concept here is to make the consequences of following a no-decision/wrong decision model such that the risk continues to use until it reaches the high level and crew action must be taken in order to regain any possibility of successful flight completion. The “successful completion” may evolve ATB or diversion to the/an alternate airfield where “success” simply means landing the plane. This integration is a two-step process. The first involves taking the selected event sets and identifying the critical tasks that are to be performed during those times. These high-level critical tasks (e.g., V1 “cut” on takeoff) are then decomposed, using the ISD process, into the complete list of subtasks involved. The MPM is then used to further identify which of the critical tasks track to which of the both the relevant CRM and lift control functions necessary for successful task performance. The set of MPM functions will organize, sequence, distribute, and coordinate the actions key to successful performance. Looked at another way: On the V1 “cut” at takeoff, as an exemplar, we find that the needed CRM function is workload management. The MPM, with ISD decomposition, will spin out the specifics of the critical actions and flight control skills embedded in the workload function. The flight control tasks for this example include
124
Human Factors in Simulation and Training
propulsion/lift/drag, operational integrity, and altitude control—with such subtasks as disconnecting the auto throttle at 400’ AGL, setting airspeed to xyz knots, checking flap setting, and so on. The MPM well also spool out the crew performance markers for each subtask—both the CRM and flight technical markers. Not only that, but the functions and actions of both the pilot flying (PF) and the pilot not flying (PNF) will be clearly spelled out. These, as said, will be spelled out at the subtask level. In fact, this is true CRM integration—the place where both CRM and flight control actions are presented as a unified whole. However, space and scope preclude further explication, and there is a CRM integration document (Lofaro, 1992). As is also clear, because the necessary performances are specified, that the performance markers can be used not only to track the crew’s actions but, if desired, to evaluate them. This evaluation can be done simultaneously using a flight simulator operator and a check airman, or done post hoc, using the videotapes that are normally part of LOFT sessions. Again, scope precludes going further into the evaluation area. In summary, we see that the LOFT design has been driven by using the ODM to do initial event set selection and sequence design. Then, the MPM was used to generate the task and subtask breakouts for selected events within the event sets. The MPM further sequenced the events selected (as an example, CRM and flight control integration was performed at the subtask level, with optional evaluation procedures). The LOFT design can be seen as embedding event sets into the LOFT scenario that can take the crew and plane into the moderate and even the high risk areas of the rising risk continuum, thusly: 1. If the conditions causing the risks are either not identified or their interactions are not recognized. 2. If the decision points are either missed or result in an incorrect decision(s). 3. As a result of 1 and or 2, no actions are taken or incorrect actions are taken. So far, we have shown the LOFT design process as one where the event sets, as well as initial and changing conditions, are used to generate decision points. The decision points, if missed or responded to incorrectly, cause a rise in the mission risk. The MPM is overlaid to give a level of de-tail whereby an analysis will determine where the errors were made: in flight control, in CRM, or in CRM and flight control. The MPM also offers an evaluative framework. However, LOFT design could be done in the other direction.
LOFT Design: Another Approach We have discussed selecting event sets that have built-in, as it were, decision points. Put another way, events/event sets can be selected that require decisions (and actions) to prevent risk from rising—to prevent the aircraft’s position in the ops envelope from approaching a corner or a boundary. As an example, the event set could include deteriorating Wx en route or at the destination airport, perhaps with braking advisories or crosswinds on approach/landing. From there, the MPM would be used to
Civil Aviation
125
develop the flight crew tasks and functions for PF and PNF. An initial bifurcation could be made, one path of event sets following the correct identification of rising risk and attendant risk reduction actions; the second path based on nonidentification of rising risk. Now, there would also be subpaths, for example, showing the correct identification of rising risk but incorrect response(s). The branching process can be repeated as needed. Thus, the ODM is the driver and the MPM is the method used to develop functionality. However, LOFT can be developed in a different way, still using the MPM and ODM. A series of event sets (based on incident reports, “hangar talk,” experiences, etc.) can be selected, linked, and the PF/PNF functions identified. These event sets we will term “expanded event sets” or “fully articulated event sets.” By analyzing these sets, the decisional points can be identified. In fact, “identified” is not the exact term; “selected” is more appropriate. This is because it seems clear that, in any flight, new conditions or changes in conditions (Wx, flight system problems, etc.) will result in changes in the aircraft’s position both in the ops envelope and on the rising risk continuum. With the initial set of fully articulated functions and actions developed via the MPM, changes are introduced using the ODM’s boundary conditions as guidelines. That is to say, an initial set of boundary conditions will be specified and used as a basis for carefully selecting changes to them that result, if left unidentified and/ or unchecked, in additive or cumulative interaction, such interactions driving the aircraft toward a corner or side in the ops envelope. Of course, this means that the risk has risen to moderate or high-moderate—even to high. The changes to the boundary conditions should be introduced at different points in flight so that the risk does not rise suddenly. The rationale here is that one goal of the LOFT is to keep situation awareness high by the introduction of ongoing series of changes, rather than a compressed set of events that lead to an immediate abnormal ops or emergency, with limited options for the flight crew. If the changes in the boundary conditions are introduced over the first 1–1½ h of the LOFT session, their additive and/or cumulative interactions and impact will be sequences so that the flight crew’s ODM skills are tested. Thus, ODM skills are tested rather than skills at handling an overt and immediately apparent abnormal or emergency situation, which are often trained in other venues. An aside: This is not to say that missed decision points as well as incorrect decisions and actions may not lead to an abnormal or emergency situation. If that occurs, then the LOFT can also demonstrate flight crew skills in the emergency arena. However, as said, the training, to include recurrent or special-item training of flight crews does provide for certain emergency training. As one example, recently some air carriers have instituted upset recovery training (recovering the aircraft from unusual or abnormal attitudes). To resume: By carefully introducing boundary conditions changes into the event sets, the risk can be caused to rise from additive or cumulative interactions. As before, when we indicated how to use the ODM to MPM LOFT design methodology, there will be a branching effect, contingent on decisions (made, unmade, correct, incorrect) and resultant actions (taken, not taken). The LOFT scenario must be designed
126
Human Factors in Simulation and Training
to include the various pathways, so that the flight simulator can be pre-programmed for the contingencies. It would seem that, optimally, the OPM to MPM and the MPM to ODM methods would operate simultaneously or in an intertwined manner. It may be fairly said that the use of the MPM and the ODM is actually a necessary and sufficient condition of effective LOFT design At this point in time, we believe that we have presented the ODM in sufficient detail and with useful examples. The same can be said for the MPM. We have given references for the reader who wants more information and exposition of either model. We have presented the framework for developing ODM/MPM-based LOFT scenario(s). The evaluation of the flight crew in the LOFT training session has been discussed. It is of import to now clearly state that neither the ODM, the MPM, nor any LOFT developed using them need have an evaluative aspect. Further, if evaluation is to be an aspect of the LOFT session, it need not be a jeopardy situation. However, we still hold to our original view that LOFT should have a jeopardy component.
Training Although we have not mentioned or emphasized the training aspects of the MPM or the ODM, it is clear that here are necessary training considerations for both. However, the MPM needs little, if any, training in terms of flight crew. The reason is that the CRM components are already included as part of either initial or recurrent training. The flight-control maneuvers components are all included in flight training/ type training—and many of the flight control aspects are used in recurrency training. Additionally, these flight control tasks/subtasks are all part of the handbooks used by pilots for each type of aircraft. Put another way, from the task through the subtask level of flight control, pilots are familiar with and have been trained in all of it. Of more importance, the performance of these flight control tasks, and the FAA and carrier standards to which they must be performed, are already known to the flight crew—they have learned and been tested in them on the ground and been evaluated on their ability to perform the standard in the air. Therefore, for any critical task decomposition used in a LOFT, the flight crew is well aware of the subtasks required to perform the task. Where, then, is any training needed for understanding and use of the MPM? It would seem that a single presentation and explanation of the MPM would suffice. There are the CRM and flight control behavior marks to consider. However, these are only of concern if the LOFT is to be evaluated for “jeopardy.” If not, the markers and scoring scales can be distributed and explained; this process could be incorporated into the presentation and explanation of the MPM. Two hours would suffice. Such is not the case with the ODM. This model would require dedicated training time. Again, there is a “however”: The boundary conditions are well known to pilots. Although it is true that the flight crew may never have seen the way ODM structures the boundaries, no training time is really needed for that aspect. The rising risk continuum, the concepts of interaction among boundary conditions/functions (with
Civil Aviation
127
resultants that can exceed the impact of the single factors involved)—all this can be easily trained in a 2–4-hour class, with pencil and paper exercises. At this point in time, the optimum use of the ODM for safely would be to automate some or all of it, and make it a call-up part of a display. Perhaps the best concept would be to have the display come up when two or more boundary conditions, from either the same boundary side or contiguous sides, had become active. Such an endeavor, or any discussion of it, is far beyond the scope of this chapter.
REFERENCES Burki-Cohen, J. and Go, T. H. et al. 2003. Simulator fidelity requirements for airline pilots training and evaluation continued: An update on motion requirements research. Proceedings of the Twelfth Annual International Symposium on Aviation Psychology, Dayton, OH. Ewell, C. D. and Chidester, T. 1996. American airlines converts CRM in favor of human factors and safety training, the flightdeck, July/August, 1996. Flight Department, American Airlines; DFW Airport. Aviation Week and Space Technology article, September 6, 1996, p. 15. Flexman, R. H. and Stark, E. A. 1987. Training simulators. In Handbook of Human Factors G. Salvendy, Ed. New York: John Wiley & Sons, pp. 1012–1037. Go, T. H. and Burki-Cohen, J. et al. 2003. The effects of enhanced hexapod motion on airline pilot recurrent training and evaluation, AIAA-2003-5678. Hamman, W. R., Seamster, T. L., Lofaro, R. J. and Smith, K. M. 1992. The future of LOFT scenario design and validation. Proceedings of the Seventh International Symposium of Aviation Psychologists. R. S. Jensen, Ed. Columbus, OH. Keeney, R. L. and Raiffa, H. 1976. Decision with Multiple Objectives: Preferences and Value Tradeoffs. Hoboken, NJ: John Wiley & Sons. Lofaro, R. J. and Smith, K. M. 2003. The finalized paradigm for operational decision-making (ODM) paradigm: Components and placement. Proceedings of the 12th International Symposium on Aviation Psychology. Dayton, OH. Lofaro, R. J. and Smith, K. M. 2001. Operational decision making: Integrating new concepts into the paradigm. Proceedings of Eleventh International Symposium on Aviation Psychology. R. S. Jensen, Ed. Columbus, OH. Lofaro, R. J. and Smith, K. M. 2001. A Paradigm for developing operational decision-making (ODM). Proceedings of 2001 SAE World Aviation Congress (WAC) Conference. Various articles in Aviation Week and Space Technology, Vol. 15, No. 3, July 17, 2000, pp. 58–63. Lofaro, R. J. and Smith, K. M. 1999. Operational decision-making (ODM) and risk management: Rising risk, the critical mission factors and training. Proceedings of the Tenth International Symposium of Aviation Psychologists. R. S. Jensen, Ed. Columbus, OH. Lofaro, R. J. and Smith, K. A. 1998. Rising risk? Rising safety? The millennium and air travel special issue of the Transportation Law Journal Vol. 25, No. 2, University of Denver Press, Denver, CO. Lofaro, R. J. and Smith, K. M, 1993. The role of LOFT in CRM integration. Proceedings of the Seventh International Symposium of Aviation Psychologists. R. S. Jensen, Ed. Columbus, OH. Lofaro, R. J., Adams, R. J. and C. N. 1992. Workshop on Aeronautical Decision-Making (ADM) DOT/FAA/ RD-92/14; Vol. I, II. National Technical Information Service: Springfield, VA. 22161.
128
Human Factors in Simulation and Training
Maurino, D. 1999. Crew resource management: A time for reflection. Chapter in Handbook of Aviation Human Factors. Daniel Garland, John Wise, and David V. Hopkin, Eds. National Transportation Safety Board. 2007. Aviation Accident Report AAR-07/06. Retrieved January 3, 2015, from http://www.ntsb.gov/investigations/AccidentReports/ Reports/ AAR0706.pdf Smith, K. M and Hastie, R. 1992. Airworthiness as a design strategy. Proceedings of the Flight Simulator Air Safety Symposium, San Diego, CA. Sprogis, H. 1997. Is the aviation industry expressing CRM failure? Proceedings of Ninth International Symposium on Aviation Psychology. J. Rakovan and R. S. Jensen, Eds. Columbus, OH. Sternberg, R. J. 1985. Implicit theories of intelligence, creativity, and wisdom. Journal of Personality and Social Psychology, Vol. 49, No. 3, 607–627. https://doi.org/10.1037 /0022-3514.49.3.607
FEDERAL AVIATION ADMINISTRATION ADVISORY CIRCULARS (AC) AND REGULATIONS AC 120-35B: Line Operation Simulations (9/6/91) AC 120-40B: Airplane Simulator Qualification (7/29/91) AC 120-45A: Airplane Flight Training Device Qualification (2/5/92) AC 120-AQP: Advanced Qualification Program (8/9/91). Note: This AC has been revised and will be reissued in early 2004. SFAR 58: Advanced Qualification Program (5/29/03). Note: This is an extension of the original SFAR of 1991)
4
Integrating Effective Training and Research Objectives Lessons from the Black Skies Series of Exercises Christopher Best, Gregory Funke, Winston Bennett, Michael Tolston, Simon Hosking, and Robert Bolia
CONTENTS Introduction............................................................................................................. 129 The Black Skies Exercises...................................................................................... 130 Outcomes for Military Operators............................................................................ 131 Performance Benefits and Transfer................................................................ 131 Auxiliary Benefits.......................................................................................... 133 Research on Team Coordination Dynamics............................................................ 134 Communication Dynamics............................................................................ 135 Physiological Dynamics................................................................................ 136 Combining Communication with Physiological Data................................... 138 Operator Acceptance of Physiological Monitoring................................................. 140 Evaluation of Training Capability........................................................................... 142 Summary and Conclusion....................................................................................... 146 Acknowledgements and Dedication........................................................................ 146 References............................................................................................................... 147
INTRODUCTION At the height of the COVID-19 pandemic in 2020, the Royal Australian Air Force (RAAF) cancelled its premier live air-combat exercise, known as Exercise Pitch Black. Despite this, a subset of the operators who would have taken part in that exercise were still able to undertake high-end training as part of a synthetic counterpart activity, known as Exercise Virtual Pitch Black. This synthetic exercise brought together air mobility and fast-jet elements with airborne and ground-based
DOI: 10.1201/9781003401353-4
129
130
Human Factors in Simulation and Training
command-and-control via distributed simulation to conduct integrated planning and execution of complex mission scenarios (Hartigan, 2020). The ability to undertake complex training despite the cancellation of the live exercise represented a significant benefit arising from over a decade of research, development, and investment by RAAF, Australia’s Defence Science and Technology Group (DSTG) and their international partners in the US Air Force Research Laboratory (AFRL) into the tools and methods of synthetic training. In this chapter, we describe the programme of research and development that laid the foundations for Virtual Pitch Black. In particular, we focus on a key component of that programme; the Black Skies series of research exercises (e.g., Shanahan, Best, Finch, Tracey, Vince, Hasenbosch, & Stott, 2009; Stephens, Crone, Temby, Best, & Simpkin, 2011; Best, Jia, & Simpkin, 2013; Francis, Best, & Yildiz, 2016). The Black Skies exercises provided a means by which RAAF operators could learn to work more effectively with others to achieve mission objectives and by which the RAAF as an organization could learn how to build and employ distributed simulation systems to augment live training. In addition, these exercises provided a rich and ecologically valid environment within which DSTG and AFRL scientists could undertake research to inform future capability development. These exercises led to a greater understanding of team learning and performance in operationally realistic settings, to the establishment of new training capability for the RAAF, and to the identification of requirements for future training capability development. The experience of the Black Skies series demonstrated that scientific and operational training objectives need not be mutually exclusive and provided a model for how the judicious integration of these objectives could serve to drive innovation in training.
THE BLACK SKIES EXERCISES Once every two years, between 2008 and 2016, the Aerospace Division of DSTG hosted Exercise Black Skies (EBS) in the weeks preceding Exercise Pitch Black (PB); the latter of which is a biennial, multinational, live-flying exercise conducted over Australia’s Northern Territory. EBS had three overarching objectives, which were captured in the motto: “Prepare, Evaluate, Demonstrate”. The first objective of EBS was to provide an opportunity for RAAF participants to prepare for PB. This was achieved by replicating many of the characteristics of PB during simulated EBS missions (e.g., airspace, order of battle, mission types, unit role assignments). The second objective was to use the realistic EBS mission scenarios to evaluate the impact of the tools and methods of synthetic training in an ecologically valid way. To achieve this, EBS incorporated investigations into various aspects of team performance as well as investigations of the transfer of performance benefits from the simulators to live PB missions. The third objective was to demonstrate the potential of synthetic training for air combat to RAAF. This was achieved partly by hosting visits by senior leaders and decision-makers responsible for RAAF training and capability development to observe the exercise and partly by leveraging the outcomes from
Integrating Effective Training and Research Objectives
131
training transfer evaluations to communicate the real-world impact of the exercise on operator performance. Each iteration of EBS ran for five consecutive days, with briefings and familiarization on the first day, followed by four days of mission scenarios. Mission scenarios generally lasted for around 90 minutes. Each mission was preceded by planning and briefing sessions and followed by facilitated after-action reviews (AARs). Simulation systems for EBS consisted of research simulators which, though high in functional fidelity (i.e., the simulator behaved in a manner that faithfully replicated operational systems) were generally of only moderate physical fidelity (i.e., the simulator hardware differed in some ways from that of operational equipment). Specific attention was paid to ensuring that the EBS training environment was relatively high in psychological fidelity (Kozlowski & DeShon, 2004) by ensuring that scenarios, work processes, and team structures were representative of the operational environment. An important contributor to this was the exercise planning process, the beginning of which typically coincided with the Australian International Airshow, over 12 months prior to each iteration of EBS. Planning for EBS was centred on a series of three conferences, which brought together a diverse range of stakeholders, including representatives from the participating units, scientists and engineers from DSTG and their partner organizations, as well as contracted support staff including professional military training subject-matter experts (see McIntyre & Smith, 2013, for a description of the importance of this role) and software engineers. The goals of these conferences were to build shared understanding of the training objectives of the participating units and the research objectives of the scientists and engineering staff and to work collaboratively towards a design and narrative structure for the exercise scenarios that enabled both sets of objectives to be achieved to the greatest extent possible. Over the course of the EBS series, military participants included both airborne and ground-based air-battle managers (ABMs), joint terminal attack controllers (JTACs), and fast-jet aircrew. The exercises delivered measurable performance benefits as well as valuable auxiliary benefits for these military participants. In addition, EBS delivered research outcomes to help inform and de-risk the development of future training capability. In the following sections, we describe how the activity led to positive outcomes of both kinds.
OUTCOMES FOR MILITARY OPERATORS The utility of EBS for the military participants can be understood in terms of the positive impact the exercise had on participant performance and in terms of a number of auxiliary benefits that were observed. These two categories of outcomes are considered in turn below.
Performance Benefits and Transfer Despite the fact that EBS was led by researchers and conducted within research facilities, the activity consistently delivered measurable performance benefits for the
132
Human Factors in Simulation and Training
operators who took part; both in terms of their performance during the simulated exercise itself and during the subsequent live exercise. Performance in EBS was operationalized as expert assessor ratings against two sets of criteria. The first was a set of behaviourally anchored rating scales relating to the teamwork dimensions of Communication, Information Exchange, Leadership/Initiative, and Supporting Behaviour (Smith-Jentsch, Zeisig, Acton, & McPherson, 1998). The second was a set of role-specific mission objectives. During EBS missions, subject-matter expert assessors rated the performance of the participating teams against both sets of criteria. After EBS, follow-up evaluations of the performance of a subset of the participating teams were conducted during the live exercise PB. To provide a basis for comparison, observers at the live exercise also evaluated the performance of similarly experienced “control” teams who had not taken part in EBS. Difference scores were calculated to capture change in ratings of mission effectiveness and team coordination behaviours between the beginning and end of EBS and between the EBS teams and the control teams during live PB missions. These differences were then expressed in terms of percentage of scale maximum, with positive values indicating performance improvements and negative values indicating performance decrements. These scores are summarized in Figure 4.1. The first column of Figure 4.1 shows the mean percentage difference of team coordination for all teams (n = 8) across the EBS series (error bars represent the standard deviation). The data show that there was an average increase in performance of around 20% on team coordination processes from the first to the last Black Skies
FIGURE 4.1 Mean difference scores for Black Skies (synthetic) and Pitch Black (live) expressed as a percentage of scale maximum. Note: nEBS = number of EBS teams; nPB = number of PB teams.
Integrating Effective Training and Research Objectives
133
mission. The second column of Figure 4.1 shows the mean percentage difference of team mission effectiveness across all teams in the EBS series. These data show that there was an increase in performance, on average, of around 10% on mission effectiveness from the first to the last Black Skies mission. The third and fourth columns of Figure 4.1 show the mean percentage difference score on team coordination and mission effectiveness between the EBS teams and the matched control teams across all observed live PB missions (n = 4). These data show that on average the teams that participated in EBS outperformed the control teams that did not by around 20% of scale maximum on team coordination and around 10% of scale maximum on mission effectiveness. These data make it clear that participating teams benefited from EBS in terms of improvements in performance and that these improvements persisted during subsequent missions undertaken in the live environment.
Auxiliary Benefits As well as the consistent performance benefits described above, EBS provided a number of auxiliary benefits for participants in the form of opportunities for augmenting standard training, refining team processes, and making the most of the subsequent live exercise. One such outcome was that the teams who participated in EBS typically departed the exercise with updated plans for how they would execute their missions during the subsequent live exercise. By providing an environment within which coordination processes could be worked out and plans for the live exercise could be tested and refined, EBS added value to the live training experience. Evidence of this benefit was reflected both in the feedback of participants and that of their senior leaders. For example, free-text feedback obtained from EBS participants at the conclusion of the exercise included comments such as: “We have refined our team dynamics and bonded during this exercise”, “I found being able to learn from mistakes through actively applying fixes invaluable”, and “The ability to try and test different ways of tackling problems provided good learning for all”. These sentiments were reinforced during one iteration of PB, when a senior Air Force officer remarked that the performance of a team that had recently participated in EBS represented “the best Day 1 of Pitch Black” he had ever seen. In addition to providing participating teams with the ability to test and refine their own plans and processes, observations during EBS often led to suggested changes being fed back into planning for the live exercise as a whole. These suggestions typically related to characteristics such as airspace boundaries, the placement of tanker or Airborne-Early Warning (AEW) orbits or combat-air-patrol (CAP) points or to the timing of scenario events. Because of this feedback, it is likely that EBS had a positive impact on the quality of the live training experience for many more operators than just those who participated in EBS itself. Large, complex exercises such as PB are often used as the context for conducting operator performance assessments that contribute to the award of advanced qualifications. On their own initiative during the third iteration of EBS, participating units began conducting such observations during the simulated missions of EBS. This was
134
Human Factors in Simulation and Training
achieved by having the operators who were under assessment observed in a one-onone fashion by a qualified trainer during their missions (a practice that is referred to as “back-seating”). This represented yet another practical benefit for the operators and units that were involved in EBS and also provided the research team with increased confidence in the ecological validity of the EBS research environment. A final practical benefit for the operators was observed during EBS 2016, when DSTG and AFRL researchers worked together to develop and provide advanced, semi-automated performance assessment and after-action review (AAR) systems for use by assessors during the exercise. On three occasions during that exercise, assessors were alerted by the prototype systems to events within the scenario that provided important learning opportunities and that they subsequently chose to focus on during the post-mission debrief, but which they stated they would have missed were it not for the automated alerts. This outcome provided good evidence for the utility of such systems for supporting complex training. But of more practical significance for the participating operators, it served to ensure that some important lessons that would not have been captured in other circumstances were identified, communicated, and learnt. When combined with the performance benefits described earlier, it is clear that EBS delivered valuable outcomes for the military operators who took part. In addition to that, EBS served as a rich and ecologically valid context for undertaking research on training and team performance. In the sections that follow, we highlight some of the outcomes from that research and describe how they will inform the development of future training capability.
RESEARCH ON TEAM COORDINATION DYNAMICS One of the key lessons learnt during the EBS series was that it takes a significant amount of effort on the part of a large number of expert staff to plan and execute complex training events using current tools and methods. There is therefore great potential benefit to be gained from research into technologies that reduce the time and the number of expert staff required to plan and generate exercise scenarios, adapt training experiences to trainee needs, observe and evaluate trainee performance, and provide feedback that leads to measurable performance improvements. A primary goal of EBS was to develop and evaluate novel tools and techniques to address these needs. The focus of this effort was on near-real-time operator and team state monitoring; and in particular, non-linear methods for assessing changes in team state by analyzing dynamic patterns in data (Riley & Van Orden, 2005). From the dynamical systems perspective, teams are continuously evolving and highly interdependent sets of nested components whose dynamics are shaped by fluid constraints that couple individuals and result in similarities in their multimodal responses (Gorman, Dunbar, Grimm, & Gipson, 2017). Put another way, observed patterns in physiological responses like heart rate, and behaviours like communication, from teams interacting to resolve task demands form dynamical trajectories that are a confluence of constraints, both internal and external. The influence of these constraints, which manifest as restrictions on degrees of freedom arising from
Integrating Effective Training and Research Objectives
135
interactions in the form of dynamic dependencies, confine the temporal evolution – the dynamics – of the system and are reflected as changes in individual- and teamlevel physiological responses as well as overt behaviour. For example, during the complex scenarios often observed in military settings, teams must cycle between regular, practised behaviours and unplanned, adaptive responses to unforeseen perturbations (Ishak & Ballard, 2011). Regular behaviours occur when executing procedures that are well trained and integrated into team functioning, such as a leader’s pushing information about objectives and progress during typical mission phases like marshalling and check-in, in the air-battle management domain. These behaviours tend to be relatively highly patterned. In this case, team dynamics are stable and predictable, and thus have a low entropy (a measure of complexity and predictability; Pincus & Singer, 1996; Richman & Moorman, 2000). This situation can be contrasted with adaptive responses to unforeseen events in which typical patterns may break down. For instance, during an aircraft emergency (e.g., an engine failure), the amount of communication traffic can quickly increase and unused communication pathways may be employed to convey the context-specific information needed to resolve an evolving set of problems. In this case, team dynamics are unplanned and emergent, which means they can fluctuate to provide local variation to meet global task demands (c.f., Gorman et al., 2017). These fluctuations are breakdowns in constraints that restrict the complexity of the system in order to meet changing objectives. This reduction in constraints allows spontaneous reorganization of system dynamics to meet ongoing task demands that increases adaptability and leads to less stability and predictability, thus resulting in higher degrees of entropy (Stephen, Boncoddo, Magnuson, & Dixon, 2009). In sum, the responses of teams to changing task demands and unforeseen events are expected to lead to measurable changes in the complexity of multimodal physiological and behavioural data collected from the teams, which in turn can be used to identify learning opportunities. In EBS, we sought to leverage unobtrusive team-level responses in physiological and behavioural data to assess changes in team state with the goals of identifying potential learning opportunities and improving after-action reviews. We used several types of non-linear analyses to assess the temporal complexity of time series data observed from teams of ABMs, including complexity analysis using sample entropy (Richman & Moorman, 2000; Strang et al., 2012) and measures of dynamic stability from recurrence quantification analysis (RQA; Marwan, Romano, Thiel, & Kurths, 2007; Webber & Zbilut, 1994). In the following sections, we highlight results from our efforts in terms of communication and physiological signals.
Communication Dynamics Team communication is a direct indicator of team cognition and reflects the manner in which teams are organized (Cooke, Gorman, Myers, & Duran, 2013). Importantly, team communication, using categorical time series indicating who is speaking and on what channel has been shown to be sensitive to cognitive workload in teams performing an ABM simulation (Strang et al., 2012). In EBS 2014, a prototype tool
136
Human Factors in Simulation and Training
called the “Dynamic, Real-time Analysis of Distributed Interactive Simulation packets tool” (DRADIS) was debuted to analyse real-time communication patterns in categorical time series data for change points (i.e., times during which the parameters that specify the mean and variance in the data change) without disrupting normal team communication (see Figure 4.2). DRADIS analyzed the temporal complexity of voice transmissions sent by ABMs during EBS missions in real time and sent alerts via a visual interface whenever entropy fell outside of 90% bootstrapped confidence intervals. The results were used to support subject-matter expert evaluation of team functioning during after-action reviews and users reported that DRADIS provided valuable information and insight into team dynamics (Dukes, Funke, Strang, & Best, 2015).
Physiological Dynamics In addition to communications, patterns of physiological data are also informative of team state and dynamics. For instance, changes in heart rate can reflect degrees of excitement and investment in a task (Wright & Gendolla, 2012) and similarity in the patterning of changes in heart rate is related to mutual investment in shared
FIGURE 4.2 DRADIS display, showing different computed complexities (sample entropies), number of utterances, and summary from a 30-minute moving window.
Integrating Effective Training and Research Objectives
137
FIGURE 4.3 Heart rates from all air-battle managers (ABMs; left) are simultaneously assessed for dynamic patterns using multivariate recurrence quantification analysis (MVRQA). %Determinism, an MVRQA metric related to the degree of randomness in the multivariate system, is then indexed for sudden changes using a change-point detection algorithm. A large change point can be seen immediately around the planned training event (the “inject”). %Determinism is also clearly higher for the team of ABMs than it is for the average of surrogate systems.
outcomes and trust between individuals (Konvalinka et al., 2011; Mitkidis, McGraw, Roepstorff, & Wallot, 2015; Tolston et al., 2018). Heart rate is also associated to different emotional states (Cacioppo, Bernston, Larson, Poehlmann, & Ito, 2000) and with cognitive workload (Charles & Nixon, 2019). Nevertheless, mapping heart rate to task performance outcomes can be difficult, even in the best-case scenario of a single participant in a tightly controlled experiment. With large teams of military operators performing in highly complex environments, calculating useful heart rate metrics becomes extremely challenging. To address this, we sought to identify metrics of team-level cohesion using multivariate recurrence quantification analysis (MVRQA, a non-linear tool for assessing structure in complex signals; Cao, Mees, & Judd, 1998; Proulx, Côté, & Parrott, 2009; Wallot, Roepstorff, & Mønster, 2016). MVRQA considers the heart rates of all team members as a single system and thus provides metrics of complexity and stability of dynamics of the team as a whole. Previous work has shown that MVRQA can be used to assess team physiological–behavioural coupling (Tolston et al., 2018) which in turn has been shown to index cohesion, workload, and stress (e.g., Strang, Funke, Russell, Dukes, & Middendorf, 2014). In our analyses of the data from EBS 2014, we investigated whether MVRQA can scale to large teams (i.e., teams composed of up to at least 13 members) in highly complex situations (Tolston, Best, Funke, Menke, & Dukes, 2016). Thirteen RAAF military members took part in six complex, air-combat missions in a DSTG simulation facility in Melbourne, Australia over the course of four days. Each mission featured an unexpected event – called a “scenario inject” – to test team effectiveness. Heartrate data collected from team members were simultaneously assessed for evidence of physiological coupling using MVRQA within each mission. MVRQA measures were evaluated against synthesized surrogate signals for greater-than-chance level coupling; each surrogate signal was generated to have a fractal structure similar
138
Human Factors in Simulation and Training
to that of the heart rate of an individual during the mission in question (c.f., Strang et al., 2014). MVRQA measures were evaluated for sudden changes in estimated distribution parameters using an online Bayesian change point (BCP) detection algorithm (Adams & MacKay, 2007) and distances from the beginning of the inject to the nearest estimated change point were compared to chance levels. The results showed that the mean value MVRQA %Determinism – a measure that is inverse to the degree of randomness in data – of real teams was significantly higher compared to that of surrogate signals, meaning that there were dynamical dependencies in the data that were revealed by MVRQA. Additionally, following initiation of the scenario inject, BCP analyses of %Determinism detected a significant variation in coupling between team members with an average time of less than two minutes, a value significantly lower than values obtained from surrogate analyses. This means that MVRQA of team-level heart rate was linked to environmental factors. Importantly, the number of change points did not differ between teams and surrogates, meaning that the surrogate analyses succeeded in replicating the degree of non-stationarity in the data, thereby providing supporting evidence that the distribution of change points in the team data was linked temporally to the onset of the perturbations. These results show that interactions and dependencies of large teams in complex tasks can be meaningfully summarized by sensitive, low-dimensional values: A single metric indexing the degree of physiological coupling between up to 13 teammates showed sensitivity to disruptive mission events. This provides evidence for the utility of MVRQA in the assessment of coupling in even a large number of signals and demonstrates the applied potential of this technique for supporting team performance assessment and monitoring changes in team coordination processes in realistic settings.
Combining Communication with Physiological Data The analyses described above demonstrated that MVRQA can scale to large teams (i.e., teams of up to 13 members) in highly complex situations (Tolston et al., 2016) and that interactions and dependencies of large teams in complex tasks can be meaningfully summarized by sensitive, low-dimensional values. They also showed that changes in team communication patterns in complex training environments vary systematically around perturbations (Dukes et al., 2015; c.f., Wiltshire, Butner, & Fiore, 2018). Together, these outcomes highlight the potential utility of univariate signals for summarizing the complex dynamics of physiological and behavioural responses of military teams and thereby detecting meaningful changes in team coordination dynamics. However, since these metrics relied on summary statistics of the entire system separated by modality, the approaches taken did not provide diagnostic information regarding which subcomponents of the multivariate system were most responsive to events (i.e., which signals and team members were most responsive to changes in task demands). In more recent efforts, we have combined non-linear heart rate variability metrics from individuals and the whole team with communication patterns into a single
Integrating Effective Training and Research Objectives
139
feature space and conducted multivariate analyses to determine which signals are likely to be most responsive to unanticipated events. As with the univariate analyses described above, we expected that there would be meaningful patterns that emerge in multivariate characterizations of team responses that map on to critical training events. Further, we expected that multivariate analyses would lend insight into how the teams adapted to those events. Our analyses showed that multivariate change-point analysis of MVRQA of physiological data and communication complexity in data from large teams of ABMs in complex scenarios can uncover patterns that vary in regular ways around critical mission events (Tolston et al., 2019). Importantly, the multivariate approach allowed detailed diagnostics of team responses to particular mission events. To follow up on this analysis, we are currently evaluating how topological data analysis (TDA, a powerful way of identifying patterns in data; Carlsson, 2009; Singh, Mémoli, & Carlsson, 2007) can be used to assess multivariate team data to identify regular patterns in team coordination and physiological states. Our preliminary findings support the proposition that teams enter into stable trajectories that correspond to dynamical attractors and that teams under perturbation re-organize to process the perturbation and return to stable attractors, which forms cycles in the data (Tolston et al., 2019). These results provide evidence for the utility of combining MVRQA and TDA in the assessment of complex high-dimensional data in high-fidelity training environments. The research described in this section demonstrates that real-time assessment of communications can aid subject-matter experts during after-action reviews and that assessment of heart rate data from large teams provides useful indices of team state. By focusing on changes in dynamical systems, we have shown that it is possible to detect when teams transition into a new state, which often happens around critical periods of the mission (cf., de Mooij et al., 2020). Being able to draw the attention of instructors and assessors to changes in team state or to drive changes in the behaviour of adaptive training systems on the basis of such information provides significant opportunities for the development of training capability. For future work, we believe that identifying low-dimensional variables to quantify dynamics of the task environment, a critical source of constraints and perturbations for the team, is an important next step in both improving automated team interventions and identifying adaptive training opportunities. For instance, longitudinal metrics characterizing the spatial relations between aircraft could be combined with lowdimensional MVRQA metrics to create a time series indicator of environment-team coupling. In this case, a strongly coupled environment-team system would entail that the environment and team are evolving together, indicating a strong mutual influence that could point to effective or responsive teamwork. Alternatively, a weakly or uncoupled environment-team system could mean that there are changes in the environment that the team has not yet responded to, or that there are changes in the dynamics of the team that reflect internal reorganizations or perturbations in the team itself. We believe that being able to measure such symmetric and asymmetric changes in dynamics of a team and its environment will be critical for developing smart, real-time interventions from automated aids.
140
Human Factors in Simulation and Training
OPERATOR ACCEPTANCE OF PHYSIOLOGICAL MONITORING The potential applications of near-real-time monitoring of operator and team states such as workload, situation awareness, and fatigue are many, both in operational and training settings. In the training domain, such information could be used to inform performance evaluation, help tailor the behaviour of adaptive training systems, alert instructors to potentially important learning points, or identify critical events for feedback during after-action review. Research of the kind described in the preceding section is needed to develop and validate the approaches to data analysis that underpin such applications. However, the effective acquisition, interpretation, and utilization of the information arising from operator state monitoring technologies in applied settings also depend upon the acceptance of these technologies; both by those who are being monitored (e.g., trainees) and by those doing the monitoring (e.g., instructors, curriculum designers). Given this, it is important to understand the tendency of operators to either accept or reject such technologies as well as the factors underlying this. Factors that may either increase or decrease acceptance of monitoring technologies have been identified in the research literature. Examples of the former include perceptions that the benefits of the technology outweigh any risks (Moran et al., 2013) or the belief that performance monitoring makes available data that individuals can use for their own purposes, such as keeping track of their own physical fitness (e.g., Heron & Smyth, 2010). Examples of the latter include fear of unwanted disclosure of health-related information (e.g., Ahamed, Talukder, & Kameas, 2007) and feelings of discomfort or anxiety associated with being the subject of evaluation by a superior, colleague, or even the monitoring system itself (e.g., Zeidner & Matthews, 2005). During EBS14 and EBS16, four teams of C2 operators (27 operators in total) were asked to wear a physiological monitoring device for the duration of the exercise scenarios. Two of these teams were from the ground-based C2 environment and two were from the airborne environment. The device used was the Zephyr BioHarness 3 (Medtronic Zephyr, Boulder CO, USA). The Zephyr BioHarness is a lightweight physiological sensor designed to be worn against the wearer’s chest by means of a flexible synthetic strap. It records electrocardiographic (ECG), respiration, and accelerometry data at rates of 250, 100, and 25 Hz respectively, as well as summary statistics at a rate of 1 Hz. Data were recorded throughout each exercise scenario to the onboard memory of the recording module. At the end of each scenario, data were downloaded from each operator’s module to a central database for subsequent analysis. At the conclusion of each exercise, participants were asked to complete a Device Comfort Questionnaire (DCQ). The first version of this instrument, comprising five subscales, was used during EBS14. Items on the first subscale, device ergonomics, related to fit factors (e.g., ease of donning/doffing). The second subscale, acceptance of physiological monitoring, included items related to operators’ comfort with being monitored. The last three subscales, namely endorsement of use during simulation training exercises, live training exercises, and real operations, asked respondents to evaluate the degree to which they would accept the use of monitoring technologies
Integrating Effective Training and Research Objectives
141
for a variety of purposes (e.g., performance assessment, adjustment of task difficulty) across those three different contexts (see Menke et al., 2015, for a detailed description of this instrument). An updated version of the DCQ comprising nine subscales was used in EBS16. The first subscale, physical and psychological comfort, combined the device ergonomics and acceptance of physiological monitoring subscales from the previous version of the DCQ. The next six subscales were constructed to distinguish between different contexts for using monitoring technologies as well as between contexts in which data are interpreted and used by a human decision-maker (e.g., an instructor) versus a machine, such as an adaptive training system. These subscales were: (1) endorsement of use during simulation training exercises to support human decisionmakers, (2) endorsement of use during simulation training exercises to support adaptive automation, (3) endorsement of use during live training exercises to support human decision-makers, (4) endorsement of use during live training exercises to support adaptive automation, (5) endorsement of use during real operations to support human decision-makers, and (6) endorsement of use during real operations to support adaptive automation. Items on the next subscale, comfort with ubiquitous monitoring, asked participants to indicate how comfortable they would be wearing physiological monitoring devices during off-duty activities, such as during sleep and leisure activities. The final subscale, comfort with assessment consequences, included items related to potentially unanticipated consequences of physiobehavioural monitoring, such as disclosure of medical information (see Funke et al., 2017 for details of this instrument). Analyses conducted on the data from the first administration of the DCQ revealed surprisingly strong acceptance of the monitoring devices and their various use cases. This appeared to indicate that, from the operators’ perspective, the perceived benefits of monitoring outweigh any perceived risks. This was a surprising outcome, given that there remain significant limitations on the extent to which such devices and the data they make available can be considered consistently reliable and valid across a wide range of operational contexts and proposed applications (e.g., Christensen, Estepp, Wilson, & Russell, 2012; Hancock & Matthews, 2019; Matthews, Reinerman-Jones, Barber, & Abich, 2015). One possible implication of this outcome is that operators may be unfamiliar with the capabilities and limitations of current technologies. Based on this, the authors concluded that it would behove those working in the area to ensure that expectations are appropriately calibrated against the actual capabilities of the systems under development. Failure to do so could result in violated expectations, mistrust, and disuse of future capabilities (Menke et al., 2015). Some potential applications of monitoring technologies involve providing information to human decision-makers, while others involve informing the behaviour of intelligent, adaptive systems. For its second administration, the structure of the DCQ was refined in order to investigate whether acceptance of monitoring technologies varied systematically between these two use cases. Analyses conducted on the data from the second administration of the DCQ suggested that this factor did indeed have an impact. While the operators’ responses were not as overwhelmingly
142
Human Factors in Simulation and Training
positive in the second administration as the first, the subscale scores that departed from the scale midpoint in a statistically significant way (physical and psychological comfort, endorsement of use during simulation training exercises to support human decision-makers, and endorsement of use during live training exercises to support human decision-makers) did so universally in the positive direction. Furthermore, there was a statistically significant difference across subscales, depending upon who was named as being the user of the data. Specifically, operator ratings were significantly more positive on average for subscales that named a human decisionmaker than those that named an intelligent, adaptive system as the user of the data. This difference in acceptance could have significant implications for the use of monitoring technologies to enable the provision of large-scale, complex training experiences in Australia and other countries with similarly sized militaries. As described above, a key constraint on the provision of such training is the availability of expert trainers to plan and oversee the execution of these activities and of operational personnel to generate scale and realistic learning opportunities by participating in exercise scenarios. Monitoring technologies can go some way towards extracting additional learning benefits from complex training events by, for example, drawing the attention of the relatively small number of available instructors to potentially important learning points that might otherwise be missed (indeed specific examples of this were observed during EBS 2016). However, if these technologies are to support a significant increase in the frequency and regularity with which complex training can be provided, without a concomitant increase in the staff and other resources required, the development and broad adoption of intelligent, adaptive training systems is likely to play a role. These outcomes from EBS, therefore, serve to highlight an area in which further research is required to understand and overcome potential barriers to the adoption of new training technologies.
EVALUATION OF TRAINING CAPABILITY Networked simulators and associated support systems of the kind developed, demonstrated, and transitioned to RAAF during the EBS series have introduced the opportunity to more closely replicate conditions encountered in combat and to record performance parameters for analysis and feedback. Further, they hold the promise of gaining efficiencies in training through a more effective blending of simulation and live training. However, it is unlikely that the promise of these technologies can be fully realized unless their design, use, and evolution are informed by a thorough understanding of combat-mission training requirements. Within the United States, a suite of tools and methods collectively known as the Mission Essential Competencies (MEC) approach has been successfully applied to define and track training requirements across a range of capabilities. MECs, along with associated supporting competencies, knowledge, skills and developmental experiences have been defined for every tactical platform in the United States Air Force (USAF), command-and-control platforms in both the USAF and the United States Navy (USN), as well as intelligence surveillance and reconnaissance (ground and airborne), close air support (air and ground operations), and multinational
Integrating Effective Training and Research Objectives
143
mission sets in air-to-air, air-to-ground, command and control, peacekeeping support, and joint terminal attack control. Given the success of the implementation of MECs, many efforts have been refreshed as the capabilities of the systems or mission requirements have changed over time and the results have been incorporated in training systems (see Bennett, Alliger, Colegrove, Garrity, & Beard, 2013, for a detailed description of the MEC processes and products). The final exercise to carry the Black Skies name, EBS 2016, represented a transition of capability from DSTG to RAAF and marked the initial operating capability (IOC) for a facility that was then known as the Joint Air Warfare Battle Laboratory (JAWBL; currently the RAAF Distributed Training Centre) at RAAF Base Williamtown. The purpose of the JAWBL was to inform and de-risk requirements for more permanent simulation and experimentation capabilities that would become part of the RAAF Air Warfare Centre. The baseline systems installed into the JAWBL and the planning, staffing, training design, assessment, and feedback methods used in EBS16 were the products of iterative development and evaluation that took place within DSTG’s Melbourne facilities throughout the preceding decade (e.g., Crane et al., 2006). To achieve the objectives of JAWBL, it was necessary to build upon this foundation by (1) codifying RAAF combat-mission training requirements, (2) developing and trialling systems and methods for addressing those requirements, (3) evaluating the relative strengths and weaknesses of those systems and methods, and (4) tracking progress over time to target resources effectively. To support this effort, DSTG and AFRL researchers partnered with RAAF to conduct a MEC-based analysis of JAWBL capability. As this would be the first time the MEC approach had been applied to a specifically Australian platform, a secondary objective was to evaluate the utility of the approach in the Australian context. The capability chosen for this analysis was the Air Defence Ground Environment (ADGE), operated by RAAF Number 41 Wing. The ADGE is a ground-based, tactical air-battle management (ABM) capability. ADGE operators had taken part in every iteration of EBS prior to 2016 and, as a result, their systems, missions, and roles were reasonably well understood. The ADGE MEC effort led to the identification of 9 ADGE mission essential competencies, 24 supporting competencies, 63 knowledge elements, 60 skills, and 73 developmental experiences. These materials were used during EBS16 to characterize the status of JAWBL training systems at the time of IOC. One aspect of the MEC approach that sets it apart from many other approaches to training analysis is its inclusion of “Developmental Experiences” as a driver of system requirements, scenario design, and training effectiveness evaluation. Experiences are defined within the MECs framework as developmental events during training and/or career activities that are necessary to learn knowledge, skills, or supporting competencies under operational conditions (Bennett et al., 2013). MEC experiences can be particularly useful in the context of capability development because they help to define what kinds of scenarios trainees should be exposed to – and therefore what kinds of stimuli and responses training systems should be able to support. They also support training evaluation by providing criteria that are straightforward for participants to understand and for decision-makers to interpret.
144
Human Factors in Simulation and Training
A “Training Experiences Survey” was constructed from the list of 73 ADGE mission-essential developmental experiences. Near the conclusion of EBS16, ADGE operators were asked to rate, based on what they had seen during the exercise, the extent to which they felt the existing JAWBL systems could support the effective provision of each experience. They were also asked to rate the importance of each experience for becoming fully combat-mission ready. Analysis of these data showed that participants judged that JAWBL capabilities, as demonstrated during EBS16, supported the provision of a majority of developmental experiences to at least some extent. Summing across participants and items, more than three-quarters (77.6%) of effectiveness ratings indicated that JAWBL was “Somewhat Effective” or better. Almost two-thirds (62.3%) of these ratings indicated that JAWBL was “Quite Effective” or better. While these outcomes were broadly positive, it was also apparent from the existence of a number of ratings in the “Not at all Effective” and “Slightly Effective” categories that there was room for improvement. A closer examination of these responses was undertaken to identify opportunities for capability development. Of the 73 ADGE developmental experiences, there were a total of 15 that received a median rating less than “Somewhat Effective”. Additionally, there were a total of 26 experiences for which two or more ADGE participants agreed that existing JAWBL capabilities were less than “Somewhat Effective”. There was substantial overlap between these two lists of experiences, such that the former was a subset of the latter, with the exception of just one experience. Examination of this list revealed three possible reasons why participants might have evaluated the JAWBL as relatively ineffective for providing some developmental experiences. First, the list contained a subset of experiences that are, by their nature, only marginally relevant for exercises of the kind represented by EBS16. Examples included “Operate in a deployed or field environment”, “Familiarization visits”, and “Being an exchange officer”. It was concluded that little could or should be done within JAWBL to address these particular shortcomings. Of the experiences that appeared to be better suited to EBS and JAWBL, two further sub-categories could be discerned. Some experiences were not nominated as training objectives by participating units during planning for EBS16 but nevertheless could have been provided if required. Examples included “Observe and respond to a civilian emergency”, “Operate under extreme fatigue, long hours, stress”, and “Experience a lost or unaccounted for aircraft”. Finally, the list contained experiences that, while well-suited to training environments like EBS in general terms, were indeed not possible within the JAWBL at the time the exercise was conducted due to the status of those particular simulation systems. Examples included “Operate in a GPS denied environment” and “Operations during degraded comms”. To help interpret the findings presented above, the underlying reasons for participant evaluations were represented diagrammatically in the form of a “MEC Experience Stack”, as shown in Figure 4.4. This model depicts the categories of developmental experiences described above as well as a path for experiences to move between categories over time as training capability develops, matures, and is demonstrated to be effective. The top layer of the stack (i.e., above the dashed
Integrating Effective Training and Research Objectives
145
FIGURE 4.4 A MEC-based model to guide training design and system development. Mission-essential developmental experiences are categorized according to whether they could, in principle, be provided within a given training environment, and if so, whether there is evidence to indicate that they are provided effectively given current system status. Arrows indicate a developmental trajectory.
line representing the effectiveness threshold) contains those experiences for which high effectiveness ratings have been obtained (Category A). The next layer down (i.e., immediately below the dashed line representing the effectiveness threshold) divides low-effectiveness experiences into those that could reasonably be considered to be within-scope for the training environment under consideration and those that should be considered out-of-scope (Category B). The bottom layer further differentiates amongst within-scope experiences; categorizing them as either supportable given current system status (Category C), or not currently supportable (Category D). A developmental trajectory for JAWBL or similar facilities is represented by the arrows in Figure 4.4. According to this model, progress is represented by the achievement of outcomes that serve to migrate learning experiences from Category D, through Category C, and eventually into Category A. Building on this model and its underlying logic, a set of metrics were derived by which the status and progress of the JAWBL could be tracked. The first metric was a measure of the total possible training scope of the capability; defined as the proportion of the entire list of mission-essential learning experiences for a given training audience that could, in principle, be addressed – whether or not they are supported by existing systems. This measure helps to quantify the ideal of what can be expected from the training environment. It is a representation of what proportion of training objectives could be achieved for a given training audience within a facility of this kind in a so-called “perfect world”. The second metric was an estimate of the proportion of mission-essential developmental experiences that could currently be provided to a reasonable level of effectiveness, given the existing state of the capability. Unlike the previous metric, this is not an operationalization of some ideal state, but rather, a measure of current status. There are two categories of experiences in
146
Human Factors in Simulation and Training
the model that could in principle represent current capability; one for which positive effectiveness ratings have been obtained (Category A) and one for which such ratings have not yet been obtained, but which nevertheless could be supported if they were written into mission scenarios (Category C). A more conservative approach to characterizing existing capability that yields an alternative version of this metric takes into account only those experiences for which supporting data have been obtained and excludes those experiences that could be supported, but which have not yet been demonstrated. The MEC products and processes, along with the model and metrics described above provided a relatively straightforward way of operationalizing, understanding, tracking, and communicating JAWBL status as well as focusing capability development and future training and research efforts. In this way, these tools and methods hold significant promise for ensuring that participating operators gain benefits from activities such as EBS and also that resources directed at expanding and evolving training capabilities are targeted effectively.
SUMMARY AND CONCLUSION The Black Skies series of exercises provided demonstrably effective training as well as a range of valuable auxiliary benefits for the military operators who participated. By providing concrete outcomes for the personnel involved, these exercises served to foster engagement by decision-makers at all levels within RAAF and, over time, helped build the case for investment in new training capabilities. In addition, the realistic combat missions simulated during EBS provided an ecologically valid context within which to investigate team learning and performance and to inform the development of emerging training tools and methods. The EBS motto of “Prepare, Evaluate, Demonstrate” served to characterize the multi-faceted objectives of the exercises and to focus exercise planning and execution on balancing the needs of a variety of stakeholders. Through thorough planning, deep stakeholder engagement, and close collaboration between experts with a diverse range of skillsets, the EBS series demonstrated that it is possible – and indeed beneficial – to pursue scientific and operational training objectives in tandem. In doing so, we believe these exercises serve as a model for how the judicious integration of operational and research objectives can drive innovation in training.
ACKNOWLEDGEMENTS AND DEDICATION The authors acknowledge the significant contributions made to the EBS series by the participating members of the Royal Australian Air Force and The Australian Army; researchers from DSTG Aerospace Division’s Human Factors and Air Operations Simulation Groups, Defence Technology Agency New Zealand, and Defence Science and Technology Laboratory UK; and industry partners from Aptima, Ball Aerospace, Milskil, and Simulation Solutions Australasia. In memory of Michael Skinner and Ron Best, whose enthusiasm for the work described here was a source of great support.
Integrating Effective Training and Research Objectives
147
REFERENCES Adams, R. P., & MacKay, D. J. (2007). Bayesian online changepoint detection. arXiv Preprint. Ahamed, S. I., Talukder, N., & Kameas, A. D. (2007, September). Towards privacy protection in pervasive healthcare. Paper presented at the 3rd IET International Conference on Intelligent Environments (IE 07), Ulm, Germany. Bennett, W., Alliger, G. M., Colegrove, C. M., Garrity, M. J., & Beard, R. M. (2013). Mission essential competencies: A novel approach to proficiency-based live, virtual, and constructive readiness training and assessment. In C. Best, G. Galanis, J. Kerry, & R. Sottilare (Eds.), Fundamental Issues in Defense Training and Simulation (pp. 47–62). Aldershot: Ashgate. Best, C., Jia, D., & Simpkin, G. (2013). Air force synthetic training effectiveness research in the Australian context. Proceedings of the NATO STO MSG 111 Multi-Workshop, Sydney, October, 2013. Cacioppo, J. T., Bernston, G. G., Larson, J. T., Poehlmann, K. M., & Ito, T. A. (2000). The psychophysiology of emotion. In M. Lewis & J. M. Haviland-Jones (Eds.), Handbook of Emotions (2nd ed., pp. 173–191). New York: Guilford Press. Cao, L., Mees, A., & Judd, K. (1998). Dynamics from multivariate time series. Physica D: Nonlinear Phenomena, 121(1), 75–88. Carlsson, G. (2009). Topology and data. Bulletin of the American Mathematical Society, 46(2), 255–308. Charles, R. L., & Nixon, J. (2019). Measuring mental workload using physiological measures: A systematic review. Applied Ergonomics, 74, 221–232. Christensen, J. C., Estepp, J. R., Wilson, G. F., & Russell, C. A. (2012). The effects of dayto-day variability of physiological data on operator functional state classification. Neuroimage, 59, 57–63. Cooke, N. J., Gorman, J. C., Myers, C. W., & Duran, J. L. (2013). Interactive team cognition. Cognitive Science, 37(2), 255–285. Crane, P., Skinner, M., Best, C., Burchat, E., Gehr, S. E., Grabovac, M., Pongracic, H., Robbie, A., & Zamba, M. (2006). Exercise Pacific link: Coalition distributed mission training using low-cost communications. Proceedings of the 2006 Australasian Simulation Technology and Training Conference (SimTecT), Melbourne, May 2006. De Mooij, S. M. M., Blanken, T. F., Grasman, R. P. P. P., Ramautar, J. R., Van Someren, E. J. W., & van der Maas, H. L. J. (2020). Dynamics of sleep: Exploring critical transitions and early warning signals. Computer Methods and Programs in Biomedicine, 193, 105448. https://doi.org/10/ggsptj Dukes, A. W., Funke, G. J., Strang, A. J., & Best, C. J. (2015). DRADIS—Real Time Communication Analysis. Poster presented at the 2015 International Symposium on Aviation Psychology, Dayton, OH. Francis, C., Best, C., & Yildiz, J. (2016). Improving air force operator performance through synthetic mission rehearsal. Proceedings of the SISO Simulation Innovation Workshop, Orlando, FL, September 2016 [Invited Paper]. Funke, G., Best, C., Menke, L., & Strang, A. J. (2017). Warfighter acceptance of future physio-behavioral monitoring and augmentation: Update. Proceedings of the Annual Meeting of the Human Factors and Ergonomics Society, 61, 1151–1155. Gorman, J. C., Dunbar, T. A., Grimm, D., & Gipson, C. L. (2017). Understanding and modeling teams as dynamical systems. Frontiers in Psychology, 8, 1053. Hancock, P. A., & Matthews, G. (2019). Workload and performance: Associations, insensitivities, and dissociations. Human Factors, 61, 374–392. Hartigan, B. (2020, July 16). Exercise Virtual Pitch Black delivers complex training [Article]. Contact Magazine. Retreived from: https://www.contactairlandandsea.com /2020/07 /16/exercise-virtual-pitch-black-delivers-complex-training/
148
Human Factors in Simulation and Training
Heron, K. E., & Smyth, J. M. (2010). Ecological momentary interventions: Incorporating mobile technology into psychosocial and health behaviour treatments. British Journal of Health Psychology, 15, 1–39. Ishak, A. W., & Ballard, D. I. (2011). Time to re-group: A typology and nested phase model for action teams. Small Group Research, 43(1), 3–29. Konvalinka, I., Xygalatas, D., Bulbulia, J., Schjødt, U., Jegindø, E.-M., Wallot, S., Van Orden, G., & Roepstorff, A. (2011). Synchronized arousal between performers and related spectators in a fire-walking ritual. Proceedings of the National Academy of Sciences, 108(20), 8514–8519. Kozlowski, S. W., & DeShon, R. P. (2004). A psychological fidelity approach to simulationbased training: Theory, research and principles. In S. G. Schiflett, L. R. Elliott, E. Salas, & M. D. Coovert (Eds.), Scaled Worlds: Development, Validation and Applications (pp. 75–99). Aldershot: Ashgate. McIntyre, H. M., & Smith, E. (2013). Key tenets of collective training. In C. Best, G. Galanis, J. Kerry, & R. Sottilare (Eds.), Fundamental Issues in Defense Training and Simulation (pp.125–134). Aldershot: Ashgate. Marwan, N., Romano, M. C., Thiel, M., & Kurths, J. (2007). Recurrence plots for the analysis of complex systems. Physics Reports, 438(5), 237–329. https://doi.org/10.1016/j.physrep .2006.11.001 Matthews, G., Reinerman-Jones, L. E., Barber, D. J., & Abich, J. (2015). The psychometrics of mental workload: Multiple measures are sensitive but divergent. Human Factors, 57, 125–143. Menke, L., Best, C., Funke, G., & Strang, A. (2015). Warfighter acceptance of future physiological monitoring and augmentation: A coalition study. Proceedings of the Annual Meeting of the Human Factors and Ergonomics Society, 59, 125–129. Moran, S., Jaeger, N., Schnädelbach, H., & Glover, K. (2013, June). Using adaptive architecture to probe attitudes towards ubiquitous monitoring. Proceedings of the IEEE International Symposium on Technology and Society (ISTAS), Toronto, ON, Canada. Mitkidis, P., McGraw, J. J., Roepstorff, A., & Wallot, S. (2015). Building trust: Heart rate synchrony and arousal during joint action increased by public goods game. Physiology & Behavior, 149, 101–106. Pincus, S., & Singer, B. H. (1996). Randomness and degrees of irregularity. Proceedings of the National Academy of Sciences, 93(5), 2083–2088. Proulx, R., Côté, P., & Parrott, L. (2009). Multivariate recurrence plots for visualizing and quantifying the dynamics of spatially extended ecosystems. Ecological Complexity, 6(1), 37–47. Richman, J. S., & Moorman, J. R. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology, 278(6), H2039–H2049. Riley, M. A., & Van Orden, G. C. (2005). Tutorials in Contemporary Nonlinear Methods. National Science Foundation. https://www.nsf.gov/pubs/2005/nsf05057/nmbs/nmbs .pdf Shanahan, C., Best, C., Finch, M., Tracey, E., Vince, J., Hasenbosch, S., & Stott, A. (2009). Exercise Black Skies 2008: Enhancing Live Training Through Virtual Preparation. Part One: An Evaluation of Training Effectiveness. DSTO-RR-0344. Melbourne: Defence Science and Technology Organisation. Singh, G., Mémoli, F., & Carlsson, G. (2007). Topological methods for the analysis of high dimensional data sets and 3d object recognition. Eurographics Symposium on PointBased Graphics, 22, 91–100.
Integrating Effective Training and Research Objectives
149
Smith-Jentsch, K. A., Zeisig, R., Acton, B., & McPherson, J. (1998). Team dimensional training: A strategy for guided team self-correction. In J. A. Cannon-Bowers & E. Salas (Eds.), Making Decisions Under Stress: Implications For Individual and Team Training. (pp. 271–297). Washington, DC: APA. Stephen, D. G., Boncoddo, R. A., Magnuson, J. S., & Dixon, J. A. (2009). The dynamics of insight: Mathematical discovery as a phase transition. Memory \& Cognition, 37(8), 1132–1149. Stephens, A., Crone, D., Temby, P., Best, C., & Simpkin, G. (2011). Using synthetic environments to enhance close-air support training. Proceedings of the Simulation Technology & Training Conference (SimTecT), Melbourne, May, 2011. Strang, A. J., Funke, G. J., Russell, S. M., Dukes, A. W., & Middendorf, M. S. (2014). Physiobehavioral coupling in a cooperative team task: Contributors and relations. Journal of Experimental Psychology: Human Perception and Performance, 40(1), 145–158. Strang, A. J., Horwood, S., Best, C., Funke, G. J., Knott, B. A., & Russell, S. M. (2012). Examining temporal regularity in categorical team communication using sample entropy. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 56(1), 473–477. Tolston, M. T., Best, C. J., Funke, G. J., Menke, L., & Dukes, A. W. (2016). Monitoring Large Teams for Changes in Physiological Coupling Using Multivariate Recurrence Quantification Analysis. Poster presented at 8th International Conference on Applied Human Factors and Ergonomics, Orlando, FL. Tolston, M. T., Best, C. J., Miller, B., Rice, B., Francis, C., & Funke, G. J. (2019). Responses of Teams of Air Battle Managers to Perturbations in High-Fidelity Training Scenarios. Talk presented at the 20th International Symposium on Aviation Psychology, Dayton, OH. Tolston, M. T., Funke, G. J., Alarcon, G. M., Miller, B., Bowers, M. A., Gruenwald, C., & Capiola, A. (2018). Have a heart: Predictability of trust in an autonomous agent teammate through team-level measures of heart rate synchrony and arousal. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 62, 714–715. Wallot, S., Roepstorff, A., & Mønster, D. (2016). Multidimensional Recurrence Quantification Analysis (MdRQA) for the analysis of multidimensional time-series: A software implementation in MATLAB and its application to group-level data in joint action. Frontiers in Psychology, 7, 1835. Webber, C., & Zbilut, J. P. (1994). Dynamical assessment of physiological systems and states using recurrence plot strategies. Journal of Applied Physiology, 76(2), 965–973. Wiltshire, T. J., Butner, J. E., & Fiore, S. M. (2018). Problem-solving phase transitions during team collaboration. Cognitive Science, 42(1), 129–167. Wright, R. A., & Gendolla, G. H. E. (2012). How Motivation Affects Cardiovascular Response: Mechanisms and Applications. https://doi.org/10.1037/13090- 000 Zeidner, M., & Matthews, M. (2005). Evaluation anxiety: Current theory and research. In A. J. Elliot & C. S. Dweck (Eds.), Handbook of Competence and Motivation (pp. 141–163). New York: The Guilford Press.
5
Extended Reality in Training Environments A Human Factors Trend Analysis Salim A. Mouloua, Gerald Matthews, John French, and Mustapha Mouloua
CONTENTS Background............................................................................................................. 152 Methodological Approach....................................................................................... 154 Results..................................................................................................................... 157 Study Designs over Time........................................................................................ 158 Simulation Domains over Time.............................................................................. 159 Simulation Cluster Areas over Time....................................................................... 161 Military Simulation........................................................................................ 161 Aerospace Simulation.................................................................................... 162 Driving Simulation........................................................................................ 162 Healthcare Simulation................................................................................... 162 Manufacturing Simulation............................................................................. 163 Entertainment Simulation.............................................................................. 163 Simulation Methods and Design.................................................................... 163 General Simulation Areas.............................................................................. 164 Military Cluster Areas over Time............................................................................ 164 Author Affiliations over Time................................................................................. 165 Military Author Affiliations over Time................................................................... 166 Funding Agencies over Time.................................................................................. 166 Military and Government Funding over Time........................................................ 168 XR System Types over Time................................................................................... 168 XR Systems’ Breakdown over Time....................................................................... 170 Physiological Recording Systems over Time.......................................................... 171 Conclusions............................................................................................................. 173 Major Conclusions and Recommendations................................................... 173 References............................................................................................................... 175
DOI: 10.1201/9781003401353-5
151
152
Human Factors in Simulation and Training
BACKGROUND The proliferation of wearable technologies that include virtual reality (VR) and augmented reality (AR) and mixed reality (MR) have gained considerable attention in recent years. VR involves total visual immersion in a representation of the real world whereas AR involves overlaying graphics and text with visuals of the real world. MR is a combination of both, where real and virtual worlds can interact. Much of this work was started in the early 1960s and 1970s, with a main focus on gaming, followed by training in systems such as aerospace, military, manufacturing, and academics. Recent reports have estimated that the global market for AR and VR technologies will reach $209.2 billion by the end of this 2022 year, and the global VR video gaming revenues were estimated at $22.9 billion in 2022 (Petrov, 2022). These numbers are forecasted to continue to grow in the next few years due to several factors such as price affordability, hardware size, internet speed, technology acceptance, and experience (Insider Intelligence, 2016). Growth in AR devices continues to grow about four times faster than in the VR markets (Market Analysis Report, 2021). MR technologies, such as Microsoft’s HoloLens, are just beginning but expected to grow similarly in the coming years. The MR experience, for example, makes it possible to visit a virtual museum many miles distant that actually exists – but is also on virtual display. Thus, a considerable increase in interest and use of these technologies reached record peaks in 2020–2022, likely due, in part, to the recent Covid-19 pandemic. The pandemic has provided both opportunities and challenges for VR designers to think about extending the reality of our daily work tasks and activities to remote and virtual platforms. These included Zoom and other internet meeting platforms, social media, education, healthcare, real estate, manufacturing, online shopping, entertainments, and training. The adoption of virtual reality in particular during the lockdown period is greater (Shirer & Soohoo, 2020). The Covid-19 pandemic has significantly prompted the adoption of virtual reality and augmented reality technologies as businesses have turned to remote work (Vardomatski, 2021). Similarly, it is estimated that 25% of internet users (70.2 million people) will be VR users by this coming 2023 year (Petrov, 2022). As a result, these users will be afforded various opportunities in their professional, social, work, entertainment environments, education and training, and online shopping, to name a few. However, VR and AR technology adoption may also lead to some human factors and engineering issues (e.g., motion sickness, lack of specific content, hardware and software compatibility, user-interface design, technology acceptance and user experience, price affordability, and safety (Petrov, 2022). One particular technology that has recently gained a momentum among these virtual reality technologies is extended reality, which is also the main focus of our current research. AR is generally believed to evoke lower levels of bodily and perceptual discomfort than VR, but direct evidence is lacking (Descheneaux et al., 2020). Extended reality (XR) is an umbrella term that combines immersive technologies like virtual reality with technologies that expands our physical world. This is enabled by adding virtual elements to it
Extended Reality in Training Environments
153
such as augmented reality or mixed reality information (Kiger, 2020). All of these technologies can be discussed as a single entity using the term XR. The XR market is expected to reach $209 billion by 2022 (Marr, 2019). This expansion makes XR development attractive to a large number of technological applications and means that the work, training, shopping, and entertainment (gaming, movies, etc.,) industries could be vastly different from today within a brief period of time. Industries utilizing XR should closely ally themselves with the Human Factors sciences, in order to streamline their development as well as endow more usability and effectiveness in their target applications. Interestingly, it is rare that engineering technologies fully mature without such close proximity to ergonomic technologies. The role of XR devices in simulation and training is becoming critical to a host of technical areas because they are far less expensive and less dangerous than realworld exposure and training (Allen et al., 1998a, 1998b; Howard & Gutworth, 2020). For example, telesurgery, remote learning, military training, driving simulation, aerospace, manufacturing, and other research-related and even clinical treatment disciplines benefit from XR. Persons who have experienced or suffered from posttraumatic stress disorder (PTSD) due to their exposure on the battlefield can now be treated using VR technologies (Bedwell et al., 2018; Wong & Beidel, 2013; Bohil et al., 2011). Such technologies can also be used for training in search and rescue missions, emergency responses to catastrophic disasters, and reducing the symptoms of simulation sickness in motion-inducing environments (Allen et al., 1998a;1998b); Mouloua et al., 2009, Smither et al., 2008; Mouloua et al., 2004; Mouloua et al., 2005a and 2005b). Similarly, physicians and surgeons can now utilize these devices in order to teach medical students skills including various types of surgery (Vincenzi et al., 2009; Scerbo et al., 2012; 2013; Atallah, 2021; Velazquez-Pimentel et al., 2021). Previous research has shown that XR systems can be used to deliver effective training for developing emotional, cognitive, and physical skills (Howard, 2018; Irish, 2013; Jensen & Koradsen, 2018). For example, some studies have reported that VR training reduced social anxiety (Anderson et al., 2013; Parsons & Rizzo, 2008). However, the findings from a recent meta-analysis on virtual training programs for social skill development also highlight the challenges facing these programs (Howard & Gutworth, 2020). The results of the meta-analysis confirmed that VR training programs are, on average, more successful than alternative methods for enhancing social skills. However, factors believed to support the effectiveness of training such as immersion and gamification did not, in fact, appear to be beneficial. These findings point to the need for further theory development to guide training system development and evaluation (Howard & Gutworth, 2020). Simulating human interaction with virtual agents also introduces unique challenges that are not present in human–machine interaction (Matthews et al., 2021a), nor real human–human interaction. Healthcare facilities, school systems, and various industries have in large part transitioned to virtual work through distributed online meetings, in order to continue their operations and enhance team engagement. This is certainly a new “extension of reality,” but it is outside the scope of the present chapter. We mention these emerging technologies briefly, as they have provided several opportunities and benefits during the confinement period until the writing of this chapter.
154
Human Factors in Simulation and Training
Specifically, if virtual meetings constitute low-fidelity virtual environments, we expect that their explosive proliferation will eclipse that of XR technologies going forward. In commercial settings, the growth of XR has taken exactly the former direction – games and applications to meet with others in a “social” setting, such as VR Chat. Furthermore, with increasing connectedness between various types of virtual “environments,” whether planes or webpages, virtual meetings, or virtual environments – we expect a gravitational pull toward “virtuality” in various industries that might form a network reminiscent or beyond the scope of the presentday internet. For example, Meta (Facebook) sees extensive commercial and leisure opportunities for its VR Metaverse. This frontier may very well be an amalgamation of 2- and 3-dimensional cyberworlds, and it remains to be seen how few and far the transition points will be between those likely proprietary virtual realities. The goal of the present research was to present the historical trend of the development of such technologies, as they apply to a wide variety of human-machine systems. In addition, this chapter also examines some of the driving forces behind the surge of these technologies and their expansion to other domains of applications.
METHODOLOGICAL APPROACH In order to elucidate the temporal trends in XR research, we conducted a trend analysis of simulation-based articles by evaluating the frequency of their appearance in the Proceedings of the Human Factors and Ergonomics Society (HFES) Annual Meeting over the last 16 years (2005–2020). The goal of this study was to identify the research gaps and human factors issues associated with the use of XR methods reported in the HFES journal, as well as differences between the methods in their quantities and trajectories. The reason for choosing the HFES Proceedings is twofold. The first reason is to observe trends in Human Factors subfields (e.g., medical human factors or aviation human factors) over time, such as the distribution of funding sources or research cluster areas in articles. This allows for charting increases and decreases in military funding (as opposed to industry funding), and in specific research cluster areas (such as simulator sickness, or surgeon training) in XR research as a whole. Furthermore, we can determine “gaps” in research focus at a macro-scale, as well as who is funding the broader literature at any point in time. The second reason is to contribute to a broader systematic examination of the research vectors in human factors simulation studies. Toward that goal, this same approach has been used in the assessment of the subfields of unmanned aerial vehicles (Mouloua et al., 2018), cybersecurity (Mouloua et al., 2019), healthcare (Descheneaux et al., 2011; Stowers & Mouloua, 2018; Mouloua et al., 2021), and aviation systems (Ludvigsen et al. (2015). The present study focuses on trends in XR methods across broader human factors subfields in order to address which methods in simulation are becoming popular and which methods are becoming outdated. We would define a factor here as any qualitative nominal variable with varying characteristics that could differentiate one article from another. For example, an article on pilot performance could be defined in terms of its type of design (experimental versus theoretical), domain (aerospace versus surface transportation), cluster area
Extended Reality in Training Environments
155
(pilot training, unmanned aerial vehicles, and situation awareness), funding source (military versus industry funding), and system type (head-mounted versus desktop simulator, VR versus AR versus MR). Furthermore, we can qualitatively compare levels of a factor (e.g., AR vs MR) at the population scale (tallies across all groups of articles). One limitation to this approach is our focus on the “big picture,” as we were unable to splice out certain differences in factors such as system type at the sub-population level (e.g., between driving and aerospace). As our focus was XR in general and within the military setting, we sought to abbreviate more on military research rather than the other sub-populations. Table 5.1 depicts a list of some of the XR technologies. Our mission was to classify articles using a standardized and transparent approach. To this end, we used the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) method developed by Moher and colleagues (2009) to detail our inspection and inclusion of articles in this trend analysis. An outline of PRISMA is shown in Figure 5.1 below. In the PRISMA method, records are (1) initially identified, (2) screened, (3) assessed for eligibility, and (4) included in the final analysis. In this process, an article could be excluded at the second or third step, with the former due to face value (topical irrelevance) and the latter due to more nuanced criteria (e.g., elongated abstract instead of article, too general in scope to encompass XR). Thus, using this process funnels a large number of articles into a smaller and highly relevant subset of articles. Articles were obtained from Sage Journals’ website, the primary publishing agency of the HFES journal. Search boundaries were implemented in order to determine what articles would be included in the results. Our search within the HFES Proceedings was restricted to three keywords, with a mandate that these keywords be present somewhere in the article for initial identification. The three specific keywords for our investigation included “virtual reality,” “augmented reality,” and “mixed reality.” While queries can be limited to article title or abstract, we focused on conducting a comprehensive search of all extended reality articles within the HFES Proceedings from 2005 to 2020. Importantly, the keyword “extended reality” was not included because a priori searches determined it is very new compared to
TABLE 5.1 Definitions of Acronyms Acronym
Definition
XR
Extended reality
VR AR VE MR HUD
Virtual reality Augmented reality Virtual environment Mixed reality Head-up display
HMD
Head-mounted display
156
Human Factors in Simulation and Training
FIGURE 5.1 PRISMA methodology for present trend analysis (initial n = 1012, final n = 530).
the others (unfitting for multi-year trend analysis) and XR articles delineate the specific technology being employed. Importantly, augmented reality and mixed reality research sprouted from efforts in virtual reality across decades. Therefore, there is little orthogonality between the keywords in practice (i.e., it is arguably both rare and theoretically alarming when commentary on virtual reality is not present in an augmented reality article). We visually inspected all articles and manually rejected irrelevant articles from our sample, as opposed to relying on automated search criteria alone. The keyword method provided a high probability of identifying studies for inclusion (“hits”), but also carried the risk of “false positives”; e.g., admitting studies that mentioned virtual reality but did not actually investigate it. As such, we used manual inspection in order to filter out those false positives. We chose this
Extended Reality in Training Environments
157
procedure in order to analyze the entire population of XR articles in the Proceedings across these years and avoid sampling error to the best of our ability. We used a yearby-year search of those three aforementioned keywords, which initially yielded 1012 articles (see Figure 5.1). Out of these articles, 881 were identified with the keyword “virtual reality,” while 131 were identified through the keywords “augmented reality” and “mixed reality.” We manually removed 125 duplicate articles across the keyword searches, amounting to 887 articles in total. We then screened those articles for topical relevance and excluded 342 articles that were not relevant to the scope of XR. Afterwards, we were left with 545 articles which we assessed for eligibility in the final trend analysis. Upon manual inspection, 15 were excluded for these reasons: too general (3), panel discussion (5), or poorly defined methods (7). The final number of articles included in the trend analysis was 530, the findings of which we outline and discuss in the following sections.
RESULTS We tabulated our 530 eligible items in a keyword-by-year table and organized them in two different combinations (see Figures 5.2–5.3). We did this to delineate differences in trends based on XR as a whole, VR versus AR versus MR, as well as general differences in XR technologies based on display type (HMDs versus desktop/handheld simulators). In Figure 5.2, VR was collapsed to emphasize different types of XR. Figure 5.2 shows that virtual reality receives the most research attention compared to mixed and augmented reality. VR was split into head-mounted displays (HMDs) and desktop and handheld simulators to account for different types of systems. Figure 5.3 shows that the VR research in Figure 5.2 seems to mostly involve simulators and other types of displays than HMDs. However, in the last four years, the amount of research reported on
FIGURE 5.2 HFES articles by keyword over time.
158
Human Factors in Simulation and Training
FIGURE 5.3 HFES articles by keyword over time.
HMDs has grown dramatically, even exceeding the level of interest in simulators in the past two years. Furthermore, we see that the popularity of augmented reality actually eclipses that of traditional VR simulators in the past year. N = 530.
STUDY DESIGNS OVER TIME Articles were categorized as one of five types of study designs, comprising experimental, meta-/trend analysis, literature review, translational, and theoretical studies. Articles were deemed experimental if an experiment took place using actual manipulations of controlled conditions and were assessed based on the reported criteria in the methods section. If a methods section did not exist, the article was excluded due to poor standards. Articles such as meta-analyses and trend analyses were articles directly comparing results of multiple studies, tabulating results or criteria from different studies, or qualitatively indicating directional trends in the literature. Literature reviews comprised articles commentating on the state of the literature with minimal to no theoretical contributions. Translational articles were those focusing on the development of methods, examining usability with participants but not possessing a true experimental paradigm (e.g., pilot and product testing), and articles with both theoretical and practical contributions that did not squarely fit into either theoretical or experimental articles. Articles were deemed theoretical if they proposed new theories, suggested methods not currently used or modified those methods, and were not principally experimental or translational in nature. It is important to note that applied articles were mostly experimental, and so were tabulated in that category. However, as we did not uniquely assess the “applied” component of articles alone, it is difficult to draw conclusions on whether earlier lab-based HF research truly supported the development of reality-based applications. For example, experimental articles may not necessarily be directed toward practical applications or operational
Extended Reality in Training Environments
159
FIGURE 5.4 HFES articles by study design over time. N = 530.
samples. As the number of “applied” articles dwarfs that of theoretical articles, we are unsure whether the technology is truly mature enough or supported by initial research efforts. The breakdown of articles over time is shown in Figure 5.4. Based on the figure above, we can see that experimental research in XR peaked in 2006 and again in 2019, largely fluctuating between those years. This study design type comprised the vast majority of XR articles and many of the dips in research are matched by increases in translational research. Translational research in XR was most popular between 2010 and 2014 and has remained fairly low, although it is making a return in the past few years. Together, experimental and translational research largely define the state of simulation in human factors – practical and empirically driven. There are very few meta-analyses and trend analyses on XR, and theoretical studies and literature reviews have neither increased nor decreased.
SIMULATION DOMAINS OVER TIME While XR comprises traditional simulation methods such as virtual environments, it also incorporates methods such as AR that can involve little to no simulation. To this end, AR might not necessarily equate to “simulation” as we perceive it, but rather a simple enhancement of one’s environment. However, we noticed a “slippery slope” in delineating AR and even some MR articles as non-simulations. For comparison’s sake, we categorized all these articles under the umbrella of “simulation,” with some encompassing full simulations of the environment (VR), to real and virtual environments combined (MR), to “simulated” additions to the real environment (AR). Therefore, some of the AR research is naturally not simulation per se – but removing those articles would ablate the “real” portion of the real-virtual continuum entirely, and we wanted to preserve this for a richer population of what constitutes XR. Ultimately, reality itself is the control to XR, no matter the method employed – and gradual virtualizations of that reality are ultimately simulations of some notion.
160
Human Factors in Simulation and Training
FIGURE 5.5 HFES articles by simulation research domains over time. N = 530.
We organized articles into seven broad domains of simulation research (see Figure 5.5), indicating the subfield within human factors of which the article was part. To this end, articles were categorized into either “military,” “aerospace,” “driving,” “healthcare,” “manufacturing,” “entertainment,” or “methods & design.” These domains were chosen in order to encompass a more general scope of simulation in human factors – that is, what simulation can be applied to. Military simulation articles were those relating to military endeavors in army, naval, and air force operations, as well as command and control, and studies related to combat or soldier, officer, or team training. Aerospace simulation articles included simulations in aviation related to unmanned aerial vehicles (UAVs), pilot training, commercial flight, spaceflight, and any articles specifically outside of the scope of military aviation. Driving simulation articles were those related to either driving simulators or unmanned ground vehicles (UGVs) specifically outside of the scope of military driving. Healthcare simulation articles involved those pertaining to the training of physicians, nurses, and surgeons using manikin and simulation technologies. Manufacturing simulation articles involved industrial inspection, process control, and microworld simulation as well as training in industrial settings. Entertainment simulation articles were related to gaming (whether recreational or serious), media, and consumer-oriented prospects. Simulation methods and design articles were articles specifically focused on simulators’ development, as well as the design and methods behind simulation at a broader scale. Surprisingly, a great deal of articles fell into this group (e.g., not principally focused on healthcare, aerospace, etc.). However, articles that focused on any of the prior content areas and were also methods-oriented were placed in their specific content domain (e.g., aerospace domain with methods & design clusters). We further broke down simulation domains into specific cluster areas within those domains in the following section.
Extended Reality in Training Environments
161
Examination of Figure 5.5 indicates that simulation methods and design research was the most populous XR domain within HFES, with articles peaking in 2006 and 2019. It is important to elaborate again that methods and design articles did not focus on a specific content area (such as driving or healthcare) and so were specifically related to basic simulation development and methods alone. Interestingly, we can see that these articles appear to match the trends in experimental research previously discussed (see Figure 5.4). This shows that the main driving force behind the trends in experimental XR studies is actually research focusing on the development of simulation techniques and design criteria. Furthermore, military simulation is the second largest domain in XR and peaked between the years 2008 and 2013. Since then, military XR research has dropped steadily and is at a lower point than in 2005 – perhaps indicating decreased interest in HFES by authors in this domain. The third largest simulation domain was driving research, which peaked in 2006 and has since remained mostly consistent (neither decreasing nor increasing very much). The domain of healthcare simulation has remained stable over this time course and has slightly grown. Interestingly, aerospace simulation research has been steadily declining since its peak in 2005. Articles in entertainment simulation have been steadily increasing and peaked in 2019. As for manufacturing simulation, articles peaked in 2014 and have remained about the same.
SIMULATION CLUSTER AREAS OVER TIME In order to specify areas of interest within our simulation domains, we generated 39 cluster areas comprising topics within and between those domains. This was done in order to capture the full scope of research in XR, and these cluster areas revealed trends in more specific research concepts within our subfields. All articles were categorized into 3–5 cluster areas that defined their approach used (e.g., soldier training, situation awareness, simulation design and development), much like specific keywords are used to highlight the areas of interest within a given article. To this end, we manually placed all articles within such areas in order to understand the trends in more specific topics in HFES, as well as demonstrate their relative presence compared to one another. This is shown in Figure 5.6. As the data are highly dimensional and non-unique across articles, we collapsed them across time for ease of interpretation. We also recommend the reader not consider this as an absolute indicator of XR topic areas, but as a general relative level of authorial interest in the literature for these research topics.
Military Simulation Within military simulation, Figure 5.6 shows that the most popular research areas were workload assessment, team training, and soldier training. The trends in workload assessment show a peak in 2010 alongside a slow decline in the number of articles published. Team training and soldier training also peaked in 2010 and have since declined.
162
Human Factors in Simulation and Training
FIGURE 5.6 Number of simulation cluster areas collapsed across time. N = 1434
Aerospace Simulation The most popular cluster areas in aerospace simulation research were pilot training and UAVs (see Figure 5.6), while research on air traffic control and spaceflight and habitats was scarce. Research in pilot training has been most stable, while research on UAVs peaked between 2009 and 2011 and has fallen since.
Driving Simulation Within driving simulations, the most researched area was driver perception and behavior, while driver distractions and unmanned ground vehicles have trailed behind. To date, there are few articles on the vigilance of drivers within the context of XR, compared to other research areas. Thus, research has not realized the potential of XR to investigate pressing contemporary safety issues including distraction, impacts of automation and assistive devices, and operation of connected vehicles.
Healthcare Simulation The most researched areas in healthcare simulation were surgeon training and assistive technologies, with surgeon training peaking in 2007 and decreasing
Extended Reality in Training Environments
163
to almost no articles in recent years. However, assistive technologies have been slowly growing over the years, and the past three years have seen an all-time high – with a similar trend for physician training. Research on telesurgery over the same time, in comparison, was very limited. The most appreciably growing area out of these four clusters is assistive technologies, ranging from assistive tools used by doctors (such as prosthetics or image enhancements) to those used by special populations such as the elderly and disabled groups (such as hearing aids or close captioning).
Manufacturing Simulation The most researched cluster area for this topic was manufacturing simulation (as it relates to assembly and product development specifically), followed by industrial inspection and search and network simulation. Manufacturing simulation studies peaked in 2014 and again in the last year. Both network simulation and industrial inspection and search do not appear to have appreciable trends, and the number of articles is quite low.
Entertainment Simulation The most researched area in entertainment simulation was recreational gaming and media, closely trailed by serious gaming. Interest in recreational gaming and media has oscillated somewhat over the years but reached a high point over the past two years. Serious gaming involves those articles specifically pushing the utility of gaming environments for purposes such as training and has remained relatively consistent compared to recreational gaming and media.
Simulation Methods and Design By far, the most researched area in simulation methods and design has been the actual design and development of simulations, followed by research in simulation fidelity and presence. Simulation design and development peaked in the last year and has grown robustly in its popularity. Additionally, it consistently dwarfs the research in all other cluster areas on an annual basis as well as in the larger picture. This is the most popular area of XR-related research identified through our trend analysis. The second most researched cluster area is simulation fidelity and presence, and these articles also peaked in the last year. Device usability and testing also peaked in 2010 and especially so in the last two years. Research in simulator sickness peaked in 2005 and has been dropping off since. The education and learning cluster area peaked in 2010 and 2020, showing a rhythmic decrease and increase in popularity over the years. With respect to virtual agents, this area peaked in 2012 and research has mostly stayed consistent since. Macroworld simulations were researched up until 2014 and have not seen any further research since.
164
Human Factors in Simulation and Training
General Simulation Areas Perception and error judgment was the most researched cluster area within this domain and peaked in the last two years. Research on navigation and wayfinding has remained consistent. The cluster area of individual differences peaked in 2015 and has also remained stable. Likewise, simulation research on stress, resources, and fatigue has remained consistent. Simulation research on telerobotics has remained low, and adaptive automation in a simulation context has remained a scarce research topic over the years.
MILITARY CLUSTER AREAS OVER TIME Cluster areas related to military simulation showed a lot of research activity, with these cluster areas from the military simulation domain: “threat detection,” “soldier training,” “team training,” “search and rescue,” “situation awareness,” “workload assessment,” “vigilance,” and “unmanned underwater vehicles.” We also incorporated the domains from other cluster areas that were identified solely within those military simulation articles (but outside of our initial cluster classification of being military-specific): “pilot training,” “navigation and wayfinding,” “stress, resources and fatigue,” “individual differences,” “unmanned ground vehicles,” “unmanned aerial vehicles,” “simulation design and development,” and “simulation fidelity and presence.” This was done because we wanted to capture broader trends within military simulation research itself, and because those articles in fact frequently focused on cluster areas that were non-specific to the military simulation domain. Within our broader military cluster areas, the most researched area was simulation design and development, as seen in Figure 5.7, with research peaking in 2019. Observing the trends shows us that simulation design and development within military research has been growing steadily over the years. The second most researched area was soldier training, and articles peaked in 2010 and 2013. Since then, simulation articles on soldier training have drastically dropped. Team training research has been rising and falling over the years, with no clear trends. However, threat detection research has risen sharply in 2019. Situation awareness in military simulation research has remained consistent over the years. With respect to military research on simulation fidelity and presence, articles peaked in 2010 and have remained very low over the past few years. Furthermore, military simulation research on unmanned aerial vehicles peaked in 2009 and has dropped in a similar fashion. Articles focusing on pilot training peaked from 2011 to 2012 and have also declined since. Navigation and wayfinding, search and rescue, individual differences, vigilance, unmanned ground vehicles, and unmanned underwater vehicles were the least researched cluster areas and do not demonstrate noticeable trends over the years. This is intriguing as these areas are some of the most active in general human factors research – yet they do not frequently appear in military simulation articles in the HFES Proceedings. We also observed that the largest peak for military simulation research was in 2013 – and since then, the diversity of articles as well as overall interest in research has dropped.
Extended Reality in Training Environments
165
FIGURE 5.7 Number of simulation cluster areas collapsed across time. N = 370.
AUTHOR AFFILIATIONS OVER TIME We identified authors’ affiliations within the articles in terms of their sector of employment. In order to do this, we did not count each author’s affiliation individually, but rather the dichotomized presence of given employment sectors within each article. Thus, if an article had three academics and two industrial employees, we counted academia once and industry once as the articles’ author affiliations. We chose this approach in order to focus less on the individual authors and more on their employment sector’s contribution to the articles. To this end, these affiliations are not demographic indicators of the authors themselves, but rather their affiliations. The affiliations we categorized articles by were academia, military, government, and industry. As shown in Figure 5.8, academics have been the largest contributors and driving force behind XR research, and their contribution grew rapidly but seems to have peaked in 2019. Industry was the second largest contributor to XR research, peaking in 2011. Interestingly, military researchers were third in contribution to XR research, with research peaking in 2010 and arguably remaining stable or decreasing in the past ten years. Lastly, contributions from government employees show no clear-cut trends but clearly lagged behind the other sectors.
166
Human Factors in Simulation and Training
FIGURE 5.8 HFES Author Sector Affiliations in XR research over Time. Evaluated as sectors’ presence in article (see text).
The trends in Figure 5.8 indicate an increasing academic interest in XR while the military, government, and industry sectors have begun lagging behind. This is surprising given the military’s long-known focus on simulation, but below we have broken down trends in military author affiliations by branch.
MILITARY AUTHOR AFFILIATIONS OVER TIME While we know about military contributions to XR research generally, we wanted to elucidate how each branch contributes and their trends over time. The largest contributor to XR research in the military is the Army, followed by the Air Force, the Navy, and international and other contributors such as foreign defense agencies and domestic military groups outside of the main branches. While we did assess trends in military author affiliations, the sample sizes were deemed too small to elicit meaningful variations over time. From our analyses, Army research in XR peaked in 2010 but has since steadily dropped in the past few years. Air Force research also peaked in 2010 and has slowly declined as well. These observations line up with the increases in attention to soldier training, simulation fidelity and presence, and military aviation research in 2010 reported above. The Navy’s research efforts in XR peaked between 2010 and 2011 and have fallen since then. The same trend is present with international and other military organization authors, with research peaking in 2010 and dropping in the following years.
FUNDING AGENCIES OVER TIME We categorized articles that displayed funding sources according to the funding agencies’ sectors. To this end, we tabulated articles into the same sectors present in our author affiliation process – academia, military, government, and industry funding
Extended Reality in Training Environments
167
FIGURE 5.9 HFES funding sources for XR research over time. N = 272.
sources. For this analysis, we counted the raw distribution of funding sources present in articles. This was highly revealing to economic contribution and agencies’ interests in conducting research. Funding for XR articles, for example, is most driven by contributions from government sources, closely followed by military agencies, as shown in Figure 5.9. This is interesting given the breakdown of author affiliations above – largely indicating most articles are actually published by academics but funded by government and military agencies. The trends in government-funded research shown in Figure 5.9 reveal a peak in 2010 followed by a steady decline until 2020, where government funding skyrocketed in XR research. Furthermore, military-funded research in XR showed a large peak in 2011, followed by declines until another smaller peak in 2019. We can see that military funding synchronizes quite well with our previously mentioned peaks in military author affiliations across all agencies from 2010 to 2011 – which marks the peak years of military investment in XR in terms of publication and economic interest. Industry funding in XR research has mostly remained low and without clear trends, with interest slightly increasing in 2020. With regard to funding in academia, research peaked in 2008 and has since slightly declined. These trends demonstrate that the money in XR research is coming from military and government agencies, as opposed to internal academic or corporate-funded sources. It would be intriguing to see exactly what kinds of efforts government and military agencies are investing in within XR research, but this is outside the scope of the present trend analysis. However, given the surge of interest in augmented reality and virtual reality HMDs in 2020, we suspect that government and military sources are prioritizing these specific avenues of XR going forward (see Descheneaux et al., 2020). Given that these articles are mostly driven by peaks in simulation methods and design in the past two years, it is likely that these agencies are precisely funding XR HMDs’ development and methodological enhancements. In addition, those specific research efforts are being conducted by academics for the
168
Human Factors in Simulation and Training
most part. However, it remains yet to be seen if this developmental boom surrounding second generation HMDs will persist, or lead the way to the emergence of thirdwave XR headsets we are not yet privy to, as in the case of the approximate five-year life cycle of first generation VR/AR/MR HMDs in the literature. If the latter ends up being the case, we speculate those third generation HMDs might emerge in the HFES literature by 2025. This would be contingent on the continual engineering of these XR systems in tandem with feedback from experts in the Human Factors sciences that will guide more usable, efficient, and effective systems for the academic and the consumer going forward.
MILITARY AND GOVERNMENT FUNDING OVER TIME We further broke down funding agencies’ contributions into military and government sources, as these were our two largest sectors of XR research funding. Elucidating these trends shows which military branches and government agencies contribute the most to funding – as well as where the contributions are lacking. However, as with the case of military funding sources, we elected not to include graphical trends here due to the small sample sizes. In accordance with our expectations from author affiliations noted above, military funding in XR was most driven by contributions from the US Army, closely followed by the USs Navy. Examining the distributions over time shows Army funding peaked in 2009 and 2019 and has remained mostly stable over time. Intriguingly, both Navy and Air Force funding peaked massively in 2011 and have both since dropped to virtually no funding in the past five years. It is odd that contributions from these branches to XR research leveled off so much, given the wealth of diversity in relevant military cluster areas being studied by (largely academic) researchers at this point.
XR SYSTEM TYPES OVER TIME Our search on extended reality systems involved virtual reality, augmented reality, and mixed reality. We wanted to profile the differences in the trends of each system type over time, as we could then elucidate which systems are becoming more popular by their relative growth compared to each other (as seen in Figure 5.10). Given how closely tied these terms are to each other, we expected to see either (I) similar trends in their growth or (II) trends in the terms, such that declines in one term would be compensated by growth in another term. It seems that these trends appear to show both suggestions, as we can see (I) concurrent peaks in 2009 and 2019 for both VR head-mounted displays and VR simulators (non-head-mounted displays). Furthermore, we can also see that (II) the largest peaks in VR simulator research (2010–2012) are juxtaposed by relatively low amounts of VR HMD research. Overall, VR simulators were the largest contributing XR systems researched in the literature, peaking in 2008, from 2010 to 2012, and again in 2017 and 2019. Figure 5.10 shows that research on XR simulators steadily rose until the mid-2010s where it dropped sharply to less than half its peak of contributions.
Extended Reality in Training Environments
169
FIGURE 5.10 HFES XR system types in research over time. N = 689.
However, after a couple of years, XR simulators rebounded and have consistently remained the main system used for XR research, and more specifically VR research as well. This is in contrast to the VR HMD “boom” that much of the literature postulated would completely overtake these VR simulator systems. In essence, VR simulators still remain the conventional method of examining virtual environments, even with the rise in wearable VR technology. However, research on VR HMDs has indeed risen immensely over the years – more so than any other system type. We can also project that this might rise even higher in the future. Observing the trends shows us that VR HMDs have actually risen to the same amount of research compared to VR simulators in 2018, as well as growth to its peak in 2019 that nearly compares with that of VR simulators. Interestingly, we see that the first peaks in VR HMD research were from 2005 to 2006, as well as 2008. Following these years, research plummeted to a lull until it began to rise in 2015, until its maximum in 2019. These trends indicate that research on VR HMDs is growing rapidly and is far more predictable than the other types of systems. Continually, we see that research in both AR/MR simulators peaked in 2006 (at 3 and 5 articles, respectively) and has hovered around 0–2 articles per year. However, for AR/MR HMDs specifically, the past two years have shown growth past that of simulators. Via inspection of the articles, there seems to be a bottleneck in the transition of theory to application in XR. Broadly, the theory oversteps what can actually be done, and this is reflected in the keyword search (what is spoken of) and the system types (what is actually examined). We can see that AR/MR research has indeed risen in recent years (system types), but how much so? Even though its popularity in the conversation (keyword search) has risen noticeably, it does not appear yet to be operationally meaningful. Based on these findings, it seems that the “XR revolution” is at present still primarily concerned with VR and is more of a buzzword in practice.
170
Human Factors in Simulation and Training
XR SYSTEMS’ BREAKDOWN OVER TIME We also broke down these XR system types into every generic model used across all articles in this trend analysis. Observing the breakdown in XR systems over time reveals some very intriguing trends, though their depiction is convoluted. As pictured below (see Figure 5.11), we can see the types of VR HMD systems being used begin to differ starting in 2015 (from different HMD systems being used up until 2013). Re-referencing this data to Figure 5.10 paints a clearer picture – the general shifting of emphasis from VR HMDs to VR simulators and back to VR HMDs. Broadly, this suggests the transition of first generation VR HMDs (such as RockwellCollins, Kaiser ProView, Virtual Research Systems, and custom/ambiguous HMDs) into second generation VR HMDs (such as the Oculus Rift and HTC Vive). Since XR simulator research begins to decline in 2013 and second generation VR HMD research begins to ramp up, we suggest that the increase in VR simulator system
FIGURE 5.11 Evolution of first and second generation XR systems over time in research use. N = 689. Red consists of AR/MR systems, while blue consists solely of VR systems. Gray consists of generation-invariant systems that could not be distinguished. Critically, darker colors are first generation systems, and lighter colors are second generation systems.
Extended Reality in Training Environments
171
research was driven by limitations of characteristics of first generation HMDs. Perhaps this was because those systems did not offer promising fidelity (whether physical or psychological), flexibility of analytical techniques, or usability, but such confirmations are beyond the technical scope of this trend analysis. If this is indeed the case, that would explain the diminishing use of first generation systems beginning in 2009, as well as the uptick in articles on second generation VR HMDs in the late 2010s (beyond the popularity first generation systems achieved). Furthermore, the latter would come only after the steep decline in the use of VR simulator systems in the early to mid-2010s. However, it is important to note that in this case, we might expect to see more lab-based research such as transitional and developmental work increasing, and those trends are not present here. It is likely that we were unable to discriminate this using the methodology we employed, which could be of notable importance. Upon inspection of Figure 5.11, we can see that within HFES the usage of second generation XR systems began in 2015 and has risen noticeably since then. The decline of first generation systems is also very evident, and from 2012 to 2014 we see early AR/MR systems fade out entirely. Likewise, early VR systems met their minima between 2013 and 2015, when the rise of second generation VR systems became evident. Essentially, what these trends appear to indicate is the decline of first generation VR HMD systems with relative improvement in VR simulator systems, followed by the decline of those VR simulator systems with relative improvements in second generation VR HMD systems. However, one notion is clear – research output on VR HMD and simulator systems is currently at a high, barring the last year of limited research in the Covid-19 era. For the reader, we ask two questions next: • What will third generation VR HMDs look like? • How soon will they be here?
PHYSIOLOGICAL RECORDING SYSTEMS OVER TIME In our trend analysis, we also highlighted the use of various systems for physiological data recording in XR research (see Figure 5.12), in order to capture what psychobiological methods are being used in the Human Factors literature. We excluded those systems that were noted physically incompatible with VR headsets as well as desk-mounted systems that captured open-face eye gaze activity. Importantly, some newer VR HMD systems actually have integrated eye-tracking capabilities, whereas standard eye-trackers have traditionally been incompatible with closed-face VR HMDs. Head-tracking systems appear to be the most widely used tools in XR research, peaking in 2006, 2008, and 2019. Research also dips in the years in between, which appears to match with the trends in VR HMDs discussed previously – indicating that the use of these systems depends on having HMDs in the first place. Indeed,
172
Human Factors in Simulation and Training
FIGURE 5.12 Physiological measures used in HFES XR research. N = 225.
head-tracking systems are often based on head-mounted gyroscopes integrated into these HMD systems, and we have observed exactly that in our analysis of these articles. Body-tracking follows behind and peaks in 2013 as well as drastically in 2019. Unlike head-tracking systems normally present in VR HMDs, these bodytracking systems can involve video, patch-light, or motion-tracking from either cameras or sensors completely detached from the head. Interestingly, in 2019, the usage spikes for both these systems – and the literature shows that many researchers were utilizing both methods in tandem. Following closely behind, eye-tracking systems peak in 2014, with low but stable trends otherwise. Otherwise, the use of neuroimaging systems with XR techniques has remained fairly low, with electroencephalography (EEG), functional near-infrared spectroscopy (fNIRS), and transcranial doppler sonography (TCD) comprising only 18 articles out of the entirety of the XR sample investigated. Likewise, neural stimulation methods such as transcranial direct/alternating current stimulation (tDCS/tACS) represent a virtually unresearched topic in XR. On the more peripherally physiological side, electromyography (EMG), galvanic skin response (GSR), and electrocardiography (ECG) consist of 33 total articles, a near doubling of the relative interest in central neurophysiological measures. One notion is clear here – there appears to be minimal growth in trends in these neurophysiological techniques in combination with XR research. As there are numerous studies on the effects of XR on the body and perceptual differences with studies of reality, the scarcity of neurophysiological research in general prompts researchers to investigate this nexus further. A practical deterrent to research may be the challenges of psychophysiological recording when the person is physically moving; researchers have yet to capitalize on specialized systems for ambulatory measurement.
Extended Reality in Training Environments
173
CONCLUSIONS The present study aimed to systematically examine trends in the population of extended reality research published in the Proceedings of the Human Factors and Ergonomics Society’s Annual Meeting from 2005 to 2020. We collected a total of 530 unique articles, identified using keywords relevant to XR. Results were tabulated, analyzed, and graphed based on study design used, simulation domain, simulation cluster area, military cluster area, authorial affiliation, funding agency, country of research, XR system type, XR system breakdown, and physiological systems. Our findings indicated that XR research has been fluctuating but slowly growing, and that system types display different trends. Diagnosing the current state of VR and extended reality research in Human Factors depicts an intriguing future for researchers to tackle. From the relative presence of research topics, funding agencies, authorial affiliations, system types, and other qualitative characteristics, we have illustrated the current gaps in the XR literature as it pertains to the field of Human Factors. We have presented here the competitors in XR research, as well as the lack of competition in authorship and funding arenas by certain sectors. We hope that the current trend analysis and findings from this research are useful to various XR systems’ researchers and developers across various areas of applications, as well as the Human Factors population of researchers. The goal of this research was to highlight the benefits of these systems for human performance and system assessment, simulation and training, as well as safety.
Major Conclusions and Recommendations The following recommendations for developing and promoting the use of VR/XR technologies are a result and logical outgrowth of the data collected, analyzed, and presented in this chapter. These may serve to educate the non-HFE professionals from other fields, who may find them useful to espouse the emerging VR devices and technologies, and how to best utilize them efficiently and safely. 1. While XR research represents a global effort, the US provides 82% of the publications in HFES. Canada supplies 5% of those articles, and Sweden is third with 2% of the articles. By this token, researchers in the US should look to collaborate more with international researchers, as a large XR hub seems to be present within the country that can be generalized further. 2. HMDs comprise only 25% of the publications on XR systems. The spikes in research seem to align with the popularity of first and second generation systems, with a noticeable limitation in the years between. Have we truly addressed the problems brought by first generation HMDs, or have we glossed over them with rosy expectations of XR’s future? 3. Because there is a limited amount of physiological research as relates to XR, researchers should consider employing physiological methods to validate within-system and between-system comparisons.
174
Human Factors in Simulation and Training
4. Out of the three military branches, the Army contributes the most to funding XR research – for the Navy and Air Force, it has decreased in recent years. This finding is surprising given the increasing sophistication of the control interfaces for both unmanned systems and manned aircraft such as F-35 and the consequent scope for VR/AR training (e.g., Pawlyk, 2021). 5. Academics studies make up the vast majority of XR research. Military and industrial research may be less readily available to the public (e.g., classified and proprietary research) – but as it stands, authors in this category could contribute more to the broader scientific literature. This review also points to the need for more theory-based research. XR provides a novel arena for human interactions with technology, and people and human factors theory development is needed to guide further research and application. 6. Simulator research is on the decline, and HMD research is on the rise. However, those numbers overwhelmingly reflect VR studies. 7. Individual difference factors in XR have been neglected. Novel personality and ability factors are emerging as predictors of behavioral and cognitive-affective responses to new technology, including XR (Matthews et al., 2021b). However, systematic research is lacking, and the small Ns of many studies mitigate against identifying reliable individual differences. Research is needed to predict who will engage with and utilize XR systems most effectively, and to develop methods for personalizing training and operational environments. 8. When we manually inspected articles for system types, we lost much of the AR and MR research that was supposedly being conducted recently (according to the keyword search). Thus, much of the research mentioning AR and MR is actually VR studies. The concepts of AR and MR are being overstated relative to the actual literature on them – and this is a serious problem for the future of XR as a whole. 9. Based on this trend analysis, research on augmented and mixed reality is not increasing as rapidly as the conversation surrounding it. Perhaps it is time to re-evaluate whether sufficient effort is being put into funding and conducting these types of studies, as opposed to traditional VR studies which reference AR and MR in a low-effort fashion. 10. There might be an impending replication crisis surrounding simulator sickness. With the rise of HMDs comes problems with fully immersive and mobile systems. However, simulator sickness research has been scarce in recent years. Critically, what differences do second generation systems confer as compared to first generation systems? To understand the XR technologies of today, we must empirically address this specific issue. 11. In order for XR to grow in the near future, HF/E practitioners must investigate AR and MR in their own context – with a greater priority than VR or separate from it. Currently, AR and MR cannot stand on their own two legs, and we must prop them up if we wish to understand the potential benefits they may confer in the field.
Extended Reality in Training Environments
175
In summary, this chapter presented a preliminary snapshot of the various HF/E aspects of these VR/XR/MR technologies using a time series design, and also focusing on content relevance, and domains of applications to name a few. Although these technological innovations have revolutionized our technological progress in learning, military and medical, patient healthcare and clinical intervention, entertainment, process control and manufacturing, they are still not readily and widely used in other domains. This stems from some of the human factors issues related to user experience, training requirements, system design, performance and workload, safety, and cost-effectiveness. Furthermore, it is noteworthy that these technologies will continue to grow at a faster speed in the educational, learning, and entertainment domains. The forecast of this growth is parallel to the increase in inflation rates and socioeconomical disturbances created by increased gas prices, geopolitical conflicts, wars, and shrinking economies. This may pressure students, commuters, and teachers to stay at home and select other options for remote learning and work activities, cultural and entertainment environments, sporting events, real estate viewing, shopping, and healthcare. Similarly, as VR systems continue to be used for training and rehabilitation, the scope for remote clinical applications will increase. Examples are therapy for gait and balance for patients who suffer from Parkinson’s disease (Canning et al., 2020), clinical intervention and training for individuals who may suffer from various social phobias or clinical disorders, such as fear of flying (Maltby et al., 2002), and trauma management with VR therapy (Owens & Beidel, 2015; Beidel et al., 2019), neuroendocrine or vascular procedures for trainees, among others. Finally, we hope that this chapter will serve as a research guide and resource for students, researchers, and practitioners alike, who may have various interests in the wide application of these technologies. We also hope that our current database of articles will also continue to grow, so we can examine other emerging trends in the near future. In addition, we also strive to make this database widely available to these students, researchers, practitioners, and developers alike for their access and various use. Similarly, we also hope that other researchers may wish to extend this research by including other publication outlets, where VR and XR technologies are also published. This research was limited to the HFES Proceedings, as a flagship for the Human Factors and Ergonomics Society. Our choice for this publication outlet was motivated by the nature of cross-disciplinarity of HF/E topics, as well as the sheer number of technical groups and diverse domains of applications that exist within the Human Factors and Ergonomics Society.
REFERENCES Allen, R., Hitt. J., Zavod, M., Bowen, S., Guest, M., & Mouloua, M. (1998a). Observations and recommendations of a simulated airport emergency. Proceedings of the 42nd Annual Meeting of the Human Factors and Ergonomics Society. Santa Monica, CA: Human Factors and Ergonomics Society. Allen, R., Guest, M., Bowen, S., Hitt. J., Zavod, M., & Mouloua, M. (1998b). Airport emergency response crew training: A virtual solution. Proceedings of the 42nd Annual Meeting of the Human Factors and Ergonomics Society. Santa Monica, CA: Human Factors and Ergonomics Society.
176
Human Factors in Simulation and Training
Atallah, S. (Ed.). (2021). Digital Surgery. Cham: Springer. https://doi.org/10.1007/978-3- 030 -49100- 0_14 Anderson, P. L., Price, M., Edwards, S. M., Obasaju, M. A., Schmertz, S. K., Zimand, E., & Calamaras, M. R. (2013). Virtual reality exposure therapy for social anxiety disorder: A randomized controlled trial. Journal of Consulting and Clinical Psychology, 81(5), 751. https://doi.org/10.1037/a0033559 Bedwell, J. S., Bohil, C. J., Neider, M. B., Gramlich, M. A., Neer, S. M., O’Donnell, J. P., & Beidel, D. C. (2018). Neurophysiological response to olfactory stimuli in Combat Veterans with posttraumatic stress disorder. The Journal of Nervous and Mental Disease, 206(6), 423–428. https://doi.org/10.1097/ NMD.0000000000000818 Beidel, D. C., Frueh, B. C., Neer, S. M., Bowers, C. A., Trachik, B., Uhde, T. W., & Grubaugh, A. (2019). Trauma management therapy with virtual-reality augmented exposure therapy for combat-related PTSD: A randomized controlled trial. Journal of Anxiety Disorders, 61, 64–74. Bohil, C. J., Alicea, B., & Biocca, F. A. (2011). Virtual reality in neuroscience research and therapy. Nature Reviews Neuroscience, 12(12), 752. Canning, C. G., Allen, N. E., Nackaerts, E., et al. (2020). Virtual reality in research and rehabilitation of gait and balance in Parkinson disease. Nature Reviews Neurology, 16, 409–425. https://doi.org/10.1038/s41582- 020- 0370-2 Descheneaux, C., Mike McNeil, M., Mouloua, M., & Alicia, T. (2011). Diagnosing HCI trends of the last decade in the medical community. Proceedings of the 55th Annual Meeting of the Human Factors and Ergonomics Society, 55, 1985–1989. https://doi.org /10.1177/1071181311551414. Descheneaux, C., Wohleber, R., Harris, S., Matthews, G., Boland, W., Maraj, C., Moss, J., & Krum, D. (2020). Implementation and Assessment Challenges for Virtual and Augmented Reality Displays within the Army Synthetic Training Environment. Report for United States Army Futures Command (AFC), Combat Capabilities Development Center (CCDC), Simulation and Training Technology Center (STTC), Orlando, FL. Howard, M. C. (2018). Virtual reality interventions for personal development: A metaanalysis of hardware and software. Human-Computer Interaction, 34(3), 1–35. Howard, M. C., & Gutworth, M. D. (2020, January). A meta-analysis of virtual reality training programs for social skill development. Computers & Education, 144, 103707. Insider Intelligence. (2016, August 22). The Virtual and Augmented Reality Market Will Reach $162 Billion By 2020. Business Insider. Retrieved from https://www.businessinsider .com /virtual-and-augmented-reality-markets-will-reach-162-billion-by-2020 -2016 -8 ?utm _source=reddit.%20com Irish, J. E. (2013). Can I sit here? A review of the literature supporting the use of singleuser virtual environments to help adolescents with autism learn appropriate social communication skills. Computers in Human Behavior, 29(5), A17–A24. Jensen, L., & Konradsen, F. (2018). A review of the use of virtual reality head-mounted displays in education and training. Education and Information Technologies, 23(4), 1515–1529. Kiger, P. J. (2020, January 6). What is extended reality? The Franklin Institute. https://www .fi.edu/tech /what-is-extended-reality Market Analysis Report. (2021). Virtual reality market size, share & trends analysis (2021– 2028). GVR-1–68038-831-2. https://www.grandviewresearch.com /industry-analysis/ virtual-reality-vr-market Ludvigsen, J., Mouloua, M., & Hancock, P. (2015). Human factors/ergonomics contributions to aerospace systems, 1980–2012. Ergonomics in Design: The Quarterly of Human Factors Applications, October 2015 23(4), 20–22.
Extended Reality in Training Environments
177
Maltby, N., Kirsch, I., Mayers, M., & Allen, G. J. (2002). Virtual reality exposure therapy for the treatment of fear of flying: A controlled investigation. Journal of Consulting and Clinical Psychology, 70(5), 1112–1118. https://doi.org/10.1037/0022- 006X.70.5.1112 Marr, B. (2019, August 12). What is extended reality technology? A simple explanation for anyone. Enterprise Tech, 12:23 AM. https://www.forbes.com /sites/ bernardmarr/2019 /08/12 /what-is - extended-reality-technology-a -simple - explanation-for-anyone/?sh =22441de47249 Matthews, G., Panganiban, A. R., Lin, J., Long, M., & Schwing, M. (2021a). Super-machines or sub-humans: Mental models and trust in intelligent autonomous systems. In C. S. Nam & J. B. Lyons (Eds.), Trust in Human-Robot Interaction (pp. 59–82). Cambridge, MA: Academic Press. Matthews, G., Hancock, P. A., Lin, J., Panganiban, A. R., Reinerman-Jones, L. E., Szalma, J. L., & Wohleber, R. W. (2021b). Evolution and revolution: Personality research for the coming world of robots, artificial intelligence, and autonomous systems. Personality and Individual Differences, 169, 109969. Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & PRISMA Group*. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. Annals of internal medicine, 151(4), 264–269. Mouloua, M., Smither, J., Kennedy, R. C., Kennedy, R. S., Compton, D., & Drexler, J. (2004). Visually-induced motion sickness: An experimental investigation. Proceedings of the Human Factors and Ergonomics Society 48th Annual Meeting, 48, 2623–2626. https:// doi.org/10.1177/154193120404802304 Mouloua, M., Smither, J., Kennedy, R. C., Drexler, J., Compton, D., & Kennedy, R. S. (2005a). Visually-induced motion sickness: Effects of adaptation training. Proceedings of the Human Factors and Ergonomics Society 49th Annual Meeting, 49, 2263–2267. https:// doi.org/10.1177/154193120504902610 Mouloua, M., Smither, J., Kennedy, R. C., Kennedy, R. S., Compton, D., & Drexler, J. (2005b). Training effects in a sickness inducing environment. Proceedings of the Human Factors and Ergonomics Society 49th Annual Meeting, 49, 2206–2210. https://doi.org/10.1177 /154193120504902519 Mouloua, M., Smither, J., & Kennedy, R. S. (2009). Space adaptation syndrome and perceptual training. In D. Vincenzi, J. Wise, M. Mouloua, & P. A. Hancock, (Eds.), Human Factors in Simulation and Training (pp. 239–255). Boca Raton, FL: CRS Press (Taylor & Francis Group). Mouloua, S. A., Ball, R. V., Ferraro, J. C., & Mouloua, M. (2021). The history of human factors in healthcare: From its emergence 50 years ago to COVID-19. Proceedings of the 10th International Symposium on Human Factors and Ergonomics in Health Care, 10(1), 165–169. Mouloua, S. A., Ferraro, J., Mouloua, M., Matthews, G., & Copeland, R. (2019). Trend analysis of cybersecurity research published in HFES proceedings from 1980 to 2018. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 63(1), 1600–1604. Mouloua, S. A., Ferraro, J., Mouloua, M., & Hancock, P. A. (2018). Trend analysis of Unmanned Aerial Vehicles (UAVs) research published in the HFES proceedings. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 62(1), 1067–1071. Owens, M. E., & Beidel, D. C. (2015). Can virtual reality effectively elicit distress associated with social anxiety disorder? Journal of Psychopathology and Behavioral Assessment, 37, 296–305. Parsons, T. D., & Rizzo, A. A. (2008). Affective outcomes of virtual reality exposure therapy for anxiety and specific phobias: A meta-analysis. Journal of Behavior Therapy and Experimental Psychiatry, 39(3), 250–261.
178
Human Factors in Simulation and Training
Pawlyk, O. (2021). The air force’s virtual reality fighter training is working best for 5th-gen pilots. Retrieved from https://www.military.com /daily-news/2021/03/26/air-forces -virtual-reality-fighter-training-working-best-5th-gen-pilots.html Petrov, C. (2022, June 3). 45 Virtual Reality Statistics That Will Rock the Market in 2022. TechJury. Retrieved from https://techjury.net/ blog/virtual-reality-statistics/#gref Scerbo, M. W., Stefanidis, D., Britt, R. C., Davis, S. S., & Stefanidis, D. (2013). A spatial secondary task for measuring laparoscopic mental workload. Simulation in Health Care, 7 558. Scerbo, M. W., Kennedy, R. A., Montano, M., Britt, R. C., & Davis, S. S. (2012). A spatial secondary task for measuring laparoscopic mental workload: Differences in surgical experience. Proceedings of the Human Factors and Ergonomics Society, 57(1), 728–732. Shirer, M., & Soohoo, S. (2020).Worldwide spending on augmented and virtual reality forecast to deliver strong growth through 2024, According to a new IDC spending guide- IDC [(accessed on 12 January 2021)]. Smither, J. A., Mouloua, M., & Kennedy, R. S. (2008). Reducing symptoms of visually induced motion sickness through perceptual training. The International Journal of Aviation Psychology, 18, 326–339. Stowers, K., & Mouloua, M. (2018). Human computer interaction trends in healthcare: an update. First Published June 29, 2018, Research Article. https://doi.org/10.1177 /2327857918071019 Article information. Vardomatski, S. (2021, September 14). Augmented and Virtual Reality After Covid-19. Forbes Technology Council. Retrieved from https://www.forbes.com /sites/forbestechcouncil /2021/09/14/augmented-and-virtual-reality-after-covid-19/?sh= 6b755aa82d97 Velazquez-Pimentel, D., Hurkxkens, T., & Nehme, J. (2021). A virtual reality for the digital surgeon. In S. Atallah (Ed.), Digital Surgery. Cham: Springer. https://doi.org/10.1007 /978-3- 030- 49100- 0_14 Vincenzi, D., Wise, J., Mouloua, M., & Hancock, P. A. (Eds.). (2009). Human Factors in Simulation and Training. Boca Raton, FL: CRS Press(Taylor & Francis Group). Wong, N., & Beidel, D. C. (2013). Virtual environments in clinical psychology research. In J. S. Comer & P. C. Kendall (Eds.), The Oxford Handbook of Research Strategies for Clinical Psychology (pp. 87–100). Oxford: Oxford University Press.
6
Mitigation of Motion Sickness Symptoms by Adaptive Perceptual Learning Implications for Space and Cyber Environments Mustapha Mouloua, John French, Janan A. Smither, and Robert S. Kennedy
CONTENTS Motion Symptoms................................................................................................... 179 Mitigation................................................................................................................ 181 Individual Differences in Adaptation...................................................................... 184 The Long-Term Retention, Conditioning, and Transfer of Adaptation.................. 186 Long-Lasting Adaptation........................................................................................ 186 Generalizability....................................................................................................... 188 Adaptive Perceptual Learning (APL) Training....................................................... 190 Pre-Adaptation on VR............................................................................................. 191 Pre-Adaptation on Vection Drum............................................................................ 194 Pre-Adaptation on VIMS........................................................................................ 195 Conclusion.............................................................................................................. 197 Acknowledgments...................................................................................................200 In Honorem.............................................................................................................200 References...............................................................................................................200
MOTION SYMPTOMS Motion sickness is an odd term for a constellation of symptoms that result from a reflexive autonomic phenomenon (Muth,2006). For example, it is not a true sickness but a normal response to unusual motion (Money, 1970). Nor does motion sickness require physical motion to produce symptoms such as from visually induced motion DOI: 10.1201/9781003401353-6
179
180
Human Factors in Simulation and Training
sickness or VIMS (Kennedy et al., 2010). It is not a prolonged illness but ends shortly after the unusual motion is ended and seems to serve no homeostatic, biological purpose (Reason, 1978). The principal and most easily recognized symptoms are nausea and emesis but a set of 16 common symptoms have been identified (Kennedy et al., 1993) in a standard test of motion sickness symptoms called the simulator sickness questionnaire (SSQ). Included in the SSQ and a symptom that is often overlooked, is the presence of a prolonged fatigue that is so common it has been named the Sopite syndrome (Graybiel & Knepton,1976) and is sometimes the only symptom associated with unusual motion. Symptoms of motion sickness are experienced by the unique visual-vestibular perspectives that always accompany new forms of locomotion; from sea travel to camels, automobiles to airplanes and now spacecraft and virtual motion through cyber environments (Money, 1972; Benson, 1978). One of the earliest references to motion sickness is evidenced by the name of the primary symptom of nausea, a word that is derived from a combination of Greek words, “nau” for navigation and sea, hence a symptom derived from traveling at sea, often attributed to Hippocrates (Golding, 2016). Many military campaigns throughout history have been won or lost because motion symptoms reduced the effectiveness of the sailors or troops. Napoleon, for example, was thought to have given up his plans for the conquest of North Africa when his troops became “camel sick” from loping along in these ships of the desert (Huppert et al., 2017). Similarly, the defeat of the Spanish Armada in 1588 is often attributed to the seasickness of the Spanish commander and, doubtless, many of his troops. There is a renewed interest in space research in recent years with the advent of commercial space operations, numerous countries sending probes to Mars and the likelihood of permanent colonies on the moon within a decade. A challenge to the efficiency and safety of space operations is space motion sickness reported to afflict about one-half of all astronauts during the initial 24–72 hours of orbital flight (Homick et al., 1984; Ishii, 1993; Nguyen, 1996; Reschke et al., 1998; Thornton et al., 1987). On returning to earth, the astronauts must readapt to terrestrial conditions. Parker et al. (1985) were able to document this readaptation and to link it with altered eye movements. These results, both in space and after return to earth, involve learned perceptual eye movements to compensate for altered inertial environments, providing relief through adaptation. Space motion sickness (SMS) symptoms resemble those from other forms of motion sickness (Money et al., 1984), particularly those which are reported in visual rearrangement studies (Kottenhoff, 1957; Welch, 1978, 2000a), and in ground-based flight simulators (Kennedy et al., 1984b). Cue conflict or neural mismatch theory (Reason, 1970) suggests that this constellation of symptoms is triggered by an uncoupling of expected sensory stimuli (Kennedy et al., 1984a; Oman, 1991; Parker et al., 1985). In other words, the disparity between and within vision, vestibular, and somatic messages can lead to conflict and conflict leads to symptoms (Guedry, 1965; Benson, 1978). Thus, as one initially moves about in the weightless environment, the sensory channels provide atypical information about spatial orientation and bodily movement, and this sensory conflict leads to nausea and motion sickness
Mitigation of Motion Sickness Symptoms
181
(Ishii, 1993). For example, under normal terrestrial conditions, tilting the head to one side causes the otolith to roll (shear) sideways, a stimulus which is interpreted as “head tilt.” Rolling of the otolith to the front or back, in this case, occurs only under linear acceleration in normogravity and is so interpreted. Under orbital conditions, in microgravity, however, head tilt does not produce a sideways rolling of the otolith and can lead to sensory conflict and subsequent nausea. This otolith tilt-translation hypothesis is a likely source of SMS. Another new form of motion results from head mounted displays, particularly those created in virtual reality environments (VE), and is termed cybersickness, a form of VIMS. Cybersickness has become much more prevalent as the use of VE has increased in military training as it can reduce the costs and dangers of real world training in military preparedness. New ways to examine medical or biological phenomenon, for example, are also opened by VR environments to traverse from the macroscopic to the microscopic or to follow cells using VR technology as they course throughout the organism. The risk of cybersickness seems to be a use limiting factor in the rapid acceptance of this new type of training (Stanney, 2002). The creation of cybersickness is also thought to result from sensory conflict, a mismatch between the perception of virtual motion when the other senses do not sense motion. This perceived self-motion in VE is a neurovestibular illusion called vection (Fischer & Kornmüller, 1930; Nooij et al., 2017). Many studies, for many decades, have explored vection and its relationship with VIMS in experiments using a slowly rotating drum that contains a pattern of stripes. The subject sits stationary at the center of the drum and observes the moving pattern. The viewing of this moving pattern elicits eye movement consisting of a motion to track the stimulus and a return saccade, which (together) is referred to as optokinetic nystagmus (OKN) (Bender & Shanzer, 1983; Kovalev et al., 2020; see also Hu et al., 1997; Hu et al., 1989). Generating motion sickness without physical motion, only illusory motion, in the OKN drum, was first demonstrated by Bárány in 1907 (as cited in Bender & Shanzer, 1983). It is still a research tool much in use today to study VIMS and vection and the relationship between the two. The OKN drum vection effect has recently been demonstrated in a similar VR cyber vection environment (French et al., 2023).
MITIGATION The deeper one looks at the biological explanations of motion sickness, the more confusing it seems to get. There is no firm theoretical foundation for research as there are at least 3-5 competing theories about the causes of motion sickness with fervent proponents on all sides (Bertolini & Straumann, 2016; Bos et al., 2008; WarwickEvans et al., 1998). Another oddity about motion sickness is that there is no effective mitigation strategy against all symptoms for all individuals (Golding & Gresty, 2015), although this has been the goal of centuries of motion countermeasures. Mitigation strategies have proliferated throughout the centuries. Remedies like ginger root and acupressure are ancient (White, 2007; Bertolucci & DiDario, 1995) and have dubious effectiveness other than as a placebo effect. More modern but
182
Human Factors in Simulation and Training
also purported treatments involve vibrotactile stimulation of the vestibular “area” or transcranial electrical stimulation, both of which hope to “rearrange” or “overwhelm” the symptoms producing nausea (Heaney et al., 2018; Weech et al., 2020). Pharmaceuticals such as the anticholinergics are effective but often produce debilitating side effects including amnesia, drowsiness and blurred vision (Dahl et al., 1984; Leung & Hon, 2019). Clearly, a more effective counter measure, devoid of deleterious side effects, is needed. While motion sickness and anxiety are two different phenomena, they share some symptoms in common. For example, sickening situations can elicit anxiety. This may explain why some useful anti-nausea medications are also anxiolytics, such as the phenothiazines (Thorazine, Compazine, Phenergan), or sedating, such as the antihistamines. Some more complex behavioral mitigation strategies for motion sickness also seek to decrease stress or anxiety as the basis of the treatment. Individuals might be aware of their sensitivity to motion and become anxious or embarrassed about suffering the symptoms. A classically conditioned anticipatory response to the sickening situation may develop that includes anxiety and anticipatory symptoms could be produced at the mere sight or smell of the anticipated event. Forms of counterconditioning methods have been used successfully as complex behavioral treatments for anxieties and phobias for over 40 years (Avila et al., 1999; Lane 2009) and, for example, airsickness (Cheung & Hofer, 2005). These techniques first train the individual in muscle relaxation and regular, deep breathing techniques with pleasant visual imagery (Sang et al., 2005). Then they increasingly approximate the anxiety-producing stimulus. Over several days of countering or replacing less stressful behaviors in response to the stressful stimuli, the symptoms can be counter-conditioned or replaced with relaxation (Yen et al., 2005; Dobie, 2019; Koch et al., 2018). In another study, desensitization and biofeedback training and confidence-building exercises over 10 sessions in a VIMS environment were shown to produce significant resistance to exposure to a subsequent VIMS environment (Dobie et al., 1987). The US military has a successful desensitization program involving a variety of motion stimuli that employs this kind of counterconditioning. The US Navy treatment, which includes biofeedback, relaxation, exposure to incremental Coriolis forces, and flying, has reduced air sickness by 85% (Rogers & Van Syoc, 2011). Military services worldwide have developed extensive programs that attempt to desensitize motion sickness responses in susceptible personnel with success rates reported to be greater than 85% (Benson, 1999; Rogers & Van Syoc, 2011; Lucertina et al., 2013). However, these can take many weeks to complete and involve complicated combinations of deep breathing, muscle relaxation training, and biofeedback (Banks et al., 1992; Jones et al., 1985; Giles & Lochridge, 1985; Mert & Bles, 2007). A newer form of counterconditioning is Cognitive Behavioral Therapy (CBT). It is distinguished from the traditional forms in that, rather than imagining pleasant visual imagery, other thoughts are used such as thinking about numbers or poetry or distracting the individual’s thoughts from the stressful stimuli. The success of these counterconditioning treatments in stress reduction implies that they could be successful in mitigating the stress or anticipation of sickness which then may reduce
Mitigation of Motion Sickness Symptoms
183
these secondary symptoms. While we were not able to find evidence that CBT has been tried with motion sickness but at least one study discussed the means to do the studies (Dobie & May, 1994). One of the most effective mitigation strategies for motion sickness is the same as it was in ancient times; adaptation through repeated exposure to the sickness-producing stimulus (Golding, 2016). The simplest and most time-tested countermeasure for most of those afflicted by motion sickness symptoms is a gradual adaptation to the new demands on perception, as one gets their “sea legs” for the new motion. Reason (1978) called this the “vis medicatrix naturae” or the healing power of nature. On ships, in space, and in other constant exposure situations, adaptation and relief of symptoms can occur within a few days (Golding, 2006; Herr & Paloski, 2006) in most individuals. However, the “treatment” requires much discomfort for many hours while adaptation occurs. It implies that something is being learned, consciously or unconsciously, to reduce the discomfort over time with no escape from the exposure. Habituation and adaptation are similar terms to describe voluntary or involuntary mechanisms that allow us to pay less attention, to neurophysiologically respond less, to distracting stimuli in our environment (Baker et al., 2010). There are subtle differences in theories of habituation but both habituation and adaptation refer to central and peripheral mechanisms respectively, by which we physically respond less and less to stimuli which are distracting. Adaptation to motion environments may parallel some early experiments in optically adapting to distortions in the visual field (Mack, 1967; Redding, 1973a; Welch, 1978). It has been well established since the late 1960s that wearing prismatic or mirrored goggles (for example, continuously for days) elicits a remarkable adaptation in which visually-based motion errors decline continuously and behavior returns to normal (Redding, 1973a, 1973b; Welch et al, 1974). People have ridden bicycles or even flown airplanes successfully after a few days of wearing prismatic goggles. One conclusion from these and other such experiments is that the visual-vestibular and kinesthetic senses are able to quickly recalibrate when actively interacting with a visually disturbed environment so long as the environment remains constant over time (Welch, 2000b). This conclusion has been made for virtual environments as well (Stanney et al., 1998; Welch, 2000a). Likewise, virtually all space travelers adapt to the unusual motion conditions that produce motion sickness during the flight, although this adaptation may not be complete for several days (Ishii, 1993; Reschke et al., 1998; Welch, 2000a). Reports of astronauts and mission specialists with more than one flight suggest that adaptation also occurs across flights, whereby symptoms experienced in subsequent missions appear to be less severe than on the first flight (Parker et al., 1985). Procedures for speeding up the rate of adaptation would be useful. Pre-adapting space travelers to sensory conflicts before embarkation to immunize them against space adaptation syndrome might be even better. Vanderploeg, Stewart, and Davis (1985) have shown that of 22 space travelers who had an opportunity for more than one flight, 11 were sick in various degrees on their first flight, and 11 were not. Of the 11 who were not sick, all were symptom-free on their second exposure. Of the 11 who were sick on their first flight, 9 experienced symptoms on their second flight (although to a lesser
184
Human Factors in Simulation and Training
extent) and 2 did not. Even though the time between space flights was protracted, the adaptation obtained on the first flight appeared to carry over to the second. The data from these 22 repeated-measures subjects implies high reliability for adaptation to space sickness. The calculated reliability for this outcome is greater than r = 0.82. Considering this level of criterion reliability, the inability to predict who will become sick and who will not (Money et al., 1984) suggests that all of the relevant factors are not included in the prediction. The mitigation strategy discussed in this chapter, Adaptive Perceptual Learning (APL) training, assumes that there is a substantial transfer of training when one adapts, even incompletely, to one form of motion event that will aid in adapting to another; the more similar the environments are, the more successful this strategy should be. This approach assumes that all forms of motion sickness represent a disorientation of the visual-vestibular perception pathways induced by unusual perceived motions, in real or virtual environments, with or without physical movement. Therefore, the countermeasures for one should apply to the other. This could mean that pre-adapting to a less stressful stimulus might make it easier and faster to adapt to a more stressful situation in the near future. Pre-adapting astronauts to the visual/vestibular conflicts before embarkation might immunize them against space sickness. In cyberworlds, proceeding slowly through nauseogenic virtual landscapes or skipping over them at first may make it easier the next time to face that landscape. Examining the evidence for this proposal is the subject of this chapter.
INDIVIDUAL DIFFERENCES IN ADAPTATION It is probable that different individuals adapt at different rates (Welch, 2000a). Evidence for this view is clear-cut in the motor learning literature (Jones, 1970; Kennedy et al., 1980), but it is less obvious in the perceptual learning literature. What we do not know are the rules of how adaptation transfers over environments and whether some individuals possess more of this ability than others do. The work of Graybiel and Lackner (1983) comes closest to what we mean about transfer of adaptation. They showed changes in performance of two types: (a) that individuals adapt similarly to three different provocative measures of motion sickness (Lackner & Graybiel, 1984) and (b) individuals’ rates of acquiring and losing adaptation are consistent in different situations (Graybiel & Lackner, 1983). Human beings vary in both the speed and magnitude of their adaptation to a given perceptually distorted environment (Welch, 1978, 2000a, 2000b). Furthermore, although there are very few studies relevant to this issue, it appears that these individual differences are reliable over time. Test-retest correlations average about .75 for adaptation to prismatic displacement (Redding, 1973a; Welch et al., 1974) and optical tilt (Redding, 1973b; Mack, 1967), whereas a split-half reliability coefficient of .83 has been obtained for adaptation to head movement-induced illusory motion of the visual field while wearing right–left reversing goggles (Kottenhoff, 1957). Crawshaw and Craske (1976) correlated prism adaptation in two different experimental situations. Their study may be flawed in that they used the terminal level of performance as an index of adaptation. Their correlation values between the two
Mitigation of Motion Sickness Symptoms
185
conditions were .17 and .19. One might argue that a better index would be the acceleration of the acquisition curve (slope) in the two conditions. Conceivably, rapid adapters would adapt more quickly in both, regardless of where their terminal level performances were. Terminal level is only, to some extent, an index of adaptation. The second issue that one might argue is that, conceivably, the reliability of the measure of adaptation is also imperfect. To some extent, this would also reduce the overall correlations. It is recognized that several other perceptual events and situations result in adaptation (for example, delayed auditory feedback (Katz & Lackner, 1977) and that similar principles would apply from a perceptual integration standpoint. Thus, there appear to be “quick adapters” and “slow adapters” (and those in between), at least in Welch’s (1978) data. A question of interest is what, if any, personal characteristics correlate with (and therefore are predictive of) these different “adaptive styles.” In brief, the answer is that very few of the more obvious characteristics have been found to predict adaptability. Gender and age do not seem to be related to adaptation rate (Welch, 1978). Neither do many of the well-known “paper-and-pencil test” measures of personality. For example, Welch (1978) reported a study in which level of adaptation (as measured in several different ways) to prism-displaced vision failed to correlate with scores on the California Psychological Inventory, the Trait Anxiety Scale, the Achievement Anxiety Test, the Tennessee Self-Concept Scale, the Internal–External Locus of Reinforcement Test, and the Extroversion Scale. On the other hand, Kottenhoff (1957) did obtain a correlation of .72 between degree of introversion/extroversion and level of adaptation to loss of visual position constancy while wearing right–left reversing goggles. Introverted subjects experienced an increase in illusory motion, whereas the more extroverted subjects showed either no change or a decrease (i.e., adaptation). It is of more than passing interest that extroversion as measured by forms of the Maudsley Personality Inventory has a test-retest reliability of only r = .60 (Kennedy, 1972) so that if one were to correct Kottenhoff’s correlation for adaptation (Guilford, 1954), he would have >95% of the variance accounted for in his predictor. With the exception of Kottenhoff’s (1957) experiment, adaptability to distorted environments has not been predictable based on general personality characteristics. A more fruitful strategy for detecting such correlates may be to perform a microanalysis of the specific perceptual and perceptual-motor behaviors required by a particular distorted environment. Thus, for example, the common laboratory situation of reaching for targets while wearing light-displacing prism goggles involves, among other things, the ability to accurately fixate the visual target (i.e., where it appears to be through the goggles), to accurately guide the hand to the target, to accurately gauge the initial prism-induced reaching error, and to correct the error. It has been shown (Warren & Platt, 1974; Welch, unpublished data) that people differ reliably in their ability to do each of these subtasks and, more importantly, that these differences are correlated with subsequent adaptation. For example, Warren and Platt (1974) found that people who have good control of their eyes (i.e., are able to very accurately fixate the visual target) but relatively poor control over their reaching responses, reveal little visual adaptation to prismatic displacement and
186
Human Factors in Simulation and Training
commensurately greater proprioceptive adaptation. Just the reverse proportions of these two types of adaptation were obtained for subjects with poor eye control, but good hand control. Both Welch (unpublished data) and Warren and Platt (1974) have reported that the instruction to point at a prism-displaced target is not interpreted in the same way by all subjects. Specifically, some subjects take this to mean that, after the initial prism-induced error, they should quite deliberately point to where they know the target to be physically located. Frequently, these so-called “object pointers” aim for a location that actually appears to them to be off to one side of the displaced image and, by so doing, very quickly succeed in pointing accurately, sometimes as soon as the second trial. Other subjects continue to point to where the target appears to be located, only gradually correcting their errors. It is perhaps significant that these “image pointers” also fail to show as large a post-exposure negative aftereffect as the “object pointers,” indicating that they have achieved less substantial adaptation.
THE LONG-TERM RETENTION, CONDITIONING, AND TRANSFER OF ADAPTATION A number of studies have demonstrated a close relationship between adaptation to perceptual rearrangement and traditional situational learning (Welch, 1978). Of present interest is the evidence concerning the degree to which adaptation (a) exhibits transfer to new situations (stimulus and/or response generalization), (b) is retained for relatively long periods of time, (c) reveals “savings” on subsequent “relearning” sessions, (d) is subject to discriminative conditioning, and (e) can be maintained for two (or more) different distorted environments at the same time. Although we are aware that the adaptive responses measured in these studies may not be identical to those occurring in space sickness, it is felt that the research discussed in this section will suggest the issues, kinds of tasks, training regimes, and measures that will prove useful in providing astronauts and other space travelers with some degree of generalized “inoculation” against the perceptual and perceptual-motor disruptions caused by the environment of micro and macro gravity to which they are exposed. In principle, adaptation, like learning, can be measured immediately or sometime after. Second, it can be measured by means of the same tasks and stimulus conditions in which it was acquired or by other situations to which it may (or may not) generalize. The two present concerns are (a) evidence for the existence of long-term adaptation effects and (b) an assessment of the degree to which adaptation generalizes to other tasks and types of perceptual rearrangement.
LONG-LASTING ADAPTATION The traditional assumption in experiments on adaptation to visual rearrangement has been that, because such adaptation involves the contradiction of a lifetime of normal visual and visuomotor experience, it must necessarily be fragile, short-lived, and easily abolished by the reinstitution of undistorted vision. This assumption has
Mitigation of Motion Sickness Symptoms
187
led most investigators to test for adaptation as quickly as possible after the exposure to visual rearrangement has occurred. There are, however, a number of observations that belie the notion that adaptation is a short-lived phenomenon. First, it has been casually observed in a number of experiments involving two or more adaptation sessions spread over a period of time that, on sessions subsequent to the first one, subjects will manifest some initial (albeit partial) adaptation as soon as the distorting goggles are in place. This observation has been reported for adaptation to prismatic displacement in terms of (a) visuomotor aftereffects and reduction of effects (Hein, 1972; Klapp et al., 1974; Lackner & Lobovits, 1977; Welch et al., 1974; Wooster, 1923), (b) shifts in visual direction (Welch et al., 1974), and (c) modifications of felt eye position (McLaughlin & Webster, 1967). It has also been reported for visual adaptation to optical curvature (Festinger et al., 1967; Slotnick, 1969). In every instance, the effect was unexpected and only mentioned as a secondary finding of the experiment. Similar to the preceding phenomenon is the fact that in several experiments, adaptation has been found to increase in strength over a series of adaptation sessions, each of which is separated by an extended period of normal vision (Kinney et al., 1970a; McGonigle & Flook, 1978; Peterson & Peterson, 1938; Snyder & Snyder, 1957). Observations of a more controlled nature concerning the ease with which some people are able to shift from one perceptual environment to another come from studies of underwater perception. Luria and Kinney (1970) and Luria et al. (1967) have shown that professional divers experience much less initial face mask-induced visual distortion when entering the water and less visual aftereffect when leaving the water than is true for inexperienced divers. The most extensive study of long-term adaptation has been carried out by Jones and Holding (1975). They used pattern contingent color aftereffects and showed that adaptation magnitude declines only with testing. By using a series of post-adaptation time delays, they were able to show that significant adaptation effects could be observed for months after a single 15-minute adaptation period. Harris (1980) has suggested that other types of adaptation may also last for extended periods. Savreau (1979) has demonstrated that motion contingent color effects last at least a week, and Wolfe (1985) has demonstrated that over 4 minutes of adaptation leads to a longlasting tilt aftereffect. Boynton and Das (1966) report a related event. The presence of partial adaptation many months after the original exposure to the visual rearrangement and the ability to maintain adaptation to two different perceptual situations simultaneously may be interpreted in several ways. Some investigators (e.g., Held, 1968; Klapp et al., 1974) have assumed that these effects represent the persistence of adaptation (i.e., incomplete decay). This seems unlikely, however, since there is typically fully sufficient interpolated normal visual and visuomotor experience to completely abolish the adaptive shift. One likely alternative is that adaptation (partial or complete) can be conditioned to the situational cues associated with it and is elicited whenever the observer is once again in the presence of these cues. A second, although not mutually exclusive, possibility is that after extensive experience subjects develop a perceptual or perceptual-motor flexibility by which
188
Human Factors in Simulation and Training
they can easily shift from an un-adapted to adapted state, and vice versa, or between two different states of adaptation as soon as they identify the particular environment in which they have been placed. In short, they have learned to adapt, an ability that might be referred to as an “adaptation set,” analogous to the more familiar but almost forgotten “learning sets” (e.g., Harlow, 1959). Harlow referred to learning sets as a way in which primates and other highly intelligent organisms “learned to learn” (Harlow, 1949). Intelligent creatures, he argued, had the ability to pick out patterns or procedures, consciously or not, that improved the ability to recognize or respond to similar patterns in the future.
GENERALIZABILITY Traditional studies of adaptation to perceptual rearrangement have used tasks that are rather similar, if not identical, to those practiced during the adaptation period (Welch, 1978). Consequently, little is known about the degree to which adaptation might transfer to tasks that are very different from those encountered during adaptation. Likewise, there is scant information concerning whether one’s adaptation to one form of perceptual rearrangement will either transfer to, or predict, one’s adaptability to another form of rearrangement. Melamed et al. (1979) discuss the fact that “the total prism shift in target pointing is equal to the algebraic sum of the shifts in the other two measures” (e.g., visual shift and proprioceptive shift—the equation being TP = VS + PS). Although a twocomponent linear additive model of prism adaptation is attractive, one wonders whether the effect of spacing the subject’s responses during the exposure period may not be a factor. Regarding the first of these two aspects of generalization, Kinney et al. (1970b), using prismatic displacement, combined three exposure activities with the same three tasks used as pre and posttests of adaptation. The tasks were (a) placing a small chess piece marker on a square within a checkerboard grid, (b) reaching under a transparent table for a target, and (c) rapidly spearing a bull’s-eye with a wooden dowel. Every subject was measured on all three tasks during the pre- and post-exposure periods, but engaged in only one of them during 5 minutes of prismatic exposure. The greatest amount of adaptation (about 65% of the total possible compensation) occurred for the trained task. The nontrained tasks for a given exposure condition revealed generalized adaptation, but with some decrement. Redding (1973a, 1973b, 1975a, 1975b) has examined the second issue of generalization: whether adaptation to one form of distortion will influence adaptation to another form. He found that when subjects were confronted (in a single session) with a visual field that was, simultaneously, prismatically displaced and tilted, adaptation to each of these distortions occurred at the same rate as when subjects were adapted to each separately. Furthermore, the magnitude of subjects’ adaptation to one distortion was not correlated with the magnitude of their adaptation to the other. Thus, it would appear that displacement and tilt adaptation are independent processes that do not transfer to one another. Because the perception of visual location and orientation may be based on qualitatively different processes, the preceding failure of transfer may not be too surprising. Alternatively, perhaps for transfer to occur from one type
Mitigation of Motion Sickness Symptoms
189
of adaptation to another it is necessary to implement a much more extensive training regime on each, perhaps alternating between the two types of distortion. Jell et al. (1985) reported a reduction in human optokinetic after nystagmus in one direction or another, depending on the exposure history of the subject. It is wellknown that optokinetic after nystagmus can be reduced due to damage or destruction of the labyrinth and by lesions in parts of the parahypoglossal nuclei or pretectum. The 1985 Jell, Ireland, and LaFortune study also revealed changes, but the authors do not comment on whether this change is due to lowered arousal or mere drop-off in the values of cumulative eye displacement, duration, or slow phase nystagmus. The authors conclude that this is simple habituation and used cumulative displacement as their most sensitive parameter. The authors suggest that “psychological habituation” (Collins, 1974) may have been a factor. In a 1987 study (Kennedy et al.), subjects were adapted to Purkinje stimulation (Benson & Bodin, 1966) involving approximately 0.5 minutes of bodily rotation, followed by a head turn about an axis orthogonal to that of the preceding rotation. This situation produces dizziness, illusory visual motion, and difficulty walking. The experience is similar to the effects of Coriolis stimulation, except that with the latter the head movements take place during rotation rather than afterward. It was first hypothesized that repeated exposure would cause a decline in the experience of the effects from the Purkinje stimulation. The question was then to see whether this adaptation would transfer to a situation of so-called pseudo-Coriolis (Dichgans & Brandt, 1973) stimulation in which, instead of the subject being rotated, the surrounding visual field is turned and the subject moves his head. The Kennedy et al. (1987) study was designed to evaluate whether adaptation acquired in one stimulus condition involving unusual vestibular stimulation would transfer to another condition where similar, but not identical, conflicting inputs were presented. The amount of transfer was significant, and somewhat unexpected, because in previous studies the hallmark had been the specificity of adaptation (Guedry, 1965). The training condition entailed bizarre stimulation of the cupula endolymph system from the post-rotatory effects (the Purkinje stimulus). The adaptation to this bizarre stimulation transferred to a condition in which the stimuli to the canals and otoliths are the same as would occur with no physical rotation present. This fact implies that the transferred adaptation was not merely some form of suppression or fatigue at the sensory level but a higher-order modification within the central nervous system. Possibly this is the source of its generalizability. (Dobie & May, 1990; Dobie et al., 1990) successfully replicated the Kennedy study where they found that subjects exposed to bodily rotation exhibited increased tolerance to visually induced self-vection (VISV). However, exposure to VISV did not result in greater tolerance to bodily rotation. Harm and Parker (1994) examined the relationship between perceptual reports obtained during a space mission and in preflight adaptation trainer (PAT) devices. Perceptual reports from the astronauts indicated that the PAT device had features similar to those encountered in microgravity. The reports also suggested that these similarities reduced some of the symptoms of space motion sickness during space flight. Welch et al. (1998) examined the possibility that the human vestibulo-ocular reflex (VOR) is subject to dual
190
Human Factors in Simulation and Training
adaptation (the ability to adapt more completely after repeated exposure to sensory rearrangement) and adaptive generalization (the ability to adapt more easily to new sensory rearrangement because of prior dual adaptation training). These researchers showed both adaptation and dual adaptation of the VOR, but no adaptive generalization when tested with a target/head gain of 1.0. Clearly, there is little research concerning the generalizability of adaptation to perceptual rearrangement as it applies to space motion sickness. More studies are needed on this issue, particularly (given the present concern) with long-term generalization.
ADAPTIVE PERCEPTUAL LEARNING (APL) TRAINING The plasticity of the central nervous system permits humans to adapt to temporary ecological changes. These short-term accommodations may be considered under the general rubric of “adaptation to the environment.” Welch (2000b) concludes that Human beings (and perhaps mammals in general) are able to adjust their behavior, and to a much lesser extent their visual perception, to any sensory rearrangement to which they are actively exposed, given that this rearrangement remains essentially constant over time.
Several texts (Welch, 1978; Dolezal, 1982) and reviews (Harris, 1965; Kennedy, 1970; Lackner & Dizio, 1998; Held, 1965; Welch, 2000a, 2000b) make important points, but are silent concerning the implications for the adapting to space sickness. A workshop (McCauley, 1984) and other literature (Kennedy & Frank, 1986; Kennedy et al., 2001) have found this line of investigation useful in understanding simulator sickness, and the same point has been made for the virtual environment technology (Stanney et al., 1998; Welch, 2000a). For space sickness, it is important to know whether any studies have shown transfer of adaptation from one environment to another—not merely adapting to one environment. The literature studying the transfer of adaptation between two conditions is scant (Welch, 2000a, 2000b), but some studies are available (Dobie & May, 1990; Dobie et al., 1987; Dobie et al., 1990; Fineberg, 1977; Fried, 1962; Fregly & Kennedy, 1965; Goodenough & Tinker, 1931; Graybiel & Lackner, 1983; Harm & Parker, 1993; 1994; Taub & Goldberg, 1973; Welch et al., 1998). We would argue that since humans are adaptable, the effects of almost any environmental stressor on performance physiology will change over time, and adaptation will ensue and follow certain rules. If these rules were known, predictions could be made. Space sickness develops in conditions in which nauseogenic stimuli are present for a long period. The perceptual situation of an astronaut or pilot exposed to unusual gravitational inertial forces (including zero and subzero gravity) for some period has been compared in many ways to that found in experiments involving perceptual rearrangement, such as optically induced displacement, curvature, tilt, or right–left reversal (Welch, 1978, 2000a). In both instances, the observer is confronted with a variety of inter and intrasensory conflicts that initially disrupt perception and
Mitigation of Motion Sickness Symptoms
191
behavior, and may cause nausea (Dolezal, 1982). Likewise, in both situations people reveal an ability to adapt to these imposed conflicts, as manifested in a reduction or elimination of the initial disruptive responses. Thus, overcoming motion sickness, correcting performance, and regaining normal perception when one is subjected to unusual gravitational forces may involve many of the same processes as adaptation to perceptual rearrangement in general. The similarity between the processes of overcoming space sickness and experimentally imposed perceptual rearrangement provides the motivation for the present approach to perceptual learning. Based on software we developed to rapidly reconfigure virtual reality (VR) devices in our laboratory (Kennedy et al., 2001), we have obtained evidence that changing specific aspects of the VR device—gain, polarity, head tracking, phase relation, and transport delay—produces systematic and replicable changes in the incidence and severity of motion sickness symptomatology. In other words, with software modification, we have developed VR research to study the perceptual rearrangement problem where we have been able to develop quantifiable dose– response relationships that have been successful in eliciting graded motion sickness responses among our participants. The present study used this software to rapidly reconfigure a VR device in order to develop a paradigm for reducing the symptoms of space motion sickness through perceptual training. Either graded motion sickness was induced through the systematic distortion of the relevant characteristics of the VR device or repeated exposure to self-propelled rotation trials was used until adaptation was attained. The generalization of this adaptation was then tested with the use of an optokinetic nystagmus drum. More specifically, we created a pseudo-Coriolis condition through the VR device by reversing head-tracking polarity. This was done with an adaptation protocol where symptoms were kept at a manageable level. Subsequently, we attempted to transfer this adaptation to a pseudo-Coriolis condition induced by a vection drum rotating at 120° per second. Through this process, we set out to demonstrate the feasibility of this Phase I research which was to transfer perceptual adaptation acquired in one environment and relief of symptoms in environments not yet experienced.
PRE-ADAPTATION ON VR Twenty adults (10 males and 10 females) ranging between the ages of 18 and 34 were tested on the effect of pre-adaptation training on virtual reality and vection drum exposure. The training consisted of a simulated rotary stimulation (SRS) procedure in which, participants were asked to raise their right hands above their heads and grasp their right earlobe with their left hand, bend at the waste and spin in a clockwise direction under self-propelled condition. The participants spun 10 times in 30 seconds (10 RPMs), and this constituted a trial. One or more moderators were always available to support unsteady performers. After standing, they were asked to rate their dizziness and walk a seven-foot line on the floor. The steps taken were counted until the participant stepped away from the line. The SSQ is a self-report checklist consisting of 27 symptoms that are rated by the participant in terms of degree of severity on a 4-point Likert-type scale (Kennedy
192
Human Factors in Simulation and Training
et al., 1993). Participants were asked to complete an SSQ before the exposure to the virtual reality device. The SSQ was then administered following VE exposure and following OKN drum exposure. For the experimental group, the study was conducted in five sessions over 5 days. On the first 4 days, participants in the experimental group experienced five trials of the SRS that lasted for about 2 hours. In the fifth session on the final day (the only day in the control case), control and experimental subjects were exposed to the VE and to pseudo-Coriolis in the OKN rotating drum (Control subjects experienced one SRS to establish their baseline). Following each task in the study (SRS, VR, and OKN drum), the participants were given an hour of post-testing (SSQ, past pointing, and posture tasks) at 0-, 30-, and 60-minute intervals. Following completion of the informed consent, each participant was given three questionnaires to evaluate his or her eligibility to participate in the research: (1) Research Participant Information Questionnaire, (2) Simulator Sickness Questionnaire, and (3) Motion History Questionnaire. Once participants were deemed eligible to participate, they performed the pre-exposure tests: postural stability, past pointing, and vestibuloocular reflex test (OR). (NB., the apparatus-based tests [posture, past pointing, and VOR] and the paper-and-pencil questionnaires are not taxing to the participant, nor conducive to discomfort, and reports of their usage abound in the scientific literature. Norms are available for these tests from >1600 cases.) After participants exited the virtual environment, they were asked to complete the post-simulator sickness questionnaire (SSQ), which served as our main dependent variable, followed by the more objective (posture and past pointing) tests. Participants were required to remain at the test site for at least 60 minutes following the virtual reality exposure to ensure that any effects experienced because of the exposure have dissipated. During this time, the SSQ was administered at 15-minute intervals, and the posture and past pointing tasks were administered at 30 minutes and 60 minutes following VE exposure. The postural stability and past pointing performances were compared to scores before VE exposure to verify that they were not noticeably different. Additionally, subjects were asked about their physical condition. If they requested that they remain, they were allowed to stay at the experimental site until adverse feelings subsided. If the researcher determined that they might need further time to recuperate, participants were advised to remain at the experimental site until the symptoms subside or they had to have a means of transportation away from the experimental site other than themselves. However, before being permitted to leave the experimental site, participants could not be experiencing any characteristic symptoms of motion sickness (reported on the postSSQ) or postural disequilibrium. Participants remained in the laboratory until all symptoms had subsided. In session two, participants from both groups were asked to enter the OKN drum and to be seated facing forward. They were then instructed on how and when to use the response key in the drum. Participants were then asked to close their eyes until the experiment began. The participants were then told to open their eyes and gaze directly at the rotating inner surface of the drum until a perception of circular
Mitigation of Motion Sickness Symptoms
193
self-motion (CV) was experienced. Once CV was experienced, participants signaled its presence by pressing the handheld button. Next, while the drum continued to turn, participants were asked to tilt their head 45° toward the left shoulder and to rate their dizziness. This pseudo-Coriolis stimulation has been shown to induce motion sickness (Dichgans & Brandt, 1973). Each participant then turned his/her head upright and made another rating. This procedure was repeated for the right shoulder and again upright before the drum was stopped (total time about 2 minutes). The drum rotated at a velocity of 120° per second, a rate we knew would produce substantial pseudo-Coriolis experience at a 1 second head tilt to 45° and then 45° return to upright in 1 second. This procedure allowed us to repeat, in a 30-minute session, this sequence enough times so that adaptation ensued without losing a substantial number of subjects due to emesis. After participants exited the OKN drum, they were asked to complete the postSSQ, which served as our main dependent variable, followed by the more objective (posture, past pointing, and VOR) tests. Participants were required to remain at the test site for at least 60 minutes following the OKN drum exposure to ensure that any effects experienced because of the exposure have dissipated. During this time, the SSQ was administered at 15-minute intervals, and the posture, past pointing, and VOR tests were administered at 30 minutes and 60 minutes following the OKN drum exposure. If necessary, additional tasks (games involving eye–hand coordination) were given to the participant to aid the participant in readapting to the natural environment. The postural stability and past pointing performances were compared to scores before (pre-SSQ) VE exposure in order to verify that they were not noticeably different. Subjects were tested prior to being released to go home. Figure 6.1 shows, the difference in the dizziness rates was higher among the control group than the experimental group showing transfer of adaptation into the
FIGURE 6.1 Mean dizziness post-VR exposure.
194
Human Factors in Simulation and Training
FIGURE 6.2 Mean SSQ score post-VE exposure.
virtual reality condition as a function of prior simulated self-propelled rotary stimulation exposure. Similarly, the analysis also showed the same effect of adaptation training on VR as reported in the simulation sickness questionnaire (SSQ) following VR exposure. As Figure 6.2 shows, higher simulation sickness ratings were reported by the control (mean = 44.25) than the experimental (mean = 9.72) group and these values compare favorably to scores from subjects exposed to space and sea sickness where similarly high values are obtained. The experimental group in this study exhibited scores that resemble or are lower than the scores of experimental pilots when exposed to flight simulation and control subjects exhibited higher scores than that group.
PRE-ADAPTATION ON VECTION DRUM The MANOVA showed a significant effect of adaptation training on vection drum exposure, indicating that the experimental group who had prior training with simulated rotary stimulation and VE exposure reported lower rates of dizziness (mean = 1.63) than the control group who did not experience simulated rotary stimulation and VE exposure (mean = 3.92). As Figure 6.3 shows, the difference in the dizziness rates was higher among the control group than the experimental group showing adaptation in the vection drum (OKN) condition as a function of simulated rotary stimulation and VE exposure. Similarly, a MANOVA yielded a significant effect of adaptation training on simulation sickness (SSQ) following the vection (OKN) drum exposure. As Figure 6.4 shows, higher simulation sickness ratings were reported by the control (mean = 61.71) than the experimental (mean = 17.20) group.
Mitigation of Motion Sickness Symptoms
195
FIGURE 6.3 Mean dizziness rating postvection exposure.
FIGURE 6.4 Mean SSQ score postvection exposure.
PRE-ADAPTATION ON VIMS These studies indicate that pre-exposure to milder conditions that can elicit motion sickness can reduce the symptoms experienced on a second more intense exposures. Others have found similar effects. For example, pre-exposure to a shorter duration OKN event with fewer degrees per second (dps) will reduce symptoms on a subsequent, longer and greater dps (Hu et al., 1991b). Clément and Deguine (2007) required provocative pitch forward and back movements during exposure to a constant velocity
196
Human Factors in Simulation and Training
rotational environment over 5 days. They found that symptoms were reduced with each exposure. In another study, exposure to a rotating optokinetic drum reduces subjective measures and physiological measures with subsequent exposures (Hu et al., 1991a). Following training on a novel visual-spatial task, Smyth, et al., were able to reduce the incidence of carsickness in a driving task (Smyth et al., 2021). Finally, Pre-exposure to a variety of visual-spatial orientations on a VR based space navigation task compared to a non-varying orientation revealed that a variety of orientation tasks improved orientation in a subsequent VR environment (Stroud et al., 2005). Data collected for a separate study (French et al., 2023) was re-examined to determine if APL training occurred. In the separate study, a VR version of the B&W stripe pattern on a traditional OKN drum was compared to a traditional drum for differences in SSQ symptoms. There were no differences between VR OKN and drum OKN but both showed significant effects compared to pre-OKN SSQ scores. This indicates that VR OKN produces the same symptom severity as traditional Drum OKN and allows the VR OKN to be used as a replacement for drum OKN. This is beneficial since more people can be tested at a time in VR OKN than drum OKN and VR is more mobile and less expensive to use. Figure 6.5 shows the different OKN induction methods of VR or drum.
FIGURE 6.5 (a) Participant wearing EOG electro-oculogram electrodes sitting in the drum OKN device while the B&W stripes rotate around them. (b) The image on the screen is a 2D representation of the 3D HMD
Mitigation of Motion Sickness Symptoms
197
FIGURE 6.6 SSQ results for the four dimensions of the SSQ. Box plot shows median scores +/− max and min score after 10 minutes in the OKN device the first and second time.
In order to prevent order effects, half the participants in this study received VR OKN first and the other half received drum OKN first. Three days later, the OKN was reversed and those receiving VR first got drum OKN, those receiving drum first got VR OKN. This allowed us to compare first OKN, either VR or drum, as the pre-exposure training of APL. The second OKN would give us an indication if APL worked (the SSQ scores should decrease on the second OKN) or not (the SSQ scores would remain the same as they did in the comparison study between VR and Drum OKN, or increased from the novel OKN exposure). Figure 6.6 shows box plots (median ± min and max SSQ) results for all 4 SSQ dimensions. The Nausea and Total dimensions were significantly decreased (p < 0.032) on the second OKN exposure following a one-tailed, Wilcoxon matched pairs signed rank test. The other two dimensions, Disorientation and Oculomotor, can be seen in the figure to be in the same direction but not significant from first OKN to second. These results are comparable to the SRS, VR, and drum results described earlier and support an APL interpretation of the results.
CONCLUSION These kinds of results have been emerging sporadically for many years but no formal description of a technique like Adaptive Perceptual Learning training has been put forward. We propose the term “APL” to help organize and focus the approach to studying motion sickness mitigation methods. More studies need to explore this phenomenon which may break the log jam of progress into what causes motion sickness in general, and what applied treatments are available. APL represents a simple, pre-treatment with less severe environments to help reduce symptoms experienced in subsequent and more severe environments. It should be remembered that this is not a new idea. The term “perceptual learning” was originally defined by Eleanor Gibson (1963). She identified three requirements for perceptual learning to have occurred:
198
Human Factors in Simulation and Training
the learning must be perceptual in nature rather than say, consciously learned, it must be long lasting, and it must be the result of practice or prior experience (Gibson, 1963). Rather, we are suggesting that the lack of coordinated research efforts into the technique may be due to the fact that the procedure has not had a name. Scientists could not discuss implications and tests of the idea with a clear idea of what body of literature, like that presented in this chapter, was being referenced. Unlike previous findings by Guedry (1965), the present study reconfirmed our previous findings that the adaptation transfer is not task specific and could be extended to tasks that are not identical. Moreover, the present findings are also consistent with previous results by Dobie et al. (1990) who reported increased tolerance to visually induced self-vection as a function of bodily rotation exposure. Although it is clear from the present findings that exposure to bodily rotation is beneficial for reducing motion-related dizziness symptoms, it is not well understood whether this adaptation phenomenon can manifest itself in both directions. Previous studies have not extensively studied the double-direction effect of perceptual learning and adaptation in distorted environments. One reason may be that there is more “reafference” (von Holst, 1968), in the self-propelled condition and less in the more passive VR and vection conditions. This difference should be examined in future research. Similarly, the study data reported above, partially supports the results by Harm and Parker (1994) and Welch et al. (1998) who previously reported both adaptation and dual adaptation of the VOR but failed to obtain adaptive generalization. Our findings suggest that the transferred adaptation may be a higher-order modification within the central nervous system, which in turn may account for its generalizability. The results also point to the need of further examining individual differences in the rate of adaptation. Some people may be more prone to simulation sickness than others and, therefore, identifying the traits for “adaptability” would have several practical implications for adapting to space sickness and other situations entailing perceptual adjustment. Notably, an adaptation training program in the form of virtual environment or vection drum may help alleviate several of the motion symptoms found in other visual-vestibular conflict environments. For example, Vanderploeg, Stewart, and Davis (1985) have previously reported that of the 22 space travelers who have had an opportunity for more than one flight, 11 were sick in various degrees on their first flight and 11 were not, and subsequently were symptom-free on their second exposure. These findings clearly indicate the need for pre-adaptation training of those who are prone to simulation sickness-type symptoms for a variety of applications including NASA astronauts. These findings are consistent with previous findings by Kennedy et al. (1987) who similarly reported that prior adaptation to rotary simulated or Purkinje stimulation transferred to pseudo-Coriolis as was demonstrated by the large difference in reported dizziness between the control and experimental conditions. In this study, Kennedy and his associates used only a self-propelled turning test and transferred to a vection drum condition. These findings suggest that training in the form of simulated self-propelled rotary stimulation and virtual environment exposure help reduce the level of sensory rearrangement found or experienced in certain simulation sickness-related tasks.
Mitigation of Motion Sickness Symptoms
199
The visual and behavioral adaptation that occurs to a distorted environment described above, where one can quickly adapt to prismatic glasses or rearranged visual fields, is remarkable in that it occurs quickly, resetting a lifetime of normal visual experience in just a few days. Many experiencing the discomfort of cybersickness terminate their experience without slowly trying to adapt to it. The evidence shows that the adaptation is a long-term phenomenon, if established, (Klapp et al., 1974; Lackner & Lobovits, 1977) and may strengthen the longer it is continued (McGonigle & Flook, 1978). There is good evidence of cross-adaptation in that exposure to one visual condition significantly reduced the adaptation rate in another but similar condition (Kennedy et al., 1987). The implications are that (a) habituation to nauseogenic stimuli is possible and rapid, although with a strong individual component and (b) that habituation may generalize to a similar but not identical environment. With regards to the last point, one could conceivably train repeatedly on a milder version of the virtual stimulus that is causing the nausea and rapidly adapt to a comparable but more difficult stimulus. Although provocative tests of motion sickness reveal generally positive correlation with “Zero G”-induced sickness from parabolic flights (Reschke et al., 1984), there is a substantial amount of unexplained variance. Kennedy (1970) and Reason and Graybiel (1972) suggest that adaptability is a strong predictor of susceptibility to motion sickness. Specifically, rather than looking for overriding personality characteristics as potential correlates of individual differences in motion sickness, emphasis could be placed on a careful assessment of people’s “perceptual adaptation traits,” performances, and idiosyncratic behavioral tendencies. For example, the number and/or extent of incidental head movements that a person makes during a baseline (non-motion) period might correlate with the amount of motion sickness experienced in a subsequent motion environment. Another potential correlate of motion sickness response might be the observer’s characteristic absolute and difference thresholds for visual and vestibular motion, each measured separately in a non-motion environment. Perhaps people who are especially sensitive to unconflicted visual and/or vestibular motion will respond more dramatically or quickly and/or adapt more gradually to a motion environment in which these two senses are placed into conflict than will individuals with higher thresholds. The provocative vestibular tests employed by various scientists (Lackner & Graybiel, 1984; Lentz, 1984; Lentz & Guedry, 1978; Oman et al., 1984; Reschke et al., 1984) generally entail the assessment of motion sickness symptomatology, including vomiting, following a strong, abrupt (usually less than 30 minutes), and relatively unpracticed stimulus. Yet, in their current form, these tests do not assess after-reactions, adaptive capacity, or adaptive retention (Lentz & Guedry, 1978). However, we know that adaptation occurs, and if there are individual differences in adaptability, perhaps the combination of both provocative and adaptation testing would improve our ability to predict who will develop symptoms in space and in cyber applications. This should be the future direction for research that seeks a simple, effective, non-pharmaceutical approach to motion sickness mitigation. In summary, our results showed that APL pre-adaptation training in the form of simulated rotary Purkinje stimulation produces reduced levels of simulation
200
Human Factors in Simulation and Training
sickness in both the virtual and vection (OKN) drum environments. The significant differences in dizziness, nausea, oculomotor, and other related simulation sickness symptoms found between the control and experimental groups are a clear indication of perceptual adaptation. These results are consistent with the relatively enduring adaptation to prism-displaced vision demonstrated even months after the initial prism exposure. It is not well understood whether the adaptation training can be sustained and maintained over a prolonged period of time. In addition, the transfer of adaptation from one situation of visual-vestibular conflict to another situation warrants further investigation. Also, the dichotomy of simulated rotary stimulation task (Coriolis versus pseudo-Coriolis) is an important dimension in examining individual differences in adaptation to perceptual rearrangement and space sickness. Adaptive learning effects were also shown in that exposure to one form of OKNVIMS induction was protective of OKN-VIMS exposure in another. For astronauts and others exposed to unusual motion environments, these results argue that pretraining with milder forms of expected motion anomalies would offer protection against other, more severe forms in the near future. Further research is needed to address these issues through a series of interlocking empirical experiments. We hope the organizing principle will be aided by collecting these procedures under the name of APL training.
ACKNOWLEDGMENTS We would like to thank Paul Huchens, Dan Compton, and Cecelia Grizzard for the data collection and technical assistance. Also, we would like to thank Drs. Norm Lane and Bob Jones for their insightful comments during the course of this research. This research was supported by a NASA contract NAS2-02016. Charles DaRoshia was the technical monitor.
IN HONOREM This chapter is dedicated in loving memory to Robert S. Kennedy. One look at the number of citations with his name on them in this chapter and virtually any work on neurovestibular effects of motion sickness will clearly reveal the debt the entire field owes to RSK. His passing will be missed but his contributions should inspire similar virtues in those he trained. We will always be grateful for his kindness and patience and willingness to mentor his generation and the next generation of scientists interested in the phenomenon of VIMS. The number of us who he touched with his insight and guidance is worthy of this thank you “gratias ago tibi carus amicus.”
REFERENCES Avila, C., Antònia, M., Generós, P., & Ignacio Ibáñez-Ribes, O. M. (1999). Anxiety and counter-conditioning: The role of the behavioral inhibition system in the ability to associate aversive stimuli with future rewards. Personality and Individual Differences, Dec; 27(6), 1167–1179.
Mitigation of Motion Sickness Symptoms
201
Baker, A., Mystkowski, J., Culver, N., Yi, R., Mortazavi, A., & Craske, M. G. (2010). Does habituation matter? Emotional processing theory and exposure therapy for acrophobia. Behaviour Research and Therapy, 48(11), 1139–1143. https://doi.org/10.1016/j.brat .2010.07.009 Banks, R. D., Salisbury, D. A., & Ceresia, P. J. (1992). The Canadian forces airsickness rehabilitation program, 1981–1991. Aviation, Space, and Environmental Medicine, Dec; 63(12), 1098–1101. PMID: 1360796. Bender, M. B., & Shanzer, S. (1983). History of optokinetic nystagmus. Neuro-ophthalmology, 3(2), 73–88. Benson, A. J. (1978). Motion sickness. In Dhenin, G. & Ernsting, J. (Eds.), Aviation Medicine: Physiology and Human Factors (pp. 468–493). London: British Crown Copyright. Benson, A. J. (1999). Spatial disorientation: Common illusions. In Ernsting, J., Nicholson, A. N., & Rainford, D. J. (Eds.), Aviation Medicine (3rd ed., p. 445). London: Butterworths. Benson, A. J., & Bodin, M. A. (1966). Interaction of linear and angular accelerations on vestibular receptors in man. Aerospace Medicine, 37, 144–154. Bertolini, G., & Straumann, D. (2016). Moving in a moving world: A review on vestibular motion sickness, frontiers in neurology, 7. https://www.frontiersin.org/article/10.3389/ fneur.2016.00014 Bertolucci, L. E., & DiDario, B. (1995). Efficacy of a portable acustimulation device in controlling seasickness. Aviation, Space, and Environmental Medicine, 66(12), 1155–1158. Bos, J. E., Bles, W., & Groen, E. L. (2008). A theory on visually induced motion sickness. Displays, 29(2008), 47–57. Boynton, R. M., & Das, S. R. (1966). Visual adaptation: Increased efficiency resulting from spectrally distributed mixtures of stimuli. Science, 154, 1581–1583. Cheung, B., & Hofer, K. (2005). Desensitization to strong vestibular stimuli improves tolerance to simulated aircraft motion. Aviation, Space, and Environmental Medicine, 76(12), 1099–1104. Clément, G, Deguine, O, Bourg, M, & Pavy-LeTraon, A. (2007). Effects of vestibular training on motion sickness, nystagmus, and subjective vertical. The Journal of Vestibular Research, 17(5–6), 227–237. PMID: 18626134. Collins, E. (1974). Habituation of vestibular responses and visual stimulation. In Kornhuber, H. H. (Ed.), Handbook of Sensory Physiology (Vol. V1/2, pp. 369–386), Vestibular Systems. Berlin, Heidelnerg, NY: Springer–Verlag. Crawshaw, M., & Craske, B. (1976). Oculomotor adaptation to prisms: Complete transfer between eyes. British Journal of Psychology, 67(4), 475–478. Dahl, E., Offer‐Ohlsen, D., Lillevold, P. E., & Sandvik, L. (1984). Transdermal scopolamine, oral meclizine, and placebo in motion sickness. Clinical Pharmacology & Therapeutics, 36(1), 116–120. Dichgans, J., & Brandt, T. (1973). Optokinetic motion sickness as pseudo–Coriolis effects induced by moving visual stimuli. Acta Otolaryngologica, 76, 339–348. Dobie, T. (2019). Motion Sickness A Motion Adaptation Syndrome (Chapter 6). Springer International Publishing. https://doi.org/10.1007/978-3-319-97493-4 Dobie, T. G., & May, J. G. (1990). Generalization of tolerance to motion environments. Aviation, Space, and Environmental Medicine, 61(8), 707–711. Dobie, T. G., & May, J. G. (1994). Cognitive-behavioral management of motion sickness. Aviation, Space, and Environmental Medicine 65(10 Pt 2), C1–C2 (ISSN: 0095-6562). Dobie, T. G., May, J. G., Fischer, W. D., & Elder, S. T. (1987). A comparison of two methods of training resistance to visually–induced motion sickness. Aviation, Space, and Environmental Medicine, 58(9, Sect 2), 34–41.
202
Human Factors in Simulation and Training
Dobie, T. G., May, J. G., Gutierrez, C., & Heller, S. S. (1990). The transfer of adaptation between actual and simulated rotary stimulation. Aviation, Space, and Environmental Medicine, 61(12), 1085–1091. Dolezal, H. F. (1982). Living in a World Transformed. New York: Academic Press. Festinger, L., Burnbham, C. A., Ono, H., & Bamer, D. (1967). Efference and the conscious experience of perception. Journal of Experimental Psychology Monograph, 74(4), 1–36. Fineberg, M. L. (1977). The effects of previous learning on the visual perception of velocity. Human Factor, 19, 157–162. Fischer, M. H., & Kornmüller, A. E. (1930). Vertigo. In Bethe, A., von Bergmann, G., Embden, G., & Ellinger, A. (Eds.), Handbook of Normal and Pathological Physiology (pp. 442–494). Berlin, Heidelberg: Springer. Fried, C. (1962). Studies on the Perceptual Threshold for Motion. II. Effects of Induced Motion on Threshold Velocity (Technical Memorandum No. 18–62). Aberdeen Proving Ground, MD: Army Human Engineering Laboratories. Fregly, A. R., & Kennedy, R. S. (1965). Comparative effects of prolonged rotation at 10 RPM on postural equilibrium in vestibular normal and vestibular effective human subjects. Aerospace Medicine, 36(12), 1160–1167. French, J., Vuillemot, F., & Bush, D. (2023). Comparison of traditional OKN Drum with VR-OKN drum on subjective symptoms and cortisol. In preparation. Gibson, E. J. (1963). Perceptual learning. Annual Review of Psychology, 14, 29–56. https://doi .org/10.1146/annurev.ps.14.020163.000333 Giles, D. A., & Lochridge, G. K. (1985). Behavioral airsickness management program for student pilots. Aviation, Space, and Environmental Medicine, Oct; 56(10), 991–994. PMID: 3904710. Golding, J. F. (2006). Motion sickness susceptibility. Autonomic Neuroscience, 129(1–2), 67–76. Golding, J. F. (2016). Motion sickness. In Furman, J. M. & Lempert, T. (Eds.), Handbook of Clinical Neurology, Elsevier, 137, 371–390. ISSN 0072-9752, ISBN 9780444634375. Golding, J. F., & Gresty, M. A. (2015). Pathophysiology and treatment of motion sickness. Current Opinion in Neurology, Feb; 28(1), 83–88. https://doi.org/10.1097/ WCO .0000000000000163 https://doi.org/10.1016/ B978- 0- 444- 63437-5.00027-3 Goodenough, F. L., & Tinker, M. A. (1931). The retention of mirror reading ability after two years. Journal of Educational Psychology, 22, 503–504. Graybiel, A., & Lackner, J. R. (1983). Motion sickness acquisition and retention of adaptation effects compared in three motion environments. Aviation, Space and Environmental Medicine, 54, 307–311. Graybiel, A., & Knepton, J. (1976). Sopite syndrome: A sometimes sole manifestation of motion sickness. Aviation, Space, and Environmental Medicine, Aug; 47(8), 873–882. Guedry, F. E. Jr. (1965). Habituation to complex vestibular stimulation in man: Transfer and retention of effects from twelve days of rotation at 10 RPM. Perceptual and Motor Skills, 21, 459–481. Guilford, J. P. (1954). Psychometric Methods. New York: McGraw–Hill Book Company. Harm, D. L., & Parker, D. E. (1993). Perceived self–orientation and self–motion in micro– gravity, after landing and during preflight adaptation training. Journal of Vestibular Research: Equilibrium and Orientation, 3(3), 297–305. Harm, D. L., & Parker, D. E. (1994). Preflight adaptation training for spatial orientation and space motion sickness. Journal of Clinical Pharmacology, 34(6), 618–627. Harlow, C. S. (1959). Learning set and error factor theory. In Koch, S. (Ed.), Psychology: A Study of a Science (Vol. 2, pp. 492–537). New York: McGraw-Hill.
Mitigation of Motion Sickness Symptoms
203
Harlow, H. F. (1949). The formation of learning sets. Psychological Review, 56(1), 51–65. https://doi.org/10.1037/ h0062474 Harris, C. S. (1965). Perceptual adaptation to inverted, reversed, and displaced vision. Psychological Review, 72, 419–444. Harris, C. S. (1980). Insight or out of sight? Two examples of perceptual plasticity in the human adult. In Harris, C. S. (Ed.), Visual Coding and Adaptability (pp. 105–160). Hillsdale, NJ: Lawrence Erlbaum Associates. Heaney, D., Jagneaux, D., & Baker, H. (2018). New device might have solved VR locomotion sickness. Retrieved from https://uploadvr.com /ototech-vibrating-headband-vr-sickness/ Hein, A. (1972). Acquiring components of visually guided behavior. In Pick, A. D. (Ed.), Minnesota Symposia on Child Psychology (pp. 53–68). Minneapolis, MN: University of Minneapolis Press. Held, R. (1965). Plasticity in sensory–motor systems. Scientific American, 213, 84–91. Held, R. (1968). Dissociation of visual functions by deprivation and rearrangement. Psychologische Forschung, 31, 338–348. Heer, M., & Paloski, W. H. (2006). Space motion sickness: Incidence, etiology, and countermeasures. Autonomic Neuroscience, 129(1–2), 77–79. Homick, J. L., Reschke, M. F., & Vanderploeg, J. M. (1984). Space adaptation syndrome: Incidence and operational implications for the space transportation system program. Proceedings of AGARD Conference, Motion Sickness: MECHANISMS, Prediction, Prevention and Treatment (AGARD–CP–372). Neuilly– Sur–Seine, France: Advisory Group for Aerospace Research and Development. Hu, S., Stern, R. M., Vasey, M. W., & Koch, K. L. (1989). Motion sickness and gastric myoelectric activity as a function of speed of rotation of a circular vection drum. Aviation, Space, and Environmental Medicine, 60(5), 411–414. Hu, S., Grant, W. F., Stern, R. M., & Koch, K. L. (1991a). Motion sickness severity and physiological correlates during repeated exposures to a rotating optokinetic drum. Aviation, Space, and Environmental Medicine, Apr; 62(4), 308–314. PMID: 2031631. Hu, S., Stern, R. M., & Koch, K. L. (1991b). Effects of pre-exposures to a rotating optokinetic drum on adaptation to motion sickness. Aviation, Space, and Environmental Medicine, Jan; 62, 53–56. ISSN: 0095-6562. Hu, S., Davis, M. S., Klose, A. H., Zabinsky, E. M., Meux, S. P., & Jacobsen, H. A. (1997). Effects of spatial frequency of a vertically striped rotating drum on vection-induced motion sickness. Aviation, Space and Environmental Medicine, 68, 306–311. Huppert, D., Benson, J., & Brandt, T. (2017). A historical view of motion sickness—A plague at sea and on land, also with military impact. Frontiers in Neurology, 8, 114. https:// www.frontiersin.org/article/10.3389/fneur.2017.00114, https://doi.org/10.3389/fneur .2017.00114. ISSN=1664-2295. Ishii, M. (1993). Space and vertigo: In relation to space motion sickness. Japanese Journal of Aerospace and Environmental Medicine, 30(1), 41–45. Jell, R. M., Ireland, D. J., & La Fortune, S. (1985). Human optokinetic after nystagmus. Acta Otolaryngology, 99, 95–101. Jones, M. B. (1970). A two–process theory of individual differences in motor learning. Psychological Review, 77(4), 353–360. Jones, P., & Holding, D. (1975). Extremely long–term persistence of the McCullough effect. Journal of Experimental Psychology: Human Perception and Performance, 4, 323–332. Jones, D. R., Levy, R. A., Gardner, L., Marsh, R. W., & Patterson, J. C. (1985). Self-control of psychophysiologic response to motion stress: Using biofeedback to treat airsickness. Aviation, Space, and Environmental Medicine, Dec; 56(12), 1152–1157. PMID: 3910020.
204
Human Factors in Simulation and Training
Katz, D. I., & Lackner, J. R. (1977). Adaptation to delayed auditory feedback. Perception and Psychophysics, 22(5), 476–486. Kennedy, R. S. (1970). Visual Distortion: A Point of View (Monograph No. 15). Pensacola, FL: Naval Aerospace Medical Institute. Kennedy, R. S. (1972). The Relationship Between Habituation to Vestibular Stimulation and Vigilance: Individual Differences and Subsidiary Problems. Doctoral dissertation, University of Rochester, NY (Also NAMRL Monograph No. 20, Naval Aerospace Medical Research Laboratory, Pensacola, FL.). Kennedy, R. S., & Frank, L. H. (1986). A review of motion sickness with special reference to simulator sickness. Paper presented at the 65th Annual Meeting of the Transportation Research Board, Washington, DC. Kennedy, R. S., Jones, M. B., & Harbeson, M. M. (1980). Assessing productivity and well– being in Navy workplaces. Proceedings of the 13th Annual Meeting of the Human Factors Association of Canada (pp. 108–113). Rexdale, Ontario, Canada: Human Factors Association of Canada. Also, Naval Biodynamics Laboratory, New Orleans, LA: November 1981, pp. 8–13. (Research Report No. NBDL–82R004). (NTIS No. AD A111180). Kennedy, R. S., Berbaum, K. S., & Frank, L. H. (1984a). Visual distortion: The correlation model. Proceedings of the SAE Aerospace Congress and Exhibition (Paper No. 841595). Long Beach, CA: Society of Automotive Engineers. Kennedy, R. S., Lilienthal, M. G., Dutton, B., Ricard, G. L., & Frank, L. H. (1984b). December. Simulator sickness: Incidence of simulator aftereffects in Navy flight trainers. Proceedings of the SAFE Symposium (pp. 299–302). Las Vegas, NV. Kennedy, R. S., Berbaum, K. S., Williams, M. C., Brannan, J., & Welch, R. B. (1987). Transfer of perceptual–motor training and the space adaptation syndrome. Aviation, Space, and Environmental Medicine, 58(9 Suppl.), A29–A33. Kennedy, R. S., Lane, N. E., Berbaum, K. S., & Lilienthal, M. G. (1993). Simulator Sickness Questionnaire (SSQ): A new method for quantifying simulator sickness. International Journal of Aviation Psychology, 3(3), 203–220. Kennedy, R. S., Stanney, K. M., & Rolland, J. (2001). Optokinetic studies of the relationship between vection and cybersickness (Report No. N61339-00-C-0054). Orlando, FL: Naval Air Warfare Center Training Systems Division. Kennedy, R. S., Drexler, J., & Kennedy, R. C. (2010). Research in visually induced motion sickness. Applied Ergonomics, 41(4), 494–503. ISSN 0003-6870, https://doi.org/10 .1016/j.apergo.2009.11.006 Kinney, J. A. S., Luria, S. M., Weitzman, D. O., & Markowitz, H. (1970a). Effects of Diving Experience on Visual Perception Under Water (NSMRL Report No. 612). Groton, CT: U.S. Naval Submarine Medical Center. Kinney, J. A. S., McKay, C. L., Luria, S. M., & Gratto, C. L. (1970b). The Improvement of Divers’ Compensation For Underwater Distortions (NSMRL Report No. 633). Groton, CT: U.S. Naval Submarine Medical Center. Klapp, S. T., Nordell, S. A., Hoekenga, K. C., & Patton, C. B. (1974). Long–lasting aftereffect of brief prism exposure. Perception and Psychophysics, 15, 399–400. Koch, A., Cascorbi, I., Westhofen, M., Dafotakis, M., Klapa, S., & Kuhtz-Buschbeck, J. P. (2018). The neurophysiology and treatment of motion sickness. Deutsches Ärzteblatt International, 115, 687–696, 687. Kottenhoff, H. (1957). Situational and personal influences on space perception with experimental spectacles. Acta Psychologica, 12, 79–87. Kovalev, A., Klimova, O., Klimova, M., & Drozhdev, A. (2020). The effects of optokinetic nystagmus on vection and simulator sickness. Procedia Computer Science, 176, 2832–2839.
Mitigation of Motion Sickness Symptoms
205
Lackner, J. R., & DiZio, P. (1998). Adaptation in a rotating artificial gravity environment. Brain Research Review, 28(1–2), 194–202. Lackner, J. R., & Graybiel, A. (1984). Influence of gravitoinertial force level on apparent magnitude of Coriolis cross–coupled angular accelerations and motion sickness. Proceedings of AGARD Conference, Motion Sickness: Mechanisms, Prediction, Prevention and Treatment (AGARD–CP–372). Neuilly–Sur–Seine, France: Advisory Group for Aerospace Research and Development. Lackner, J. R., & Lobovits, D. (1977). Adaptation to displaced vision: Evidence for prolonged aftereffects. Quarterly Journal of Experimental Psychology, 29, 65–69. Lane, J. (2009). The neurochemistry of counterconditioning: Acupressure desensitization in psychotherapy. Energy Psychology: Theory, Research, and Treatment, 1(1), 31–44. Lentz, J. M. (1984). Laboratory tests of motion sickness susceptibility. Proceedings of AGARD Conference, Motion Sickness: Mechanisms, Prediction, Prevention and Treatment (AGARD–CP–372). Neuilly–Sur– Seine, France: Advisory Group for Aerospace Research and Development. Lentz, M., & Guedry, F. E. (1978). Motion sickness susceptibility: A comparison of laboratory tests. Aviation, Space, and Environmental Medicine, 49, 1281–1288. Leung, A. K., & Hon, K. L. (2019). Motion sickness: An overview. Drugs Context, 8, 2019-94. https://doi.org/10.7573/dic.2019-9-4 Lucertini, M., Verde, P., & Trivelloni, P. (2013). Rehabilitation from airsickness in military pilots: Long-Term treatment effectiveness. Aviation, Space, and Environmental Medicine, Nov; 84(11), 1196–1200. Luria, S. M., & Kinney, J. A. S. (1970). Underwater vision. Science, 167, 1454–1461. Luria, S. M., Kinney, J. A., & Weissman, S. (1967). Estimates of size and distance underwater. American Journal of Psychology, 80, 282–286. Mack, A. (1967). The role of movement in perceptual adaptation to a tilted retinal image. Perception and Psychophysics, 2, 65–68. McCauley, M. E. (Ed.). (1984). Simulator sickness: Proceedings of a workshop. National Academy of Sciences/National Research Council/National Academy of Sciences. Washington, DC: Committee on Human Factors. McGonigle, B. O., & Flook, J. (1978). Long–term retention of single and multistate prismatic adaptation by humans. Nature, 272, 364–366. McLaughlin, S. C., & Webster, R. C. (1967). Changes in straight–ahead eye position during adaptation to wedge prisms. Perception and Psychophysics, 2, 37–44. Melamed, L. E., Beckett, P. A., & Halay, M. (1979). Individual differences in the visual component of prism adaptation. Perception, 8, 699–706. Mert, A., & Bles, W. (2007). Hyperventilation in a motion sickness desensitization program. Aviation, Space, and Environmental Medicine, 78(4), 505–509. Money, K. E. (1970). Motion Sickness. Physiological Reviews, Jan; 50(1). https://doi.org/10 .1152/physrev.1970.50.1.1 Money, K. E. (1972). Measurement of susceptibility to motion sickness. In Lansberg, M.P. (Ed.), AGARD Conference Proceedings No. 109: Predictability of Motion Sickness in the Selection of Pilots (pp. B2-1–B2-4). Nueilly-sur-Seine, France: Advisory Group for Aerospace Research and Development. Money, K. E., Watt, D. G., & Oman, C. M. (1984). Preflight and postflight motion sickness testing of the spacelab I crew. Proceedings of AGARD Conference, Motion Sickness: Mechanisms, Prediction, Prevention and Treatment (AGARD–CP–372). Neuilly–Sur– Seine, France: Advisory Group for Aerospace Research and Development. Nguyen, T. (1996). Space sickness. Proceedings of the 5th International Conference on Space ‘96 (Vol. 2). Albuquerque, NM, June 1–6.
206
Human Factors in Simulation and Training
Muth, E. R. (2006). Motion and space sickness: Intestinal and autonomic correlates. Autonomic Neuroscience, 129(1–2), 58–66. ISSN 1566-0702. Nooij, S. A. E., Pretto, P., Oberfeld, D., Hecht, H., & Bülthoff, H. H. (2017). Vection is the main contributor to motion sickness induced by visual yaw rotation: Implications for conflict and eye movement theories. PLoS One, 12(4), e0175305. Oman, C. M. (1991). Sensory conflict in motion sickness: An observer theory approach. In Ellis, S. R., Kaiser, M., & Grunwald, A. (Eds.), Pictorial Communication in Virtual and Real Environments (pp. 362–376). London: Taylor and Francis. Oman, C. D., Lichtenberg, B. K., & Money, K. E. (1984). Space motion sickness monitoring experiment: Spacelab 1. Proceedings of AGARD Conference, Motion Sickness: Mechanisms, Prediction, Prevention and Treatment (AGARD–CP–372). Neuilly–Sur– Seine, France: Advisory Group for Aerospace Research and Development. Parker, D. E, Reschke, M. F., Arrott, A. P., Homick, J. L., & Lichtenberg, B. V. (1985). Otolith tilt–translation reinterpretation following prolonged weightlessness: Implications for preflight training. Aviation, Space, and Environmental Medicine, 56, 601–606. Peterson, J., & Peterson, J. K. (1938). Does practice with inverting lenses make vision normal? Psychological Monographs, 225, 12–37. Reason, J. T. (1970). Motion sickness: A special case of sensory rearrangement. Advanced Science, 26, 386–393. Reason, J. (1978). Motion sickness: Some theoretical and practical considerations. Applied Ergonomics, 9(3), 163–167. ISSN 0003-6870. https://doi.org/10.1016/0003 -6870(78)90008-X. Reason, J. T., & Graybiel, A. (1972). Factors contributing to motion sickness susceptibility: Adaptability and receptivity. Proceedings of AGARD Conference: Predictability of Motion Sickness in the Selection of Pilots (AGARD–CP–109). Neuilly–Sur–Seine, France: Advisory Group for Aerospace Research and Development. Redding, G. M. (1973a). Simultaneous visual adaptation to tilt and displacement: A test of independent processes. Bulletin of Psychonomic Society, 2, 41–42. Redding, G. M. (1973b). Visual adaptation to tilt and displacement: Same or different processes? Perception and Psychophysics, 14, 193–200. Redding, G. M. (1975a). Simultaneous visuomotor adaptation to optical tilt and displacement. Perception and Psychophysics, 17, 97–100. Reschke, M. F., Homick, J. L., Ryan, P., & Mosely, E. C. (1984). Prediction of the space adaptation syndrome. Proceedings of AGARD Conference, Motion Sickness: Mechanisms, Prediction, Prevention and Treatment (AGARD–CP–372). Neuilly–Sur– Seine, France: Advisory Group for Aerospace Research and Development. Reschke, M. F., Bloomberg, J. J., Harm, D. L., Paloskli, W. H., Layne, C., & McDonald, V. (1998). Posture, locomotion, spatial orientation, and motion sickness as a function of one space flight. Brain Research Review, 28, 102–117. Rogers, D., & Van Syoc, D. (2011). Clinical Practice Guideline for Motion Sickness. American Society of Aerospace Medicine Specialists.Virginia: Aerospace Medical Association, November 14. Sang, F. Y. P., Billar, J., Gresty, M. A., & Golding, J. F. (2005). Effect of a novel motion desensitization training regime and controlled breathing on habituation to motion sickness. Perceptual and Motor Skills, 101(1), 244–256. Savreau, D. (1979). Persistence of simple and contingent motion aftereffects. Perception and Psychophysics, 26(3), 187–194. Slotnick, R. S. (1969). Adaptation to curvature distortion. Journal of Experimental Psychology, 81, 441–448. Snyder, F. W., & Snyder, C. W. (1957). Vision with spatial inversion: A follow–up study. Psychological Record, 7, 20–30.
Mitigation of Motion Sickness Symptoms
207
Smyth, J., Jennings, P., Bennett, P., & Birrell, S. (2021). A novel method for reducing motion sickness susceptibility through training visuospatial ability – A two-part study. Applied Ergonomics, 90, 103264. ISSN 0003-6870. Stanney, K. M. (Ed.). (2002). Handbook of Virtual Environments: Design, Implementation, and Applications. Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Stanney, K. M., Mourant, R. R., & Kennedy, R. S. (1998). Human factors issues in virtual environments: A review of the literature. Presence, 7(4), 327–351. Stroud, K. J., Harm, D. L., & Klaus, D. M. (2005). Preflight virtual reality training as a countermeasure for space motion sickness and disorientation. Aviation, Space, and Environmental Medicine, 76, 352–356. Taub, E., & Goldberg, I. A. (1973). Prism adaptation: Control of intermanual transfer by distribution of practice. Science, 180, 755–757. Thornton, W. E., Pool, S. L., Moore, T., & Vanderploeg, J. (1987). Clinical characterization and etiology of space motion sickness. Aviation, Space, and Environmental Medicine, 58(9, Suppl.), A1–A8. Vanderploeg, J. M., Stewart, D. F., & Davis, J. R. (1985). Space Motion Sickness. Houston, TX: NASA (NASA Report NASA–S–85–02963). von Holst, E. (1968). Relations between the central nervous system and the peripheral organs. In Haber, R. N. (Ed.), Contemporary Theory and Research in Visual Perception (pp. 497–503). New York: Holt, Rinehart, and Winston. Warren, D. H., & Platt, B. B. (1974). The subjects: A neglected factor in recombination research. Perception, 3, 421–438. Warwick-Evans, L. A., Symons, N., Fitch, T., & Burrows, L. (1998). Evaluating sensory conflict and postural instability: Theories of motion sickness. Brain Research Bulletin, 47(5), 465–469. ISSN 0361-9230. Welch, R. B. (1978). Perceptual Modification: Adapting to Altered Sensory Environments. New York: Academic Press. Welch, R. B. (2000a). Adapting to virtual environments. In Stanney, K. M. (Ed.), Handbook of Virtual Environments: Design, Implementation, and Applications. Mayhaw, NJ: Lawrence Erlbaum Associates, Publishers. Welch, R. B. (2000b). Adapting to telesystems. In Hettinger, L. & Haas, M. (Eds.), Psychological Issues in the Design and Use of Virtual and Adaptive Environments. Mayhaw, NJ: Lawrence Erlbaum Associates, Publishers. Welch, R. B., Choe, C. S., & Heinrich, D. R. (1974). Evidence for a three–component model of prism adaptation. Journal of Experimental Psychology, 103, 700–705. Welch, R. B., Bridgeman, B., Williams, J. A., & Semmler, R. (1998). Dual adaptation and adaptive generalization of the human vestibulo–ocular reflex. Perception and Psychophysics, 60(8), 1415–1425. Weech, S., Wall, T., & Barnett-Cowan, M. (2020). Reduction of cybersickness during and immediately following noisy galvanic vestibular stimulation. Experimental Brain Research, 238, 427–437. White, B. (2007). Ginger: An overview. American Family Physician, Jun 1; 75(11), 1689–1691. Wolfe, J. (1985). Fatigue and structural change: Two consequences of visual pattern adaptation. Investigative Ophthalmology and Visual Science (Supplement), 24, 215. Wooster, M. (1923). Certain factors in the development of a new spatial coordination. Psychological Monographs, 32(4), 1–96. Yen Pik, S. F., Billar, J., Gresty, M. A., & Golding, J. F. (2005). Effect of a novel motion desensitization training regime and controlled breathing on habituation to motion sickness. Perceptual and Motor Skills, 101, 244–256.
7
Decision-Making under Crisis Conditions A Training and Simulation Perspective Jiahao Yu, Tiffany Nickens, Dahai Liu, and Dennis A. Vincenzi
CONTENTS Introduction.............................................................................................................209 Effects of Time Stress and Uncertainty on Decision-Making................................ 210 Other Effects on Human Decision Makers under Crisis Conditions...................... 211 Decision-Making Theories...................................................................................... 212 Decision-Making Performance Measures............................................................... 214 Crisis Decision-Making Training............................................................................ 215 General and Stress Training.................................................................................... 215 Simulation............................................................................................................... 217 Microworld.............................................................................................................. 219 Conclusion.............................................................................................................. 220 References............................................................................................................... 220
INTRODUCTION Humans make decisions every day. These range from life-planning decisions, such as whether to take a job after college or go to graduate school, to quotidian decisions about what to eat for lunch. Decision-making is a task in which “a person must select one option from a number of alternatives” with “some amount of information available” and under the influence of “time frame” and context uncertainty (Wickens, Lee, Liu, & Becker, 2004). For decisions such as what to eat for lunch, humans can take enough time to consider all the available options, and even if a bad decision is made, the consequence is not significant. Unfortunately, this is not the case when making a decision during a crisis. Decision-making in a crisis situation involves time stress and uncertain information and can be an issue of life or death, such as navigating an airplane through severe weather or deciding when to deploy a parachute if the airplane malfunctions in such a situation. Sometimes, people even need to make
DOI: 10.1201/9781003401353-7
209
210
Human Factors in Simulation and Training
critical decisions in a stressful setting while solving another task simultaneously (Gathmann et al., 2014). Indeed, as Orasanu and Connolly (1993) noted, for crisis decision-making, “the stakes are often high and the effects on lives are likely to be significant.” For public leaders facing uncertainty, such as Covid-19, crisis decisionmaking becomes an adaptive process with four fundamental functions: cognition, communication, coordination, and control (Comfort et al., 2020). A crisis can best be described as a “rare and unique” event (Sniezek, Wilkins, & Wadlington, 2001), bringing with it an “unexpected, life-threatening, and timecompressed” (McKinney, 1993) sequence of events. The characteristics of a crisis include the following (Sniezek et al., 2001): • Uncertainty—Not understanding enough about the event or situation to know how to carry out an appropriate action or to know what the corresponding outcome(s) of that action would be. • Threat to property/life—A chance that possessions and/or human life “could be lost, or soon will be” (Sniezek et al., 2001). • Quick occurrence—Resulting effects of crisis quickly spread out to other areas (i.e., panic, supplies shortage, potential for violence, etc.). Immediate actions are critical in restricting the magnitude of damage. • Uncontrollability—Many of the crisis’ outcomes can be “partially influenced” (Sniezek et al., 2001), but not completely controlled. These characteristics make coherent decision-making under crisis nearly impossible, not to mention the lack of evidence in this evidence-informed process (Khalid et al., 2019). Nevertheless, even during this time of uncertainty, high stress levels, and time pressure, the individual or team knows that “not making a decision is not an option” (Flin & Arbuthnot, 2002). One individual or all people involved must take control of the situation and not only prevent it from getting worse, but not “fan the flames” either. This set of actions is known as crisis management. As Sniezek et al. (2001) explain, crisis management can be compared to risk management, only the situation is “real, not potential.” Good and accurate decision-making under time pressure and uncertainty is what makes crisis management effective. In this chapter, we will summarize the research findings in this area. Firstly, we will briefly highlight some of the background information on time stress and uncertainty, particularly the effect they have on decision makers. This will be followed by decision-making theories as related to crisis decision-making. In the final section, we will discuss the training and simulation issues for crisis decision-making.
EFFECTS OF TIME STRESS AND UNCERTAINTY ON DECISION-MAKING Research has shown that time pressure can reduce the quality of decision-making because limited time is available for thinking through various possible actions (Edland & Svenson, 1993; Maule, Hockey, & Bdzola, 2000). Time pressure as a task characteristic has different meanings for different tasks. Some researchers use
Decision-Making under Crisis Conditions
211
the terms “time urgency” or “time stress” (Rastegary & Landy, 1993) and “time window” (Rothrock, 2001). Time urgency refers to an accelerated pace of activities that results from striving to finish more and more tasks in a decreasing period of time, whereas time pressure is defined as “the difference between the amount of time available and the amount of the time required to solve the task” (Rastegary & Landy, 1993). In most cases, crisis decision-making requires humans to respond within an appropriate time interval. For example, pilots must decide whether to proceed or turn back when encountering severe weather conditions. Such decisions should be made within the appropriate time window—neither too early nor too late (Rothrock, 2001). As a result, the decision-maker must decide what actions to take within a finite amount of time, as well as determine when to implement the chosen actions (Brehmer, 1992). Even when the duration is relatively adequate, the chronic stress brought by the upcoming crisis can still impact decision-making and lead to a selection for immediate rewards instead of long-term payoffs (Morgado et al., 2015; Mudra & Tong, 2020). As for uncertainty, many definitions exist. Related terms include vagueness, incompleteness, ambiguity, conflict, and randomness (Davis & Hall, 2003). Definitions can be classified into two categories: (1) the variability of a given situation and (2) the characteristics of information regarding the situation (Rastegary & Landy, 1993). Thus, in human decision-making, uncertainty can be characterized as the unknown probability (or likelihood) of a possible outcome (Busemeyer, 1985) or the lack of complete information on which to base a decision (Kuipers, Moskowitz, & Kassirer, 1988; e.g., where an incoming hurricane is likely to make landfall and should certain cities be evacuated). Brecke and Garcia (1995) classified uncertainty in a decision problem as “primary” and “secondary” uncertainties. Primary uncertainty is the action uncertainty to the decision, whereas secondary uncertainty includes situation uncertainty, goal uncertainty, and option uncertainty. The information required to end primary uncertainty pertains to secondary uncertainty. Different levels of uncertainty can interact to make the uncertain situation more complex; those emanating from the environment, the organization, or individuals (Rastegary & Landy, 1993). Although many variations of the term persist, what can be agreed upon is that uncertainty plays a significant role in increasing stress in decision-making processes and affects performance (Rastergary & Landy, 1993). Otherthan the variation from the term itself, as daily decisions of uncertainty are made within a social context, but decisions of uncertainty in the non-social domain may be different (FeldmanHall et al., 2015). FeldmanHall et al. (2015) found that “acute stress dampens an individual’s likelihood of making ambiguously uncertain decisions in social contexts but heightens how often they engage in ambiguously uncertain decisions in non-social contexts.”
OTHER EFFECTS ON HUMAN DECISION MAKERS UNDER CRISIS CONDITIONS As a result of the many stressors placed on the human during an emergency situation, many physiological and psychological effects can occur and limit cognitive resources
212
Human Factors in Simulation and Training
(Mandler, J. M., 1979; Mandler, G., 1982). Under normal circumstances, the average person can simultaneously process roughly five to nine concepts (Miller, 1956); however, when a high volume of stimuli is suddenly experienced, thought capacity decreases to two concepts (Waugh & Norman, 1965). Because of the overwhelming amount of stimuli and stressors encountered in a crisis situation, a short-term memory deficit will undoubtedly ensue (Mandler, 1979; Hockey, 1986). As in a domino effect, communication between teammates and/or the command base is often lacking because of short-term memory failure—thereby creating further problems (Stuster, 1996; Wickens, 2005). For obvious reasons, communication is a critical link between not only the team members but also the team and command base and is essential for receiving correct and complete information on the crisis incident to achieve a better situational understanding. However, it is also found that psychological stress in combination with a parallel executive task can preserve the decision-making performance from decreasing (Comfort et al., 2020; Pabst et al., 2013). Further problems include attention tunneling, or when attention is devoted solely to one dilemma at a time, by way of prioritization. Moray and Rotenburg (1989) described this as “cognitive lock-up.” With the probability of more than one problem surfacing, limitation of the expanse of damages is greatly compromised by the lack of cognitive processes. Confusion can also be induced in high-stress environments. This is due to the sudden influx of information and the human attempting to receive and process this information as quickly as possible (Horvitz & Barry, 1995). With working memory degrading as a result of the situation, giving each piece of incoming information sufficient amount of time and thought is extremely difficult. In a time of emergency, there is moreover a need to make a decision as quickly as possible, no matter whether all alternatives have been evaluated or not. Hockey (1986) discovered this when participants, under different combinations or by themselves, were placed under noise and anxiety stress—a situation not uncommon to that seen in crises. Unfortunately, premature decision-making can be counterproductive and lead to additional problems, especially if the situation was not understood correctly to begin with. From an evolutionary approach, human decision-making was shaped by the cost and benefits of the ancestral past. Some aggressive mechanisms may be evoked during the crisis conditions regardless of the cost and benefits. Even people can evaluate the situation rationally, the judgment and decision-making are often subconscious and biased (Johnson et al., 2012). Crisis situations can have many stressors associated with them that will limit the human’s decision-making ability. To further understand the effect of decisionmaking under crisis conditions, special decision-making theories are needed. In the next section, we will discuss some of the theories associated with decision-making and how these theories address the effects produced by time stress and uncertainty.
DECISION-MAKING THEORIES To improve human decision-making skills in crisis situations, researchers have strived to find the best strategies and models to describe, predict, and aid human
Decision-Making under Crisis Conditions
213
decision-making processes. Sometimes, different models like the organizational culture model and bureaucratic politics model need to be used collectively to understand some decision-making in crisis (Monten & Bennett, 2010). Early efforts, now known as the “classical” normative model, utilized probability and statistical theories. These included subjective expected utility (SEU) theory and multi-attribute utility (MAU) theory (Wickens et al., 2004). These theories assumed every decisionmaker to be a complete rational entity. Combining this assumption with probability models (such as Bayesian models and the Markovian model), researchers argued that optimal behavior could be expressed by using quantitative measures. This normative model is based on the assumptions that (1) the human decision-maker has complete and valid information available for each decision and (2) the decision-maker has unlimited time to put all the information into the normative model, compare the outcomes of each alternative, and make the right decision. The decision-making process is assumed to proceed through a linear series of steps, i.e., the DECIDE model (Jensen, 1988). Despite the wide application of the normative model, the assumptions underlying these rational models are often violated in crisis situations. Consider an aeronautical weather-related decision-making (WRDM) scenario (Wiggins & O’Hare, 1995). When approaching a weather condition, a pilot must decide whether to continue proceeding to the original destination, to divert the plane to a new destination, or to cancel the flight and return to the departure point. The pilot’s decision, which has to be made within the appropriate and very short time frame (neither too early nor too late), is based on limited and uncertain information such as the present state of aircraft, the current performance characteristics of the aircraft, meteorological conditions, aerodrome specifications, and topological maps. Each of these pieces of information has unquantifiable effects on future events. That is, according to the rational normative model, the pilot has neither the time nor the information to decide what to do. Instead, the pilot must decide what to do in a more intuitive manner. The pilot’s decision will probably be based on his or her prior experiences. In an emergency situation, humans typically make decisions without formally quantifying each information cue and all outcome alternatives into probabilities (Beach & Lipshitx, 1993). Therefore, researchers have argued that during a crisis, humans tend to make decisions more “naturally” or “intuitively” rather than completely “rationally” (Klein, 1997, 2000; Orasanu & Connolly, 1993; Zsambok, 1997), as the ancestral past mentioned above (Johnson et al., 2012). In naturalistic descriptive models, from Klein’s Recognition-Primed Decision-Making (RPD; Klein, 1989) to Rasmussen’s Skill-, Rule-, and Knowledge-based (SRK) behavior in decision-making (Rasmussen, 1983), the belief that experience and extensive practice within a particular domain is the only way to improve decision-making skills has become widely accepted. According to Klein’s (1989) RPD model, decision-making is primed by the decisionmaker’s recognition of the situation based on his or her experience with this taskspecific domain. If an unfamiliar situation is encountered, this RPD model will not work (Orasanu, 1997); a person must have some previous experience or familiarity with a situation to develop any sort of valid hypothesis about the current situation.
214
Human Factors in Simulation and Training
Take for example the difference in decision-making between a novice and an expert. At the novice level, decision-making is slow, analytical, unreliable, effortful, and disjointed (Brecke & Garcia, 1995). At the expert level, decision-making is intuitive, fast, reliable, effortless, and parallel. Researchers who identified these characteristics in decision-making tasks include Deitch (2001), Kirlik, Fisk, Walker, and Rothrock (1998), Klein (2000), Means, Salas, Crandall, and Jacobs (1993), Mosier (1997), Shanteau (1995), and Wiggins and O’Hare (1995). In Varma’s (2019) axiomatic model of cognitive decision-making during a crisis, the strategy process is affected by several factors including politicization, formalization of the decision-making process, financial report, the impact of the crisis, etc. In the model, variables can be put on a scale with proscriptive variables at one end while supportive variables at the other end. This scale is cross-joined with the advocacy/accommodation continuum yield a Cartesian product of communication options (Varma, 2019).
DECISION-MAKING PERFORMANCE MEASURES Decision-making in dynamic environments such as crises lacks a single valid performance measure to assess its effectiveness. Decision-making differs from other skills such as perceptual skills or psychomotor control skills. The latter two can be objectively measured and the measurement outcomes directly reflect skill level. One major problem with measuring decision-making is that the outcomes from decision-making in uncertain environments are relatively random. That is, even if the decision-maker applied perfect decision-making strategies, the outcome might not be successful due to unpredictable interventions of chance, whereas conversely, a lucky guess could produce a successful outcome. Although basing performance on the outcome does greatly simplify things, it does not identify “training needs or provide trainees with feedback” (Johnston, Cannon-Bowers, & Smith-Jentsch, 1995; Johnston, Smith-Jentsch, & Cannon-Bowers, 1997). This implies that for assessing decision-making efficiencies under uncertainty, an individual outcome is not a direct measure. Therefore, “measures of performance” (or processes) and “measures of effectiveness” (or outcomes) (Smith-Jentsch, Johnston, & Payne, 1998) are needed to understand the whole picture of decision-making effectiveness. Examples of measuring both decision-making processes and outcomes exist in the literature. For example, Cohen, Freeman, and Thompson (1997) used several different measures to assess decision-making efficiency, including the number of issues considered, amount of evidence identified, number of explanations of conflict generated, number of alternatives generated, accuracy of assessment, consensus and confidence in assessment, and frequency of contingency planning. Johnston et al. (1997) developed a framework to measure outcomes and processes at both the individual and team level. In attempting to find evidence of internal thought processes, Woods (1993) applied a “process-tracing” or “protocol analysis” methodology. Verbal protocol, behavior protocols, walkthroughs, and interviews are the most common techniques for process tracing. In naturalistic decision-making studies, retrospective
Decision-Making under Crisis Conditions
215
self-reports such as these as well as other interview techniques are widely used (Boreham, 1989; Doherty, 1993). One drawback in these approaches is the reliance on human recall of past events, which can substantially limit the reliability and validity of the measures. Other research has used regression techniques, such as Lens model, to measure relationships between environmental cues and human decisions (Bisantz, Kirlik, Gay, Phipps, Walker, & Fisk, 2000; Hammond, 1993; Jha & Bisantz, 2001; Rothrock & Kirlik, 2003). Although the research on measuring decision-making processes and outcomes has been enlightening, much work remains to be done. With the measurement of decision-making effectiveness, the efficacy of decisionmaking training can be determined. Traditional measures of training effectiveness apply a transfer-of-training paradigm (Liu & Vincenzi, 2004; Liu, Blickensderfer, Vincenzi, & Macchiarella, 2006). Transfer of training can be measured in many ways; both as an outcome or process. For example, as one of the most popular process measures, transfer of training can be measured by making comparisons between the durations of the training needed to perform a task at a certain skill level (time to standard; Liu et al., 2006). Learning curve techniques have been used in some transfer-of-training studies (Damos, 1991; Liu et al., 2006; Spears, 1985; Taylor, Lintern, Hulin, Talleur, Emanuel, & Phillips, 1999) and may be useful in assessing decision-making skill development. Raw data obtained by decision-making performance measures (e.g., accuracy of assessment) can be used in developing a learning curve and determining just how effective the training program is. The learning-curve-fitting methods provide a much more detailed analysis of data, and the three aspects of performance, i.e., beginning, asymptotic, and rate of improvement, can be examined separately (Liu, Nickens, & Wang, 2006).
CRISIS DECISION-MAKING TRAINING The next question is, exactly how do we train for a rare and abrupt situation occurring in a crisis environment that requires humans to decide on efficacious actions in a short span of time with ambiguous information? Due to the dynamic nature of crises, which makes training according to fixed protocols extremely difficult (Cesta et al., 2014), little research has been conducted in this area. In this section, we will discuss what needs to be involved in the training program, the problems experienced with traditional training processes, and what unconventional training processes could be beneficial in ensuring a successful crisis management training program.
GENERAL AND STRESS TRAINING One current method of training for high-stress environments involves two separate aspects: general training and stress training. General training ensures that “required knowledge, skills, and abilities” are acquired by means of classroom training or simulation under predictable conditions (Driskell & Johnston, 1998). This training content should extensively cover, from beginning to end, all mission goals, depending
216
Human Factors in Simulation and Training
on the particular mission and domain. Even with crises being as unique as they are, set procedures should be learned and exercised on a multitude of possible scenarios and system malfunctions. Research has shown that as long as the individual understands the relationships between symptoms and causes (Dienes & Fahey, 1995) and the “dependencies between all system components” (Kersholt, 1997), control of the situation can be obtained. If there is specific information needed that has not been made available, they must be able to use their knowledge of the system to find this information (Gonzalez, Vanyukov, & Martin, 2005). One result, other than the sheer knowledge that will be required to work with complex system interdependencies, is that the individual will undoubtedly face “unintended consequences” of their decisions (Gonzalez et al., 2005). This may result from hasty or forced decision-making, lack of complete information, or even from the dynamic environment itself. Another consequence is that of goal conflicts (Gonzalez et al., 2005). In many instances, the available resources (i.e., humanpower, time, supplies, etc.) are simply not enough to sustain the situation. A decision must be made that will prioritize these needs and determine what resources will be focused where. The trainee also needs to learn what side effects will be produced as a result of his or her decisions in this dynamic system and how to make trade-offs when certain goals are threatened. One increasingly popular method of learning the intricate system relationships and how the individual’s decisions affect the system as a whole is through microworld (or scaled-world) simulations (e.g., Controller Teamwork Evaluation and Assessment Methodology [CTEAM] or Networked Fire Chief [NFC]). More will be discussed on this type of simulation further on in the chapter. Therefore, and perhaps not surprisingly, it is imperative that each individual have exposure in dealing with nearly every possible situation, whether it is a planned part of the mission or something outside of that, and how to respond accordingly (Cohen, Freeman, & Thompson, 1997); thereby rendering multiple tasks less novel. Stress training, on the other hand, is used solely to prepare someone on how to cognitively and behaviorally respond in a high-stress environment; This means that the majority of the training is performed outside of the classroom, without “normal” or expected conditions (Driskell & Johnston, 1998). These stress-training tasks involve uncertain cues and time pressure that are extremely critical in ensuring transfer of training; so that when the real event happens, effective actions occur “naturally.” Much research has been conducted on the viability of exposing trainees to stress and how it later affects task performance (Ivancevich et al., 1990; Johnston & Cannon-Bowers, 1996; Meichenbaum, 1985; Novaco, Cook, & Sarason, 1983; Smith, 1980; Zakay & Wooler, 1984). For example, one such stress exposure training (SET) program (Driskell & Johnston, 1998) follows these steps:
1. Information provision: An introduction to the symptoms of stress and how stress influences performance. Allows the trainee to become familiar with sensory information, procedural information, and instrumental information associated with a stressful environment, giving them a sense of greater control over the situation.
Decision-Making under Crisis Conditions
217
2. Skills acquisition: Provides exposure to “attentional focus, overlearning, and decision-making skills” training. 3. Application and practice: The application of knowledge and criticalthinking skills obtained by the effects of stress to scenarios similar to those that could probably be experienced, with the stress level being gradually increased over time. Stress training provides a number of advantages applicable to a high-stress situation that traditional or general training cannot, the first of which is that it gives a better understanding of stressful environments (Driskell & Johnston, 1998; Johnston & Cannon-Bowers, 1996). This allows the trainee to learn to “form accurate expectations” concerning crisis situations, thereby allowing for better “predictability” (Driskell & Johnston, 1998). Furthermore, skills are acquired to overcome anxiety and other stress effects produced by high-stress levels that hinder performance. When trained on what to expect and how to respond, individuals will be skilled in acknowledging and then “cognitively controlling” (Driskell & Johnston, 1998) or suppressing these stress effects to perform appropriately and efficiently. Lastly, this type of training builds performance confidence (Driskell & Johnston, 1998; Johnston & Cannon-Bowers, 1996). Those who learn to approach tasks in a positive or confident manner are found to be less likely to become distracted by extraneous variables in the environment and focus instead on the task at hand.
SIMULATION Crisis situations are nearly impossible to replicate in training and it is not safe to expose trainees to them. Traditional training (i.e., instructor, classroom, etc.) has typically been deemed insufficient for all spheres of crisis management training. Sniezek et al. (2001) identified the following issues in developing a crisis management training program through traditional training methods: expert selection and recruitment, determining training content, effectiveness assessment, feedback, interactions with trainer, scheduling, cost, realism, and transfer of training. As a result, the best training program to overcome these traditional training concerns would be through the use of simulations; a training method that produces a wide range of scenarios, with an “immersive interface,” complete experimental control, and a performance feedback system (Sniezek et al., 2001). There are many advantages to using simulation over the traditional training techniques. According to Sniezek et al. (2001), for effective crisis management, humans need to train “under acute stress” or at least under a combination of “arousal, time pressure, and anxiety”; conventional training methods, unlike simulation, simply cannot provide this. If a simulated training program can successfully produce these results by accurately replicating the natural environment with sequences likely to be experienced, trainees will become “immersed” and approach each scenario as though it were real, as opposed to only “managing a simulated event” (Crego & Spinks, 1997). The level of realism produced is an important key in promoting
218
Human Factors in Simulation and Training
transfer of training and, if deficient in any way, can greatly hinder training transfer (Zakay & Wooler, 1984). Other advantages include the ability of the trainee to understand the effects and side effects of his or her chosen actions. As mentioned previously, it is extremely important for the trainee to learn and understand inputs and outputs of the system to know what decisions to make and to begin to gain control of the situation. Simulation incorporates these complex interdependent relationships of the system into the training. Other benefits include the training of multiple trainees at any given time. This is especially beneficial in situations where individuals may be inactive for periods of time before the action (e.g., military personnel being transported over long distances to a war zone) and can use the system for refreshing or recurrent training. Trainees will also be able to interact with one another on the same task, even if they exist in different domains (e.g., air traffic control [ATC] trainees communicating with student pilots in separate simulated environments), an essential in team decisionmaking training. The automated feedback system in simulation programs also assists in satisfying a few of the traditional training issues addressed previously. As Kirlik et al. (1998) noted, there are four areas that would strengthen the individual’s training experience if implemented in the feedback system: timeliness, standardization, diagnostic precision, and presentation mode. Timeliness: An automated feedback system allows trainees to receive instant information at the end of the trial, or even during the trial if requested. Often, feedback is obtained too late to be of any use to the trainee (Kirlik et al., 1998). Habit breaking can also be achieved if the system has been programmed to intervene when the trainee commits an error during the simulation (Sniezek et al., 2001). Standardization: Although expert trainers can provide individualized performance feedback to the trainee, it is labor intensive (Sniezek et al., 2001) and can be “highly idiosyncratic” (Kirlik et al., 1998). The trainer must be able to identify the processes used by the trainee to achieve the outcome. Unfortunately, this is not always possible or feasible. As has been already established, the process is as important as the outcome in identifying where improvements are needed. Additionally, trainers tend to have their own preferences in training, and what is deemed important by one may differ from other trainers’ viewpoints, or even differ from the training program itself. With numerous trainers involved in the program, each trainee could receive variations in training; this in turn could hinder future team interactions. Diagnostic precision: Following a training session, the trainee must be informed as to where and how errors occurred, not just that x number of errors were committed. An automated feedback system would be able to diagnose the exact failure and provide an explanation about what went wrong, as well as offer suggestions on how to improve or prevent it from happening again. Explanations are essential in ensuring that the trainees not only understand the feedback (Sniezek et al., 2001), but that they do not “attribute the error to the particular events in the scenario” and instead take away from it a more generalized lesson (Kirlik et al., 1998).
Decision-Making under Crisis Conditions
219
Presentation mode: Kirlik et al. (1998) found that verbal feedback from trainers during the training session resulted in more interference by creating a “secondary task.” The presentation mode used in their study implemented a text-based “real-time, embedded feedback” system. In a study by Passenier and Kerstholt (1996), participants were supplied with an additional computer screen containing information about the system and the relationships between subsystems; 20% more problems were solved by participants who used this technique than those who did not. Currently, simulations are being used widely in decision-making training, but more specifically are being used for dynamic decision-making. As will be described next, one form of simulation gaining popularity in dynamic decision-making research is microworld simulation.
MICROWORLD Although training in the field does provide the highest level of fidelity and allows trainees to replicate real-world tasks, it is extremely difficult to manipulate and control training scenarios, especially when incorporating time stress and uncertainty. Microworld (or scaled-world) simulations, on the other hand, offer a “compromise between experimental control and realism” (Gonzalez et al., 2005). This type of simulation has become an increasingly useful educational and dynamic decisionmaking research tool over the past three decades (Granlund, Johansson, Persson, Artman, & Mattson, 2001). It enables trainees to operate in a “scaled” version of the environment, thereby giving the users a top-down view of how their decisions and actions made in real time affect the system as a whole. Microworld simulations show the system/environment as it changes autonomously and when each decision or action is enforced. If the user hesitates in decision-making or makes no decision at all, the simulation incorporates this time of inactivity into the current scenario. In addition, this type of simulation gives the researcher the capability to shape the training session to precisely meet the needs of the trainee and the researcher. It ensures that trainees receive a “deeper and more integrated understanding” of the system and environment in which they are immersed in— especially of the “environmental inputs and behavioral outputs” (Ehret, 1998). Some of the current microworld simulation programs that have been evaluated and shown to incorporate relatively high dynamics and complexity are NEWFIRE, Fire Chief, Duress II, Moro, and Water Production Plant (Gonzalez et al., 2005). Although this method of training has been deemed useful in dynamic decision-making studies, further research is needed on its true advantages and disadvantages when applied in crisis training. Overall, although the benefits of simulation far outweigh conventional training methods in crisis training, a huge barrier faced when implementing simulation into the training program is the initial cost. This cost must cover the “research and development costs” associated with a system such as this (Sniezek et al., 2001). The simulation end results are only as good as the model; therefore, an extensive amount of time and effort must be committed to the development of the design.
220
Human Factors in Simulation and Training
CONCLUSION In this chapter, characteristics of a crisis and the effect they have on the human decision-maker have been discussed, as well as problems associated with relying solely on traditional training methods to develop effective decision-making skills during a crisis. Although traditional training methods are adequate for acquiring general domain knowledge and skills, its use otherwise is relatively limited. Crisis training involves much more complicated requirements. During a crisis, individuals face time pressure, high risk, and ambiguous information in a dynamic environment. Research has shown that SET can assist in mitigating many of these effects mentioned. Therefore, the crisis training program relies heavily on simulation to meet these needs not satisfied by traditional methods.
REFERENCES Beach, L. R., & Lipshitz, R. 1993. Why classical decision theory is an inappropriate standard for evaluating and aiding most human decision making. In Klein, G. et al. (Eds.), Decision Making in Action: Models and Methods (pp. 21–36). Norwood, NJ: Ablex Publishing Corp. Bisantz, A. M., Kirlik, A., Gay, P. Phipps, D. A., Walker, N., & Fisk, A. D. 2000. Modeling and analysis of dynamic judgment tasks using a lens model approach. IEEE Transactions on Systems, Man, and Cybernetics, 30(6), 605–616. Boreham, N. C. 1989. Modeling medical decision-making under uncertainty. British Journal of Educational Psychology, 59, 187–199. Brecke, F. H., & Garcia, S. K. 1995. Training methodology for logistic decision making, USAF-AMRL- Technical-Report (Brooks). October 1995; AL/HR-TR 1995-0098: iii– vii, 1–94. Brehmer, B. 1992. Dynamic decision making: Human control of complex systems. Acta Psychologica, 81, 211–241. Busemeyer, J. R. 1985. Decision making under uncertainty: A comparison of simple scalability, fixed-sample, and sequential-sampling models. Journal of Experimental Psychology: Learning, Memory and Cognition, 11(3), 538–564. Cesta, A., Cortellessa, G., & De Benedictis, R. 2014. Training for crisis decision making – An approach based on plan adaptation. Knowledge-Based Systems, 58, 98–112. https://doi .org/10.1016/j.knosys.2013.11.011 Cohen M. S., Freeman, J. T., & Thompson, B. T. 1997. Integrated critical thinking training and decision support for tactical anti-air warfare. 3rd International Command and Control Research and Technology Symposium Proceedings. Comfort, L. K., Kapucu, N., Ko, K., Menoni, S., & Siciliano, M. 2020. Crisis Decision‐Making on a global scale: Transition from cognition to collective action under threat of COVID‐19. Public Administration Review, 80(4), 616–622. https://doi.org/10.1111/puar.13252 Crego, J., & Spinks, T. 1997. Critical incident management simulation. In Flin, R., Salas, E., Strub, M., & Martin, L. (Eds.), Decision Making Under Stress: Emerging Themes and Applications (pp. 85–94). Burlington, VT: Ashgate publishing Company. Damos, D. L. 1991. Examining transfer of training using curve fitting: A second look. The International Journal of Aviation Psychology, 1(1), 73–85. Davis, J. P., & Hall, J. W. 2003. A software-supported process for assembling evidence and handling uncertainty in decision-making. Decision Support System, 35, 415–433.
Decision-Making under Crisis Conditions
221
Deitch, E. 2001. Learning to land: A qualitative examination of pre-flight and in-flight decision-making processes in expert and novice aviators. Dissertation, Virginia Polytechnic Institute and State University. Dienes Z., & Fahey, F. 1995. Role of specific instances in controlling a dynamic system. Journal of Experimental Psychology: Learning, Memory and Cognition, 21(4), 848–862. Doherty, M. E. 1993. A laboratory scientist’s view of naturalistic decision making. In Klein, G. et al. (Eds.), Decision Making in Action: Models and Methods (pp. 362–389). Norwood, NJ: Ablex Publishing Corp. Driskell, J. E., & Johnston, J. H. 1998. Stress exposure training. In Cannon Bowers, J. A., & Salas, E. (Eds.), Making Decisions Under Stress: Implications for Individual and Team Training (pp. 191–217). Washington, DC: American Psychological Association. Edland, A., & Svenson, O. 1993. Judgment and decision making under time pressure: Studies and finding. In Svenson, O., & John Maule, A. (Eds.), Time Pressure and Stress in Human Judgment and Decision Making (pp. 27–40). New York and London: Plenum Press. Ehret, B. D. 1998. Scaled worlds as research tools: A demonstration. Human Factors and Ergonomics Society 42th Annual Meeting (pp. 1157). Santa Monica, CA: Human Factors and Ergonomics Society. FeldmanHall, O., Raio, C. M., Kubota, J. T., Seiler, M. G., & Phelps, E. A. 2015. The effects of social context and acute stress on decision making under uncertainty. Psychological Science, 26(12), 1918–1926. https://doi.org/10.1177/0956797615605807 Flin, R., & Arbuthnot, K. 2002. Incident Command: Tales from the Hot Seat. Aldershot: Ashgate. Gathmann, B., Schulte, F. P., Maderwald, S., Pawlikowski, M., Starcke, K., Schäfer, L. C., Schöler, T., Wolf, O. T., & Brand, M. 2014. Stress and decision making: Neural correlates of the interaction between stress, executive functions, and decision making under risk. Experimental Brain Research, 232(3), 957–973. https://doi.org/10.1007/ s00221- 013-3808-6 Gonzalez, C., Vanyukov, P., & Martin M. K. 2005. The use of microworlds to study dynamic decision making. Computers in Human Behavior, 21, 273–286. Granlund, R., Johansson, B., Persson, M., Artman, H., & Mattson, P. 2001. Exploration of Methodological Issues in Micro-world Research—Experiences from Research in Team Decision Making. Presented at a workshop on the use of micro-worlds in research. Granada, Spain. Retrieved online at http://www.nada. kth.se/~ artman /Articles/ Misc/ MIKRO_GRANADWORKSHOP.pdf Hammond, K. R. 1993. Naturalistic decision making from a Brunswikian viewpoint: Its past, present, future. In Klein, G. et al. (Eds.), Decision Making in Action: Models and Methods (pp. 205–228). Norwood, NJ: Ablex Publishing Corp. Hockey, G. R. J. 1986. Changes in operator efficiency as a function of environmental stress, fatigue and circadian rhythms. In Boff, K. R., Kaufman, L., & Thomas, J. P. (Eds.), Handbook of Perception and Human Performance (pp. 1–49). New York: Wiley. Horvitz, E., & Barry, M. 1995. Display of information for time-critical decision making. Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, San Francisco, CA: Morgan Kaufmann Publishers. Ivancevich, J., Matteson, M., Freedman, S., & Philips, J. 1990. Worksite stress management interventions. American Psychologist, 45, 252–261. Jha, P., & Bisantz, A. M. 2001. Modeling fault diagnosis in a dynamic process control task using a multivariate lens model. Proceedings of the Human Factors and Ergonomics Society 45th Annual Meeting, Minneapolis/St. Paul, MN.
222
Human Factors in Simulation and Training
Jensen, R. S. 1988. Creating a ‘1000 Hour’ pilot in 300 hours through judgment training. Proceedings of the Workshop on Aviation Psychology, Newcastle, Australia: Institute of Aviation, University of Newcastle. Johnson, D. D. P., McDermott, R., Cowden, J., & Tingley, D. 2012. Dead certain: Confidence and conservatism predict aggression in simulated international crisis decision-making. Human Nature (Hawthorne, N.Y.), 23(1), 98–126. https://doi.org/10.1007/s12110- 012 -9134-z Johnston, J., & Cannon-Bowers, J. A. 1996. Training for stress exposure. In Driskell, J. E., & Salas, E. (Eds.), Stress and Human Performance (pp. 223–256). Mahwah, NJ: Lawrence Erlbaum. Johnston, J. H., Cannon-Bowers, J. A., & Smith-Jentsch, K. A. 1995. Event-based performance measurement system for shipboard command teams. Proceedings of the First International Symposium on Command and Control Research and Technology (pp. 274–276). Washington, DC: Institute for National Strategic Studies. Johnston, J., Smith-Jentsch, K. A., & Cannon-Bowers, J. A. 1997. Performance measurement tools for enhancing team decision-making training. In Brannick, M. T., Salas, E., & Prince, C. (Eds.), Team Performance Assessment and Measurement: Theory, Methods, and Applications (pp. 311–327). Mahwah, NJ: Lawrence Erlbaum Associates. Kerstholt, J. H. 1997. Dynamic decision making in non-routine situations. In Flin, R., Salas, E., Strub, M., & Martin, L. (Eds.), Decision Making Under Stress: Emerging Themes and Applications (pp. 185–192). Burlington, VT: Ashgate publishing Company. Khalid, A. F., Lavis, J. N., El-Jardali, F., & Vanstone, M. 2019. Stakeholders’ experiences with the evidence aid website to support ‘real-time’ use of research evidence to inform decision-making in crisis zones: A user testing study. Health Research Policy and Systems, 17(1), 106–106. https://doi.org/10.1186/s12961- 019- 0498-y Kirlik, A., Fisk, A. D., Walker, N., & Rothrock, L. 1998. Feedback augmentation and parttask practice in training dynamic decision-making skills. In Cannon-Bowers J. A., & Salas, E. (Eds.), Making Decisions Under Stress (pp. 91–113). Washington, DC: American Psychological Association. Klein, G. A. 1989. Recognition-primed decisions. In Rouse, W. (Ed.), Advances in ManMachine Systems Research (pp. 47–92). Greenwich, CT: JAI Press. Klein, G. 1997. An overview of naturalistic decision-making applications. In Zsambok, C., & Klein, G. (Eds.), Naturalistic Decision Making (pp. 48–61). Mahwah, NJ: Lawrence Erlbaum Associates. Klein, G. 2000. How can we train pilots to make better decisions? In O’Neil, H., & Andrews, D. (Eds.), Aircrew Training and Assessment (pp. 165–194). Mahwah, NJ: Lawrence Erlbaum Associates. Kuipers B., Moskowitz, A. J., & Kassirer, J. P. 1988. Critical decision under uncertainty: Representation and structure. Cognitive Science, 12, 177–210. Liu, D., Blickensderfer, E., Vincenzi, D., & Macchiarella, N. 2006. Transfer of training. In Vincenzi, D., & Wise, J. (Eds.), Human Factors and Simulation (pp. 49–60). Liu, D., Nickens, T., & Wang, Y. 2006. Modeling Decision-Making Learning Process Under Crisis Situations. Paper presented at the 10th Annual Fall Simulation Interoperability Workshop, Orlando, FL. Liu, D., & Vincenzi, D. 2004. Measuring simulation fidelity: A conceptual study. Proceedings of Human Performance, Situation Awareness and Automation Conference, Daytona Beach, FL. Mandler, J. M. 1979. Categorical and schematic organization in memory. In Puff, C. R. (Ed.), Memory Organization and Structure (pp. 259–299). New York: Academic Press. Mandler, G. 1982. Stress and thought processes. In Goldberger, L., & Breznitz, S. (Eds.), Handbook of Stress: Theoretical and Clinical Aspects (pp. 88–104). New York: Free Press.
Decision-Making under Crisis Conditions
223
Maule, A. J., Hockey, G. R., & Bdzola, L. 2000. Effects of time-pressure on decision-making under uncertainty: Changes in affective state and information processing strategy. Acta Psychologica, 104, 283–301. McKinney, E. H. 1993. Flight leads and crisis decision making. Aviation, Space, and Environmental Medicine, 64, 359–362. Means, B., Salas, E., Crandall, B., & Jacobs, T. O. 1993. Training decision makers for the real world. In Klein, G. et al. (Eds.), Decision Making in Action: Models and Methods (pp. 306–327). Norwood, NJ: Ablex Publishing Corp. Meichenbaum, D. 1985. Teaching thinking: A cognitive-behavioral perspective. In Segal, J. W., Chipman, S. F., & Glaser, R. (Eds.), Thinking and Learning Skills, 2: Research and Open Questions. London: Lawrence Erlbaum Associates. Miller, G. 1956. The magical number seven plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63, 81–97. Monten, J., & Bennett, A. 2010. Models of crisis decision making and the 1990–91 gulf war. Security Studies, 19(3), 486–520. https://doi.org/10.1080/09636412.2010.505129 Moray, N., & Rotenberg, I. 1989. Fault management in process control: Eye movements and action. Ergonomics, 32, 1319–1342. Morgado, P., Sousa, N., & Cerqueira, J. J. 2015. The impact of stress in decision making in the context of uncertainty. Journal of Neuroscience Research, 93(6), 839–847. https:// doi.org/10.1002/jnr.23521 Mosier, K. 1997. Myths of expert decision making and automated decision aid. In Zsambok, C., & Klein, G. (Eds.), Naturalistic Decision Making (pp. 319–331). Mahwah, NJ: Lawrence Erlbaum Associates. Mudra, R. A., & Tong, M. T. 2020. Making “good” choices: Social isolation in mice exacerbates the effects of chronic stress on decision making. Frontiers in Behavioral Neuroscience, 14, 81–81. https://doi.org/10.3389/fnbeh.2020.00081 Novaco, R., Cook, T., & Sarason, I. 1983. Military recruit training: An arena for stress-coping skills. In Meichenbaum, D., & Jaremko, M. (Eds.), Stress Reduction and Prevention (pp. 377–418). New York: Plenum. Orasanu, J. 1997. Stress and naturalistic decision making: Strengthening the weak links. In Flin, R., Salas, E., Strub, M., & Martin, L. (Eds.), Decision Making Under Stress: Emerging Themes and Applications (pp. 43–66). Burlington, VT: Ashgate publishing Company. Orasanu, J., & Connolly, T. 1993. The reinvention of decision making. In Klein, G., Orasanu, J., Calderwood, R., & Zsambok, C. (Eds.), Decision Making in Action: Models and Methods (pp. 3–20). Norwood, NJ: Ablex. Pabst, S., Schoofs, D., Pawlikowski, M., Brand, M., & Wolf, O. T. 2013. Paradoxical effects of stress and an executive task on decisions under risk. Behavioral Neuroscience, 127(3), 369–379. https://doi.org/10.1037/a0032334 Passenier, P. O., & Kerstholt, J. H. 1996. Design and evaluation of a decision support system for integrated bridge operation. Report TNO TM-1996 C064. Soesterberg: Human Factors Research Institute. Rasmussen, J. 1983. Skills, rules, and knowledge; Signals, signs, and other distinctions in human performance models. IEEE Transactions on Systems, Man, and Cybernetics, 13, 257–266. Rastegary, H., & Landy, F. 1993. The interaction among time urgency, uncertainty and time pressure. In Svenson, O., & Maule, A. (Eds.), Time Pressure and Stress in Human Judgment and Decision Making. New York: Plenum Press. Rothrock, L. 2001. Using the time windows to evaluate operator performance. International Journal of Cognitive Ergonomics, 5(1), 1–21. Rothrock, L., & Kirlik, A. 2003. Inferring rule-based strategies in dynamic judgment tasks: Toward a noncompensatory formulation of the lens model. IEEE Transactions on Systems, Man, and Cybernetics, Part A, 33(1), 58–72.
224
Human Factors in Simulation and Training
Shanteau, J. 1995. Expert judgment and financial decision making. In Green, B. (Ed.), Risky Business: Risk Behavior and Risk Management. Stockholm University. Retrieved July 2005 online from http://www.ksu.edu/psych/cws/pdf/financial_experts95.PDF Smith, R. E. 1980. A cognitive/affective approach to stress management training for athletes. In Nadeau, C. H., Halliwell, W. R., Newell, K. M., & Roberts, G. C. (Eds.), Psychology of Motor Behavior and Sport—1979 (pp. 54–73). Champaign, IL: Human Kinetics. Smith-Jentsch, K. A., Johnston, J. H., & Payne, S. C. 1998. Measuring team-related expertise in complex environments. In Cannon-Bowers, J. A., & Salas, E. (Eds.), Decision Making Under Stress: Implications for Training and Simulation (pp. 61–87). Washington, DC: American Psychological Association. Sniezek, J. A., Wilkins, D. C., & Wadlington, P. L. 2001. Advanced training for crisis decision making: Simulation, critiquing, and immersive interfaces. Proceedings of the 34th Hawaii International Conference on System Sciences, Maui, HI: IEEE Computer Society. Spears, W. 1985. Measuring of learning and transfer using curve fitting. Human Factors, 27, 251–266. Stuster, J. 1996. Bold Endeavors: Lessons from Polar and Space Exploration. Annapolis, ML: Naval Institute Press. Taylor, H. L., Lintern, G., Hulin, C. L., Talleur, D. A., Emanuel, T. W. Jr., & Phillips, S. I. 1999. Transfer of training effectiveness of a personal computer aviation training device. International Journal of Aviation Psychology, 9(4), 319–335. Varma, T. 2019. Understanding decision making during a crisis: An axiomatic model of cognitive decision choices. International Journal of Business Communication (Thousand Oaks, Calif.), 56(2), 233–248. https://doi.org/10.1177/2329488415612477 Waugh, N., & Norman, D. 1965. Primary memory. Psychological Review, 72, 89–104. Wickens, C. D. 2005. Attentional tunneling and task management. Proceedings of the 13th International Symposium on Aviation Psychology, Dayton, OH: Wright-Patterson AFB. Wickens, C., Lee, J. D., Liu Y., & Becker S. 2004. An Introduction to Human Factors Engineering. Hoboken, NJ: Prentice Hall. Wiggins, M., & O’Hare, D. 1995. Expertise in aeronautical weather-related decisionmaking: A cross-sectional analysis of general aviation pilots. Journal of Experimental Psychology: Applied, 1(4), 305–320. Woods, D. D. 1993. Process-tracing methods for the study of cognition outside of the experimental psychology laboratory. In Klein, G. et al. (Eds.), Decision Making in Action: Models and Methods (pp. 228–252). Norwood, NJ: Ablex Publishing Corp. Zakay, D., & Wooler, S. 1984. Time pressure, training and decision effectiveness. Ergonomics, 27, 273–284. Zsambok, C. E. 1997. Naturalistic decision making: Where are we now? In Zsambok, C., & Klein, G. (Eds.), Naturalistic Decision Making (pp. 3–16). Mahwah, NJ: Lawrence Erlbaum Associates.
8
Healthcare Simulation and Training Sarah A. Powers and Mark W. Scerbo
CONTENTS Introduction............................................................................................................. 225 Benefits of Simulation in Healthcare............................................................. 226 Drawbacks of Simulation in Healthcare........................................................ 226 Dimensions of Simulation............................................................................. 227 History of Medical Simulation................................................................................ 228 Mannequins.................................................................................................... 228 Virtual Reality Systems................................................................................. 229 Surgical Systems................................................................................ 230 Collaborative and Immersive Training Systems................................ 231 Standardized Patients..................................................................................... 233 Hybrid Systems.............................................................................................. 234 Training................................................................................................................... 235 Team Training......................................................................................................... 236 Benefits.......................................................................................................... 237 History and Scope of Team Training in Healthcare....................................... 237 Training Transfer..................................................................................................... 238 Healthcare Simulation and the Pandemic............................................................... 242 Conclusion..............................................................................................................244 References............................................................................................................... 245
INTRODUCTION The use of simulation in high-risk fields such as aviation, aerospace, and the military has been widespread for nearly a century (Rosen, 2008; Weinger & Gaba, 2014). Comparatively, the integration of simulation in healthcare is relatively new. Simulation has been defined as “a technique – not a technology – to replace or amplify real experiences with guided experiences that evoke or replicate substantial aspects of the real world in a fully interactive manner” (Gaba, 2004, p. 126). The need for methods to enhance training in healthcare became clear with the publication of the Institute of Medicine’s (IOM) report, To Err Is Human, in 2000 (Kohn et al., 2000). This report estimated that nearly 98,000 people die each year
DOI: 10.1201/9781003401353-8
225
226
Human Factors in Simulation and Training
from medical errors. Several of the recommendations offered in this report to increase safety in the US healthcare system stressed the use of simulation for training. In addition to the ultimate goal of enhancing patient safety, there are several other driving forces of simulation in healthcare. Perhaps the biggest factor has been an increase in the capabilities and availability of technology for simulation (Gaba, 2004; Issenberg & Scalese, 2008; Nestel & Kelly, 2018; Satava, 2001). Another is the growing size of the healthcare workforce combined with the reduction in time available for training (Bradley, 2006; Issenberg & Scalese, 2008; Nestel & Kelly, 2018; Sinz, 2007). There has been a shift in healthcare training, with a greater focus on streamlined and shorter more efficient training (Bradley, 2006). However, a consequence of this emphasis has been that many students are less prepared for clinical practice (Cartwright et al., 2005; Perlmen et al., 2017). Simulation can supplement or even replace current training requiring patient contact hours. There is a growing trend to count time spent training in simulation as a suitable substitute for clinical time in healthcare curricula (Accreditation Council for Graduate Medical Education [ACGME], 2020; Nestel & Kelly, 2018; Sinz, 2007).
Benefits of Simulation in Healthcare There are numerous benefits of simulation in healthcare, many of which are related to enhancing training and safety. First, simulation can be made easily adaptable to users at all levels for skill acquisition, assessment, and retention (Bradley, 2006; Gordon, 1974; Jones et al., 2015; Yunoki & Sakai, 2018). This adaptability enables individualized training and can accommodate various learning styles, which can also help to increase student engagement compared to more traditional forms of instruction. Unlike real patients, simulations don’t get tired, worried, and are always available (Gordon, 1974). Additionally, simulators provide an ethical substitution for patients when trainees are practicing dangerous, invasive, or rare treatments (Buck, 1991; Gaba, 2004; Gordon, 1974; Ziv et al., 2003). Simulation provides novice trainees more opportunities to practice and refine their skills before performing any procedures on an actual patient. From the trainees’ perspective, simulation also provides a less stressful environment because they are less likely to feel pressure or embarrassment compared to practicing on a real patient (Forrest, 2019) and they know that any mistake will not harm an actual patient (Bradley, 2006). Ultimately, these benefits translate into improved patient safety (Buck, 1991; Bradley, 2006; Gaba, 2004; Nestle & Kelly, 2018). Despite these numerous benefits of simulation in healthcare, there are potential drawbacks that must also be considered.
Drawbacks of Simulation in Healthcare Several barriers have slowed the widespread implementation of simulation in healthcare. Potential drawbacks of simulation in healthcare include cost, zero or negative transfer, technology limitations, and fidelity. One of the most frequently cited drawbacks is the cost associated with simulation (Buck, 1991; Gordon, 1974; Rosen, 2008; Satava, 2001; Sinz, 2007). Although cost can be a concern, it is important
Healthcare Simulation and Training
227
to note that cost varies widely depending on many different aspects of a simulation (Gaba, 2004). High-fidelity simulation systems and activities will likely incur a high cost, whereas low-fidelity simulations can be quite inexpensive. For example, the high-fidelity SimMan costs upward of $68,000 (Laerdal Medical Corp, 2014), which does not include other yearly operational costs associated with maintenance, repairs, supporting information technology, and maintaining instructional support staff (Walsh & Jaye, 2012). Additionally, time spent in simulation can incur a higher cost if it takes time away from clinical services. However, when simulation training can replace clinical training, the relative cost can be lower. Thus, there are tradeoffs when considering how the implementation of simulation training in healthcare will impact overall cost. Another concern relates to the reliability of the technology used for simulation. Specifically, when the cost of a simulator is high, there are concerns regarding whether it or other related technology may fail during training or practice (Buck, 1991). A final concern is whether simulators sufficiently, and more importantly accurately, represent the things they purport to simulate (Gordon, 1974). If a simulator does not adequately represent the genuine clinical activity, this bias is likely to transfer into practice and can have ramifications on patient care and outcomes.
Dimensions of Simulation An overview of the diversity of simulation applications was provided by Gaba (2004) who categorized simulation into 11 dimensions. Later, Scerbo and Anderson (2012) organized these dimensions into three higher-level categories: goals, trainee or user characteristics, and method of implementation. Goals include the purpose and aims of the simulation activity. Broadly, these include education, training, performance assessment, and research. They also include the domain (e.g., primary care, procedural/surgery, or high hazard/emergency medicine) and the target of the simulation (e.g., motor skills, knowledge, attitudes, teamwork), as well as the age of the simulated participant(s), which can range from the neonate to the elderly. User characteristics include the unit of participation (i.e., individuals, teams, and even entire organizations; the experience level of the participants from novice to expert; and the discipline of the participants which could be physicians, nurses, or even management). Last, the method of implementation addresses the type of technology. At the most basic level, simulation can be conducted without any technology, such as verbal role-playing, while at the most complex level, simulation may require sophisticated technology, such as virtual patients embedded in a virtual replication of a clinical site. The method also includes the site for the simulation activity such as a simulation center or in the actual work environment (in situ) as well as the nature of participation which can range from observing the activity to being an active participant. Finally, the method also includes how feedback is provided, i.e., from an instructor, debriefing personnel, or simulator-generated performance metrics. Together these dimensions may not fully capture the breadth of current simulation applications, but they do provide a classification scheme to help researchers and others conceptualize the various elements across the diversity of simulation applications.
228
Human Factors in Simulation and Training
HISTORY OF MEDICAL SIMULATION Healthcare-related simulation can be traced back to the early 4th century (Owen, 2012). Very early simulators were typically anatomical models made from elements such as clay, stone, bronze, wood, ivory, and even wax (Jones et al., 2015; Owen, 2012). Documented use of simulation for the purpose of medical education is much more recent, having been first cited in the 17th and 18th centuries (Buck, 1991; Owen, 2012). More modern forms of simulation were introduced in the 20th century and have continued to evolve (Cooper & Taqueti, 2008). In this chapter, we will focus on a subset of the most common types of simulation. These include mannequins, virtual reality systems, immersive VR collaborative training systems, standardized patients, and hybrid systems.
Mannequins Some of the earliest simulators used for medical education are mannequins (Buck, 1991). These are life-sized physical reproductions of the human body with components that replicate critical systems (e.g., heart, lungs, digestive tract, etc.). More sophisticated mannequins also replicate human physiology and response to pharmacological agents (Gaba et al., 2001). One of the first mannequins was developed by Gregoire the Younger in the 18th century (Buck, 1991; Owen, 2012, 2018). This early mannequin simulated a woman’s abdomen and pelvis for the purpose of training midwives how to handle a variety of birthing complications. Although there are no known photos of this simulator, written accounts of its use questioned whether training would transfer to actual practice given the simulator’s low fidelity. In response to these concerns, several other slightly higher fidelity simulators were developed throughout the 19th century. One of the first modern mannequins is Resusci®-Anne developed by doll and toymaker, A.S. Laerdal, in the early 1960s for resuscitation training (Cooper & Taqueti, 2008; Rosen, 2008). Simulators for training resuscitation are important to reduce the risks of injury when practicing on healthy people and spreading contagious diseases (Buck, 1991). Resusci®-Anne is a life-sized mannequin that represents a young adult female. Early versions of this mannequin were developed solely to teach mouth-to-mouth resuscitation via a simulated airway and lungs capable of inflating and deflating (Buck, 1991; Cooper & Taqueti, 2008; Rosen, 2008). Later versions incorporated a spring and piston mechanism to simulate the resistance of the breastbone and ribcage during chest compressions. The Laerdal company also created child and adolescent versions (Buck, 1991). By the late 1960s, more complex simulators with integrated computer technology began to emerge. The earliest known simulator capable of replicating functions of human patients via computerized technology was Sim One (Abrahamson et al., 1969). Developed in the mid-1960s, Sim One was a full-body mannequin used for training endotracheal intubation in anesthesiology (Cooper & Taqueti, 2008; Gordon, 1974; Rosen, 2008). Sim One could generate a heartbeat, pulse, and blood pressure and was capable of breathing, opening and closing its mouth and eyes, exhibiting pupillary changes, and
Healthcare Simulation and Training
229
FIGURE 8.1 The Harvey® cardiopulmonary patient simulator.
reproducing responses to anesthesia injections (Buck, 1991). Using specialized input devices known as subcutaneous sensors, Sim One was able to interpret a learner’s actions and react accordingly. Although many felt Sim One was ahead of its time, it was not adopted because of its high cost and a reluctance among medical educators to appeal to this newer method of instruction (Buck, 1991; Bradley, 2006; Cooper & Taqueti, 2008). It wasn’t until 1986 that another mannequin incorporating computer technology was introduced (Cooper & Taqueti, 2008; Rosen, 2008). Developed by Michael Gordon, the Harvey® Cardiology Patient Simulator was designed to train learners how to recognize common auscultatory cardiac findings (Gordon, 1974). Harvey® was capable of simulating a variety of conditions and symptoms associated with cardiac disorders (see Figure 8.1). Some notable features included chest wall movement, murmurs, breath sounds, pupillary responses, and cyanosis (Buck, 1991; Cooper & Taqueti, 2008; Gordon, 1974). Harvey® is also considered to be the earliest example of a part-task trainer (Cooper & Taqueti, 2008), in that it represents only the portions of the body needed for specific training applications (Bradley, 2006). Compared to other simulators, Harvey® has undergone some of the most extensive testing regarding its educational efficacy (Cooper & Taqueti, 2008), demonstrating enhanced performance in clinical practice for students who trained with this system compared to those who trained with traditional methods of instruction (Ewy et al., 1987). At present, there are numerous mannequins available from many vendors that address cardiopulmonary functioning for adults, children, and neonates. Mannequins are also available now for training procedures in emergency medicine, labor and delivery, nursing, respiratory therapy, trauma, as well as basic anatomy.
Virtual Reality Systems Another form of simulation involves the use of virtual reality (VR), which has been described as computer-generated recreations of environments, objects, and people represented by avatars (Bradley, 2006). Early versions of VR systems for healthcare began to emerge in the 1990s. Joseph Rosen and Scott Delp created the first virtual representation of a lower limb used to practice tendon transplants (Delp et
230
Human Factors in Simulation and Training
al., 1990; Satava, 2001). Shortly thereafter, Dr. Richard Satava and Jaron Lanier created a virtual representation of organs in the upper abdomen (Satava, 1993). An important development in VR representations occurred in 1994, when researchers at the National Library of Medicine developed an interactive and searchable database of a male and female body created from CT, MRI, and phototomography scans of cadavers known as the Visible Human (Ackerman, 1998). This project is thought to be the basis for modern VR systems including much of the seminal work in surgical simulation (Satava, 2001). From the Visible Human Project data, Scott Delp created the Limb Trauma Simulator (Delp & Zajac, 1992; Satava, 2001). Surgical Systems The evolution of VR simulators in surgery was facilitated further by widespread adoption of the laparoscopic or minimally invasive method in the late 1980s. Unlike traditional open procedures, laparoscopic procedures are performed from outside the body. A miniature video camera and surgical instruments are inserted into the body through several small incisions. Laparoscopic procedures are particularly amenable to VR because the surgeon views the patient’s body cavity on a video display. Thus, developers could readily create computer-generated representations of anatomy, tissue, and laparoscopic instruments. One of the first VR systems for laparoscopic surgery was MIST VR (Sutton et al., 1997), which integrated a laparoscopic interface with a graphical 3D representation of laparoscopic instruments interacting with geometric shapes. Trainees performed basic psychomotor tasks representing the fundamental eye-hand coordinated movements needed for many laparoscopic manipulations (e.g., target acquisition, target transfer, traversal, target diathermy, etc.). Today, VR systems for laparoscopy often include similar psychomotor skill-building activities as well as whole-task training activities. For example, the LapSim® system from Surgical Science includes modules for appendectomy, hysterectomy, and laparoscopic cholecystectomy (gall bladder removal), among others. VR surgical training systems offer many advantages over other simulation techniques. First, they allow trainees to practice procedures repeatedly without cutting, altering, and destroying physical components, so they do not require replaceable parts. VR systems can increase user engagement through immersion and presence (Heinrichs et al., 2013; Heinrichs et al., 2010). Second, they can provide real-time performance-based feedback tailored to a particular individual (Gaba, 2004; Satava, 2001). Third, the simulation software can be easily updated to mirror changes in healthcare policies and procedures (Dev, 2016). Last, many VR surgical simulators incorporate haptic force feedback systems that give users the haptic sensation of probing or pulling on virtual tissue with simulated instruments. Laparoscopic and endoscopic procedures are good candidates for haptic force feedback because there is no direct contact with the patient’s organs and tissue. Instead, forces are felt indirectly through the laparoscopic instruments. This form of feedback is important for surgical simulation training, particularly among novices, when learning to apply the proper amount of force to manipulate but not damage tissue.
Healthcare Simulation and Training
231
Collaborative and Immersive Training Systems Outside of surgery, VR has also been used for other forms of healthcare simulation. Many current systems now use immersive VR headsets and support wireless movement within patient rooms, hospital environments, or onsite first-responder care. Another form of simulation uses VR for collaborative training. These systems have been used to facilitate training by enabling individuals to connect from remote sites and interact in a virtual world that mimics a clinical environment (Gaba, 2004; Heinrichs et al., 2013; Heinrichs et al., 2010; Issenberg & Scalese, 2008; Liaw et al., 2021; Rosen, 2008). One early system used Second Life. Originally developed in 2003, Second Life allows users to create an avatar and interact with others via the Internet (Beard et al., 2009; Rosen, 2008). Medical school instructors became interested in Second Life as a training tool for students to collaboratively practice history-taking and other clinical skills (Beard et al., 2009; Singh et al., 2013). Beard and colleagues (2009) used a medical application in Second Life and obtained some evidence to suggest that knowledge and skills gained with this format transferred to clinical practice. Since Second Life, more advanced three-dimensional virtual worlds (3DVW) specific to healthcare are being used for simulation-based interprofessional collaborative training (Liaw et al., 2021). For example, simulation of an interprofessional clinical encounter in a 3DVW known as Create Real-time Experience and Teamwork in a Virtual Environment (CREATIVE) was shown to improve team performance and interprofessional attitudes, as well as foster a mutual understanding of patient-centered care and team members’ interprofessional roles (Liaw et al., 2019, 2020). Another promising VW being investigated for use in medical education, training, surgical simulation, and more is Facebook’s Metaverse (Mozumder et al., 2022). The Metaverse goes beyond traditional VR technologies by incorporating augmented reality (AR) and a hybrid of VR and AR, known as mixed reality (MR). Although it is still in its very early stages of use, the Metaverse is a promising approach to immersive and collaborative training. Researchers have also studied virtual human technology to train diagnostic, reasoning, and communication skills to train students before interacting with patients (Kleinsmith et al., 2015; Lane et al., 2013). For example, Kron et al. (2017) developed the MPathic-VR system using virtual humans to address intercultural and interprofessional communication (see Figure 8.2). In one scenario, the patient, a young woman has been diagnosed with leukemia. The learner has to break the bad news to the patient’s mother, a woman from El Salvador whose cultural values differ from her daughter’s. In a follow-up scenario, the learner must engage in conflict resolution with the patient’s nurse who is upset because the learner failed to include her in the family meeting with the patient’s mother. Students who used MPathic-VR performed the communication scenarios, received feedback on their performance, and then repeated the scenarios to improve their scores. They were then given a transfer test where they applied their newly acquired skills in an objective structured clinical exam. The investigators showed that the students who trained with MPathic-VR scored higher than students trained with a conventional computer-based learning method. These findings suggest that training with virtual humans may provide a
232
Human Factors in Simulation and Training
FIGURE 8.2 User listening to the patient and her mother in the MPathic-VR system.
viable means to acquire communication and reasoning skills that transfer beyond the learning environment. On a grander scale, Scerbo and his colleagues (2007) at Old Dominion University and Eastern Virginia Medical School developed a fully immersive virtual operating room (VOR) to examine surgical team decision-making. The VOR environment and equipment were modeled on a standard OR configured for laparoscopic procedures and rendered in a 10 ft by 10 ft Cave automatic virtual environment (CAVE). They embedded a part-body mannequin and laparoscope to allow trainees to perform a simulated laparoscopic cholecystectomy. The VOR included a virtual attending surgeon, anesthetist, and circulating nurse. The virtual teammates were designed to interact with the trainee through speech scripts and movements based on the knowledge, personalities, and activities of genuine surgical team members. Figure 8.3 shows a trainee with a human scrub technician who hands the surgeon the operating instruments. In one scenario, in the middle of the procedure, the anesthetist alerts the trainee that the patient’s oxygen saturation level has dropped, and the trainee must troubleshoot the problem and decide whether to continue or abort the procedure. Thus, the VOR incorporates several forms of simulators and provides a more comprehensive OR environment to train procedural, communication, and social skills among surgical team members. Of course, one of the challenges with embedding humans in immersive virtual simulation is the division between physical and virtual elements. It is not possible to transition instruments from virtual to physical forms. However, Daher and colleagues (2020) developed a clever way to combine a human physical form with a virtual patient. They used a rear-projection AR system to represent the patient on a physical human shell. The patient lies on a table, but underneath is a set of shelves that house projectors, speakers, haptics, and heaters. Their physical-virtual patient can present multisensory symptoms such as changes in skin appearance, pulse, and
Healthcare Simulation and Training
233
FIGURE 8.3 Surgical resident and scrub technician interacting with the virtual attending surgeon in the virtual operating room with an embedded laparoscopic cholecystectomy simulator.
temperature. Trainees can touch the physical-virtual patient which can then initiate changes in facial expressions, speech, and underlying physiology (e.g., changes in eye movements and pupil size). The physical shell can also be changed to represent an adult, a child, or a male or female. The developers tested the system and found that most users were impressed with the fidelity and satisfied that critical cues needed for diagnoses were reasonably well rendered. Although the system is still constrained by numerous technical challenges, it does represent an important step forward in blurring the line between physical and virtual simulation.
Standardized Patients The most life-like form of simulation is the standardized patient (SP), sometimes referred to as a simulated patient. These are individuals trained to portray a patient in a standardized manner (Nestel et al., 2018). SPs are also used to perform two other important roles: assessing physician-patient interactions according to standardized criteria and providing immediate feedback to trainees about the quality of their interactions. The use of SPs was introduced in the US in 1964 by Howard Barrows for educating students about neurological examinations (Barrows, 1993; Rosen, 2008). SPs are now widely used for medical student assessments during their Objective Structured Clinical Examinations (OSCE; Nestel et al., 2018) and in the National Board of Medical Examiners United States Medical Licensing Examination Step 2 Clinical Skills for assessing clinical competency (Boulet et al., 2009). SPs bring a humanistic quality to simulation that is missing in many other forms of simulation (Nestel et al., 2018). However, there are also some important considerations and limitations that must be acknowledged. First, SPs are human and are susceptible to limitations of attention and working memory (Newlin-Canzone et al., 2013). Although SPs receive formal training to portray patients and provide feedback in a standardized manner, there can still be issues with the quality of the role-playing
234
Human Factors in Simulation and Training
or the feedback provided (Nestel et al., 2018). Second, because they are typically normal healthy individuals, they cannot simulate many underlying pathologies or conditions when examined physically. However, these limitations can be overcome to some extent with the use of moulage or appliances (see below). There are also ethical considerations regarding SPs themselves. Unlike devices, SPs are humans and come to the job with their own histories and experiences. For example, some simulations require SPs to partake in roles that emulate trauma (e.g., rape victim, alcoholic, etc.), which can elicit strong emotional reactions that can persist for several days (Nestel et al., 2018; Woodward, 1998). In other instances, SP roles may be played by a staff member of a medical institution which can elicit negative stereotypes among colleagues who associate that staff member with that role outside of the simulation and potentially harm his or her reputation (Nestel et al., 2018). Fortunately, there are methods that can help mitigate these concerns. The SP administrator should discuss the role requirements with the SP ahead of time to minimize the chance that the role will bring up any past trauma. The facilitator should encourage SPs to “de-role” before leaving the simulation site. There has been a growing movement among many in healthcare simulation to adopt policies for creating a psychologically safe environment for learning (a safe container) for trainees and staff (Rudolph et al., 2014).
Hybrid Systems Hybrid systems incorporate at least two types of simulation, typically a physical and virtual component (Satava, 2001). One of the first commercial examples is the VIST system (Mentice, Inc.) developed by Dawson and his colleagues (2000) for training interventional cardiology procedures. This system combined a physical representation of the body with simulated fluoroscopy and an interactive anatomical display. Trainees can choose among different catheters and guide wires, place them into the simulator, and monitor a virtual representation of the images used in interventional procedures. Some of the original examples of hybrid systems were often augmented standardized patients with part-task trainers, such as a partial mannequin, to enable simultaneous practice of technical and nontechnical skills (Bradley, 2006). For example, a female standardized patient could place a pelvic trainer between her legs to train labor and delivery procedures. The virtual operating room described above combined physical simulators into a virtual environment populated with virtual humans. Figure 8.4 shows another example of a hybrid system that uses a digital display, a physical model/appliance, and a human simulated patient for training ultrasonography. The learner places the sonography probe on any of the key locations on the abdominal appliance. As the learner moves the probe, fetal images derived from that location are updated on the display to indicate the correct or incorrect positioning and underlying anatomy. The simulated patient can ask questions of the learner (e.g., Can you tell if my baby is OK?). Thus, hybrid systems have the potential to expand training beyond specific procedures to incorporate a more wholistic provider–patient experience.
Healthcare Simulation and Training
235
FIGURE 8.4 A hybrid sonography system using a physical appliance, sonography display, and live model.
TRAINING Methods for training with simulation have been well documented in the literature and in this volume (Farmer et al., 2003; Roscoe & Williges, 1980; Swezey & Andrews, 2001; Vincenzi et al., 2009). In many high-risk occupations (e.g., aviation, military operations, nuclear power plant operations, etc.), computer-based simulators have been a historical and fundamental component of training. However, adoption of simulation-based training in healthcare is a much more recent phenomenon. Historically, medical education and other allied health professions traditionally followed an apprenticeship model in which procedures are learned by the “see one, do one, teach one” approach. In fact, Dawson (2002) noted that this approach to medical education had not changed since ancient Egyptian times. As noted above, publication of the To Err Is Human report (Kohn et al., 2000) focused attention on a health system that was rife with error. McGaghie et al. (2020) have indicated several other important factors that have led to a shift in the landscape for health professions training. First, the traditional apprenticeship method produces different and often insufficient educational opportunities. In surgery, for example, expectations for competency across many different procedures were rarely achieved by the end of residency (Bell et al., 2009). Second, methods for evaluating learner performance are incomplete and unreliable and the feedback provided does not always offer actionable and verifiable information for improvement. Third, a body of evidence shows that the traditional method of experience-based clinical education is not producing graduates who are adequately prepared for subsequent training in clinical practice. In light of recent concerns about the efficacy of the traditional approach to health professions training, McGaghie and his colleagues (2020) have advocated for an alternative that emphasizes simulation-based mastery learning. Their approach has its theoretical foundations in behavioral, constructivist, and social cognitive learning theory traditions. Fundamentally, mastery learning is a competency-based approach
236
Human Factors in Simulation and Training
to skill acquisition. Although there are variations, McGaghie (2020) has described seven critical features:
1) Begin by establishing baseline measures of a learner’s knowledge, skills, and abilities 2) Set and communicate measurable educational objectives in sequenced instructional units that progress in increasing levels of complexity 3) Prepare educational activities that meet the educational objectives 4) Establish minimal criteria for advancement and passing all educational units 5) Provide formative assessments, cognitive engagement, coaching, and feedback aimed at helping learners meet the educational objectives 6) Use evidence-based objective measures of knowledge, skills, and abilities for advancement through the instructional units 7) Engage in individualized, continual practice and assessment until mastery criteria are met
The simulation-based mastery learning approach is beginning to be adopted in different areas of health professions training with promising results. Ahn and colleagues (2016) adopted this approach for training video laryngoscopy (i.e., intubating patients with a breathing tube) and found that once mastery criteria were reached, skills were retained over a 6-month interval. Schroedl et al. (2020) found that the final level of skill attained by residents trained with the mastery method for managing the mechanical ventilation of patients exceeded that of those trained under the standard approach by over 50%. Reed et al. (2016) trained medical students on 6 basic procedures using the simulation-based mastery learning approach and found that 98% of the students retained knowledge of those procedures over a 1–9-month interval at a level that met or exceeded the minimal passing standards. Collectively, these studies show that the simulation-based mastery learning approach results in better, more consistent, and longer retained levels of performance than the traditional apprenticeship approach.
TEAM TRAINING The provision of healthcare is inherently a team-based activity. A variety of providers need to coordinate care to treat the same patient through the course of a single illness or even a straightforward appointment (Baker et al., 2005; Owen, 2016, 2018; Weller & Civil, 2018). However, the portion of medical or nursing curricula dedicated to team-based care is a fraction of what is spent on technical skills, if it is available at all (Baker et al., 2005; Moorthy et al., 2005; Weller & Civil, 2018; Yunoki & Sakai, 2018). The low priority given to team training is ironic when one considers that failures of technical skill are not necessarily the primary cause of errors (Chopra et al., 1992). Instead, a large number of preventable adverse events are attributed to failures of team-based care including teamwork and communication (Leape et al., 1991). Therefore, there is a need for more interdisciplinary and
Healthcare Simulation and Training
237
multidisciplinary team training (Owen, 2016) and simulation has been instrumental in facilitating these efforts.
Benefits Several benefits are seen when simulation-based team training is implemented in healthcare. An obvious benefit is that it can enhance teamwork and communication skills (Salas et al., 2006; Weller & Civil, 2018). Specifically, training with individuals from other disciplines can help team members to understand other members’ perspectives, contributing to a shared perspective among all team members (Gaba et al., 2001). Further, training as a team rather than as individuals is associated with a reduction in workload and improvement in performance on team tasks (Prichard et al., 2011). Additionally, studies that have explored the effectiveness of simulation for team training have found improvements in many areas including simulated case performance, teamwork in real clinical environments, attitudes toward safety, perceptions of clinical decision-making, and patient outcomes, among others (Weaver et al., 2014; Weller & Civil, 2018). These enhancements contribute further to a reduction in both patient morbidity and mortality, ultimately increasing patient safety (Salas et al., 2006).
History and Scope of Team Training in Healthcare The origin of simulation-based team training in healthcare can be traced back to the comprehensive simulation environment for anesthesiology (CASE) developed by David Gaba and his colleagues in the 1980s (Gaba & DeAnda, 1988). These investigators applied principles of crew resource management (CRM) used in aviation for training pilots to improve their teamwork and reduce the likelihood of critical events (Fritz et al., 2008; Salas et al., 2005) to anesthesiology, creating the anesthesia crisis resource management (ACRM) program (Howard et al., 1992). The goal of ACRM was to improve clinical team-based training, by training anesthetists and other team members to improve effective communication, positive group dynamics, and personnel and resource utilization. Robert Helmreich who worked with Gaba’s team also helped generate a similar program for the operating room, Team Oriented Medical Simulation (Helmreich & Schaffer, 1994). Soon simulation-based team training began to emerge in other areas of healthcare including intensive care, pediatrics, emergency medicine, cardiology, labor and delivery, neonatology, and radiology (Fritz et al., 2008; Gaba, 2010; Rosen, 2008). Several teaching institutions have now adopted the ACRM curriculum and require both trainees and experienced providers to undergo yearly training (Gaba et al., 2001). Healthcare team training expanded rapidly with the introduction of TeamSTEPPS™ in the mid-2000s. This program was developed through a collaboration of the US Department of Defense and the Agency for Healthcare Research and Quality (AHRQ), a division of the US National Institutes of Health. TeamSTEPPS™ is a set of evidence-based tools used to improve healthcare providers’ teamwork skills emphasizing leadership, situation monitoring, mutual support,
238
Human Factors in Simulation and Training
and communication. The program focuses on empowering healthcare professionals, patient families, and patients themselves to speak up whenever they have a significant concern to ensure the best possible quality of care. Research on the effectiveness of simulation for teaching TeamSTEPPS™ has primality focused on interdisciplinary teams, however, it has also been independently applied to teams in obstetrics, pediatrics, intensive care, anesthesiology, nursing, surgery, and emergency medicine (AHRQ, 2015). Although most research has shown positive benefits of TeamSTEPPS™ training on team skills, a recent review indicates that the program has led to improvements in patient safety, a reduction in medical errors, and increased patient satisfaction (Parker et al., 2019). Simulation-based team training has now become essential for several specialty areas of healthcare. One of the first areas outside of anesthesiology and surgery to begin using team training was emergency medicine. In this area, training is often focused on developing and enhancing general teamwork skills applicable to the variety of unique procedures and tasks these teams must perform (Weile et al., 2021). For example, it has been used to train effective team communication skills necessary for procedures such as resuscitation (Issenberg & Scalese, 2008) and to respond to trauma and cardiac arrest events (Weile et al., 2021). In addition, this method has been used to provide emergency medicine teams the opportunity to practice team skills that are critical for managing rare or unlikely situations, such as a mass casualty incident (Bracq et al., 2019; Fritz et al., 2008; Heinrichs et al., 2010). Simulation is also used in pediatric emergency medicine to train team skills. Like emergency medicine, this type of training is critical for trauma and cardiopulmonary resuscitation (Grant et al., 2016). Simulation-based training has been used to train pediatric teams in patient and family-centered care (e.g., breaking bad news and patient safety). Pediatric emergency medicine teams also undergo a type of training known as just-in-time (JIT) training, which provides pediatric teams the opportunity to rehearse a procedure via simulation before performing it on the patient. Another specialty area that utilizes simulation for team training is labor and delivery. Teams are trained to manage complications to the mother and the baby, such as shoulder dystocia (Crofts et al., 2016; Draycott et al., 2006; Maslovitz et al., 2007), postpartum hemorrhage (Birch et al., 2007; Draycott et al., 2006; Maslovitz et al., 2007; Riley et al., 2011), eclampsia (Draycott et al., 2006; Ellis et al., 2008), and breech delivery (Draycott et al., 2006; Maslovitz et al., 2007).
TRAINING TRANSFER The incorporation of simulation as a major training method for healthcare providers has increased dramatically since the early 2000s and numerous studies have been published touting the benefits of this approach. Ultimately, however, it is important to establish the effectiveness of simulation-based training. In this regard, many researchers have turned to the Kirkpatrick (1994) model. According to Kirkpatrick, training can be evaluated at four levels that transition from the training event itself to its impact on the operational environment. At Level 1, a learner’s attitudes and opinions regarding the training event are measured. At Level 2 interest lies in measuring
Healthcare Simulation and Training
239
the knowledge and skills a learner acquires from the training event. Level 3 targets transfer of training, measuring how knowledge and skills acquired from training affect performance back on the job. Last, at Level 4, interest lies with measuring specific work-related outcomes affected by the training. Another way to view the different levels is through the lens of translational science in a biomedical context (Dougherty & Conway, 2008; McGaghie et al., 2011a) in which treatments and solutions are evaluated first in the laboratory, then with patients, and finally within society as a whole. At present, much of the evidence supporting the effectiveness of simulation-based training is limited to Kirkpatrick levels 1 and 2, trainee attitudes and knowledge or skills measured in the simulated environment (Cook et al., 2011; McGaghie et al., 2011b; Paige et al., 2020). Initially, many studies of simulators were focused on student interest, enthusiasm, and confidence to garner growing instructional support for this alternative method of training (Cooper & Taqueti, 2008). Fourteen years later, however, Yunoki and Saki (2018) concluded that simulation training in healthcare has helped increase learner confidence, but that evidence of improved patient outcomes is still largely unknown. There continues to be a great need for research that goes beyond the lower Kirkpatrick levels and demonstrates the benefits of simulation-based training in clinical settings, on patient outcomes as well as at higher levels organizational and public policy (McGaghie et al., 2011a; Palaganas et al., 2016). However, some encouraging examples do exist. One of the seminal studies to demonstrate the benefits of simulation training with genuine patients was conducted by Seymour and his colleagues (2002). These investigators sought to determine whether laparoscopic surgical skills acquired on the MIST VR system would transfer to genuine laparoscopic procedures performed in the OR. They compared the performance of residents assigned to the standard “apprenticeship” training condition and those who had practiced on MIST VR for 3–8 training 1-hour sessions. Following training, all residents performed a procedure under the supervision of a surgeon and had videos of their performance recorded and assessed by surgeons who were blind to the conditions. The investigators found that residents who trained on MIST VR completed their surgeries in 29% less time and committed fewer errors than their counterparts in the standard training condition. Ultimately, this study showed that skills acquired by training on a laparoscopic VR simulator had positive benefits when transferred to the operating room. In a later study, Scerbo and colleagues (2006) compared two forms of simulation for training phlebotomy (i.e., drawing blood): a VR simulator and the more traditional approach using simulated limbs. They trained 20 medical students under one of the two methods and measured their performance with a 28-item checklist. The investigators found that performance improvements were limited to those who trained with the simulated limbs and not the VR system. A detailed comparison of the functional and physical characteristics of each simulation system revealed important differences which ultimately led the researchers to conclude that training with both systems might provide complementary benefits. One of the more compelling examples of successful transfer concerns simulationbased training for central venous catheter (CVC) insertion. A CVC is typically placed
240
Human Factors in Simulation and Training
into a large vein (e.g., internal jugular or subclavian vein) when larger volumes of fluid need to be infused than can be accommodated by smaller needles (e.g., hemodialysis). If not done properly, it can cause damage to the central veins, pulmonary or cardiac complications, and central line-associated bloodstream infections (which have an estimated mortality rate of 12% to 25%; Patel et al., 2019). Barsuk and his colleagues (2009) developed a training program for CVC placement following the mastery training model described above. In an initial study, a sample of residents was given a pretest, training, and a posttest, and performance was assessed with a 27-item checklist. The pretest results showed that mean performance scores fell under 50%. The residents were then given intensive training tailored to their specific needs/learning objectives, they were coached, and required to engage in repetitive simulation-based practice until they met minimum passing standards (80%). Posttest results showed that mean performance scores exceeded the minimum criterion. These investigators conducted a follow-up study in which they compared the performance of residents trained with the standard apprenticeship model to another group trained with the mastery approach (Barsuk et al., 2009). Both groups of residents were then monitored when they attempted CVC placements with genuine patients. The results showed that those who had received simulation-based mastery training needed significantly fewer attempts for successful placement and were less likely to fail in their attempts or require their catheters to be readjusted after placement. More important, in another study, Barsuk and his colleagues (2009) showed that patients who received CVCs from simulation-based mastery-trained residents were significantly less likely to develop central line-associated bloodstream infections. Moreover, a recent meta-analysis indicates that simulation-based training for CVC placement resulted in higher levels of procedural success (Madenci et al., 2014). Other evidence for transfer has been demonstrated with team training. In pediatric emergency medicine, Falcone et al. (2008) showed that simulated pediatric trauma team training with an emphasis on teamwork and communication resulted in improvements in trauma care related to initial assessments, airway management, cervical spine care, and pelvic fracture recognition and management. In another study, Andreatta et al. (2011) found a significant positive correlation between simulation-based mock code team training and pediatric cardiopulmonary arrest survival rates. Survival rates increased 33% after the initial training and nearly 50% one year after routine training had been implemented. Additionally, a recent review of the effectiveness of simulation-based neonatal and pediatric resuscitation team training found evidence that it improves team and technical performance for at least 6 months post-training (Lindhard et al., 2021). Benefits of simulation-based team training have also been observed in the area of labor and delivery with promising evidence for improved patient outcomes. Phipps et al. (2012) prospectively evaluated patient outcomes after labor and delivery teams received simulation team training based on principles of CRM and reported that the labor and delivery unit’s adverse outcome index (AOI) decreased significantly after the training intervention. Another prospective study by Riley et al. (2011) found a 37% decrease in perinatal mortality after hospital staff received team training
Healthcare Simulation and Training
241
on simulated obstetrical emergency scenarios. Some researchers have found that simulation-based team training resulted in a decrease in neonatal injuries at birth (Draycott et al., 2008; Crofts et al., 2016) or a reduction in the number of infants born with low 5-minute Apgar scores of 6 or less and hypoxic-ischemic encephalopathy (HIE) following the training. In a similar study, Siassakos et al. (2009) reviewed patient outcomes before and after staff completed simulation-based team training and found improved efficiency and a significant increase in the number of maneuvers teams used to alleviate cord compression. Fransen and colleagues (2017) also found that team training reduced neonatal shoulder injury and increased invasive treatment for severe postpartum hemorrhage. Further, in a recent review by Yucel et al. (2020), the authors reported that simulation-based team training in obstetrics reduced neonatal injuries, cesarean sections, and transfusion as well as increases in maneuvers for managing shoulder dystocia and cord compression. However, others have reported that the benefits are not necessarily observed much beyond the immediate period after training (van de Ven et al., 2017). Collectively, the work of Barsuk, his colleagues, and others shows that skills acquired from a simulation-based curriculum following the mastery training model can transfer from the training environment (Kirkpatrick levels 1 and 2) to improved patient care (Kirkpatrick level 3) and to a reduction in adverse outcomes, i.e., infections (Kirkpatrick level 4). Although the results of studies such as these are encouraging, measures of transfer used in other disciplines are not often seen in healthcare. For example, in the aviation community, there is a long history of measuring the amount of transfer from training to actual practice (Roscoe, 1980). Transfer is measured by calculating the difference in time between training under normal conditions and training with a new technique (i.e., simulation). Transfer is said to be positive if simulation training is more efficient than training under standard conditions. On the other hand, negative transfer would indicate that simulation training is less efficient than the standard. Povenmire and Roscoe (1973) suggested that transfer be measured with the transfer effectiveness ratio (TER), a ratio of the time saved in training to the time spent in simulation. TER values greater than 1 indicate that simulation training is efficient while values less than 1 show that simulation training introduces inefficiency. There have not been many studies reporting TERs for simulation in healthcare. Aggarwal and his colleagues (2007) reported the first TER for a laparoscopic VR simulator. They compared novice surgeons who practiced with the VR simulator to a control group that did not practice. Both groups were then assessed on their ability to perform a laparoscopic cholecystectomy on a set of 5 pig cadavers. The researchers estimated the TER to be 2.28, showing that their VR simulation training was very efficient. More recently, Lohre and colleagues (2020) recently determined the TER for training with a VR system addressing an orthopedic procedure (reverse shoulder arthroplasty) for a rotator cuff tear. Residents who used the VR system were compared to those who viewed a video. Post-training performance was measured on a cadaver. The investigators reported that VR training resulted in a TER that approached 1 (TER = 0.79) and determined that one 1 hour of simulation training was equivalent to 48 minutes of real-world training. It should be noted TER values
242
Human Factors in Simulation and Training
less than 1 do not necessarily mean that training with the simulator is inappropriate. In healthcare, TER values less than 1 may be entirely acceptable given safety concerns when transitioning to patient care. Although there continues to be a need for more research demonstrating the effectiveness of simulation training, particularly at the higher Kirkpatrick levels, it should be noted that investigators face some real challenges when gathering data from genuine patient care settings. The lack of evidence at the higher Kirkpatrick levels; however, must be considered in the context of some real challenges investigators face when gathering data from genuine patient care settings. First, it is difficult to control for the wide variety of individual differences among patients and all of their potential comorbidities. Thus, treatments can vary considerably for patients who present with the same complaint/condition. Second, there are federal laws that govern patient information. The Health Insurance Portability and Accountability Act of 1996 (HIPAA) was established to protect against the disclosure of sensitive patient health information without a patient’s consent. Thus, data gathered from genuine patients must comply with human subjects/ethical review and HIPAA rules. Third, providers treat patients in many places where training and education are not a primary emphasis. Fourth, even in facilities where training is an important goal, it takes a significant administrative effort to observe and evaluate trainees under rigorous experimental conditions (e.g., holding constant opportunities to perform a procedure, the evaluators, patient conditions, case load, shift time, fatigue, etc.). Last, in a recent review, Paige et al. (2020) noted that many educators who use simulation are themselves focused more on outcomes measured in the training environment than in clinical practice.
HEALTHCARE SIMULATION AND THE PANDEMIC On March 11, 2020, the World Health Organization declared Covid-19 a pandemic. The virus spread rapidly from country to country with rising infection rates and loss of life. The need to respond to overwhelming numbers of infected patients and keep frontline providers safe became paramount in all hospitals and healthcare facilities. Perhaps no other event in history generated a greater demand for healthcare simulation methods, centers, and personnel. The rapid rise in cases and hospitalizations placed a tremendous strain on the healthcare system, requiring the deployment of new personnel into acute care settings, development of new treatment spaces, and creative methods of dealing with equipment shortages. One of the immediate needs was to train staff to handle large numbers of Covid19 patients while minimizing the risk of exposure and spread within hospitals. Simulation was utilized initially to familiarize personnel in critical care units (intensive care units, emergency rooms) with basic Covid-19 care principles and then to address non-critical care staff from other specialties being deployed in Covid-19 care units. Mannequin simulators were used to train for respiratory failure, circulatory failure, bedside procedures, and Covid-19 patient care. In many instances, training was carried out on a massive scale. Delamarre et al. (2022) describe their training efforts to use in situ simulation and peer-teaching for how to don and doff
Healthcare Simulation and Training
243
personal protective equipment and manage airways in simulated infected patients. They report training over 1,600 healthcare workers in 99 sessions over 11 days! Zucco and colleagues (in press) also used just-in-time, in situ simulation training aimed at workflow changes among anesthesia, nursing, and surgery staff in their healthcare network over 3 weeks. They evaluated the impact of their training by measuring compliance with the new Covid-19 workflows for cases of confirmed or suspected Covid19. Their results showed 95% compliance with new Covid-19 workflow protocols accompanied by lower-than-expected positive test rates among staff. Other institutions relied on VR simulation and immersive training on smartphones or VR headsets to teach Covid-related knowledge and skills, such as proper PPE use, infection control measures, how to use ventilators, and approaches to endof-life conversations (Liaw et al., 2021; Young & Aquilina, 2021). The ability to provide this training remotely in addition to its scalability enabled a variety of individuals to participate, including practicing professionals, retirees, medical students, and volunteers all while protected from potential exposure to Covid-19. In a randomized control trial, individuals who underwent this training demonstrated significant improvements in their Covid-related knowledge as well as a reduction in Covidrelated anxiety and stress compared with a control group that received traditional training methods. Collectively, the results from these studies highlight the feasibility of using a variety of forms of simulation to quickly train large numbers of employees to care for Covid-19 patients with minimal risk of exposure. Other researchers report using in situ simulation to identify latent safety threats (LSTs). Sharara-Chami et al. (2020) conducted 15 simulations over 2 weeks and uncovered LSTs tied to inadequate preparedness for infection control, uncertainty of procedural guidelines and protocols, and poor communication. Balmaks (2021) and colleagues also used in situ simulation to develop an action plan for mitigating occupational hazards and spread of the virus relying on healthcare failure mode and effects analysis. They uncovered several organizational, individual, and environmental issues related to a lack of clear guidelines and policies, noncompliance with policies and procedures, the flow of patient traffic, as well as maintenance, cleaning, and information availability for staff and patients. The analysis enabled them to take action to control many of the failure modes identified and develop a nationally approved set of recommendations for donning and doffing. In spring of 2020, the pandemic nearly exhausted medical equipment, supplies, and even oxygen. Among the more serious problems faced by many hospital systems was a shortage of ventilators. Burnett et al. (2021) turned to high-fidelity simulation to evaluate potential solutions to the problem of ventilator shortages by testing designs for safely splitting ventilators, repurposing non-invasive ventilators to invasive ventilators, and expanding the ability to test new ventilator designs. Simulation facilitated all of this testing of devices and protocols before there was any need to use them on actual patients. Another area where simulation proved invaluable was studying the potential risk of exposure to the virus by staff who needed to perform aerosol-generating procedures (e.g., endotracheal intubation) in Covid-19 patients. Using simulation, Shavit (2020) and colleagues developed a method to evaluate the suitability and
244
Human Factors in Simulation and Training
acceptability of an alternative biological isolation gown to be used by emergency department staff. They used a non-visible fluorescent marker as an indicator of contamination and had participants perform airway management procedures following Covid-19 protocols. They then examined the garments after doffing with an ultraviolet light to visualize potential contamination and found that no markers were visualized on any of the participants suggesting that the gown was a suitable alternative. Oman et al. (2023) used simulation to evaluate devices designed to mitigate the spread of aerosol and droplet-sized particles. These researchers discovered that an acrylic box and a plastic drape designed as barriers did not mitigate particle spread, but that the drape increased the time needed to perform the procedure. In another unique demonstration, Lampotang and his colleagues (2022) used simulation to evaluate a method used in low-resource areas to conserve oxygen. Lampotang simulated an adult Covid-19 patient with hypoxemia and examined whether caregivers could interrupt the flow of oxygen flow by manually crimping the oxygen tubing with pliers during exhalation. They discovered that pinching the tubing reduced oxygen use by over 50% without a significant drop in simulated oxygen saturation. These studies show the benefit of using simulation to examine novel ideas prior to implementation with patients.
CONCLUSION In 2004, Gaba offered some predictions for simulation in healthcare. He described a future where simulation training is not only a requirement but is the driving force behind changes to healthcare curricula, where patients demand a level of safety comparable to aviation, where simulation-based standards for training are required by regulatory agencies, and where the effectiveness of medical devices is gathered in trials using simulation. Where are we today? Although we have not yet realized Gaba’s future much progress has been made. In the last 20 years, simulation has gone from being a novelty to an accepted and even expected and necessary method of training and education in healthcare. One of the first specialty areas to adopt and promote simulation was anesthesiology and the importance of simulation was underscored when the American Board of Anesthesiology began requiring providers to demonstrate communication and technical skills in simulation scenarios as part of their Maintenance of Certification in Anesthesiology program (AMA, 2021) (https:// theaba.org/staged%20exams.html). Professional societies have also emerged that bring together educators, practitioners, and researchers concerned with simulation (e.g., the Society in Europe for Simulation Applied to Medicine and the Society for Simulation in Healthcare in the US, which established Simulation in Healthcare, the first peer-reviewed journal dedicated solely to simulation in 2006). The International Association for Clinical Simulation in Nursing published standards for simulation performance in 2010 (INACSL, 2021). Centers dedicated solely to simulationbased training and education are now commonplace. The Society for Simulation in Healthcare offers accreditation of simulation center programs with over 100 centers in 10 countries receiving full accreditation since 2010 (SSH, 2021). Moreover, the
Healthcare Simulation and Training
245
technology supporting training and education in healthcare simulation has become more complex and diverse to address a wider range of specialty areas, clinical procedures and activities, as well as differences among patients. However, there are still many challenges that lie ahead for simulation to reach the vision that Gaba described. More research is needed to show the benefits of simulation training at the highest Kirkpatrick levels. Although simulation-based mastery training described by McGaghie, Barsuk, and Wayne (2020) shows much promise for improving learner performance and safety for patients, it is not norm. There is also a need to establish standards for training transfer like those used in aviation and other high-risk disciplines that rely on simulation training. Last, there are costs to consider. Acquiring simulation equipment, creating a center and maintaining a staff, and pulling providers from clinical duties to engage in training requires significant ongoing investment and administrative support. In spite of these challenges, simulation has clearly had a transformational effect on the training and education of healthcare providers. For the first time in history, it has afforded the opportunity for objective performance assessment with standardized metrics in a safe and controlled environment. Simulation has been shown to increase proficiency and reduce the performance variance and errors of providers. Moreover, lessons learned from the pandemic showed that simulation could train and prepare legions of healthcare providers to treat Covid-19 patients while minimizing their own risk of exposure. Finally, data have begun to show that patients treated by simulation-trained providers may be at a lower risk for harm. The recommendation by the US Institute of Medicine (Kohn et al., 2000) to use simulation for training 20 years ago has indeed begun to make the US healthcare system safer.
REFERENCES Abrahamson, S., Denson, J. S., & Wolf, R. M. (1969). Effectiveness of a simulator in training anesthesiology residents. Journal of Medical Education, 44(6), 515–519. Accreditation Council for Graduate Medical Education (ACGME). (2020). ACGME Program Requirements for Graduate Medical Education in Anesthesiology. Retrieved from https://www.acgme.org/Portals/0/PFAssets/ProgramRequirements/040_Anesthesiology _2020.pdf?ver=2020-06-18-132902-423 Ackerman, M. J. (1998). The visible human project. Proceedings of the IEEE, 86(3), 504–511. http://doi.org/10.1109/5.662875 Agency for Healthcare Research and Quality (AHRQ). (2015). TeamSTEPPS®: Research/ Evidence Base. Retrieved from https://www.ahrq.gov/teamstepps/evidence-base/ simulation.html Aggarwal, R., Ward, J., Balasundaram, I., Sains, P., Athanasiou, T., & Darzi, A. (2007). Proving the effectiveness of virtual reality simulation for training in laparoscopic surgery. Annals of Surgery, 246, 771–779. http://doi.org/10.1097/SLA.0b013e3180f61b09 Ahn, J., Yashar, M. D., Novack, J., Davidson, J., Lapin, B., Ocampo, J., & Wang, E. (2016). Mastery learning of video laryngoscopy using the glidescope in the emergency department. Simulation in Healthcare, 11(5), 309–315. http://doi.org/10.1097/SIH .0000000000000164 American Board of Anesthesiology. Retrieved on April 28 2021 on the WWW from https:// theaba.org/staged%20exams.html
246
Human Factors in Simulation and Training
Andreatta, P., Saxton, E., Thompson, M., & Annich, G. (2011). Simulation-based mock codes significantly correlate with improved pediatric patient cardiopulmonary arrest survival rates. Pediatric Critical Care Medicine, 12(1), 33–38. http://doi.org/10.1097/ PCC.0b013e3181e89270 Baker, D. P., Gustafson, S., Beaubien, J. M., Salas, E., & Barach, P. (2005). Medical team training programs in health care. In K. Henriksen, J. B. Battles, E. S. Marks, & D. I. Lewin (Eds.), Advances in Patient Safety: From Research to Implementation, 4 (pp. 253–267). Rockville, MD: AHRQ Publication. Balmaks, R., Grāmatniece, A., Aija, V., et al. (2021). A simulation-based failure mode analysis of SARS-CoV-2 infection control and prevention in emergency departments. Simulation in Healthcare, 16, 386–391. Barrows, H. S. (1993). An overview of the uses of standardized patients for teaching and evaluating clinical skills. Academic Medicine, 68(6), 443–53. http://doi.org/ 10.1097/00001888-199306000-00002 Barsuk, J. H., Cohen, E. R., Feinglass, J., McGaghie, W. C., & Wayne, D. B. (2009). Use of simulation-based education to reduce catheter-related bloodstream infections. Archives of Internal Medicine, 169(15), 1420–1423. http://doi.org/10.1001/archinternmed.2009 .215 Barsuk, J. H., McGaghie, W. C., Cohen, E. R., Balachandran, J. S., & Wayne, D. B. (2009). Use of simulation-based mastery learning to improve the quality of central venous catheter placement in a medical intensive care unit. Journal of Hospital Medicine, 4(7), 397–403. http://doi.org/10.1002/jhm.468 Barsuk, J. H., McGaghie, W. C., Cohen, E. R., O’Leary, K. J., & Wayne, D. B. (2009). Simulation-based mastery learning reduces complications during central venous catheter insertion in a medical intensive care unit. Critical Care Medicine, 37(10), 2697–2701. http://doi.org/10.1097/CCM.0b013e3181a57bc1 Beard, L., Wilson, K., Morra, D., & Keelan, J. (2009). A survey of health-related activities on second life. Journal of Medical Internet Research, 11(2), e17. 1–19. http://doi.org/10 .2196/jmir.1192 Bell, R. H., Biester, T. W., Tabuenca, A., Rhodes, R. S., Cofer, J. B., Britt, L. D., & Lewis, F. R. (2009). Operative experience of residents in US general surgery programs: A gap between expectation and experience. Annals of Surgery, 249(5), 719–724. http://doi.org /10.1097/SLA.0b013e3181a38e59 Birch, L., Jones, N., Doyle, P. M., Green, P., McLaughlin, A., Champney, C., … Taylor, K. (2007). Obstetric skills drills: Evaluation of teaching methods. Nurse Education Today, 27, 915–922. https://doi.org/10.1016/j.nedt.2007.01.006 Boulet, J. R., Smee, S. M., Dillon, G. F., & Gimpel, J. R. (2009). The use of standardized patient assessments for certification and licensure decisions. Simulation in Healthcare, 4(1), 35–42. http://doi.org/10.1097/SIH.0b013e318182fc6c Bracq, M. S., Michinov, E., & Jannin, P. (2019). Virtual reality simulation in nontechnical skills training for healthcare professionals: A systematic review. Simulation in Healthcare, 14(3), 188–194. http://doi.org/10.1097/SIH.0000000000000347 Bradley, P. (2006). The history of simulation in medical education and possible future directions. Medical Education, 40(3), 254–262. http://doi.org/10.1111/j.1365-2929.2006.02394.x Buck, G. H. (1991). Development of simulators in medical education. Gesnerus, 48(1), 7–28. https://doi.org/10.1163/22977953- 04801002 Burnett, G., Shah, A., Fried, E. A., et al. (2021). Using simulation to develop solutions for ventilator shortages from the epicenter. Simulation in Healthcare, 16, 78–79. Cartwright, M. S., Reynolds, P. S., Rodriguez, Z. M., Breyer, W. A., & Cruz, J. M. (2005). Lumbar puncture experience among medical school graduates: The need for formal
Healthcare Simulation and Training
247
procedural skills training. Medical Education, 39(4), 437–437. http://doi.org/10.1111/j .1365-2929.2005.02118.x Chopra, V., Bovill, J. G., Spierdijk, J., & Koornneef, F. (1992). Reported significant observations during anaesthesia: A prospective analysis over an 18-month period. British Journal of Anaesthesia, 68(1), 13–17. https://doi.org/10.1093/ bja/68.1.13 Cook, D. A., Hatala, R., Brydges, R., et al. (2011). Technology-enhanced simulation for health professions education: A systematic review and meta-analysis. JAMA, 306(9), 978– 988. http://doi.org/10.1001/jama.2011.1234 Cooper, J. B., & Taqueti, V. (2008). A brief history of the development of mannequin simulators for clinical education and training. Postgraduate Medical Journal, 84(997), 563–570. http://doi.org/10.1136/qshc.2004.009886 Crofts, J. F., Lenguerrand, E., Bentham, G. L., et al. (2016). Prevention of brachial plexus injury—12 years of shoulder dystocia training: An interrupted time-series study. British Journal of Obstetrics & Gynecology, 123(1), 111–118. http://doi.org/10.1111/1471-0528 .13302 Daher, S., Hochreiter, J., Schubert, R., Gonzalez, L., Cendan, J., Anderson, M., Diaz, D. A., & Welch, G. F. (2020). The physical-virtual patient simulator: A physical human form with virtual appearance and behavior. Simulation in Healthcare, 15(2), 115–121. http:// doi.org/10.1097/SIH.0000000000000409 Dawson, S. L. (2002). A critical approach to medical simulation. Bulletin of the American College of Surgeons, 87(11), 12–18. Dawson, S. L., Cotoin, S., Meglan, D., Shaffer, D., & Ferrell, M. (2000). Designing a computer-based simulator for interventional cardiology training (with editorial comment). Catheterization and Cardiovascular Interventions, 51(4), 522–528. http:// doi.org/10.1002/1522-726x(200012)51:43.0.co;2-7 Delamarre, L., Couarraze, S., Vardon-Bounes, F., et al. (2022). Mass training in situ during COVID-19 pandemic: Enhancing efficiency and minimizing sick leaves. Simulation in Healthcare, 17, 42–48. Delp, S. L., Loan, J. P., Hoy, M. G., Zajac, F. E., Topp, E. L., & Rosen, J. M. (1990). An interactive graphics-based model of the lower extremity to study orthopaedic surgical procedures. IEEE Transactions on Biomedical Engineering, 37(8), 757–767. http://doi .org/10.1109/10.102791 Delp, S. L., & Zajac, F. E. (1992). Force-and moment-generating capacity of lower-extremity muscles before and after tendon lengthening. Clinical Orthopaedics and Related Research, 284, 247–259. Dev, P. (2016). Simulation: A view into the future of education. In C. A. Weaver, M. J. Ball, G. R. Kim, & J. M. Kiel (Eds.), Healthcare Information Management Systems (pp. 317–329). http://doi.org/10.1007/978-3-319-20765-0 Dougherty, D., & Conway, P. H. (2008). The “3T’s” road map to transform US health care: The “how” of high-quality care. JAMA, 299(19), 2319–2321. http://doi.org/10.1001/ jama.299.19.2319 Draycott, T. J., Crofts, J. F., Ash, J. P., et al. (2008). Improving neonatal outcome through practical shoulder dystocia training. Obstetrics & Gynecology, 112(1), 14–20. http://doi .org/10.1097/AOG.0b013e31817bbc61 Draycott, T., Sibanda, T., Owen, L., Akande, V., Winter, C., Reading, S., & Whitelaw, A. (2006). Does training in obstetric emergencies improve neonatal outcome? BJOG: An International Journal of Obstetrics & Gynaecology, 113(2), 177–182. https://doi.org/10 .1111/j.1471- 0528.2006.00800.x Ellis, D., Crofts, J. F., Hunt, L. P., Read, M., Fox, R., & James, M. (2008). Hospital, simulation center, and teamwork training for eclampsia management: A randomized
248
Human Factors in Simulation and Training
controlled trial. Obstetrics & Gynecology, 111(3), 723–731. http://doi.org/10.1097/AOG .0b013e3181637a82 Ewy, G. A., Felner, J. M., Juul, D., Mayer, J. W., Sajid, A. W., & Waugh, R. A. (1987). Test of a cardiology patient simulator with students in fourth-year electives. Journal of Medical Education, 62(9), 738–743. Falcone Jr., R. A., Daugherty, M., Schweer, L., Patterson, M., Brown, R. L., & Garcia, V. F. (2008). Multidisciplinary pediatric trauma team training using high-fidelity trauma simulation. Journal of Pediatric Surgery, 43(6), 1065–1071. https://doi.org/10.1016/j .jpedsurg.2008.02.033 Farmer, E., van Rooij, J., Riemersma, J., Jorna, P., & Moraal, J. (2003). Handbook of Simulation-Based Training. Burlington, VT: Ashgate. Forrest, K. (2019). What is simulation education. In K. Forrest & J. McKimm (Eds.), Healthcare Simulation at a Glance (pp. 4–5). Hoboken, NJ: John Wiley & Sons Ltd. Fransen, A. F., van de Ven, J., Schuit, E., van Tetering, A., Mol, B. W., & Oei, S.G. (2017). Simulation-based team training for multi-professional obstetric care teams to improve patient outcome: A multicentre, cluster randomized controlled trial. British Journal of Obstetrics & Gynecology, 124(4), 641–650. http://doi.org/10.1111/1471- 0528.14369 Fritz, P. Z., Gray, T., & Flanagan, B. (2008). Review of mannequin‐based high‐fidelity simulation in emergency medicine. Emergency Medicine Australasia, 20(1), 1–9. https://doi.org/10.1111/j.1742- 6723.2007.01022.x Gaba, D. M. (2004). The future vision of simulation in healthcare. Quality and Safety in Health Care, 13 (Supp 1), i2–i10. http://doi.org/ 10.1097/01.SIH.0000258411.38212.32 Gaba, D. M. (2010). Crisis resource management and teamwork training in anaesthesia. British Journal of Anaesthesia, 105, 3–6. https://doi.org/10.1093/ bja/aeq124 Gaba, D. M., & DeAnda, A. (1988). A comprehensive anesthesia simulation environment: Re-creating the operating room for research and training. Anesthesiology, 69(3), 387–394. Gaba, D. M., Howard, S. K., Fish, K. J., Smith, B. E., & Sowb, Y. A. (2001). Simulation-based training in anesthesia crisis resource management (ACRM): A decade of experience. Simulation & Gaming, 32(2), 175–193. https://doi.org/10.1177/104687810103200206 Gordon, M. S. (1974). Cardiology patient simulator: Development of an animated manikin to teach cardiovascular disease. The American Journal of Cardiology, 34(3), 350–355. https://doi.org/10.1016/0002-9149(74)90038-1 Grant, V. J., Wolff, M., & Adler, M. (2016). The past, present, and future of simulation-based education for pediatric emergency medicine. Clinical Pediatric Emergency Medicine, 17(3), 159–168. https://doi.org/10.1016/j.cpem.2016.05.005 Heinrichs, L., Fellander-Tsai, L., & Davies, D. (2013). Clinical virtual worlds: The wider implications for professionals. In K. Bredl & W. Bösche (Eds.), Serious Games and Virtual Worlds in Education, Professional Development, and Healthcare (pp. 817– 836). Hershey, PA: IGI Global. Heinrichs, W. L., Youngblood, P., Harter, P., Kusumoto, L., & Dev, P. (2010). Training healthcare personnel for mass-casualty incidents in a virtual emergency department: VED II. Prehosp Disaster Med, 25(5), 424–432. http://doi.org/10.1017/S1049023X00008505 Helmreich, R. L., & Schaefer, H. G. (1994). Team performance in the operating room. In M. S. Bogner (Ed.), Human Error in Medicine (pp. 225–253). Boca Raton: CRC Press. Howard, S. K., Gaba, D. M., Fish, K. J., Yang, G., & Sarnquist, F. H. (1992). Anesthesia crisis resource management training: Teaching anesthesiologists to handle critical incidents. Aviation, Space, and Environmental Medicine, 63(9), 763–770. INACSL Standards of Best Practice: Simulation. Retrieved on April 28 2021 on the WWW from https://www.inacsl.org/inacsl-standards-of-best-practice-simulation/ history-of -the-inacsl-standards-of-best-practice-simulation/
Healthcare Simulation and Training
249
Issenberg, S. B., & Scalese, R. J. (2008). Simulation in health care education. Perspectives in Biology and Medicine, 51(1), 31–46. http://doi.org/10.1353/pbm.2008.0004 Jones, F., Passos-Neto, C. E., & Braghiroli, O. F. M. (2015). Simulation in medical education: Brief history and methodology. Principles and Practice of Clinical Research, 1(2). 46–54. http://doi.org/10.21801/ppcrj.2015.12.8 Kirkpatrick D. (1994). Evaluating Training Programmes; The Four Levels. San Francisco, CA: Berrett-Kochler Publishers. Kleinsmith, A., Rivera Gutierrez, D., Finney, G., Cendan, J., & Lok, B. (2015). Understanding empathy training with virtual patients. Computers in Human Behavior, 52, 151–158. https://doi.org/10.1016/j.chb.2015.05.033 Kohn, L., Corrigan, J., & Donaldson, M. (2000). To Err is Human: Building a Safer Health System. Washington, DC: National Academy Press. Kron, F. W., Fetters, M. D., Scerbo, M. W., White, C. B., Lypson, M. L., Padilla, M. A., … Becker, D. M. (2017). Using a computer simulation for teaching communication skills: A blinded multisite mixed methods randomized controlled trial. Patient Education and Counseling, 100(4), 748–759. https://doi.org/10.1016/j.pec.2016.10.024 Laerdal Medical Corp. (2014). Pricelist (p. 20). Retrieved from https://www.ogs.state.ny.us/ purchase/spg/pdfdocs/3823219745PL _ Laerdal.pdf Lampotang, S., DeStephens, A., Zarour, I., et al. (2022). Manual conservation of supplemental oxygen in low-resource settings during the COVID-19 pandemic. Simulation in Healthcare, 17, 95–97. Lane, H. C., Hays, M. J., Core, M. G., & Auerbach, D. (2013). Learning intercultural communication skills with virtual humans: Feedback and fidelity. Journal of Educational Psychology, 105(4), 1026–1035. https://doi.org/10.1037/a0031506 Leape, L. L., Brennan, T. A., Laird, N., Lawthers, A. G., Localio, A. R., Barnes, B. A., … Hiatt, H. (1991). The nature of adverse events in hospitalized patients: Results of the Harvard medical practice study II. New England Journal of Medicine, 324(6), 377–384. http://doi.org/10.1056/ NEJM199102073240605 Liaw, S. Y., Choo, T., Wu, L. T., Lim, W. S., Choo, H., Lim, S. M., … Lau, T. C. (2021). “Wow, woo, win”-Healthcare students’ and facilitators’ experiences of interprofessional simulation in three-dimensional virtual world: A qualitative evaluation study. Nurse Education Today, 105, 1–6. http://doi.org/10.1016/j.nedt.2021.105018 Liaw, S. Y., Wu, L. T., Soh, S. L. H., Ringsted, C., Lau, T. C., & Lim, W. S. (2020). Virtual reality simulation in interprofessional round training for health care students: A qualitative evaluation study. Clinical Simulation in Nursing, 45, 42–46. http://doi.org /10.1016/j.ecns.2020.03.013 Liaw, S. Y., Wu, L. T., Wong, L. F., Soh, S. L. H., Chow, Y. L., Ringsted, C., … Lim, W. S. (2019). “Getting everyone on the same page”: Interprofessional team training to develop shared mental models on interprofessional rounds. Journal of General Internal Medicine, 34(12), 2912–2917. http://doi.org/10.1007/s11606- 019- 05320-z Lindhard, M. S., Thim, S., Laursen, H. S., Schram, A. W., Paltved, C., & Henriksen, T. B. (2021). Simulation-based neonatal resuscitation team training: A systematic review. Pediatrics, 147(4). https://doi.org/10.1542/peds.2020- 042010 Lohre, R., Bois, A. J., Pollock, J. W., et al. (2020). Effectiveness of immersive virtual reality on orthopedic surgical skills and knowledge acquisition among senior surgical residents: A randomized clinical trial. JAMA Network Open, 1–12. http://doi:10.1001/ jamanetworkopen.2020.31217 Madenci, A. L., Solis, C. V., & de Moya, M. A. (2014). Central venous access by trainees: A systematic review and meta-analysis of the use of simulation to improve success rate on patients. Simulation in Healthcare, 9(1), 7–14. http://doi.org/10.1097/SIH .0b013e3182a3df26
250
Human Factors in Simulation and Training
Maslovitz, S., Barkai, G., Lessing, J. B., Ziv, A., & Many, A. (2007). Recurrent obstetric management mistakes identified by simulation. Obstetrics & Gynecology, 109(6), 1295–1300. http://doi.org/10.1097/01.AOG.0000265208.16659.c9 McGaghie, W. C. (2020). Mastery learning: Origins, features, and evidence from health professions. In W. C. McGaghie, J. H. Barsuk, & D. B.Wayne (Eds.), Comprehensive Healthcare Simulation: Mastery Learning in Health Professions Education (pp. 27– 46). Cham, Switzerland: Springer. McGaghie, W. C., Barsuk, J. H., & Wayne, D. B. (2020). Comprehensive Healthcare Simulation: Mastery Learning in Health Professions Education. Cham, Switzerland: Springer. McGaghie, W. C., Draycott, T. J., Dunn, W. F., Lopez, C. M., & Stefanidis, D. (2011a). Evaluating the impact of simulation on translational patient outcomes. Simulation in Healthcare, 6(Suppl), S42–S47. http://doi.org/10.1097/SIH.0b013e318222fde9 McGaghie, W. C., Issenberg, S. B., Cohen, E. R., Barsuk, J. H., & Wayne, D. B. (2011b). Does simulation-based medical education with deliberate practice yield better results than traditional clinical education? A meta-analytic comparative review of the evidence. Academic Medicine, 86(6), 706–711. https://doi.org/10.1097/ACM.0b013e318217e119 Moorthy, K., Munz, Y., Adams, S., Pandey, V., & Darzi, A. (2005). A human factors analysis of technical and team skills among surgical trainees during procedural simulations in a simulated operating theatre. Annals of Surgery, 242(5), 631–639. http://doi.org/10.1097 /01.sla.0000186298.79308.a8 Mozumder, M. A. I., Sheeraz, M. M., Athar, A., Aich, S., & Kim, H. C. (2022). Overview: Technology roadmap of the future trend of metaverse based on IOT, blockchain, AI technique, and medical domain metaverse activity. In 2022 24th International Conference on Advanced Communication Technology (ICACT) (pp. 256–261). IEEE. 10.23919/ICACT53585.2022.9728808 Nestel, D., & Kelly, M. (2018). An introduction to healthcare simulation. In D. Nestel, M. Kelly, B. Jolly, & M. Watson (Eds.), Healthcare Simulation Education: Evidence, Theory and Practice (pp. 1–6). West Sussex: John Wiley & Sons, Ltd. Nestel, D., Sanko, J., & McNaughton, N. (2018). Simulated participant methodologies: Maintaining humanism in practice. In D. Nestel, M. Kelly, B. Jolly, & M. Watson (Eds.), Healthcare Simulation Education: Evidence, Theory and Practice (pp. 45–53). West Sussex: John Wiley & Sons, Ltd. Newlin-Canzone, E. T., Scerbo, M. W., Gliva-McConvey, G., & Wallace, A. M. (2013). The cognitive demands of standardized patients: Understanding limitations in attention and working memory with the decoding of nonverbal behavior during improvisations. Simulation in Healthcare, 8(4), 207–214. http://doi.org/10.1097/SIH.0b013e31828b419e Oman, S. P., Sanghavi, D. K., Helgeson, S. A., et al. (2023). Simulation method for testing aerosol mitigation strategies, an observational study. Simulation in Healthcare, 18(1), 8–15. Owen, H. (2012). Early use of simulation in medical education. Simulation in Healthcare, 7(2), 102–116. http://doi.org/10.1097/SIH.0b013e3182415a91 Owen, H. (2016). Simulation and teaching in resuscitation and trauma management. In H. Owen (Ed.), Simulation in Healthcare Education: An Extensive History (pp. 417–430). Cham, Switzerland: Springer. Owen, H. (2018). Historical practices in healthcare simulation: What we still have to learn. In D. Nestel, M. Kelly, B. Jolly, & M. Watson (Eds.), Healthcare Simulation Education: Evidence, Theory and Practice (pp. 16–22). West Sussex: John Wiley & Sons, Ltd. Paige, J. B., Graham, L. L., & Sittner, B. (2020). Formal training efforts to develop simulation educators: An integrative review. Simulation in Healthcare, 15(4), 271–281. http://doi .org/10.1097/SIH.0000000000000424
Healthcare Simulation and Training
251
Palaganas, J. C., Brunette, V., & Winslow, B. (2016). Prelicensure simulation-enhanced interprofessional education: A critical review of the research literature. Simulation in Healthcare, 11(6), 404–418. http://doi.org/10.1097/SIH.0000000000000175 Parker, A. L., Forsythe, L. L., & Kohlmorgen, I. K. (2019). TeamSTEPPS®: An evidence‐based approach to reduce clinical errors threatening safety in outpatient settings: An integrative review. Journal of Healthcare Risk Management, 38(4), 19–31. https://doi .org/10.1002/jhrm.21352 Patel, A. R., Patel, A. R., Singh, S., & Khawaja, I. (2019). Central line catheters and associated complications: A review. Cureus, 11(5), e4717. http://doi.org/10.7759/cureus.4717 Perlman, R. E., Pawelcazak, M., Yacht, A. C., et al. (2017). Program director perceptions of proficiency of the core entrustable professional activities. Journal of Graduate Medical Education, 9(5), 588–592. http://doi.org/10.4300/JGME-D-16- 00864.1 Phipps, M. G., Lindquist, D. G., McConaughey, E., O'Brien, J. A., Raker, C. A., & Paglia, M. J. (2012). Outcomes from a labor and delivery team training program with simulation component. American Journal of Obstetrics and Gynecology, 206(1), 3–9. https://doi .org/10.1016/j.ajog.2011.06.046 Povenmire, H. K., & Roscoe, S. N. (1973). Incremental transfer effectiveness of a groundbased general aviation trainer. Human Factors, 15, 534–542. https://doi.org/10.1177 /001872087301500605 Prichard, J. S., Bizo, L. A., & Stratford, R. J. (2011). Evaluating the effects of team-skills training on subjective workload. Learning and Instruction, 21(3), 429–440. https://doi .org/10.1016/j.learninstruc.2010.06.003 Reed, T., Pirotte, M., McHugh, M., et al. (2016). Simulation-based mastery learning improves medical student performance and retention of core clinical skills. Simulation in Healthcare, 11(3), 173–180. http://doi.org/10.1097/SIH.0000000000000154 Rudolph, J. W., Raemer, D. B., & Simon, R. (2014). Establishing a safe container for learning in simulation: The role of the presimulation briefing. Simulation in Healthcare, 9(6), 339–349. http://doi.org/10.1097/SIH.0000000000000047 Riley, W., Davis, S., Miller, K., Hansen, H., Sainfort, F., & Sweet, R. (2011). Didactic and simulation nontechnical skills team training to improve perinatal patient outcomes in a community hospital. The Joint Commission Journal on Quality and Patient Safety, 37(8), 357–364. http://doi.org/10.1016/S1553-7250(11)37046-8 Roscoe, S. N. (1980). Aviation Psychology. Ames, IA: Iowa University Press. Roscoe, S. N., & Williges, B. H. (1980). Measurement of transfer of training. In S. N. Roscoe (Ed.), Aviation Psychology (pp. 182–193). Ames, IA: Iowa University Press. Rosen, K. R. (2008). The history of medical simulation. Journal of Critical Care, 23(2), 157–166. https://doi.org/10.1016/j.jcrc.2007.12.004 Salas, E., Sims, D. E., & Burke, C. S. (2005). Is there a “big five” in teamwork? Small Group Research, 36(5), 555–599. https://doi.org/10.1177/1046496405277134 Salas, E., Wilson, K. A., Burke, C. S., Wightman, D. C., & Howse, W. R. (2006). A checklist for crew resource management training. Ergonomics in Design, 14(2), 6–15. https://doi .org/10.1177/106480460601400204 Satava, R. M. (1993). Virtual reality surgical simulator. Surgical Endoscopy, 7(3), 203–205. Satava, R. M. (2001). Accomplishments and challenges of surgical simulation. Surgical Endoscopy, 15(3), 232–241. http://doi.org/ 10.1007/s004640000369 Scerbo, M. W., & Anderson, B. L. (2012). Medical simulation. In P. Carayon (Ed.), Handbook of Human Factors and Ergonomics in Health Care and Patient Safety (2nd Ed., pp. 557–571). Boca Raton, FL: CRC Press. Scerbo, M. W., Belfore, L. A., Garcia, H. M., et al. (2007). A virtual operating room for context-relevant training. Proceedings of the Human Factors & Ergonomics Society 51st Annual Meeting, 507–511. Santa Monica, CA: Human Factors & Ergonomics Society.
252
Human Factors in Simulation and Training
Scerbo, M. W., Bliss, J. P., Schmidt, E. A., & Thompson, S. N. (2006). The efficacy of a medical virtual reality simulator for training phlebotomy. Human Factors, 48(1), 72– 84. http://doi.org/10.1518/001872006776412171 Schroedl, C. J., Frogameni, A., Barsuk, J. H., Cohen, E. R., Sivarajan, L., & Wayne, D. B (2020). Impact of simulation-based mastery learning on resident skill managing mechanical ventilators. ATS Scholar, 2, 34–48. http://doi.org/10.34197/ats-scholar .2020- 0023OC Seymour, N. E., Gallagher, A. G., Roman, S. A., O’briend, M. K., Bansal, V. K., Andersen, D. K., & Satava, R. M. (2002). Virtual reality training improves operating room performance. Annals of Surgery, 236(4), 458–464. http://doi.org/10.1097/00000658 -200210000- 00008 Sharara-Chami, R., Sabouneh, R., Zeineddine, R., et al. (2020). In situ simulation: An essential tool for safe preparedness for the COVID-19 pandemic. Simulation in Healthcare, 15, 303–309. Shavit, D., Feldman, O., Hussein, K., et al. (2020). Assessment of alternative personal protective equipment by emergency department personnel during the SARS-CoV-2 pandemic: A simulation-based pilot study. Simulation in Healthcare, 15, 445–446. Siassakos, D., Hasafa, Z., Sibanda, T., Fox, R., Donald, F., Winter, C., & Draycott, T. (2009). Retrospective cohort study of diagnosis–delivery interval with umbilical cord prolapse: The effect of team training. BJOG: An International Journal of Obstetrics & Gynaecology, 116(8), 1089–1096. http://doi.org/10.1111/j.1471- 0528.2009.02179.x Singh, H., Kalani, M., Acosta-Torres, S., El Ahmadieh, T. Y., Loya, J., & Ganju, A. (2013). History of simulation in medicine: From Resusci Annie to the Ann Myers medical center. Neurosurgery, 73 (suppl_1), S9–S14. https://doi.org/10.1093/neurosurgery/73 .suppl_1.S9 Sinz, E. H. (2007). Anesthesiology national CME program and ASA activities in simulation. Anesthesiology Clinics, 25(2), 209–223. https://doi.org/10.1016/j.anclin.2007.03.012 Society for Simulation in Healthcare. Retrieved on April 28 2021 on the WWW from https:// www.ssih.org/Credentialing/Accreditation. Sutton, C., McCloy, R., Middlebrook, A., Chater, P., Wilson, M., & Stone, R. (1997). MIST VR. A laparoscopic surgery procedures trainer and evaluator. Studies in Health Technology and Informatics, 39, 598–607. Swezey, R. W., & Andrews, De. H. (Eds.). (2001). Readings in Training and Simulation: A 30-Year Perspective. Santa Monica, CA: Human Factors and Ergonomics Society. van de Ven, J., Fransen, A. F., Schuit, E., van Runnard Heimel, P. J., Mol, B. W., & Oei, S. G. (2017). Does the effect of one-day simulation team training in obstetric emergencies decline within one year? A post-hoc analysis of a multicentre cluster randomised controlled trial. European Journal of Obstetrics & Gynecology and Reproductive Biology, 216, 79–84. http://doi.org/10.1016/j.ejogrb.2017.07.020 Vincenzi, D. A., Wise, J. A., Mouloua, M., & Hancock, P. A. (Eds.). (2009). Human Factors in Simulation and Training. Boca Raton, FL: CRC Press. Walsh, K., & Jaye, P. (2012). The relationship between fidelity and cost in simulation. Medical Education, 12(46), 1226–1228. http://doi.org/10.1111/j.1365-2923.2012.04352.x Weaver, S. J., Dy, S. M., & Rosen, M. A. (2014). Team-training in healthcare: A narrative synthesis of the literature. BMJ Quality & Safety, 23(5), 359–372. http://doi.org/10.1136 /bmjqs-2013- 001848 Weile, J., Nebsbjerg, M. A., Ovesen, S. H., Paltved, C., & Ingeman, M. L. (2021). Simulationbased team training in time-critical clinical presentations in emergency medicine and critical care: A review of the literature. Advances in Simulation, 6(1), 1–12. http://doi .org/10.1186/s41077- 021- 00154-4
Healthcare Simulation and Training
253
Weinger, M. B., & Gaba, D. M. (2014). Human factors engineering in patient safety. Anesthesiology, 120(4), 801–806. https://doi.org/10.1097/ALN.0000000000000144 Weller, J., & Civil, I. (2018). Teamwork and healthcare simulation. In D. Nestel, M. Kelly, B. Jolly, & M. Watson (Eds.), Healthcare Simulation Education: Evidence, Theory and Practice (pp. 127–134). West Sussex: John Wiley & Sons, Ltd. Woodward, C. (1998). Standardized patients: A fixed role therapy experience in normal individuals. Journal of Constructivist Psychology, 11(2), 133–48. https://doi.org/10 .1080/10720539808404645 Young, A., & Aquilina, A. (2021). Use of virtual reality to support rapid upskilling of healthcare professionals during COVID-19 pandemic. In XR Case Studies (pp. 137– 145). Cham: Springer. http://doi.org/10.1007/978-3- 030-72781-9_17 Yucel, C., Hawley, G., Terzioglu, F., & Bogossian, F. (2020). The effectiveness of simulationbased team training in obstetrics emergencies for improving technical skills: A systematic review. Simulation in Healthcare, 15(2), 98–105. http://doi.org/10.1097/SIH .0000000000000416 Yunoki, K., & Sakai, T. (2018). The role of simulation training in anesthesiology resident education. Journal of Anesthesia, 32(3), 425–433. http://doi.org/ 10.1007/ s00540-018-2483-y Ziv, A. Wolpe, P. R., Small, S. D., & Glick, S. (2003). Simulation-based medical education: An ethical imperative. Academic Medicine, 78(8), 783–788. http://doi.org/10.1097/01 .SIH.0000242724.08501.63 Zucco, L., Chen, M. J., Levy, N., et al. (2023). Just-in-time in-situ simulation training as a preparedness measure for the perioperative care of COVID-19 patients. Simulation in Healthcare, 18(2), 90–99.
9
Best Practices in Surgical Simulation Dominique Doster, Christopher Thomas, and Dimitrios Stefanidis
CONTENTS Introduction............................................................................................................. 255 Current Technologies Used in Surgical Simulation for Skills Training.................. 257 Technical Skills Simulation........................................................................... 257 Low-Cost Simulators......................................................................... 257 High-Cost Simulators........................................................................ 259 Nontechnical Skills Simulation.....................................................................260 The Objective Structured Clinical Examination (OSCE)..................260 Team-Based Training......................................................................... 261 Application of Simulation across the Continuum of Surgical Training.................. 261 Undergraduate Medical Education................................................................ 261 Graduate Medical Education......................................................................... 262 Postgraduate Training of Practicing Surgeons............................................... 265 Human Factors and Surgical Simulation................................................................ 265 Equipment Design and Ergonomics.............................................................. 265 Performance Optimization.............................................................................266 Mental Skills Optimization and Coaching.........................................266 Developing Expertise......................................................................... 267 Surgical Team Dynamics............................................................................... 267 Utilizing Simulation to Minimize Subjectivity in Assessment...................... 268 Future of Simulation and Human Factors in Surgery............................................. 268 Conclusion.............................................................................................................. 269 References............................................................................................................... 270
INTRODUCTION The apprenticeship model of surgical training was originally pioneered in 1892 by William Halsted, and traditionally involved the subjective observation of trainee performance under the supervision of a skilled surgical teacher with graded responsibility and enhanced independence until proficiency was reached (Wright Jr. & Schachar, 2020). The earliest evidence of surgical simulation in the modern model of surgical residency was Dr. Halsted’s use of dog labs as a means to teach procedural DOI: 10.1201/9781003401353-9
255
256
Human Factors in Simulation and Training
and team-based skills. While early attempts at simulation-based surgical training relied heavily on animal models and cadavers for practicing technical skills, the field of surgical simulation didn’t really take off until the emerging technologies of the early 1970s sparked the imagination of surgical educators. The ability to objectively track, measure and assess performance using virtual reality technology inspired surgical educators to embrace simulation and continues to be the motivation for its integration into surgical training programs today. The goal of simulation is to afford training opportunities to surgical trainees to hone their technical and nontechnical skills outside the operating room and clinical environment and provide an improved method of objective skill assessment compared to other traditional methods that rely purely on human raters. In the operating room, tension and stress can run high, and mistakes can lead to devastating consequences for patients. Such a high-stress, high-risk environment is hardly the place for new learners to acquire new skills. To address this important limitation of the traditional training paradigm, simulators allow practice on the same technical and nontechnical skills in a more controlled and low-stress environment. Simulators provide a variety of benefits, including the ability to optimize performance through deliberate practice of the trainee and removing assessment bias through the incorporation of objective performance metrics, that ultimately decrease risk to patients (Stefandis et al., 2019). These benefits have driven the Residency Review Committee (RRC) for Surgery at the Accreditation Council for Graduate Medical Education (ACGME) to recommend the integration of simulation as a means to teach technical and nontechnical skills to surgical residents. The American Board of Surgery now requires certification in two key simulation-based assessments of technical skill to be eligible to register for the certifying examination: Fundamentals of Laparoscopic Surgery (FLS) and the Fundamentals of Endoscopic Surgery (FES). The American College of Surgeons has also created a network of accredited simulation centers to provide training opportunities to surgical trainees and practicing surgeons to push the field of surgical simulation forward through the exchange of ideas and best practices. While workplace-based assessments remain the goal standard with regard to the assessment of learners (Norcini & Burch, 2007; Holmboe et al., 2010), simulationbased assessments can significantly contribute to the training of surgeons as they permit the focused and recurrent evaluation of technical and nontechnical skills in a safe learning environment (Ziv et al., 2003). However, care must be taken to ensure validity evidence supports simulation-based assessments as a reliable evaluation of skill (Cook & Hatala, 2016). This can be difficult to establish as technical and nontechnical perioperative encounters are challenging to replicate. Furthermore, incorporating simulation into surgical skills curricula requires a well thought-out approach that promotes deliberate practice and optimizes skill acquisition of the learner. The use of immediate feedback and/or debriefing helps learners solidify skills and techniques practiced in the simulation environment. Deliberate and repetitive practice is foundational to mastery learning (McGahie et al., 2010). Timing of simulation sessions would ideally occur before the skill is performed in the clinical environment to maximize the yield of training. The
Best Practices in Surgical Simulation
257
challenge arises when residency programs are faced with time constraints due to duty hour restrictions and training session scheduling conflicts with clinical prac tice (Stefandis et al., 2015). Nevertheless, optimizing the integration of simulation into surgical training curricula is becoming more of an expectation and less of an option. The purpose of this chapter is to elaborate on a number of aspects pertaining to best practices in surgical simulation. We seek to provide an appraisal of simulation modalities and their relevant uses in surgical training. Furthermore, the authors seek to review the use of simulation as a means to optimize the influence of human factors in the field of surgery.
CURRENT TECHNOLOGIES USED IN SURGICAL SIMULATION FOR SKILLS TRAINING Technical Skills Simulation Low-Cost Simulators Providing easy access for learners to hone basic technical skills, low-cost simulators encompass a wide range of platforms starting with simple knot-tying boards and suturing pads. Nearly every physician can recall learning how to suture and knot ties during their surgery rotation on such low-cost simulators (Gomez et al., 2014; see Figures 9.1a and 9.1b). While often referred to as low-fidelity simulators, these low-cost platforms for learning basic procedural skills can actually quite closely replicate simple suturing and knot-tying techniques. Laparoscopic box trainers also fall under the category of relatively low-cost simulators and provide a platform for practicing the basic laparoscopic surgical techniques of passing items between graspers, intracorporeal suturing, knot tying and cutting. These box trainers are widely available at most training institutions and are used in the Fundamentals of Laparoscopic Surgery technical assessment (Zendejas et al., 2016).
FIGURE 9.1 Examples of low-cost suturing pad (a) and knot-tying board (b).
258
Human Factors in Simulation and Training
While the previously mentioned simulators are platforms to practice generalizable surgical skills, other low-cost, procedure-specific simulators exist. These include rudimentary models made of affordable materials or 3D-printed forms that seek to recreate true-to-life anatomy for procedures like chest tube placement using butchered pork ribs, or cricothyroidotomy using ventilator tubing (see Figures 9.2a, 9.2b, 9.3, 9.4a, and 9.4b). Despite their low cost and ease of assembly, the haptic feedback can be surprisingly accurate.
FIGURE 9.2 Example of a low-cost chest tube simulator using cadaveric bovine ribs.
FIGURE 9.3 Central line procedural simulator for practicing internal jugular and subclavian central line placement.
Best Practices in Surgical Simulation
259
FIGURE 9.4 Low-cost bowel anastomosis model using simulated bowel. Can be used to practice hand-sewn and stapled bowel anastomoses.
The obvious strength of these simulators is their increased accessibility to trainees due to their cost-effectiveness. Students can purchase knot-tying boards and suture pads off the internet, and laparoscopic box trainers can be assembled at home. Importantly, the repetitive nature of deliberate practice that requires multiple practice sessions and repetition of specific skills to reach the desired level of performance makes these simulators highly valuable even for established simulation centers due to their low cost of use. Hence, such simulators are among the most widely used by education institutes of the American College of Surgeons (Korndorffer et al., 2006). Nevertheless, the simplistic nature of these models makes them less useful as trainees progress into learning more advanced skills and procedures (Korndorffer et al., 2013). This has driven the adoption of both low- and high-cost simulators in the ACS/ APDS skills curriculum for surgical trainees (Scott & Dunnington, 2008). High-Cost Simulators Virtual Reality (VR) simulators are most often used in the context of laparoscopic, endoscopic, and robotic simulation of tasks and procedures (see Figures 9.5a and 9.5b). Such simulators provide a number of benefits for procedural training: they allow repetitive practice of procedures at no additional cost, provide multiple objective metrics for trainee performance assessment, can provide virtual feedback and coaching on trainee performance, and can include multiple procedures on the same platform. Nevertheless, they are associated with an increased upfront cost that is often over $100,000, with yearly maintenance contracts costing more than $20,000 (Parham et al., 2019). Such costs can make these VR simulators unattainable for many small to medium-sized training programs. Prior to the technological advances that have taken place since the 1970s, cadavers and animal models were the primary platform for training outside the operating room both for technical and nontechnical surgical skills. Cadavers have traditionally
260
Human Factors in Simulation and Training
FIGURE 9.5 High-cost Simbionix laparoscopic simulator (a) and Davinci robotic simulator (b). Both platforms incorporate virtual reality technical skills tasks and specific operations.
been used in either a fresh frozen or embalmed format. Though cadavers provide the benefit of true-to-life anatomic layers and tissue planes, their lifespan is relatively short-lived. While the cost of obtaining a cadaver may pale in comparison to the VR simulators ($1,500–$3,000 depending on source), the required ventilation system and infrastructure needed to support such simulation endeavor is limited. The use of animal models in surgical simulation was identified in an effort to provide a similar physiology to an operative patient with a beating heart and active bleeding. Advanced Trauma Operative Management (ATOM) is a course dedicated to teaching skills needed to manage traumatic injuries and was developed around a porcine model. However, the use of animal models for educational simulations has met notable setbacks over the last several years due to ethical concerns, prompting many international trainees to travel outside of their home country to seek out these educational opportunities (Gala et al., 2012). While these various higher-fidelity simulators often provide the benefit of greater complexity, improved haptic feedback, and truer-to-life tissue handling, the greatest drawbacks are the cost and accessibility. The use of high-cost models for training students who don’t plan to pursue surgical specialization is unnecessary. However, as surgical trainees progress and are entrusted with more technically complex skills and procedures in the operating theater, the benefits of the added complexity of highfidelity models balance their cost (Johnston et al., 2016).
Nontechnical Skills Simulation The Objective Structured Clinical Examination (OSCE) OSCEs have been an integral part of clinical training and learner assessment for over three decades (Sloan et al., 1993). These examinations utilize standardized patients
Best Practices in Surgical Simulation
261
or mannequins in simulated clinical arenas to present scenarios similar to situations surgical trainees experience on the job. They are designed to assess clinical decisionmaking. Learners often report feeling similar stress to actual clinical encounters as the simulation environment closely replicates actual clinical areas and the patients are often human actors (Pena et al., 2015). These encounters are recorded and reviewed during debrief sessions and provide a valuable learning opportunity for students and residents to watch actions they may or may not realize they perform (Pucher et al., 2014). These OSCE scenarios often present common perioperative complications and provide an assessment of students’ and residents’ clinical management and bedside manner (Sudan et al., 2014). Team-Based Training Nontechnical simulation has been an integral asset to the development and training of surgeon-led provider teams both inside the operating room and in other clinical arenas. Given the complexity and dynamic nature of the operating room, surgeons are charged with the responsibility to minimize human error and maximize efficiency during cases. Surgical team training strategies can transform dysfunctional OR teams into highly reliable and effective OR teams (Sudan et al., 2014; Robertson et al., 2017). Simulation-based trauma team training includes multidisciplinary teams of surgeons, emergency medicine physicians, and nurses who work together on scenarios commonly seen in Emergency Department trauma bays. These programs are associated with improved team dynamics, task performance quality and speed, knowledge, and provider satisfaction (McLaughlin et al., 2019). The ACS/ APDS has found these team-based scenarios vital to the curriculum for surgical trainees and has adopted 10 of these scenarios as part of Phase 3 of their curriculum. However, given the cost and complexity of organizing multidisciplinary team training sessions, this phase of the curriculum has seen the slowest adoption rate (Korndorffer et al., 2013).
APPLICATION OF SIMULATION ACROSS THE CONTINUUM OF SURGICAL TRAINING In the following paragraphs, we will address the application of surgical simulation in the training of medical students, surgical residents, as well as practicing surgeons.
Undergraduate Medical Education Basic Life Support (BLS) and Advanced Cardiac Life Support (ACLS) are two simulation-based certifications that are required of all advanced hospital personnel. Medical students are expected to pass and maintain renewed certification when they begin their clinical rotations. The courses include various combinations of in-person and online learning in which certified instructors provide a universal framework to guide students through the basics of life support, CPR, rescue breathing, and for ACLS, the interpretation and treatment of life-threatening heart rhythms. Implementation of BLS and ACLS guidelines has shown improved outcomes for
262
Human Factors in Simulation and Training
cardiac arrest patients in and out of the hospital (Honarmand et al., 2018; Kleinman et al., 2017). The goal of these programs is to eliminate discrepancy in the quality of care of critically ill patients by providing healthcare workers with a universal framework that can be utilized in high-stakes, high-stress situations. While medical students are expected to possess the medical knowledge required to safely navigate the wards with attending supervision, their role in the clinical healthcare setting pertaining to procedural skills is less defined. Though it is their goal to be ready to enter specialty training at the time of graduation, the reality is medical students are often excluded from performing common procedures such as IV placement, laceration repair, and splinting of fractures on live patients that were routinely performed by students in the past (Tan & Skye, 2009). This is often times due to medical-legal, patient safety, and patient consent concerns. Additionally, there is great variability in medical student teaching and autonomy between different medical schools and even within an individual medical school (Naeem et al., 2018). These hurdles provide an excellent role for simulation to fill the ever-growing, hands-on educational gap. The use of technical and nontechnical simulators to fill this gap has become widespread across US medical schools (Stefandis et al., 2019). Due to the ubiquity of use of simulators, students can practice skills on both low- and high-cost simulators, resulting in increased proficiency (Stefandis et al., 2019; Borggreve et al., 2017; Okuda et al., 2009; Olasky et al., 2019; Yeh et al., 2017). In response to patient safety initiatives and a need for standardization of medical student skill acquisition, the American College of Surgeons (ACS) and Association of Surgical Education (ASE) created a simulation-based skills curriculum with the aim of improving medical student procedural skills. Evidence has shown success in this curriculum and it has been praised for its cost-effectiveness and accessibility (Olasky et al., 2019). Furthermore, there has been an increase in the popularity of surgical residency “boot camps” over the last 10 years. These simulation-based electives for students and pre-residents allow trainees to hone their psychomotor, cognitive, and technical skills to best prepare them for the start of surgical internship (Yeh et al., 2017; Hudson, 2018a).
Graduate Medical Education Residency is the time when previously undifferentiated doctors with a broad knowledge base in medicine are honed into specialty providers. Those in the surgical specialties must be trained from novices to full-fledged surgeons in 4–7 years, depending on the specific subspecialty. Simulation allows these trainees to learn and practice operative tasks without posing risk to a live patient. A variety of simulationbased curricula and certifications have been established to ensure competency in tasks relevant to each subspecialty. The ACS and APDS developed the Surgery Resident Skills Curriculum with the goal of providing a structured, longitudinal 3-phase program that targets both technical and nontechnical skill development. The curriculum includes various modules ranging from beginner bedside procedures to more advanced operating room
Best Practices in Surgical Simulation
263
techniques (Bartlett et al., 2017). Phase 1 is titled “Core Skills,” and is designed to teach residents in the early stages of their training the 16 essential skills required to operate. Phase 2, “Advanced Procedures,” is designed to teach mid and seniorlevel residents 15 common surgical procedures using cadavers, animal models, and virtual reality. Lastly, Phase 3, “Team-Based Skills,” focuses on the development of nontechnical skills required to be an effective surgeon-leader through 10 simulated scenarios (Hudson, 2018b). The scenarios presented are common dilemmas encountered in the surgical management of patients pre-, intra-, and post-operatively (Bartlett et al., 2017). The Fundamentals of Laparoscopic Surgery , Fundamentals of Endoscopic Surgery, and Fundamentals of Robotic Surgery are three simulation-based certification programs used to train residents in the technology and basic skills required to perform common laparoscopic, endoscopic, and robotic operations. Each course uses specific simulator models, as the main teaching platform, allowing for hands-on experience in a risk-free environment. The technical skills assessed on each simulator mimic foundational maneuvers required to be competent on the actual platforms of laparoscopy, endoscopy, and robotic surgery (see Figures 9.6a and 9.6b). FLS and FES are now required for certification by the American Board of Surgery, and many practicing surgeons are advocating for the completion of FRS prior to certification as well (Fundamentals of Laparoscopic Surgery, 2021a, 2021b, 2021c). Advanced Trauma Life Support (ATLS), similar to BLS and ACLS, is a program designed to provide a standardized framework in the clinical evaluation and initial procedural management of the undifferentiated trauma patient. It involves a combination of online and in-person instruction aimed at familiarizing trauma providers
FIGURE 9.6 Laparoscopic box trainer (a) used for Fundamentals of Laparoscopic Surgery (FLS) skills practice and assessment. (b) GI mentor used for Fundamentals of Endoscopic Surgery skills practice and assessment.
264
Human Factors in Simulation and Training
with the algorithms that prioritize life-threatening injuries and their management followed by identification of non-life-threatening injuries. The in-person class relies heavily on simulation, and mannequins and standardized patients are used for both technical and scenario-based simulations. ATLS has been shown to improve outcomes in trauma and has become ubiquitous in trauma education (Mohammad & Abu-Zidan, 2014). Advanced Trauma Operative Management is yet another simulation-based trauma certification program, only it focuses more so on the operative management of traumatic injuries to the chest and abdomen. Developed in Hartford, Connecticut in 1998, ATOM uses an animal model to train and assess a learner’s ability to identify and repair life-threatening traumatic injuries. The benefit of this simulation model is that it can reproduce physiologic processes that are not feasible in cadaveric models, such as bleeding, bile leakage, urine production, and breathing (Advanced Trauma Operative Management, 2021). Studies have appreciated improvements in residents’ trauma knowledge and technical skills upon completion of ATOM (Ali et al., 2008). While such skills curricula have been implemented alongside the main surgery resident curriculum, some programs have dedicated an entire month of residency training to simulated procedural skills training. The University of Miami Department of Surgery has created a “Technical Skills Rotation” for general surgery residents whereby residents spend a month doing simulated skills in VR, laparoscopic trainers, and scenario simulation. The residents in the study group reported this rotation to be a positive experience overall (Gonzalez et al., 2010). Furthermore, some surgical programs in the US have integrated procedural simulation training as a prerequisite to clinical training. At Indiana University, the novel “Laparoscopic Cholecystectomy” rotation is one such experience. It utilizes the Simbionix LAP Mentor simulator to train residents in the fundamentals of laparoscopic cholecystectomy over a series of VR modules. Upon completion of the simulator modules, the residents then travel to various sites performing only laparoscopic cholecystectomies with supervising faculty for the remainder of the rotation. They record each procedure and review their performance with a faculty. At the end of the rotation, there is a post-test on critical steps of the procedure, relevant anatomy, and perioperative management of patients with cholecystitis. Faculty evaluation data of residents performing laparoscopic cholecystectomy demonstrated an improvement in technical proficiency among residents who completed the rotation compared to those who did not have this rotation in their curriculum (Huffman et al., 2021). A similar model has been implemented in surgical training programs for endoscopy, not only at Indiana University (Mizota et al., 2020) but at other institutions across the country. The University of Michigan Otolaryngology residency has a similar rotation in which residents perform simulated tasks on mannequins and VR models to develop skills in airway management. Theses simulation tasks are coupled with a dedicated anesthesia rotation, with the goal of maximizing endoscopic skill. Faculty and residents noticed an improvement in residents’ second-year preparedness and procedural skills (Kovatch et al., 2019).
Best Practices in Surgical Simulation
265
Postgraduate Training of Practicing Surgeons The average length of practice for a surgeon after the completion of training is 32 years (Jonasson & Kwakawa, 1996). With the rate of development of new medical devices and technologies, it is impossible to expect every practicing surgeon to have formal training during residency on newer devices that were not yet around. For this purpose, many medical device companies produce practice devices which can be used to orient practicing surgeons to the new technology in a simulation environment. While companies use these simulation sessions as marketing ploys to encourage the incorporation of their new technology into surgical practice, they also provide a safe environment for post-graduate surgeons to develop new skills. However, given the wide range of surgeon age and variability in practice patterns despite evidence for best practice, one could argue that simulation is currently underutilized. The ACS–AEI consortium has attempted to bridge this gap by affording training opportunities to practicing surgeons at participating sites.
HUMAN FACTORS AND SURGICAL SIMULATION The application of human factors to surgery is multifaceted. While there is no field with as great a breadth of research pertaining to human factors as aviation, surgery draws the most notable parallel. Similar to pilots, surgeons often function in “autopilot” with most cases following a predictable pattern. Even when things don’t go as planned, surgeons are trained to adapt and adjust the “flight” plan, so to speak, as needed. Furthermore, in surgery, the stakes are equally high and the cost of error is more than monetary, it’s life or death or complication associated decrease in patient quality of life. Many aspects of human factors directly relate to both the practice of surgery and the training of surgeons. These include surgical equipment design and ergonomics, performance optimization, surgical team dynamics, and subjectivity of trainee performance assessment. This intersection of biomedical engineering and performance psychology makes the field of human factors in surgery very relevant today.
Equipment Design and Ergonomics Simulation provides the optimal setting for integrating human factors concepts pertaining to device design and operative environment layout. One such example of device design and evidence-based revision involved the BD Odon Device used in the assistance of vaginal delivery in simulated operative births (Obrien et al., 2017). After multiple simulated assessments and user feedback sessions, biomedical engineers were able to redesign the BD Odon Device in a way that increased the percentage of practitioners able to successfully perform an operative vaginal delivery. Similarly, several new surgical instruments and equipment can be tested in the simulated environment and perfected based on surgeon feedback prior to their implementation in the high-stakes clinical environment.
266
Human Factors in Simulation and Training
Instrument design and its interplay with anthropometry and movement science not only impacts surgeon lifestyle and physical well-being, but it also drives surgical technology innovation and acceptance. Work-related musculoskeletal disorders are prevalent among surgeons and often times drive practice modification (Catanzarite et al., 2018). Surgeons and surgical trainees are at increased ergonomic risk as they operate for hours on end, often times standing in suboptimal positions due to difficult exposure, strenuous retraction, and/or use of loupes, headlamps, or microscopes (Athanasiadis et al., 2021). Developing methods that help identify surgeons at ergonomic risk and programs that aid the mitigation of this risk are highly desirable (Park et al., 2017). To that end, many surgeons are opting to integrate the robotic surgery platform into their practice due to the enhanced ergonomic profile compared to laparoscopic and open surgery (Wee et al., 2020). However, more human factors work and device redesign needs to be done as trunk, wrist and finger strain are still notable across all surgical platforms (Catanzarite et al., 2018).
Performance Optimization Optimizing performance involves defining a set proficiency goal and utilizing strategies to achieve consistent performance at that level (Wulf & Lewthwaite, 2016). The value of deliberate practice is undeniable (Macnamara et al., 2014). The pathway to developing expert surgical performance is a demanding process. It involves completing consecutive complex tasks with precision and accuracy, all the while maintaining clear and concise communication with members of the surgical team in high-pressure situations (Spanager et al., 2015). Mental Skills Optimization and Coaching Feedback comes in two different forms: intrinsic, which is feedback that arises from self-reflection, and augmented, which is feedback that comes from external sources, such as a coach. It may come as no surprise that beginner trainees benefit more from augmented feedback, while experts rely more on intrinsic feedback to improve performance. It can be up to the coach to provide this augmented feedback in an individualized, applicable, and encouraging way (Stefandis et al., 2019). When feedback and coaching are optimized, surgeon performance has been shown to improve (Greenberg et al., 2015, 2016). Additionally, beginners often suffer from the deterioration of skills when placed in high-pressure situations with inadequate coping mechanisms. The ability to maintain skillful performance under stress has been termed cognitive or mental skills, and such skills can be taught by a trained coach. These skills are used in professional athletics, the military, and aviation to help participants perform their best in all situations (Dashauer et al., 2019). These mental skills curricula have already found their way into surgical training and have shown positive outcomes in helping trainees mitigate the impact of stress on surgical performance (Anton et al., 2017, 2019). The body of evidence showing improvement in operating room performance with simulation dates back to 2002 and has been growing rapidly since (Seymour et al., 2002; Sroka et al., 2010). In such a structured coaching environment, quantitative
Best Practices in Surgical Simulation
267
assessment and repetition allow trainees to recognize their common mistakes and create personalized training regimens and mental maps to aid success. Developing Expertise Educational psychologists often reference the Dreyfus model of skill acquisition as the conceptual framework for developing expertise (Dreyfus, 1986). This framework details the progression through five stages of performance beginning with novice and progressing to advanced beginner, competent, proficient, and finally to expert performer. One prominent difference between proficient and expert performance lies in the ability to problem-solve based on prior experience. An expert surgeon recognizes potential errors that may arise due to the subtle complexities of the case at hand and utilizes a deeper understanding of surgery as a whole, which has been developed by prior experience, to arrive at a novel solution that avoids the error altogether (Bereiter & Scardamalia, 1993). While the value of simulation in teaching and assessing clinical skills through the first four stages has been clearly established, human factors pertaining to error recognition, distraction avoidance, and stress mitigation are an integral part in the transition from proficiency to expertise, and recent work has started to look at how to use simulation to explore these areas. Simulation-based error management training (EMT) has been promoted across multiple surgical disciplines (Franklin et al., 2021; Sternbach et al., 2017). These mastery learning-based EMT curricula utilize various low- and high-fidelity simulators to demonstrate pre-completed procedures containing a variable number of procedural errors. Trainee identification of errors is the crux of EMT learning objectives and implies an ability to not only do the procedure correctly but to also identify when things are not progressing in the appropriate manner. This skill also relies heavily on the ability to minimize distractions and focus on the task at hand. Stress mitigation intraoperatively is another human factor that distinguishes expert surgeons. While stress can enhance performance by improving concentration, alertness, and economy of motion, excessive levels of stress have been shown to impair judgment, decision-making, and communication intraoperatively (Wetzel et al., 2005). Furthermore, ineffective stress-coping strategies in surgical trainees correlate with poor performance on virtual-reality laparoscopic simulators (Hassan et al., 2006). Inspired by the success of stress-management programs in aviation, the military, and professional sports, surgical training programs have started to utilize simulation as a means to provide stress-management skills training and assess trainees in high-stress trauma simulations (Goldberg et al., 2018). In training residents to adopt stress-management strategies early on in their technical careers, the hope is that the path to achieving surgical expertise is expedited and patient outcomes are improved.
Surgical Team Dynamics Human Factors and how they impact surgical teams include not only communication style and situational awareness but also leadership dynamics and task coordination. Surgeons do not operate in isolation. Therefore, human factors pertaining to device
268
Human Factors in Simulation and Training
ergonomics and performance optimization mean nothing if not understood within the context of the surgical team (Flin et al., 2015). Most surgical cases involve an anesthesiologist who is responsible for the patient’s airway and sedation, a first assistant who can be a resident physician or physician assistant, a surgical technologist who passes surgical instruments directly into the hands of the surgeons, and a circulating nurse who is responsible for making sure all of the necessary supplies are in the operating room. Healthy interactions, modeled by the surgeon, are vital not only to the health and well-being of the patient but also to the members of operative and trauma teams. Multi-institutional work has been done to combine scenario and procedural-based training to trauma teams using the Advanced Modular Manikin (AMM), a novel simulation platform (Stefandis et al., 2021). Simulators such as these create the optimal environment for team training and allow for the study of human factors in team dynamics.
Utilizing Simulation to Minimize Subjectivity in Assessment Simulation provides a unique platform in its ability to quantitate performance and provide objective feedback and coaching during repetitive practice (Okuda et al., 2009). The utility of assessing surgical trainees in a quantitative manner is a fairly recent paradigm shift in surgical training. The traditional Halsted model of training has senior faculty assessing trainees only through direct observation inside and outside the OR. This model is wrought with bias and subjectivity and does not allow for objective comparisons between performances or performers. This subjectivity extends beyond the scope of an individual surgical faculty and is further illuminated by the diminished inter-rater reliability between faculty assessing the same performance or performer (Gawad et al., 2019; Andersen et al., 2021). These differing assessments have been shown to be universal and are due to a variety of factors, such as the underlying expertise and predetermined expectations of the observer. The inter-rater reliability in some cases can be trained away; however, an objective assessment system would bypass this need altogether (Pradarelli et al., 2020). Simulation technology enables the quantification of new, objective, and potentially more robust performance metrics such as grip strength, excess motion, and gaze direction (Stefandis et al., 2019). Further, newer non-traditional performance metrics that assess learner tissue handling during simulation may provide additional benefits to trainees as they acquire surgical skills (Witthaus et al., 2020; Huffman et al., 2020). These quantitative assessments inform and supplement feedback and have become crucial in the training of surgeons (Vaidya et al., 2020). As these quantifiable metrics increase in number and quality, they will provide a more discrete target to work toward, forcing the art of surgical education to evolve into the science of surgical education.
FUTURE OF SIMULATION AND HUMAN FACTORS IN SURGERY In order to discuss the future of simulation, one must understand its current utility and present trends. Simulation has clearly become the gold standard in fields
Best Practices in Surgical Simulation
269
where safety is a top priority. This explains the early integration of simulation into military and aviation training and resultant exemplar status (Aebersold, 2016). Using these fields as examples, one may safely assume the future of simulation in surgery contains higher-fidelity simulators that integrate higherlevel human factors into performance assessment. Despite its relative infancy, simulation in surgery offers many undeniable advantages. As the technology of machine learning algorithms (AI) progresses, the opportunity for more objective automated performance assessments is endless. Furthermore, the quantification of skill performance and the ability to practice complex tasks without risk to human lives have been revolutionized by simulation. As such, the next logical step was to create a variety of tests/certifications of proof of proficiency in many important tasks prior to gaining the privilege to perform the task in actual patients. This trend is likely to continue and may even expand to the realm of accreditation and recertification. In the medical student world, some predict that surgery “aptitude tests” may gain popularity and help students struggling with their decision to pursue surgery as a career. With regard to the future of human factors research pertaining to surgery, platforms involving wearable sensors that provide an assessment of surgeon ergonomic risk are currently being developed with the goal of mitigating that risk. As simulator technology and AI progress, tools that identify and mitigate surgical errors, minimize OR distractions, and integrate virtual coaching platforms into simulation training could be developed to ameliorate human factors in surgery and enhance technical and team-based training for surgeons. Lastly, it is expected that simulation will aid in the field of outcomes research. Outcomes research has already studied a variety of personal variables and their impact on outcomes in surgery. Such examples include surgeon age, years out of training, among others. It is not unfeasible that, as quantitative proficiency data becomes available, surgical outcomes will be studied using this variable (Aebersold, 2016; Stefandis et al., 2019). Using clinical performance and patient outcomes for practicing surgeons to inform simulation training and recertification and vice versa is vital to quality improvement in the field of surgery moving forward.
CONCLUSION Simulation has clearly brought a variety of advantages to surgical training. Among these are the ability to practice complex surgical skills without risking harm to living people, the ability to quantitate performance and therefore optimize performance and reduce bias, the ability to hone skills in a low-stress environment, and the ability to gain competency via standardized curricula, among others. Simulation in surgery is in its growing stages and still has room to advance and become an even more integral part of surgical education. While limitations exist in the fidelity of simulators due to the inherent complexity of simulating living systems reliably, these limitations are constantly being superseded. With this in mind, we may soon enter an era where the greatest limitation of a simulator is the imagination to create it.
270
Human Factors in Simulation and Training
REFERENCES Advanced Trauma Operative Management. 2021; Available from: https://www.facs.org/ quality-programs/trauma /education /atom. Aebersold, M., The history of simulation and its impact on the future. AACN Adv Crit Care, 2016. 27(1): p. 56–61. Ali, J., et al., The advanced trauma operative management course in a canadian residency program. Can J Surg, 2008. 51(3): p. 185–189. Andersen, S.A.W., et al., Use of generalizability theory for exploring reliability of and sources of variance in assessment of technical skills: A systematic review and meta-analysis. Acad Med, 2021. 96(11): p. 1609–1619. Anton, N.E., et al., Application of mental skills training in surgery: A review of its effectiveness and proposed next steps. J Laparoendosc Adv Surg Tech A, 2017. 27(5): p. 459–469. Anton, N.E., et al., Mental skills training limits the decay in operative technical skill under stressful conditions: Results of a multisite, randomized controlled study. Surgery, 2019. 165(6): p. 1059–1064. Athanasiadis, D.I., et al., An analysis of the ergonomic risk of surgical trainees and experienced surgeons during laparoscopic procedures. Surgery, 2021. 169(3): p. 496–501. Bartlett, J., et al., ACS/APDS surgery resident skills curriculum. 2017 [cited 2021 4/14/21]; Available from: https://www.facs.org/education/program/resident-skills. Bereiter, C. and M. Scardamalia, Surpassing Ourselves: An Inquiry into the Nature and Implications of Expertise. 1993. Chicago: Open Court. Borggreve, A.S., et al., Simulation-based trauma education for medical students: A review of literature. Med Teach, 2017. 39(6): p. 631–638. Catanzarite, T., et al., Ergonomics in surgery: A review. Female Pelvic Med Reconstr Surg, 2018. 24(1): p. 1–12. Cook, D.A. and R. Hatala, Validation of educational assessments: A primer for simulation and beyond. Advances in Simulation, 2016. 1(1): p. 31. Deshauer, S., et al., Mental skills in surgery: Lessons learned from virtuosos, olympians, and navy seals. Ann Surg, 2019. 274(1): p. 195–198 Dreyfus, H.L. and S. Dreyfus. Mind Over Machine: The Power of Human Intuition and Expertise in the Era of the Computer. 1986. New York: Free Press. p. 250. Flin, R., G.G. Youngson, and S. Yule, Enhancing Surgical Performance: A Primer in Nontechnical Skills. 2015. Boca Raton, FL: CRC Press. Franklin, B.R., et al., Piloting the FIRE: A novel error management training simulation curriculum for fasciotomy instruction. J Surg Educ, 2021. 78(2): p. 655–664. Fundamentals of Laparoscopic Surgery. 2021a [cited 2021 4/14/21]; Available from: https:// www.flsprogram.org/. Fundamentals of Robotic Surgery. 2021c [cited 2021 4/14/21]; Available from: https:// frsurgery.org/. Gala, S.G., et al., Use of animals by NATO countries in military medical training exercises: An international survey. Mil Med, 2012. 177(8): p. 907–10. Gawad, N., et al., The inter-rater reliability of technical skills assessment and retention of rater training. J Surg Educ, 2019. 76(4): p. 1088–1093. Goldberg, M.B., et al., Optimizing performance through stress training - An educational strategy for surgical residents. Am J Surg, 2018. 216(3): p. 618–623. Gomez, P.P., et al., External validation and evaluation of an intermediate proficiency-based knot-tying and suturing curriculum. J Surg Educ, 2014. 71(6): p. 839–845. Gonzalez, R.I., et al., Technical skills rotation for general surgery residents. J Surg Res, 2010. 161(2): p. 179–82.
Best Practices in Surgical Simulation
271
Greenberg, C.C., et al., Surgical coaching for individual performance improvement. Ann Surg, 2015. 261(1): p. 32–4. Greenberg, C.C., J. Dombrowski, and J.B. Dimick, Video-based surgical coaching: An emerging approach to performance improvement. JAMA Surg, 2016. 151(3): p. 282–283. Hassan, I., et al., Negative stress-coping strategies among novices in surgery correlate with poor virtual laparoscopic performance. Br J Surg, 2006. 93(12): p. 1554–1559. Holmboe, E.S., et al., The role of assessment in competency-based medical education. Med Teach, 2010. 32(8): p. 676–82. Honarmand, K., et al., Adherence to advanced cardiovascular life support (ACLS) guidelines during in-hospital cardiac arrest is associated with improved outcomes. Resuscitation, 2018. 129: p. 76–81. Hudson, K. Prepare Your Graduating Students for Their New Responsibilities in Surgical Care. ACS/APDS/ASE Resident Prep Curriculum 2018a [cited 2021 September 3]; Available from: https://www.facs.org/education/program/resident-prep. Hudson, K. ACS/APDS Surgery Resident Skills Curriculum. 2018b [cited 2021 September 3]; Available from: https://www.facs.org/education/program/resident-skills. Huffman, E., et al., Optimizing assessment of surgical knot tying skill. J Surg Educ, 2020. 77(6): p. 1577–1582. Huffman, E.M., et al., A competency-based laparoscopic cholecystectomy curriculum significantly improves general surgery residents’ operative performance and decreases skill variability: Cohort study. Ann Surg, 2021. 276(6): e1083–e1088. Johnston, M.J., et al., An overview of research priorities in surgical simulation: What the literature shows has been achieved during the 21st century and what remains. Am J Surg, 2016. 211(1): p. 214–225. Jonasson, O. and F. Kwakawa, Retirement age and the work force in general surgery. Ann Surg, 1996. 224(4): p. 574–582. Kleinman, M.E., et al., 2017 American heart association focused update on adult basic life support and cardiopulmonary resuscitation quality: An update to the American heart association guidelines for cardiopulmonary resuscitation and emergency cardiovascular care. Circulation, 2018. 137(1): p. e7–e13. Korndorffer, J.R. Jr., D. Stefanidis, and D.J. Scott, Laparoscopic skills laboratories: Current assessment and a call for resident training standards. Am J Surg, 2006. 191(1): p. 17–22. Korndorffer, J.R. Jr., et al., The American college of surgeons/association of program directors in surgery national skills curriculum: Adoption rate, challenges and strategies for effective implementation into surgical residency programs. Surgery, 2013. 154(1): p. 13–20. Kovatch, K.J., et al., Integrated otolaryngology-anesthesiology clinical skills and simulation rotation: A novel 1-month intern curriculum. Ann Otol Rhinol Laryngol, 2019. 128(8): p. 715–720. Macnamara, B.N., D.Z. Hambrick, and F.L. Oswald, Deliberate practice and performance in music, games, sports, education, and professions: A meta-analysis. Psychol Sci, 2014. 25(8): p. 1608–1618. McGaghie, W.C., et al., A critical review of simulation-based medical education research: 2003–2009. Med Educ, 2010. 44(1): p. 50–63. McLaughlin, C., et al., Multidisciplinary simulation-based team training for trauma resuscitation: A scoping review. J Surg Educ, 2019. 76(6): p. 1669–1680. Mizota, T., et al., Development of a fundamentals of endoscopic surgery proficiency-based skills curriculum for general surgery residents. Surg Endosc, 2020. 34(2): p. 771–778. Mohammad, A., F. Branicki, and F.M. Abu-Zidan, Educational and clinical impact of Advanced Trauma Life Support (ATLS) courses: A systematic review. World J Surg, 2014. 38(2): p. 322–329.
272
Human Factors in Simulation and Training
Naeem, N., et al., Exploring variability of teaching & supervision at clinical clerkship teaching sites. Pak J Med Sci, 2018. 34(2): p. 368–373. Norcini, J. and V. Burch, Workplace-based assessment as an educational tool: AMEE Guide No. 31. Med Teach, 2007. 29(9): p. 855–871. O’Brien, S.M., et al., Design and development of the BD Odon Device(TM): A human factors evaluation process. Bjog, 2017. 124(Suppl 4): p. 35–43. Okuda, Y., et al., The utility of simulation in medical education: What is the evidence? Mt Sinai J Med, 2009. 76(4): p. 330–343. Olasky, J., et al., ACS/ASE medical student simulation-based skills curriculum study: Implementation phase. J Surg Educ, 2019. 76(4): p. 962–969. Parham, G., et al., Creating a low-cost virtual reality surgical simulation to increase surgical oncology capacity and capability. Ecancermedicalscience, 2019. 13: p. 910. Park, A.E., et al., Intraoperative “micro breaks” with targeted stretching enhance surgeon physical function and mental focus: A multicenter cohort study. Ann Surg, 2017. 265(2): p. 340–346. Pena, G., et al., Nontechnical skills training for the operating room: A prospective study using simulation and didactic workshop. Surgery, 2015. 158(1): p. 300–309. Pradarelli, J.C., et al., Assessment of the Non-Technical Skills for Surgeons (NOTSS) framework in the USA. Br J Surg, 2020. 107(9): p. 1137–1144. Pucher, P.H., et al., Ward simulation to improve surgical ward round performance: A randomized controlled trial of a simulation-based curriculum. Ann Surg, 2014. 260(2): p. 236–243. Robertson, J.M., et al., Operating room team training with simulation: A systematic review. J Laparoendosc Adv Surg Tech A, 2017. 27(5): p. 475–480. Scott, D.J. and G.L. Dunnington, The new ACS/APDS skills curriculum: Moving the learning curve out of the operating room. J Gastrointest Surg, 2008. 12(2): p. 213–221. Seymour, N.E., et al., Virtual reality training improves operating room performance: Results of a randomized, double-blinded study. Ann Surg, 2002. 236(4): p. 458–463; discussion 463–464. Sloan, D.A., et al., Use of an Objective Structured Clinical Examination (OSCE) to measure improvement in clinical competence during the surgical internship. Surgery, 1993. 114(2): p. 343–350; discussion 350–351. Spanager, L., et al., Comprehensive feedback on trainee surgeons’ non-technical skills. Int J Med Educ, 2015. 6: p. 4–11. Sroka, G., et al., Fundamentals of laparoscopic surgery simulator training to proficiency improves laparoscopic performance in the operating room-a randomized controlled trial. Am J Surg, 2010. 199(1): p. 115–120. Stefanidis, D., et al., Simulation in surgery: What’s needed next? Ann Surg, 2015. 261(5): p. 846–853. Stefanidis, D., J.R. Korndorffer, and R. Sweet, Comprehensive Healthcare Simulation: Surgery and Surgical Subspecialties. 2019. Cham, Switzerland: Springer. Stefanidis, D., et al., Advanced modular manikin and surgical team experience during a trauma simulation: Results of a single-blinded randomized trial. J Am Coll Surg, 2021. 233(2): p. 249–260.e2. Sternbach, J.M., et al., Measuring error identification and recovery skills in surgical residents. Ann Thorac Surg, 2017. 103(2): p. 663–669. Sudan, R., et al., American college of surgeons resident objective structured clinical examination: A national program to assess clinical readiness of entering postgraduate year 1 surgery residents. Ann Surg, 2014. 260(1): p. 65–71. Tang, T.S. and E.P. Skye, When patients decline medical student participation: The preceptors’ perspective. Adv Health Sci Educ Theory Pract, 2009. 14(5): p. 645–653.
Best Practices in Surgical Simulation
273
Fundamentals of Endoscopic Surgery. 2021b [cited 2021 4/14/21]; Available from: https:// www.fesprogram.org/. Vaidya, A., et al., Current status of technical skills assessment tools in surgery: A systematic review. J Surg Res, 2020. 246: p. 342–378. Wee, I.J.Y., L.J. Kuo, and J.C. Ngu, A systematic review of the true benefit of robotic surgery: Ergonomics. Int J Med Robot, 2020. 16(4): p. e2113. Wetzel, C.M., et al., The effects of stress on surgical performance. Am J Surg, 2005. 191(1): p. 5–10. Witthaus, M.W., et al., Incorporation and validation of clinically relevant performance metrics of simulation (CRPMS) into a novel full-immersion simulation platform for nervesparing robot-assisted radical prostatectomy (NS-RARP) utilizing three-dimensional printing and hydrogel casting technology. BJU Int, 2020. 125(2): p. 322–332. Wright, Jr, J.R. and N.S. Schachar, Necessity is the mother of invention: William Stewart Halsted’s addiction and its influence on the development of residency training in North America. Can J Surg, 2020. 63(1): p. E13–e19. Wulf, G. and R. Lewthwaite, Optimizing performance through intrinsic motivation and attention for learning: The optimal theory of motor learning. Psychon Bull Rev, 2016. 23(5): p. 1382–1414. Yeh, D.H., K. Fung, and S. Malekzadeh, Boot camps: Preparing for residency. Otolaryngol Clin North Am, 2017. 50(5): p. 1003–1013. Zendejas, B., R.K. Ruparel, and D.A. Cook, Validity evidence for the fundamentals of Laparoscopic Surgery (FLS) program as an assessment tool: A systematic review. Surg Endosc, 2016. 30(2): p. 512–520. Ziv, A., et al., Simulation-based medical education: An ethical imperative. Acad Med, 2003. 78(8): p. 783–788.
10
Healthcare Simulation Methods A Multifaceted Approach Amy L. Hanson and Aaron W. Calhoun
CONTENTS Introduction............................................................................................................. 275 What is Healthcare Simulation?.................................................................... 277 Concepts in Healthcare Simulation......................................................................... 277 Simulation Fidelity........................................................................................ 277 Psychological Safety...................................................................................... 279 Locations and Modes of Healthcare Simulation.....................................................280 Applications of Healthcare Simulation................................................................... 282 Simulation for Practice and Learning............................................................ 282 Simulations with Reflective Debriefing............................................. 282 Rapid Cycle Deliberate Practice........................................................284 Mastery Learning............................................................................... 286 Just-in-Time Training........................................................................ 287 Choice of Training Type................................................................................ 288 Simulation for Evaluation and Testing.......................................................... 288 Systems Testing............................................................................................. 289 Conclusion and Future Directions.......................................................................... 289 References............................................................................................................... 290
INTRODUCTION Since ancient times, the art and practice of medicine has been grounded in the concept of primum non nocere: “first, do no harm” (Aggarwal et al., 2010). This phrase enjoins practitioners to weigh the possible harm of any intervention against its potential benefits. Despite the general acceptance of this moral adage, in 2000 the Institute of Medicine reported that up to 98,000 patient deaths per year in US hospitals were due to medical error (Kohn et al., 2000). The “see one, do one” approach to medical training that so long formed the core of educational practice in medicine was called into question by this data, forcing a reexamination of the degree to which inexperienced healthcare trainees practiced their craft directly on patients (Aggarwal et al., 2010). Here, simulation-based training entered as a logical solution, enabling DOI: 10.1201/9781003401353-10
275
276
Human Factors in Simulation and Training
trainees to gain needed skills while reducing the exposure of patients to risk and preventable harm. In fact, the idea of practicing medical procedures on inanimate objects before caring for human beings dates back thousands of years (Owen, 2012). During the Song dynasty in China in 1027, the imperial physician, Wang Wei-Yi, created bronze statues covered with small holes in the surface that were used to teach students of acupuncture about surface anatomy (Owen, 2012). The statues were then covered in wax and filled with liquid so that a drip off the end of a removed needle could help confirm appropriate anatomical placement of the acupuncture needle (Owen, 2012). In the late seventeenth century, fueled by sufficient concern over infant and maternal childbirth mortality rates and a need for increased training among midwives, Drs. Grégoires, a father and son team, produced a manikin of the pregnant female abdomen, then referred to as a “phantom” (Buck, 1991). Made of basket-weave and covered with oil-skin and cloths, the simulator was used to train midwives on birthing techniques after lecture education alone failed to translate to improved childbirth outcomes (Buck, 1991; Gardner & Raemer, 2008). Concern over the risk of harm and prioritization of patient safety sparked a resurgence of this practice in modern times. As this transition accelerated, changes also took place in our understanding of adult learning theory (Kolb, 1984). These theoretical developments argued that lectures (i.e., “telling people what they should know”) were not robust as an educational method in this population. Instead, the adult learner needs immersive and experiential learning that correlates with relevant real-world problems they have been charged to address. In 1984, Kolb, an American educational theorist, detailed how adult learning is an “experiential, conflict-filled process out of which the development of insight, understanding and skills come.” Kolb described a recurring cycle of learning that begins with the learner first having a new, concrete “real-world” experience that they wish to learn more about. This naturally leads to a second phase in which the experience is reviewed and reflected upon. Kolb suggests that this period of reflection is critical to learning, as it is here where learner “breakthroughs” can occur and mindsets can change. In the third phase, the learner generates new concepts and conclusions based on this reflection, solidifying what they have learned. Finally, in the fourth phase, the learner actively experiments with these conclusions, applying them via active experimentation. This, in turn, generates new experiences and observations, and the cycle is repeated in an ongoing fashion. Simulation was quickly recognized by the healthcare community as a training modality that naturally fit into this approach and provided needed opportunities for experiential learning (Aggarwal et al., 2010). By providing a simulated experience followed by a facilitated debriefing, learners are exposed to new situations (Phase 1 of the Kolb Cycle) and actively encouraged to reflect on them (Phase 2 of the Kolb Cycle; Kolb, 1984). Thus, learners can more carefully examine their own actions in a guided way and think critically about what was done well and how their actions may have differed from what needed to be done. Simulation also provides an environment in which active experimentation (Phase 4 of the Kolb Cycle) can occur at no risk to patients. By directly facilitating these aspects of the adult learning process, simulation has the potential to enhance learning far beyond what can be accomplished via
Healthcare Simulation Methods
277
traditional didactic approaches. The uptake of healthcare simulation has been extensive, and there are now more than 825 healthcare institutions formally providing simulation-based training worldwide (ssih.org). Technological advancements also formed a crucial part of this developmental process (Aggarwal et al., 2010). The earliest full-body manikin simulators were used in the 1960s for anesthesia (SimOne) and cardiology (Harvey) and for cardiopulmonary resuscitation and mouth-to-mouth training (Resusci-Anne; Cooper & Taqueti, 2008). In the 1980s, personal computer technology became more affordable and software more accessible to industries like aviation, space, military, and nuclear power. This paved the way for the extensive development of the techniques that later took place. In the 1990s, Dr. David Gaba, an important initial pioneer in the field, and colleagues created the first comprehensive anesthesia simulation modules known as CASE 1.2 (Comprehensive Anesthesia Simulation Environment). With this platform, what is now commonly referred to as high-fidelity healthcare simulation entered the field of medicine. As the field of anesthesia, and later other medical subspecialties, adopted simulation, “crew resource management” training concepts that had long been used in aviation were adapted to the healthcare environment. Renamed “crisis resource management” to better fit the healthcare environment, this educational approach was used in concert with high-fidelity simulation to improve teamwork during medical emergencies by providing practitioners with a series of conceptual steps that could be used to maximize teamwork in response to a simulated crisis. Since that time, healthcare simulation has further grown and evolved, and now embraces a wide array of technology and techniques that address almost every aspect of healthcare education.
What is Healthcare Simulation? While many definitions of healthcare simulation have been developed and/or used over the past decades, the Society for Simulation in Healthcare (SSH) defines healthcare simulation as “a technique that creates a situation or environment to allow persons to experience a representation of a real health care event for the purpose of practice, learning, evaluation, testing, or to gain understanding of systems or human actions” (Lioce et al., 2020). This generally agreed-upon definition is purposefully broad, deliberately reflecting the now widespread use of simulation-based methods in healthcare. The remainder of this chapter is structured to provide an overview of the concepts of fidelity and psychological safety, which are critical aspects of healthcare simulation, followed by a review of both the physical characteristics of healthcare simulation environments and how simulation is specifically applied in the healthcare environment for both teaching and assessment.
CONCEPTS IN HEALTHCARE SIMULATION Simulation Fidelity Fidelity refers to “the degree to which a simulation replicates the real-event and/or workplace” and is a universal aspect of all simulated experiences (Lioce et al., 2020).
278
Human Factors in Simulation and Training
As a construct, fidelity can be further divided into physical, cognitive and emotional elements of realism (Lioce et al., 2020). Physical realism refers to the physical properties of the manikin or equipment itself. For example, the manikin will have better physical realism if its weight is similar to that of a similarly sized human being (Dieckmann et al., 2007). Likewise, other physical aspects of the manikin, such as the force generated by moving its chest wall during compressions, the appearance of the materials composing its airway or the haptics (i.e. tactile perceptions) felt when placing a breathing tube into the trachea help to determine the physical realism for the learner. Despite their human shape and ever-improving technologies, existing manikins still possess unrealistic qualities – such as breath sounds that are still distinguishable from actual breath sounds and the obviously synthetic materials that comprise its “skin.” Physical realism also applies to the environment in which the simulation is conducted – whether in an actual medical workplace or a simulation lab – and how well the physical space replicates the actual work environment. If a simulation takes place in an emergency room but in an infrequently used back room as opposed to the actual resuscitation bay where people resuscitate actual patients, physical realism suffers to some extent. Cognitive realism concerns concepts, care decisions, and their relationships within the simulation (Dieckmann et al., 2007). For example, if severe hemorrhage occurs in a living patient, then a high heart rate and low blood pressure will follow over a fairly well-defined time course. If this anticipated time course is not roughly followed during a simulation, learners will perceive a gap in realism. It is important to note that the actual delivery mechanism of this information is irrelevant to cognitive realism. This means that the mode by which the team learns of the hemorrhage – whether through a labor-intensive process of concocting and applying simulated blood to the manikin versus viewing an image of a bleeding patient, or simply being told by the facilitator – has little bearing on the cognitive realism of the simulation. As long as the manikins’ response as it relates to vital signs, physical findings, or behavior follows what would physiologically occur in response to provided interventions, cognitive realism will be preserved. Emotional realism refers to the emotions, beliefs and self-awareness that participants directly experience during the simulation (Dieckmann et al., 2007). It represents the degree to which the simulation evokes the feelings or emotions that learners would expect to experience in a real situation (Rudolph et al., 2014). This aspect of fidelity is largely independent of the knowledge content of the case but can have significant impact on the learning that occurs. For this reason, it is often addressed upfront during debriefing in what is often termed a “reactions” phase, which explores the emotional experiences and reactions of participants (Eppich et al., 2015). During the initial uptake of simulation by the healthcare field, the fidelity of a simulation was felt to unilaterally correspond to the overall “realism” of a case and was seen as one of the most critical elements for effective learning. Under this assumption, healthcare simulation used the close replication of reality as a gold standard (Dieckmann et al., 2007). This view has been repeatedly and consistently challenged, however, over the intervening decades (PW, 1973; Hays & Singer, 1989; Jentsch et al., 2011). Studies have consistently failed to show a benefit to higher
Healthcare Simulation Methods
279
fidelity in terms of training outcomes (Issenberg et al., 2005). Furthermore, data also suggests that relatively low-fidelity simulations can also lead to effective learning (Beaubien & Baker, 2004; Salas & Burke, 2002). It has been noted that participants experience a simulated scenario both as a complex real-time situation in which they interact with specific equipment, human actors and aspects of the environment, and as an educational event intended to approximate an actual clinical encounter. If participants appreciate how the simulated scenarios apply to clinical practice, they are likely to accept lower physical, cognitive, and emotional fidelity while still deriving educational benefit from the experience. Furthermore, it appears that the success of a simulation also depends on a host of other factors that extend past the fidelity or realism of the simulator or simulation event. The “social practice” of simulation, for example, plays an integral role in the process (Dieckmann, 2020). Social practice is defined as a “contextual event in time and space, conducted for one or more purposes, in which people interact in a goaloriented fashion with each other, with technical artifacts (the simulator) and with the environment (and relevant devices).” In terms of healthcare simulation, this term does not refer to team interactions within the simulation, but rather the larger scale interactions between learners and the entire simulation process. This includes the explicit learning objectives for the simulation as well as the learners’ understanding of them and affects how they choose to interpret what has transpired during the session (Dieckmann et al., 2007). As an example, prior to a simulation imagine one learner is overheard saying to another, “if we just get the patient intubated and don’t screw it up, maybe they’ll be satisfied.” In this context, the learner is aimed at pleasing the simulation facilitator and is underappreciating the true goal of the exercise, which is for their own professional learning and development. Simulation is truly a complex social endeavor (Dieckmann, 2020).
Psychological Safety Necessary to the appropriate social functioning of healthcare simulation is the establishment of a safe learning environment prior to engagement in the simulation exercise. Historically, most simulations address this in a “pre-briefing,” which is an “orientation session held prior to the start of a simulation activity in which instructions and preparatory information are given to the participants” (Lioce et al., 2020). During the pre-briefing, the stage is set for the learning experience by clarifying the goals and expectations for the session and attending to logistical details (Rudolph et al., 2014). The pre-brief may also include an explanation of the strengths and weaknesses of simulation and what participants can do to get the most out of the simulated clinical experiences. This commonly involves invoking a “fiction contract,” whereby participants are asked to behave as if the situation is real (Gardner & Raemer, 2008; Rudolph et al., 2014). In doing so, the merits of simulation may be more fully realized. While these are some well-attested ways of addressing these simulation issues in pre-briefing, other practical approaches do exist. In the final portion of the pre-briefing (or, in some approaches, just prior to the debriefing), the facilitator commits to respecting the learners and their psychological
280
Human Factors in Simulation and Training
safety and confidentiality (Rudolph et al., 2014). Here the participants are encouraged to share their thoughts and questions about the simulation and debriefing and are reassured that they will not be chastened or humiliated in the process. This sets up the team for risk-taking in the name of learning, which is vital given that their professional skills are in many ways on display during the event. The facilitator will often also concede that the simulation can only mimic reality to a certain point and acknowledges the limitations of the simulation modality being used, which can help normalize any fidelity-related issues that may arise. Clearly voicing a commitment to the participants can also help curb counterproductive defense mechanisms, such as blaming perceived lapses in simulation realism as a reason for poor performance. When participants perceived the ground rules are fair, they are often more willing both to engage with the learning objectives and critically reflect on their own performance (Calhoun et al., 2020a, 2020b). The dual issues of fidelity and psychological safety can intersect in complex ways when simulations containing emotionally difficult subject matter, in particular patient death, are in view (Calhoun et al., 2015; Truog & Meyer, 2013). Leighton et al. propose a helpful heuristic that distinguished between situations in which manikin death is both planned by facilitators and explicitly revealed to learners prior to the case, situations in which the death is planned but learners are unaware of this, and situations in which manikin death is not initially planned but emerges instead as a consequence of learner action or inaction (Leighton, 2009). Each situation requires a different approach to assure psychological safety. Manikin death due to learner action or inaction remains somewhat controversial. Much of the current literature on this subject focuses on the interaction between the stress of the event and eventual knowledge and skill retention, with mixed results (Heller et al., 2016; Fraser et al., 2014; Bryson & Levine, 2008; Demaria et al., 2010; Lizotte et al., 2015; Phrampus & Cole, 2005). Some of this literature, noting that learning seems in some cases to improve when a close relationship exists between learner actions and outcome, further suggest that manikin death due to learner actions may enhance both fidelity and the learner’s sense of agency within the simulation (Goldberg et al., 2017; Calhoun & Gaba, 2017). Learner experience also appears to play a role in the appropriateness of the technique. A number of cognitive models have recently been proposed that attempt to more comprehensively represent both learning and emotional stress under these conditions (Tripathy et al., 2016; McBride et al., 2017).
LOCATIONS AND MODES OF HEALTHCARE SIMULATION Simulation centers became popular and were invested in greatly starting in the 1990s as “schools” that offer simulation training as the primary teaching modality (Lateef et al., 2021). These centers may be particularly useful in the training and testing of newer trainees. With time, however, the benefits of in-situ simulation, or “simulations that take place in the actual patient care settings/environment in an effort to achieve a high level of fidelity and realism” (Lioce et al., 2020) have been increasingly appreciated. In-situ simulation is also valuable when the goal is not primarily
Healthcare Simulation Methods
281
to educate, but instead to assess, troubleshoot, or develop new systems processes, as the actual work environment is on display. This offers the ability to detect latent safety threats and alter logistical details or system operations. Medicine makes use of a variety of types of simulator modalities, including manikin-based, task trainers, standardized patients, screen-based simulation (i.e., distance simulation), and virtual reality. Manikin-based simulation was the earliest adopted way of conducting healthcare simulation. Low-fidelity manikins are static replicas of the human form that are capable of only the most basic movements, such as chest rise when ventilated, or chest recoil during cardiopulmonary resuscitation. Despite this simplicity, they can be quite effective tools on which to practice rescue breathing, bag-valve mask ventilation, or cardiopulmonary resuscitation. Higherfidelity manikins, in contrast, contain computer-driven mechanisms that can dynamically display heart sounds, lung sounds, pulses, skin perfusion assessments, pupil responsiveness, and electrocardiographic waveforms. Newer manikins can even speak with increasing realism, exhibit varied facial expressions, cry, mimic stroke symptoms, and be realistically intubated. An additional simulator type that is closely related to the manikin is the task trainer. These replicate only one part of the body (such as an arm, wrist, or spinal column) and are typically used for procedural skills training (surgical technique, lumbar puncture, central line placement, etc.). Task trainers are often simple in construction and rely on specialized materials to replicate the feel and pliability of specific human tissues. As point-of-care ultrasound has grown in popularity, many task trainers also incorporate materials capable of transmitting ultrasound waves to allow practice of this diagnostic modality as well. A third modality is the standardized patient (SP). SP simulations are simulations “using a person or persons trained to portray a patient scenario or actual patient(s) for health care education” (Lioce et al., 2020). An SP may also be called upon to act as the parent of a patient during a simulation. SPs, as living humans, can provide direct feedback to learners in a way that artificial manikins cannot, especially regarding relational skills such as history-taking and physical examination. SPs receive training to assure that this feedback is focused and of high quality, and their commentary can thus significantly surpass the feedback given during face-to-face interaction with an actual patient in quality. SPs have been shown to be effective in the rehearsal of difficult conversations such as disclosing a medical error or breaking bad news (Borghi et al., 2021; Bell et al., 2014; Meyer et al., 2009; Peterson et al., 2012, 2021). Other simulator modalities include screen-based forms of healthcare simulation such as serious-games and virtual hospitals (Bracq et al., 2019). Virtual reality makes use of specialized goggles and haptic feedback devices that allow the learner to engage in a fully virtual setting where medical care or procedures can be performed. While virtual reality can take significant initial investment, ongoing costs are relatively low when compared to manikin-based simulation, and it is thus increasingly seen as a lower-cost alternative to manikin-based simulation. Medicine is also making use of augmented reality, where virtual constructs are visually overlaid on a real environment to assist in the educational process. An example of this is a recently developed birthing simulator manikin which can overlay virtual reality
282
Human Factors in Simulation and Training
components onto the manikin using virtual reality goggles. This allows the provider to “see inside” the pregnant woman’s abdomen and directly observe the effects of the mother’s condition, and their actions, on the baby within (CAE, 2021). Finally, the Covid-19 pandemic has posed significant challenges to current simulation practices, which had previously required people to gather in close groups for education. New social distancing requirements drove a proliferation of tele-simulation, or distance simulation techniques, in which groups of learners and facilitators were brought together using video conferencing platforms to focus on the care of a simulated patient (Wagner et al., 2020). This is typically accomplished by either streaming live, on-site manikin interactions to remote participants, or by creating an entirely computer-based educational environment via photographs, videos, and vital sign display software that can be displayed using remote conferencing software (Gross et al., 2020). The lessons learned from this have opened up an array of new educational possibilities, especially for remote or low-resource environments, or for situations where physical presence or space is not necessary for learning to occur (Patel et al., 2020). One notable challenge involved in carrying out tele-simulation is the constraint that is imposed on learner engagement by physical separation (Cheng et al., 2020). Debriefing, in particular, relies heavily on the ability to form and maintain relationships, which can be difficult to do under these circumstances. Debriefing successfully under these conditions requires the establishment of a culture of safety, confidentiality, and openness to self-reflection within the learner group. The educational theories and approaches that can better define the best practices involved in doing tele-simulation optimally are actively being studied and tested.
APPLICATIONS OF HEALTHCARE SIMULATION Healthcare simulation is typically conducted either for practice and learning or for evaluation and assessment. Simulation for practice and learning embraces four basic educational approaches: Reflective Debriefing (i.e., a standard sequence of pre-briefing, simulation and debriefing), Rapid Cycle Deliberate Practice, Mastery Learning, and Just-in-Time training. Key approaches to simulation for evaluation and assessment focus primarily on the object of assessment: the evaluation and assessment of learners or the evaluation/assessment of system operations or processes. The remainder of the chapter will address each of these in turn.
Simulation for Practice and Learning Simulations with Reflective Debriefing The majority of this chapter thus far has focused on “traditional” simulations that begin with a pre-brief phase, proceed to an immersive simulated patient event, and end with a subsequent facilitated team debriefing. Given that a great deal of this has already been discussed, we will focus here on the debriefing process. Debriefing is a critical component of this approach, as it allows for the clarification and consolidation of insights and lessons learned from the simulation. Debriefing is a “conversation between two or more people to review a simulated event or activity
Healthcare Simulation Methods
283
in which participants explore, analyze, and synthesize their actions and thought processes, emotional states and other information to improve performance in real situations” (Brett-Fleegler et al., 2012). Using this approach, debriefing is held in a separate space from the simulated patient event itself to allow for adequate psychological distance from the event. Participants are made to feel comfortable and psychologically safe to share their observations and personal analysis, and to give and receive feedback. High participant engagement is necessary for transfer of knowledge and skills to the real clinical setting, and the facilitator plays a key role in assuring this. A well-conducted debriefing directs learners to consider the frames, or “mental models,” that shape their actions. For example, if an anesthesiologist holds the frame that “I must have both a bag-valve mask and oxygen source to ventilate this patient” (the standard way of providing assisted respirations in a hospital environment), they may, when presented with a scenario where a patient stops breathing and there is no oxygen source readily available, search relentlessly for an oxygen hookup or tank, during which time the patient may experience severe hypoxemia and eventual cardiac arrest. Exploring the mental model that led to this action can help uncover opportunities for changing practice and facilitate troubleshooting in real time. In this example, that might include expanding this frame to explicitly include providing bagvalve mask ventilation with room air (a physiologically viable option) while another provider locates oxygen for the patient. This may also bring up opportunities to discuss what to do in the event that no bag-valve mask is available, such as providing mouth-to-mask ventilation or passive oxygenation, thereby expanding the learners’ frames of reference even further. Various debriefing tools have been developed to aid the facilitator in hosting a successful debriefing. The PEARLS tool is one such tool that focuses on a few main components (Bajaj et al., 2018). The initial reactions phase begins by allowing learners to express their emotions and feelings regarding the case. This both initiates the conversation and provided facilitators with information what should be debriefed and how it might best be approached. Then comes a descriptive phase in which facts of the case are clarified and the team develops a shared understanding of the diagnosis. This is particularly important as it sets the “ground truth” regarding the actual physiologic process that the simulator’s actions were intended to represent. The debriefing next proceeds to the analysis phase, in which the team discusses aspects of the case that were managed well and aspects of the case the team might want to change, along with accompanying rationales (the “plus/delta” approach). Learning points and takeaways are then summarized in the final application phase. Advocacy/Inquiry is another key technique used in the analysis phase. This approach uses probing questions to encourage learners to move beyond performance assessment to a deeper consideration of the overall frames of reference that contributed to their actions during the simulation. For example, the facilitator may say, “I noticed there was a 3-minute delay in recognizing that the patient had a shockable rhythm. To me it seemed like the teams’ attention was focused solely on intubating the patient and it was not clear that anyone noticed the rhythm change. How did the team see it?” By doing this, learners are encouraged to examine their overall approach to care, a process that can lead to ongoing reflection in real practice (Rudolph, 2014).
284
Human Factors in Simulation and Training
Rapid Cycle Deliberate Practice Rapid Cycle Deliberate Practice (RCDP) is an alternative educational approach that builds on the concept of “deliberate practice” described by Ericsson et al. (Perretta et al., 2020; Hunt et al., 2014; Ericsson, 2006). In medicine, physicians aspire to be “experts” in their chosen field, which Ericsson defines as someone “able to perform at virtually any time with relatively limited preparation.” Deliberate Practice is a “systematically designed activity that has been created specifically to improve an individual’s performance in a given domain” (Ericsson & Harwell, 2019). Traditional notions of professional expertise pre-supposed it was directly linked to length of experience, reputation, and perceived mastery (Ericsson et al., 1993). Research, however, has subsequently demonstrated that only weak relationships exist between these factors and actual, observed performance and skill (Ericsson, 2008). While deliberate practice was initially studied with regard to the purposeful, intentional training and coaching of musicians, athletes and chess players, the merits of the technique translate well to many other areas, including medicine. Hunt et al. coined the term RCDP in 2014, describing it as a “learner-centered simulation instructional strategy that identifies performance gaps and targets feedback to improve individual or team deficiencies” (Perretta et al., 2020; Hunt et al., 2014). The first principal of RCDP is to maximize the time learners spend deliberately practicing an activity or procedure, which allows for the timely correction of “bad habits/mistakes.” Participants are given multiple opportunities to “do it right,” through repeated rehearsals and overlearning to create procedural memories (i.e., “muscle memory”). of what things feel like when performed correctly. The second principle of RCDP is the opportunity for problem-solving and evaluation, aided by the instructors providing specific evidence-based or expert-derived solutions for common problems. New information is presented in smaller, bite-sized chunks for incorporation and reinforcement of learning in real time. The third principle involves fostering psychological safety so learners can embrace direct feedback and incorporate it into the next iteration of practice while refining their skill and avoiding defensiveness that can limit their growth and learning (Hunt et al., 2014). Confidence is instilled in a safe learning environment where anxiety is reduced, open communication is promoted, and confidence built. Elements of Deliberate Practice are listed in Table 10.1 (McGaghie et al., 2011). In healthcare, RCDP has been successfully used to improve procedural competence, educate learners in crisis resource management principles, and teach the skills needed to manage complex medical and traumatic emergencies (Perretta et al., 2020). RCDP now forms the conceptual backbone of many simulationbased learning curricula in emergency medicine, pediatric and adult medicine, neonatology, and critical care. RCDP has also been shown to measurably improve the quality and timeliness of cardiopulmonary resuscitation (CPR) and defibrillation, in novice providers’ adherence to an intubation checklist, and the ability of emergency physicians to compassionately engage in difficult conversations. Table 10.2 lists situations in which RCDP may be of particular utility (Perretta et al., 2020).
Healthcare Simulation Methods
285
TABLE 10.1 The Elements of Deliberate Practice
1. Highly motivated learners 2. Well-defined learning objectives 3. Appropriate level of difficulty 4. Focused, repetitive practice 5. Rigorous measurements 6. Informative feedback 7. Monitoring, error correction and more deliberate practice 8. Performance evaluation toward mastery standards 9. Advancement toward the next task
Source: Barsness, K.A. Achieving expert performance through simulation-based education and application of mastery learning principles. (2020). Seminars in Pediatric Surgery, 29(2), 150904.
TABLE 10.2 Circumstances in Which RCDP May Have a Favorable Impact on Clinical Performance 1. Existing, well-established performance guidelines: The Institute of Medicine recommends establishing performance standards to minimize risk of harm to patients. These can originate from national guidelines, institutional protocols, or expert consensus. Using standards provides an objective way to measure learner performance and an ability to build prescriptive feedback. 2. A need for learners to master key behaviors: RCDP continuously provides formative testing of learner performance against established standards. Instructors will not advance to the next learning objective until learners achieve the current objective. This describes several of the complementary features of mastery learning incorporated into RCDP. 3. Limited teaching time: Published RCDP studies show that it allows learners to master a large amount of content within one standard time frame, making it a good option when learners have a short time to master a topic, or if it is a stand-alone or self-sustaining course. 4. Low-volume, high-risk, time-sensitive events: RCDP has been associated with improved performance during simulated low-volume, high-risk, time-sensitive events. 5. Team situations requiring or benefiting from specific scripting and/or choreography: RCDP can facilitate utilization of shared mental models for patient assessment, explicit choreography for patient management, role delineation, shared language, and interdisciplinary procedural training. Source: Perretta, J. S., Duval-Arnould, J., Poling, S., Sullivan, N., Jeffers, J. M., Farrow, L., Shilkofski, N. A., Brown, K. M., & Hunt, E. A. (2020). Best practices and theoretical foundations for simulation instruction using Rapid-Cycle Deliberate Practice. Simul Healthc, 15(5), 356–362.
286
Human Factors in Simulation and Training
A commonly used strategy in RCDP is to present an initial scenario that unfolds in an uninterrupted manner (Perretta et al., 2020). This allows facilitators to identify performance gaps that they can then focus on during subsequent parts of the session. For example, in the case of a sudden cardiac arrest, the team can be assessed on their ability to assess for a pulse, initiate CPR, place pads, and defibrillate for a shockable rhythm during an initial simulation. This data can then be used to identify gaps in clinical skill that can shape the focus of subsequent iterative simulations conducted during the same session. Mastery Learning Mastery learning is founded on John Carroll’s 1963 work on educational theory (Carroll, 1963 1). Mastery learning is a curriculum style grounded on two fundamental beliefs (Eppich, 2015). First, all learners can and will achieve a uniform performance goal. Second, it will take some learners longer than others to achieve these goals. Given enough time, a mastery learning approach believes that every learner can achieve proficiency standards and competence in any given skill. While one learner may achieve these objectives in 20 minutes, it may take another learner many hours to accomplish the same task. Both, however, can and will reach this endpoint given sufficient opportunity. Mastery learning is thus a logical development of the deliberate practice approach, as it pairs repetitive deliberate practice and robust feedback that compares learner performance with an externally determined standard (Block & Airasian, 1971). Here the curriculum and learner performance are fixed and time to achieving mastery performance is variable. This can be challenging to accommodate in the field of medicine, however, as educational sessions often have durations that are predetermined by the ability of the instructors and learners to free themselves from clinical duties. Table 10.3 lists key elements of mastery learning (McGaghie et al., 2015). Paramount to achieving mastery learning goals is understanding how to structure the feedback and debriefing that the learner receives (Eppich & Cheng, 2015). For mastery learning to work, the feedback and debriefing given to learners needs
TABLE 10.3 Elements of Mastery Learning
1. Baseline testing/assessments 2. Clear learning objectives, sequenced as units of increasing difficulty 3. Engagement in educational activities that are focused on the learning objectives 4. A set of proficiency standards for each educational unit 5. Formative testing with feedback to assess completion of each unit 6. Advancement to the next educational unit once mastery standard achieved in current unit 7. Continued practice until mastery standards achieved for all units
Source: McGaghie, W.C. (2015). Mastery learning: It is time for medical education to join the 21st century. Acad Med, 90(11), 1438–1441.
Healthcare Simulation Methods
287
to be interspersed throughout the educational session so that it can inform practice and guide ongoing performance improvement efforts. While traditionally structured simulation tends to make use of longer debriefing sessions, mastery learning instead focuses on “micro-debriefings” either within the event (reflecting in-action) or after a short event (reflecting on-action). This approach allows for quick diagnosis of skills that require improvement so that they can be highlighted and focused on in subsequent simulation cycles. There is increasing evidence that mastery learning has the ability to improve clinician performance (Barsuk et al., 2016; Eppich et al., 2015). It has shown benefit in advanced cardiac life-support and procedural skills training for procedures such as central line placement and boot camps for training newly graduated intern physicians (Cook et al., 2013; McGaghie et al., 2014; Wayne et al., 2006; Barsuk et al., 2009a, 2009b; Cohen et al., 2013). Mastery learning lends itself well to all types of procedures, such as suturing and surgical technique, intubation, endoscopy, and lumbar puncture, among others (Ritter et al., 2018; Gabrysz-Forget et al., 2020; Franklin et al., 2018; Dyke et al., 2021; Barsuk et al., 2012). The uptake of mastery learning techniques is likely to grow as it fits in well with competency-based medical education. Mastery learning also offers opportunities for stress inoculation training. In stress inoculation training, learners are exposed to graded levels of challenges and stresses that mimic a real environment (Chang et al., 2020). This concept was borrowed from military training and is based on the idea that repeated exposure at higher levels of stress and fidelity, growing closer at each phase to the level of stress that might be experienced in reality, can improve downstream performance (Lauria et al.2017). For example, suturing silicone skin in a quiet classroom is quite different from repairing a lip laceration on a wiggling, crying, biting toddler, but by providing sequential simulations of the laceration repair process with ever-escalating levels of environmental stress, the clinician will ultimately be better able to care for such a child in a busy emergency department with parents and other medical staff observing. Just-in-Time Training Just-in-time training (JITT) is “a method of training that is conducted directly prior to a potential intervention” (Kamdar et al., 2013). The basic concept is that by providing simulation-based practice immediately prior to the performance of a necessary procedure, the ultimate success of that procedure can be enhanced. This approach has been shown to be successful in such disparate procedures as CPR and lumbar puncture (Niles et al., 2017; Niles et al., 2009; Kessler et al., 2015). The benefits of JITT include a review of appropriate anatomic landmarks, an opportunity to rehearse the procedure, and ask questions. While this approach can occasionally be difficult to implement given the time constraints often present in healthcare environments, the popularity of this approach continues to grow. During the Covid-19 pandemic, JITT training was successfully used to guide workflow design for difficult airway management for experienced providers and has also been paired with RCDP to improve pediatric intensive care unit team proficiency in identifying and managing postoperative shock in congenital heart disease (Daly Guris et al., 2020; Brown, 2021).
288
Human Factors in Simulation and Training
Choice of Training Type Given these options, simulation-based medical educators are then faced with the decision of which to use, and when. While the right choice will be highly dependent on local circumstances, a consideration of the learning objectives, available time, and needed level of post-simulation skill can be helpful. For simulations that focus primarily on critical thinking during complex situations or organizing large, functional teams, the reflective debriefing approach may be best given its focus on uncovering deep frames of reference. When, on the other hand, procedural skills or more straightforward psychomotor tasks (such as cardiopulmonary resuscitation) are in view, some combination of RCDP and mastery learning may be best. The degree to which the mastery learning paradigm can be fully implemented will also depend, however, on the available time. Finally, JITT is best chosen when the need for swift, education of immediate clinical relevancy (such as the hour prior to the arrival of a postoperative cardiac patient with a high risk for physiologic decompensation) is in view.
Simulation for Evaluation and Testing Before simulation became readily available, new techniques and procedures were studied and then adopted and performed using consenting patients in a specialized testing environment (Aggarwal et al., 2010). An example of this occurred during the rapid uptake of laparoscopic surgery for cholecystectomy. Despite virtually no prior training in laparoscopic technique, many surgeons were performing this new surgical approach directly on their patients, leading to an unintentional increase in morbidity and mortality. The use of simulated environments now provides an opportunity, however, to evaluate and test learners on their competencies prior to employing them on real patients. Simulation-based learner assessment, like all forms of evaluation, can be divided into two basic subtypes: Formative and Summative (Calhoun et al., 2016; Watling, 2019; Boulet, 2008). Formative assessment often occurs during a course of study, and enables facilitators to provide focused feedback that learners can use to alter their future practice. Summative assessment, on the other hand, typically occurs after a course of study and is used to determine whether a learner is capable of practicing a certain skill independently or requires remediation. A growing number of institutions are now using simulation to credential providers in procedures that are rarely encountered in clinical practice but that they must be able to perform should the need arise. In addition, the United States Medical Licensing Exam included, until recently, a simulation-based summative assessment intended to evaluate doctor-patient interactions (Scott et al., 2019; Ali, 2020). Finally, it is important to note that simulation-based learner assessment, like all forms of learner assessment, must be shown to be both valid and reliable for the decision it is intended to assist. A number of validity frameworks currently exist that can assist in this process, but their exploration is beyond the scope of this chapter (Downing, 2003; Cook et al., 2015; Tavares et al., 2018; Calhoun & Scerbo, 2022).
Healthcare Simulation Methods
289
Systems Testing In addition to assessing individual providers, simulation can be used for the evaluation of systems of care. The Institute of Medicine has emphasized that many medical errors are system-related and not attributable to individual negligence or misconduct. Systems testing using simulation-based evaluative methods has been shown to reduce error rates, improve the quality of care provided, and drive changes in the healthcare environment that have tangible effects on patient outcomes (Imach et al., 2020). Given that this approach evaluates a clinical environment in real time, the simulation most often takes place in situ (i.e., within actual patient care area). This enables simulation facilitators to directly assess how provider teams and systems of care interact. Applications of simulation for systems testing include the use of simulation to diagnose potential safety issues in a new hospital or medical facility prior to opening, or to identify latent safety issues in currently operational units and correct them before they have an impact on patient care (Hebbar et al., 2018). As an example, consider an in-situ code blue simulation conducted in a new hospital ward that is about to open. Prior to engaging in the simulation, the care team was instructed to utilize all systems of care, as they would in an actual emergency, in order to be sure that the space was safe to receive patients. During the case, a nurse discovers that a button on the wall used to activate the code blue process is not working correctly. Systems-testing simulations can also be used to establish the point prevalence of care patterns across institutions or trial new care processes (Maa et al., 2020). One recent study utilized this methodology to assess the rates of common errors in anaphylaxis management across an array of pediatric institutions, while others have described the use of this approach to prepare institutions to care for patients with highly infectious conditions such as Ebola and, more recently, Covid-19 (ShararaChami et al., 2020; Gaba, 2014; Biddell et al., 2016; Phrampus, 2016; Lie et al., 2020). Finally, this modality can also be used to evaluate the protocols and procedures for disaster preparedness and mass casualty protocols (Gardner, 2016; Jung et al., 2016; Castoldi et al., 2020; Jorm et al., 2016). The use of simulation as a means of diagnosing systemic issues in healthcare practice has a wide range of potential applications and is in a state of rapid growth. In the future, we expect this approach to have a broad, positive impact on patient safety efforts worldwide.
CONCLUSION AND FUTURE DIRECTIONS Once a relatively limited niche within healthcare education, simulation has now become an integral aspect of the modern healthcare professionals’ educational and patient safety armamentarium. Significant evidence demonstrates that its use can help achieve higher provider competence and safer care. The recent Covid-19 pandemic has also spurred a rapid shift to distance simulation modalities in a relatively short time, and the healthcare simulation community is currently working to better understand how this family of approaches functions both educationally and as an assessment approach. Finally, the advent of (relatively) lower-cost virtual and
290
Human Factors in Simulation and Training
augmented reality approaches promises to radically transform the ways in which we integrate simulation-based approaches into healthcare education and practice.
REFERENCES Aggarwal, R., Mytton, O. T., Derbrew, M., Hananel, D., Heydenburg, M., Issenberg, B., MacAulay, C., Mancini, M. E., Morimoto, T., Soper, N., Ziv, A., & Reznick, R. (2010). Training and simulation for patient safety. Qual Saf Health Care, 19 Suppl 2, i34–i43. https://doi.org/10.1136/qshc.2009.038562 Ali, J. M. (2020). The USMLE step 2 clinical skills exam: A model for OSCE examinations? Acad Med, 95(5), 667. https://doi.org/10.1097/ACM.0000000000003183 Bajaj, K., Meguerdichian, M., Thoma, B., Huang, S., Eppich, W., & Cheng, A. (2018). The PEARLS healthcare debriefing tool. Acad Med, 93(2), 336. https://doi.org/10.1097/ ACM.0000000000002035 Barsuk, J. H., McGaghie, W. C., Cohen, E. R., Balachandran, J. S., & Wayne, D. B. (2009a). Use of simulation-based mastery learning to improve the quality of central venous catheter placement in a medical intensive care unit. J Hosp Med, 4(7), 397–403. https:// doi.org/10.1002/jhm.468 Barsuk, J. H., McGaghie, W. C., Cohen, E. R., O’Leary, K. J., & Wayne, D. B. (2009b). Simulation-based mastery learning reduces complications during central venous catheter insertion in a medical intensive care unit. Crit Care Med, 37(10), 2697–2701. https://www.ncbi.nlm.nih.gov/pubmed/19885989 Barsuk, J. H., Cohen, E. R., Caprio, T., McGaghie, W. C., Simuni, T., & Wayne, D. B. (2012). Simulation-based education with mastery learning improves residents’ lumbar puncture skills. Neurology, 79(2), 132–137. https://doi.org/10.1212/ WNL.0b013e31825dd39d Barsuk, J. H., Cohen, E. R., Wayne, D. B., Siddall, V. J., & McGaghie, W. C. (2016). Developing a simulation-based mastery learning curriculum: Lessons from 11 years of advanced cardiac life support. Simul Healthc, 11(1), 52–59. https://doi.org/10.1097/SIH .0000000000000120 Beaubien, J. M., & Baker, D. P. (2004). The use of simulation for training teamwork skills in health care: How low can you go? Qual Saf Health Care, 13 Suppl 1, i51–56. https://doi .org/10.1136/qhc.13.suppl_1.i51 Bell, S. K., Pascucci, R., Fancy, K., Coleman, K., Zurakowski, D., & Meyer, E. C. (2014). The educational value of improvisational actors to teach communication and relational skills: Perspectives of interprofessional learners, faculty, and actors. Patient Educ Couns, 96(3), 381–388. https://doi.org/10.1016/j.pec.2014.07.001 Biddell, E. A., Vandersall, B. L., Bailes, S. A., Estephan, S. A., Ferrara, L. A., Nagy, K. M., O’Connell, J. L., & Patterson, M. D. (2016). Use of simulation to gauge preparedness for Ebola at a free-standing children’s hospital. Simul Healthc, 11(2), 94–99. https://doi .org/10.1097/SIH.0000000000000134 Block, J. H., & Airasian, P. W. (1971). Mastery learning: Theory and practice. Holt. Borghi, L., Meyer, E. C., Vegni, E., Oteri, R., Almagioni, P., & Lamiani, G. (2021). Twelve years of the Italian Program to Enhance Relational and Communication Skills (PERCS). Int J Environ Res Public Health, 18(2). https://doi.org/10.3390/ijerph18020439 Boulet, J. R. (2008). Summative assessment in medicine: The promise of simulation for highstakes evaluation. Acad Emerg Med, 15(11), 1017–1024. https://doi.org/10.1111/j.1553 -2712.2008.00228.x Bracq, M.-S., Michinov, E., & Jannin, P. (2019). Virtual reality simulation in nontechnical skills training for healthcare professionals: A systematic review. Simulation in
Healthcare Simulation Methods
291
Healthcare: The Journal of the Society for Simulation in Healthcare, 14(3), 188–194. 10.1097/SIH.0000000000000347 Brett-Fleegler, M., Rudolph, J., Eppich, W., Monuteaux, M., Fleegler, E., Cheng, A., & Simon, R. (2012). Debriefing assessment for simulation in healthcare: Development and psychometric properties. Simul Healthc, 7(5), 288–294. https://doi.org/10.1097/SIH .0b013e3182620228 Brown, K. M., Mudd, S. S., Perretta, J. S., Dodson, A., Hunt, E. A., & McMillan, K. N. (2021). Rapid cycle deliberate practice to facilitate “Nano” in situ simulation: An interprofessional approach to just-in-time training. Crit Care Nurse, 41(1), e1–e8. https://doi.org/10.4037/ccn2021552 Bryson, E. O., & Levine, A. I. (2008). The simulation theater: A theoretical discussion of concepts and constructs that enhance learning. J Crit Care, 23(2), 185–187. https://doi .org/10.1016/j.jcrc.2007.12.003 Buck, G. H. (1991). Development of simulators in medical education. Gesnerus, 48(1), 7–28. https://www.ncbi.nlm.nih.gov/pubmed/1855669 CAE, H. (2021). CAE Lucina Validated High-Fidelity Maternal/Fetal Training. https://www .caehealthcare.com /patient-simulation / lucina/ Calhoun, A., Bhanji, F., Sherbino, J., & Hatala, R. (2016). Simulation for high-stakes assessment in pediatric emergency medicine. Clin Pediatr Emerg Med, 13(September), 212–223. Calhoun, A. W., & Gaba, D. M. (2017). Live or let die: New developments in the ongoing debate over mannequin death. Simul Healthc, 12(5), 279–281. https://doi.org/10.1097/ SIH.0000000000000256 Calhoun, A. W., Pian-Smith, M., Shah, A., Levine, A., Gaba, D., DeMaria, S., Goldberg, A., & Meyer, E. C. (2020b). Guidelines for the responsible use of deception in simulation: Ethical and educational considerations. Simul Healthc, 15(4), 282–288. https://doi.org /10.1097/SIH.0000000000000440 Calhoun, A. W., Pian-Smith, M. C., Truog, R. D., Gaba, D. M., & Meyer, E. C. (2015). The importance of deception in simulation: A response. Simul Healthc, 10(6), 387–390. https://doi.org/10.1097/SIH.0000000000000127 Calhoun, A. W., Pian-Smith, M. C., Truog, R. D., Gaba, D. M., & Meyer, E. C. (2015). Deception and simulation education: Issues, concepts, and commentary. Simul Healthc, 10(3), 163–169. https://doi.org/10.1097/SIH.0000000000000086 Calhoun, A. W., & Scerbo, M. W. (2022). Preparing and presenting validation studies: A guide for the perplexed. Simul Healthc, 17(6), 357–365. Calhoun, A. P.-S. M., Shah, A., Levine, A., Gaba, D., DeMaria, S., Goldberg, A., & Meyer, E. (2020a). Exploring the boundaries of deception in simulation: A mixed methods study. Clin Simul Nurs, 40(March), 7–16. Caro, P. W. (1973). Aircraft simulators and pilot training. Hum Factors, 15, 502–509. Carroll, J. B. (1963). A model of school learning. Teach Coll Rec, 64, 723–733. Castoldi, L., Greco, M., Carlucci, M., Lennquist Montan, K., & Faccincani, R. (2020). Mass Casualty Incident (MCI) training in a metropolitan university hospital: Short-term experience with MAss Casualty SIMulation system MACSIM((R)). Eur J Trauma Emerg Surg https://doi.org/10.1007/s00068- 020- 01541-8 Chang, T. P., Hollinger, T., Dolby, T., & Sherman, J. M. (2020). Development and considerations for virtual reality simulations for resuscitation training and stress inoculation. Simul Healthc. https://doi.org/10.1097/SIH.0000000000000521 Cheng, A., Kolbe, M., Grant, V., Eller, S., Hales, R., Symon, B., Griswold, S., & Eppich, W. (2020). A practical guide to virtual debriefings: Communities of inquiry perspective. Adv Simul (Lond), 5, 18. https://doi.org/10.1186/s41077- 020- 00141-1
292
Human Factors in Simulation and Training
Cohen, E. R., Barsuk, J. H., Moazed, F., Caprio, T., Didwania, A., McGaghie, W. C., & Wayne, D. B. (2013). Making July safer: Simulation-based mastery learning during intern boot camp. Acad Med, 88(2), 233–239. https://doi.org/10.1097/ACM.0b013e31827bfc0a Cook, D. A., Brydges, R., Ginsburg, S., & Hatala, R. (2015). A contemporary approach to validity arguments: A practical guide to Kane’s framework. Med Educ, 49(6), 560–575. https://doi.org/10.1111/medu.12678 Cook, D. A., Brydges, R., Zendejas, B., Hamstra, S. J., & Hatala, R. (2013). Mastery learning for health professionals using technology-enhanced simulation: A systematic review and meta-analysis. Acad Med, 88(8), 1178–1186. https://doi.org/10.1097/ACM .0b013e31829a365d Cooper, J. B., & Taqueti, V. R. (2008). A brief history of the development of mannequin simulators for clinical education and training. Postgrad Med J, 84(997), 563–570. https://doi.org/10.1136/qshc.2004.009886 Daly Guris, R. J., Doshi, A., Boyer, D. L., Good, G., Gurnaney, H. G., Rosenblatt, S., McGowan, N., Widmeier, K., Kishida, M., Nadkarni, V., Nishisaki, A., & Wolfe, H. A. (2020). Just-in-Time simulation to guide workflow design for coronavirus disease 2019 difficult airway management. Pediatr Crit Care Med, 21(8), e485–e490. https://doi.org /10.1097/ PCC.0000000000002435 Demaria, S. Jr., Bryson, E. O., Mooney, T. J., Silverstein, J. H., Reich, D. L., Bodian, C., & Levine, A. I. (2010). Adding emotional stressors to training in simulated cardiopulmonary arrest enhances participant performance. Med Educ, 44(10), 1006– 1015. https://doi.org/10.1111/j.1365-2923.2010.03775.x Dieckmann, P. (2020). The unexpected and the non-fitting - Considering the edges of simulation as social practice, Adv Simul (Lond), 5, 2. https://doi.org/10.1186/s41077 -020- 0120-y Dieckmann, P., Gaba, D., & Rall, M. (2007). Deepening the theoretical foundations of patient simulation as social practice. Simul Healthc, 2(3), 183–193. https://doi.org/10.1097/SIH .0b013e3180f637f5 Downing, S. M. (2003). Validity: On meaningful interpretation of assessment data. Med Educ, 37(9), 830–837. https://doi.org/10.1046/j.1365-2923.2003.01594.x Dyke, C., Franklin, B. R., Sweeney, W. B., & Ritter, E. M. (2021). Early implementation of fundamentals of endoscopic surgery training using a simulation-based mastery learning curriculum. Surgery, 169(5), 1228–1233. https://doi.org/10.1016/j.surg.2020.12.005 Eppich, W., & Cheng, A. (2015). Promoting Excellence and Reflective Learning in Simulation (PEARLS): Development and rationale for a blended approach to health care simulation debriefing. Simul Healthc, 10(2), 106–115. https://doi.org/10.1097/SIH .0000000000000072 Eppich, W. J., Hunt, E. A., Duval-Arnould, J. M., Siddall, V. J., & Cheng, A. (2015). Structuring feedback and debriefing to achieve mastery learning goals. Acad Med, 90(11), 1501–1508. https://doi.org/10.1097/ACM.0000000000000934 Ericsson, K. A. (2006). The Cambridge handbook of expertise and expert performance. Cambridge University Press. Ericsson, K. A. (2008). Deliberate practice and acquisition of expert performance: A general overview. Acad Emerg Med, 15(11), 988–994. https://doi.org/10.1111/j.1553-2712.2008 .00227.x Ericsson, K. A., & Harwell, K. W. (2019). Deliberate practice and proposed limits on the effects of practice on the acquisition of expert performance: Why the original definition matters and recommendations for future research. Front Psychol, 10, 2396. https://doi .org/10.3389/fpsyg.2019.02396 Ericsson, K. A. K., Krampe, R. T., & Tesch-Romer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100(3), 363–406.
Healthcare Simulation Methods
293
Franklin, B. R., Placek, S. B., Gardner, A. K., Korndorffer, J. R. Jr., Wagner, M. D., Pearl, J. P., & Ritter, E. M. (2018). Preparing for the American board of surgery flexible endoscopy curriculum: Development of multi-institutional proficiency-based training standards and pilot testing of a simulation-based mastery learning curriculum for the endoscopy training system. Am J Surg, 216(1), 167–173. https://doi.org/10.1016/j.amjsurg.2017.09 .010 Fraser, K., Huffman, J., Ma, I., Sobczak, M., McIlwrick, J., Wright, B., & McLaughlin, K. (2014). The emotional and cognitive impact of unexpected simulated patient death: A randomized controlled trial. Chest, 145(5), 958–963. https://doi.org/10.1378/chest.13-0987 Gaba, D. M. (2014). Simulation as a critical resource in the response to Ebola virus disease. Simul Healthc, 9(6), 337–338. https://doi.org/10.1097/SIH.0000000000000068 Gabrysz-Forget, F., Bonds, M., Lovett, M., Alseidi, A., Ghaderi, I., & Nepomnayshy, D. (2020). Practicing on the Advanced Training in Laparoscopic Suturing Curriculum (ATLAS): Is mastery learning in residency feasible to achieve expert-level performance in laparoscopic suturing? J Surg Educ, 77(5), 1138–1145. https://doi.org/10.1016/j.jsurg .2020.02.026 Gardner, A. K., DeMoya, M. A., Tinkoff, G. H., Brown, K. M., Garcia, G. D., Miller, G. T., Zaidel, B. W., Korndorffer, J. R. Jr., Scott, D. J., & Sachdeva, A. K. (2016). Using simulation for disaster preparedness. Surgery, 160(3), 565–570. https://doi.org/10.1016 /j.surg.2016.03.027 Gardner, R., & Raemer, D. B. (2008). Simulation in obstetrics and gynecology. Obstet Gynecol Clin North Am, 35(1), 97–127, ix. https://doi.org/10.1016/j.ogc.2007.12.008 Goldberg, A., Samuelson, S., Khelemsky, Y., Katz, D., Weinberg, A., Levine, A., & Demaria, S. (2017). Exposure to simulated mortality affects resident performance during assessment scenarios. Simul Healthc, 12(5), 282–288. https://doi.org/10.1097/SIH .0000000000000257 Gross, I. T., Whitfill, T., Redmond, B., Couturier, K., Bhatnagar, A., Joseph, M., Joseph, D., Ray, J., Wagner, M., & Auerbach, M. (2020). Comparison of two telemedicine delivery modes for neonatal resuscitation support: A simulation-based randomized trial. Neonatology, 117(2), 159–166. https://doi.org/10.1159/000504853 Hays, R. T., & Singer, M. J. (1989). Simulation fidelity in training system design: Bridging the gap between reality and training. Springer-Verlag. Hebbar, K. B., Colman, N., Williams, L., Pina, J., Davis, L., Bost, J. E., Jones, H., & Frank, G. (2018). A quality initiative: A system-wide reduction in serious medication events through targeted simulation training. Simul Healthc, 13(5), 324–330. https://doi.org/10 .1097/SIH.0000000000000321 Heller, B. J., DeMaria, S., Katz, D., Heller, J. A., & Goldberg, A. T. (2016). Death during simulation: A literature review. J Contin Educ Health Prof, 36(4), 316–322. https://doi .org/10.1097/CEH.0000000000000116 Hunt, E. A., Duval-Arnould, J. M., Nelson-McMillan, K. L., Bradshaw, J. H., Diener-West, M., Perretta, J. S., & Shilkofski, N. A. (2014). Pediatric resident resuscitation skills improve after “rapid cycle deliberate practice” training. Resuscitation, 85(7), 945–951. https://doi.org/10.1016/j.resuscitation.2014.02.025 Imach, S., Eppich, W., Zech, A., Kohlmann, T., Pruckner, S., & Trentzsch, H. (2020). Applying principles from aviation safety investigations to root cause analysis of a critical incident during a simulated emergency. Simul Healthc, 15(3), 193–198. https://doi.org/10.1097/ SIH.0000000000000457 Issenberg, S. B., McGaghie, W. C., Petrusa, E. R., Lee Gordon, D., & Scalese, R. J. (2005). Features and uses of high-fidelity medical simulations that lead to effective learning: A BEME systematic review. Med Teach, 27(1), 10–28. https://doi.org/10.1080 /01421590500046924
294
Human Factors in Simulation and Training
Jentsch, F., Curtis, M., & Salas, E. (2011). Simulation in aviation training. Ashgate Pub. Jorm, C., Roberts, C., Lim, R., Roper, J., Skinner, C., Robertson, J., Gentilcore, S., & Osomanski, A. (2016). A large-scale mass casualty simulation to develop the nontechnical skills medical students require for collaborative teamwork. BMC Med Educ, 16, 83. https://doi.org/10.1186/s12909- 016- 0588-2 Jung, D., Carman, M., Aga, R., & Burnett, A. (2016). Disaster preparedness in the emergency department using in situ simulation. Adv Emerg Nurs J, 38(1), 56–68. https://doi.org/10 .1097/ TME.0000000000000091 Kamdar, G., Kessler, D. O., Tilt, L., Srivastava, G., Khanna, K., Chang, T. P., Balmer, D., & Auerbach, M. (2013). Qualitative evaluation of just-in-time simulation-based learning: The learners' perspective. Simul Healthc, 8(1), 43–48. https://doi.org/10.1097/SIH .0b013e31827861e8 Kessler, D., Pusic, M., Chang, T. P., Fein, D. M., Grossman, D., Mehta, R., White, M., Jang, J., Whitfill, T., Auerbach, M., & Investigators, I. L. (2015). Impact of just-in-time and justin-place simulation on intern success with infant lumbar puncture. Pediatrics, 135(5), e1237–1246. https://doi.org/10.1542/peds.2014-1911 Kohn, L. T., Corrigan, J., & Donaldson, M. S. (2000). To err is human: Building a safer health system. National Academy Press. Kolb, D. A. (1984). Experiential learning: Experience as the source of learning and development. Prentice-Hall. Lateef, F., Suppiah, M., Chandra, S., Yi, T. X., Darmawan, W., Peckler, B., Tucci, V., Tirado, A., Mendez, L., Moreno, L., & Galwankar, S. (2021). Simulation centers and simulationbased education during the time of COVID-19: A multi-center best practice position paper by the world academic council of emergency medicine. J Emerg Trauma Shock, 14(1), 3–13. https://doi.org/10.4103/JETS.JETS_185_20 Lauria, M. J., Gallo, I. A., Rush, S., Brooks, J., Spiegel, R., & Weingart, S. D. (2017). Psychological skills to improve emergency care providers’ performance under stress. Ann Emerg Med, 70(6), 884–890. https://doi.org/10.1016/j.annemergmed.2017.03.018 Leighton, K. (2009). Death of a simulator. Clin Simul Nurs, 5(2), E59–E62. Lie, S. A., Wong, L. T., Chee, M., & Chong, S. Y. (2020). Process-oriented in situ simulation is a valuable tool to rapidly ensure operating room preparedness for COVID-19 outbreak. Simul Healthc, 15(4), 225–233. https://doi.org/10.1097/SIH.0000000000000478 Lioce, L. (Ed.), Loprieato, J. (Founding Ed), Downing, D., Chang, T. P., Robertson, J. M., Anderson, M., Diaz, D. A., & Spain, A. E. (Assoc. Eds.) and the Terminology and Concepts Working Group. (2020). Healthcare simulation dictionary –Second Edition. https://doi.org/10.23970/simulationv2 Lizotte, M. H., Latraverse, V., Moussa, A., Lachance, C., Barrington, K., & Janvier, A. (2015). Trainee perspectives on Manikin death during mock codes. Pediatrics, 136(1), e93–e98. https://doi.org/10.1542/peds.2014-3910 Maa, T., Scherzer, D. J., Harwayne-Gidansky, I., Capua, T., Kessler, D. O., Trainor, J. L., Jani, P., Damazo, B., Abulebda, K., Diaz, M. C. G., Sharara-Chami, R., Srinivasan, S., Zurca, A. D., Deutsch, E. S., Hunt, E. A., & Auerbach, M., Peak investigators of the International Network for Simulation-based Pediatric Innovation, R., & Education. (2020). Prevalence of Errors in Anaphylaxis in Kids (PEAK): A multicenter simulationbased study. J Allergy Clin Immunol Pract, 8(4), 1239–1246 e1233. https://doi.org/10 .1016/j.jaip.2019.11.013 McBride, M. E., Schinasi, D. A., Moga, M. A., Tripathy, S., & Calhoun, A. (2017). Death of a simulated pediatric patient: Toward a more robust theoretical framework. Simul Healthc, 12(6), 393–401. https://doi.org/10.1097/SIH.0000000000000265
Healthcare Simulation Methods
295
McGaghie, W. C., Siddall, V. J., Mazmanian, P. E., & Myers, J. (2009). Lessons for continuing medical education from simulation research in undergraduate and graduate medical education. Effectiveness ofcontinuing medical education: American College of Chest Physicians evidence-based educational guidelines. Chest, 135 Suppl, 62S–68S. McGaghie, W. C., Issenberg, S. B., Cohen, E. R., Barsuk, J. H., & Wayne, D. B. (2011). Does simulation-based medical education with deliberate practice yield better results than traditional clinical education? A meta-analytic comparative review of the evidence. Acad Med, 86(6), 706–711. McGaghie, W. C., Issenberg, S. B., Barsuk, J. H., & Wayne, D. B. (2014). A critical review of simulation-based mastery learning with translational outcomes. Med Educ, 48(4), 375–385. https://doi.org/10.1111/medu.12391 Meyer, E. C., Sellers, D. E., Browning, D. M., McGuffie, K., Solomon, M. Z., & Truog, R. D. (2009). Difficult conversations: Improving communication skills and relational abilities in health care. Pediatr Crit Care Med, 10(3), 352–359. https://doi.org/10.1097 /PCC.0b013e3181a3183a Niles, D., Sutton, R. M., Donoghue, A., Kalsi, M. S., Roberts, K., Boyle, L., Nishisaki, A., Arbogast, K. B., Helfaer, M., & Nadkarni, V. (2009). “Rolling refreshers”: A novel approach to maintain CPR psychomotor skill competence. Resuscitation, 80(8), 909– 912. https://doi.org/10.1016/j.resuscitation.2009.04.021 Niles, D. E., Nishisaki, A., Sutton, R. M., Elci, O. U., Meaney, P. A., O’Connor, K. A., Leffelman, J., Kramer-Johansen, J., Berg, R. A., & Nadkarni, V. (2017). Improved retention of chest compression psychomotor skills with brief “rolling refresher” training. Simul Healthc, 12(4), 213–219. https://doi.org/10.1097/SIH.0000000000000228 Owen, H. (2012). Early use of simulation in medical education. Simul Healthc, 7(2), 102–116. https://doi.org/10.1097/SIH.0b013e3182415a91 Patel, S. M., Miller, C. R., Schiavi, A., Toy, S., & Schwengel, D. A. (2020). The sim must go on: Adapting resident education to the COVID-19 pandemic using telesimulation. Adv Simul (Lond), 5, 26. https://doi.org/10.1186/s41077- 020- 00146-w Perretta, J. S., Duval-Arnould, J., Poling, S., Sullivan, N., Jeffers, J. M., Farrow, L., Shilkofski, N. A., Brown, K. M., & Hunt, E. A. (2020). Best practices and theoretical foundations for simulation instruction using rapid-cycle deliberate practice. Simul Healthc, 15(5), 356–362. https://doi.org/10.1097/SIH.0000000000000433 Peterson, E., Morgan, R., & Calhoun, A. (2021). Improving patient- and family-centered communication in pediatrics: A review of simulation-based learning. Pediatr Ann, 50(1), e32–e38. https://doi.org/10.3928/19382359-20201211- 02 Peterson, E. B., Porter, M. B., & Calhoun, A. W. (2012). A simulation-based curriculum to address relational crises in medicine. J Grad Med Educ, 4(3), 351–356. https://doi.org /10.4300/JGME-D-11- 00204 Phrampus, P. E., & Cole, J. (2005). Death during simulation training: Feedback from trainees. International Meeting on Medical Simulation. Phrampus, P. E., O’Donnell, J. M., Farkas, D., Abernethy, D., Brownlee, K., Dongilli, T., & Martin, S. (2016). Rapid development and deployment of Ebola readiness training across an academic health system: The critical role of simulation education, consulting, and systems integration. Simul Healthc, 11(2), 82–88. https://doi.org/10.1097/SIH .0000000000000137 Ritter, E. M., Lineberry, M., Hashimoto, D. A., Gee, D., Guzzetta, A. A., Scott, D. J., & Gardner, A. K. (2018). Simulation-based mastery learning significantly reduces gender differences on the fundamentals of endoscopic surgery performance exam. Surg Endosc, 32(12), 5006–5011. https://doi.org/10.1007/s00464- 018- 6313-y
296
Human Factors in Simulation and Training
Ritter, E. M., Taylor, Z. A., Wolf, K. R., Franklin, B. R., Placek, S. B., Korndorffer, J. R. Jr., & Gardner, A. K. (2018). Simulation-based mastery learning for endoscopy using the endoscopy training system: A strategy to improve endoscopic skills and prepare for the fundamentals of endoscopic surgery (FES) manual skills exam. Surg Endosc, 32(1), 413–420. https://doi.org/10.1007/s00464- 017-5697-4 Rudolph, J. W., Raemer, D. B., & Simon, R. (2014). Establishing a safe container for learning in simulation: The role of the presimulation briefing. Simul Healthc, 9(6), 339–349. https://doi.org/10.1097/SIH.0000000000000047 Salas, E., & Burke, C. S. (2002). Simulation for training is effective when. Qual Saf Health Care, 11(2), 119–120. https://doi.org/10.1136/qhc.11.2.119 Scott, S., Hearns, V., & Barker, M. A. (2019). Testing clinical skills: A look at the OSCE and USMLE clinical skills exams. S D Med, 72(10), 451–453. https://www.ncbi.nlm.nih.gov /pubmed/31816205 Sharara-Chami, R., Sabouneh, R., Zeineddine, R., Banat, R., Fayad, J., & Lakissian, Z. (2020). In situ simulation: An essential tool for safe preparedness for the COVID-19 pandemic. Simul Healthc, 15(5), 303–309. https://doi.org/10.1097/SIH.0000000000000504 ssih.org. Sim Center Directory. https://www.ssih.org/ Home/SIM-Center-Directory Tavares, W., Brydges, R., Myre, P., Prpic, J., Turner, L., Yelle, R., & Huiskamp, M. (2018). Applying Kane’s validity framework to a simulation based assessment of clinical competence. Adv Health Sci Educ Theory Pract, 23(2), 323–338. https://doi.org/10 .1007/s10459- 017-9800-3 Tripathy, S., Miller, K. H., Berkenbosch, J. W., McKinley, T. F., Boland, K. A., Brown, S. A., & Calhoun, A. W. (2016). When the mannequin dies, creation and exploration of a theoretical framework using a mixed methods approach. Simul Healthc, 11(3), 149– 156. https://doi.org/10.1097/SIH.0000000000000138 Truog, R. D., & Meyer, E. C. (2013). Deception and death in medical simulation. Simul Healthc, 8(1), 1–3. https://doi.org/10.1097/SIH.0b013e3182869fc2 Wagner, M., Jaki, C., Lollgen, R. M., Mileder, L., Eibensteiner, F., Ritschl, V., Steinbauer, P., Gottstein, M., Abulebda, K., Calhoun, A., & Gross, I. T. (2020). Readiness for and response to coronavirus disease 2019 among pediatric healthcare providers: The role of simulation for pandemics and other disasters. Pediatr Crit Care Med, Publish. Ahead of Print. https://doi.org/10.1097/ PCC.0000000000002649 Watling, C. J., & Ginsburg, S. (2019). Assessment, feedback and the alchemy of learning. Med Educ, 53(1), 76–85. https://doi.org/10.1111/medu.13645 Wayne, D. B., Butter, J., Siddall, V. J., Fudala, M. J., Wade, L. D., Feinglass, J., & McGaghie, W. C. (2006). Mastery learning of advanced cardiac life support skills by internal medicine residents using simulation technology and deliberate practice. J Gen Intern Med, 21(3), 251–256. https://doi.org/10.1111/j.1525-1497.2006.00341.x
11
Design and Development of Algorithms for Gesture-Based Control of SemiAutonomous Vehicles Brian Sanders, Yuzhong Shen, and Dennis Vincenzi
CONTENTS Introduction............................................................................................................. 297 Background............................................................................................................. 298 Gestures and Gesture Capture Technology.................................................... 298 Cognitive Loading Considerations................................................................300 Approach, Implementation, and Results................................................................. 301 Approach........................................................................................................ 301 Developing Proper Training System Familiarization as a Prelude to Training.............................................................................................. 301 Implementation and Results: Phase 1............................................................ 303 Gestures and LMC Measurements..................................................... 303 Virtual Environment Development....................................................304 User Testing.......................................................................................307 Algorithm Redesign...........................................................................309 Implementation and Results: Phase 2............................................................ 310 Physical Demonstration Setup........................................................... 310 User Testing....................................................................................... 312 Summary and Conclusions..................................................................................... 314 References............................................................................................................... 315
INTRODUCTION Drones (aka, unmanned aerial systems or UAS) are used for a variety of purposes such as aerial videography, photography, and surveillance. Successful accomplishment of these tasks requires the execution of a series of basic maneuvering functions (i.e., take off, acceleration, point-to-point navigation) that, when combined, contribute DOI: 10.1201/9781003401353-11
297
298
Human Factors in Simulation and Training
to a mission-capable system. Commercially available small unmanned aerial systems (sUAS) have traditionally been designed and controlled using legacy interface approaches to control the remote vehicle. These traditional control interfaces are typically one-dimensional (1D) or two-dimensional (2D) devices that allow the user to interact with a system in a limited manner (Balog et al., 2016). For example, keyboards are 1D input devices that allow for text input and activation of preprogrammed functions via a sequence of key/text inputs. Mice have expanded input capabilities into a 2D framework, but input is still limited to menu item selection or “hotspots” on a graphical user interface. Both of these control devices, while functional and useful, are limited in nature and not very intuitive in terms of control movement, input, and function, and they are often slow and time consuming as control through these devices often requires a series or sequence of inputs to achieve the desired end state. Other legacy control devices, such as those that are joystick based are better, but still an attempt to translate 2D input into movement through a three-dimensional (3D) space or environment. Integration with touch-sensitive devices such as phones and tablets is emerging on the market to replace or augment discrete physical controls and information displays (Balog et al., 2016). However, these devices are, in many cases, simply electronic or digital versions of the same 2D legacy control devices. These devices typically combine electronic visual displays with touch input, and sometimes electronic input (GPS, accelerometers, and automation, for example). An alternative to these traditional command and control approaches is via the use of gestures. Development of a gesture-based approach for sUAS operation may be a viable alternative for implementation into command and control interfaces using technology that is designed to recognize gestures. A gesture-based approach can free the operator from having to hold and operate a multi-joystick, multi-button-based controller by correlating the vehicle operations to a set of fluid, intuitive, natural, and accepted set of hand gestures. This in combination with new visual displays can create an entirely new command and control interface structure. Design of these systems will require careful investigation of human factors issues to populate gesture libraries that are natural and intuitive, as well as cognitive loading considerations due to the easy availability of a vast amount of visual information. The remainder of this chapter is organized as follows. Following the introduction, the next section touches on related research on gestures and cognitive load considerations. The section that follows next illustrates a model research approach useful for the multidisciplinary nature of this effort, implementation of that model, and discusses some design guidelines obtained from simulations and physical demonstrations. The final section summarizes the findings and conclusions.
BACKGROUND Gestures and Gesture Capture Technology Gesture-based control, as well as traditional control technology, pose a unique challenge to remote operations of unmanned vehicles. To begin with, the term “unmanned system” is a misnomer at this point in time; since there is a human operator present in
Design and Development of Algorithms
299
the system, the system will always be “manned” in some way. The only difference in the case of unmanned systems is that the operator is not collocated with the vehicle. Thus, placing the operator in a unique position and providing a different operational perspective since many of the environmental cues normally present in manned scenarios are no longer present and available to the human operator. Research has suggested that while separated from the vehicle, gestures can help mentally connect with it. Cauchard et al. (2015) investigated how to interact with flying robots (aka drones). They conducted a study to learn how to naturally interact with drones. Results show strong agreement between participants for many interaction techniques, such as when gesturing for the drone to stop. They discovered that people interact with drones as with a person or a pet, using interpersonal gestures, such as beckoning the drone closer. Some previous related research centered around development of computer algorithms that would allow robotic systems to recognize gesture commands in the field as part of military teams. Other research has focused on virtual reality environments integrated with optical sensors to recognize and measure movement, velocity, and patterns of movement, of fingers and hands, and then translate those gestures into commands. Hamilton et al. (2016) conducted research that focused on developing the ability for robotic systems to understand military squad commands. The long-term goal was to develop the capability to integrate robots with ground forces as seamless teammates in combat operations. Their research focused on creating a recognition model that understands 12 squad-level commands, such as rally, listen, stop, and come here. The input into the model was collected using Microsoft Kinect’s skeletal model and processed with a logistic regression activation function to identify the gesture. The logistic model showed an overall 97% effectiveness when discriminating if the datasets are from a given member set. The decision model was 90% effective in determining the gesture class a given dataset represents (Hamilton et al., 2016). Lampton et al. (2002) conducted investigations into using a gesture recognition system integrated with a virtual environment. Their goal was to measure the accuracy and effectiveness of a VR-based gesture recognition system. The system consisted of two video cameras, software to track the positions of the gesturers hands, and software to recognize gestures by analyzing the position and movement of the hands. The researchers selected 14 basic and accepted hand gestures commonly used in the field by US Army personnel. In general, the results were mixed in terms of recognition and accuracy. Many of the gestures were problematic in terms of tracking, recognition, or both. Recent advancements in hardware and software processing have resulted in the ability to accurately capture gestures electronically. As mentioned above the Microsoft Kinect’s is one example. Another one is the Leap Motion Controller (LMC) (Leap Motion, 2019). It is a relatively recent technology that can capture and track hand motion with a sensor just slightly bigger than a standard USB flash drive. These devices have millimeter position accuracies (see, for examples, Weichert et al., 2013; Guna et al., 2014) and are able to capture a range of hand motions gripping processes (Smeragliuilo et al., 2016; Staretu & Moldovan, 2016). It has been suggested that these devices are better suited for 3D environments, such as that which a
300
Human Factors in Simulation and Training
sUAS operates in, as compared to 2D devices such as the joystick and mouse (Scicali & Bischof, 2015). There have been a few documented efforts to control drones with gestures and multimodal approaches. Sarkar et al. (2016) used the LMC to control some basic motions of a UAV. They present the implementation of using the LMC to control an off-the-shelf quadcopter via simple human gestures. Some basic tests were accomplished to document the feasibility of the LMC based system to control the vehicle motion. Chandarana et al. (2017) explored a multimodal natural language interface that uses a combination of speech and gesture input modalities to build complex UAV flight paths by defining trajectory segment primitives. Gesture inputs (measured with the LMC) were used to define the general shape of a segment while speech inputs provide additional geometric information needed to fully characterize a trajectory segment. They observed that the interface was intuitive, but the gesture module was more difficult to learn than the speech module. This, and the other studies cited above, highlights the possibilities of alternative command and control approaches with the emerging technology.
Cognitive Loading Considerations When combined with a head-mounted device (HMD) the gesture capture technology provided can be used to develop an alternative command and control system. In addition to identifying a gesture library careful consideration should be given to the cognitive loading aspects. This can result from the physical hand motions and the potential of information overload given the amount of available information that can be presented in a HMD. An example of what is possible, and methods to reduce cognitive loading, was discussed by Zollman et al. (2014). They investigated the application of micro aerial vehicles (MAVs) equipped with high-resolution cameras to create aerial reconstructions of selected locations. They discuss that this workflow yields several issues in cognitive loading, such as the need to mentally transfer the aerial vehicle’s position between 2D map positions and the physical 3D environment, and the complicated depth perception of objects flying in the distance. They presented an AR-supported navigation and flight planning of micro aerial vehicles by augmenting the user’s view with relevant information for flight planning and live feedback for flight supervision. Additionally, they introduced depth hints supporting the user in understanding the spatial relationship of virtual waypoints in the physical world and investigated the effect of these visualization techniques on the spatial understanding. The investigation highlighted the possibilities of an AR component of a command and control system and specific challenges related to cognitive processing. Zollman et al. (2014) highlighted a few of the cognitive loading issues related to the design of an AR based command and control system. There are several more that need to be considered (Givens et al., 1998; Rorie & Fern, 2014). For example, Dodd et al. (2014) investigated touch screen capability in aircraft cockpits and stated that as elements and workload increase in number and complexity, increased cognitive loading will follow. For the current effort, this concept drove the design in terms of the number and complexity of gestures a participant was expected to initiate for controlling
Design and Development of Algorithms
301
the vehicle. As the research progresses beyond the flat screen additional factors come into play. As AR capability was added, issues of switching views between the operator real-world view and a virtual framework need to be considered. Recent evidence indicates that very different brain processes are involved in comprehending meaning from these sources (Ravassard et al., 2013). The above discussion highlights some of the complexities that can quickly emerge in a command and control system, so careful consideration must be given to this design aspect as this approach is developed.
APPROACH, IMPLEMENTATION, AND RESULTS Approach There were two research objectives for this demonstration. One was related to the human factors aspect, while the second addressed the suitability of the proposed simulated environment as an assessment and training tool. They are stated as follows: 1. Investigate the application of gesture-based control of semi-autonomous systems to identify capability, challenges, and limitations to assess the feasibility (can you do it) and viability (does it add value) of the approach. 2. Assess suitability of a commercially available simulation environment to (1) support assessment of human performance and interface preferences for vehicle control and (2) provide a training environment for transition to a real-world system. A two-phase approach was taken to address the objectives. Phase 1 involved simulation only. It centered around the idea of observing a user’s ability to control a recreational quadcopter. This phase started with the identification of hand gestures to control the vehicle and the selection and utility validation of a representative gesture capture technology. With these fundamental building blocks in place, the simulation development followed an evolutionary approach where participants were brought in periodically to exercise the simulation and provide feedback on the basic gesturebased concept and simulation features. Modifications and additions were made after each of these events. While not a large sample size of participants, it did provide much-needed insight into major design features to minimize mental and physical fatigue and stable vehicle control. The objective of Phase 2 was to add validity to the findings from the pure simulated environment. In this phase a small ground vehicle was selected as the control model. This phase included a virtual reality simulation to train and familiarize the participant with the controls and vehicle performance. It was then followed by a physical demonstration of navigating to support findings from the simulation tests.
Developing Proper Training System Familiarization as a Prelude to Training Proper training methodologies and programs sometimes need to be modified to add additional training beyond the basic scope of training to become proficient to
302
Human Factors in Simulation and Training
operate the system. This may be necessary in two specific instances. The first instance may be encountered when training systems are deficient and the trainee encounters unanticipated confusion due to a mismatch in design between the training system and the real-world system; (i.e., the training system is poorly designed and does not operate in the same manner as the real-world system is designed to operate in terms of displays, controls, and visuals). The second instance may be encountered when dealing with new, unique, and unfamiliar technology. Training to learn to use the technology properly is an important first step that may be needed before learning to actually learning to operate the real-world system. In the case of systems that rely on gesture-based controls and gesture recognition, the need to learn what gestures are used to operate the system, and what features may be incorporated into the system to assist in interpretation and operation of the system in a smooth and efficient manner are essential prior to learning how to operate the system itself. In the current system being researched and developed, the complete system consists of multiple components using a combination of gestures, a gesture recognition system, and software algorithms designed to assist in the operation of the vehicle to produce smooth and efficient operation based on the command gesture produced by the trainee and recognized by the gesture recognition system. The gesture recognition system consists of the Leap Motion Controller which produces a “control bubble” that detects and recognizes a pre-determined set of gestures. Once a person places their hands inside the recognition area, the LMC will recognize and interpret the gesture for translation into movement of the vehicle. The software algorithms then enhance the interpretation of that gesture to assist in smoother, better-defined movement in the simulation or real-world vehicle operation. This type of control interface (gesture-based recognition system) is at a great disadvantage when compared to a type of control interface, such as a joystick, that has been in use for decades and very familiar to almost anyone who has played a video game or purchased a remote-control plane or car for entertainment. The most common control interface device for video games and remote-control vehicles today is a joystick-based hand controller. Almost all joystick interfaces have a number of features that are common across various units, and the functions of those control inputs are universal in nature. A gesture-based control system is something unfamiliar to most individuals, and must be clearly defined before use. The gestures used for recognition by the LMC are natural, intuitive, and somewhat familiar in a general sense, but the type of control interface is novel and unique. In other words, the gestures when taken by themselves are familiar to any user in a general sense, but to use those gestures in an interface used to control a vehicle is something foreign and not familiar. Therefore, the need to train the user on the gestures and their use in the control of the vehicle is essential before any system training can take place. Failure to do so will result in longer training times and poorer performance compared to training a user to control a vehicle with a device that is familiar to them such as a joystick-based hand controller.
303
Design and Development of Algorithms
Implementation and Results: Phase 1 Gestures and LMC Measurements The first step toward developing the capability was to conduct a task breakdown and gesture matching exercise to identify the functions associated with flying and operating a representative recreational hovercraft (aka., quadcopter). This task breakdown is shown in the first column of Table 11.1. It was partitioned into categories of flight control and camera control. Seven potential actions in the flight control category related to the movement of the vehicle in the airspace were identified. Three camera actions were identified. They refer to the camera view (i.e., a first-person view from the operator or vehicle) and direction (pitch and yaw). The description of the flight control is an abstraction and describes what the operator wants to make happen rather than how the vehicle does it. For example, the desired action is for the vehicle to climb or descend, translate in a horizontal plane, or yaw around its vertical axis. This motion is enabled through the application forces and torques on the vehicle. These forces and torques are determined by the internal control logic of the air vehicle based on user commands. In this investigation, one of the algorithms that drive the simulation translates the operator input to forces and torques on the vehicle. The Leap Motion Controller was selected as the representative technology for which to capture the gestures identified in Table 11.1. Previous investigations (Weichert et al., 2013) have demonstrated that the LMC has submillimeter accuracy. It was desired to build on this finding and assess how accurately the humanperformed gestures described in Table 11.1 are captured by the LMC. A C# script developed using the LMC application programming interface (API) was used to capture this data. Figure 11.1 is a representative example of the hand angle vs sample
TABLE 11.1 Quadcopter Control Actions and Corresponding Gestures Vehicle Action
Gesture
Flight Control Climb/Descend Translate Left/Right Translate Forward/Aft Yaw Increase/Decrease Speed Stop Control Initiation Camera Control Switch View Pitch
Left Hand Pitch Right Hand Roll Right Hand Pitch Left Hand Yaw or Roll Controlled by Vehicle Pitch and Roll Fist/Remove Hands from Control Environment Open Hand
Yaw
Left Hand Yaw
Tapping Motion Right Hand Pitch
304
Human Factors in Simulation and Training
FIGURE 11.1 Representative gesture capture using LMC – rolling right hand.
number captured by the LMC. In this case a left-right rotation of the hand. It was produced by performing the gesture with the right hand at a natural speed so as not to be excessively slow or fast. It can be observed that the LMC captured the gesture with a high degree of fidelity. The slope variations are a result of minor changes in rotational speed of the hand, indicating again the highly accurate nature of this sensor. This exercise demonstrated the precise results produced by the LMC algorithms used to process the captured images. It also illustrates the variations possible with a human-performed motion. Indicating the necessity of data smoothing in the gesture interpretation algorithms used in the simulation. The implementation of this is described in the section “Algorithm Redesign.” Virtual Environment Development The simulation development begins with a description of the visual component of the virtual environment (VE). This includes basic scene setting, information displayed, and vehicle control mechanisms. It then subsequently explores details of key components that make the simulation functional. The approach taken in this investigation is to minimize the load on the working memory. This will result in limiting the information transmitted to the user to include basic vehicle status (i.e., speed, altitude) and visual information to improve perception and vehicle component control. Taking this approach will keep the focus on the gesture control aspect and suitability of the basic simulation environment. Figure 11.2 shows a screen capture of the initial virtual environment. The drone is a generic representation of a recreational quadcopter. It models a 1kg drone with nominal dimensions of 30 cm × 30 cm × 10 cm and has red lights indicating the
Design and Development of Algorithms
305
FIGURE 11.2 Initial simulation screen design.
forward part of the drone and blinking green lights in the rear of the unit. The arrow in the left-hand corner serves as an orientation aid for users to determine the vehicle direction when it is too far away to clearly distinguish the lights. This is best understood by rotating the arrow 90° so it is on a parallel plane with the vehicle. For the case shown in Figure 11.2, the arrow indicates it is coming at the user from the right. The vehicle information displayed is altitude, speed, and range to vehicle and is shown in top left of the figure. An alternative concept for the vehicle data is to have it follow the vehicle in a fixed position such as off to the right. However, it would tax working memory unnecessarily and so not implemented since this was not a focus of the demonstration at this point in time. Therefore, the side position was selected so the user could quickly glance at the data when needed. A dynamic user interface (UI) was used to switch the camera view using gestures. It is a capability available in the Orion Version of the LMC API (Leap Motion, 2019). In this case, a dynamic UI is attached to the left hand and is visible when that hand is rotated toward the user as shown in Figure 11.3. It contains two buttons to enable the user to switch the view between the operator or vehicle camera. This type of dynamic UI is an attractive feature for the proposed system. It has the potential to lower working memory load since it is not always in the field of view. Now that the visual component of the VE has been described, we will discuss some of the mechanics that made it work starting with a discussion of how the vehicle motion was controlled. There were 14 individual C# scripts written to control the simulation to include visual features such as tracking and displaying vehicle information to capturing gestures and controlling the vehicle. Two scripts central to vehicle control are the GestureListener (captures gestures) and UAVController scripts. The former captures gestures (i.e., hand orientation, fist) of each hand such as that shown in Figure 11.2. The latter interprets the gestures control vehicle operations such as setting forces and torques on the vehicle and camera control.
306
Human Factors in Simulation and Training
FIGURE 11.3 Dynamic UI to control view perspective.
FIGURE 11.4 Modeling the vehicle forces and torques – actual (left) – in Unity (right).
Unity provides a physics engine to apply forces and torques to an object via its Rigidbody class, which controls the object’s linear motion via forces and angular motion via torques. Figure 11.4 shows two freebody diagrams of the model vehicle. The left one shows the four forces produced by each propeller. By adjusting individual propeller forces, a force–torque combination will be applied to the actual vehicle to produce the desired flight behavior. For this simulation, the vehicle was modeled with a Rigidbody component attached to it. This enables the application of a single 3D force vector and a single 3D torque vector to the vehicle. For the simulated vehicle, the four propeller forces are then modeled as single force in y-direction relative to the orientation of the vehicle (i.e., perpendicular) and a single torque vector as shown in freebody diagram on the right. Maximum forces and torques
Design and Development of Algorithms
307
values applied to the vehicle were adjusted so that the simulated vehicle performance closely approximated that of the real vehicle. A linear relationship is used to interpret the gesture and transform it to an applied force or torque in the UAVController script. It is a normalized value between −1 and 1 based on a prescribed maximum hand rotation. A limit on the hand rotation was based on the observations on the range of motion of natural hand gestures discussed previously. For example, the maximum wrist rotation was set to 30°. Even though the user may rotate the hand to a larger angle, the control input was maxed out at this condition. User Testing The purpose of the first round of tests was to make a comparative assessment between a joystick/button device (the Xbox 360) and gesture-based control. Four participants took part in the testing. Each participant engaged in two scenarios with each control approach. The first scenario was “play time”, and the second was a search mission. In playtime, the users were not asked to do anything specific; it was just meant to give them time to explore the response of the vehicle to the flight control inputs via the two techniques and also become familiar with the operation of the dynamic UI for controlling the camera view. In the second scenario, they were asked to locate and navigate the vehicle to a specific location in the scene. There was no prescribed path at this point but rather just a destination. After this, the participants engaged in a short post-test interview. The total time to complete the test and post-test interview typically took just under an hour per participant. In general, the participants preferred the Xbox controller over the gesturebased control system. Several observations and comments support this position. For example, on average twice as much time (11 min v. 22 min) was spent in play mode with the gesture-based system. This is an indication that the users felt more comfortable and familiar with the Xbox controller v. the gesture-based system. A typical user’s ability to control the vehicle significantly improved over the play period, but they still did not feel as comfortable with the gesture system as compared to the joystick device at the end of the play session. Finally, mission times when using the Xbox were on the order of three minutes while the missions using gesture control were rarely completed due to fatigue and frustration with the system. In the post-test interviews, participants reported feeling fatigued, mostly due to using the gesture system. This is most likely from a combination of physical and mental fatigue. Even though only minimal hand movement is required to control the vehicle, it was observed that the participants used large hand gestures requiring more energy compared to the small thumb motions that can be used with the joystick. Also, the vehicle did not respond as accurately to these gestures since they did not fall into the detection region (i.e., the green box) and were not the subtle motions expected by the processing algorithm. These observations coupled with the consideration that gesture control is a new approach probably led to a higher level of mental engagement and thus fatigue. For the most part, the visual content was satisfactory for the participants. The location and amount of the textural information was enough, and the user’s responses
308
Human Factors in Simulation and Training
did not indicate they were overly taxed with processing that information. In fact, they were typically so focused on the vehicle that they needed to be reminded this information was available. The exception to this was the virtual hands which they found distracting. Other comments and observations centered around the use of the dynamic UI and visual aids. Participants could not consistently produce the menu shown in Figure 11.3 and often could not make the selection once the menu was available. Restricting the region where the vehicle control was activated received unfavorable comments too. The control box made them feel constricted, and it led to lack of control because they frequently had to check where they were in the field of view. Finally, they had difficulty processing 3D vehicle orientation using the 2D arrow. One final observation that all of the participants made was that they liked how the gesture-based system made them feel more connected to the vehicle response. The comments and observations from this set of tests led to several modifications of the simulation. First, the idea of introducing an unconstrained play environment did not result in effective conditions for the participants to learn the new gesture interface. A building block training environment was implemented to address this shortfall. Second, participants had a difficult time processing the correlation of the vehicle orientation with the 2-D direction indicator. In the updated version of the simulation, a 3D representation of the vehicle was included. This is shown in the bottom of Figure 11.5 as a semitransparent sphere containing a small-scale version of the drone model. This drone matches the pitch and roll orientation as well as the direction the vehicle is flying. It was anticipated that this will reduce the cognitive loading and thus fatigue since it is a more direct representation of the vehicle’s orientation and will require minimum processing to understand the vehicle’s orientation. Components of the user interface were also updated. A neutral command was programmed into the simulation. If a hand was detected to be in the shape of a fist, then no control command would be transmitted to the vehicle. Also, the virtual
FIGURE 11.5 Modified simulation screen design.
Design and Development of Algorithms
309
hands were made out of clear material, so it was less distracting to the user but still available for reference. Next, the dynamic UI was hard for participants to control. So, this was replaced by simply performing a task that appears as if the user was touching the vehicle to change the camera view. When viewing from the camera a small semitransparent square in front of the viewer is the target interface. In addition to being a bit more intuitive, it is also a simpler technique. A command (rotating the index finger) was also added to rotate the camera pitch angle 90°. This let the users scan from a position parallel to the flight path and straight down, which was useful for searching an area and landing. To assess the effectiveness of these modifications, two participants from the previous test were brought back. It was conceded that the joystick approach far exceeded the gesture-based control at this time, so the users were asked only to engage in the gesture-based control approach. Each participant was first led through the training environment. As anticipated, this aided in helping them develop a feel for the limited range of motion required to control the vehicle. Then they again went into the play and mission scenarios. In general, the feedback from the users was much more positive and it was observed that they had better control of the vehicle, were able to complete the requested missions, and switch camera views. They also demonstrated a lower level of fatigue and frustration. Algorithm Redesign Previous implementation of the proportional-integral-derivative (PID) controller was limited to transiting to hover mode and ensuring the vehicle did not exceed its maximum rotation angle in pitch and roll. This approach was expanded to include more control setpoints. These setpoints include the vertical climb rate, yaw rate, and the pitch and roll angle. Having this structure results in the hand gestures determining the setpoint and then the PID controller determines the required force and torque vector to maintain the vehicle in this condition until an additional command is given. So, it is still a kinetic-based simulation. The setpoint is determined based on a cubic relationship using the normalized change in hand orientation. This approach can be clarified by studying Figure 11.6. This figure illustrates a cubic relationship between the normalized gesture command and a control parameter. For this illustration, the maximum value of the control parameter is set to three. Assume that the vehicle is on the ground waiting for takeoff. This condition then defines the initial setpoint shown in the figure. A change in hand orientation from the reference orientation, such as positive pitch rotation, is then normalized and the new setpoint is determined based on a cubic function. Once this command is set the user can then return their hands to the neutral (e.g., resting) position and the vehicle will continue to follow that last input command by virtue of the PID controller. Note that returning the hands to the reference orientation does not affect the setpoint. Incremental changes to this updated setpoint are made again following the cubic function, so a small change in hand orientation will result in a small change in the setpoint while a larger change will increase it more but not beyond its maximum. Finally, an additional state was added to the system, so in this version there were three: active control, hover, and cruise. Switching between the
310
Human Factors in Simulation and Training
FIGURE 11.6 Redesigned gesture interpretation algorithm.
states was achieved by touching the thumb and index finger. After a change in state, the reference hand position can be reset based on the user’s preference. Finally, data smoothing was implemented to remove the jitter resulting from the captured hand gestures. This approach provides the operator with a wide range of control and flexibility anywhere in the flight envelop. Several tests by the developer and a complete novice demonstrated that these changes resulted in a significantly improved command and control system. First, the vehicle is easier to control and is more stable in flight. More precise control of performance parameters and vehicle positioning is enabled too by the new control algorithm. Further, the updated gesture interpretation algorithm combined with the implementation of the state machine resulted in the user being able to keep a lower level of physical stress on the hands and wrist. This is a positive consequence of not requiring the user to maintain off-neutral, fixed-hand positions for the vehicle to maintain its current flight trajectory.
Implementation and Results: Phase 2 Physical Demonstration Setup The purpose of Phase 2 was to see if the observations and algorithms developed using the simulated environment transfer to an actual physical demonstration. Thus, adding validity to the lessons learned and observations from Phase 1 and supporting the realism of the simulation, the latter being one of the two objectives of the project. This was accomplished via the use of a small ground vehicle.
Design and Development of Algorithms
311
As discussed previously, it is a good idea to provide a preliminary training environment so that an operator can become familiar with the required hand gestures and vehicle response. For this situation, an immersive VR simulation was developed. The VR environment was designed to represent the geometry of a room with dimensions of 5 m × 10 m × 3 m. Figure 11.7 shows the actual and simulated environments with the model car. The virtual environment contained a few obstacles such as pillars and tables that provided targets when directing test participants to navigate around. Hardware used in this experiment include the LMC and Oculus Rift Headset. The LMC generates the operational gesture recognition environment while the Oculus Rift provides an immersive display environment. Figure 11.8 shows virtual car in addition to two visual aids. One is a virtual trackball concept introduced in Phase 1 (nearly transparent sphere) and the other is a crossbar control indicator. The virtual trackball and the crossbar work in a coordinated manner. The trackball itself has a diameter of 0.1 m, which is about the size of a softball. The intent is to provide the user with an anchor for the hand to rotate
FIGURE 11.7 Actual (top) and virtual (bottom) environments with model car.
FIGURE 11.8 Trackball and crossbar control indicator.
312
Human Factors in Simulation and Training
around as it hovers over the LMC. Erratic readings can result from the LMC if the hand gets too close it. This distance is approximately 2.5 cm above it. An alert range is conservatively set to 5 cm. At this point, the trackball turns from a nearly transparent to a yellow color. The control indicator is composed of a crossbar each with a disk that moves either vertically or horizontally. The vertical component is tied to the pitch of the right hand and the forward and backward motion of the vehicle. The horizontal motion is tied to roll of the right hand, which is used to control the steering angle. The maximum motion of these gestures is set to 30°. Control of the vehicle based on these gestures is made through the use of a wheel collider and will be discussed a little later. To further support muscle memory training, users are positioned in a chair with the LMC just below and in front of the right armrest. While not practical for an actual application, it helps to provide an anchor for the arm. This in turn lets the user focus on the small hand motions required for vehicle control. The basic algorithms used in the virtual environments were implemented to control the model car system shown in Figure 11.9. It has two components: the transmitter (left) and the model car (right). A schematic of the processing and control algorithm is shown in Figure 11.10. As shown in the left circle, Unity captures and interprets gestures from the LMC to send to the transmitter via the computer USB port. It is a string containing rear wheel motor power and turning commands. The transmitter then sends the signal to the car. The Arduino board on the car then interprets and executes the commands as indicated in the circle on the right. Minor modifications to the original software that came with the system were required to enable this capability. User Testing There were two rounds of testing conducted. In the first round the initial, the linear control algorithm mentioned earlier was implemented, so this was a direct control of the vehicle motor torque and steering using a linear gesture interpretation algorithm. The second round of testing implements most but not all of the modified control
FIGURE 11.9 Adeept controller and smart car (Adeept, 2019).
Design and Development of Algorithms
313
FIGURE 11.10 Schematic representation of Unity to car control scheme.
algorithm. This is due to the fact that feedback parameters (i.e., vehicle speed) are not always available. For example, vehicle speed is available in the VR simulation but not in the actual vehicle in its current configuration. The following features are included: cubic interpretation of gestures, incremental command inputs, and commands based on a neutral reference position. So, this still captures key foundational elements of the control approach related to lowering the physical stress on the user. Two sessions were conducted in the first round. One a high participation count (around 15 participants) but informal activity and one a more formal but lower participation count (2 participants). In each case, there was a training period followed by a play time in the VR environment, followed by an event where the user controlled the model vehicle. The initial training mode involved no vehicle movement, but the indicator was free to move. This enabled the user to become familiar with the hand position and the small range of motion required for vehicle control. After that forward and backward motion was enabled to allow the user to become familiar with the visual effect of the moving car. Finally, the car steering was enabled. This stepby-step training process was inspired from the findings of Phase 1. Informal observations of approximately 15 people took place. During these engagements, it was observed that within about 20 minutes the majority of the participants were able to reasonably control the vehicle in both virtual and reallife scenarios (10 minutes in each environment). Further, the large gestures from previous testing were not observed and the participants used small, relaxed gestures. So, it appears that the combination of the virtual visual aids and anchoring of the arm produced the desired results. One issue that was observed is that the turning performance was a little unstable. More like seeing someone ride a wobbly bicycle rather than the smooth, consistent motion. Two additional but more formal tests were conducted. In this case, each participant was processed through the same rigorous training process before enabling the play mode. Similar observations were made to the informal test described above
314
Human Factors in Simulation and Training
about hand motions and vehicle control. It was further observed that users became more confident in their ability to control the car in around 10 minutes. As described above in the informal test session, speed control was smooth but the turning was still a bit unstable. This is something that the latest control algorithm corrected. Participation in the second round of tests was limited to the developer. In this round, the developer implemented and exercised the new control scheme in both the VR environment and with the remote-control car. The main differences in implementation between car-based scenarios and the UAV simulation is in selection and implementation of setpoints. For the VR simulations, vehicle speed and wheel angle were the setpoints used with PID controller scheme. These setpoints are not available with the remote-control car. That would require additional vehicle sensors to provide feedback, so the active PID controller is not implemented. Other features, such as the cubic gesture interpretations and data smoothing, are integrated into the control methods. After some initial testing, it was decided to slightly modify the control algorithm to more smoothly control the car. In the UAV control, a performance parameter is set such as climb rate or desired roll angle to achieve the desired speed. Then the PID controller maintains that condition. In the case of the car controller, this worked well for the speed control. In the VR simulation, the user could adjust desired speed and then the PID controller would determine the required torque to apply to the wheel. In the remote-control car case, the user torque is directly linked to the cubic gesture function. In each of these scenarios, the user can still return their hand to the neutral position and the car will continue at that speed. To stabilize the steering required returning to a more rudimentary approach. This is due to the fact that steering, especially in confined venues, is a more dynamic event requiring constant adjustments. It was found that the best way to steer the car was to maintain the hand in a rotated position while turning but release it once the target direction was achieved. The wheel position would in turn then return to a zero angle. It was also decided to implement a 3: steering ratio. For example, the maximum recognized hand rotation angle was set to ±30° while the maximum steering angle of the car was set to ±10°. This is another feature that translates the less accurate human performance to more precise control of the car. These adjustments made executing basic maneuvering such as ovals and figure 8s more manageable. This final exercise illustrates the complexity of the control process and several of the features that need to be considered in the design of such a system.
SUMMARY AND CONCLUSIONS The ability to control vehicles via gesture-based control is achievable. Additionally, with the emergence of head-mounted, augmented reality technology, it may make it preferable. However, at this point, there is still a strong preference for the joystick approach. This may be the result of a combination of familiarization and maturity of the technology. The joystick-based controller has been around for a number of years; its basic design is well tailored and its functions are well developed. The gesture-based system is still new and can be intuitive, but it is not something that
Design and Development of Algorithms
315
individuals are very familiar with and comfortable using at this point in time. While care was taken in this effort to implement natural gestures, they were still new ideas. However, participants learned the new system quickly (on the order of minutes) to achieve a moderate level of vehicle control and stated they felt more connected to the vehicle using this approach. Another conclusion is that the available gesture capture systems are highly accurate and capable of detecting a wide range of hand motions. These hand motions can subsequently be transformed into control commands. However, the human is not as precise. To make these systems more usable, data processing and control systems need to be implemented that smooth out the variations in human performance and thus stabilize the vehicle control. Also, being a new interface will require training environments to familiarize the user with the range of motion required since it is basically unlimited by any mechanical constraints. For example, a joystick is a mechanical-based system and it has motion limits. A hands-free gesture system is wide open and limited only by the ergonomic boundaries of the operator. Establishing proper training environments showed that this motion can be easily learned if natural, accepted, and intuitive movements are considered in the design. Further, the research has revealed other subtleties on motion that were not originally considered. For example, the original gesture concept was to simply rotate flat hands via a pitch and roll motion of the wrist. Observations showed the preferred neutral position of a hand was slightly offset and semi-rounded, making it more suitable for a virtual trackball concept. Testing on a larger scale is required to further investigate human performance and preferences for this control idea. The system developed in this research is now set up to conduct these larger-scale tests. As these new technologies and types of interfaces become more commonplace and more familiar to the user, research and development in these areas will expand and incorporation of new technology such as gesture recognition combined with various software solutions will become commonplace and accepted much like the current joystick technology and control interfaces of today.
REFERENCES A. Givens, M. E. Smith, H. Leong, L. McEvoy, S. Whitefield, R. Du & G. Rush, “Monitoring Working Memory Load During Computer-Based Tasks with EEG Pattern Recognition Methods,” Human Factors: The Journal of the Human Factors and Ergonomics Society, vol. 40, no. 1, pp. 79–91, 1998. A. H. Smeragliuilo, N. J. Hill, L. Disla & D. Putrino, “Validation of the Leap Motion Controller Using Markered Motion Capture Technology,” Journal of Biomechanics, vol. 49, no. 9, pp. 1742–1750, 2016. A. Sarkar R. K. Ganesh Ram, K. A. Patel & G. K. Capoor, “Gesture Control of Drone Using a Motion Controller,” in International Conference on Industrial Informatics and Computer Systems (CIICS), pp. 1–5, Sharjah, 2016. A. Scicali & H. Bischof, “Useability Study of Leap Motion Controller,” in Proceedings of the International Conference on Modeling, Simulation and Visualization Methods (MSV), Athens, Greece, 2015. Adeept. “Adeept,” [Online]. Available: https://www.adeept.com/. [Accessed 18 September 2019].
316
Human Factors in Simulation and Training
C. Balog, B. Terwillinger, D. Vincenzi & D. Ison, “Examining Human Factors Challenges of Sustainable Small Unmanned Aircras Systems (sUAS),” in Advances in Human Factors in Robots and Unmanned Systems. Vol 499 of the series Advances in Intelligent Systems and Computing, New York, NY, Springer International Publishing, 2016, pp. 61–73. C. Rorie & L. Fern, “UAS Measured Response: The Effect of GCS Control Model Interfaces on Pilot Ability to Comply with ATC Clearances,” in Proceedings of the Human Factors Ergonomics Society 58th Annual Meeting, 2014. D. R. Lampton, B. Knerr, B. R. Clark, G. Martin & D. A. Washburn, “ARI Research Note 2306-6 - Gesture Recognition System for Hand and Arm Signals,” United States Army Research Institute for Behavioral Sciences, Alexandria Va, 2002. F. Weichert, D. Bachmann, B. Rudak & D. Fissler, “Analysis of the Accuracy and Robustness of the Leap Motion Controller,” Sensors, vol. 13, no. 5, pp. 6380–6393, 2013. I. Staretu & C. Moldovan, “Leap Motion Device Used to Control a Real Anthropomorphic Device,” International Journal of Advanced Robotic Systems, vol. 13, no. 113, 2016. J. Guna, G. Jakus, M. Pogacnik, S. Tomazic & J. Sodnik, “An Analysis of the Precision and Reliability of the Leap Motion Sensor and Its Suitability for Static and Dynamic Tracking,” Sensors, vol. 14, no. 2, pp. 3702–3720, 2014. J. R. Cauchard, L. E. Jane, K. Y. Zhai & J. A. Landay, “Drone & me: An exploration into natural human-drone interaction,” in Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, Osaka, Japan, 2015. Leap Motion. “Leap Motion,” [Online]. Available: https://www.leapmotion.com/. [Accessed 16 09 2019]. M. Chandarana, E. L. Meszaros, A. Trujillo & B. D. Allen, “Natural Language Based Multimodal Interface for UAV Mission Planning,” in Proceedings of the Human Factors and Ergonomics Society 2017 Annual Meeting, Los Angeles, CA, 2017. M. S. Hamilton, P. Mead, M. Kozub & A. Field, “Gesture Recongition Model for Robotic Systems of Military Squad Commands,” in Interservice/Industry, Training, Simulation and Education Conference, Orlando, FL, 2016. P. Ravassard, A. Kees, B. Willers, D. Ho, D. Aharoni, J. Cushman, Z. M. Aghajan & M. R. Mehta, “Multisensory Control of Hippocampal Spatiotemporal Selectivity,” Science, vol. 340, no. 6138, pp. 1342–1346, 2013. S. Dodd, J. Lancaster, A. Miranda, S. Grothe, B. DeMers & B. Rogers, “Touch Screens on the Flight Deck: The Impact of Touch Target Size, Spacing, Touch Technology and Turbulence on Pilot Performance,” in Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Chicago, Ill, 2014. S. Zollman, C. Hoppe, T. Langlotz & G. Reitmayr, “FlyAR: Augmented Reality Supported Micro Aerial Vehicle Navigation,” IEEE Transactions on Visualization and Computer Graphics, vol. 20, no. 4, pp. 560–568, 2014.
12
The Influence of New Realities How Virtual, Augmented, and Mixed Reality Advance Training Methods in Aviation Graham King, Kendall Carmody, and John Deaton
CONTENTS Introduction............................................................................................................. 317 Virtual, Mixed, and Augmented Reality as We See It … or Don’t......................... 318 The Growth of Technology in Aviation Training.................................................... 319 Use of AR, MR, and VR in the Aviation Field........................................................ 320 The Shortage and the Future................................................................................... 322 Benefits, Drawbacks, and Resolutions.................................................................... 326 Conclusion.............................................................................................................. 327 References............................................................................................................... 328
INTRODUCTION As times change and technology grows, the expectations of humankind grow in tandem: businesses desire processes that are simpler and more efficient, the public requires goods and services that are more advanced and easier to obtain, and industries look to faster training methods. While these expectations become reality and the world more automated with each industrial revolution, it has been increasingly apparent that virtual, augmented, and mixed reality may be a significant factor in this present transformation. Despite their unfamiliarity to many, these new realities have been all around us, advancing industries since their introduction. The constant and overwhelming burgeoning of this immersive technology has displayed its attainable commercial capabilities in recent years, allowing many companies to jump at the opportunity to maximize their efficiency and effectiveness through the implementation of such devices. Several big names like Microsoft, Facebook, and VIVE have relentlessly engaged in the matter of developing this technologically advanced equipment DOI: 10.1201/9781003401353-12
317
318
Human Factors in Simulation and Training
for over a decade. With the current growth in aviation and shortage of pilots, maintenance personnel, and flight crew, the Federal Aviation Administration’s hesitation to recognize and implement virtual, mixed, and augmented reality for training in its regulations may soon be void. Recognized aviation schools have already begun to test out virtual reality headsets in their training. Maintenance, repair, and overhaul (MRO) aircraft mechanics have turned to augmented reality for training and operations in order to decrease the time of task completion. By simplifying movements, providing workers with a step-by-step guide as they perform tasks, and reducing the risk involved with rarely-before-seen repairs, these aircraft mechanics hope to use augmented and mixed reality to streamline MRO. American Airlines gave its new flight attendants a chance to practice and train using a virtual reality lab to reduce error and the cost of running a simulator for longer. Even branches of the military are using state-of-the-art virtual reality technology to train their classes of pilots. Virtual, mixed, and augmented reality products, although minor, have some drawbacks that affect user comfort and the way in which the technology can be used. However, researchers and programmers are already finding ways to counteract these drawbacks, and the potential for insurmountable paradigm within the field of aviation makes implementing these devices a necessary paragon.
VIRTUAL, MIXED, AND AUGMENTED REALITY AS WE SEE IT … OR DON’T Virtual reality (VR), typically the most recognizable of the three reality-changing technologies, is a system which eliminates the user’s current reality and transports them to a simulated world, using a medium, such as a headset, to stimulate their brain and perceptions with the created virtual elements. As for mixed and augmented reality, previous research has had difficulties in defining the terms, as their meaning is debated among scholars, engineers, and researchers (Yung & Khoo-Lattimore, 2017). However, a relatively simple version defined by Intel Corporation states that augmented reality overlays simulated/digital information on the real world, while mixed reality combines real world and simulated/digital elements, meaning one manipulates both the simulated elements and the real world together (Intel, 2019). Microsoft, viewed as a leader in the mixed-reality network, has been heading up its efforts in the creation of the HoloLens, a mixed-reality headset designed to create easy solutions for business training by projecting a 3D model into real world elements (Foundry4, 2019). Both types of simulated reality (augmented (AR) and mixed (MR)) are becoming more popular in the gaming industry as well. Google’s spinoff company Niantic implemented AR into one of the most popular games on the market: Pokémon GO (Foundry4, 2019). Game players can download the app on their smartphones and spend countless hours walking around outside in the “real world” collecting various Pokémon and battling with trainers they find mixed into their surroundings. It is only a matter of time before children walking the streets equipped with augmented reality devices is a “norm.”
The Influence of New Realities
319
Altering what users see in real life does not come without risk, however. Many players of the aforementioned game Pokémon GO for example were involved in accidents attributed to distractions while playing. The Journal of the American Medical Association conducted a study analyzing keywords in tweets and news articles from July 10 through 19, 2016 (Ayers et al., 2016). Taking a random sample of 4,000 tweets, they found that 33% indicated that a driver, passenger, or pedestrian was distracted by Pokémon GO (Ayers et al., 2016). The study found that around 18% of the 4,000 tweets suggested a person was playing and driving (Ayers et al., 2016). In the study, they also found that there were 14 car crashes as a result of the game in news reports during the same time period (Ayers et al., 2016). VR sickness, disorientation, seizures, and a lack of direct transfer of learning can also be inherent problems people find when using VR, MR, and AR devices in excess; however, with all disadvantages considered, the ease of learning, cost effectiveness, decreased risk of harm in training, and faster training can make the use of this technology inherently beneficial. While the risks should be strongly analyzed, they should not deter the use or implementation of such devices; rather users should heed caution and use them in a proper manner: one that enhances training and is in accordance with the manufacturer’s suggestions.
THE GROWTH OF TECHNOLOGY IN AVIATION TRAINING Overall, high demand for these systems and the need to innovate within certain industries has led many companies to devote much of their time and resources to developing fully functioning augmented, mixed, and virtual reality systems, which are now constantly being incorporated into various forms of training, sometimes before they are even fully developed. The aviation industry in particular is always looking for new strategies to modernize and restructure its practices and procedures to maintain the safest skies for everyone in the most efficient way possible. For years airlines, flight schools, and various other training centers have used and will continue to use simulators, flight training devices (FTDs) and Aviation Training Devices (ATDs) in their airman certification training, testing, and checking tasks. As technology flourished, the Federal Aviation Administration (FAA) and Department of Transportation (DOT) adapted their regulations and standards to employ these nowcertified methods of training. In the FAA’s advisory circular AC 120-40B, they specifically state that “as technology progressed and the capabilities of flight simulation were recognized, FAR revisions were made to permit the increased use of simulators in approved training programs” (United States Department of Transportation, 1991, p. 2). On February 2, 1970 greater use of FAA-approved airplane simulators was permitted when training airline crews; to avoid congestion in the air, effective August 1st, 1996, the FAA issued a final rule that permits the use of flight simulators and flight training devices for most airman certification training, testing, and the checks themselves (Federal Aviation Administration, 1996; Federal Register, 1996). The FAA document made these changes to be “consistent with a state-of-the-art training concept and recognizes industry recommendations for the expanded use of sophisticated flight simulation” (Federal Register, 1996, p. 1). They have also stated that
320
Human Factors in Simulation and Training
simulators provide more “in depth training than can be accomplished in airplanes and provide a very high transfer of learning and behavior from the simulator to the airplane” (United States Department of Transportation, 1991, p. 2). With just how rapid the growth of simulators in training became since the 1950’s when they were first introduced, some experts believe that as VR develops more it will have a similar if not faster track. The managing editor of FAA Safety Briefing, Tom Hoffman indicated in an article that VR is an up-and-coming area for use with the ATD and FSTD (flight simulation training device), especially with it boasting “broader visuals and 3-D imaging” (Federal Aviation Administration, 2017, p. 13). It has been shown off by trainers and researchers at several conferences such as NTAS and as Hoffman mentions big events like FlightSimCon (Federal Aviation Administration, 2017). Sooner rather than later it is likely aviation regulations regarding VR, AR, and MR will be implemented as the FAA becomes more confident in the technology and its capabilities because of its lower cost and real-world feel.
USE OF AR, MR, AND VR IN THE AVIATION FIELD As virtual reality headsets become more readily available for industry and general purposes alike, research on their effects has grown. There is an extensive assortment of research across many industries with respect to VR and performance. Previous research focuses mainly on utilizing VR to enhance training performance. Such cutting-edge technology is being swiftly implemented into areas of training and education for the aviation industry. Understanding the benefits and limitations of these training methods is crucial as the technology becomes more mainstream. Utilizing VR devices for enhancing visual inspection training is a prominent area of research for the aviation industry and one that has yielded some notable results. A study focusing on this subject was conducted in 2002 by Vora et al. for the Department of Industrial Engineering at Clemson University. The aim of this study was to analyze and compare a subject’s perceptions of two differing systems with respect to an actual aft-cargo bay environment to identify which system had the most support in future training. The systems included: ASSIST: a computerbased training program; and VR: an SGI Onyx2 VR system (Vora et al., 2002). A population of graduate and undergraduate students at Clemson University was utilized for this study, with 14 subjects who were randomly selected (Vora et al., 2002). All subjects used both systems, with the treatment factor being the type used. To cancel out any order effects, the order in which the subjects performed the task was counterbalanced (Vora et al., 2002). An immersive tendency questionnaire as well as a presence questionnaire were administered, respectively (Vora et al., 2002). The results indicated that the VR system was better for the inspection task and was preferred over the PC-based aircraft inspection simulator (Vora et al., 2002). Areas of the aviation industry are beginning to take this advanced technology more seriously and conduct increased research on the impact of implementing it into everyday training, especially with respect to maintenance and visual inspection. While research with VR, AR, and MR continues to be conducted and methods improved upon, some organizations and training hubs are beginning to give
The Influence of New Realities
321
the current technology a trial run. Particularly at aviation-related universities, the transition to VR and MR devices at these schools with flight programs, such as the University of North Dakota (UND), has become necessary. Following suit of commercial flight training devices, UND’s John D. Odegard School of Aerospace Science recently incorporated the use of VR head-mounted devices (HMDs) in its all-new VR lab (Weirauch, 2020). Using the HTC VIVE VR headset, and the yoke, throttle, and rudder pedal controls, students new to aviation can begin to develop situational awareness and basic piloting skills at the stations (Murphy, 2020). Far from being imaginary or fake, the simulator bay makes use of photorealistic images with a 360-degree full-cockpit view of one of their Piper Archer single-engine aircraft (Murphy, 2020). For the school, the lab not only serves as an extra tool for training when the simulators are being used for checks and more advanced training, but because of the fun reputation and ease of use VR headsets have, it helps to draw in students that may have an interest in becoming a commercial pilot, especially since each VR station is on wheels and can easily be transported around the school. Schools like the University of North Dakota want to train their students in the most thorough yet cost-effective way possible, while upholding their reputation as premier schools for future aviators. While higher-end VR headsets like the new HTC VIVE Cosmos Elite can cost around $900 in total, FRASCA flight training devices, which are some of the most common simulators used in training, start around $200,000 per unit before any upgrades (VIVE, 2021; Frasca Flight Simulation, 2021). Adding repair costs and labor on top of that, the cost of even the cheapest simulators can add up quickly. As VR develops and becomes more trusted by the FAA and in school programs for flight training, it could potentially save thousands of dollars by requiring less actual time in the aircraft for pilots-in-training, and less use of costly simulators. Associate Dean of Aerospace Sciences at the university, Beth Bjerke, indicated that UND plans to partner with other universities and organizations to gather data and build a case to present to the FAA on the legitimacy of virtual reality in flight training (Murphy, 2020). MR and AR have been an exciting advantage in the aviation industry, making flight simulators and other sources of training all the more realistic. Many flight schools and universities have implemented some sort of AR or MR as a training medium. One of the many reasons it has proven so useful is due to the spatial memorability it allows students, trainees, and employees to perceive; that is: AR/VR/MR all affect long term memory through enhancing memory recall (Macchiarella et al., 2005). The aspect of muscle memory (e.g., head movements during a task) is a large factor that leads to this enhanced memory recall (Macchiarella et al., 2005). Much like driving to the same destination repeatedly becomes so engrained in our brains that on occasion we may forget driving there, repetitions of movement producing a positive end result when using these devices may also become encoded and automatically processed. When these head mounted devices bring the brain to a different or enhanced reality it can keep the focus on that task and eliminate distractions that may detract from learning and encoding. Not only are these devices being used in the civil side of the aviation industry, but the military has been employing virtual reality in much of its training methods to speed up the time of training. Through their Pilot Training Next (PTN) program,
322
Human Factors in Simulation and Training
student pilots working toward their wings in the United States Air Force could put on the HTC VIVE Pro headset and enter the cockpit of their T-6 trainer, or any aircraft, with little to no effort, hearing the real-time sounds of the engine and noises from the cockpit along with the sound of their instructor’s voice like in the real aircraft (Losey, 2018). Instructors can start the student midair to not waste time so they can practice loops, barrel rolls, and any other maneuvers that were necessary for training that day, without having to use up the resources of the plane and take on the danger of a real military jet (Losey, 2018). Using biometrics and artificial intelligence, the instructors can monitor their students, checking pulse rate, heartbeat, blood pressure, and stress levels to see how engaged they are and if it is too stressful and adjust the lessons accordingly (Losey, 2018). Compared to the legacy simulators alone which use a stagnant screen, using this technology is an immersive experience, similar to an IMAX movie, where the virtual environment is all around engulfing the student and tricking their brain into thinking they are actually in the cockpit (Losey, 2018). When they are dismissed back to their dorms for the day, students can practice or fine tune their skills using their own setup that they share with a roommate so they can progress faster (Losey, 2018). The time saved is exactly what they are looking for and is a huge reason to use this kind of technology. It usually takes a year to graduate the pilot training program, but students of PTN were able to graduate in just four months after starting (Losey, 2018). Another huge reason the Air Force developed this program is the cost. Legacy simulators for class and training can cost $4.5 million each, while the headsets run for less than $1,000 (Losey, 2018). Added with the controls and other equipment, the VR headset and devices would cost around $15,000 (Losey, 2018). As a result of the low cost and the little surface area involved, the Air Force could make 20 simulators (Losey, 2018). Since there are so many students these devices help occupy their time rather than wait for the one legacy simulator or until the weather is good and the aircraft the student needs is occupied.
THE SHORTAGE AND THE FUTURE Boeing forecasts that “763,000 new civil aviation pilots, 739,000 new maintenance technicians and 903,000 new cabin crew members will be needed to fly and maintain the global fleet over the next 20 years,” and although the COVID-19 pandemic seems to have decreased commercial air travel dramatically, the industry plans to recover. The number of outgoing flights will soon rise, and a large population of the workforce is predicted to reach mandatory retirement age simultaneously (Boeing, 2020, p. 2). However, because of this decrease in commercial travel and cuts in their budget, airlines have had to furlough unessential maintenance, cabin crew workers, and pilots despite having the need to maintain aircraft in storage. Because of this aircraft MRO shortage, that was already in effect, using the minimum amount of time to complete training and keeping costs low will be crucial to keep the industry stable. VR has shown repeatedly that because students can learn independently and on their own time they can finish at their own pace and push themselves to get done while feeling fully comfortable. However, in VR there is a disconnect between a
The Influence of New Realities
323
user’s current reality and their simulated one, which can make interactions and using surroundings tricky. Because of the disconnect from reality that VR creates, certain tasks are better suited for AR or MR. In AR, the link between the real world and the simulated medium exists because it is simply a projection of symbols, words, and computer-aided design elements onto what the user sees presently (Ceruti et al., 2019). Having a virtual aide to help students or maintenance personnel is crucial for properly handling demanding and complex maintenance tasks where high risk is involved. An article in the Journal of Computational Design and Engineering thoroughly discusses the use of AR in aeronautical maintenance in an industry 4.0 context with that being the fourth industrial revolution (Ceruti et al., 2019). The integration of the next revolution encompasses smart factories, where data are shared through connected machines and devices in the Internet of Things (IoT), completely autonomous systems, without the need for human intervention on complex issues, and machine learning (Ceruti et al., 2019). AR systems could be as simple as seethrough glasses with a camera and small projector or even a smartphone or tablet where the camera would be used to capture the external environment and the screen is the output (Ceruti et al., 2019). In both situations when the user moves the camera, the symbols or virtual projections would not change in reference to the external world, but they would move with it or stay in place according to where they were calibrated; in other words, they change position according to the video output to align with the surroundings (Ceruti et al., 2019). In a study on wearable technology, 15 mechanics employed by GE Aviation were asked to perform a complex maintenance task and a simpler maintenance task, twice: once in the traditional manner with paper manual instructions and using normal tools and the second, using smart glasses and a Wi-Fi enabled torque wrench (Robertson et al., 2018). Over the course of three weeks, one maintenance professional each day would perform tasks, working on the Variable Geometry Actuator (VGA) as the simpler component and a Main Fuel Pump on the CF34-8C engine as the complex component (Robertson et al., 2018). The team chose to use Google Glass and were given a WiFi-enabled Atlas-Copco Saltus MWR85TA torque wrench to be able to connect with the technology and sense their movements during the second portion of the study (Robertson et al., 2018). At the conclusion of the study, it was found that the wearable technology with augmented reality-like components reduced the completion time for both tasks by 7.7% for the VGA and 11.6% on the main fuel pump (Robertson et al., 2018). In the AR simulation, the participants were able to complete the tasks in a quicker manner without constantly going back and forth, sometimes up and down a ladder, from where they were working to the manuals because with the glasses all the necessary information was overlaid. Being able to stay in one place saved a lot of time and many stated that once they are more familiar with the wearable technology, certain tasks may get completed even faster because they will know how to properly use it (Robertson et al., 2018). For the survey portion of the study, it was found that 60% of the participants preferred to use the wearable technology over the more traditional method (Robertson et al., 2018).
324
Human Factors in Simulation and Training
Another study with collaborative efforts by a design engineer at Boeing and Iowa State University tested individuals who used three different methods to assemble the wings of an aircraft, a challenging process requiring over 50 steps with nearly 30 parts to assemble (Augmented Reality for Enterprise Alliance, 2015). The groups were composed of a stationary desktop computer with a work instruction pdf file, a mobile tablet with the work instruction pdf file, and a mobile tablet equipped with the AR software that could show guided steps for task completion using graphical overlays (Augmented Reality for Enterprise Alliance, 2015). The study found that when the users had the tablets in the AR mode, on average, they made zero errors (Augmented Reality for Enterprise Alliance, 2015). The group using AR also completed the tasks faster the first time than the other methods (Augmented Reality for Enterprise Alliance, 2015). The results also indicated that using the AR helped reduce the time of completing the wing by about 30% and gave a 90% increase in the quality of the first build of the AR modes in comparison with using the desktop (Augmented Reality for Enterprise Alliance, 2015). When it comes to aircraft and the safety of pilots and passengers, maintenance repair and overhaul is of utmost importance. The FAA and other governing bodies recognize this fact and that is why mechanics must go through an arduous process of certification and obtain years of experience and training. As technology grows and aircraft systems become more electronic and complex, maintenance has become more specialized, and training of maintenance personnel must be technologically adapted and regulated to avoid accidents like that of the MCAS technology on the Boeing 737 Max 8. By incorporating AR into manuals for the aircraft, which can often include hundreds of parts, the operator now has the capabilities to see the position of the part he or she is looking for on the actual aircraft itself (Ceruti et al., 2019). One of the main challenges facing the implementation of this technology is time: time to input the manuals and add the animations and projections; however, if that is done beforehand in another location by the aircraft manufacturer and added to the device to be used when needed, it could be a simple solution (Ceruti et al., 2019). On the training side of things, it can make complex maintenance tasks and situations much easier by not having to find parts that facilities may not have or that are hard to come by. Using augmented reality can also reduce the time cost and resource waste of printing a new manual every time a new maintenance update comes out. It also allows for more than one aviation professional to utilize the manual if there is a limited amount. AR permits the employee, trainee, or mechanic to access the required information on his or her device with just a few taps. The development and employment of AR in an industry context has had a long journey, first being developed in 1992, and has reached a level of maturity where it could satisfactorily be made available to various factories and aviation facets, such as aircraft maintenance (Ceruti et al., 2019). Unfortunately, just like with virtual reality flight training at aviation associated universities and the airlines, there are no current legislation and certification processes in place; the lack of legislation surrounding these devices limits the widespread application of AR technology. However, if the market continues to employ VR, AR, and MR capabilities at the current rapid rate, governing bodies can be
The Influence of New Realities
325
pushed to develop proper rules and this technology could be in place all around us within the next couple of years (Ceruti et al., 2019). Not just in maintenance, but a lot of creators of VR and AR programs and trainers in different sectors of aviation are taking this to heart as they work toward a more self-directed training method, meaning that the student can work on the parts that they do not feel comfortable with or have knowledge on and skip through the parts they already feel competent on. This way training of their employees can be faster and more streamlined to help with the shortage. In a presentation given at the World Aviation Training Summit in 2019, instructor Roger Lowe of American Airlines and David Jones, President and CEO of Quantified Design Solutions, joined forces to conduct a case study in late 2017–March of 2018, to maximize the efficiency and level of training for the cabin crew of American Airlines using VR. American Airlines wanted a smoother way of teaching the aircraft so the training can be processed and absorbed in the best way, and a trainer that accurately represented their fleet, so they had Quantified Design develop a twelve-room VR training lab to introduce the crew to new aircraft and train before their certified check out (Jones & Lowe, 2019). After a 20-minute familiarity briefing each Sunday before the start of each week, trainees or new hires who wanted training on one of American Airlines’ type of aircraft: A321, 777, 787, a quick refresher before a work trip, or a brush up prior to an evaluation can use one of the rooms to practice door operations, their knowledge of the location of emergency equipment and the various preflight check that are required by the FAA (Jones & Lowe, 2019). Many VR devices and programs can track performance and give instructors statistics on what the students may need to work on and what makes sense or comes naturally to them, and the VR trainer used in this study is a prime example of that. Monitored by a staff member in another room as they work during the week, students’ performance in the training sections they do are recorded under their accounts so they can keep moving forward and progress as they return (Jones & Lowe, 2019). This adaptive measure not only helps the individual hone in on what they might need to work on and helps them work on more advanced procedures when they are ready, but it also helps American Airlines’ training development determine errors made by many students and improve focus on problem areas before they get to further VR training and the sims. Of the 50 students in the February class, the high self-efficacy scores for opening an A321 door rose from 20% of students (before the virtual reality training was introduced) to 68% (post training), which means students felt more confident and comfortable after the addition of the training (Jones & Lowe, 2019). Adding VR dropped the required amount of use for simulators and thus the operating costs because the number of students required to repeat evaluations on a physical simulator went from 25% to 2% from January to March (Jones & Lowe, 2019); the performance in students also drastically improved, moving the percentage of error free qualifying events from 34% to 82% during the January to March time frame (Jones & Lowe, 2019). Considering the cost of running a door trainer, to include the instruction time, maintenance, and operation of the simulator, to get upwards of $51 an hour and multiplying it by all the repeats, the error evaluations, the debriefs required, and the time it takes to complete all of these, American Airlines and any other airline who may decide to make use of
326
Human Factors in Simulation and Training
this could potentially save $207,809 for the three-month new hire training (Jones & Lowe, 2019). When you multiply this by three for nine months of new hire training, the savings could be upwards of $623,429, not to mention the cost of a flight attendant accidentally deploying a slide or forgetting material when just using a simulator (Jones & Lowe, 2019).
BENEFITS, DRAWBACKS, AND RESOLUTIONS In this ever-changing world, the expansion of technology has pushed for more modern and relatable ways of learning: learning for not only students but educators, mentors, parents, and military personnel alike who are tasked with teaching the next generation using tools they may be unfamiliar with. For example, in recent times amidst the Covid-19 pandemic, where in-person classes and gatherings were not possible, many universities were forced to train their students and staff to use Zoom or other programs, which allow video communications between students and their teachers, video chatting, screen sharing, and collaborative learning; even job interview have been conducted by use of applications such as Zoom. Using technology as an advantage has become paramount for successful teaching and learning so students can still attend school and learn no matter what the circumstances. Advances in available resources have allowed classrooms to shift in a way past generation would never have dreamed of: going from chalkboards to tablets and computers, from textbooks to ebooks and podcasts, from analog projectors to state-of-the-art ceiling-mounted projectors incorporating Apple TVs, from speech class and speech therapy to VR-based exposure therapy. Virtual reality has many important advantages that enhance the learning experience and make it easier, especially in the aviation field. It can be extremely interactive as users work through the given program and can practice the skills they are taught in a stress-free manner; any mistakes that are made can quickly be reset with the press of a button and students can learn what they did wrong and fix it. By being able to practice what they learned repetitiously and in a quick manner without having to find new resources, users can build the confidence they need to feel proficient. While this can be extremely beneficial, one of the main concerns aviation experts have on implementing this technology is the issue of convincing the brain that they are in a real-life flight situation when the student subconsciously knows he or she can just tap the stop or pause button, get the simulator to reset, and have another crack at it if things go badly; there is no fear factor or pressure to get it right if there is no real feel (Ellis, 2019). NASA’s solution to this issue of not having the completely real element to it is their “Fused Reality” technology (Conner, 2015). It combines computer-generated scenes and environments with real-world video by use of a flight helmet with a special optical system that can overlay the graphics of another plane, airfield, or potentially dangerous situation to the outside camera view (Conner, 2015). This technique allows for an extremely immersive way of training to prepare pilots for all situations in a safe manner. Challenging tasks such as aerial refueling, formation flying, or highspeed landings can be practiced without the huge risk, but because the training is done in the aircraft with the feel and sound of being in the air, it is as real as it gets.
The Influence of New Realities
327
The pilots are essentially able to take the simulator in the air with them, seeing real elements, such as the mountains and clouds, but having the simulation right there with them as well to practice the harder stuff (Conner, 2015). In the aforementioned American Airlines study, the virtual reality lab’s training system gives many advantages, however, it can still only be used as a supplement for the simulators and training on the aircraft itself. Using virtual reality methods is not yet certified by the FAA, so students and trainees still have to get qualified and take examinations on a physical simulator, and the transfer of training from the virtual environment to a simulator can be egregiously different. Designers can make the simulated programs as sound and smooth as physically possible to instill good procedures, but subconsciously the student will not have the stress of a real situation and they may not receive the same tactile and auditory feedback or stimulation as onboard the aircraft. Many health hazards exist with using VR, MR, and AR as well. Prolonged exposure to the virtual reality world can also cause extreme disorientation; the user may experience dizziness, nausea, or headaches (Chang et al., 2020). A study examining virtual reality sickness found that among the 858 survey participants involved, “48.6–52.8 % of participants had experienced VR sickness regardless of age” (Lim et al., 2021, Section 3.1). In addition to the survey, when looking at brain waves, the study found that “across two repeated tests, all waves showed significant changes when EEG baseline and EEG sickness were compared” (Lim et al., 2021, Section 3.2). When being exposed to virtual reality many of the participants experienced sickness and hit the VR sickness button, and although many designers create elements and safeguards including using higher fidelity technology, it is still a great possibility that users can get sick especially from long-term exposure in training where their brain is already being worked and used. Although extreme and unusual, some users (about 1 in 4,000) will experience dizziness, seizures, or blackouts as a result of VR among other things (Lim et al., 2021). These are mysterious and disconcerting effects that can happen, especially if the technology is not used properly; however, for the majority of people this is not a concern if use is limited and conducted in an appropriate manner. The guidebooks to VR devices like the Oculus Rift even suggest giving a break in the VR experience or reminders to the user as to how long they have been using it (Chang et al., 2020).
CONCLUSION The immeasurable growth in technology gives mankind the opportunity to use an extensive array of tools to make life easier and more enjoyable in their personal life, such as a Rumba for cleaning floors, an Oculus Rift to play virtual reality games on, a smart TV that users can speak to, to watch what they want, or smart lights that turn off simply by asking Alexa, and for an industry as a whole. The use of VR and AR in the medical field, for example, has allowed a ton of growth and explored new horizons for what can be accomplished: dentists can use AR projections of patient data to help them build more calculated caps or crowns, nurses can work through many examples of different patients using an AR equipped tablet, and VA-ST SmartSpecs
328
Human Factors in Simulation and Training
can allow the legally blind or critically visually impaired recognize faces, see their environment, and find lost items (Alliance of Advanced BioMedical Engineering, 2017). Many enthusiasts and experts hope that aviation will be the next industry influenced and expanded upon as a result of this revolutionary technology; for this to be possible, the FAA, ICAO, and government bodies must adapt their regulations to accommodate, recognize, and implement these tools as acceptable aids and training methods. As technology is enhanced even more, improvements are made, and there are more use cases. AR, MR, and VR will become a huge part of the learning environment in the next several years.
REFERENCES Alliance of Advanced BioMedical Engineering. (2017). Augmented Reality to Revolutionize the Health Care. The Alliance of Advanced BioMedical Engineering; Frost & Sullivan. https://aabme.asme.org/posts/novel-augmented-reality-technology-to-revolutionize -the-health-care-industry Augmented Reality for Enterprise Alliance. (2015, August 20). Augmented Reality Can Increase Productivity. AREA. https://thearea.org/augmented-reality-can-increase -productivity/ Ayers, J. W., Leas, E. C., Dredze, M., Allem, J.-P., Grabowski, J. G., & Hill, L. (2016). Pokémon GO—A New Distraction for Drivers and Pedestrians. JAMA Internal Medicine, 176(12), 1865. https://doi.org/10.1001/jamainternmed.2016.6274 Boeing. (2020). Pilot and Technician Outlook. Boeing. https://www.boeing.com /resources/ boeingdotcom /market /assets/downloads/2020_ PTO_ PDF_Download.pdf Ceruti, A., Marzocca, P., Liverani, A., & Bil, C. (2019). Maintenance in aeronautics in an Industry 4.0 context: The role of Augmented Reality and Additive Manufacturing. Journal of Computational Design and Engineering, 6(4), 516–526. https://doi.org/10 .1016/j.jcde.2019.02.001 Chang, E., Kim, H. T., & Yoo, B. (2020). Virtual Reality Sickness: A Review of Causes and Measurements. International Journal of Human–Computer Interaction, 36(17), 1658–1682. https://doi.org/10.1080/10447318.2020.1778351 Conner, M. (2015, September 29). Fused Reality. NASA. https://www.nasa.gov/centers/ armstrong/features/fused_ reality.html Ellis, C. (2019, October 9). Are VR flight simulators the future of pilot training? | Air Charter Service. Aircharterservice.com. https://www.aircharterservice.com /about-us/news -features/ blog/are-vr-flight-simulators-the-future-of-pilot-training Federal Aviation Administration. (1996). FAA Historical Chronology. https://www.faa.gov/ about/ history/chronolog_ history/media/ b-chron.pdf Federal Aviation Administration. (2017). Sim City. Federal Aviation Administration Safety Briefing. https://www.faa.gov/news/safety_briefing/2017/media/novdec2017.pdf Federal Register (July, 1996). Aircraft flight simulator use in pilot training, testing, and checking at training centers; Final rule. Federal Register, 61(128), 34508–34568. Foundry4. (2019, February 19). 7 Augmented Reality Companies to Watch. Foundry4.com. https://foundry4.com /7-augmented-reality-companies-to-watch Frasca Flight Simulation. (2021, February 20). How Much Does a Frasca Simulator Cost? Frasca Flight Simulation. https://www.frasca.com / how-much-does-a-frasca-simulator -cost/ intel. (2019). Virtual Reality Vs. Augmented Reality Vs. Mixed Reality. intel. https://www .intel.com /content /www/us/en /tech-tips-and-tricks/virtual-reality-vs-augmented-reality.html
The Influence of New Realities
329
Jones, D., & Lowe, R. (2019). Maximizing Virtual Reality Cabin Crew Training: A Case Study. WATS 2019. Quantified Design Solutions and American Airlines Presentation https://www.wats-event.com /wp-content/uploads/2019/05/Jones_ Lowe.pdf Lim, H. K., Ji, K., Woo, Y. S., Han, D., Lee, D.-H., Nam, S. G., & Jang, K.-M. (2021). Test-Retest Reliability of the Virtual Reality Sickness Evaluation Using Electroencephalography (EEG). Neuroscience Letters, 743, 135589. ScienceDirect. https://doi.org/10.1016/j .neulet.2020.135589 Losey, S. (2018, September 30). The Air Force Is Revolutionizing the Way Airmen Learn to Be Aviators. Air Force Times. https://www.airforcetimes.com /news/your-air-force /2018/09/30/the-air-force-is-revolutionizing-the-way-airmen-learn-to-be-aviators/ Macchiarella, N., Liu, D., & Gangadharan, S. (2005) Augmented Reality as a Training Medium for Aviation/Aerospace Application. Murphy, C. (2020, January 30). Simulators on Campus: UND Aerospace Launches VR FlightTrainer. UND Today. Robertson, T., Bischof, J., Geyman, M., & Lise, E. (2018). Reducing Maintenance Error with Wearable Technology. 2018 Annual Reliability and Maintainability Symposium (RAMS). pp. 1–6. https://doi.org/10.1109/ram.2018.8463068 U.S Department of Transportation. (1991). Advisory Circular 120-40B. Federal Aviation Administration. https://www.faa.gov/documentLibrary/media /Advisory_Circular/120 -40B1.pdf VIVE. (2021, March 11). Find the right high-end VR system for you | VIVE United States. http://www.vive.com /us/product/ Vora, J., Nair, S., Gramopadhye, A. K., Duchowski, A. T., Melloy, B. J., & Kanki, B. (2002). Using Virtual Reality Technology for Aircraft Visual Inspection Training: Presence and Comparison Studies. Applied Ergonomics, 33(6), 559–570. https://doi.org/10.1016 /s0003- 6870(02)00039-x Weirauch, C. (2020, April 27). Enhanced Pilot Training Via VR. The Journal for Civil Aviation Training, 31(2). Halldale Group. Yung, R., & Khoo-Lattimore, C. (2017). New Realities: A Systematic Literature Review on Virtual Reality and Augmented Reality in Tourism Research. Current Issues in Tourism, 22(17), 2056–2081. https://doi.org/10.1080/13683500.2017.1417359
13
Training, Stress, Time Pressure, and Surprise An Accident Case Study Julianne M. Fox and Mustapha Mouloua
CONTENTS Introduction and Background................................................................................. 331 Training and System Design................................................................................... 333 An Accident Case Study: Colgan Air Flight 3407.................................................. 333 Why Didn’t Previous Training Prevent This Accident?.......................................... 337 Opportunities for Better Outcomes......................................................................... 338 Training: A Countermeasure for Responding to Startle and Surprise........... 338 Automation Design: Keeping the Human in the Loop Where Feasible........ 339 Highlighting the Lessons Learned.......................................................................... 341 References............................................................................................................... 342
INTRODUCTION AND BACKGROUND In all activities in which we engage, there is risk and flying in an aircraft is not an exception. However, perhaps contrary to public perception, flying an aircraft is safer than everyday activities in which many of us choose to regularly engage, such as driving a car or traveling by rail, bus, ferryboat, or train – just to name a few (Savage, 2013). The aviation system, by design, is incredibly robust and over the years, safety in this sector has continued to improve (Savage, 2013). Catastrophic accidents have become more and more rare. However, challenges in the system still exist. Over the last couple decades, advances in automation technologies have driven improvements to the aviation system through: enabling a decrease in the physical workload required to operate aircraft, facilitating all-weather flying, driving greater fuel efficiency, system reliability, and an increase to flight safety (Wiener, 1988; Mouloua et al., 1997, 2010). However, such advances have also come at a cost as flight operations have become more automated which has resulted in human factors challenges for the flight crew such as a deterioration of situation awareness (Endsley, 1999; Kaber & Endsley, 1997), automation-induced complacency (Wiener, 1977; Parasuraman et al., 1993; Mouloua et al., 1993a, 2010), increased mental workload (Endsley, 1999), and these unintended consequences of its use have resulted in the loss of aircraft control accidents following unexpected transitions to manual control (Parasuraman DOI: 10.1201/9781003401353-13
331
332
Human Factors in Simulation and Training
et al., 1992; Mouloua et al., 2010). These challenges are further impacted by factors such as the effects of interruption and distraction (Ferraro et al., 2017, 2018; Stader, 2014; Dismukes et al., 1998; Latorella, 1996; Dismukes, 2006), automation reliability (Wickens & Dixon, 2007; Oakley et al., 2003; Ferraro et al., 2017, 2018), and its impact upon operator trust in automation (Parasuraman, 1987; Parasuraman & Riley, 1997) which ultimately affects its use. Despite the continuous improvements in the areas of system design and reliability, information availability, enhanced and evolving procedures and training, and the ability to practice in simulated environments, approximately 60–80% of accidents continue to stem from human error – including aviation accidents (Lautman & Gallimore, 1987; Rasmussen et al., 1994). To offset this known potential inherent fallible link in the system (i.e., human error), part of a flight crew’s training is designed to present flight crews with unexpected challenging scenarios which will require peak human performance to successfully respond. The problem is that human perception and performance can be significantly negatively impacted when expectations are violated (Martin et al., 2016; Dewar & Olson, 2007; Olson & Farber, 2003; Hole & Tyrell, 1995; Foyle et al., 2002) and even more so when stress and time pressure exists or is perceived to exist (Landman et al., 2017a; Casner et al., 2013; Beringer & Harris, 1999; Wickens et al., 1993; Sheridan, 1981; Easterbrook, 1959). The best-known countermeasure is training and where and when possible, to a level of automaticity (Staal, 2004; Driskell et al., 1986; Holt & Rainey, 2002; Driskell et al., 1992; Dismukes, 2007). Because it is impossible to train for all possible scenarios, it is crucial to develop specific simulated training exercises. Thus, it is recommended that flight crews be exposed to a wide array of practical training and simulation exercises that can target pilot skills with regard to unpredictable events, unforeseen situations, and emergencies. These scenarios will include framing mismatches following surprising and/or startling events (Landman et al., 2017a; Rankin et al., 2016; Kochan, 2005). It has long been established that the accumulation of knowledge and skill through practice and experience serves to offset pilot performance decrements in response to such surprising and/or startling events (Landman et al., 2017a). And flight simulators have long since enabled training organizations to expose flight crews to such scenarios in a realistic, but safe environment. In fact, as a result of a growing number of loss of control accidents with a startle/surprise element, regulatory agencies are now recommending that scenarios incorporating startle and surprise be included in training programs (Federal Aviation Administration, 2015; European Aviation Safety Agency, 2015; International Civil Aviation Organisation, 2013). However, the key challenge continues to be to capture the right scenarios for the training syllabus (both initial and recurrent training) and to replicate the surprise and/or startle element that would likely exist outside of the simulator if such a situation were to unfold (Driskell et al., 1992; Gainer & Sullivan, 1976). When an experienced flight crew is presented with a scenario that is within the same context and with the same attributes which they were exposed to in previous training, the likelihood of an effective response is higher than when the context and/or attributes are seen for a first time (i.e., a novel presentation of the scenario such as likely occurred during the accident
Training, Stress, Time Pressure, and Surprise
333
discussed in this case study). We attribute it to the way in which expertise manifests itself over time. Expert responses can be either helped or hindered in accuracy and response time by an expert’s usage of pattern recognition and intuition in formulating their response as compared to novices. The responses of experts, when accurate, can be faster and more effective, but when inaccurate can be strong, but wrong (Reason, 1990) which is evidenced by an absence of the cognitive flexibility required to be able to shift scripts when a response is not achieving the desired outcome. Many factors such as experience, expertise, attention, motivation, heuristics, framing, and biases are likely to influence the entire process related to a response to surprise (Mauro et al., 2001; Kochan, 2005). As a result, it is recommended that a portion of flight crew training needs to focus upon the training of reframing skills in response to framing mismatches following scenarios where surprising and/or startling events ensue (Landman et al., 2018; Landman et al., 2017a, 2017b, 2017c; Rankin et al., 2016; Casner et al., 2013).
TRAINING AND SYSTEM DESIGN As a part of the design and certification process of an aircraft, the consequences of function loss are identified, and based upon the consequence level (Advisory Circular 25.1309-1A, 1988), the systems probability of allowable failure is determined. For functions whose failure will have less of an impact on a given flight, a higher probability of failure is permissible. But for those where a loss of function is likely to result in a catastrophic aircraft accident (i.e., a hull loss), a more robust and/or redundant system design is mandated – one that by design (statistically) should never fail during the life of an aircraft. For those failures that by design can occur, unexpectedly and at a point in the flight where a near-immediate and accurate response is required with the potential to result in surprise and or startle for the flight crew, repetitive training has historically been provided as a countermeasure (e.g., engine failure just prior to taking off – V1 engine cuts and wing stall recovery). However, exposing flight crews to all possible perturbations (i.e., with same attributes and within the same context) for each of these failures is not realistic. According to the regulations governing transport category aircraft certification, 14 Code of Federal Regulations (CFR) 25.1309 (2007), each system must be designed to meet its intended function and in the event of a system failure, the flight crew must receive a warning alerting them to the unsafe system operating condition. According to CFR 25.1302 (2013), the operational behavior of the system must be predictable, unambiguous, and designed to enable the flight crew to intervene in a manner appropriate to the task. When failures occur, the system must enable the flight crew to take corrective action and as a part of the certification process, the potential for the failure to go unnoticed must be taken into account.
AN ACCIDENT CASE STUDY: COLGAN AIR FLIGHT 3407 To further the understanding of these challenges, we will present a case study of an aviation accident that occurred in Buffalo, New York, on February 12, 2009. In
334
Human Factors in Simulation and Training
this chapter, we focus upon how on a particular day, a flight crew’s perception and performance was impacted by system design, automation use, distraction, loss of situational awareness, weather, and the flight crew’s training. Additionally, we will explore why the repetitive prior training did not serve as an adequate countermeasure on this day. We will describe the events that led to this accident and discuss the key human factors and system design challenges which contributed to this outcome. Finally, we will propose some guidelines for the design of system interfaces and training and highlight the lessons learned from the presented case study. This case study and the guidelines presented are applicable not only to the aviation domain but also to any domain where similar challenges exist (e.g., autonomous vehicles, spacecraft, maritime vessels, etc.). On Thursday, February 12, 2009, at approximately 10:17 pm, a Colgan Air Inc., Bombardier Dash 8-Q400 (N200WQ) flight designated as Continental Express Flight 3407, crashed while on approach into Buffalo Niagara International Airport. Four flight crew including the captain and first officer and 45 passengers were fatally injured and the aircraft was destroyed. One person on the ground was fatally injured as well. The flight was a Title 14 Code of Federal Regulations (CFR), Part 121 scheduled passenger flight (i.e., a transport category flight flown by a Continental Airlines commuter airline) from Newark Liberty International Airport (Newark) to Buffalo Niagara International Airport (Buffalo) and occurred during night instrument meteorological conditions (IMC). During the approach into Buffalo, the aircraft flew primarily in the clouds (i.e., IMC) and in and out of icing conditions. Night IMC prevented the flight crew from being able to visually see the horizon outside of the aircraft requiring them to monitor instrumentation in the aircraft for this information. Having a visible horizon would have enabled them to more easily maintain a sense of their geographical orientation during their approach into a major metropolitan area by merely looking outside the aircraft. Simply put, flying in IMC is a higher workload task especially in the area of maintaining situational awareness as compared to when flying in visual conditions (Rousseau et al., 2004; Endsley, 2000; Klein, 2000; Shebilske et al., 2000; Endsley, 1999; Durso & Alexander, 2010; Wickens, 2002; Kaber & Endsley, 1997). And in accordance with this flight crews training and procedures, at the time that the accident sequence began, they were utilizing their autopilot (as opposed to flying manually) which in turn required them to focus upon and monitor what the autopilot was commanding the aircraft to do as well. As discussed in the introductory chapter, the use of automation has many advantages, however it also ushers in the opportunity for flight crews to lose track of what the aircraft automation is commanding the aircraft to do (i.e., to experience a loss of situational awareness) setting up the potential for what is referred to as automation surprise which stems from automation misuse (i.e., a flight crew becoming surprised as a result of not monitoring the automation effectively) (Parasuraman & Riley, 1997). Additionally, at the time the accident sequence began to unfold, the first officer was likely focused upon requesting their landing data. At that time, she was also in the midst of determining the weather at their arrival destination, selecting and inputting the arrival runway, communicating with the captain about the weather and runway, and checking with the cabin crew to see if there were any “specials”
Training, Stress, Time Pressure, and Surprise
335
– all potentially required events likely serving to distract from the monitoring of the automation (Simons & Chabris, 1999; Dismukes, 2010; Dismukes, 2006; Dismukes, 2007; Dismukes et al., 1998). During their approach into Buffalo, as a result of the potential for icing given the weather conditions, the flight crew turned on the aircraft’s deicing system by selecting the Reference Speed (i.e., the REF SPD) switch to ON. Once this decision was made and the system was turned on, the flight crew also needed to request landing speeds which would factor in the potential of icing to accumulate on the surface of the wing during their final approach. Landing speeds for icing conditions are notably faster than those used when icing is not likely to exist. If icing were to begin to adhere to the aircrafts wing during the approach (which there is no evidence to suggest that that this is what happened on this night), higher airspeeds would theoretically enable the aircraft to maintain greater lift than at the lower non-icing landing airspeeds – offsetting the decrement in lift caused by the ice which would obstruct the smooth laminar airflow across the wing. To request landing airspeeds (either icing or non-icing), the pilot monitoring and not flying (in this case the first officer), was required to make the request through an onboard system referred to as the Aircraft Communications Addressing and Reporting System (ACARS). To do so, the first officer was required to input the request into the Flight Management Computer (FMC) by manually typing the request using a keypad. To request icing airspeeds, the first officer had to first remember that they needed to enter this information into the FMC, next they needed to select the correct page within the FMC and then type “ICING” in the correct location, using the correct spelling, and submit the request. Unfortunately, either a misspelling such as “ICEING” (versus ICING) or forgetting to go to the correct page and enter the information altogether, would result in the default non-icing airspeeds being provided to the flight crew. Although we don’t know whether the first officer forgot to make the request (a slip) or misspelled the input (error of commission), we do know that on this night, 24 minutes before the aircraft crashed, non-icing airspeeds were received by the flight crew which were incongruent with the REF SPD switch position and neither crew member detected that this discrepancy existed. A slip or error of commission are both types of human error (Norman, 1981; Reason, 1990). Regardless which type occurred, the system interface design did not enable the error to be easily nor readily noticed. This mismatch set the accident sequence in motion. As discussed, the REF SPD switch set to the ON position activates the aircraft’s deicing system, however it also increases the airspeed at which stall warning information is conveyed to the flight crew. When this aircraft is approaching an airspeed at which the stall warning system determines the wing of the aircraft will stall resulting in the aircraft no longer being able to sustain adequate lift to fly, a system called the stick shaker automatically activates. The stall warning system is a very salient warning in that it provides very attention-getting tactile, visual, and audible feedback to the flight crew. The yoke of the aircraft vibrates (referred to as stick shaker), a loud vibrating sound is emitted (i.e., stick shaker), and visual feedback is provided via the flight displays. This feedback is designed to capture human attention and focus it upon these warnings.
336
Human Factors in Simulation and Training
On this night, had icing speeds been requested, the VREF airspeed supplied to the flight crew would have been 138 knots; however, since this did not occur, a non-icing target airspeed of 118 knots was delivered to the first officer which was 20 knots slower than it should have been given the aircraft’s configuration. As a result, the airspeed at which the stall warning system activated was 20 knots faster than would have been reasonably anticipated by the flight crew and this framing mismatch, as evidenced by the flight crew’s reaction, both surprised and startled them as a result. At the time the VREF airspeed target was received, the first officer set the speed bug which depicts a fly-to target on the airspeed indicator (i.e., the airspeed tape). Because this target airspeed (118 knots) on the airspeed tape was out of view at the time it was being set (given the depicted airspeed resolution on the airspeed tape) and only the numerical presentation was visible, neither flight crew member had the opportunity to readily detect that a configuration error was occurring. Had the resolution of the airspeed tape enabled the 118-knot position on the tape to be visible at the time the bug was being set (or feedback provided in another form), the flight crew would have had the opportunity to see that the target airspeed that they were attempting to set was at too slow an airspeed (i.e., within the low-speed cue) based upon their configuration. This feedback would have likely had the opportunity to abruptly stop this accident sequence. Along the edge of the airspeed tape, a lowspeed cue which appears as a red and black barber pole clearly identifies the airspeed region in which the stall warning system will activate, and it would be reasonable to expect that a flight crew would not intentionally set an airspeed bug at any airspeed within the low-speed cue. Thus, as the flight crew was configuring the aircraft for landing, the autothrottles were commanding the aircraft to decelerate and the autopilot was subsequently commanding an increase to the pitch attitude of the aircraft as required during any final approach phase. In the final sequence of events, the first officer lowered the landing gear as required and then shortly thereafter extended the flaps to 10⁰, another required step in the process. As the aircraft decelerated below 131 knots (on its way to the 118-knot target airspeed that had already been selected), the stall warning system activated, and the autopilot disconnected instantaneously transitioning the control of the aircraft from the autopilot and autothrottles to the captain. This abrupt unexpected transition of control from the autopilot to the captain required a near-immediate manual intervention to enact the appropriate recovery procedure for an approaching wing stall. This likely came as quite a surprise to the flight crew because given the target airspeed they had received, this occurred at a significantly faster airspeed than they would have reasonably expected to result in a stall warning. Thus, their expectation was likely significantly violated by this automation surprise. Additionally, the stick shaker given its attention-getting properties and its occurrence at a time not expected, likely served to startle the flight crew as well. A perception of significant time pressure also likely existed as a near-immediate response is usually required for such a recovery maneuver. Given the communication between the crew as retrieved from the Cockpit Voice Recorder (CVR) and their performance recorded by the Digital Flight Data Recorder (DFDR), stress and perceived time pressure existed. From both the flight data recorder and the CVR, we know that the flight
Training, Stress, Time Pressure, and Surprise
337
crew’s response was opposite that of what was required to recover from an approaching wing stall. The captain should have decreased the pitch attitude of the aircraft (i.e., push the nose of the aircraft down), enabling an increase in lift, but instead, he manually increased the pitch attitude up to over 30⁰ which in turn then induced a wing stall. In approximately 3.7 seconds after the stick shaker activated, the captain increased power (albeit to a setting less than appropriate) which was another error committed in this chain of events. As the captain increased the aircraft’s pitch to over 30⁰, the aircraft rolled into a left 50⁰ bank (i.e., significantly greater than the 30⁰ maximum bank angle which is typically flown during normal operations) until the stick pusher (another automated system which automatically pushes the nose of the aircraft down in response to an approaching wing stall to reduce the angle of attack to enable the wing to produce lift) activated for the first time and the master caution warning (another attention-getting flight deck effect) also activated. At this point, the aircraft’s pitch attitude was decreased to an approximate 5⁰ pitch down attitude and the aircraft rolled into a 100⁰ right bank (resulting in the aircraft becoming inverted). At an airspeed of less than 110 knots (which would be considered quite slow for this aircraft in this configuration), the first officer inappropriately began to retract the flaps and then the landing gear, which were the final erroneous actions committed by this flight crew prior to the complete loss of control of this aircraft. Flaps enable the aircraft to fly at a slower airspeed, so retracting them would have resulted in the aircraft stalling sooner (i.e., at a faster speed). Within 26 seconds of the stick shaker first activating, this aircraft had crashed.
WHY DIDN’T PREVIOUS TRAINING PREVENT THIS ACCIDENT? Given that all pilots receive extensive training in the recovery to an approaching wing stall from the earliest stages of flight training and they continue to receive such training throughout their entire career (including as transport category airline pilots), we are interested in exploring, from a human factors standpoint, the underpinnings of this tragic accident. First, during our investigation, we noticed that the attributes of the approach to stall which existed on this night did not align with the typical scenarios used during most/all simulated stall training, so it is very likely that this flight crew had not previously experienced an approach to stall presented in this insidious manner. That is, while on autopilot which was automatically trimming a slowly increasing pitch-up attitude, along with the computer automatically decelerating the aircraft by retarding the throttles (autothrottle was engaged), and enabling the aircraft to decelerate to a purposefully selected (albeit mismatched) airspeed which would unexpectedly initiate the stall warning earlier than likely anticipated (given the misconfiguration), at an altitude in close proximity to the ground, at night without a visual horizon all while in conditions susceptible to icing. Added to this likely novel set of attributes, a conversation was taking place between the flight crew which likely served to further distract them from what the autopilot was commanding the aircraft to do (resulting in poorer situational awareness and ultimately resulting in an automation surprise). Thus, when the autopilot disconnected, the captain was ill prepared to take over and
338
Human Factors in Simulation and Training
manually control the aircraft. The startle/surprise likely experienced by the flight crew, while in close proximity to the ground, in and of itself could explain the captain erroneously increasing back pressure as a response to this situation. However, the first officer’s decision to retract the flaps during the attempted recovery sequence begs further question as to why this may have occurred. Consequently, there is an additional factor that needs to be taken into consideration. This flight crew had prior tailplane stall recovery training preceding the accident. However, the tail plane stall training was administered in the form of a video produced by NASA and no hands-on training in the airplane or simulator was made available to this flight crew. Given that this flight crew had been required to watch this video and their response to the stall was more aligned with a tailplane stall recovery, we are left to ponder whether they were confused as to which type of stall warning they were encountering. The NASA Tailplane Icing Video (produced by Glenn Research Center) indicates that tailplane stalls are a type of stall more likely to occur in icing conditions and during the approach phase as the aircraft decelerates. Crews are warned that some of the cues foreshadowing the conditions for this type of stall can be missed when autopilot is in use. Flight crews are also warned that the differences between tailplane and wing stall are subtle; however, the recovery procedure is the exact opposite. For an approach to wing stall, flight crews are instructed to add power, relax back pressure or push forward on the yoke or joystick depending upon the trim setting (i.e., decrease pitch attitude) and for tailplane stall, pilots are instructed to do the very opposite – pull back on the yoke/joystick (i.e., increase pitch attitude), reduce flaps (to last position), and reduce power (on some aircraft) as power aggravates the tail stall condition. Additionally, during a tailplane stall, the nose of the aircraft will pitch down subsequently requiring significant back pressure to counteract it. In this accident, the activation of the stick pusher (another automated system installed on the accident aircraft which automatically pitches the nose of the aircraft down) may have served to confirm a bias that the aircraft was experiencing a tailplane stall (confirmation bias). Important to note is that flight crews received very little exposure to the stick pusher activation during their training. Also, due to the lack of effective crew resource management depicted during the recovery sequence during this accident flight (e.g., verbal communication between the two), we are uncertain as to whether a negative transfer of training (related to wing versus tailplane stall recovery procedures) may have had any bearing on this flight crew’s procedural steps, but we suggest that it should be considered as a possibility.
OPPORTUNITIES FOR BETTER OUTCOMES Training: A Countermeasure for Responding to Startle and Surprise We have briefly discussed the historic training approach for emergency and nonnormal procedures where time pressure and stress are likely to exist at the time an accurate crew response is required. Historically, certain scenarios (such as recovery to an approaching wing stall) have been repeatedly practiced by flight crews such
Training, Stress, Time Pressure, and Surprise
339
that they are able to develop a level of automaticity in their response. This approach has served as the best countermeasure to prevent an ineffective, incorrect, or delayed response which could otherwise occur due to the normal, natural effects of stress and time pressure on human performance (e.g., hypervigilance, panic, etc.). However, this approach is effective as long as all of the attributes of the scenario practiced exist if and when the scenario presents itself during flight. Given the likelihood that a flight crew will not face a “textbook” scenario (i.e., at least some of the attributes are likely to be different), we have learned that the rigidity of this training approach needs to be addressed (Burratto & Graef, 2022; Landman et al., 2018). We have learned that flight crews are best served by experiencing variety in the setup for the scenarios in which they are presented during their training. Through experiencing a range of scenarios (e.g., in the attributes leading up to the stall warning), flight crews can train their ability to adapt to the variation in presentation, yet still apply a timely and appropriate response. This approach enables flight crews to train their ability to accommodate the differing attributes (i.e., make sense of them), reframe the situation, and apply an appropriate and timely response (Burratto & Graef, 2022; Landman et al., 2018). Over the last 20 years, the approach to flight crew training has been transitioning from a historically task-based approach, which focused upon providing repetitive training for predicted situations, to a competency-based approach which places a greater emphasis on training the skills which support enhanced performance when flight crews face startling, unanticipated situations (e.g., resilience resulting from demonstrating competence and confidence across a wide variety of scenarios). According to Burratto and Graef (2022), the goal of the Competency-Based Training and Assessment (CBTA) approach seeks to prepare flight crews for an infinite number of situations by developing a finite number of competencies enabling crews to successfully manage unexpected situations. By exposing flight crews to a variety of situations as opposed to the highly anticipated training scenarios that were historically presented, flight crews have demonstrated that they become better at their ability to adapt to novel situations. Not only will flight crews benefit from this type of training, research has shown that there is also opportunity for improvement in human performance when interfacing with automated systems (Stader et al., 2013).
Automation Design: Keeping the Human in the Loop Where Feasible As previously described, enabling the operator to transition system control away from the human and to the machine has no doubt brought about an added layer of safety and efficiency to flight operations as it allows for the offloading of a demanding task during high workload periods and enables greater precision of flight path control. In essence, it provides an additional resource to the flight crew and at a time when they need and/or want it. Today’s modern aircraft allow a pilot to select the level of automation desired. However, unfortunately, there are also downsides to this added resource. As standard operating practice has allowed (or even dictated) flight crews to use automation during a significant portion of their operations, the flight crew’s situational awareness and manual proficiency have been negatively impacted,
340
Human Factors in Simulation and Training
and this trend in reliance is likely to only be exacerbated by today’s worldwide shortage of expert pilots. Research into alternative designs (Stader et al., 2013) where either the machine or the human can determine when to best transition control has shown great promise for leveraging the benefits of automation while limiting the associated decrements. Unlike fixed “static” automation that we have previously reviewed, the design of adaptive automation (AA) is more human-centered and dynamic, thereby allowing users and machines to mutually exchange tasks and roles in the control of function allocation between a user (e.g., operator such as a pilot or driver) and machine (computer, autopilot, intelligent tutoring system, etc.). This dynamic exchange, also referred to as adaptive function or task allocation, can be initiated by either a computer or human operator as prescribed by adaptive automation philosophy (Scerbo, 1996, 2007). This process of invoking adaptive automation relies on various task parameters, such as performance criteria and theoretically based models of adaptation (Mouloua et al., 1993b; Parasuraman et al., 1996; Mouloua et al., 2002; Scallen et al., 1995; Harris et al., 1995), and levels of subjective, physiological, and secondary workload adaptation measures of workload states (Morrison & Gluckman, 1994; Hilburn et al., 1997; Byrne & Parasuraman, 1996; Kaber & Riley, 1999). The mechanism of invoking AA is mutually assumed and can be initiated by either a machine (adaptive) or a human (adaptable). An example of such a system currently in use is the Automatic Ground Collision Avoidance System (Auto-GCAS) and the Pilot Activated Recovery System (PARS) currently installed on the F-16. This system is designed to transition control away from the pilot when Controlled Flight Into Terrain (CFIT) is imminent as is seen in military operations as a result of spatial disorientation and/or G-induced Loss of Consciousness (G-LOC). When the pilot realizes they need assistance and is capable of soliciting the assistance, the adaptable automation system PARS is available to them. However, when the pilot is unable to solicit the automated assistance due to, for example, a loss of consciousness, the Auto-GCAS system automatically takes control of the aircraft (adaptive automation). By design, rules for invoking AA pre-exist the scenario and can be triggered through some algorithms using various performance metrics (such as detection accuracy, reaction time, flight path deviations, etc.), physiological responses (Hear Rate, HRV, EEG/ ERP waveforms, FNIRs, etc.), workload-based measures (Byrne & Parasuraman, 1996; Kaber et al., 2001; Bailey et al., 2006), and/or modeling and performance-based adaptation methods (Parasuraman et al., 1996; Mouloua et al., 1993b; Kaber et al., 2005). As a result of these studies, it was demonstrated that when users revert to manual control, even for a short period of time, then switch back to automation control, their performance is markedly improved over the course of subsequent automation cycles. Additionally, operator workload could be regulated and situation awareness could be maintained or improved under automation control due to the benefits of AA (Kaber et al., 2001, 2005, 2007; Kaber & Endsley, 2004). This is mainly due to enhanced operator engagement in the task, as well as their active control in the supervisory monitoring task (Parasuraman et al., 1996; Mouloua et al., 2019).
Training, Stress, Time Pressure, and Surprise
341
Moray et al. (2000) also examined the benefit of AA in a fault task using varying levels of automation reliability across both manual and automated control on performance measures such as root mean square error, avoidance of accident, false shutdowns, subjective trust in the system, and operator self-confidence. Their findings indicated that trust in automation, but not self-confidence, was strongly affected by automation reliability.
HIGHLIGHTING THE LESSONS LEARNED In this chapter, we focused upon, from a training standpoint, why the training that these pilots received (both in and out of simulators) did not serve as a meaningful countermeasure to the normal and natural hypervigilant or panic reaction that can be elicited by an occurrence that requires an accurate and near-immediate correct reaction in a situation where a failed response will likely subsequently cause a fatal outcome. We also highlighted the failures that occurred to discuss the related human factors literature. This chapter has been written to serve as a tool to better explain these concepts using a real-world example and ideally provide information that serves to help prevent accidents with similar causal and contributing factors (both within and outside the aviation domain) in the future. To that end, we offer the following guidance for the design of system interfaces and training as highlighted by this case study. • Since we cannot train responses to all possible failures, we need to train operators to retain some cognitive flexibility in the process so that they are able to reframe a scenario when required. To do so, operators should be exposed to a wide variety of unpredictable, yet practical training to exercise reframing skills in response to framing mismatches following scenarios where surprising and/or startling events ensue. • Systems fail; however, they need to remain tolerant of human error. It is imperative that human error can be detected and remedied and that operators remain capable of doing so. Through sound design and interface testing, it is imperative to eliminate the potential for undetectable framing mismatches to develop where feasible. • Enable an automation design philosophy that is more adaptive than static resulting in greater operator awareness and improved responses. • Distraction comes in many forms. The potential for human performance decrements that accompany distraction and inattention needs to be emphasized during training (e.g., inattentional blindness, change blindness, prospective memory error). • Automation can be incredibly effective at reducing operator workload, but the downside is that through the increase in reliance upon automation, the operator will likely become less capable of successfully handling a transition back to manual control especially when it occurs unexpectedly and when time pressure exists. Operational procedures need to take situational awareness, the potential for over-reliance and complacency, and the need for skill retention into account.
342
Human Factors in Simulation and Training
• Operators require a thorough understanding of the automation and must be adept in its use including how to effectively handle failures and unexpected disconnections. • Negative transfer of training must be a consideration during training development. Simulator fidelity and the accuracy of system behavior exposed to operators during training is an imperative consideration. • Specific to this accident and the aviation domain: both stick pusher and tailplane stall training need to be incorporated into flight crew training, and simulators need to be able to accommodate this training in a realistic way.
REFERENCES Advisory Circular 25.1309-1A. (1988). System design and analysis. June 21. Bailey, N. R., Scerbo, M. W., Freeman, F. G., Mikulka, P. J., & Scott, L. A. (2006). Comparison of a brain-based adaptive system and a manual adaptable system for invoking automation. Human Factors, 48(4), 693–709. Beringer, D. B., & Harris, H. C. Jr. (1999). Automation in general aviation: Two studies of pilot responses to autopilot malfunctions. International Journal of Aviation Psychology, 9, 155–174. Burratto, F., & Graef, R. (2022, January). Training pilots for resilience. Safety First: The Airbus Safety Magazine #33, 18–27. Byrne, E. A., & Parasuraman, R. (1996). Psychophysiology and adaptive automation. Biologicalpsychology, 42(3), 249–268. Casner, S. M., Geven, R. W., & Williams, K. T. (2013). The effectiveness of airline pilot training for abnormal events. Human Factors, 55, 477–485. Dewar, R., & Olson, P. (2007). Human Factors in Traffic Safety (2nd Ed., pp. 11–32). Tucson, AZ: Lawyers and Judges Publishing Company. Dismukes, K. (2006). Concurrent task management and prospective memory: Pilot error as a model for the vulnerability of experts. Proceedings of the Human Factors and Ergonomics Society 50th Annual Meeting, 909–913. Dismukes, K. (2007). Prospective memory in aviation and everyday settings. In M. Kliegel, M. A. McDaniel, & G. O. Einstein (Eds.), Prospective Memory: Cognitive, Neuroscience, Developmental, and Applied Perspectives (pp. 411–431). Mahwah, NJ: Erlbaum. Dismukes, R. K. (2010). Remembrance of things future: Prospective memory in the laboratory, workplace and everyday settings. In D. Harris (Ed.), Reviews of Human Factors and Ergonomics (Vol. 6, pp. 79–122). Santa Monica, CA: Human Factors and Ergonomics Society. Dismukes, K., Young, G., & Sumwalt, R. (1998). Cockpit interruptions and distractions: Effective management requires a careful balancing act. ASRS Directline, 10, 1–26. Driskell, J. E., Carson, R., & Moskal, P. J. (1986). Stress and Human Performance. Final report Orlando, FL: Naval Training Systems Center. Driskell, J. E., Willlis, R. P., & Cooper, C. (1992). Effect of overlearning on retention. Journal of Applied Psychology, 77, 615–622. Durso, F. T., & Alexander, A. L. (2010). Managing workload, performance, and situation awareness in aviation systems. In E. Salas & D. Maurino (Eds.), Human Factors in Aviation (pp. 217–247). Burlington, MA: Elsevier. Easterbrook, J. A. (1959). The effect of emotion on cue utilization and the organization of behavior. Psychological Review, 66, 183–201.
Training, Stress, Time Pressure, and Surprise
343
Endsley, M. (1999). Situation awareness in aviation systems. In D. Garland, J. Wise, & V. D. Hopkin (Eds.), Handbook of Aviation Human Factors (pp. 257–276). Mahwah, NJ: Erlbaum. Endsley, M. R. (2000). Situation models: An avenue to the modeling of mental models. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 44(1), 61–64. European Aviation Safety Agency. (2015). Loss of control prevention and recovery training: Notice of Proposed Amendment 2015–13. Cologne, Germany. FAR 25.1309 Equipment, systems, and installations. November 8, 2007. FAR 25.1302 Installed Systems and Equipment for Use by the Flightcrew. May 3, 2013. Federal Aviation Administration. (2015). Advisory circular (120/111). Washington, DC. Ferraro, J., Christy, N., & Mouloua, M. (2017). Impact of auditory interference on automated task monitoring and workload. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 61(1), 1136–1140. Ferraro, J., Clark, L., Christy, N., & Mouloua, M. (2018). Effects of automation reliability and trust on system monitoring performance in simulated flight tasks. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 62(1), 1232–1236. Foyle, D., Hooey, B., Wilson, J., & Johnson, W. (2002). HUD symbology for surface operations: Command guidance vs. situation guidance formats. SAE Transactions: Journal of Aerospace, 111, 647–658. Gainer, C. A., & Sullivan, D. J. (1976). Aircrew Training Requirements for Nap-of-the-Earth Flight. Final Report 203-1, Santa Barbara, CA: Anacapa Sciences, Inc. Army Research Institute For The Behavioral And Social Sciences. Harris, W. C., Hancock, P. A., Arthur, E. J., & Caird, J. K. (1995). Performance, workload, and fatigue changes associated with automation. The International Journal of Aviation Psychology, 5(2), 169–185. Hilburn, B., Jorna, P. G., Byrne, E. A., & Parasuraman, R. (1997). The effect of adaptive air traffic control (ATC) decision aiding on controller mental workload. In M. Mouloua & J. Koonce (Eds.), Human-Automation Interaction: Research and Practice (pp. 84–91). Mahwah, NJ: Erlbaum. Hole, G. J., & Tyrell, L. (1995). The influence of perceptual ‘set’ on the detection of motorcyclists using daytime headlights. Ergonomics, 38(7), 1326–1341. Holt, B. J., & Rainey, S. J. (2002). An overview of automaticity and implications for training and thinking process. Alexandria, VA: U.S. Army Research Institute Research Report 1790. International Civil Aviation Organisation. (2013). Manual of Evidence-Based Training (Nr. 9995). Montreal, Canada: Author. Kaber, D. B., & Endsley, M. R. (1997). Out‐of‐the‐loop performance problems and the use of intermediate levels of automation for improved control system functioning and safety. Process Safety Progress, 16(3), 126–131. Kaber, D. B., & Riley, J. M. (1999). Adaptive automation of a dynamic control task based on secondary task workload measurement. International Journal of Cognitive Ergonomics, 3(3), 169–187. Kaber, D. B., Riley, J. M., Tan, K. W., & Endsley, M. R. (2001). On the design of adaptive automation for complex systems. International Journal of Cognitive Ergonomics, 5(1), 37–57. Kaber, D. B., & Endsley, M. R. (2004). The effects of level of automation and adaptive automation on human performance, situation awareness and workload in a dynamic control task. Theoretical Issues in Ergonomics Science, 5(2), 113–153. Kaber, D. B., Wright, M. C., Prinzel III, L. J., & Clamann, M. P. (2005). Adaptive automation of human-machine system information-processing functions. Human Factors, 47(4), 730–741.
344
Human Factors in Simulation and Training
Kaber, D. B., Perry, C. M., Segall, N., & Sheik-Nainar, M. A. (2007). Workload state classification with automation during simulated air traffic control. The International Journal of Aviation Psychology, 17(4), 371–390. Klein, G. (2000). Analysis of situation awareness from critical incident reports. In M. Endsley & D. J. Garland (Eds.), Situation Awareness Analysis and Measurement (pp. 51–71). Mahwah, NJ: Erlbaum. Kochan, J. A. (2005). The role of domain expertise and judgment in dealing with unexpected events (PhD thesis). University of Central Florida, Orlando. Use of Frames to explain performance during surprise events. Landman, A., van Oorschot, P., van Paassen, M. M. R., Groen, E., Bronkhorst, A., & Mulder, M. (2018). Training pilots for unexpected events: A simulator study on the advantage of unpredictable and variable scenarios. Human Factors, 60(6), 793–805. Landman, A., Groen, E., Van Paassen, M., Bronkhorst, A., & Mulder, M. (2017a). Dealing with unexpected events on the flight deck: A conceptual model of startle and surprise. Human Factors, 59(8), 1161–1172. Landman, A., Groen, E., Van Paassen, M., Bronkhorst, A., & Mulder, M. (2017b). The influence of surprise on upset recovery performance in airline pilots. The International Journal of Aerospace Psychology, 27(1–2), 2–14. Landman, A., Groen, E. L., van Paassen, M., Bronkhorst, A. W., & Mulder, M. (2017c). The effect of surprise on upset recovery performance. 19th International Symposium on Aviation Psychology, 37–42. Latorella, K. A. (1996). Investigating interruptions: An example from the flight deck. Proceedings of the Human Factors and Ergonomics Society 40th Annual Meeting, 249–253. Lautman, L. G., & Gallimore, P. L. (1987). Control of the crew caused accident: Results of a 12-operator study. Airliner, 56(10), 1–6. Martin, W. L., Murray, P. S., Bates, P. R., & Lee, P. S. Y. (2016). A flight simulator study of the impairment effects of startle on pilots during unexpected critical events. Aviation Psychology and Applied Human Factors, 6(1), 24–32. Mauro, R., Barshi, I., Pederson, S., & Bruininks, P. (2001). Affect, experience, and aeronautical decision-making. Proceedings of 11th International Symposium on Aviation Psychology, Columbus, OH: The Ohio State University. Moray, N., Inagaki, T., & Itoh, M. (2000). Adaptive automation, trust, and self-confidence in fault management of time-critical tasks. Journal of Experimental Psychology: Applied, 6(1), 44. Morrison, J. G., & Gluckman, J. P. (1994). Definitions and prospective guidelines for the application of adaptive automation. Human Performance in Automated Systems: Current Research and Trends, 256–263. Mouloua, M., Ferraro, J., Parasuraman, R. Molloy, R., & Hilburn, B. (2019). Human monitoring of automated systems. In M. Mouloua & P. A. Hancock (Eds.), Human Performance in Automated and Autonomous Systems: Current Theory and d Methods (pp. 1–26). Boca Raton, FL: CRC Press (Taylor & Francis Group). Mouloua, M., Gilson, R., & Koonce, J. (1997). Automation, flight management and pilot training: Issues and considerations. In R. A. Telfer & P. J. Moore (Eds.), Aviation Training: Learners, Instruction and Organization (pp. 78–86). Aldershot: Avebury Aviation. Mouloua, M., Hancock, P., Jones, L., & Vincenzi, D. (2010). Automation in aviation systems: Issues and considerations. In J. Wise, D. Garland, & D. V. Hopkin (Eds.), Handbook of Aviation Human Factors (pp. 8-1–8-11). Boca Raton, FL: CRS Press (Taylor & Francis Group).
Training, Stress, Time Pressure, and Surprise
345
Mouloua, M., Parasuraman, R., & Molloy, R. (1993a). Monitoring automation failures: Effects of single and multi-adaptive function allocation. Proceedings of the 37th Annual Meeting of the Human Factors Society. Santa Monica, CA: Human Factors and Ergonomics Society, 1–5. Mouloua, M., Parasuraman, R., & Molloy, R. (1993b). Monitoring automation failures: Effects of task type on performance and subjective workload. Proceedings of the First Mid-Atlantic Human Factors Conference, 155–161. Mouloua, M., Smither, J.A., Vincenzi, D.A., & Smith, L. (2002). Automation and aging: Issues and considerations. In E. Salas (Ed.), Advances in Human Performance and Cognitive Engineering Research: Automation (pp. 213–237). Oxford: Elsevier. Norman, D. A. (1981). Categorization of action slips. Psychological Review, 88(1), 1–15. Oakley, B., Mouloua, M., & Hancock P. (2003). Effects of automation reliability on human monitoring performance. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 47(1), 188–190. Olson, P., & Farber, E. (2003). Forensic Aspects of Driver Perception and Response (2nd Ed.). Tucson, AZP: Lawyers and Judges Publishing Company. Parasuraman, R. (1987). Human-computer monitoring. Human Factors, 29(6), 695–706. Parasuraman, R., Bahri, T., Deaton, J. E., Morrison, J. G., & Barnes, M. (1992). Theory and Design of Adaptive Automation in Aviation Systems. Washington, DC: Catholic University of America cognitive science lab. Parasuraman, R., Mouloua, M., & Molloy, R. (1996). Effects of adaptive task allocation on monitoring of automated systems. Human Factors, 38(4), 665–679. Parasuraman, R., & Riley, V. (1997). Humans and automation: Use, misuse, disuse, abuse. Human Factors, 39(2), 230–253. Parasuraman, R., Molloy, R., & Singh, L. (1993). Performance consequences of automationinduced “complacency”. The International Journal of Aviation Psychology, 3(1), 1–23. Rankin, A., Woltjer, R., & Field, J. (2016). Sensemaking following surprise in the cockpit: A re-framing problem. Cognition, Technology & Work, 18, 623–642. Use of Frames to explain performance during surprise events. Rasmussen, J., Pejtersen, A. M., & Goodstein, L. P. (1994). Cognitive Systems Engineering. New York: John Wiley & Sons, Inc. 135, 144–146. Reason, J. T. (1990). Human Error (pp. 1–18, 53–96). Cambridge: Cambridge University Press. Rousseau, R., Tremblay, S., & Breton, R. (2004). Defining and modeling situation awareness: A critical review. In S. Banbury & S. Tremblay (Eds.), A Cognitive Approach to Situation Awareness: Theory and Application (pp. 3–21). Hampshire: Ashgate. Savage, I. (2013). Reflections on the economics of transportation safety. Research in Transportation Economics, 43(1), 1–8. Scallen, S. F., Hancock, P. A., & Duley, J. A. (1995). Pilot performance and preference for short cycles of automation in adaptive function allocation. Applied Ergonomics, 26(6), 397–403. Scerbo, M. W. (1996). Theoretical perspectives on adaptive automation. In R. Parasuraman & M. Mouloua (Eds.), Automation and Human Performance: Theory and Applications (pp. 37–63). Hillsdale, NJ: Lawrence Erlbaum. Scerbo, M. (2007). Adaptive automation. In R. Parasuraman & M. Rizzo (Eds.), Neuroergonomics: The Brain at Work (pp. 238–252). New York: Oxford University Press. Shebilske, W. L., Goettl, B. P., & Garland, D. J. (2000). Situation awareness, automaticity, and training In M. Endsley & D. Garland (Eds.), Situation Awareness Analysis and Measurement (pp. 303–323). Boca Raton, FL: CRC Press.
346
Human Factors in Simulation and Training
Sheridan, T. B. (1981). Understanding human error and aiding human diagnostic behavior in nuclear power plants. In J. Rasmussen & W. B. Rouse (Eds.), Human Detection and Diagnosis of System Failures (pp. 19–35). New York, NY: Plenum. Simons, D. J. & Chabris, C. F. (1999). Gorillas in our midst: Sustained inattentional blindness for dynamic events. Perception, 28, 1059–1074. Staal, M. A. (2004). Stress, cognition, and human performance: A literature review and conceptual framework. NASA/TM-2004-212824, Moffitt Field, CA: Ames Research Center. National Aeronautics and Space Administration. Stader, S., (2014). Impacts of complexity and timing of communication interruptions on visual detection tasks. Unpublished Doctoral Dissertation, UCF Stars Library: Electronic Theses and Dissertations. 4571. Retrieved from https://stars.library.ucf.edu/etd/4571. Stader, S., Leavens, J., Gonzalez, B., Fontaine, V., Mouloua, M., & Alberti, P. (2013). Effects of display and task features on system monitoring performance in the original multi-attribute task battery and MATB-II. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 57(1), 1435–1439. Wickens, C. D. (2002). Situation awareness and workload in aviation. Current Directions in Psychological Science, 11(4), 128–133. Wickens, C. D., Stokes, A., Barnett, B., & Hyman, F. (1993) The Effects of Stress on Pilot Judgment in a MIDIS Simulator. In O. Svenson & A. J. Maule (Eds.), Time Pressure and Stress in Human Judgment and Decision Making (pp. 271–292). Boston, MA: Springer. Wickens, C. D., & Dixon, S. R. (2007). The benefits of imperfect diagnostic automation: A synthesis of the literature. Theoretical Issues in Ergonomics Science, 8(3), 201–212. Wiener, E. L. (1977). Controlled flight into terrain accidents: System-induced errors. Human Factors, 19(2), 171–181. Wiener, E. L. (1988). Cockpit automation. In E. L. Wiener & D. C. Nagel (Eds.), Human Factors in Aviation (pp. 433–461). San Diego, CA: Academic Press.
Index A
Low-fidelity, 154, 227, 257, 279, 281
ACLS, 261, 263, 271 ATLS, 263–264, 271 ATOM, 260, 264, 270
M
B Biases, 333 BLS, 261, 263 Boot camp, 291
C Cognitive processes, 79, 212
D Debriefing, 104, 227, 256, 275–276, 278–280, 282–283, 286–288, 290, 292 Decision-making, 32, 54, 79, 87, 89, 94, 103–104, 108–111, 122, 127, 209–217, 219–224, 232, 237, 261, 267, 344
F Fallacies, 122 FES, 256, 263, 295 FLS, 256, 263, 273 FRS, 263
H Hand gestures, 298–299, 301, 307, 309–311 Healthcare simulation, 151, 160–162, 225, 227, 229, 231, 233–235, 237, 239, 241–243, 245, 247–251, 253, 272, 275, 277–283, 285, 287, 289, 291, 293–295 Highfidelity, 27, 56, 139, 149, 227, 243, 248, 267, 277, 291, 293
L Latent safety threats, 243, 281 Learning, 54, 67, 70, 72–80, 82–86, 90, 92–94, 97, 103, 130, 133–135, 140, 142, 145–146, 153, 163, 175, 179, 184, 186, 188, 190– 191, 198, 202–203, 215–216, 220–224, 226, 230–232, 234–236, 240, 245–246, 249–252, 256–257, 259, 261, 267, 269, 272–273, 275–280, 282–288, 290–296, 302, 319–321, 323, 326, 328
Mannequin, 228–229, 232, 234, 242, 247–248, 277, 280–282, 291–292, 296 Mastery training, 240–241, 245 Medical devices, 244, 265 Medical education, 226, 228, 231, 235, 245–253, 255–256, 261–262, 271–273, 286–287, 291, 294–295 Medical students, 153, 236, 239, 243, 261–262, 270, 293
Q Quality improvement, 269
R Rapid cycle deliberate practice, 275, 282, 284, 291, 293 Residents, 236, 239–241, 245–246, 249, 256, 261–264, 267, 270–272, 290, 296
S Semi-autonomous systems, 301 Simulation, 15–63, 65–66, 68, 70, 72, 74, 76, 78–80, 82, 84, 86–88, 90, 92–98, 100, 102–104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128–132, 134–138, 140–144, 146– 149, 151–154, 156, 158–168, 170, 172–174, 176–180, 182, 184, 186, 188, 190, 192–196, 198–200, 202, 204, 206, 209–210, 212, 214–220, 222, 224–253, 255–273, 275–298, 300–306, 308–324, 326–328, 331–332, 334, 336, 338, 340, 342, 344, 346 Standardized patients, 225, 228, 233–234, 246, 250, 253, 260, 264, 281 Surgery, 153, 176, 178, 227, 230–231, 235, 238, 243, 245–246, 248, 250, 252, 255–257, 262–273, 285, 288, 292–293, 295 Surgical simulation, 230–231, 251, 255–257, 259–261, 263, 265, 267, 269, 271–273
347
348
Index
T
V
Team training, 149, 160–161, 164, 221, 225, 236–238, 240–241, 246, 248–249, 251–253, 261, 268, 271–272 Training transfer, 65, 78, 94, 131, 218, 225, 238, 245
Virtual reality, 68, 78, 84–86, 152, 155–157, 167–168, 176–178, 181, 191–193, 207, 225, 228–229, 245–246, 249, 251–253, 256, 259–260, 263, 272, 281–282, 291, 299, 301, 318–321, 324–329
U Unity, 306, 312–313