254 65 68MB
English Pages 616 [617] Year 2023
LNCS 14016
Hirohiko Mori Yumi Asahi (Eds.)
Human Interface and the Management of Information Thematic Area, HIMI 2023 Held as Part of the 25th HCI International Conference, HCII 2023 Copenhagen, Denmark, July 23–28, 2023 Proceedings, Part II
Lecture Notes in Computer Science Founding Editors Gerhard Goos Juris Hartmanis
Editorial Board Members Elisa Bertino, Purdue University, West Lafayette, IN, USA Wen Gao, Peking University, Beijing, China Bernhard Steffen , TU Dortmund University, Dortmund, Germany Moti Yung , Columbia University, New York, NY, USA
14016
The series Lecture Notes in Computer Science (LNCS), including its subseries Lecture Notes in Artificial Intelligence (LNAI) and Lecture Notes in Bioinformatics (LNBI), has established itself as a medium for the publication of new developments in computer science and information technology research, teaching, and education. LNCS enjoys close cooperation with the computer science R & D community, the series counts many renowned academics among its volume editors and paper authors, and collaborates with prestigious societies. Its mission is to serve this international community by providing an invaluable service, mainly focused on the publication of conference and workshop proceedings and postproceedings. LNCS commenced publication in 1973.
Hirohiko Mori · Yumi Asahi Editors
Human Interface and the Management of Information Thematic Area, HIMI 2023 Held as Part of the 25th HCI International Conference, HCII 2023 Copenhagen, Denmark, July 23–28, 2023 Proceedings, Part II
Editors Hirohiko Mori Tokyo City University Tokyo, Japan
Yumi Asahi Tokyo University of Science Tokyo, Japan
ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Computer Science ISBN 978-3-031-35128-0 ISBN 978-3-031-35129-7 (eBook) https://doi.org/10.1007/978-3-031-35129-7 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Foreword
Human-computer interaction (HCI) is acquiring an ever-increasing scientific and industrial importance, as well as having more impact on people’s everyday lives, as an ever-growing number of human activities are progressively moving from the physical to the digital world. This process, which has been ongoing for some time now, was further accelerated during the acute period of the COVID-19 pandemic. The HCI International (HCII) conference series, held annually, aims to respond to the compelling need to advance the exchange of knowledge and research and development efforts on the human aspects of design and use of computing systems. The 25th International Conference on Human-Computer Interaction, HCI International 2023 (HCII 2023), was held in the emerging post-pandemic era as a ‘hybrid’ event at the AC Bella Sky Hotel and Bella Center, Copenhagen, Denmark, during July 23–28, 2023. It incorporated the 21 thematic areas and affiliated conferences listed below. A total of 7472 individuals from academia, research institutes, industry, and government agencies from 85 countries submitted contributions, and 1578 papers and 396 posters were included in the volumes of the proceedings that were published just before the start of the conference, these are listed below. The contributions thoroughly cover the entire field of human-computer interaction, addressing major advances in knowledge and effective use of computers in a variety of application areas. These papers provide academics, researchers, engineers, scientists, practitioners and students with state-of-the-art information on the most recent advances in HCI. The HCI International (HCII) conference also offers the option of presenting ‘Late Breaking Work’, and this applies both for papers and posters, with corresponding volumes of proceedings that will be published after the conference. Full papers will be included in the ‘HCII 2023 - Late Breaking Work - Papers’ volumes of the proceedings to be published in the Springer LNCS series, while ‘Poster Extended Abstracts’ will be included as short research papers in the ‘HCII 2023 - Late Breaking Work - Posters’ volumes to be published in the Springer CCIS series. I would like to thank the Program Board Chairs and the members of the Program Boards of all thematic areas and affiliated conferences for their contribution towards the high scientific quality and overall success of the HCI International 2023 conference. Their manifold support in terms of paper reviewing (single-blind review process, with a minimum of two reviews per submission), session organization and their willingness to act as goodwill ambassadors for the conference is most highly appreciated. This conference would not have been possible without the continuous and unwavering support and advice of Gavriel Salvendy, founder, General Chair Emeritus, and Scientific Advisor. For his outstanding efforts, I would like to express my sincere appreciation to Abbas Moallem, Communications Chair and Editor of HCI International News. July 2023
Constantine Stephanidis
HCI International 2023 Thematic Areas and Affiliated Conferences
Thematic Areas • HCI: Human-Computer Interaction • HIMI: Human Interface and the Management of Information Affiliated Conferences • EPCE: 20th International Conference on Engineering Psychology and Cognitive Ergonomics • AC: 17th International Conference on Augmented Cognition • UAHCI: 17th International Conference on Universal Access in Human-Computer Interaction • CCD: 15th International Conference on Cross-Cultural Design • SCSM: 15th International Conference on Social Computing and Social Media • VAMR: 15th International Conference on Virtual, Augmented and Mixed Reality • DHM: 14th International Conference on Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management • DUXU: 12th International Conference on Design, User Experience and Usability • C&C: 11th International Conference on Culture and Computing • DAPI: 11th International Conference on Distributed, Ambient and Pervasive Interactions • HCIBGO: 10th International Conference on HCI in Business, Government and Organizations • LCT: 10th International Conference on Learning and Collaboration Technologies • ITAP: 9th International Conference on Human Aspects of IT for the Aged Population • AIS: 5th International Conference on Adaptive Instructional Systems • HCI-CPT: 5th International Conference on HCI for Cybersecurity, Privacy and Trust • HCI-Games: 5th International Conference on HCI in Games • MobiTAS: 5th International Conference on HCI in Mobility, Transport and Automotive Systems • AI-HCI: 4th International Conference on Artificial Intelligence in HCI • MOBILE: 4th International Conference on Design, Operation and Evaluation of Mobile Communications
List of Conference Proceedings Volumes Appearing Before the Conference
1. LNCS 14011, Human-Computer Interaction: Part I, edited by Masaaki Kurosu and Ayako Hashizume 2. LNCS 14012, Human-Computer Interaction: Part II, edited by Masaaki Kurosu and Ayako Hashizume 3. LNCS 14013, Human-Computer Interaction: Part III, edited by Masaaki Kurosu and Ayako Hashizume 4. LNCS 14014, Human-Computer Interaction: Part IV, edited by Masaaki Kurosu and Ayako Hashizume 5. LNCS 14015, Human Interface and the Management of Information: Part I, edited by Hirohiko Mori and Yumi Asahi 6. LNCS 14016, Human Interface and the Management of Information: Part II, edited by Hirohiko Mori and Yumi Asahi 7. LNAI 14017, Engineering Psychology and Cognitive Ergonomics: Part I, edited by Don Harris and Wen-Chin Li 8. LNAI 14018, Engineering Psychology and Cognitive Ergonomics: Part II, edited by Don Harris and Wen-Chin Li 9. LNAI 14019, Augmented Cognition, edited by Dylan D. Schmorrow and Cali M. Fidopiastis 10. LNCS 14020, Universal Access in Human-Computer Interaction: Part I, edited by Margherita Antona and Constantine Stephanidis 11. LNCS 14021, Universal Access in Human-Computer Interaction: Part II, edited by Margherita Antona and Constantine Stephanidis 12. LNCS 14022, Cross-Cultural Design: Part I, edited by Pei-Luen Patrick Rau 13. LNCS 14023, Cross-Cultural Design: Part II, edited by Pei-Luen Patrick Rau 14. LNCS 14024, Cross-Cultural Design: Part III, edited by Pei-Luen Patrick Rau 15. LNCS 14025, Social Computing and Social Media: Part I, edited by Adela Coman and Simona Vasilache 16. LNCS 14026, Social Computing and Social Media: Part II, edited by Adela Coman and Simona Vasilache 17. LNCS 14027, Virtual, Augmented and Mixed Reality, edited by Jessie Y. C. Chen and Gino Fragomeni 18. LNCS 14028, Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management: Part I, edited by Vincent G. Duffy 19. LNCS 14029, Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management: Part II, edited by Vincent G. Duffy 20. LNCS 14030, Design, User Experience, and Usability: Part I, edited by Aaron Marcus, Elizabeth Rosenzweig and Marcelo Soares 21. LNCS 14031, Design, User Experience, and Usability: Part II, edited by Aaron Marcus, Elizabeth Rosenzweig and Marcelo Soares
x
List of Conference Proceedings Volumes Appearing Before the Conference
22. LNCS 14032, Design, User Experience, and Usability: Part III, edited by Aaron Marcus, Elizabeth Rosenzweig and Marcelo Soares 23. LNCS 14033, Design, User Experience, and Usability: Part IV, edited by Aaron Marcus, Elizabeth Rosenzweig and Marcelo Soares 24. LNCS 14034, Design, User Experience, and Usability: Part V, edited by Aaron Marcus, Elizabeth Rosenzweig and Marcelo Soares 25. LNCS 14035, Culture and Computing, edited by Matthias Rauterberg 26. LNCS 14036, Distributed, Ambient and Pervasive Interactions: Part I, edited by Norbert Streitz and Shin’ichi Konomi 27. LNCS 14037, Distributed, Ambient and Pervasive Interactions: Part II, edited by Norbert Streitz and Shin’ichi Konomi 28. LNCS 14038, HCI in Business, Government and Organizations: Part I, edited by Fiona Fui-Hoon Nah and Keng Siau 29. LNCS 14039, HCI in Business, Government and Organizations: Part II, edited by Fiona Fui-Hoon Nah and Keng Siau 30. LNCS 14040, Learning and Collaboration Technologies: Part I, edited by Panayiotis Zaphiris and Andri Ioannou 31. LNCS 14041, Learning and Collaboration Technologies: Part II, edited by Panayiotis Zaphiris and Andri Ioannou 32. LNCS 14042, Human Aspects of IT for the Aged Population: Part I, edited by Qin Gao and Jia Zhou 33. LNCS 14043, Human Aspects of IT for the Aged Population: Part II, edited by Qin Gao and Jia Zhou 34. LNCS 14044, Adaptive Instructional Systems, edited by Robert A. Sottilare and Jessica Schwarz 35. LNCS 14045, HCI for Cybersecurity, Privacy and Trust, edited by Abbas Moallem 36. LNCS 14046, HCI in Games: Part I, edited by Xiaowen Fang 37. LNCS 14047, HCI in Games: Part II, edited by Xiaowen Fang 38. LNCS 14048, HCI in Mobility, Transport and Automotive Systems: Part I, edited by Heidi Krömker 39. LNCS 14049, HCI in Mobility, Transport and Automotive Systems: Part II, edited by Heidi Krömker 40. LNAI 14050, Artificial Intelligence in HCI: Part I, edited by Helmut Degen and Stavroula Ntoa 41. LNAI 14051, Artificial Intelligence in HCI: Part II, edited by Helmut Degen and Stavroula Ntoa 42. LNCS 14052, Design, Operation and Evaluation of Mobile Communications, edited by Gavriel Salvendy and June Wei 43. CCIS 1832, HCI International 2023 Posters - Part I, edited by Constantine Stephanidis, Margherita Antona, Stavroula Ntoa and Gavriel Salvendy 44. CCIS 1833, HCI International 2023 Posters - Part II, edited by Constantine Stephanidis, Margherita Antona, Stavroula Ntoa and Gavriel Salvendy 45. CCIS 1834, HCI International 2023 Posters - Part III, edited by Constantine Stephanidis, Margherita Antona, Stavroula Ntoa and Gavriel Salvendy 46. CCIS 1835, HCI International 2023 Posters - Part IV, edited by Constantine Stephanidis, Margherita Antona, Stavroula Ntoa and Gavriel Salvendy
List of Conference Proceedings Volumes Appearing Before the Conference
xi
47. CCIS 1836, HCI International 2023 Posters - Part V, edited by Constantine Stephanidis, Margherita Antona, Stavroula Ntoa and Gavriel Salvendy
https://2023.hci.international/proceedings
Preface
Human Interface and the Management of Information (HIMI) is a Thematic Area of the International Conference on Human-Computer Interaction (HCII), addressing topics related to information and data design, retrieval, presentation and visualization, management, and evaluation in human computer interaction in a variety of application domains, such as, for example, learning, work, decision, collaboration, medical support, and service engineering. This area of research is acquiring rapidly increasing importance towards developing new and more effective types of human interfaces addressing new emerging challenges, and evaluating their effectiveness. The ultimate goal is for information to be provided in such a way as to satisfy human needs and enhance quality of life. The related topics include, but are not limited to the following: • Service Engineering: Business Integration; Community Computing; E-commerce; E-learning and E-education; Harmonized Work; IoT and Human Behavior; Knowledge Management; Organizational Design and Management; Service Applications; Service Design; Sustainable Design; User Experience Design • New HI (Human Interfaces) and Human QOL (Quality of Life): Electronic Instrumentation; Evaluating Information; Health Promotion; E-health and its Application; Human-Centered Organization; Legal Issues in IT; Mobile Networking; Disasters and HCI • Information in VR, AR, and MR: Application of VR, AR, and MR in Human Activity; Art with New Technology; Digital Museums; Gesture/Movement Studies; New Haptic and Tactile Interaction; Information of Presentation; Multimodal Interaction; Sense of Embodiment (SoE) in VR and HCI • AI, Human Performance, and Collaboration: Automatic Driving Vehicles; Collaborative Work; Data Visualization and Big Data; Decision Support Systems; Human AI Collaboration; Human-Robot Interaction; Humanization of Work; Intellectual Property; Intelligent Systems; Medical Information Systems and Their Application; Participatory Design Two volumes of the HCII 2023 proceedings are dedicated to this year’s edition of the HIMI Thematic Area. The first part focuses on topics related to information design and user experience, data visualization and big data, multimodal interaction, and interaction with AI and Intelligent Systems. The second part focuses on topics related to service design, knowledge in e-Learning and e-Education, as well the support of work and collaboration. Papers of these volumes are included for publication after a minimum of two singleblind reviews from the members of the HIMI Program Board or, in some cases, from members of the Program Boards of other affiliated conferences. We would like to thank all of them for their invaluable contribution, support, and efforts. July 2023
Hirohiko Mori Yumi Asahi
Human Interface and the Management of Information Thematic Area (HIMI 2023)
Program Board Chairs: Hirohiko Mori, Tokyo City University, Japan and Yumi Asahi, Tokyo University of Science, Japan Program Board: • • • • • • • • • • • • • •
Takako Akakura, Tokyo University of Science, Japan Shinichi Fukuzumi, RIKEN, Japan Michitaka Hirose, University of Tokyo, Japan Yasushi Ikei, University of Tokyo, Japan Keiko Kasamatsu, Tokyo Metropolitan University, Japan Daiji Kobayashi, Chitose Institute of Science and Technology, Japan Yusuke Kometani, Kagawa University, Japan Ryosuke Saga, Osaka Metropolitan University, Japan Katsunori Shimohara, Doshisha University, Japan Takahito Tomoto, Tokyo Polytechnic University, Japan Kim-Phuong Vu, California State University, USA Tomio Watanabe, Okayama Prefectural University, Japan Takehiko Yamaguchi, Suwa University of Science, Japan Sakae Yamamoto, Tokyo University of Science, Japan
The full list with the Program Board Chairs and the members of the Program Boards of all thematic areas and affiliated conferences of HCII2023 is available online at:
http://www.hci.international/board-members-2023.php
HCI International 2024 Conference
The 26th International Conference on Human-Computer Interaction, HCI International 2024, will be held jointly with the affiliated conferences at the Washington Hilton Hotel, Washington, DC, USA, June 29 – July 4, 2024. It will cover a broad spectrum of themes related to Human-Computer Interaction, including theoretical issues, methods, tools, processes, and case studies in HCI design, as well as novel interaction techniques, interfaces, and applications. The proceedings will be published by Springer. More information will be made available on the conference website: http://2024.hci.international/. General Chair Prof. Constantine Stephanidis University of Crete and ICS-FORTH Heraklion, Crete, Greece Email: [email protected]
https://2024.hci.international/
Contents – Part II
Service Design Method for Assessing the Potential Impact of Changes in Software Requirements of Agile Methodologies Based Projects . . . . . . . . . . . . . . . . . . . . . . Angelo Amaral and Ferrucio de Franco Rosa
3
Validation of Items of Aspects of Interests in Quality-In-Use -Stakeholder Needs of Each System Domain- . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shin-ichi Fukuzumi
22
Design Study of Wearable IV Pole: Service Design Perspective . . . . . . . . . . . . . . Guizhi Hong and Hong Chen
35
Extensibility Challenges of Scientific Workflow Management Systems . . . . . . . . Muhammad Mainul Hossain, Banani Roy, Chanchal Roy, and Kevin Schneider
51
The Effect of Color on the Visual Search Efficiency of Mobile Travel Service APP in Night Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Junyang Hou, Xiaofan Zhou, and Zhijuan Zhu
71
Research on Conversational Interaction Design Strategy of Shopping APP Based on Context Awareness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fusheng Jia, Xinyu Chen, and Yongkang Chen
90
Influence of Different Language Labels on Perception of Product Value . . . . . . . 104 Yen-Yu Kang and Yu-Dan Pan Structural Equation Modeling for the Interplay Among Consumer Engagements with Multiple Engagement Objects in Consumer’s Fashion . . . . . . 114 Masahiro Kuroda, Akira Oyabu, and Ryohei Takahashi Considerations for Health Care Services Related to the Menstrual Cycle . . . . . . . 127 Mayu Moriya, Suzuka Mori, Momoka Nozawa, Kaito Ofusa, Miho Suto, Ayami Ejiri, Takeo Ainoya, and Keiko Kasamatsu Dialogue-Based User Needs Extraction for Effective Service Personalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Takuya Nakata, Sinan Chen, Sachio Saiki, and Masahide Nakamura
xx
Contents – Part II
The Impact of External Networks on Product Innovation in Social Purpose Organizations: An Empirical Research on Japanese Museums . . . . . . . . . . . . . . . . 154 Shohei Oishi and Akitsu Oe Does Guaranteeing Anonymity in SNS Use Contribute to Regional Revitalization? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Yurika Shiozu, Soichi Arai, Hiromu Aso, Yuto Ohara, Ichiro Inaba, and Katsunori Shimohara Effects of Poor Visibility on Riding Anxiety in Riding a Bicycle that Can Be Ridden with Two Infants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 Sakurako Toyoshima, Makoto Oka, and Hirohiko Mori Wayfinding and Navigation in the Outdoors: Quantitative and Data Driven Development of Personas and Requirements for Wayfinding in Nature . . . . . . . . 199 Frode Volden and Ole E. Wattne Knowledge in eLearning and eEducation Analysis of Classroom Test Results for an Error-Based Problem Presentation System for Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 Nonoka Aikawa, Shintaro Maeda, Tomohiro Mogi, Kento Koike, Takahito Tomoto, Isao Imai, Tomoya Horiguchi, and Tsukasa Hirashima Using Interactive Flat Panel Display for STEM Education Based on SAMR Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 Yu-Hung Chien, Yu-Jui Chang, Hsunli Huang, Hsiang-Chang Lin, and Jyun-Ting Chien Analysis of Effects of Raggedy Student CG Characters in Face-to-Face Lectures and Their On-Demand Streaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Seishiro Hara, Ryoya Fujii, Saizo Aoyagi, and Michiya Yamamoto Triangle Logic Recomposition Exercise for Three-Clause Argument and Its Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 Tsukasa Hirashima, Takuya Kitamura, Tomohiro Okinaga, Reo Nagasawa, and Yusuke Hayashi Proposal for a Semi-subjective Learning Support System with Operation Indices Targeting Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 Tomohito Jumonji, Nonoka Aikawa, and Takahito Tomoto Instructional Design of a VR-Based Empathy Training Program to Primary School Children . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274 Meng-Jung Liu, Chia-Hui Pan, and Le-Yin Ma
Contents – Part II
xxi
Classroom Practice Using a Code-Sharing Platform to Encourage Refinement Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 Shintaro Maeda, Kento Koike, and Takahito Tomoto A Learning Support System for Programming that Promotes Understanding of Source Code Function Through Behavior Modeling . . . . . . . . . . . . . . . . . . . . . . 298 Taiki Matsui, Shintaro Maeda, Kento Koike, and Takahito Tomoto Proposal for Automatic Problem and Feedback Generation for Use in Trace Learning Support Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 Tomohiro Mogi, Yuichiro Tateiwa, Takahito Tomoto, and Takako Akakura Improving Educational Outcomes: Developing and Assessing Grading System (ProGrader) for Programming Courses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322 Fatema Nafa, Lakshmidevi Sreeramareddy, Sriharsha Mallapuram, and Paul Moulema Development of VR Education System for Media Exchange in Cell Culture Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 Akihiko Nakajima, Toru Kano, and Takako Akakura Proposal of Learning Programs: Using the Senseware . . . . . . . . . . . . . . . . . . . . . . . 354 Momoka Nozawa, Suzuka Mori, Miho Suto, Kaito Ofusa, Mayu Moriya, Keiko Kasamatsu, and Takeo Ainoya Investigation of the Relationship Between Map Quality and Higher-Order Thinking in Kit-Build Concept Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 Nurmaya, Aryo Pinandito, Yusuke Hayashi, and Tsukasa Hirashima Application of the Recomposition Method to Mind Map and Experimental Verification of Learning Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378 Kodai Watanabe, Aryo Pinandito, Nurmaya, Yusuke Hayashi, and Tsukasa Hirashima Development of a VR Collaboration System to Support Reflection on the Learning Process of Oneself and Others . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399 Yusuke Yagi, Yusuke Kometani, Saerom Lee, Naka Gotoda, Takayuki Kunieda, Masanori Yatagai, Teruhiko Unoki, and Rihito Yaegashi
xxii
Contents – Part II
Supporting Work and Collaboration Design of an Interview Script Authoring Tool for a Job Interview Training Simulator Using Graph Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 Deeksha Adiani, Emily Tam Nguyen, Jessica Urban, Matthew Fadler, Amir Alam, Jonathan Garcia-Alamilla, Nilanjan Sarkar, and Medha Sarkar Human Factors and Ergonomics Awareness Survey of Professional Personnel in a Large-Scale Company from the Aerospace Industry . . . . . . . . . . . 432 Atakan Co¸skun, Hacer Güner, and Mehmetcan Fal Optimization of a Human-Machine Team for Geographic Region Digitization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444 Steven M. Dennis and Chris J. Michael Crowdsourced Argumentation Feedback for Persuasive Writing . . . . . . . . . . . . . . 461 Hiroki Ihoriya and Yusuke Yamamoto An Online Opinion-Learning Experiment Simulating Social Interaction on Emerging Technologies: A Case Study of Genome-Edited Crops . . . . . . . . . . 476 Kyoko Ito, Kazune Ezaki, and Tomiko Yamaguchi Tasks Decomposition Approaches in Crowdsourcing Software Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488 Abdullah Khanfor Comparison of Nature and Office Environments on Creativity- A Field Study - . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499 Ryosuke Konishi, Shinji Miyake, and Daiji Kobayashi A Conceptual Design of Management Interface for Wireless Sensor Network System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509 Julia Lee and Lawrence Henschen What Affects the Success of Programmers in Query Validation Process? An Eye Tracking Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522 Deepti Mishra and Yavuz Inal Comparative Analysis of Manipulation Skills of Experts and Non-experts in Cell Culture Using VR Gloves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533 Satoru Osada, Toru Kano, and Takako Akakura
Contents – Part II
xxiii
Prototyping Process Analyzed from Dialogue and Behavior in Collaborative Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545 Fuko Oura, Takeo Ainoya, Ahmad Eibo, and Keiko Kasamatsu A Study on Visual Communication with Different Conveyance Under MR Remote Collaboration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557 Keigo Satomi and Hirohiko Mori Developers Foraging Behavior in Code Hosting Sites: A Gender Perspective . . . 575 Abim Sedhain, Shahnewaz Leon, Riley Raasch, and Sandeep Kaur Kuttal Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
Contents – Part I
Information Design and User Experience Cooperation Mode of 2D and 3D Interfaces on Destination Planning Tasks in the Location-Based AR Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fangyuan Cheng, Qing Gu, and Xiaohua Sun
3
Generalized Cohen’s Kappa: A Novel Inter-rater Reliability Metric for Non-mutually Exclusive Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andrea Figueroa, Sourojit Ghosh, and Cecilia Aragon
19
Knowledge Graph-Based Machining Process Route Generation Method . . . . . . . Jiawei Guo, Jingjing Wu, Jixuan Bian, and Qichang He
35
How to Share a Color Impression Among Different Observers Using Simplicial Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ryo Kamiyama and Jinhui Chao
49
Task-Based Open Card Sorting: Towards a New Method to Produce Usable Information Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christos Katsanos, Vasileios Christoforidis, and Christina Demertzi
68
Emotive Idea and Concept Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tetsuya Maeshiro, Yuri Ozawa, and Midori Maeshiro
81
Survey on the Auditory Feelings of Strangeness While Listening to Music . . . . . Ryota Matsui, Yutaka Yanagisawa, Yoshinari Takegawa, and Keiji Hirata
95
Text Reconstructing System of Editorial Text Based on Reader’s Comprehension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Yuki Okaniwa and Tomoko Kojiri Interfaces for Learning and Connecting Around Recycling . . . . . . . . . . . . . . . . . . 122 Israel Peña and Jaime Sánchez Sound Logo to Increase TV Advertising Effectiveness Based on Audio-Visual Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Kazuki Seto and Yumi Asahi Research on Visualization Method for Empathetic Design . . . . . . . . . . . . . . . . . . . 152 Miho Suto, Keiko Kasamatsu, and Takeo Ainoya
xxvi
Contents – Part I
A Study on HCI of a Collaborated Nurture Game for Sleep Education with Child and Parent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Madoka Takahara and Shun Hattori Analysis of Resilient Behavior for Interaction Design . . . . . . . . . . . . . . . . . . . . . . . 182 Haruka Yoshida, Taiki Ikeda, Daisuke Karikawa, Hisae Aoyama, Taro Kanno, and Takashi Toriizuka How Information Influences the Way We Perceive Unfamiliar Objects – An Eye Movement Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 Lanyun Zhang, Rongfang Zhou, Jingyi Yang, Zhizhou Shao, and Xuchen Wang Data Visualization and Big Data The Nikkei Stock Average Prediction by SVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Takahide Kaneko and Yumi Asahi What Causes Fertility Rate Difference Among Municipalities in Japan . . . . . . . . 222 Shigeyuki Kurashima and Yumi Asahi Explore Data Quality Challenges Based on Data Structure of Electronic Health Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 Caihua Liu, Guochao (Alex) Peng, Chaowang Lan, and Shufeng Kong Feature Analysis of Game Software in Japan Using Topic Model and Structural Equation Modeling for Reviews and Livestreaming Chat . . . . . . . 248 Ryuto Miyake and Ryosuke Saga Inductive Model Using Abstract Meaning Representation for Text Classification via Graph Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 Takuro Ogawa and Ryosuke Saga Enhancing Visual Encodings of Uncertainty Through Aesthetic Depictions in Line Graph Visualisations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 Joel Pinney, Fiona Carroll, and Esyin Chew Satisfaction Analysis of Group/Individual Tutoring Schools and Video Tutoring Schools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 Hiroyo Sano and Yumi Asahi Zebrafish Meets the Ising Model: Statistical Mechanics of Collective Fish Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 Hirokazu Tanaka
Contents – Part I
xxvii
Research on New Design Methods for Corporate Value Provision in a DX (Digital Transformation) Society: Visualization of Value by Lifestyle Derived from Qualitative Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 Akio Tomita, Keiko Kasamatsu, Takeo Ainoya, and Kunika Yagi Evaluating User Experience in Information Visualization Systems: UXIV an Evaluation Questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 Eliane Zambon Victorelli and Julio Cesar dos Reis Multimodal Interaction Study of HMI in Automotive ~ Car Design Proposal with Usage by the Elderly ~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 Takeo Ainoya and Takumi Ogawa Pilot Study on Interaction with Wide Area Motion Imagery Comparing Gaze Input and Mouse Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 Jutta Hild, Wolfgang Krüger, Gerrit Holzbach, Michael Voit, and Elisabeth Peinsipp-Byma Development of a Speech-Driven Communication Support System Using a Smartwatch with Vibratory Nodding Responses . . . . . . . . . . . . . . . . . . . . . . . . . . 370 Yutaka Ishii, Kenta Koike, Miwako Kitamura, and Tomio Watanabe Coordinated Motor Display System of ARM-COMS for Evoking Emotional Projection in Remote Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 Teruaki Ito and Tomio Watanabe Fundamental Considerations on Representation Learning for Multimodal Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 Kenya Jin’no, Masato Izumi, Saki Okamoto, Mizuki Dai, Chisato Takahashi, and Tatsuro Inami A Fundamental Study on Discrimination of Dominant Hand Based on Motion Analysis of Hand Movements by Image Analysis Using Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400 Takusige Katura Glasses Encourage Your Choices: A System that Supports Indecisive Choosers by Eye-Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410 Tatsuya Komatsubara and Satoshi Nakamura Physiological Measures in VR Experiments: Some Aspects of Plethysmogram and Heart Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422 Shinji Miyake, Chie Kurosaka, and Hiroyuki Kuraoka
xxviii
Contents – Part I
Effects of Visual and Personality Impressions on the Voices Matched to Animated Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 Hiyori Takahashi and Tetsuya Maeshiro Effects of Gaze on Human Behavior Prediction of Virtual Character for Intention Inference Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445 Liheng Yang, Yoshihiro Sejima, and Tomio Watanabe Interacting with AI and Intelligent Systems Development of a Light-Emitting Sword Tip Accompanying Thrusts and a Device for Judging Valid Thrusts by Light Spectrum Detection Without an Electric Judge in the Foil Event of Fencing Competitions . . . . . . . . . 457 Seira Aguni, Tetsuo Nishikawa, Kaito Fujita, Ren Nakanishi, and Yumi Asahi A Study on Human-Computer Interaction with Text-to/from-Image Game AIs for Diversity Education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471 Shun Hattori and Madoka Takahara A Generative Vase Design System Based on Users’ Visual Emotional Vocabulary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487 Yinghsiu Huang The Impact of AI Text-to-Image Generator on Product Styling Design . . . . . . . . 502 Yu-Hsu Lee and Chun-Yao Chiu Generating Various 3D Motions by Emergent Imitation Learning . . . . . . . . . . . . . 516 Ryusei Mitsunobu, Chika Oshima, and Koichi Nakayama Personalized Sleep Stage Estimation Based on Time Series Probability of Estimation for Each Label with Wearable 3-Axis Accelerometer . . . . . . . . . . . 531 Iko Nakari, Masahiro Nakashima, and Keiki Takadama Controllable Features to Create Highly Evaluated Manga . . . . . . . . . . . . . . . . . . . . 543 Kotaro Nishizaki and Tetsuya Maeshiro A Study on Trust Building in AI Systems Through User Commitment . . . . . . . . . 557 Ryuichi Ogawa, Shigeyoshi Shima, Toshihiko Takemura, and Shin-ichi Fukuzumi Chatbot to Facilitate Opinion Formation in Web Search . . . . . . . . . . . . . . . . . . . . . 568 Yuya Okuse and Yusuke Yamamoto
Contents – Part I
xxix
A State-of-Art Review on Intelligent Systems for Drawing Assisting . . . . . . . . . . 583 Juexiao Qin, Xiaohua Sun, and Weijian Xu Discussion Support Framework Enabling Advice Presentation that Captures Online Discussion Situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606 Yuki Shoji, Yuki Hayashi, and Kazuhisa Seta Triple Supportive Information for Matrix Factorization with Image, Text, and Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 622 Takuya Tamada and Ryosuke Saga An Analysis of Factors Associated with Self-confidence in the Japanese . . . . . . . 634 Michiko Tsubaki, Naoki Hemmi, and Yumi Asahi Detecting Signs of Depression for Using Chatbots – Extraction of the First Person from Japanese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660 Min Yang and Hirohiko Mori Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
Service Design
Method for Assessing the Potential Impact of Changes in Software Requirements of Agile Methodologies Based Projects Angelo Amaral1(B) and Ferrucio de Franco Rosa1,2 1 University of Campo Limpo Paulista (UNIFACCAMP), Campo Limpo Paulista/SP, Brazil
[email protected], [email protected] 2 Renato Archer Information Technology Center, Campinas/SP, Brazil
Abstract. The main issues in impact analysis of scope changes are the need for identifying other requirements potentially affected by the change and determining how complex is the change. These challenges are potentialized into projects based on agile methodologies, characterized by broad communication with the customer, which leads to more opportunities for change requests during the software development life cycle. We propose a new method in which agile methodologies can benefit from requirement traceability and effort prediction on scope changes. By applying the method to a software development project, the results are synthesized into indicators presenting how scope changes realized through different development iterations impact other software requirements and how the complexity of the requirement being changed can be used to determine the effort estimate. The main contributions of our work are (i) method aimed to provide a potential impact rate on scope changes in agile methodologies-based projects; (ii) metric for ranking software requirements based on their potential impact, supporting effort prediction and impact analysis; (iii) value range to determine the risk of accepting a scope change request; and (iv) discussion on the results from applying the approach in an agile software development scenario. Our proposal is intended to be used by software development teams in the context of agile projects. Keywords: Assessment · Requirement · Change · Impact · Effort · Agile
1 Introduction When designing service applications, a key issue is evaluating the impact of scope changes in software requirements. This becomes critical when adopting agile development methodologies, where the focus on transparency and communication between the development team and the client may result in less-detailed documentation. According to Sommerville [1], requirements can be organized into a hierarchy composed of User Requirements, which describe end-user high-level expectations, and System Requirements, reflecting key implementation aspects. System requirements can be classified as Functional and Non-Functional, referring to implementations related to software functionalities and environment constraints, respectively. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 3–21, 2023. https://doi.org/10.1007/978-3-031-35129-7_1
4
A. Amaral and F. de Franco Rosa
This approach applies to the traditional software development lifecycle (SDLC) [2, 3], whereas projects based on agile methodologies (e.g., Scrum Framework) use a model based on a high-level requirements backlog, which is decomposed iteratively during the SDLC [4–6]. A Product Backlog describes the user’s high-level requirements [5], similar to the System Requirements [1], and a Sprint Backlog describes items that must be implemented by the development team to deliver the software, corresponding to Sommerville’s System Requirement concept. Additionally, the Sprint Backlog usually contains Tasks, which can be paired with Functional and Non-Functional requirements. Based on this structure, we propose a method in which Requirement Analysts and other professionals working under the role of Product Owner on agile methodologiesbased projects can benefit from requirement traceability to support impact analysis and effort prediction on scope changes [3].
2 Literature Review and Related Work We analyzed 33 articles in our systematic literature review on impact analysis on change management in agile methodologies projects. The approach used in the literature review was based on the guidelines proposed by Kitchenham [7] and the research was driven by the question “How to evaluate the impact of scope changes in agile projects?”, through queries on ACM Digital Library and IEEE Xplore, followed by exploratory analysis classifying the articles on 6 objectives: i) usage of a previous relationship model (Ru) to organize requirements, ii) propose a new relationship model (Rp) to organize requirements, iii) present a model (M) to manage requirements, iv) present a software tool (T) to manage the requirements, v) support of change effort prediction (EP), and vi) support of change impact analysis (IA). In addition to these objectives, 4 application domains were considered in our analysis: i) review of SDLC, ii) coverage of agile methodologies, iii) coverage of traditional methodologies, and iv) presence of an algorithm-based approach for requirement change in agile methodologies projects. The query used to search the libraries was based on the string “(Impact AND (Analysis OR Evaluation OR Assessment) AND (Change OR Changing) AND (Agile OR SCRUM) AND Project)”. This search resulted in 44 articles collected, 29 of which were from IEEE Xplore, and 15 from ACM Digital Library. Two articles were found in both libraries, as they were published at a joint IEEEACM international conference. After removing the redundancy, the preliminary list was composed of 42 articles. As Inclusion Criteria, we have considered all articles returned plus 4 additional works [3, 8–10], which were cited by other articles returned (snowballing review process), totalizing 46 evaluated works. As Exclusion Criteria, during the analysis, 13 articles were identified as non-adherent to our focus, being removed from this study. We classified the 33 resulting articles as 24 analyzed works, listed in Appendix I (Table 6), and 9 related works, listed in Appendix I (Table 7), also consulting other 12 works [1, 5, 7–9, 11–17] as a methodological reference, totalizing 45 references.
Method for Assessing the Potential Impact of Changes in Software Requirements
5
The analyzed and related works combine new insights and the status quo of current research, with the concept of an ideal systemic approach, setting key aspects for change impact analysis. The synthesis of these works can be found in the next subsections and in Appendix I (Tables 6 and 7) we present their classification. Considering our literature review, we identified that the adoption of a requirement relationship model to support traceability on impact analysis is a promising strategy. 2.1 Analyzed Work Sletholt et al. (2011) [18] present a literature review on impact analysis for requirement change in projects based on agile methodologies. They are strictly focused on impact analysis, whereas we are focused on impact analysis and effort estimation for scope changes, presenting an updated systematic mapping and a classification of the research works. Basri, et al. published two articles [19, 20] presenting the implementation of a model to predict the change effort and to support impact analysis on software development projects, running it on agile and traditional methodologies. Those articles were elaborated extending the application of a previously proposed tool [8, 9] for change impact analysis in the extent of agile methodologies projects. Kim et al. (2010) [21] and Wang et al. (2018) [10] propose methods and tools for change impact analysis, both starting with source code analysis to estimate the impact of scope changes. None of their articles consider agile methodologies projects explicitly. A key difference between them is the fact that Wang et al. (2018) [10] added the change effort prediction aspect to their method. Krzanik et al. (2010) [22] present a method for agile impact analysis and effort estimation from a high-level perspective on project management processes. Silva et al. (2018) [23] present a model based on tests to manage the impact of scope changes in agile projects, without a requirement relationship model to support traceability. Cao et al. (2010) [24], Vasic et al. (2017) [25], Tufano et al. (2019) [26] and Habib & Romli (2021) [27] present models to support changes in agile projects, lacking association between requirements. Jonkers & Eftekhari Shahroudi (2021) [28] present a model to support changes in agile projects. It lacks an approach for effort estimation, in addition to not proposing a requirements relation. Raj et al. (2015) [6], Duarte et al. (2011) [29], Abou Khalil (2019) [30], Hamed & Abushama (2013) [31], and Ochoa et al. (2021) [32] present detailed views of the overall SDLC, without focusing on the impact analysis of requirement changes. Kaur et al. (2021) [33], Malhotra & Chug (2016) [34], Manisha et al. (2021) [35], and Murphy & Williams (2013) [36] support the discussion on impact analysis on change requests without major contributions. Rajlich (2010) [37], Rajlich (2013) [38], and Włodarskiet al. (2020) [39] focus on teaching software engineering, covering the topics of our work at a basic level.
6
A. Amaral and F. de Franco Rosa
2.2 Related Work As related work, we considered articles presenting contributions on supporting impact analysis and effort prediction on change requests, as summarized in Appendix I (Table 7) and detailed in the next paragraphs. Díaz et al. (2013) [2] propose an approach for establishing relations between software requirements, aimed to be applied to agile methodology projects, focusing on impact analysis for changes and excluding the change effort prediction aspect. Wang et al. (2017) [4] present the concept of Epic, which can be decomposed into one or more Stories, which are the basis for the Product Backlog and Sprint Backlog [5]. It also supports the usage of a requirement relation model, which relates to Sommerville requirement classification [1], such as its focus on the documentation of agile methodologies requirements. Rubasinghe (2018) [3] presents a traceability model for requirements, which is adherent to the approach presented by Wang et al. (2017) [4], adding a perspective focused on agile methodologies to the proposed model, under the concept of an iterative backlog decomposition [6, 18]. Ahmad et al. (2021) [40] consider change impact analysis and effort estimate as success factors for agile projects, presenting a high-level review of these approaches. Gary et al. (2015) [41], have a similar high-level approach regarding the focus of this article. Zapotecas-Martinez et al. (2020) [42] and Benedicenti et al. (2016) [43] propose a relation structure to decompose requirements in agile methodologies, which is close to Díaz et al. (2013) [2]. Taromirad & Paige (2012) [44] present a method for requirement decomposition without focusing on change impact analysis. Utz (2019) [45] defines a metamodel for business processes, capable of supporting change effort analysis and effort estimation but even though its work does not exclude agile methodologies projects, it is based on traditional approaches.
3 The Proposed Method We propose the Method for Assessing the Potential Impact Rate of Changes in Software Requirements (MAPIR-CSR), which is focused on projects based on agile development methodologies. In Fig. 1, we present MAPIR-CSR and its i) Inputs – System Requirements (α), Software Requirements (β), and Story Points (γ); ii) Outputs – Potential Impact Rate of the Change (ε) and List of Potentially Impacted Requirements (δ).
Fig. 1. MAPIR-CSR – inputs and outputs.
Method for Assessing the Potential Impact of Changes in Software Requirements
7
Our proposal considers a requirement relationship model based on Sommerville’s approach [1], as the running example presented in Fig. 2, with User, System, and Software requirements. We use software requirements (β) for impact assessment and their connection with system requirements (α) to support traceability, identifying which requirements (δ) are potentially impacted by a scope change. The third input parameter is the complexity measure, Story Points (γ), for each software requirement, enabling effort prediction through a potential impact rate of the change (ε). A Story Point is the relative estimate of the size of the activity compared to other activities in the project [5, 11]. The estimation is founded on previous successful estimations.
Fig. 2. Requirements Relationship Model for a Hypothetical E-Commerce System.
As we considered the usage of the Scrum Framework [5], client requests result in a Product Backlog, which is a set of high-level requirements needed to deliver the system. Product backlog items are decomposed and implemented in subsequent sprints. The requirements of a specific sprint compose the sprint backlog. Wang et al. [4] propose an association between Software Requirements (β) and the Sprint Backlog, while the System Requirements (α) can be related to the Product backlog, as shown in Fig. 3, representing the requirements from Fig. 2. In Scrum, requirement complexity shall not be estimated in implementation hours, but by relative estimation against other requirements in the sprint backlog [5]. This relative complexity is measured in Story Points [11, 12], used by development teams to rank the implementation effort of sprint backlog items, with the major score being assigned to the most complex requirement and the minor score being assigned to the less complex. These complexity scores (Story Points) are usually based on a Fibonacci Scale [11]. In our method, we consider {1, 2, 3, 5, 8, 13, 21} as the set of values which can be assigned. The process consists of selecting a mid-range value (5 or 8) as the reference
8
A. Amaral and F. de Franco Rosa
Fig. 3. Product Backlog, Sprint Backlog, and the traditional Requirement Relationship.
value for a “normal task”, which neither bears uncertainty factors nor is considered trivial by the development team. The next step is selecting a software requirement that meets this definition and setting the selected value to it. We adopted the approach proposed by Hannay et al. (2019) [13], setting 5 as the value in Story Points to the “normal task” (software requirement β04) and removed the value 21 from the range, setting it apart for tasks defined by Wang [4] as Epics, or tasks with high uncertainty level, which should be decomposed before moving forward with their implementation. From these definitions, in our running example, each other software requirement is compared to β04, defined as the “normal task”, being assigned minor story point values if considered less complex than it and major values if considered more complex. The values for all requirements were then balanced to represent the comparison against each other [14], resulting in the values presented in Fig. 4 and listed on the requirement traceability matrix, in Appendix II (Table 8).
Fig. 4. Story Points (γ) assigned to each Software Requirement.
Method for Assessing the Potential Impact of Changes in Software Requirements
9
Once the requirement relationship is defined and the complexity has been measured, tracking the change request of a software requirement (β) back to the system requirement (α) that originated it allows a deepening into all other requirements derivate from it and potentially impacted by the change. The potential impact rate of the change (ε) is defined as the sum of the story points of all software requirements potentially impacted ( γ), divided by the total amount of software requirements in the project (n): ε = ( γ)/n. This formula implies that the potential impact of a change is not a simple product of the total amount of other requirements which can be impacted by it, as the impact rate is reduced proportionally to the total amount of system requirements in the project, assuming that a project with a minor set of requirements tends to adopt a bigger granularity on each requirement [21], while a larger set of requirements allows a smaller granularity, restricting the impact of each isolated change [9]. One of the results of applying MAPIR-CSR is ranking all sprint backlog items by their potential impact in case of change, as illustrated by Table 1. This approach can drive the discussion with the customer to clarify the impact of a change, in addition to supporting effort estimation. Table 1. Software Requirements Ranked by the Potential Impact Rate Software Requirement (β)
Story Points (γ)
Potentially Impacted Requirements (δ)
Potential Impact Rate (ε)
β03
5
{β01, β02, β03, β04}
6.2
β02
13
{β01, β02, β03}
5.2
β01
8
{β01, β02, β03}
5.2
β04
5
{β03, β04, β05}
2.6
β05
3
{β04, β05}
1.6
In Table 1, we list the requirements of our running example, ranked by their Potential Impact Rate (ε). As examples, in the first row (β03), (8 + 13 + 5 + 5) / 5 = 6.2; in the second row (β02), (8 + 13 + 5)/5 = 5.2. As shown in Table 1, despite β02 being considered the more complex requirement of the system by having the major Story Points (γ) value, it had not the major Potential Impact Rate (ε), which belongs to β03. Another output of our method is the list of Potentially Impacted Requirements (δ) of each requirement, aimed at supporting exploratory impact analysis. Section 4 presents a real-world application scenario, containing the product backlog decomposition and requirement relationship, as well as the assessment of changes in different sprints, based on the hypothetical requirements described above.
10
A. Amaral and F. de Franco Rosa
4 Real-World Application Scenario To demonstrate MAPIR-CSR and better illustrate how it can support change impact and effort prediction on agile methodologies, we emulated the development of an ecommerce system, with requirements introduced in Sect. 3 (Fig. 2). All requirements were proposed by two senior software engineers, with over 20 years of field experience each. One of them holds a doctorate degree in software engineering, and the other holds multiple certifications in Scrum and project management. The software engineers assumed multiple personas [15] to represent the different roles of a team with 3 software developers, plus a Scrum Master and a Product Owner, adopting the Scrum Framework to run a set of Sprints with simulated events and intentional scope change requests. Simulations on the method outputs for each sprint were stored and calculated through a Microsoft Excel spreadsheet, representing MAPIR-CSR processing. As the requirement relationship definition is a complex task, based on subjective factors [43], and critical for project success [40], real-world requirement engineering experience was the reference to structure the requirements as presented in the requirement traceability matrix (AppendixII - Table 8). The automation of requirement relationship mapping is out of the scope of this work. The simulated development cycle assumes the objective of implementing an eCommerce system through the Scrum Framework [5], with a development team composed of 5 personas (3 software developers, 1 scrum master, and 1 product owner), working with a sprint duration of 2 weeks, which combines the smallest team size and minimum iteration duration recommended by the Scrum Guide [5]. This scenario aims to represent an ever-changing environment in which time is a key success factor in evaluating scope changes [40]. The product owner role adopts a humancentered design approach [16], presuming interviews, user observation, prototyping, and usability tests to support the project decisions, through 3 sprints described in the next Subsects. (4.1–3). 4.1 Sprint 1 During the simulated planning session of the first sprint, the Product Owner presented the Product Backlog to the team, composed of System Requirements α01, α02, and α03, agreeing with the developers on an initial decomposition for these requirements, resulting in Software Requirements β01, β02, β03, β04 and β05, as shown in Fig. 4. As a technical decision, the development team stated to the Product Owner that the requirements should be aggregated to describe 2 intertwined software (an inventory control, and a sales website), which compose the desired e-Commerce System, as presented in Fig. 5. In response to this, the Product Owner requested the development team to focus on delivering the inventory control as soon as possible, as it would allow the customer to start working and consequently would add more value to the project. The agreement was that for Sprint 1 the Sprint Backlog would be composed of the Software Requirements β01 and β02, totalizing 18 story points of complexity as presented in Table 1.
Method for Assessing the Potential Impact of Changes in Software Requirements
11
Fig. 5. Requirements per Software under the e-Commerce System
Once the team agreed on the requirements structure and overall strategy to deliver the system, the potential impact rate of each requirement could be calculated by using MAPIR-CSR, based on the set of requirements currently mapped. Table 2 presents the inputs and resulting outputs at the beginning of Sprint 1, ordered by Potential Impact Rate, as the method returns the list of potentially impacted requirements (δ) and the Potential Impact Rate (ε) of all software requirements present in the project, independently of those requirements being addressed on the current sprint or not. In Table 2, we list the software requirements and their potentially impacted requirements (δ) ranked by the Potential Impact Rate (ε). Table 2. Inputs and Outputs of MAPIR-CSR at the beginning of Sprint 1 Inputs
Outputs
Software Requirement (β)
Parent System Requirements (α)
β03
α01, α02
β02 β01
Story Points (γ)
Potentially Impacted Requirements (δ)
Potential Impact Rate (ε)
5
{β01, β02, β03, β04}
6.2
α01
13
{β01, β02, β03}
5.2
α01
8
{β01, β02, β03}
5.2
β04
α02, α03
5
{β03, β04, β05}
2.6
β05
α03
3
{β04, β05}
1.6
As shown in Table 2, Software Requirement β01 has a parent relationship with System Requirement α01, potentially impacting requirements (δ) β01 (itself), β02, and β03 in case of being subject to a scope change request. This potential impact is a direct result of β01, β02, and β03 sharing the parent relationship with α01 and is the first MAPIR-CSR output, driving the eventual impact analysis, in a qualitative approach.
12
A. Amaral and F. de Franco Rosa
The second output of our method is the Potential Impact Rate (ε) of a scope change related to β01, which is obtained by summing up the Story Points (γ) assigned by the development team to each potentially impacted requirement (δ) of β01 (β01, β02 and β03), respectively 8, 13, and 5 Story Points (γ), totalizing 26 Story Points ( γ). This resulting amount is then divided by the total amount of imputed Software Requirements (n), which are 5 at this point (β01, β02, β03, β04 and β05), resulting in a Potential Impact Rate (ε) of 5.2 (26 / 5) to β01, described by the equation ε = ( γ)/n. From knowing the potential impact rate (ε) of changing each software requirement in the project, the development team has a quantitative argument to negotiate with the Product Owner eventual trade-offs during the sprint planning sessions, providing a clear indicator of how complex the change tends to be, as Story Points are, by definition, a complexity measure defined by developers [5, 17]. The same logic is used to calculate the outputs for all software requirements imputed to MAPIR-CSR. We point out that in this example β03 and β04 present parent relationships with more than one system requirement. This means for β03 the list of potentially impacted requirements (δ) is {β01, β02, β03, β04}, and for β04 the list is {β03, β04, β05}, as presented in Table 2. Finally, β03 potential impact rate (ε) is obtained by the sum of the Story Points (γ) assigned to each of their potentially impacted requirements (δ), totalizing 31 Story Points ( γ), which is divided by 5, the total amount of Software Requirements (n) imputed to the method, resulting in a Potential Impact Rate (ε) of 6.2. 4.2 Sprint 2 During the review session of Sprint 1, the development team presented the work done to the customer, who then asked for a change in the Product Backlog. The request was to create a new screen, to configure the report generation. This new requirement was added to the backlog as the Software Requirement β06 and named “Report Generation Configuration Screen” under a parent relationship with System Requirement “α01” as presented in the requirement traceability matrix (Appendix II - Table 8). After that, on the planning session for Sprint 2, the development team committed with the Product Owner to delivering the Software Requirements β03 (“Product Maintenance Screen”) and β06 (“Report Generation Configuration Screen”) and agreed on a complexity score of 2 Story Points for β06. In Table 3, we present the inputs and outputs of MAPIR-CSR, considering the new requirement added to the project. From adding a new requirement to the project, the potential impact rate of each requirement was reset by the method, reflecting the change in requirement granularity. As the change objective was the addition of a new requirement instead of changing an existing one, there is no formal change impact analysis to be conducted, but MAPIR-CSR anticipates visibility on the potential impact of changes to the new set of requirements. All Software Requirements sharing a parent relationship with the same System Requirement as β06 (β01, β02, and β03) also had their list of Potentially Impacted Requirements (δ) updated by the method.
Method for Assessing the Potential Impact of Changes in Software Requirements
13
Table 3. Inputs and Outputs of MAPIR-CSR at the beginning of Sprint 2 Inputs
Outputs
Software Requirement (β)
Parent System Requirements (α)
Story Points (γ)
Potentially Impacted Requirements (δ)
Potential Impact Rate (ε)
β03
α01, α02
5
{β01, β02, β03, β04, β06}
5.5
β02
α01
13
{β01, β02, β03, β06}
4.7
β01
α01
8
{β01, β02, β03, β06}
4.7
β06
α01
2
{β01, β02, β03, β06}
4.7
β04
α02, α03
5
{β03, β04, β05}
2.2
β05
α03
3
{β04, β05}
1.3
4.3 Sprint 3 After concluding Sprint 2, the customer asked the Product Owner for a new change to the project scope, indicating that the Product Maintenance Screen would need to contemplate additional fields and formulas to comply with recently approved legislation, which resulted in a change requested by law. Based on the MAPIR-CSR’s list of potentially impacted requirements (δ), the development team was able to identify that changing the requirement β01 (product maintenance screen) could result in impacts to requirements β01 (itself), β02, β03 and β06. With this guidance, during the planning session for Sprint 3, the team agreed to elevate the complexity score of β01 to 13, assuming that additional analysis effort and attention were needed to avoid impacting the other 3 listed requirements (β02, β03 and β06). The product owner updated the requirement traceability matrix, present in Appendix II (Table 8), and then double-checked the specification of β02, β03, and β06 to ensure that no additional change was needed on those requirements and agreed with the team on the strategy of deploying the change on β01 and the development of β04 and β05 (not impacted by the change) on Sprint 3. In Table 4, we present the changes in MAPIR-CSR outputs after the agreed changes to the project scope.
14
A. Amaral and F. de Franco Rosa Table 4. Inputs and Outputs of MAPIR-CSR at the beginning of Sprint 3
Inputs
Outputs
Software Requirement (β)
Parent System Requirements (α)
Story Points (γ)
Potentially Impacted Requirements (δ)
Potential Impact Rate (ε)
β03
α01, α02
5
{β01, β02, β03, β04, β06}
6.3
β02
α01
13
{β01, β02, β03, β06}
5.5
β01
α01
8
{β01, β02, β03, β06}
5.5
β06
α01
2
{β01, β02, β03, β06}
5.5
β04
α02, α03
5
{β03, β04, β05}
2.2
β05
α03
3
{β04, β05}
1.3
As no additional requirements were added to the project, the scenario on Sprint 3 is different from Sprint 2, with β04 and β05 showing no changes in their respective potential impact rate (ε), as they have no relationship with the requirement subject to the change requested (β01). The list of potentially impacted requirements (δ) suffered no changes on all requirements, as the change requested just reflected on the potential impact rate of β01, β02, β03, and β06.
5 Discussion on the Results MAPIR-CSR supported both qualitative and quantitative impact analysis on scope changes, providing the development team visibility on which other requirements could be impacted by the change, through its “Potentially Impacted Requirements (δ)” output (qualitative analysis) and on how complex the requested change tends to be, through the “Potential Impact Rate (ε)” output for each requirement (quantitative analysis). In Table 5, we present a view on how the potential impact rate (ε) of each requirement changed during the iterations, providing an interesting insight into Sprint 2, when the requirements got their impact rate reduced due to the change on requirement granularity, accommodating the newly added requirement (β06).
Method for Assessing the Potential Impact of Changes in Software Requirements
15
Table 5 supports the perception that a change in a specific requirement mainly impacts the requirements directly related to it, such as happened in Sprint 3, while the addition of new requirements to the project scope affected the whole set of requirements in Sprint 2, demanding a bigger qualitative analysis to ensure that no additional change results from that action. With these resources in his hands, the Product Owner can also determine which requirements to avoid changes, indicating when to search for alternative technical solutions with the development team to attend to a customer change request. An example is the requirement β03, which had the biggest score on potential impact rate (ε) during the whole project, meaning it would be the riskiest requirement to be changed [12]. Table 5. Potential Impact Rate per Software Requirement (ε) on each iteration Iteration
Potential Impact Rate per Software Requirement (ε) β01
β02
β03
β04
β05
β06
Sprint 1
5,2
5,2
6,2
2,6
1,6
N/A
Sprint 2
4,7
4,7
5,5
2,2
1,3
4,7
Sprint 3
5,5
5,5
6,3
2,2
1,3
5,5
Based on the application scenario, it is feasible to set ranges for the potential impact rate calculated by MAPI-CSR, with an impact rate above 8 possibly pointing to a requirement that is intertwined with many other high-complexity requirements, consequently risky to the project, while a rate below 3 tends to be the result of a change in a requirement related to other low-complexity requirements, potentially less risky. The adoption of our method together with this range verification can lead to a strategy definition to accept scope change requests.
6 Conclusion We presented an innovative approach that considers agile methodologies-based projects, aiming at supporting development teams on both impact analysis and effort estimate of scope change requests. The main contributions of our work are (i) method aimed to provide a potential impact rate on scope changes in agile methodologies-based projects; (ii) metric for ranking software requirements based on their potential impact, supporting effort prediction and impact analysis; (iii) value range to determine the risk of accepting a scope change request; and (iv) discussion on the results from applying the approach in an agile software development scenario.
16
A. Amaral and F. de Franco Rosa
As the next steps, this method can be applied to larger projects and combined with automated issue tracker tools, such as Jira or Trello, to calculate and dynamically present the MAPIR-CSR outputs as project indicators for the development team. The strategy definition on acceptance of scope change requests, based on the value range for the Potential Impact Rate returned by the method can also be subject to future works.
Appendix I
Table 6. Summary of Analyzed Work Authors
Objective Ru
Rp
Application Domain M
T
EP
IA
1
2
3
4
Abou Khalil, 2019 [30]
X
Basri, Kama, Haneem, et al., 2016a [19]
X
X
X
X X
X
X
X
Basri, Kama, Sarkan, et al., 2016b [20]
X
X
X
X
X
X
X
Cao et al., 2010 [24]
X
X
X
X
Duarte et al., 2011 [29]
X
Habib & Romli, 2021 [27]
X
X
X
X
X
X
X
X
X
X
X X
X
X
Hamed & Abushama, 2013 [31]
X
Jonkers & Eftekhari Shahroudi, 2021 [28]
X
X
Kaur et al., 2021 [33] Kim et al., 2010 [21]
X
Krzanik et al., 2010 [22]
X X
X
X X
X
X
Malhotra & Chug, 2016 [34]
X
X
X
Manisha et al., 2021 [35]
X
X
X
Murphy & Williams, 2013 [36]
X
X
X
X
X
X
Rajlich, 2010 [37]
X
X
X
X
Rajlich, 2013 [38]
X
X
X
X
X
X
X
Ochoa et al., 2021 [32]
X
Raj et al., 2015 [6]
X
Silva et al., 2018 [23] Sletholt et al., 2011 [18]
X
X X
X
X
X
X
X
Tufano et al., 2019 [26]
X
X
X
X
Vasic et al., 2017 [25]
X
X
X
X
(continued)
Method for Assessing the Potential Impact of Changes in Software Requirements
17
Table 6. (continued) Authors
Objective Ru
Rp
C. Wang et al., 2018 [10]
Application Domain M
T
EP
IA
X
X
X
X
Włodarski et al., 2020 [39]
X
1
2
X
X
3
4
X
X
X
X
Objective: Relationship model – using of (Ru); Relationship model - proposal of (Rp); Model (M); Tool (T); Effort Prediction (EP); Impact Analysis (IA). Application Domain: (1) Software Development Life Cycle; (2) Agile Methodologies; (3) Traditional Methodologies; (4) Algorithm Based Table 7. Summary of Related Work Authors
Objective Ru
Ahmad et al., 2021 [40]
X
Benedicenti et al., 2016 [43]
X
Díaz et al., 2013 [2]
Rp
X
Gary et al., 2015 [41]
X
Rubasinghe et al., 2018 [3]
X
Taromirad & Paige, 2012 [44] Wang et al., 2017 [4]
X
Zapotecas-Martinez et al., 2020 [42]
X
M
T
EP
IA
1
2
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X X X
X
Utz, 2019 [45]
Application Domain
X
X
3
4
X
X
X
X
X
X
X
X
X
X
X
Objective: Relationship model – using of (Ru); Relationship model - proposal of (Rp); Model (M); Tool (T); Effort Prediction (EP); Impact Analysis (IA). Application Domain: (1) Software Development Life Cycle; (2) Agile Methodologies; (3) Traditional Methodologies; (4) Algorithm Based
18
A. Amaral and F. de Franco Rosa
Appendix II
Table 8. Requirement Traceability Matrix Type
Id
Description
Story Points
Parent
Creation
User Requirement
ω01
Create an e-Commerce with a sales website and inventory control
N/A
N/A
Sprint 1
System Requirement
α01
Inventory control must keep a product registry with stock amounts and allow queries
N/A
ω01
Sprint 1
System Requirement
α02
Only products registered on inventory can be sold; Customers need an active register to buy
N/A
ω01
Sprint 1
System Requirement
α03
Customers with bad debit must have their register blocked
N/A
ω01
Sprint 1
Software Requirement
β01
Product Maintenance Screen
8 (13 on Sprint 3)
α01
Sprint 1
Software Requirement
β02
Stock Control Screen with 13 Report Generation
α01
Sprint 1
Software Requirement
β03
Product Stock Availability 5 Query
α01, α02
Sprint 1
Software Requirement
β04
Customer Maintenance Screen
5
α02, α03
Sprint 1
Software Requirement
β05
Customer Register Block by Bad Debit
3
α03
Sprint 1
Software Requirement
β06
Report Generation Configuration Screen
2
α01
Sprint 2
References 1. Bauer, F.L., et al. (eds.): Software Engineering. LNCS, vol. 30. Springer, Heidelberg (1975). https://doi.org/10.1007/3-540-07168-7 2. Díaz, J., Pérez, J., Garbajosa, J., Yagüe, A.: Change-impact driven agile architecting. In: 2013 46th Hawaii International Conference on System Sciences, pp. 4780–4789 (2013) 3. Rubasinghe, I., Meedeniya, D., Perera, I.: traceability management with impact analysis in devops based software development. In: 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 1956–1962 (2018)
Method for Assessing the Potential Impact of Changes in Software Requirements
19
4. Wang, Y., Bogicevic, I., Wagner, S.: A study of safety documentation in a scrum development process. In: Proceedings of the XP2017 Scientific Workshops. Association for Computing Machinery, New York, NY, USA (2017) 5. Schwaber, K., Sutherland, J.: The Scrum Guide the Definitive Guide to Scrum: The Rules of the Game (2020) 6. Raj, G., Yadav, K., Jaiswal, A.: Emphasis on testing assimilation using cloud computing for improvised agile SCRUM framework. In: 2015 International Conference on Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE), pp. 219–225 (2015) 7. Kitchenham, B.: Procedures for Performing Systematic Reviews. Keele (2004) 8. Kama, N., Azli, F.: A change impact analysis approach for the software development phase. In: 2012 19th Asia-Pacific Software Engineering Conference, pp. 583–592 (2012) 9. Asl, M.H., Kama, N.: A change impact size estimation approach during the software development. In: 2013 22nd Australian Software Engineering Conference, pp. 68–77 (2013) 10. Wang, C., Xie, X., Liang, P., Xuan, J.: Multi-perspective visualization to assist code change review. In: Proceedings - Asia-Pacific Software Engineering Conference, APSEC 2017-December, pp. 564–569 (2018). https://doi.org/10.1109/APSEC.2017.66 11. Zahraoui, H., Janati Idrissi, M.A.: Adjusting story points calculation in scrum effort & time estimation. In: 2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA), pp. 1–8 (2015) 12. Popli, R., Chauahn, N.: Managing uncertainty of story-points in Agile software. In: 2015 2nd International Conference on Computing for Sustainable Global Development (INDIA-Com), pp. 1357–1361 (2015) 13. Hannay, J.E., Benestad, H.C., Strand, K.: Agile uncertainty assessment for benefit points and story points. IEEE Softw. 36, 50–62 (2019). https://doi.org/10.1109/MS.2018.2875845 14. Jakobsen, C.R., Sutherland, J.: Scrum and CMMI going from good to great. In: 2009 Agile Conference, pp. 333–337 (2009) 15. Cleland-Huang, J.: Meet Elaine: a persona-driven approach to exploring architecturally significant requirements. IEEE Softw. 30, 18–21 (2013). https://doi.org/10.1109/MS.2013.80 16. Garcia, C.K.Z.: Human-centered product owner: how human-centered design can sharpen scrum methodology. In: Stephanidis, C. (ed.) HCI 2015. CCIS, vol. 528, pp. 409–413. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21380-4_69 17. Sjoberg, D.I.K., Odberg, E., Warlo, B.: The challenge of assessing and controlling complexity in a large portfolio of software systems. In: Proceedings of the 11th International Conference on Product Focused Software. Association for Computing Machinery, New York, NY, USA, pp. 71–74 (2010) 18. Sletholt, M.T., Hannay, J., Pfahl, D., et al.: A literature review of agile practices and their effects in scientific software development. In: Proceedings of the 4th International Workshop on Software Engineering for Computational Science and Engineering. Association for Computing Machinery, New York, NY, USA, pp. 1–9 (2011) 19. Basri, S., Kama, N., Haneem, F., Ismail, S.A.: Predicting effort for requirement changes during software development. In: Proceedings of the Seventh Symposium on Information and Communication Technology. Association for Computing Machinery, New York, NY, USA, pp. 380–387 (2016) 20. Basri, S., Kama, N., Sarkan, H.M., et al.: An algorithmic-based change effort estimation model for software development. In: 2016 23rd Asia-Pacific Software Engineering Conference (APSEC), pp. 177–184 (2016) 21. Kim, T., Kim, K., Kim, W.: An interactive change impact analysis based on an architectural reflexion model approach. In: 2010 IEEE 34th Annual Computer Software and Applications Conference, pp. 297–302 (2010)
20
A. Amaral and F. de Franco Rosa
22. Krzanik, L., Rodriguez, P., Simila, J., et al.: Exploring the transient nature of agile project management practices. In: 2010 43rd Hawaii International Conference on System Sciences, pp. 1–8 (2010) 23. Silva, A.G.F., Andrade, W.L., Alves, E.L.G.: A study on the impact of model evolution in MBT suites. In: Proceedings of the III Brazilian Symposium on Systematic and Automated Software Testing. Association for Computing Machinery, New York, NY, USA, pp. 49–56 (2018) 24. Cao, L., Ramesh, B., Abdel-Hamid, T.: Modeling dynamics in agile software development. ACM Trans. Manage. Inf. Syst. 1 (2010)https://doi.org/10.1145/1877725.1877730 25. Vasic, M., Parvez, Z., Milicevic, A., Gligoric, M.: File-Level vs. Module-level regression test selection for .NET. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. Association for Computing Machinery, New York, NY, USA, pp. 848–853 (2017) 26. Tufano, M., Sajnani, H., Herzig, K.: Towards predicting the impact of software changes on building activities. In: 2019 IEEE/ACM 41st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), pp. 49–52 (2019) 27. Habib, B., Romli, R.: A systematic mapping study on issues and importance of documentation in agile. In: 2021 IEEE 12th International Conference on Software Engineering and Service Science (ICSESS), pp. 198–202 (2021) 28. Jonkers, R.K., Eftekhari Shahroudi, K.: A Design change, knowledge, and project management flight simulator for product and project success. IEEE Syst. J. 15, 1130–1139 (2021). https://doi.org/10.1109/JSYST.2020.3006747 29. Duarte, A.P., Herrera, V.A.S., Herrera, R.S., Melendez, G.Y.T.: Using IBM rational application developer to develop enterprise applications with Java EE, Dojo server faces and interconnecting them using SOA. In: Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research. IBM Corp., USA, pp. 373–374 (2011) 30. Abou Khalil, Z.: Studying the impact of policy changes on bug handling performance. In: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 590– 594 (2019) 31. Hamed, A.M.M., Abushama, H.: Popular agile approaches in software development: review and analysis. In: 2013 International Conference on Computing, Electrical and Electronic Engineering (ICCEEE), pp. 160–166 (2013) 32. Ochoa, O., Towhidneiad, M., Wilson, T., et al.: Adopting agility in academia through pilot projects. In: 2021 IEEE Frontiers in Education Conference (FIE), pp. 1–5 (2021) 33. Kaur, K., Manisha, K.M.: Impact of agile scrum methodology on time to market and code quality – a case study. In: 2021 3rd International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), pp. 1673–1678 (2021) 34. Malhotra, R., Chug, A.: Comparative analysis of agile methods and iterative enhancement model in assessment of software maintenance. In: 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), pp. 1271–1276 (2016) 35. Manisha, K.M., Kaur, K.: Impact of agile scrum methodology on team’s productivity and client satisfaction – a case study. In: 2021 3rd International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), pp. 1686–1691 (2021) 36. Murphy, B., Williams, L.: To branch or not to branch that is the question. In: 2013 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), p. 55 (2013) 37. Rajlich, V.: Teaching undergraduate software engineering. In: 2010 IEEE International Conference on Software Maintenance, pp. 1–2 (2010) 38. Rajlich, V.: Teaching developer skills in the first software engineering course. In: 2013 35th International Conference on Software Engineering (ICSE), pp. 1109–1116 (2013)
Method for Assessing the Potential Impact of Changes in Software Requirements
21
39. Włodarski, R., Poniszewska-Mara´nda, A., Falleri, J-R.: Comparative case study of plandriven and agile approaches in student computing projects. In: 2020 International Conference on Software, Telecommunications and Computer Networks (SoftCOM), pp. 1–6 (2020) 40. Ahmad, J., Khan, A.W., Khan, H.U.: Role of critical success factors in offshore quality requirement change management using SLR. IEEE Access 9, 99680–99698 (2021). https:// doi.org/10.1109/ACCESS.2021.3096663 41. Gary, K.A., Xavier, S.: Agile learning through continuous assessment. In: 2015 IEEE Frontiers in Education Conference (FIE), pp. 1–4 (2015) 42. Zapotecas-Martinez, S., Garcia-Nájera, A., Cervantes, H.: Multi-objective optimization in the agile software project scheduling using decomposition. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion. Association for Computing Machinery, New York, NY, USA, pp. 1495–1502 (2020) 43. Benedicenti, L., Cotugno, F., Cianfrini, P., et al.: Applying scrum to the army - a case study | IEEE conference publication | IEEE xplore. In: 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C), pp. 725–727 (2016) 44. Taromirad, M., Paige, R.F.: Agile requirements traceability using domain-specific modelling languages. In: Proceedings of the 2012 Extreme Modeling Workshop. Association for Computing Machinery, New York, NY, USA, pp. 45–50 (2012) 45. Utz, W.: Design of a domain-specific metamodel for industrial business process management. In: 2019 8th International Congress on Advanced Applied Informatics (IIAI-AAI), pp. 821– 826 (2019)
Validation of Items of Aspects of Interests in Quality-In-Use -Stakeholder Needs of Each System DomainShin-ichi Fukuzumi1,2(B) 1 Center for AIP, RIKEN, Tokyo, Japan
[email protected] 2 Tokyo Metropolitan University, Tokyo, Japan
Abstract. The target of Quality-in-use model has been expanded to not only direct users but also other stakeholders like organization or society. The target system which has been considered from the viewpoint of quality-in-use. From the background like these, the author prepares many terms to make quality-in-use characteristics from the viewpoint of each system domain. To validate items of aspects of interests, these terms were evaluated by using 5000 workers related system development using a questionnaire. From the results of investigation, for all stakeholders, many of terms gets over 50% and some of them gets about 40%. Therefore, these terms could be verified their validation. However, there are the terms not supported by this research. About them, it is necessary to consider their usage. Keywords: Software · Quality · Usability · Stakeholder · Ethics
1 Introduction From the end of the last century, importance of usability has been considered in not only ergonomics area but also software engineering area [1]. Quality model in SQuaRE (System and software Quality Requirement and Evaluation) series which are dealt with in ISO/IEC JTC1SC7 defined effectiveness, efficiency, satisfaction freedom from risk and context coverage as elements of “Quality in Use” [2]. Figure 1 shows the Quality in Use model defined in ISO/IEC 25010. On the other hand, ISO 9241–11: 2018 [3] which is one of human-system interaction standards in ergonomics area describes the concept of usability shown in Fig. 2. As usability definition is “extent to which a system, product or service can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use” in this standard, this figure explains that outcome of “use” under identified context of use is usability which elements are effectiveness, efficiency and satisfaction. In this figure, “use” is emphasized. Comparing these two figures, it turns out that there are common elements, i.e., effectiveness, efficiency, satisfaction. As describes above, usability is parts of outcome © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 22–34, 2023. https://doi.org/10.1007/978-3-031-35129-7_2
Validation of Items of Aspects of Interests
23
Fig. 1. Quality in Use model defined in ISO/IEC 25010 [2]
Fig. 2. Usability concept (ISO9241–11, modified) [3]
of “use”, it can be found that Quality in Use model is also focused on “use”, that is direct interaction. In case of this, freedom from risk means influence on economy and health by using and context coverage means whether product, system or service can be used or not under initially explicitly identified context of use. As described in above, current quality-in-use model is mainly focused for direct users. However, as system and software products are widely used in our life, use of them is influence on not only their direct uses but also organizations and society. For example, interaction operator (direct user) with a control panel in electric power supply company, organization (indirect user) who manages and distributes by using the results, residents, government and/or public (stakeholders except direct user/indirect user) who are affected (e.g. black out) by the results used in the organization. In case of self-driving car, direct user is a driver, indirect users are persons’ who sits a passenger’s seat or customer in commercial car, stakeholders except direct user/indirect user are other cars exist on road or pedestrian or government/local government managed road and traffic networks. As influence of use spreads widely, it is responsibility for society of the manufacturer to be controllable this influence as much as possible.
24
S. Fukuzumi
From this, quality is regarded influence on stakeholders by using. The purpose of quality in use is that manufacturers and managers are able to enforce to “use” for improvement of quality by measuring and evaluating.[4]. 1.1 Quality-in-Use Model The objectives which are influenced by use of system and software is not only their direct users but also included various kinds of stakeholders. Contents of influence (quality characteristics) are different by the difference of objects. From this, the paper classifies these objects into four groups shown below and quality in use model for each group is defined. – – – –
Operator of system and/or software Organization which has responsibility for system and/or software management Customer using system and/or software Society which exists system and/or software
When “quality in Use” is considered, it is necessary to clarify which group is focused. Because quality models for use of system and product are different by the difference of objects described above influence is different by different of target when considering quality of same product or system by use. In this paper, terms are defined as follows: – “Quality in Use”: Quality in Use is influence on users and/or public by using product, system, or service. In here, influence is represented as quality. When object of quality in use is wider, the model has to be changed. Figure 3 shows the correspondence quality characteristics/ sub-characteristics with each stakeholder’s general needs [5]. This figure shows the results of labeling and grouping stakeholder needs extracted from multiple examples with generic names (right side) and quality characteristics and quality sub-characteristics (left side). To use these quality characteristics/subcharacteristics, new quality-in-use model has been proposed [5]. Figure 4 is the proposed quality-in-use model. In this model, three quality characteristics supports for a related work [6]. 1.2 Aim of this Study The aim of this study is to validate items of aspects of interests shown in Fig. 3. To clarify the rationale of these terms, validation of the proposed model will be verified.
Validation of Items of Aspects of Interests
25
Fig. 3. Correspondence quality characteristics/ sub-characteristics with each stakeholder’s general needs [5]
Fig. 4. Proposed quality in use models [5]
26
S. Fukuzumi
2 Procedure Questionnaires were carried out to 5027 persons who engage system/software product development (808) or their users (all). The range of their age are from 16 to over 70. The target systems are shown below. This table describes the category of business of the participants engaged numbers of persons. All of participants are users of any kinds of systems (Table 1). Table 1. The category of business of the participants N TOTAL
5027
Software, information processing, information service
1878
Financial, stock exchange
444
business transact with financial organization
16
Retailing
181
Consumer products
33
Logistics
47
Storage
391
Medical and welfare
215
Farmacy
16
General staff
327
Government/public servant, a civil servant
983
University teacher, staff
40
University student
456
The target systems: 1. Financial system (banking, stock exchange, credit card, etc.) 2. Retail system (department store, supermarket, convenience store include customer system) 3. Medical system using hospital or pharmacy (electric chart, account system, diagnosis data analysis, online medical interview, etc.) 4. In house management system (human resource, general affairs, accounting and other business system used by employee (adjust account, attendance, etc.) 5. Monitoring and control system for traffic, air control, electric power generation, building management, etc. 6. University system (registration for a course, syllabus, performance, other management used by teachers, staffs and students) 7. Online shopping system like EC site
Validation of Items of Aspects of Interests
27
8. Other systems Question A) What do you think of merits for your organization by installing system which shown above? (Select from terms shown below. Multiple answers allowed). “Cost and benefit”, “man-hour for management”, “man-hour for operation”, “stock price”, “advantage”, “reliability”, “safety”, “security”,“confidentiality”, “maintainability”, “trust”, “transparency”, “accountability”, “brand image”, “corporate identification”, “traceability”, “service support”, “ethics”, “others” B) When you use the system listed above, what is your expectation to the system? (Select from terms shown below. Multiple answers allowed). “usability”, “accessibility”, “adaptability”, “reliability”, “safety”, “privacy”, “security”, “self-authority”, “trust”, “transparency”, “ethics”, “tool for operation”, “manual for operation”, “education/ training”, “others” C) From the viewpoint of neutrality, when these systems are used in your region, what is your expectation to the system? (For Government/public servant, a civil servant only. Select from terms shown below. Multiple answers allowed). “tax revenue”, “employ people”, “environment adaptability”, “few the number of accident or matters”, “trust”, “transparency”, “ethics”, “fairtrade”, “others”
3 Results The aim of this study is to verify validity each stakeholder needs which are represented by terms. The results of totalization and analysis for each question are shown below. In these tables, filled cell indicates that the ratio which stakeholder selected the terms as suitable stakeholder needs is over 50%, and bold cell indicates that the ratio is just under 50% (approximately 40% to 50%). 3.1 Results for Stakeholders as Organization The aim of question A) is to know the merit or needs from the viewpoint of organization when these systems are installed in the organization. The result is shown in Table 2. This result shows that the terms except “stock price”, “transparency”, “accountability”, “brand image”, “corporate identification”, “traceability”, “ethics” are shown to be suitable as stakeholder needs for any system.
28
S. Fukuzumi
Table 2. The results about aspects of interests for organization. (Upper cell: number, lower cell: percentage) N
Financial system Retail system Medical system using hospital or pharmacy In house management system Monitoring and control system University system Online shopping system like EC site The other system
Financial system Retail system Medical system using hospital or pharmacy In house management system Monitoring and control system University system Online shopping system like EC site The other system
Man-hour for management 77
Man-hour for operation 98
stock price
220
Cost and benefit 67
advantage
reliability
15
90
129
100.0
30.5
35.0
44.5
6.8
40.9
58.6
152
56
78
93
12
54
79
100.0 83
36.8 14
51.3 32
61.2 34
7.9 6
35.5 23
52.0 47
100.0
16.9
38.6
41.0
7.2
27.7
56.6
323
47
185
189
14
84
164
100.0
14.6
57.3
58.5
4.3
26.0
50.8
51
11
20
23
3
24
36
100.0
21.6
39.2
45.1
5.9
47.1
70.6
33
4
18
14
5
15
16
100.0
12.1
54.5
42.4
15.2
45.5
48.5
63
28
19
18
6
15
25
100.0
44.4
30.2
28.6
9.5
23.8
39.7
276
69
135
167
6
109
153
100.0
25.0
48.9
60.5
2.2
39.5
55.4
Safety
Security
82
139
Confidentiality 108
Maintainability 89
Trust
Transparency 113
46
37.3
63.2
49.1
40.5
51.4
20.9
36 23.7 35
61 40.1 47
39 25.7 41
51 33.6 34
48 31.6 35
27 17.8 28
42.2
56.6
49.4
41.0
42.2
33.7
89
171
124
90
95
72
27.6
52.9
38.4
27.9
29.4
22.3
32
33
26
24
20
13
62.7
64.7
51.0
47.1
39.2
25.5
15
16
15
14
11
11
45.5 20
48.5 38
45.5 19
42.4 21
33.3 20
33.3 14
31.7
60.3
30.2
33.3
31.7
22.2
80
141
114
101
85
38
29.0
51.1
41.3
36.6
30.8
13.8
(continued)
Validation of Items of Aspects of Interests
29
Table 2. (continued)
Financial system Retail system Medical system using hospital or pharmacy In house management system Monitoring and control system University system Online shopping system like EC site The other system
Accountability 28
Brand image 22
Corporate identification 22
Traceability 42
service support 46
Ethics 35
Others 1
12.7
10.0
10.0
19.1
20.9
15.9
0.5
19
20
11
37
38
12
2
12.5
13.2
7.2
24.3
25.0
7.9
1.3
24
11
12
23
20
22
0
28.9
13.3
14.5
27.7
24.1
26.5
0.0
36
14
22
68
69
37
2
11.1
4.3
6.8
21.1
21.4
11.5
0.6
8
6
7
11
18
11
0
15.7
11.8
13.7
21.6
35.3
21.6
0.0
10
5
8
12
14
9
0
30.3
15.2
24.2
36.4
42.4
27.3
0.0
7
18
8
12
17
8
0
11.1
28.6
12.7
19.0
27.0
12.7
0.0
21
16
15
56
58
30
7
7.6
5.8
5.4
20.3
21.0
10.9
2.5
3.2 Results for Users or Customers as Stakeholders The aim of question B) is to know the expectation to the system from the viewpoint of operators or customers when these systems are used by users. The result is shown in Table 3. This result shows that the terms except “accessibility”, “adaptability”, “selfauthority”, “transparency”, “ethics”, “tool for operation”, “manual for operation”, “education/ training” are shown to be suitable as stakeholder needs for any system.
30
S. Fukuzumi
Table 3. The results about aspects of interests for operators or customers. (Upper cell: number, lower cell: percentage) N Financial system
2356
usability 723
accessibility 393 16.7
adaptability 252
reliability 1526
10.7
64.8
100.0
30.7
Retail system
887
378
219
182
486
396
100.0
42.6
24.7
20.5
54.8
44.6
Medical system using hospital or pharmacy In house management system
539
180
119
140
340
279
100.0
33.4
22.1
26.0
63.1
51.8
1198
privacy 1178
security 1886
50.8
50.0
80.1
341 63.3
66.2 391 72.5
935
483
445
46.6
24.1
22.2
54.6
38.4
Monitoring and control system
298
122
86
83
181
173
100.0
40.9
28.9
27.9
60.7
58.1
University system
335
114
86
75
174
152
100.0
34.0
25.7
22.4
51.9
45.4
61.2
70.1
2097
844
442
233
117
1047
1235
1687
100.0
40.2
21.1
11.1
56.1
49.9
58.9
80.4
Financial system Retail system Medical system using hospital or pharmacy In house management system Monitoring and control system University system Online shopping system like EC site
trust
transparency
ethics
tool for operation
163 6.9 84 9.5 66 12.2
1545 65.6 465 52.4 306 56.8
444 18.8 205 23.1 151 28.0
240 10.2 107 12.1 133 24.7
391 16.6 200 22.5 121 22.4
140 7.0
776 38.7
416 20.7
256 12.8
635 31.7
55 18.5
176 59.1
73 24.5
62 20.8
77 25.8
manual for operation 248 10.5 143 16.1 98 18.2
841
587
100.0
self-authority
771
397 44.8
2006
Online shopping system like EC site
1096
safety
41.9 125 41.9 205
1366 68.1 206 69.1 235
education/ training 110 4.7 80 9.0 76 14.1
0.0
532 26.5
298 14.9
10 0.5
62 20.8
56 18.8
0.0
others 5 0.2 2 0.2 0
0
43
165
83
51
101
95
81
12.8
49.3
24.8
15.2
30.1
28.4
24.2
0.6
2
147 7.0
1191 56.8
392 18.7
221 10.5
313 14.9
188 9.0
54 2.6
0.2
4
Validation of Items of Aspects of Interests
31
3.3 Results for Government Employee as Stakeholders The aim of question C) is to know the expectation to the system from the viewpoint of neutrality (e.g. government/public servant) when these systems are used by users. The result is shown in Table 4. Table 4. The results about aspects of interests by stakeholders as viewpoints of neutrality,.. (Upper cell: number, lower cell: percentage) N Financial system Retail system Medical system using hospital or pharmacy In house management system Monitoring and control system f University system Online shopping system like EC site
tax revenue
983 100.0 983 100.0 983 100.0
139 14.1 160 16.3 70 7.1
983 100.0 983 100.0 983 100.0 983 100.0
71 7.2 82 8.3 34 3.5 77 7.8
employ people 76 7.7 160 16.3 82 8.3
environment adaptability 92 9.4 147 15.0 140 14.2
140 14.2 99 10.1 55 5.6 49 5.0
few the number of accident or matters 253 25.7 193 19.6 331 33.7
153 15.6 235 23.9 176 17.9 111 11.3
244 24.8 544 55.3 213 21.7 237 24.1
trust 748 76.1 650 66.1 738
transparency 337 34.3 303 30.8 408
ethics 201 20.4 175 17.8 366
fairtrade 202 20.5 390 39.7 196
others 18 1.8 18 1.8 16
75.1
41.5
37.2
19.9
1.6
In house management system
708
463
344
159
14
72.0
47.1
35.0
16.2
1.4
Monitoring and control system University system
684 69.6 712 72.4 749
316 32.1 503 51.2 405
241 24.5 386 39.3 219
198 20.1 128 13.0 555
16 1.6 20 2.0 14
76.2
41.2
22.3
56.5
1.4
Financial system Retail system Medical system using hospital or pharmacy
Online shopping system like EC site
This result shows that the terms except “tax revenue”, “employ people”, “environment adaptability”, “ethics” are shown to be suitable as stakeholder needs for any system.
32
S. Fukuzumi
Fig. 5. The relationship among aspects of interests for each stakeholder’s type
4 Discussions 4.1 Difference of Stakeholder Needs by Each Stakeholder This figure shows the correspondence the results of three questions to stakeholder needs in Fig. 5. In this, bold text terms got positive score as stakeholder needs and grey color. From this, for responsible organization, stakeholder needs related to “beneficialness” and “freedom from risk” are almost accepted by participant from the viewpoint of organization. However, about the terms related “acceptability”, almost terms are not accepted except “trust” and “service support”. The reason is considerable that it is not necessary to show clearly here as these items are much important for organization activities, especially enterprise. From detail data, “transparency”, “traceability” and “accountability” in “freedom from risk” got over 30% support. Then it is not necessary to delete these terms from the list.
Validation of Items of Aspects of Interests
33
For operator and customer, stakeholder needs related to “freedom from risk” are almost accepted by participant from the viewpoint of operators and customers. However, about the terms related “beneficialness” and “acceptability”, almost terms are not accepted except “usability” and “trust”. For these stakeholders, not only “usability” and “trust” but also 2freedom from risk” related terms are important. How about the other terms? The author thinks that “accessibility” is one of the most important items for interactive systems. Though the ratio the term has got the score is not so high, but each system got not so lower score (20%–30%). The score of “adaptability” and “transparency” are similar. Then, these terms can be remained in the list. For public and society, stakeholder needs related to “acceptability” are almost accepted by participant from the viewpoint of neutrality persons. However, about the terms related “beneficialness” and “freedom from risk”, almost terms are not accepted except “the number of accident or matters”. Especially, about “beneficialness”, no term has been accepted by stakeholders related public and society. The reason is considerable that neutrality persons were likely to pay attention to social benefit rather than personal or private organization (e.g. enterprise). About “environmental adaptability”, the ratio is not so high. Though it is difficult to explain this reason, the relationship between use of system and influence on the environment maybe unclear. Influence on the environment is much important for considering quality-in-use. So, this item is not deleted from the list after giving sufficient explanation. 4.2 Ethics The term “ethics” is included in all stakeholders’ needs of original list as acceptability items. However, for all questions, the ratio of accept is low. The reason is considerable that it seems difficult to explain the relation between ethics and use of systems. However, as described in Sect. 1, “It is responsibility for society of the manufacturer to be controllable these influences as much as possible.” So, it is important to show that there is influence on ethics when user uses any system. From the result of table X and table XX, about medical system using hospital or pharmacy., the score related to ethics is relative higher than the score of any other systems. The use of this kind of system requires ethical view. And for public and society, the ratio for accepting “ethics” is not so low (20%–40%). From these reason, “ethics” is still remained in the list of stakeholder needs.
5 Conclusion From these results, the author finally decided terms of aspects of interest shows in Fig. 6. However, there is no term in “beneficialness” for public and society. As quality subcharacteristics, “suitability” is used (see Fig. 4). So, suitable terms shall be appeared in this cell. In future, the author will consider the terms.
34
S. Fukuzumi
Fig. 6. The terms and aspects of interests for each stakeholders’ type verified by questionnaire
References 1. Fukuzumi, S., Hirasawa, N., Wada, N., Komiyama, T., Azuma, M.: Proposal of quality in use in software quality. In: Kurosu, M. (ed.) Human-Computer Interaction. Design and User Experience, HCII 2020, vol. 12181, pp. 431–438. Springer, Cham (2020). https://doi.org/10. 1007/978-3-030-49059-1_31 2. ISO/IEC 25010: Systems and software engineering -Systems and software Quality Requirements and Evaluation (SQuaRE) — System and software quality (2011) 3. ISO9241–11: Ergonomics of human-system interaction –Part11: Usability for definition and concept (2018) 4. Fukuzumi, S., Hirasawa, N.: Usability in software development process -Proposal of society. stakeholder centered design, IWESQ2019 (2019) 5. Fukuzumi, S., Wada, N., Hirasawa, N.: Quality in Use -Issues and proposal-, IWESQ2020 (2020) 6. Karasawa, K.: Social acceptance toward autonomous cars: suggestions from a survey focused on risk/benefit perception. J. Soc. Autom. Eng. Jpn. 75(1), 23–28 (2021)
Design Study of Wearable IV Pole: Service Design Perspective Guizhi Hong and Hong Chen(B) East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, Xuhui District, People’s Republic of China [email protected]
Abstract. The current pandemic has caused a surge in the need for medical services, and this has led to heightened demand for intravenous infusion, a key medical treatment. The traditional intravenous infusion system is plagued by a lack of digitalization, complex processes, redundant tasks, and difficulty for patients to move around. This research examines the infusion process in a general ward, applies service design principles to the design of the IV pole, optimizes the infusion service through questionnaires, interviews, user journey maps, and service blueprints, and develops an IV pole and its accompanying system which can be operated with one hand, automating the intravenous infusion process and making it more accurate and efficient, thus minimizing errors. The implementation of this system boosts the precision and speed of infusion data management, cuts down on infusion errors and medical incidents, and elevates the quality of infusion services, thus granting patients better service and guaranteeing the security of infusion. Keywords: Service Design · User Survey · IV Pole · Wearable Device
1 Current Product Situation and Usage of IV Poles 1.1 Overview of Infusion Background Due to the current outbreak, the necessity for medical assistance has increased, and the requirement for intravenous infusions, one of the primary therapeutic interventions, is augmenting. It is estimated that between 60–80% of hospitalized patients receive intravenous infusion therapy [1]. In China, this statistic reaches up to 93.13% [2]. The entire procedure encompasses dispensing, checking patient information, administering the infusion, attaching the infusion bottle, visiting the patient, positioning, completing, and so forth. The lengthy process of infusion consumes a considerable amount of nursing and medical resources, with nurses spending over 75% of their day on this task [3]. The traditional infusion model lacks the organic integration of multiple service elements such as patients, families, medical staff, equipment, and environment, which not only affects the infusion experience of patients but also consumes the energy of the medical staff, so the model needs to be optimized to meet the service needs of patients, families, and nurses. As the quality of life improves, people pay more attention to the service © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 35–50, 2023. https://doi.org/10.1007/978-3-031-35129-7_3
36
G. Hong and H. Chen
experience of products, and the service design concept is increasingly integrated into medical product design. As an important medium for optimizing infusion management services, IV poles greatly influence accurate infusion information, streamlining the infusion management process, controlling infusion risks, and enhancing patient comfort. We can conduct a study on the patient’s access process and access services, and transform the infusion process and management method through service design with the assistance of wearable smart IV poles. 1.2 Type of IV Pole An IV pole is a device that holds the vessel containing the medication in place while the patient is being infused, with the goal of safely and efficiently delivering the fluid to the patient at a constant rate. Existing IV poles can be divided into two main categories: clinically used IV poles and intelligent IV poles. Clinical IV poles include but are not limited to floor-mounted IV poles, rail-mounted IV poles, and bedside IV pole [4], as shown in Fig. 1(a)-(c). The floor stand is the most prevalent IV pole in modern times, with a combination of a stand, wheeled base, and hooks connected in a manner that the hooks are attached to the top of the stand and the wheeled base connects to its base [5]. Although floor stands are more affordable and common, they are inherently bulky and cannot be easily moved; Tracked stands have a limited range of motion, as they must stay on the track and are only suitable for certain spaces; and bedside stands can only be used in specific beds [6]. The intelligent IV pole includes the robotic IV pole [7] and the automatic monitoring IV pole. The design of the robotic IV pole is aimed at addressing the challenge of limited patient mobility, as shown in Fig. 1(d). Even though the IV pole’s mobility is enhanced, it is not fully automated and does not perfectly track the patient’s movements, so the lack of flexibility is still present. Various designs have been proposed to address the issue of flexibility, such as backpack IV pole [8], wheelchair IV poles, hanging neck IV poles, and wall IV poles, as illustrated in Fig. 1(e)-(f). However, they all only serve a single purpose and may present potential safety hazards such as slipping and blood backflow. The implementation of automated monitoring of IV poles is becoming increasingly prevalent and the combination of IoT, barcode technology, data analysis, sensor devices, and medical applications could lead to a surge in the popularity of intelligent IV pole systems, thus significantly enhancing the accuracy, efficiency, and safety of medical service. 1.3 Problems to Be Solved Improve the Information Accuracy of Infusion Services. During intravenous infusion, there are a variety of potential issues, including incorrect bed assignment, inappropriate drug administration, incorrect dispensing due to a lack of rigorous verification, undetected issues associated with the infusion process, and misinterpreted data resulting from patients’ movements, such as going to the restroom, etc. Therefore, it requires high standards for medical and nursing operations, and to avoid the safety risks of the intravenous infusion process, the accuracy of infusion information should be strictly controlled. To avoid the safety risk of the IV infusion process, the accuracy of the infusion information should be strictly controlled. Especially, faster and more precise feedback
Design Study of Wearable IV Pole: Service Design Perspective
37
Fig. 1. Different intravenous IV poles. (a) Floor-mounted IV pole. (b) Orbital IV pole. (c) Bedside IV pole. (d) Robotic IV pole. (e) Backpack IV pole. (f) Neck IV pole.
regarding the infusion process is essential, such as confirming the medication, identifying any contraindications to infusion, tracking the infusion, responding to infusion-related emergencies, changing the medication, and reminding of needle removal. Improve Medical Efficiency. The infusion process involves many steps, such as three double-checks, administering the drug, confirming data after the procedure, making regular visits during the infusion, being mindful of the IV flow rate, informing patients and their families that the rate should not be changed without authorization, keeping a close eye on any infusion reactions, managing IV treatment issues, and addressing any negative effects of drugs, and so forth. Nurses are under greater strain as they must continually carry out the same mundane and repetitive infusion tasks. Improve User Experience. The traditional infusion mode often encounters issues such as a lack of patient mobility, difficulty summoning assistance, inadequate communication with medical personnel, and discrepancies in information, likely as a result of its intricate infusion procedure and extended infusion treatment duration. 1.4 Research Tasks This study explores how service design can be used to create sensor devices and systems for IV pole, to optimize infusion services through the use of sensors and terminals
38
G. Hong and H. Chen
for monitoring, enabling accurate feedback of infusion information, eliminating manual tasks for nurses, and providing intelligent monitoring of patient status to reduce medical errors, improve the efficiency of healthcare workers, and increase patient comfort. This study takes the main stakeholders in the general ward environment as the design object, analyzes the current situation of infusion in the general ward environment and the application of existing IV poles, conducts research on patients, patient’s families, and medical and nursing staff through questionnaire and interview methods, and then analyzes the research results using service design methods and tools to study the pain points and touch points in the infusion process to identify design opportunity points Propose an optimized design. Specific design strategies were proposed based on user needs, and the result was a wearable IV pole and an associated infusion information interface. Finally, the design solution was evaluated by means of usability testing, which proved that the model is capable of increasing the accuracy of infusion data, optimizing infusion efficiency, decreasing potential safety hazards, lessening the workload of healthcare personnel, and improving the overall infusion experience of patients.
2 User Research and Demand Analysis of Ward Infusion Services 2.1 User Research Methods This user needs assessment was performed through a combination of qualitative and quantitative data collection and analysis. In this qualitative study, data was gathered from patients, family members, and nurses through semi-structured interviews. The quantitative research employed a survey that examined the sentiments and views of the participants (patients and their family members) of the infusion service regarding the entire infusion procedure. 2.2 User Research Participants The user needs assessment amalgamated the findings of semi-structured interviews and questionnaires. In the interview section, a total of 10 patients, 5 accompanying family members, and 5 nurses participated in the semi-structured interviews (Table 1). In the questionnaire section, a total of 500 participants with previous infusion experience were invited to complete the questionnaire through online methods, and after screening out invalid questionnaires, the information of a total of 425 participants was recorded (Table 2). 2.3 Analysis of Interview Content and Results The key stakeholders were interviewed online, with semi-structured questions being used as a reference. Before the interview, the participants were provided with information concerning the purpose, content, and procedures of the research. The interview outline included basic information, experiences during the infusion process, and infusion management work. A transcript of the interview was recorded to facilitate subsequent organization and analysis.
Design Study of Wearable IV Pole: Service Design Perspective
39
Table 1. Background of user interview participants Statistics. Related Stakeholders
Participants
Gender
Age
Academic qualifications
Patients
N1
Female
23
Master
N2
Female
25
Undergraduate
N3
Female
48
Secondary School
N4
Female
21
Undergraduate
N5
Male
22
Master
N6
Female
22
Master
N7
Male
52
Secondary School
N8
Male
78
University
N9
Male
81
High School
Accompanying family members
Nurse
N10
Female
23
Master
N11
Female
22
Master
N12
Female
48
High School
N13
Female
53
University
N14
Female
32
College
N15
Female
50
University
N16
Female
34
College
N17
Female
26
College
N18
Female
28
College
N19
Female
24
College
N20
Female
38
College
The interview results displayed in Table 3 indicated that, for patients and their relatives, a major source of discontentment during infusion was the lack of prompt information response and limited mobility. On the opposite side of information feedback, the main point of conflict focused on the existence of information inequality between patients and family members and nurses, the difficulty of establishing an intuitive and rapid communication channel with nurses, and the need for multiple times in case of adverse reactions or need for help calls to the nurse.2. Patients tend to experience negative emotions when they are required to remain in the same position for an extended period due to their mobility restrictions. For nurses, the main problems are: 1. They cannot grasp the real-time infusion situation and need to go back and forth several times; 2. The need to attend to multiple patients simultaneously depletes their energy and impedes them from having prompt communication with the patients. 3. Patients may be unable to accurately identify the issues they are facing, which can slow down the process of addressing any adverse reactions. 4. They need to rely on manual experience to judge the infusion rate and other information, which is sometimes not accurate enough.
40
G. Hong and H. Chen Table 2. Statistics of background information of the participants of the questionnaire
Participants
N
Age Under 18 years old
1
18–24 years old
174
25–35 years old
163
35–60 years old
70
61 years old and above
27
Gender Male
145
Female
288
Education Level Elementary school and below
3
Junior High School
14
High school/junior high school/technical school
35
University Specialists
86
Undergraduate
212
Master’s degree and above
77
Infusion experience Yes
425
No
0
Solo infusion experience Yes
350
No
75
2.4 Questionnaire Content and Analysis of Results The survey distributed 500 questionnaires, with 425 of them being returned with usable responses. This paper gathered the participants’ opinions on the infusion process by employing a questionnaire with four single-choice questions, three multiple-choice questions, and one scale question. The questionnaire consisted of three main parts; the first part consisted of questions asking about the participants’ background related to age, gender, educational background, and infusion experience; the second part asked about the problems the participants encountered during the infusion process; the third part addressed the efficiency of the health care provider’s response during the infusion service; and the fourth part was about the patient expectations of the infusion product. The study objectives and questionnaires were distributed to participants in an online format through Questionnaire Star, and all data obtained were statistically analyzed using IBM SPSS Statistics 25 and Microsoft Excel.
Design Study of Wearable IV Pole: Service Design Perspective
41
Table 3. Results of Stakeholder Interviews Stakeholders
Problems encountered during infusion
Patients
•More patients, tighter infusion spaces in hospitals • Long waiting time, not sure when to start the infusion • It often happens that the nurse has not come to change the medicine after the drops are finished • Uncomfortable going to the toilet when infusing alone • Need help from family members when moving down to the ground • The drop speed is too slow, I don’t know when it will be finished • During infusion, the range of motion is restricted by the IV pole, so the infusion can only be kept in one position for a long time
Family members
• When asking the nurse about the situation, I feel that the nurse will be impatient if I ask too many questions • Need to pay attention to the drops every once in a while • Nurses sometimes do not respond to calls right away and need to be called some time in advance • Concerns about drug side effects
Nurse
• Inability to keep track of the infusion process of all patients in real-time, requiring frequent rounds • Too busy to respond to patient calls promptly • Often need to rely on manual meter reading to calculate the drip rate, time-consuming and not accurate enough • Sometimes the drip rate is too fast leading to adverse reactions in patients • Patients cannot accurately respond to problems encountered during infusion and do not have enough trust in the explanations given by the healthcare provider
The following are the results of the questionnaire collection. Figure 2 demonstrates the hardship experienced by patients while going through the infusion procedure. 71.76% of the patients have encountered the stress of mobility during the infusion process, more than 50% of the patients said they could not remind the doctor to change the medicine bottle in time when they were unaccompanied, and only 7.06% of the patients chose no distress. The questionnaire contained one scale question with 5 items to measure patient satisfaction with the nurses’ performance of intravenous infusion tasks. The scale was scored on a 5-point Likert scale, with "1–5" representing "not satisfied at all, not satisfied, average, satisfied, and completely satisfied", and was tested for good reliability, with a Cronbach’s A coefficient of 0.801. The analysis of the descriptive statistics of the current nurses’ satisfaction in performing intravenous infusion tasks (Table 4) revealed that the
42
G. Hong and H. Chen
Fig. 2. Trouble patients have encountered during infusion.
mean value of each question was between average (3) and more satisfactory (4) and that the nurses were able to handle the patient’s perceived need for medication changes in a more timely manner (M = 3.929, SD = 0.856). This was followed by the presence of infusion abnormalities (M = 3.894, SD = 0.951) and the responsiveness of nurses (M = 3.800, SD = 0.768), with the lowest satisfaction being the number of rounds by nurses (M = 3.376, SD = 0.951) and patients’ satisfaction with nurses informing patients in detail about the contents of the infusion and precautions before infusion (M = 3.541, SD = 0.853). Overall, participants were moderately satisfied with the nurses’ work status during infusion, indicating that infusion services still need to be improved. Table 4. Descriptive statistics of patients’ satisfaction with nurses performing infusion tasks Name
Average value Standard deviation
Q1 Did the nurse inform in detail about the contents of the 3.541 infusion and precautions
0.853
Q2 When abnormal infusion occurs, is the nurse able to solve it in time
3.894
0.951
Q3 When medication needs to be changed, can the nurse solve it promptly
3.929
0.768
Q4 Is the number of nurse rounds up to standard
3.376
0.951
Q5 Whether the response is timely when calling the nurse 3.800
0.856
Figure 3 shows patients’ expectations for infusion aids. The feature that received the most expectations was the reminder function, with 74% of patients expecting smart
Design Study of Wearable IV Pole: Service Design Perspective
43
reminders to be included in their infusion devices, followed by visualization of infusion information and wearable features.
Fig. 3. Patient expectations of infusion aids.
2.5 User Journey Mapping Analysis The infusion process was broken down into individual stages and a user journey map was formulated (Fig. 4), after conducting a comprehensive assessment which included a literature review, semi-structured online interviews with patients, family members, and nurses, and questionnaires. This enabled us to evaluate how patients felt during each stage of their journey. From the journey map, it can be seen that throughout the infusion process, patients need to go through the process of waiting for infusion, checking infusion information, and cooperating with nurses to complete operations such as puncturing, changing medication, and removing needles, which is a relatively long experience for patients, who are unable to pay attention to the infusion status at all times. With nurses tending to a multitude of patients simultaneously, some scenarios may go undetected or not be addressed promptly, resulting in feelings of worry, uncertainty, and frustration on the part of the patients. Consequently, we must enhance the productivity of nurses by streamlining their procedures, fortifying the dialogue and feedback between patients and nurses, and keeping patients abreast of the infusion’s current development. Subsequently, exact data-oriented monitoring is employed to prevent potential operational blunders resulting from human mistakes and to boost patients’ trust in infusion security. The inflexibility of the IV pole is an issue that must be addressed, as the limits it imposes on activities during infusion are a source of frustration. 2.6 Service Blueprint and Touchpoints Analysis Following the user journey map, the infusion service process was leveraged to devise a service blueprint that allowed for the identification of service touchpoints at each stage
44
G. Hong and H. Chen
Fig. 4. User Journey Map.
and to further refine the user requirements (Fig. 5). The service blueprint incorporates crucial parties including patients, family members, and nurses, with nurses assuming the role of the service provider and patients and family members as the recipients of the service. The components of the service and the contact points at each stage of the infusion process are thus evaluated. The touch points in the infusion preparation stage are interpersonal, digital, and physical touch points. Interpersonal touchpoints are reflected in the delivery of medical advice information by nurses, feedback to patients on infusion readiness, and observation and feedback on patients’ physical status; digital touchpoints are reflected in the management of medical advice information by nurses’ stations and the display of patient infusion queue information by monitors; and physical touchpoints are reflected in the interaction of patients with beds and seats. Interpersonal touchpoints during the infusion execution phase, including the nurse’s verification of medical information with the patient, the nurse’s informing the patient about the current infusion, and physical touchpoints including the use of infusion devices during venipuncture.
Design Study of Wearable IV Pole: Service Design Perspective
45
When administering an infusion, there are also interpersonal and physical touchpoints, with the interpersonal touchpoints being the nurse’s answers to the patient’s questions, the observation and feedback of the patient’s response to the infusion, and the nurse’s response to the patient’s call, and the physical touchpoints being the equipment used to maintain the infusion, such as the IV pole and catheter, and the pager used to call the nurse. The infusion completion phase involves interpersonal, digital, and physical touchpoints. Interpersonal touchpoints include feedback on infusion feelings and handling patient calls, digital touchpoints include infusion completion records and infusion information management, and physical touchpoints are mainly reflected in the pager. The analysis of the above-mentioned touchpoints revealed that the main needs of patients are in terms of infusion effectiveness and experience, and the most important needs to be improved during the infusion process are physical dexterity and promptness of front desk staff response. As the front desk staff, the main needs of nurses are accurate information feedback and efficient management of infusion information, so the needs are in terms of information accuracy and intuitiveness.
Fig. 5. Service Blueprint.
46
G. Hong and H. Chen
3 IV Pole Product Design Based on Service Design Theory This phase is based on the feedback gathered during the user needs assessment for the IV pole design as well as the interface design for the application prototype. Feedback from all participants to the product prototype regarding service experience and usability was included in the current product prototype. The design includes the IV pole and digital interface, and the prototype solution was produced using Rhino, Keyshot, and Figma to implement all of the content and functionality identified in the user needs assessment. 3.1 System Architecture The IV pole utilizes sensors to track the infusion state in real-time and pass the data to the infusion management platform, which is linked to the hospital information system to get medical advice information and subsequently send the infusion state information and medical advice information to the large screen and handheld terminal at the nurses’ station, granting nurses instantaneous access to the ongoing infusion data and facilitating quantitative management of the infusion process. 3.2 Wearable IV Pole Design The infusion bag is fixed on the patient in a wearable way, which facilitates the collection of infusion information and physiological characteristics data of the patient through sensors at any time. The system enables comprehensive tracking of the patient during the infusion process, ensuring a safe and secure infusion, and allowing for more freedom of movement for the patient. A diagram of the IV pole design is presented in Fig. 6. A. Perception of drug drip rate and residual volume by sensor ultra-micro gravity sensor, accurate monitoring of infusion information, and quantitative management of infusion. B. The foldable design of the structure allows it to be tucked away when not needed, freeing up cramped medical areas. C. By utilizing biosensors affixed to IV poles, nurses can acquire patients’ physiological information such as heart rate and body temperature, identify any irregularities quickly, and send out an alert to the nurse’s station. D. The infusion bottle is securely fastened to the patient’s body via adjustable elastic straps and shoulder clips, allowing the patient and their family to move around without having to manually lift it. The infusion can be done with the user having the ability to independently use the restroom and shift their body to reduce the discomfort from extended immobility, which will result in a more pleasant medical experience. E. alerting the patient by light and sound so that the patient does not need to be constantly aware of the residual volume. F. Utilize voice-enabled wireless call technology to enable patients and their families to conveniently communicate with nurses promptly, thus permitting nurses to answer inquiries remotely during times of high infusion demand. Wearing method: The patient attaches the IV pole to their body with the shoulder clip and elastic band, rotates the stand to hang the infusion bottle on the hook, and turns the stand back to its full extension so the bottle reaches the correct height. The collapsible
Design Study of Wearable IV Pole: Service Design Perspective
47
Fig. 6. Wearable IV pole design.
design of the infusion pole allows it to be taken away easily with a single hand, providing a convenient option for patients who need to be relocated while receiving the infusion. The patient can transfer the bottle from a regular IV pole to a wearable one with only one hand, due to the rotation of the unfolded IV pole, which allows them to use their other hand to put on the IV pole and place the bottle onto it for infusion, as shown in Fig. 7.
Fig. 7. IV pole storage and unfolding method.
48
G. Hong and H. Chen
3.3 Nurse Station Interactive Large Screen Interface Design The interactive screen at the nurse’s station has been installed to enable nurses to quickly gain an overview of the patient’s infusion process, visualize potential issues, and ultimately improve their efficiency in carrying out infusion tasks. The nurse’s station features a large interactive screen with a graphical interface that displays a variety of information, as depicted in Fig. 8. This presentation illustrates the capability of intelligent infusion information management. Medical and nursing personnel can easily view each bed’s infusion process through the infusion monitoring interface, which displays the remaining dosage, expected time to completion, infusion data, and infusion status; moreover, the system can provide nurses with alerts about upcoming infusions through visuals and audio. Different infusion statuses are distinguished by eye-catching colors, and a flashing yellow dot will appear above the bed number to prompt nurses when a patient calls.
Fig. 8. Design of interactive large screen interface of nurse station.
4 Usability Testing To obtain user feedback on the interaction flow and the effectiveness of the interface, this paper uses interaction prototypes for formative usability testing, to identify design improvement opportunities and ensure that the design is moving in the right direction. Due to the stage of the design, the evaluation of the interface prototype design was the main focus. The evaluation of the product prototype involved 20 patients, accompanying family members, and nurses. After introducing the product system architecture, the purpose of the study, and briefly demonstrating the contents and functions of the IV pole and application to the participants, the participants were asked to rate their satisfaction
Design Study of Wearable IV Pole: Service Design Perspective
49
with the visual experience of the interface in the following areas: in the intuitiveness of the infusion status, interface fit to the infusion environment, interface hierarchy, consistency of interface style, and overall interface effect. The degree of satisfaction evaluation was divided into five categories: satisfied, relatively satisfied, average, relatively dissatisfied, and dissatisfied, and the ratings were transformed through a five-point Likert scale, in which very satisfied was 5, very dissatisfied was 1, and so on. Table 5. Interface evaluation Evaluation Object
Average value
Standard deviation
Visualization of infusion status
4.053
0.786
Suitability for infusion environment
4.394
0.835
Interface style
3.929
0.915
Operability
4.376
0.765
Overall interface effect
3.994
0.854
As shown in Table 5, in the feedback on the prototype, nurses attached more importance to the information display and operation habits, especially the intuitiveness of information and the simplicity of operation, and they tended to quickly understand the patient’s condition simply and efficiently, which required a higher level of information on the page and the targeting of the layout, while patients were more satisfied with the intuitiveness of infusion information and the interface effect.
5 Discussion and Conclusion 5.1 Implementation Improvements With the rise of the service design concept, service design tools are increasingly applied in healthcare to enhance the quality and efficiency of healthcare services. Most of the current research on ward infusion design has focused on infusion monitoring technology and the design of IV pole shape and function, and research on infusion service flow is still pending. This study addresses the pain points encountered by various stakeholders in the infusion process in the general ward environment, redesigns the pathway for studying the infusion process, and proposes a new design solution for the IV pole, with specific improvements mainly in the following areas. 1. In terms of information accuracy: the use of sensors to collect infusion information and patient body data has realized the operation procedure and information automation of the intravenous infusion process, and the monitoring of infusion information is more accurate and stable, which reduces infusion errors and medical accidents. 2. In terms of improving the efficiency of medical care: It provides an appropriate system architecture to transmit infusion data to nurses’ stations through wireless communication between sensors and their workstations, eliminating repetitive operations
50
G. Hong and H. Chen
such as back-and-forth inspections by nurses, helping nurses to complete their infusion tasks efficiently and reducing the work pressure of medical staff. 3. In terms of optimizing the user experience, the wearable IV pole improves the patient’s activity space, satisfying the user’s needs such as going to the toilet alone during infusion and moving appropriately to relieve the soreness of prolonged stillness; the timely information transmission with the nurse reduces the patient’s doubts about the safety of infusion and makes the patient have a more comfortable medical experience. 5.2 Shortcomings and Limitations of the Study In the user study, due to the impact of the epidemic, questionnaires and interviews could only be conducted online, so the sample size and sample distribution had inevitable limitations. Second, in the evaluation of the design prototype, due to the project schedule, the evaluation could not be completely close to the real machine test, and a more systematic evaluation in a more realistic infusion scenario is needed after further iterations. 5.3 Significance and Prospect of the Study This study combines service design theory to tap into the service needs of various stakeholders in the infusion process, realizes operational procedures and information automation of the intravenous infusion process, improves the accuracy of infusion information delivery and infusion efficiency, reduces infusion errors and medical errors, and provides better quality services to patients while ensuring patient infusion safety, thus further improving patient satisfaction. As costs allow, several improvements can be made in the future in terms of IV pole data monitoring, automatic flow control, and more comfortable wearing methods to further improve the quality of infusion services. Acknowledgements. This study was supported by Shanghai Summit Discipline in Design.
References 1. Oros, D., et al.: Smart intravenous infusion dosing system. Appl. Sci. 11(2), 513 (2021) 2. Jia, B., Liu, X., Xu, F., et al.: Design and application of intelligent infusion system based on low-power Bluetooth technology. China Med. Equip. 37(02) (2022) 3. Jintao, C.: Analysis of the causes of peripheral venous indwelling needle blocking and nursing measures. Forum Primary Med. 19(S1), 125–126 (2015) 4. Zhang, J., Han, Z.N.: Design and application of a bedside multifunctional IV pole. Contemporary Nurse (Lower J.) 5, 120 (2017) 5. Abbas, S.K., Kozah, N., Hajj-Moussa, G., harb, R., Zaylaa, A.J.: BMIVPOT, a Fully automated version of the intravenous pole: simulation, design, and evaluation. J. Healthcare Eng. (August 11, 2020), e7963497 (2020) 6. Mo, Y., Wang, J., Wei, Z., Zhao, P.: Development of a new foot-operated liftable IV pole. Contemporary Nurse (Zhongjian) 27(9), 190–91 (2020) 7. Hajj-Moussa, G., Sayed-Kassem, A., Kozah, N., Harb, R., Arnaout, M., Zaylaa, A.J.: Prototype advancement of the robotic IV pole: preliminary simulation. In: 2018 International Conference on Computer and Applications (ICCA), pp. 71–74 (2018) 8. EZPole, Mobiu Corporation. The 30th International Medical & Hospital Equipment Show. (KIMES 2014). March 2014
Extensibility Challenges of Scientific Workflow Management Systems Muhammad Mainul Hossain(B) , Banani Roy , Chanchal Roy , and Kevin Schneider University of Saskatchewan, Saskatoon, SK, Canada {mainul.hossain,banani.roy,chanchal.roy,kevin.schneider}@usask.ca https://www.usask.ca/
Abstract. Researchers compose scientific workflows for complex scientific experiments and simulations by connecting tools and data in a pipeline. The usability of a scientific workflow management system (SWfMS) largely depends on the availability of necessary tools in the system and the simplicity of their usage. Scientific experiments are incredibly diverse and need a wide variety of tools. Due to an overwhelming number of available tools in the public domain, a SWfMS cannot preinstall all tools required for multifarious experiments. Hence an extensibility mechanism to integrate external tools is greatly important for the flexibility of SWfMS. Tools are independently developed by different development teams using their favorite or suitable programming languages and may run on different operating environments. The tool integration is challenging due to the myriad development languages used for tools and potentially varying operating environments of SWfMS and tools. The software tools may not be robust enough for workflow integration. The state-of-the-art SWfMSs such as Galaxy and KNIME are webbased and can simultaneously serve hundreds of users. The end-users may want to quickly integrate their code in a SWfMS as a tool and use it in a workflow model. But many tools require a system configuration change, which end-users are not authorized to do. The integrated tool must also fit the workflow pipeline with input and output datasets. End-users need an efficient user interface for tool integration by themselves. We created 50 workflows in image processing, bioinformatics, and software analytics domains using VizSciFlow SWfMS. We gathered the challenges we encountered while extending it by integrating tools for these workflows using its extensibility interface. In this paper, we describe the challenges and propose solutions with the help of two case studies we conducted by developing two real-world workflow products - CoGe’s SynMap workflow in the Bioinformatics domain and source code clone detection and validation in the Software Analytics domain.
Keywords: SWfMS Tool Integration
· Workflow Composition · Tool Development ·
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 51–70, 2023. https://doi.org/10.1007/978-3-031-35129-7_4
52
1
M. M. Hossain et al.
Introduction
For complex scientific experiments and simulations, scientists connect various tools and specify their executions into a scientific workflow. Each workflow step involves data transformation and analysis operations based on a set of rules and the instructions for its execution [24]. The provenance information is captured and queried along the way and visualized afterward. Scientific Workflow Management Systems (SWfMSs) streamline the specification, registration, execution, visualization, and monitoring of scientific workflows [2,12,14,21]. Many scientific domains such as astronomy, bioinformatics, gravitational wave physics, geoinformatics, and chemoinformatics extensively use SWfMSs. Scientific experiments encompass a vast range of disciplines, so the tools available to scientists vary greatly. Despite this, researchers often tackle similar problems using different approaches, which can create numerous duplicative tools. While this duplication may seem redundant, it highlights the diversity of thought and methodology within the scientific community, allowing for multiple avenues of exploration and increasing the chances of making groundbreaking discoveries. Bioinformatics Links Directory [6,13] lists dozens to hundreds of tools for solving similar problems. For example, the directory lists 84 different multiple sequence aligners, 141 transcript expression analysis tools, and 182 resources for analyzing pathways and interactions [13,36]. The robustness of a software tool is crucial for ensuring its compatibility across various platforms and seamless integration with other tools. But in the explosive growth of tools, many suffer from a lack of robustness since tool developers frequently ignore the established principles of software engineering in creating robust tools, often due to limited time and budget constraints [36]. The primary objective of a scientific workflow management system (SWfMS) is to assist scientists in creating an abstract representation of their experiment, transforming this representation into a concrete workflow, and finally submitting it for execution on a runtime infrastructure. Ensuring a balance between flexibility and usability is crucial when designing any information system [17,30,34]. A system should be effective and efficient in supporting regular activities and flexible enough to accommodate the specific needs of its users [18]. Due to conflicting requirements, providing both with equal efficacy is often complicated, and this trade-off is well-known as the flexibility-usability trade-off [22]. The challenge lies in finding the right balance between flexibility and usability. An extensibility model is necessary for experienced users to exploit the system’s capabilities and extend the system to improve flexibility. Integrating the required tools for a scientific experiment is the primary means of achieving this extensibility in a SWfMS. Researchers or programmers usually develop software tools independently using different programming languages, libraries, and frameworks and execute them on various platforms to solve scientific problems. Reusable, composable, reproducible, and singly responsible tools are the best candidates for integration into a SWfMS [5]. The primary goal of tool creation is to solve specific problems rather than being designed for integration into a SWfMS. The workflow management system may have been developed using a
Extensibility Challenges of Scientific Workflow Management Systems
53
different programming language, libraries, and frameworks and runs on a different execution environment than the tools, which may lead to compatibility issues. Tool development does not always prioritize portability which makes it easier to move software from one platform to another without losing functionality. Multiple factors must be taken into account during tool integration into a SWfMS. Integrating additional tools on a desktop system may not pose a significant challenge, as users typically have administrative privileges during tool installation. However, large-scale web-based SWfMSs often have many users without direct access to the system. The end-users must send their tools to a system administrator along with complete deployment instructions. This can be tedious and time-consuming for a large user base if the tool deployment varies greatly between users and the tools. Our investigation evaluated various popular and state-of-the-art scientific workflow management systems to uncover the challenges, difficulties, and issues associated with integrating external software tools into a SWfMS. We utilized VizSciFlow to develop two real-world workflows as part of our case studies and developed thirty-five tools. Our first case study involved the creation of CoGe’s SynMap workflow [26], which employed fifteen tools to generate a dot plot between two organisms, revealing their syntenic regions of DNA sequences. The second case study was to compose a source code clone detection and validation workflow utilizing NiCad clone detection [32], CloneCognition [27], and ANNCloneValidation [28] software tools. We ran these tools during workflow execution and collected the corresponding provenance information. We formulate the following research questions for this work: 1. RQ1: What are the challenges of supporting extensibility in Scientific Workflow Management Systems? 2. RQ2: What measures can SWfMS take to address those challenges? 3. RQ3: Can the measures for addressing the extensibility challenges enable the successful composition and execution of real-world workflows? The remainder of this document is organized as follows. Section 2 provides an overview of scientific workflows, SWfMS, and the extensibility of SWfMS. Section 3 outlines the challenges encountered in integrating new tools into popular and state-of-the-art SWfMSs. The proposed solutions to these challenges are described in Sect. 4. The results of our two case studies, in which we aimed to explore the challenges and assess the proposed solutions, are presented in Sect. 5. The results of the case studies are discussed in Sect. 6. Our future research directions are outlined in Sect. 7. Finally, we conclude the paper in Sect. 8.
2
Background
This section provides background to this work by discussing scientific workflows, scientific workflow management systems, and extensibility with tool integration. A scientific workflow composes a collection of interdependent tasks which acquire, generate, transform, or analyze complex datasets [25]. They can be
54
M. M. Hossain et al.
used during different phases of an extensive scientific process, i.e., the cycle of hypothesis formation, experiment design, execution, and data analysis [15]. A scientific workflow management system defines, modifies, manages, monitors, and executes scientific workflows by executing scientific tasks [23]. Extensibility is the mechanism of adding external features to an information system dynamically. This balances the requirement for flexibility and usability by providing sufficient support for both. The primary means of achieving extensibility in SWfMSs is through integrating tools. Researchers are constantly developing new tools to solve specific scientific problems. Extensibility through tool integration allows these tools to be registered within the SWfMS, extending the syntax of the workflow language to include these tools and enabling the execution of these tools during workflow execution. Tool Developers are responsible for designing, developing, and integrating these tools within SWfMSs. End-users may also assume the role of tool developers by developing and integrating their tools into the SWfMS. Tool developers should follow the best practices for creating workflow-ready tools [5]. In most scientific workflow management systems, only administrators or superusers can integrate tools.
3
Extensibility Challenges of Scientific Workflow Management Systems
Integrating external tools into a scientific workflow management system can be complex and challenging. The lack of adherence to guidelines for making tools workflow-ready [5] can result in difficulties integrating them into the system. Integration of tools needs tool-specific consideration of compatibility and interoperability issues. We mainly investigated Python-based SWfMSs to discover the tool integration challenges. Current state-of-the-art data-intensive SWfMSs are primarily developed using Python due to its popularity among data scientists and rich support for scientific data analysis. Moreover, Python is an interpreted language that makes integrating and loading a tool dynamically easy. The tool integration procedure can be divided into four phases – registration, composition, execution, and provenance. Each phase has its challenges. We found the following challenges that tool developers often encounter while integrating an externally developed tool. 3.1
Registration
The registration phase installs the tool into a scientific workflow management system and configures it for other workflow processes. It must transfer the tool to the appropriate location, bind the tool’s parameters with the internal data representation, and configure the tool’s dependencies during execution. The language system needs to be extended for the new tool to appear in the Tools Catalog and quickly use in workflow composition.
Extensibility Challenges of Scientific Workflow Management Systems
3.2
55
Composition
A software tool must seamlessly participate in workflow composition and interoperate with other system tools. Scientists must be able to connect one tool to another to create a workflow pipeline using the supported workflow language of the SWfMS. Two critical factors for composition are seamless input and output connections, as well as seamless integration into the workflow language system, which we will briefly explain below: 3.2.1
Bridging Inputs and Outputs
For workflow, inputs and outputs of a tool must be computer-addressable such as standard I/O stream, file system, HTTP packet [5] or database system. A standalone software or tool may generate output in unexpected locations or require input from inaccessible sources or unknown to other tools in the workflow pipeline, making it challenging to integrate into a SWfMS. Besides, SWfMSs may have their own data storage mechanism, presenting a significant challenge in redirecting inputs and outputs so that SWfMS can process them and vice versa. 3.2.2
Plugging into Workflow Language System
SWfMSs offer a graphical or textual language for workflow composition. With visual languages of Galaxy [14], Kepler [2], VisTrails [7], Workspace [9], SciWorCS [29], among others, users can create workflows by dragging and dropping graphical elements onto a canvas and connecting their inputs and outputs. On the other hand, textual languages of VizSciFlow [18], Bpipe [33], QIIME 2 [8], and so on allow users to specify workflows as text code, where inputs and outputs are connected through variable names. The registration process must extend the language system so the new tools can be composed into workflow models. 3.3
Execution
Execution of workflow is a critical phase in a scientific experiment and poses the most challenges for integrating an external tool to the workflow system. Depending on the operations a tool performs, the technology used for its development, and its target execution environment, myriad difficulties may arise during its execution and runtime interaction with other tools. The dependence on the development languages of the target SWfMS is also a big concern. Our research and case studies focus on the challenges associated with the configuration, interpretation, adaptation, and execution models of Python-based SWfMS. However, it is important to note that tools can be programmed in any language. The challenges of tool integration for workflow execution are outlined below: 3.3.1
Function
A function or method is part of a larger program and cannot operate independently. Tool developers may want to integrate a function developed in their preferred language into the system. Integration is straightforward if developed using the same language as the workflow system. But if the tool and SWfMS are
56
M. M. Hossain et al.
programmed in different languages, it becomes very challenging to seamlessly integrate the function into the type system of the workflow language. 3.3.2
Program
A tool program can run independently on its target runtime environment. A SWfMS can spawn a child or a completely independent process for the execution of a tool program. The complexity of program tool execution and communicating with it can drastically increase if they are targeted for different runtime environments. Some tools cannot be integrated without taking extreme measures such as virtualization, containerization, and service-oriented deployment. Even a program tool and SWfMS, both targeted to run on the same execution environment, may pose integration challenges if developed using different programming languages, libraries, or frameworks. 3.3.3
Shell Script
The shell script is a common technique for running external tools from the command line. Many software tools offer package configuration options in a shell script to run them. The script and the software tool run in isolated processes. Since there is no direct connection between SWfMS and the tool, SWfMS must pass arguments so that the shell script can understand and forward them to the external tool and, finally, can return the execution results to the SWfMS. 3.3.4
Library Dependency
Software tools may depend on external libraries, often shared among other software. It is a well-established way to share source code and operations with multiple tools. Tool developers may also want to upload modules from their machines. Most development languages support online or offline repositories to download and install libraries. The most famous online repository for Python library is PyPI1 . Special care must be taken to install the right version of a library. Any wrong tool installation may destabilize the system. The version mess of shared libraries in Windows system is infamously known as DLL Hell [11]. 3.3.5
Virtual Environment Dependency
A Tool is targeted to run on an appropriate execution environment. Java programs need a Java Virtual Machine (JVM), and Python programs need a virtual environment. C++ programs may run directly on the system execution environment. For many software tools, a shared virtual environment may not be appropriate. Some of those tools may need isolated virtual environments to execute. For example, BlastToRaw, BlastZ, GeneOrder tools of CoGe’s SynMap [16], [26] need Python 2.7 virtual environment, while DagFormatter tool needs Python 3.x virtual environment. It is a big challenge to accommodate customized virtual environments of multiple language systems in a SWfMS. 1
https://pypi.org/.
Extensibility Challenges of Scientific Workflow Management Systems
3.3.6
57
Different Development Languages for SWfMS and Tool
Most general-purpose languages have standard and documented ways to quickly hook one tool into another if both are programmed in the same language. For example, a Python program can use runpy command to locate and run Python modules. The situation becomes more complex when SWfMS and the software tool are developed using different languages. One big challenge is to pass the parameters in the proper order and return the result to the workflow system correctly. The input parameters and output results must be computer-addressable [5]. 3.3.7
System Configuration Dependency
Some tools need unique system configurations to run. For example, NiCad [32] must run in its executable directory. Other tools may need to run in the data source directory. It is essential to carefully decide which configurations end-users can change and which are not. There are security concerns in allowing end-users to change sensitive system configurations. 3.3.8
Containerization and Virtualization
Some software tools may not execute on the same runtime infrastructure SWfMS runs. Sometimes, a software tool may only be compatible with a specific operating system or architecture, while the SWfMS is on a different platform. The interoperability challenges between isolated execution environments affect the reliability and performance of data and resource sharing. Sharing in-memory data from one environment to another may cause serialization overhead, data format compatibility, data loss and corruption, and concurrency and synchronization problems. 3.3.9
Web Service Tool
Many tools are available online as web services, each with unique input and output parameter types that impact their usability in a SWfMS. The National Center for Biotechnology (NCBI) provides a comprehensive collection of databases, tools, and resources for biotechnology and biomedicine research. Two prominent bioinformatics web service tools of NCBI are BLAST (Basic Local Alignment Search Tool), used to search for similar sequences in large biological databases, and Entrez, a search engine for the NCBI databases. The technology of each web service dictates the specific format in which it accepts parameters. XML-based SOAP and JSON-based REST are prominent protocols for data exchange in web services. However, transferring large amounts of data to and from a web service in data-intensive applications can have a negative impact on workflow performance. 3.3.10
Custom Installation
It is the process of installing software in a way that is tailored to the specific needs and requirements of the user. The custom installation procedure is necessary for many software tools. The installation procedure is usually interactive and needs administrative access rights to the system. It is extremely difficult, if not impossible, to allow end-users to install these tools without compromising security.
58
M. M. Hossain et al.
3.3.11
Source Compilation
Some tools must be compiled from source code before installation. Like custom installation, source compilation is often interactive and requires administrative access rights. It is quite challenging to let end-users integrate such software tools reliably in a SWfMS. 3.3.12
System Repository Dependency
A tool may depend on a particular system repository or a different version of an installed repository which must be installed for the execution of the tool. The administrator or superuser of a system can only update the repository. Allowing end-users to add, remove or update a system repository can introduce security vulnerabilities and may result in compatibility issues if the wrong version of a repository is installed. 3.4
Provenance
SWfMSs systematically capture provenance information which helps to recompute, reproduce and validate the result of a scientific experiment. All tools of a SWfMS participate in provenance capture by providing necessary information about the origin, quality, authorship, and evolution of data and process [10]. The information captured by a provenance system of a SWfMS is systemspecific. It can be captured in different levels (activity, workflow, and operating system), in different structures (internal and external), in different granularities (fine-grained and course-grained), orientation (data and process) and so on [10]. Provenance management is already very challenging [1]. Due to little or no direct connection between the external tool and the workflow system, tuning the type of information to capture is even more challenging. SWfMS may need information that the software tool does not generate. Besides, some information cannot be captured by the provenance system if not written in computer-addressable and documented locations by the tool. Answer to RQ1 RQ1: What are the challenges of supporting extensibility in Scientific Workflow Management Systems? Answer: The challenges of extensibility in SWfMS by tool integration can be grouped by registration, composition, execution, and provenance phases. Tools must be registered and configured correctly to appear in Tools Catalog and execute properly. The main challenges of extensibility occur during the execution of the tool, which depend on the development language of the tool, execution environment, library, and system repository dependencies. The workflow system has less control over the type, location, and amount of provenance data that an external tool generates.
Extensibility Challenges of Scientific Workflow Management Systems
4
59
Proposed Solutions
We studied and investigated the tool integration mechanisms of popular and state-of-the-art scientific workflow management systems such as Galaxy [14], Kepler [2], VisTrails [7], VizSciFlow [18], Workspace [9]. Galaxy introduced an AppStore-style storage for tool installation called Toolshed [4]. The tool developer submits the new tool to the Toolshed for approval. Once approved, a user with administrative access rights can add this tool from the Toolshed. Kepler, VisTrails, and Workspace have desktop-based GUI, and tool developers have direct access to the system. Kepler and VisTrails provide a subdirectory to save new tools programmed using Java and Python, respectively, and they can load these tools during the start. Workspace provides a dialog to load a C++ tool. VizSciFlow is a web-based system; therefore, end-users do not have direct access to the system. We extended the VizSciFlow system so the end-users can integrate many tools. VizSciFlow provides a textual domain-specific language (DSL) and a Graphical IDE for composing workflows. The DSL provides minimal programming features and limited vocabulary specific to a scientist’s domain knowledge to increase productivity while decreasing the cognitive burden and reducing maintenance effort [18]. Integration of a new tool extends the vocabulary of the DSL. The core part of the VizSciFlow system is programmed using the Python programming language. We created more than 50 workflows in Bioinformatics, Image Processing, and Software Analytics domains to study the challenges of tool integration in the VizSciFlow system. This section briefly describes the solutions we incorporated in VizSciFlow to address the extensibility challenges. VizSciFlow uses a service-oriented approach to extend its tool support by consuming external tools. The plugin architecture of VizSciFlow allows the integration of new tools on-the-fly. The tools are then immediately available for workflow composition and execution. We extended the VizSciFlow framework [18] to accommodate the mitigation measures of the extensibility challenges. A conceptual architecture of the new extensibility framework is depicted in Fig. 1. The Mapper component integrates a tool into the workflow system by configuring its input and output parameters and adding its definition to the Tools Catalog. It enhances the workflow language system by incorporating the tool’s signature, preparing it for workflow composition. The external tool definition is mapped to an internal representation for a more streamlined process using a JSON-based data structure. Mapper also sets up the required virtual or container environment to ensure proper execution of the tool. The Composer component makes tools readily accessible for workflow composition by retrieving them from the Tools Catalog. The VizSciFlow provides a graphical interface for setting the input and output parameters of the tool and inserting them into the workflow editor as a DSL code snippet. The Adapter component acts as a mediator for smooth tool execution by preparing and passing the necessary input parameters to the tool. It loads and submits the Tool Module to the runtime infrastructure, retrieving the output and forwarding it to connected tools. The adapter is programmed in Python and dynamically loaded into the VizSciFlow execution
60
M. M. Hossain et al.
Fig. 1. Extensibility Framework of VizSciFlow
model. The Tool Manager plays a pivotal role in coordinating the actions of all components, ensuring seamless and efficient tool integration and execution. 4.1
Registration
The registration phase maps the tool definition and its parameters to the internal representation of the workflow system, extend the language system with tool information, and configures the runtime environment for tool execution. The main challenge here is to coordinate the frontend and backend operations effectively. This requires a good understanding of the tool, its parameters, and its dependencies, as well as a thorough knowledge of the SWfMS and its architecture. The registration process must prepare the system so that the composition mechanism can connect the tool to the neighboring tools during the workflow modeling and execution process and can load the tool into the appropriate runtime infrastructure, pass inputs from neighbors, execute the tool, and finally pass outputs to the other neighboring tools. Different SWfMSs use different approaches to register new tools. Galaxy has an AppStore-like toolshed for tool storage, and a user with administrative access rights can install it into the system [4]. In desktop-based systems such as Kepler [2], and VisTrails [7], tools are copied into a subfolder of their installation folders and loaded as plugins. All of these approaches need administrative access rights to the workflow system. VizSciFlow uses a JSON mapper to associate the services provided by tools or modules with the DSL vocabulary. An adapter written in Python mediates between workflow runtime and actual tool execution. A user interface dialog provides a JSON code editor for mapping and a Python code editor for adapting an external tool to run with the internal system. The mapper and adapter are shown in Fig. 2 and Fig. 3, respectively. Users can add Python packages from the PyPI repository and upload local packages to the system.
Extensibility Challenges of Scientific Workflow Management Systems
Fig. 2. Mapper Interface of VizSciFlow Tool Registration
Fig. 3. Adapter Interface of VizSciFlow Tool Registration
61
62
M. M. Hossain et al.
In Listing 1.1, an example mapping of the FastQC tool to DSL is shown. FastQC is a Java program used in Bioinformatics to check the quality of a genome sequence file. It takes a FASTQ file as input and generates an HTML file with the quality information of the FASTQ file and a ZIP file with all generated files. The mapper adds a new service CheckQuality to the DSL vocabulary and internally binds it to an adaptor method, named run_fastqc which checks the arguments and forwards the call to the FastQC tool. An adapter class associated with each service is responsible for unpacking the arguments, running the tool with those arguments, and fetching the results. A compressed ZIP folder that contains the FastQC program is uploaded using the “Add Module” dialog. 1
{ " name " : " CheckQuality " , " internal " : " run_fastqc " , " desc " : " Measures quality of a FASTQ file " , " params " : [ { " name " : " input " , " type " : " file " } ], " returns " : [ { " name " : " html " , " type " : " file " }, { " name " : " zip " , " type " : " file " } ],
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
} Listing 1.1. VizSciFlow JSON Mapper for Tool Integration
4.2
Composition
In graphical SWfMS, the tool manager provides the newly integrated tool in a searchable Tools Catalog. In visual workflow languages, users can drag the tool from the catalog and drop it into the workflow canvas during composition. They can then connect the tool with the neighbors. In VizSciFlow, users drag the tool from the catalog on the code editor, generating a corresponding DSL snippet on the editor. Then, they can write code to connect the input and output arguments with the neighboring tools.
Extensibility Challenges of Scientific Workflow Management Systems
4.3
63
Execution
The execution phase is the most critical and challenging part of tool integration. When the execution system encounters an integrated software tool, the tool manager finds the tool from the Tools Catalog, loads the module, instantiates the adapter of the tool, which executes the external tool and collects the output and other provenance information. It must overcome the challenges of tool integration. 4.3.1
Function
If the tool function and SWfMS are developed in the same language, it is easy to integrate the tool function. In VizSciFlow, the tool developer can write the function code in the adapter file and call it from its internal function. The outputs are just the return values of the function. Python can also run a string of Python code using exec function. 4.3.2
Program
Python language has several constructs to execute an external Python script. The run_path function of runpy module can run a Python script in the current process. On the other hand, subprocess module has several functions such as Popen, call, check_output, and run to spawn a new process to run an external program. The inputs are passed as parameters to these function calls. Output depends on the program itself. The program can write to the default output stream stdout or a file or database. 4.3.3
Shell Script
Shell script calls one or more programs or scripts. VizSciFlow can run shell script using the functions of subprocess module. The input parameters are first passed to the shell script, which then forwards them to the target program. Like Programs, shell script outputs depend on the target program or script. 4.3.4
Library Dependency
Python programs or modules often depend on external Python packages. In VizSciFlow, tool developers can specify the required Python packages in the “Add Module” dialog by adding the “pip” package names. They can also upload a requirements file for pip install. The packages are installed during the registration process and used in the execution process. Tool developers can also upload the module as a single file or compressed zip folder in VizSciFlow’s “Add Module” dialog. 4.3.5
Virtual Environment Dependency
Python programs need a virtual environment to run. The default virtual environment of the VizSciFlow system may not be appropriate for some software tools. We extended VizSciFlow so that tool developers can create a specific virtual environment (e.g., Python 2.7), install required packages, activate the environment, and run a Python program. They can select the appropriate virtual environment in the adapter code. During execution, VizSciFlow creates a shell script as shown in Listing 1.2.
64
M. M. Hossain et al.
Listing 1.2. Bash Script to Run a Tool in Virtual Environment 1 #! / b i n / b a s h 2 3 source $1 4 shift 5 python "$@"
The location of the virtual environment is passed as the first parameter to this script, followed by the Python program and then the rest of the input parameters of the program. 4.3.6
System Configuration Dependency
Tool developers can be allowed simple system configuration changes without security risks. NiCad [32] needs to run from its “bin” subdirectory. So we set the current working directory to this folder. End-users should not be allowed to change properties that may pose security risks to the SWfMS machine. 4.3.7
Containerization and Virtualization
Containers and Virtual Machines provide an isolated execution environment for a tool. Setting up containers and virtual machines without administrative access rights is complex. Tool developers can use existing preconfigured docker containers [31] in VizSciFlow. They must request the administrator for custom dockers to provide a docker with their requirements. 4.3.8
Web Service Tool
Web service tools provide Application Programming Interfaces (API) on some standard protocols such as REST and SOAP. Tool developers can use Python requests module and follow API specifications to call an external web service. Some web services also provide Python modules as syntactic sugar on the lowlevel connection details, parameters passing, and output retrieving. VizSciFlow uses BioBlend package [35] to interact with Galaxy servers for data access and tool execution. 4.3.9
Custom Installation
Since custom installation is often interactive and needs administrative access rights, allowing end-users to install these tools without compromising security is challenging. Tool developers must contact an administrator to install these tools. 4.3.10
Source Compilation
Similar to custom installation, source compilation needs administrative access rights. Hence, end-users should not be allowed to integrate these tools themselves. 4.3.11
System Repository Dependency
A tool may depend on certain system repositories or a specific version of an already installed repository that needs to be installed for the execution of the
Extensibility Challenges of Scientific Workflow Management Systems
65
tool. Debian-based Linux operating systems provide Advanced Packaging Tool (APT) to install system repositories. But superuser access rights are required to install system repositories. Therefore, the end-users should request the system administrator to install the dependent repositories. 4.4
Provenance
SWfMSs track the lineage of workflows and their data products using the provenance mechanism. Though workflow-specific provenance is captured automatically by the provenance system, tool-specific provenance information of external software tools cannot be captured if the tool is not specifically prepared for this. VizSciFlow captures the default standard output (stdout) and standard error (stderr) information during the execution of the tool. An IDE-like composition and visualization environment is developed for VizSciFlow to compose provenance queries intuitively and visualize the results [19]. Answer to RQ2 RQ2: What measures can SWfMS take to address those challenges? Answer: The developers of SWfMS can provide a user interface to automate the tool integration. Python modules and programs can run in the system or in an isolated virtual environment within Python-based SWfMSs. Tools written in other languages can also be adapted from Python-based SWfMSs. Isolated execution environments in containers or virtual machines may be necessary for some tools. There are still some tools that only administrators can integrate into a SWfMS.
5
Case Studies
We investigated the challenges of tool integration by creating tools in Image Processing, Bioinformatics, and Software Analytics domains. We, particularly, conducted two case studies in VizSciFlow to discover the challenges and take mitigation measures to adapt external software tools to the VizSciFlow system. Our first case study created CoGe’s SynMap workflow [26], [16] in VizSciFlow. It generates a Dot Plot between two organisms to show syntenic regions of DNA sequences. Fifteen independently developed tools were used in this workflow. These tools have diverse execution requirements, including Python 3.x and Python 2.7 virtual environments, Bash shell scripts, and Perl scripts. The challenges of integrating these tools are listed in Table 1. The solutions proposed in Sect. 4 are applied to address these challenges.
66
M. M. Hossain et al. Table 1. Challenges of CoGe’s SynMap Workflow in VizSciFlow
Challenges
Tools
Function
Blastz, Last, ProcessDups
Program
BlastToRaw, Last, DAGFormatter, GeneOrder, DagChainer
Shell Script
GevoLink
Library Dependency
NumPy (ProcessDups), dotplot.py (Dot Plot)
Virtual Environment Dependency
Python 2 (BlastToRaw, Last, GeneOrder, DagChainer), Python 3 (DAGFormatter)
Different SWfMS and Tool Languages Perl (Blast2Bed, GevoLink) System Configuration Dependency
Set current directory (All tools)
In the second case study, we created a workflow of source code clone detection and validation using NiCad [32], CloneCognition [27], and ANNCloneValidation [28] software. NiCad clone detection tool provides several Bash scripts to pass configuration parameters during execution easily. CloneCognition and ANNCloneValidation are developed using an older version of Python (Python 2.7), and they depend on several libraries incompatible with VizSciFlow. We used Docker containers to isolate their executions from the VizSciFlow core system. Table 2 lists the challenges of integrating these tools in VizSciFlow. Table 2. Challenges of Clone Detection and Validation Workflow in VizSciFlow Challenges
Tools
Function
Extract, FindClonePairs, ClusterPairs
Program Shell Script
Extract, Rename, Filter, Abstract, Normalize, FindClonePairs, GetSource, MakePairHTML, ClusterPairs
Library Dependency
TXL, pathlib
Virtual Environment Dependency
Python 2 (CloneCognition, ANNCloneValidation)
Different SWfMS and Tool Languages Java System Configuration Dependency
Extract, Rename, Filter, Abstract, Normalize, FindClonePairs, GetSource, MakePairHTML, ClusterPairs
Containerization and Virtualization
Docker (CloneCognition, ANNCloneValidation)
Web Service Tool
CloneCognition
Extensibility Challenges of Scientific Workflow Management Systems
6
67
Discussion
We integrated all of the tools from SynMap and NiCad with the case studies using our proposed solutions in Sect. 4. The Shell scripts and Python programs were adapted to the external tools. SynMap has a lot of Perl scripts that are executed in VizSciFlow by the tools’ adapter. Some of the tools need their own Python virtual environment to run. Tool developers can create a virtual environment using “Add Module” dialog and run the Python program in that environment from the adapter. All tools generate outputs in files or folders that can easily be connected to neighboring tools. The standard output (stdout) and standard error (stderr) streams are captured and saved into the provenance repository. NiCad requires that FreeTxl (current version 10.8b) be installed in the system. If the FreeTxl tool does not exist in the system, it must be integrated into VizSciFlow first. CloneCognition and ANNCloneValidation need special care for tool registration and execution. These tools are programmed in an earlier version of Python (Python 2.7). CloneCognition is a web-based interactive graphical tool. So it cannot be directly integrated into VizSciFlow as a tool. We deployed it as a RESTful web service and then used its REST interface from VizSciFlow to run the tool. We communicated with the administrator of the VizSciFlow system to create a docker for ANNCloneValidation. Then we used the docker exec command in a bash script to run the tool from VizSciFlow. A volume is set up to link the data storages of VizSciFlow and ANNCloneValidation for quick access to their datasets. Most tools integrated into VizSciFlow with our case studies are portable. Portable software is easier to integrate as a tool because they are designed to run directly without requiring installation or modification to the host environment. But there are plenty of tools that are not designed to be portable. The system must either bypass the system’s security or provide an isolated environment for tool installation. Bypassing the security measures of a system may result in severe consequences. An approach to automatically generate an isolated environment in a container or virtual machine may help tool developers in selecting an appropriate execution environment for the integration of non-portable tools in a scientific workflow management system. Answer to RQ3 RQ3: Can the measures for addressing the extensibility challenges enable the successful composition and execution of real-world workflows? Answer: Through our case studies, we demonstrated the practicality of our proposed solutions for addressing extensibility challenges. We created two real-world workflows using tools we integrated using the measures outlined in Sect. 4. These results suggest that our proposed solutions can be effectively applied to integrate tools and derive real-world workflows in a scientific workflow management system.
68
7
M. M. Hossain et al.
Future Direction
The rapidly increasing number of independently developed software tools presents significant challenges for SWfMS and tool developers in tool integration. More research is needed to address the security risks of extensibility, including developing secure adapter code and restricting access to unauthorized datasets. One promising approach for secure tool operation is allocating a docker container for each user or tool. However, docker containers are resource intensive. One docker alternative, Singularity [20], is getting much attention in recent times due to its optimization for High-Performance Computing (HPC) workloads and better security than docker containers [3]. Running many containers with high workloads has performance issues due to resource constraints. In the future, we will do research to evaluate if a single container allocated for a group of users or tools has performance or security benefits. More research is needed to improve the usability of tool integration concerning automatic container generation by end-users on demand.
8
Conclusion
The extensibility of SWfMS is challenging due to different development languages and incompatible execution platforms of tools and workflow systems. The extensibility challenges must be addressed for the flexibility and usability of a system. In this paper, we discussed the extensibility challenges of VizSciFlow SWfMS by creating two real-world workflows and demonstrated that those challenges could effectively be addressed for portable tools. Tools that need administrative access rights for installation in a system cannot be integrated into a SWfMS by the end-users directly. Allowing end-users to integrate these tools may pose security vulnerabilities to the workflow system. An isolated execution environment in a container can provide enhanced security for tool execution. More research is necessary for this area.
References 1. Alam, K., Roy, B.: Challenges of provenance in scientific workflow management systems. In: 2022 IEEE/ACM Workshop on Workflows in Support Of Large-Scale Science (WORKS), pp. 10–18 (2022) 2. Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludascher, B., Mock, S.: Kepler: an extensible system for design and execution of scientific workflows. In: Proceedings of the 16th International Conference on Scientific and Statistical Database Management, p. 423- (2004). https://doi.org/10.1109/SSDBM.2004.44 3. Arango, C., Dernat, R., Sanabria, J.: Performance evaluation of container-based virtualization for high performance computing environments. ArXiv Preprint arXiv:1709.10140 (2017) 4. Blankenberg, D., et al.: Dissemination of scientific software with Galaxy ToolShed. Genome Biol. 15, 1–3 (2014)
Extensibility Challenges of Scientific Workflow Management Systems
69
5. Brack, P., et al.: Ten Simple Rules For Making a Software Tool Workflow-ready. Public Library of Science, San Francisco (2022) 6. Brazas, M.D., Yim, D., Yeung, W., Ouellette, B.F.: A decade of web server updates at the bioinformatics links directory: 2003–2012. Nucleic Acids Res. 40, W3–W12 (2012) 7. Callahan, S., Freire, J., Santos, E., Scheidegger, C., Silva, C., Vo, H.: VisTrails: visualization meets data management. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pp. 745–747 (2006) 8. Caporaso, J., et al.: QIIME allows analysis of high-throughput community sequencing data. Nat. Methods. 7, 335 (2010) 9. Cleary, P., Bolger, M., Hetherton, L., Rucinski, C., Thomas, D., Watkins, D.: Workspace: a platform for Delivering Scientific Applications. In: Proceedings EResearch. (2014) 10. Cruz, S., Campos, M., Mattoso, M.: Towards a taxonomy of provenance in scientific workflow management systems. In: 2009 Congress on Services-I, pp. 259–266 (2009) 11. Eisenbach, S., Jurisic, V., Sadler, C.: Feeling the way through DLL Hell. In: Proceedings Of The First Workshop On Unanticipated Software Evolution (USE 2002), Malaga, Spain (2002) 12. Fillbrunn, A., Dietz, C., Pfeuffer, J., Rahn, R., Landrum, G., Berthold, M.: KNIME for reproducible cross-domain analysis of life science data. J. Biotechnol. 261, 149– 156 (2017) 13. Fox, J.A., Butland, S.L., McMillan, S., Campbell, G., Ouellette, B.F.: The bioinformatics links directory: a compilation of molecular biology web servers. Nucleic Acids Res. 33, W3–W24 (2005) 14. Giardine, B.,et al.: Galaxy: a platform for interactive large-scale genome analysis. Genome Res. 15, 1451–1455 (2005) 15. Gil, Y., et al.: Examining the challenges of scientific workflows. Computer. 40, 24–32 (2007) 16. Haug-Baltzell, A., Stephens, S., Davey, S., Scheidegger, C., Lyons, E.: SynMap2 and SynMap3D: web-based whole-genome synteny browsers. Bioinformatics. 33, 2197–2198 (2017). https://doi.org/10.1093/bioinformatics/btx144 17. Heinl, P., Horn, S., Jablonski, S., Neeb, J., Stein, K., Teschke, M.A.: Comprehensive approach to flexibility in workflow management systems. In: Proceedings of the International Joint Conference on Work Activities Coordination and Collaboration, pp. 79–88 (1999). https://doi.org/10.1145/295665.295675 18. Hossain, M.M., Roy, B., Roy, C.K., Schneider, K.A.: VizSciFlow: a visually guided scripting framework for supporting complex scientific data analysis. Proc. ACM Human-Comput. Interact. 4, 1–37 (2020) 19. Hossain, M., Roy, B., Roy, C., Schneider, K.: A domain-specific composition environment for provenance query of scientific workflows. In: 2022 IEEE/ACM Workshop On Workflows in Support of Large-Scale Science (WORKS), pp. 19–26 (2022) 20. Kurtzer, G., Sochat, V., Bauer, M.: Singularity: scientific containers for mobility of compute. PloS One. 12, e0177459 (2017) 21. Lawrence, P.: Workflow Handbook. John Wiley and Sons Inc. (1997) 22. Lidwell, W., et al.: Universal principles of design, revised and updated: 125 ways to enhance usability, influence perception, increase appeal, make better design decisions, and teach through design. Rockport Pub (2010) 23. Lin, C., et al.: A reference architecture for scientific workflow management systems and the VIEW SOA solution. IEEE Trans. Serv. Comput. 2, 79–92 (2009) 24. Liu, J., Pacitti, E., Valduriez, P., Mattoso, M.: A survey of data-intensive scientific workflow management. J. Grid Comput. 13, 457–493 (2015)
70
M. M. Hossain et al.
25. Ludäscher, B., Weske, M., McPhillips, T., Bowers, S.: Scientific workflows: business as usual? In: International Conference on Business Process Management, pp. 31–47 (2009) 26. Lyons, E., Bomhoff, M., Oliver, S., Lenards, A.: Comparative Genomics of Grass Genomes using CoGe. In: Handbook of Plant (2014) 27. Mostaeen, G., et al.: CloneCognition: machine learning based code clone validation tool. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations Of Software Engineering, pp. 1105–1109 (2019) 28. Mostaeen, G., Roy, B., Roy, C., Schneider, K., Svajlenko, J.: A machine learning based framework for code clone validation. J. Syst. Softw. 169, 110686 (2020) 29. Mostaeen, G., Roy, B., Roy, C., Schneider, K.: Designing for real-time groupware systems to support complex scientific data analysis. Proc. ACM Human-Comput. Interact. 3, 1–28 (2019) 30. Pesic, M., Schonenberg, H., Aalst, W.: Declarative workflow. In: Modern Business Process Automation, pp. 175–201 (2010) 31. Rad, B., Bhatti, H., Ahmadi, M.: An introduction to docker and analysis of its performance. Int. J. Comput. Sci. Netw. Secur. (IJCSNS). 17, 228 (2017) 32. Roy, C., Cordy, J.: NICAD: accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization. In: 2008 16th IEEE International Conference on Program Comprehension, pp. 172–181 (2008) 33. Sadedin, S., Pope, B., Oshlack, A.: BPIPE: a tool for running and managing bioinformatics pipelines. Bioinformatics 28, 1525–1526 (2012) 34. Schonenberg, H., Mans, R., Russell, N., Mulyar, N., Aalst, W.: Process flexibility: a survey of contemporary approaches. Adv. Enterp. Eng. I, pp. 16–30 (2008) 35. Sloggett, C., Goonasekera, N., Afgan, E.: BioBlend: automating pipeline analyses within Galaxy and CloudMan. Bioinformatics 29, 1685–1686 (2013) 36. Taschuk, M., Wilson, G.: Ten Simple Rules For Making Research Software More Robust. Public Library of Science, San Francisco (2017)
The Effect of Color on the Visual Search Efficiency of Mobile Travel Service APP in Night Mode Junyang Hou, Xiaofan Zhou, and Zhijuan Zhu(B) School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, People’s Republic of China [email protected]
Abstract. Objective Through the study of color on human visual search efficiency in night mode, and the design practice and evaluation with the interface of Gaode Map, to derive the influence of color on the visual search efficiency of mobile travel apps in night mode. Methods Taking the visual search efficiency of each representative color in night mode as the main research object, we used the experimental paradigm of joint search and reaction-time search, wrote the experimental program through E-Prime 2.0 software, and obtained the physiological behavior data of the subjects with reaction-time test experiment; then we took the interface of Gaode Map app as an example to carry out the design practice of its night mode color optimization; Results Taking the color search efficiency Based on the night mode interface that combines the user needs in the travel service app, the usability of the results of the design practice was confirmed by testing the W3C standard and feasibility analysis. Conclusion By applying the experimental results based on the color search efficiency, the user experience of the travel service APP is optimized. Keywords: Industrial design · Night mode · Visual search efficiency · Electronic navigation · Mobile
1 Introduction The interface design of an excellent map service APP should not only achieve functionality but also be based on the study of visual search efficiency, improving the safety of the user during use while taking into account the optimization of user experience. Therefore, this paper focuses on the effect of color on visual search efficiency in the night mode of the mobile interface. Through the experimental paradigm of reaction-time search, the experimental variable of icon color is controlled, and the data of subjects’ behavioral characteristics are collected and analyzed to conclude, and on this basis, the optimized design of the night mode interface of map service APP is carried out.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 71–89, 2023. https://doi.org/10.1007/978-3-031-35129-7_5
72
J. Hou et al.
2 Travel APP Night Mode Status In this paper, three map night modes that currently account for a large share of market customers were selected for analysis and investigation: Baidu Map, Gaode Map, and Tencent Map, to further understand the current perception of navigation maps when driving at night. It can be found that the main color of all three-night modes is the same ink blue, which also provides a reference for the subsequent experimental design. As for the opening method, the night mode of the following three models needs to be opened manually by users, and can only be turned to night mode when entering the driving navigation interface (Fig. 1).
Fig. 1. Mobile map service APP night mode interface
And after the analysis of the questionnaire and interview results, the following points were summarized (Fig. 2). 1. Users usually turn on night mode to relieve visual fatigue and save power. 2. Most users think that night mode is necessary. 3. The reasons for not using night mode are usually centered on the hassle of turning it on and simply not knowing about it. 4. Most users think the night mode of the navigation app is inadequate and needs to be improved.
The Effect of Color on the Visual Search Efficiency
73
Fig. 2. Structure of questionnaire and interview questions
3 Literature Review In human-computer interaction, visual search is a complex task that is performed all the time to obtain information, and any small reduction in visual search time may result in significant cost savings and user experience improvements [1]. In the context of driving behavior, visual search efficiency is even more relevant to driving safety and has greater significance for users. Among the many factors that have an impact on visual search efficiency, color is the most effective way to encode information in graphic symbols, and studies have shown that background color has an important impact on visual search [2]. By examining the effect of the color relationship between graphics and background on the visual search of graphic symbols, we found that the higher the contrast between graphics and background, the faster the visual search [3, 4]. Moreover, color is important for object recognition, and the correct use of color will improve the speed of graphic classification and recognition[5, 6]. Because color coding can provide perceptual cues for graphic classification and recognition, it makes the user’s search more efficient [7]. In the direction of color contrast and combination research, Huang investigated the influence of the color relationship between the graphics of icons and the background on visual search. It was found that the search time was longer for the color pairing of white/yellow and white/blue and shorter for the color pairing of black/blue and black/yellow, which indicated that the color pairing was an important factor affecting the efficiency of icon search, while the search was faster and easier to detect the target when the contrast between the graphic and the background was higher [8]. The eye-movement experiments of Bhattacharyya et al. showed that the icon’s graphic and background. The higher the color contrast, the more efficient the subjects’ search. In the combination relationship between dark graphics and bright background, blue and red graphics with white background showed the best results, while in the combination relationship between bright graphics and dark background, blue or green background with white graphics showed the best results [9].
74
J. Hou et al.
4 Study of Color Visual Search Efficiency in Night Mode Visual search is a complex cognitive process, which refers to the act of actively searching for information to let out the information that meets one’s needs in the face of complicated information [10]. Visual search can be divided into two types: feature search and joint search [11]. The one used in this paper is a joint search. Meanwhile, among the two widely recognized visual search paradigms, the paradigm based on reaction-time search and the paradigm based on precise judgment [13], the experimental paradigm based on the reaction-time search is used in this paper to investigate the effect of color on the efficiency of visual search in nighttime mode. 4.1 Research Methodology In this paper, we first adopt the questionnaire method to study the proposed problem and clarify the innovation and necessity of the research direction. Then, the experimental program was written by E-Prime 2.0 software, and the physiological behavioral data of the subjects were obtained by reaction time test experiment. All the data were processed by SPSS for data analysis after verifying their validity. 4.2 Experimental Design and Procedure Purpose of the Experiment. This experiment contains two aspects: first, to investigate the effect of color on human visual search efficiency in the night mode; second, the effect of different color relationships on visual search efficiency in the night mode. Through the experimental paradigm based on reaction-time search, the sensitivity of people to different colors is explored. Based on the experimental results, relevant design principles of human-computer interaction interface design for mobile map service apps are proposed to optimize the user experience and improve the safety of use. Experimental Subjects. There were 39 subjects, and to ensure the accuracy of the experimental results, the population of subjects in this experiment met the following requirements: (1) the ratio of men to women was close to 1:1; (2) the age was between 18 and 50; (3) all of them had used the mobile map service APP, and the frequency of use was greater than once a month; (4) they had normal vision, no color blindness or color weakness; (5) they could skillfully complete the basic operation of the computer. Relevant information about the subjects was collected before the experiment, including gender, age, and frequency of travel service APP use, and the subjects were guided to familiarize themselves with the experimental sample and process before the experiment (Fig. 3).
The Effect of Color on the Visual Search Efficiency
75
Fig. 3. Basic information statistics of experimental subjects
Experimental Environment Equipment and Materials. This experiment uses a 15.6inch color screen to present stimuli with a resolution of 3840*2160 and a refresh rate of 120 Hz. The experimental program is prepared in E-prime and runs on the Windows 11 operating system. To better simulate the nighttime environment, the experiments were conducted at night with low light levels (Fig. 4).
Fig. 4. Experimental environment shot
Comprehensive research on the color of the current night mode of cell phone apps in the market, and choose to simulate the effect of night mode with ink blue as the base color. And six iconic colors that have been adapted to the night mode processing are selected as the main experimental variables. At the same time, considering the interaction between colors, three kinds of visual search interfaces were created: “single color”, “two-color” and “color”. The effect of color on visual efficiency in night mode was investigated in more detail. The experimental materials were created in AI 2020, and the design size of 1000 × 1000px was used for the production. The materials for this experiment were divided into four parts: a guide interface, a gaze point interface, a target cue interface, and a visual search interface. (a) Guideline interface: the material first presents the guideline of the experiment, informing the subject of the content and procedure of the experiment.
76
J. Hou et al.
(a) Point-of-attention interface: a white “ten” is presented in the center of the interface to focus the subject’s visual point of attention. Target cue interface: prompting subjects for search tasks. Visual search interface: subjects perform a visual search for the target and click on it (Fig. 5).
Fig. 5. Part of the experimental material
Experimental Variables. In this study, the experimental independent variable was the color of the icon and the experimental dependent variable was the subjects’ search performance (including search time spent and accuracy) when completing the task. The following preparations were to be made before experimenting. 1. To avoid the influence of icon content differences on the experimental results, the icon style of each page in the experiment is kept consistent, and the preset target icons are also the same. And the icons chosen for the experiment are all common and in the same priority in the current map service APP, to minimize the influence of the icons themselves on the experimental results. 2. To prevent subjects from having some memory of the order and corresponding positions of the target icons, the experimental materials in the three experimental tasks in the experiment were randomly presented in a single session. 3. In the experiment, a gaze point interface with a “+” mark was inserted between two small visual search tasks and persisted in the screen for three seconds to prevent the subject’s deviation of vision from the experimental results.
4.3 Experimental Data and Analysis Data Processing. During the experiment, subjects had difficulty finding a target object for a long time and were "stuck" in completing the experiment. Therefore, the data of different subjects with different repetitions of each task were sorted from largest to smallest, and the difference between the longest reaction time and the second longest reaction time was compared. It was found that the data difference was greater than 3000 ms, and ingest reaction time of the subject in this task was invalid, and the operation of data cleaning was performed (Fig. 6).
The Effect of Color on the Visual Search Efficiency
77
Fig. 6. Experimental flow and tasks of the Department
To understand the visual search efficiency of the subjects under each color condition, the data of Task 1, Task 2, and Task 3 were divided into four equal levels of “fast, fast, average, and slow” after arranging the data by size. On this basis, each level was assigned a score of “1, 2, 3, 4”. In the calculation of color, all the data are equally weighted, and the score is assigned and calculated according to the grading, and the final score is the corresponding reaction time grading score for that color. After processing and analyzing the experimental data, the three most important data indicators were extracted, i.e., the mean, median, and graded score of the experimental data at response time. These three data of different colors were ranked from the largest to the smallest, and the scores were assigned from the smallest to the largest according
78
J. Hou et al.
to the ranking order. In the case that the mean, median and response time grading scores have exactly equal weights, the composite score is calculated, and the final result is the score of color visual sensitivity. Data analysis. 1. Task 1. Arrange the six colors of task one from largest to smallest according to the average size: yellow, pink, green, blue, red, and purple; according to the middle. The bit size will be the six colors in task one from small to large then: yellow, red, green, pink, blue, and purple; after converting the ratings into numerical integrated operation, the values of task one red, yellow, purple, pink, green and blue are: 162, 126, 194, 155, 148, 176, then the order from small to large is: yellow, green, pink, red, blue, purple. In summary, in the night mode with ink-blue background, the visual search response time is shorter, and people’s sensitivity to it is ranked from highest to lowest: yellow, green, pink, red, blue, and violet. We use the standard deviation as an important indicator to judge the generality, the smaller the standard deviation represents the visual search efficiency of this color is affected by human variability. And the order of the six colors according from the smallest to the largest is: blue, purple, pink, green, red, and yellow. Combining the above analysis, in the data analysis of Task 1, after integrating the two-way indicators of generality and people’s sensitivity to it, we used pink and green as the colors with higher visual search efficiency in Task 1 (Fig. 7 and Fig. 8).
Fig. 7. Monochrome experimental data analysis2
The Effect of Color on the Visual Search Efficiency
79
Fig. 8. Monochromatic experiment color visual sensitivity scores
2. Task 2. According to the average size of the six colors of task one from large to small: green, blue, pink, red, purple, yellow; bit size of the six colors in task one from small to large then: green, blue, red, pink, purple, yellow; after converting the rating into the numerical value of the integrated operation, task one red, yellow, purple, pink, green and blue values are: 75, 84, 78, 73, 60, 70, then the order from small to large: green, blue, pink, red, purple, yellow. To sum up, in the night mode with ink-blue background, the visual search response time is shorter, and people’s sensitivity to it is ranked from highest to lowest: green, blue, pink, red, purple, and yellow. And according to the standard deviation of the ranking, we can find the stability of color search efficiency in the following order: green, red, blue, pink, purple, and yellow. Combining the above analysis, in the data analysis of task two, after integrating the two-way indicators of generality and people’s sensitivity to it, we used green, blue, red, and pink as the colors with higher visual search efficiency in the case of task two (Fig. 9). 3. Task 3 In Task 3, subjects were asked to find the target object among different icons composed of two colors. The position of each experimental icon was randomly arranged, and the content of the icons did not change, only the color was changed to some extent. The main purpose was to examine the effect of the overall color composition after color matching on the efficiency of human visual search in night mode.
80
J. Hou et al.
Fig. 9. Color experimental data analysis2
According to the mean and median data, in the night mode with the ink-blue background, the order of the reaction time of people looking for each color scheme from smallest to largest is: blue-yellow, purple-orange, red-green, and purple-blue. From this, we can conclude that people have the shortest reaction time when looking for icons composed of yellow and blue, and a longer reaction time when looking for blue-violet and red-green icons. After analysis, we summarized several possible reasons for the longer response time: First, the two groups of blue-violet, the red-green color schemes on the experimental material presented two colors in the lightness of almost no difference; second, in the ink-blue background, the same color similar color blue-violet in the overall too harmonious, resulting in the subjects visual search efficiency is not high; third, the contrast color will strengthen the contrast in the hue, resulting in the focus is not prominent, the subject’s vision lax, thus reducing the search efficiency. According to its standard deviation value comparison, the order is from smallest to largest: purple-orange, blue-yellow, red-green, and blue-violet. Therefore, when testing the relationship between the two color combinations, the visual search efficiency of blue-yellow and purple-orange is more stable and efficient than that of red-green and blue-violet. The comparison shows that people have the shortest response time when looking for icons composed of yellow and blue, but the response time of red and green icons, which are also complementary colors, is longer. Therefore, while considering the color relationship, we should also consider the color itself, and make a comprehensive consideration (Fig. 10).
The Effect of Color on the Visual Search Efficiency
81
Fig. 10. Two-color experimental data analysis2
5 Optimized Design of Visual Search Efficiency Based on Gaode Map Interface 5.1 Interface Optimization Design Process The design practice process follows the design process of interactive products, and the whole process includes three parts: preliminary research, interaction prototype analysis, and visual design. The specific process is as follows (Fig. 11).
Fig. 11. Design practice flow chart (author’s drawing)
82
J. Hou et al.
First of all, in the preliminary research, we apply the current situation research, literature research, and multi-dimensional survey method of questionnaire plus interview to understand the electronic navigation night mode more comprehensively, discover the user needs and determine the design direction. Then the color blocks and information layers were separated by the low-fidelity interface, and the information framework was summarized to clarify the information expression of the existing interface of the Gaode Map and determine the corresponding color selection range of each layer. Then refine the previously drawn low-fidelity, replace the colors according to the already determined color range, and complete the high-fidelity visual design. Finally, the design was evaluated by using the color-to-background contrast ratio assessment and feasibility questionnaire to check the reliability of the design solution (Fig. 12).
Fig. 12. Information framework analysis (author’s drawing)
The Effect of Color on the Visual Search Efficiency
83
5.2 Interaction Prototyping Interaction prototyping is divided into information framework and layers and analysis. After mapping the information framework, the text shows the low-fidelity interface of the interface in night mode, differentiating layers according to gray brightness, as follows (Fig. 13).
Fig. 13. Information hierarchy analysis (author’s drawing)
In the interface, in addition to brightness contrast, font size also affects the information level, in the self-drawn low-fidelity interface, I use the font for the Ping Fang - Jane, the font is a standard sans serif font, high recognition, and the use of a total of 12, 13, 16, 19, 21 five font size to distinguish the information level, combined with the color and the specific circumstances of the individual interface, the final information level is divided into four levels, respectively: map layer, background layer, three or four levels of information, one or two levels of information. They are the map layer, background layer, three or four levels of information, and one or two levels of information. 5.3 Visual Design Through the preliminary experimental results and the analysis of information layers, the author drew the corresponding color range for each layer (Fig. 14). Electronic navigation interface compared to the ordinary software interface to add the map level of consideration, I will map and ordinary search interface as the same level to consider separately. (1) Map: The information level of the map is closely related to its composition level, so it is divided into four levels here: point layer, line layer, surface layer, and background layer to consider.
84
J. Hou et al.
Fig. 14. Color selection table (author’s drawing)
(2) Search interface: Here the hierarchy is divided by the previous layers and analysis, but it should be added that this color selection range does not include gray, a common text color, and the layers here are more oriented to the focus of the hierarchy of tips. While considering the visual efficiency of color, interface design needs to consider the psychological impact of color on people to achieve a better user experience. In the optimized interface, the overall color scheme of the interface is based on the calming colors blue and green, which have a sense of security and reduce the user’s fatigue when using the navigation at night. The main color scheme is also aimed at the age group for electronic navigation [13], according to research, older people are more interested in blue and green. The background selection ink blue #232936, text color selection #E3E3E3, #E9E9E9, and #F6F6F6 are three light grays. In need of attention point a small amount of use of yellow, pink, and purple, to ensure the overall coherence of the interface while enriching the interface look and feel (Fig. 15).
Fig. 15. Effect display (author’s drawing)
The Effect of Color on the Visual Search Efficiency
85
5.4 Overall Effect After controlling the brightness, saturation, and contrast of each color, the optimized interface meets the brightness requirements of the night mode. Because it does not use a lot of high saturation and high brightness colors on a dark background, the stimulation to human eyes is milder than the previous interface, which can better relieve visual fatigue. And on top of that, the information hierarchy is clear and has good readability. 5.5 Evaluation of Design Practice Results Evaluation Method Design. To verify the practicality and effectiveness of the design strategies obtained from the research analysis and experiments on this topic, and whether the color-matching strategies meet the needs of electronic navigation users. This section evaluates the design practice of this chapter using an integrated objective and subjective approach based on objective and scientific evaluation indicators of the Ratio Assessment and Feasibility Scale (Fig. 16).
Fig. 16. Evaluation Methodology (Author’s drawing)
Because of the large differences in the qualifying criteria, the analysis of the ratio assessment will be divided into two parts: text color comparison and icon color comparison. Their background colors are all color #232936 (Fig. 17 and Fig. 18).
Fig. 17. Color Contrast Ratio Evaluation of Graphic Components (Author’s drawing)
After evaluating the contrast ratio of all colors, it can be seen from this perspective that the color optimization design of the night mode interface is correct and usable while taking into account the WCAG standard so that the color scheme is more easily accepted by the majority of color-blind people. The QUIS standardized scale is divided into five dimensions, which are: overall response, screen, system information, learning, and system capability. Given that the
86
J. Hou et al.
Fig. 18. Text color pair ratio evaluation (author’s drawing)
optimized design of this paper mainly focuses on the change of color of the Gaode Map APP, and the changes in layout, typography and content are minor, the part of system information in the QUIS scale is deleted and the questions in other parts are changed. The questionnaire was investigated for the feeling of the optimized design of the night mode, and the standardized scale was fine-tuned according to their own needs. (1) The first is to obtain the basic information of the questionnaire fillers to facilitate the effective recovery of the questionnaire at a later stage and to improve the reliability and validity of the questionnaire. {1} Question 1: What is your gender? {2} Your age? (2) The next step is to investigate people’s overall impression of this interface. {1} How do you feel about the product as a whole? The questions were “bad - excellent”, “difficult - easy”, “frustrated - satisfied”, “inadequate - complete “, “dull - exciting”, and “stereotypical - flexible” to get a more holistic and general impression of the user’s experience. (3) And finally, the interface is split to initiate questions to the questionnaire filler from a multidimensional perspective. {1} Text: reading the text on the screen is easy: organization of the information; use of terms in the system. {2} modes of operation: simplifying the task; learning the operation of the product; input prompts; supplementary references; simple and clear operation of the task. {3} Icon clusters: on-screen help information; screen sequences; location of on-screen information. To reduce the result errors from respondents forgetting the interface, the questionnaire is presented with the interface after every three questions to ensure the validity and accuracy of the results (Fig. 19).
The Effect of Color on the Visual Search Efficiency
87
Fig. 19. Part of the questionnaire display (author’s drawing)
Analysis of Results. A total of 54 valid questionnaires were collected, with 31 male and 23 female respondents, most of whom were aged 18-30 years old, accounting for 92.25%. The analysis of the results was based on the reference index of the QUIS standardized scale [14] (Fig. 20 and Fig. 21).
Fig. 20. Reference index of the QUIS standardized scale
Comparing the QUIS assessment scores and the standard data, we can find that the scores of each index are greater than the standard data, so we can consider the feasibility and validity of the research and design of this topic.
88
J. Hou et al. Bad -Excellent
Diffi-
Frus-
culties
Insuf-
trated
-Easy
ficient
-Satisfaction
function-
dull
stereo-
-Ex-
type -Flexi-
citement
ble
ality -Perfection
Evalu-
6.57
6.7
6.69
6.94
5.8
5.94
5.16
4.63
4.56
5.34
5.59
4.87
Good
Good
Good
Good
better
Good
ation Score Standard score Analysis of results
Reading
Mission
screen text is easy Evaluation Score Standard score Analysis of results
Simplification
Information Organization
Screen Sequence Clarity
Clarity
Consistent use of terminology
7.17
5.8
6.13
6.09
6.8
4.91
4.38
5.34
5.38
5.5
Good
Good
better
better
Good
Message Location Consistent
Input Tips Clarity
Operation Learning Easy
Operation Simple and clear
Helpful Infor-
Additional In-
mation Effective
formation Clarity
Evaluation
6.65
6.43
7.37
6.57
6.43
6.22
5.31
3.35
4.47
4.5
4.31
4.25
Good
Good
Good
Good
Good
Good
Score Standard score Analysis of results
Fig. 21. QUIS assessment score (author’s drawing)
The Effect of Color on the Visual Search Efficiency
89
6 Conclusion This study provides some reference value based on the application of experimental results on the efficiency of color search and the optimization of user experience from the travel service APP. The optimized design of the night mode of the mobile map service APP in this study is only at the level of color, but the elements of user experience are not isolated from each other, and later it is hoped that the study of optimized user experience can be conducted for the whole night mode system.
References 1. Jie, L.: A Study on the User Visual Search Performances for Web Design. Tsing-hua University. Department of Industrial Engineering, Beijing (2005). (in Chinese) 2. De Vries, J.P., Hooge, I.T.C., Wertheim, A.H., et al.: Background, an important factor in visual search. Vision. Res. 86(10), 128–138 (2013) 3. Huang, K.C.: Effects of computer icons and figure/background area ratios and color combinations on visual search performance on an LCD monitor. Displays 29(3), 237–242 (2008) 4. Huang, K.C., Chiang, S.Y., Chen, C.F.: Icon flickering, flicker rate, and color combinations of an icon’s symbol/background in visual search performance. Percept. Mot. Skills 106(1), 117–127 (2008) 5. Jinhong, D., Zhongxian, L.: A study of color, shape, and texture representational properties of graphs. J. Psychol. 32(3), 253–257 (2000) 6. Huan, Y., Yina, L., Kang, Z.: Color application in visualization design. J. Comput.-Aided Des. Graph. 27(9), 1587–1596 (2015) 7. Yamani, Y., McCarley, J.S.: Visual search asymmetries within colorcoded and intensity-coded displays. J. Exp. Psychol. Appl. 16(2), 124–132 (2010) 8. Huang, K.C.: Effects of computer icons and figure/Background area ratios and color combinations on visual search performance on an LCD monitor. Displays 29(3), 237–242 (2008) 9. Charyyad, B., Huryb, C., Terjeet, C., et al.: Selection of character/background colour combinations for on screen searching tasks: an eye movement, subjective and performance approach. Displays 35(3), 101–109 (2014) 10. Pashler, H., Johnston, J.: Attention limitations indual task performance. Attention Psychol. Press, 1155–189 (1998) 11. Morawski, T.B., Drury, C.G., Karwan, M.H.: Predicting search performance for multiple targets. Hum. Factors 22, 707–719 (1980) 12. Wolfe, J.M.: Changing your mind: on the contributions of top- down and bottom-up guidance in visual search for feature singletons. J. Exp. Psychol. Hum. Perception Perf. 29, 483–502 (2003) 13. Wanya, D., Fang, Y.: Research on the application of color psychology in UI interface design. Design 34(16), 96–98 (2021) 14. Feizi, A., Wong, C.Y.: Usability of user interface styles for learning a graphical software application. 1089–1094 (2012). https://doi.org/10.1109/ICCISci.2012.6297188
Research on Conversational Interaction Design Strategy of Shopping APP Based on Context Awareness Fusheng Jia1 , Xinyu Chen2 , and Yongkang Chen3(B) 1 School of Design, Hunan University, Changsha 410000, China 2 Television School, Communication University of China, Beijing 10024, China 3 School of Design and Creativity, Tongji University, Shanghai 200092, China
[email protected]
Abstract. This paper discusses the application of situational awareness theory in the conversational interaction of shopping apps, proposes the conversational interaction design process of shopping apps based on the situational awareness model and the KANO-AHP model, obtains user needs qualitatively and quantitatively, and explores and summarizes the user experience-oriented conversational interaction design strategy of shopping apps. On the basis of the situational awareness theory, the situational awareness model is used to obtain the user’s experience demand set from the four dimensions of conversational interactive vision, interaction, function and emotion from the three aspects of user situational factors, product situational factors and environmental situational factors. The KANO model is used to screen and classify the resulting demand set. Obtain the shopping APP conversational interactive user demand hierarchy; Then combined with the analytic hierarchy process to calculate the weight of each demand and the importance of the ranking, finally put forward the design strategy. By using the situational awareness theory and the KANO-AHP model to analyze user needs, the author puts forward strategies and suggestions for the conversational interactive experience design of shopping APP from four aspects of user visual interface experience, functional experience, interactive experience and emotional experience. In particular, strategies such as personality characteristics, high-context conversation mechanism and the same emotional feedback mechanism are proposed for emotional experience, so as to improve the user experience and satisfaction of shopping APP conversational interaction. Keywords: Situational awareness · KANO-AHP model · Conversational interaction · Shopping APP · User experience
1 Research Background Chatbots interact or talk with users through a conversational interface. Chatbots are computer-mediated applications that use artificial intelligence or machine learning to simulate conversations with users and communicate with users through the use of natural language. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 90–103, 2023. https://doi.org/10.1007/978-3-031-35129-7_6
Research on Conversational Interaction Design Strategy
91
By studying the message push mechanism of conversational chatbot, scholar Dong Hao summarized the functional requirements including interaction mode, message format, information transmission and other aspects [1]. Tan Menghua et al. analyzed the dialogue strategies of 6 chatbots by using cluster analysis method, and proposed that their dialogue strategies should follow daily expression and emotionalization, and avoid robots from asking themselves self-answers or answering non-questions [2]. When interacting with conversational chatbots, the accuracy of feedback information, adaptability of prompt information and emotionalization of expression can improve the usability and usability of chatbots [3]. In the language system of human-computer interaction, the level of dialogue context is closely related to the interactive scene and interactive information content. The system achieves the communication environment of mixed context of information between human and machine through the calculation of situation awareness, consciousness perception and emotion perception [4]. Based on literature search and collation, it is found that domestic scholars have done a lot of research on speech interactive robots, involving the design and implementation of speech interaction, system development, interaction design experience and other aspects. However, few scholars have studied chat robots based on text conversational interaction. Currently, they still focus on deep learning and key technology research. The research of shopping chatbot also focuses on the development of key technologies, and the domestic academic research is basically blank, only staying in the development research stage. Marwade A et al., a foreign researcher, enhanced the product recommendation of e-commerce products by analyzing customer personality, focusing on the specific individuation of consumers [5]. In the field of e-commerce, machine learning algorithms are mainly used to provide product suggestions. With the help of chatbots, personalized products are recommended for users based on historical orders and user conversation records. Bhawiyuga A et al. are committed to the design and application of the interactive system of e-commerce chatbot, providing automatic output corresponding to the questions and information queried by users, with good matching accuracy [6]. Setiaji B embedded the knowledge model of human interaction into the chatbot, which could recognize text information and respond automatically when users communicate with each other, and carried out deep learning of each input and output dialogue to obtain higher information matching similarity [7]. Moore RJ studied the relationship between text information processing mode and user experience, built a natural conversation framework (NCF) based on conversation science, and provided a general conversation pattern library independent of any specific technology platform for improving user experience of conversational interaction, including: Basic interaction model, general method of navigating session interface, repeated session pattern library and interactive performance index model [8]. Chai J and Lin J studied the role of natural language dialogue interface in online sales. The navigation system based on natural language text interaction is conducive to users’ quick query of product and service information. Compared with fixed menu navigation, users prefer natural language navigation [9]. Natural language dialog interfaces can provide powerful personalization options for traditional menu-driven or search-based applications. With the rapid development of artificial intelligence in China, the application of conversational chatbot will get a rapid growth, especially in the field of e-commerce
92
F. Jia et al.
shopping, and the response degree is increasing day by day. Further improving the user-friendliness and intelligence of human-computer conversational interaction will help people to improve their acceptance. Faced with the new conversational interface interaction, there is no complete and systematic strategy to guide the design of chatbot. Many designers refer to the methods of social chat software for their design, which lacks pertinence and is difficult to ensure the rationality of interaction, resulting in poor user experience. Based on the differences in user behavior and cognitive characteristics, this paper analyzes the related problems affecting the cognitive process of users, and puts forward the sensory and aesthetic, attention, learning and memory, and thinking needs of users in the cognitive process. Based on the analysis of cognitive characteristics and conversational interface interaction experience needs, the corresponding experience design strategies are proposed from four aspects: visual experience in user perception stage, interactive experience in user cognition stage, functional experience needs in reaction stage, and emotional experience in decision stage. It is helpful to guide other types of conversational interactive applications to carry out effective design practice. Conversational chatbots are poised to become a major application scenario in service industries such as finance, retail, education, business management and entertainment.
2 Demand Research and Analysis 2.1 Research Idea Through actual contact with users, the author focuses on understanding the cognition, behavior habits, pain points and expectations of conversational chatbots in the process of user interaction, explores the shopping characteristics of users as a whole and the usability and usability of conversational interactive interface, and comprehensively analyzes the survey results to obtain users’ interface experience needs, interactive experience needs, functional experience needs and emotional experience needs. The main target group of this survey is users between 20 and 35 years old, who use ecommerce platforms most frequently and have strong purchasing power. 20–25 year old group is mainly college students, good at trying new things; Most people aged between 26 and 30 are office workers who have just graduated and entered the society. With the increase of income sources, consumer demand is also increasing, and they are in the state of pursuing a higher quality of life. People between 31 and 35 years old have stable income, but they have short contact time with online shopping. To a large extent, they find it difficult to accept the convenience brought by new technologies. However, due to the low learning cost of conversational interaction, they are likely to become experienced users of conversational chatbots. This in-depth interview is a combination of structured and unstructured interviews. The former mainly discusses the usage of users, while the latter mainly explores the actual needs of users. Through interaction and communication with users, users’ experience, evaluation, pain points and suggestions on the visual performance, function, interaction and emotion of conversational chatbots can be obtained.
Research on Conversational Interaction Design Strategy
93
2.2 Questionnaire Survey and In-depth Interview The purpose of questionnaire survey is to deeply understand the cognitive and behavioral characteristics of users on chatbots by combining the cognitive characteristics of users on conversational chatbots, and to further explore the needs of users from the shopping situation and conversational interface interaction experience of users. The results of questionnaire survey analysis can provide reference for the setting of further in-depth interview content and provide guidance for the construction of design strategy. Before designing the questionnaire survey materials, I conducted a survey on ecommerce shopping apps in App Store in mainland China, and selected the top 3 apps in the free ranking of shopping apps as the questionnaire survey materials, including mobile Taobao, Jingdong and Pinduoduo. In addition, Uniqlo IQ, a popular shopping chat robot abroad, was selected as the questionnaire research material. In the conversational interface of the four apps, the product being consulted is displayed in different forms. Among them, Taobao conversational interaction is characterized by the active recommendation of the current new products; Jingdong’s conversational interaction is characterized by actively informing users that the current chat object is online customer service; The characteristic of Pinduoduo conversational interaction is that at the end of the information dialog box, it will indicate “this message is sent by the robot”; Uniqlo IQ conversational interaction is characterized by a commodities-oriented interface with a selection of attributes. In order to study the perceived visual complexity of the interface, the measurement method of the visual complexity of the interface proposed by Wang et al. [10] is referred to, which has four measurement dimensions, including complexity, density, interactivity and diversity, as shown in Table 1. Users are required to evaluate and score different existing interactive interfaces using Likert scales. Table 1. Interactive interface visual Complexity Measurement scale Interface visual complexity (PVC)
Test question
P1
I think it’s a complicated interface
P2
I think this interface is very information-intensive
P3
I think the interface is very interactive
P4
I think the layout of elements in this interface is diverse
Among the 78 valid questionnaires collected, the average age of users was 22.80 years old (SD = 1.58), among which 62 users were between 20 and 28 years old, and most of the users were students or new office workers. There are 42 female users (53.8%) and 36 male users (46.2%). The female users are slightly more than the male users, but the overall distribution is more even. In terms of educational background, 37 (47.4%) had a bachelor’s degree, 39 (50%) had a master’s degree, and two (2.6%) had a doctor’s degree
94
F. Jia et al.
or above. The above basic information shows that the data distribution of the survey user group is in line with the objectives of the survey.
3 Requirement Type Analysis of Conversational Chatbot Based on KANO and AHP Model 3.1 Requirement Type Identification of Conversational Chatbot Based on KANO Model In order to quickly determine the type of user needs, the KANO questionnaire results of user research are analyzed based on the KANO model demand evaluation table. Questionnaire design is conducted based on KANO questionnaire design criteria based on the user needs obtained from questionnaire survey and in-depth interview. Users need to evaluate the corresponding questions from positive and negative aspects, namely, users’ subjective feelings when the needs are met and when the needs are not met (M represents necessary needs, O represents expected needs, A represents charm needs, and I represents undifferentiated needs. R stands for reverse demand). According to the ratio between the demand evaluation table of KANO model and the relative customer satisfaction coefficient, the KANO types of various demands of users are counted and calculated. The statistical results of demands are shown in Table 2. Analyze various types of user requirements listed in Table 3, and filter different requirements based on different principles to better improve user experience satisfaction. 3.2 Demand Weight Calculation of Conversational Chatbot Based on AHP Model Based on the first-level demand and second-level demand of KANO demand type, the different demands are compared to determine the appropriate scale, and then the firstlevel demand and second-level demand are compared in pairs, and the final judgment matrix and results are obtained by combining the user rating. The first-level demand of users for conversational chatbots is shown as (O) > (M) > (A) > (I). The higher the expectation and demand of users, the higher their satisfaction will be, and the necessary demand will not increase the satisfaction of users with the improvement of the degree of possession. On the contrary, when the degree of possession of the necessary demand is low, the user experience will be reduced, and the high degree of charm demand will bring users unexpected surprises and increase their satisfaction. In the experience design of conversational chatbot, the expectation and demand of users should be considered first, and the charm needs of users should be satisfied, so as to bring surprises to users and improve their experience satisfaction as much as possible. The results shown in Table 3 are obtained by weighting the secondary demands of conversational chatbot. The consistency ratios of essential demand (M), expected demand (O), charm demand (A) and undifferentiated demand (I) are 0.083, 0.079, 0.063 and 0.051 respectively, all (M) > (A) > (I), which accords with the weight result of users’ first-level demands. The highest comprehensive weight ranking of second-level needs is “D6 sets specific personality characteristics (professionalism and trust), establishes emotional connection with users, and improves professionalism and trust (0.244)” under first-level necessary needs (M), which shows users’ high demand for emotional experience. The comprehensive weights of the top ten secondary demands of users are all necessary demands and expected demands. The top ten items of the comprehensive weight of the user’s second-level demand from high to low are: Set specific personality characteristics (professionalism, trust), establish emotional connection
96
F. Jia et al. Table 3. KANO requirement type
type
number
Requirement description
M Essential demand
D1
Interface performance is strong interactive sense
D2
The interface information is easy to read
D3
The navigation bar changes dynamically based on the user’s shopping process or contextual information
D4
Dialog history information screening view, enhance flexibility
D5
The decision stage allows users to modify the selection, and the interface function information is consistent
D6
Set specific personality traits to establish emotional connection with users
D7
Warm and positive personality traits contribute to specialization and increase users’ tolerance for their mistakes
D8
Visual interface emotional embodiment, increase interest, arouse users’ interest in goods
D9
The interface layout is flexible, simple and professional
D10
Color use harmony and comfort, avoid visual fatigue
D11
Recommend high quality products according to user characteristics, and attach recommendation reasons, recommendation index
D12
Personalized recommendation of goods, intelligent shopping reminder, avoid repeated recommendation of goods
D13
The user selection list is displayed centrally, avoiding only appearing in the history information
D14
Add fault tolerance to chatbot design
D15
Demand information is highlighted to provide historical information in line with the user’s memory
D16
Use a pleasant tone of voice or expression to warm up the conversation
D17
The system uses different greetings for different users, such as new users and old users
D18
The use of high-context dialogue forms to promote communication and communication, a stronger sense of closeness
D19
Visual perception of picture and text information is consistent
D20
The information layout should be neat and rigorous, avoiding dense
D21
Automatic output of goods to compare advantages and disadvantages, to provide commodity auxiliary decision mechanism
D22
The selection of goods should be multiple choice, reducing the number of conversation turns
D23
Input invalid information or system cannot recognize, emotional feedback
D24
The system guides users to provide key information to improve interaction quality and efficiency
O Desired demand
A Charm need
(continued)
Research on Conversational Interaction Design Strategy
97
Table 3. (continued) type
I Undifferentiated demand
number
Requirement description
D25
Chatbots take the initiative to inform the identity, such as intelligent assistant, intelligent robot
D26
The system shows the same emotional feedback according to the user’s emotion, and establishes emotional resonance
D27
Insight into user emotions and display personalized features, such as holiday greetings
D28
Use dynamic expressions or voice forms to create a natural state of interaction between people
D29
Explain the professional choices in the selection process
D30
The conversational forms of natural language gradually transition from low context to high context
with users, improve professional degree and trust (0.244), recommend quality products according to user characteristics, with recommendation reasons and recommendation index, the recommendation content is not only the latest products (0.205), appropriate natural language expression of dialogue, using pleasant tone or expression, Make the dialogue more temperature (0.181), demand information highlighted, provide historical information in line with the user’s memory (0.173), user selection list centralized display, Avoiding only appearing in historical information (0.171), strong interactive interface performance (0.120), dynamic changes of navigation bar according to users’ shopping process or context information (0.115), warm and positive personality characteristics of robots are conducive to its specialization, which can improve users’ tolerance for its mistakes (0.112), personalized recommendation of products. Smart shopping reminder, avoid repeated recommendation of goods (0.105), use dialogue form with high context, promote communication, and stronger affinity (0.091).
4 Construction of Conversational Interactive Design System for Shopping APP 4.1 Shopping APP Conversational Interactive Visual Interface Experience Design Principles (1) Lightweight visual interface representation When perceiving the visual interface, users hope that the interface information is easy to read and the layout is flexible, concise and professional. According to the principle of closure and continuity in Gestalt theory, the functional layer and the content layer are distinguished by the elements such as line and wire frame. According to the principle of similarity and proximity, the specific information in the content layer of the interface is organized and distributed, so that users can easily perceive the relevance between similar information and detailed information. For the layout of interface content information, we can start from the classification of information, classify the information into function layer and content layer, organize and code the information in the content layer reasonably, and then establish a good user navigation visual flow.
98
F. Jia et al.
The user’s perception of color comes from the psychological activity of perceiving the physical properties of color. The use of colors in the conversational interface should consider the feeling and comfort experience brought by colors to effectively convey the recognition of products. Many shopping platforms have their own brand colors, such as red on Jingdong and orange on Taobao.com. The interface color of Taobao is harmonious and comfortable, bringing simplicity and freshness to people, while the interface color of Jingdong is single. In the figure, different colors in the product card of Taobao express different dimensions of content. The source content of the product is emphasized at the presentation level, and the link becomes the main reinforcement information. The dark gray information is followed by the light gray information. Different color brightness used to express text content can show the difference of information level. (2) Improve the visual interface interaction The setting of visual images of conversational chatbots can enhance the sense of interaction and encourage users to perceive their personal characteristics. With the help of dynamic visual images, different emotions and feelings can be expressed. Conversational interface communication is mainly carried out in the form of natural language, and information input is mainly in the form of text information and voice information. In specific situations, users will input information in the form of voice. A good auditory image can enhance users’ sense of immersion, create interesting interaction process, and meet different users’ experience in different situations. 4.2 Principles of Conversational Interactive Functional Experience Design of Shopping APP (1) Build a personalized recommendation mechanism Through in-depth learning of users’ characteristics, a personalized recommendation mechanism is built. Specifically, the system dynamically adjusts the system’s recommendation strategy according to users’ behavioral characteristics and preferences of commodities selected each time, so as to ensure that the recommended commodities meet users’ real needs. In the stage of target product information input by users, the information often has fixed characteristics, such as clothing size, shoe size, etc. The system can form a basic user portrait through the accumulation of these information, according to which the optimal state of personalized recommendation can be achieved. For example, a sneaker enthusiast chooses sneaker size 40 every time, and the price range is between 500 and 700 yuan. Through deep learning, the system analyzes and stores the user’s basic behavioral needs, and obtains the user’s foot size and acceptable price range. When the user chooses sneakers and other footwear products next time, The system can actively recommend products based on such information, and can also refer to such information when recommending related products, such as socks, footpads and other products. (2) Provide commodity assist decision mechanism Before making the final decision, users will go through the stages of commodity perception and cognition, which are affected by many factors. On the one hand, the main influencing factors are poor readability of commodity information, chaotic display of commodity information, difficult to attract users’ eyes; On the other hand,
Research on Conversational Interaction Design Strategy
99
the presentation of commodity information is diversified, making it difficult for users to judge the differences of each product. Recommended goods mainly show their price, main functional features and sales volume. In the face of multiple goods, users want to spend the least time to understand the basic information of each product, so as to have a more detailed understanding of the selected target goods and make the final decision. In the face of multiple recommended products, users tend to conduct multiple rounds of screening, such as the first round of screening based on price and the second round of screening based on sales volume. As a result, users spend more time in the process of selecting products, reducing their efficiency of use and inevitably reducing their satisfaction with them. Providing recommended product comparison information can highlight the main differences of products based on the mechanism ranking, and avoid users spending more time and effort to compare. 4.3 Principles of Conversational Interactive Experience Design for Shopping APP (1) Improve user’s sense of control over session orientation In the process of communication between users and conversational chatbots, the conversational initiative tends to focus on the chatbot. When the chatbot attempts to control the user’s behavior, its initiative may stimulate users and make them feel uncomfortable. When expressing demand information, different types of users will feel different sense of control in dialogue. Users with clear products have clear product goals and will show self-oriented sense of control in dialogue and communication, while users with unclear goals do not have clear product goals and are fuzzy in demand expression, feeling a sense of control dominated by robots. In the process of communication, users should be given more control, so that they can freely express the information they need. When they encounter information that is difficult for them to make decisions, they can be converted into a system-led dialogue. For the special cognitive process of users with unclear goals, the dominant power can be gradually transferred to users according to their usage time. With the formation of usage habits of users with unclear goals, the sense of control over the conversation will also be strengthened. (2) Fault-tolerant design Fault tolerance is designed to deal with conversational chatbots’ ability to handle conflict or failure situations. The specific performance is that the system processes the demand information input by the user incorrectly, and the system cannot identify the demand information input by the user. The system processes the demand information input by the user incorrectly mainly means that the user’s demand information exceeds the processing capacity of the robot. In this case, the system needs to guide the user to communicate on a new topic, and integrate the emotional expression to explain the reason to the user gently. Positive attitude of admitting mistakes and sincere language expression can increase the tolerance of users to their limited ability, get the recognition of users and carry out new topic continuation, improve the usability of the system.
100
F. Jia et al.
4.4 Principles of Conversational Interactive Emotional Experience Design for Shopping APP (1) Set certain personality traits a. Meet the personality characteristics of the task operation scenario The setting of personality characteristics should be consistent with the scene of task operation. The scene of task operation will be affected by the rhythm of information presentation, the way of information presentation, the rhythm of interaction process, the response speed of information and other factors. Similar operational tasks are also affected by service types. Different service types should convey different personality characteristics. For example, a chatbot for selling clothes should be extroverted, showing enthusiasm and active emotions. Chatbots for electronic ticket sales should be rigorous, conveying professional, responsible and other emotional characteristics. b. Personality traits that fit the brand personality Chatbot personality is closely related to commodity brand personality, and its personality characteristics need to be consistent with brand personality to some extent. For electronic consumer goods, the brand personality is mainly the sense of science and technology, and the main consumer groups are enthusiasts and science and technology lovers. The setting of chatbots should be more scientific and younger. LG Smart TV sets the personality characteristics of “elegant in line, skillful in micro, modest in harmony” based on the message notification at the interactive level, which accurately expresses the image to be conveyed. c. Multidimensional co-construction of personality characteristics The multi-dimensional co-construction of personality characteristics should be based on the commodity brand, online store, official website, offline store, etc., to jointly build the comprehensive personality characteristics, to help users build a multi-dimensional cognitive mind. Global experience refers to the reaction brought by the user’s contact with all the contact points of the product. It should also consider the expression and shaping of the product brand, showing the chatbot’s professional assistant image. (2) An emotional form of conversation Conversational chatbots mainly help users to solve relevant problems in time. When users ask for product information, the system will match the demand information with its own text knowledge base, and then make correct feedback; When there is no highly matched answer in the text knowledge base of the system for the demand information entered by the user, the system needs to choose one of its alternative answers to reply, so as to ensure the user’s interaction satisfaction. If the user inputs the semantically similar text information twice in a row, the system should judge that the output feedback does not solve the user’s problem, at this time, the system should apologize to the user in a euphemistic tone, and confirm the input information with the user again, in order to relieve the user’s negative emotions in the form of emotional dialogue. (3) High context conversational mechanisms In conversational interaction, the level of context is closely related to the content of interactive information and interactive situation. Based on situational awareness
Research on Conversational Interaction Design Strategy
101
computing, cognitive perception computing and emotional perception computing, the interactive system establishes a mixed context dialogue environment combining human’s high-context dialogue form and artificial intelligence’s low-context dialogue form. With the deepening of information dialogue, the accuracy of information acquisition by both sides of the interaction will gradually increase, and the external stimuli of information will also increase with the deepening of the dialogue in this process. The more important feature is that multiple rounds of dialogue between the two sides lead to the increase of context-related influencing factors and the decrease of users’ accurate recognition rate of information. Therefore, it is necessary to use relevant forms of context to improve the efficiency of the dialogue between the two sides. The conversational mechanism with high context can improve conversational fluency and accurate recognition of text information. In the semantic understanding of interactive systems, parts of speech analysis, text analysis, emotion analysis and other processes are needed. Parts of speech are obviously different in different contexts, which affects the accuracy of text analysis and emotion analysis, such as sarcasm, emphasis and other parts of speech. In the process of semantic transmission, the low context conversational form of chatbot tends to misinterpret the implicit and introverted high context conversational form of user input. Users tend to use high-context conversational forms in the process of conversation, with a lot of implicit information, and conversational chatbots will misunderstand these information, resulting in greatly reduced recognition speed and accuracy. When setting the context of conversational chatbot system, the following aspects can be considered: a. Determine the delivery type of information The conversational interaction between users and conversational chatbots is mainly task-type and question-and-answer type, which contain different expression forms in high and low contexts. For the two sides of interaction with different high and low contexts, try to use neutral words in the dialogue and avoid using negative and positive words to prevent users from feeling ironic expressions. In terms of the design of intonation and greeting, the interaction system in low context can regard itself as a small assistant and communicate with users in a soft intonation and tone. The small assistant will usually address users with a high level. For example, the online mobile game King of Glory calls the mass players as the “master”. b. Be careful with fixed replies For the acquisition of the content of the two sides of the interaction in different contexts, fixed language expression should be used cautiously. Dialogue communication inconsistent with users’ daily expression will cause strong aversion to chatbots. In the design process, for text information with little difference in expression between high and low contexts, it can follow the existing corpus to give feedback; For text information with large differences in expression between high and low contexts, multiple rounds of question and answer can be used to confirm users’ demand information, so that the system can gradually grasp the logical relationship contained in users’ high-context conversational forms, and then try to use high-context
102
F. Jia et al.
conversational forms to improve the feedback performance and effect of chatbots in mixed contexts. (4) Same emotional feedback mechanism Users express their inner emotions through text messages. For the feedback of conversational chatbots, users also hope to get the emotional resonance of text messages to obtain self-satisfaction. Since conversational interaction is two-way, the emotional output of conversational chatbot is not only related to external stimuli, but also related to the user’s next round of conversation information. The system should make prediction and judgment according to these influencing factors. According to the “principle of increase and decrease in interpersonal attraction” in social psychology theory, if users’ recognition, evaluation and praise of conversational chatbots continue to increase, conversational chatbots will become more and more suitable to users. On the contrary, if users’ recognition, evaluation and praise of conversational chatbots continue to decrease, the chatbots will become more and more unsuitable to users. Conversational chatbots improve their satisfaction through emotional stimulation of users and output in line with their emotional responses. However, for users, conversation is a two-way street. The influencing factors of chatbot satisfaction are not only related to the emotional stimulation of the last round of conversation, but also affected by the emotional feedback of the next round of user conversation. The satisfaction of users’ emotional experience is not only related to the emotional stimulus output by the chatbot in the last round, but also to the emotional feedback from the chatbot in the next round. In psychological theories, there are three main factors affecting interpersonal communication, which are charisma, proximity and acceptance, and similarity or complementarity. Similarity or complementarity is the easiest factor to establish emotional resonance between people and chatbots. Similarity is mainly reflected in the fact that the two sides have relatively consistent or close views on something or a certain point of view, while complementarity is mainly reflected in the complementary relationship between the two sides in terms of information needs or personality characteristics. Accordingly, the two sides will gradually establish a close emotional connection. The design of chatbot emotional feedback mechanism should consider both the complementarity and similarity between emotions. For the emotional cognitive experience of users, chatbot should output the optimal text information and emotional feedback according to the context emotion of conversational information and the emotion currently input by users, so as to promote the conversational interaction between users and chatbot to be more natural and harmonious.
5 Conclusion Based on user cognitive characteristics, user experience needs and the survey results of user needs, this paper analyzes the mapping relationship between users’ various requirements for shopping APP conversational interaction and design principles, and extracts the design principles of shopping APP conversational interactive experience based on cognitive mechanism from four aspects of user visual interface experience,
Research on Conversational Interaction Design Strategy
103
functional experience, interactive experience and emotional experience. It specifically includes the key points to be paid attention to under various user experiences and detailed design methods. All design principles are in line with the cognitive psychology and characteristics of users, which can effectively guide the design of shopping chatbot and help improve user satisfaction.
References 1. Dong, H.: Research on Chatbot information push mechanism. China New Commun. 20(18), 146–147 (2018) 2. Tan, M.H., Pan, X.Y.: Research on dialogue reply strategy of text chat robot. Software 41(09), 51–55 (2020) 3. Fang, Z.W., Du, L.M.: Application of Grice’s cooperative principle and conversational rules in human-computer dialogue. J. Nat. Sci. Heilongjiang Univ. (06), 716–720 (2005) 4. Qin, J.Y., Wang, Y.Z.: Research on conversational voice interaction system in high and low context. Packag. Eng. 42(10), 85–91 (2021). https://doi.org/10.19554/j.cnki.1001-3563.2021. 10.012 5. Marwade, A., Kumar, N., Mundada, S., et al.: Augmenting e-commerce product recommendations by analyzing customer personality. In: 2017 9th International Conference on Computational Intelligence and Communication Networks (CICN), pp. 174–180. IEEE (2017) 6. Bhawiyuga, A., Fauzi, M.A., Pramukantoro, E.S., et al.: Design of E-Commerce chat robot for automatically answering customer question. In: 2017 International Conference on Sustainable Information Engineering and Technology (SIET), pp. 159–162. IEEE (2017) 7. Setiaji, B., Wibowo, F.W.: Chatbot using a knowledge in database: human-tomachine conversation modeling. In: 2016 7th International Conference on Intelligent Systems, Modelling and Simulation (ISMS), pp. 72–77. IEEE (2016) 8. Moore, R.J.: A natural conversation framework for conversational UX design. In: Moore, R., Szymanski, M., Arar, R., Ren, G.J. (eds.) Studies in Conversational UX Design. Human– Computer Interaction Series. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-955 79-7_9 9. Chai, J., Lin, J., Zadrozny, W., et al.: The role of a natural language conversational interface in online sales: a case study. Int. J. Speech Technol. 4(3), 285–295 (2001) 10. Wang, Q., Yang, S., Liu, M., et al.: An eye-tracking study of website complexity from cognitive load perspective. Decis. Support Syst. 62, 1–10 (2014)
Influence of Different Language Labels on Perception of Product Value Yen-Yu Kang1(B) and Yu-Dan Pan2 1 Department of Industrial Design, National Kaohsiung Normal University, Kaohsiung City,
Taiwan [email protected] 2 Department of Geography, National Kaohsiung Normal University, Kaohsiung City, Taiwan
Abstract. This study is tested by Influence of Different Language Labels and Packaging Design Characteristics on Perception of Product Value. First of all, the use of observation, Information Collection method, and Market Research, explore the collection and analysis of people and things. And the use of Questionnaire Survey evaluation to the amount of glossy unlimited collection of the shampoo packaging texture. Analysis to summarize the 16 major comparative adjectives, And then experimental design and analysis, using Likert Scale 7 to measurements to analyze the effect of languages on packaging designs and how the use of different languages on the label can affect the perception of the product value by consumers in different countries and the feeling the consumer would receive from the products packaging labels. This study: (1) Knowing which languages should or should not appear on the label to market a product in different countries also allows the packaging designers to be able to strongly and inexpensively influence the consumers’ purchasing decision. (2) packaging designers will be able to choose a proper language or a combination of languages to be presented on the labels of products for selling in target regions. (3) For the manufacturers, using different languages on the labels will help attract the consumers by conveying the impression of a higher value for the product compared to the other products in the same category. Keywords: Packaging design · Consumer Psychology · Different language · Text design · Consumer value · Communicating across cultures
1 Introduction In order to capture the attention of consumers when searching for certain products on supermarket shelves, visual elements on packaging can influence consumers to choose their products in the competition. A good design should arouse consumers’ perception of product added value (Valentya et al. 2014). Marketers should be involved in the packaging design process of their products because they understand the trends of the target consumers of their products (Council 2013). According to the differences in language used in product design and packaging in the process of naming, launching and advertising a product, it has a significant impact (I © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 104–113, 2023. https://doi.org/10.1007/978-3-031-35129-7_7
Influence of Different Language Labels on Perception
105
Doole and Lowe 1999; Isobel Doole and Lowe, 2012). It is important for producers to be aware of consumer trends in different countries regarding the use of different languages on packaging labels. For example, Run and Chin (2006) mentioned that “consumers prefer their native language to be presented on packaging”. It shows that the language on the packaging will affect consumers’ value and purchase intention (Fahim 2013). In the past research on packaging design, the correlation between packaging image design and brand and the view of single country consumers have been discussed. However, there is little discussion on how packaging design with different languages can be transferred to the design and development of products or brands and the positioning of products in consumers’ psychological yearning. Therefore, this research in different languages use the same packaging for the consumers of different countries, for example, in different language use in different countries for consumers in the same package of cognition and preference, the results can be sold in different countries, for the future on the packaging of any language can be more attractive to consumers, and even affect the consumer purchase intention. 1.1 Research Purpose This study explores whether packaging in different languages affects consumers’ purchasing behavior, and analyzes the perceptions and preferences of consumers in different countries, such as Japanese and Taiwanese, regarding packaging in languages other than daily necessities. 1. Through literature review, understand and discuss consumer sentiment, definition of packaging, packaging design in different languages, cross-lingual culture of brands, and the use of different languages, etc. This PAPER INTRODUCES THE PERCEPTION PROCESS OF CONSUMERS WHEN THEY BUY PRODUCTS, AND how VISUAL elements affect CONSUMERS ‘perception of product value. 2. Interviews and questionnaires were conducted to analyze the differences in consumers’ perceptions of the value of daily necessities packaging in different languages in different countries. 3. Two-factor ANOVA and regression analysis were used to explore the value perception differences of foreign packaging of household goods. 4. Construct a more efficient reference for future packaging and marketing designers based on the cognitive differences of foreign language packaging among consumers in different countries.
2 Literature Review 2.1 Packaging Design and Consumer Sentiment Newman (1957) mentioned that the highest level of consumers is psychological satisfaction, and the placement of foreign words will set off the value of goods. To meet the emotional and psychological needs of consumers, consumers project their own characteristics on the products, and the packaging is their avatar. That is, to meet the emotional and psychological needs of consumers, consumers will project their own characteristics
106
Y.-Y. Kang and Y.-D. Pan
on the product, the product is their avatar. Pleasant and unpleasant feelings may vary with different use times and familiarity with the product. For example, they may be felt during the use of the product, or they may be emotional feelings before or after use. 2.2 Visual Design Elements on Packaging Visual elements can include all the features of the package that consumers see, such as color, text, shape, picture, and decoration (Sara, R. 1990). Using these elements in packaging, consumers can find the products they are looking for through visual perception (Chen, Y.C. 2005; Wang, R.W.Y. and Chou M.C. 2010). Packaging is mainly composed of two parts: shape and structure design and image design, which is the main communication mode between manufacturers and consumers (Karimi, P. et al. 2013). Design elements can be divided into visual and functional elements (Wang, R.W.Y. and Chou, M.C. 2010). Other studies on packaging design elements show that the best elements to attract consumers are packaging features and shapes (Mu Chien C. and Regina, W.Y. 2012). Since typography is one of the visual design elements presented in the form of language, it can convey the marketing information of products to consumers of different cultural, social and ethnic backgrounds in a short time on three-dimensional media. Typography helps define product characteristics and where products come from. In addition, the typography used on the packaging can represent the aesthetics and message of the product, as language is a key factor in successful communication with consumers. Therefore, this study presents different languages in the same product in the process of typesetting, so as to analyze whether language affects the values observed by consumers from different countries. 2.3 Text Design in Packaging Appearance of packaging visual design in packaging includes colour, modelling, text, line, illustrations and decoration factors such as the formation and configuration (Kaneko 1998), the maker will have to declare the data to consumers, make its visual impulse effects for consumers, customers’ attention and interest in it, refund and purchase action, The text message of packaging design includes LOGOTYPE, trade name, captions, etc. The form of character formation can be divided into two types: enterprise special font and printing font. Enterprise special fonts include fonts specially designed for text trademarks, commodity names and so on. COOLWATER WASH is a typeface for trade names. Both are typefaces designed to distinguish them from other companies. The key points of special font design include: originality, readability, impression, aesthetic, reproducibility and sustainability. Printing fonts are mainly used to mark the use, usage, characteristics and quality related to the product content; Manufacturer, distributor, address, telephone number and country of production related to the source; Description of ingredient label, content, date of manufacture, shelf life, etc. related to the statutory message. Because the number of words in the description is large and the explanation is strong, it is suitable to choose a fast and concise printing font to convey the content of the text. In addition, some trade names also use typography.
Influence of Different Language Labels on Perception
107
2.4 Emotional Needs of Consumers Jordan (2000) added Maslow human needs level into human factors consideration and put forward the explanation of consumer class pyramid. Besides satisfying basic product functionality at the bottom level, the following usability considerations of whether the product is easy to use should be considered. The two mentioned above should be physiological needs, while at the highest level. The most difficult level to achieve is the pleasurable level of psychological satisfaction, which means that in addition to the physiological needs of the product, the use of the product generates a pleasant feeling to meet the psychological internal level. Functionality: A product must provide good functionality to enable it to complete its task successfully. If the product can not meet the needs of the function, will lead to consumer dissatisfaction with the product use experience. Usability: When organic energy is used, users want the product to be easy to use in order to achieve the best efficiency of use. Pleasure: In addition to functionality and usability, products should provide additional needs, including functional benefits and satisfaction of emotional feelings, which belong to the level of internal sensibility.
3 Research Procedures and Methods The hypothesis of this study is that the emphasis is on discussion from different countries customers, loan word for bath products packaging, the value of cognitive tests, and then analysis the different national consumer cognitive and emotional charm of the value of the loan word packaging, relevance, and discusses influence of the subjective and objective factors resulting in different languages are packing, according to the study of the motivation and background, Through the process of experimental design, the paper analyzes the differences between different countries on the value perception and emotion perception of loanwords packaging with different product attributes. The subjects were 150 non-design students from three different countries, including 50 students from Kaohsiung Normal University, 50 students from Iwate University, and 50 students from Shandong Art School in mainland China. In order to avoid showing any bias in the results, the experimental process will be conducted in three different countries. Consumers in these three countries were chosen for the experiment on the grounds that consumers in Taiwan, China and Japan would have a better chance of having SHAMPOO delivered in different languages. The average age of the participants was between 18 and 25. The subjects were consumers without design background. Because the color combination will affect the perception of consumers. The application of two colors with 180° diagonal from the color spectrum is the most favorable color scheme (Favre, November 1979). For example, if red (R) is selected in the spectrum, the best color match for 180° diagonal is teal (BG). Using high-contrast color combinations can make letters more visible and attractive. In the study, a blue background plus white letters was used on experimental packaging labels, and this combination could help produce positive and pleasant visual effects (Laughery, Wogalter 2006). In the test of color psychology, it is found that almost no one is averse to blue (Hook 2010). Blue also presents a sense of smoothness, stable mood and serenity. Blue can present a feeling associated with freshness and cleanliness, and
108
Y.-Y. Kang and Y.-D. Pan
many manufacturers are aware that this color interpretation helps shampoos provide a clean, soothing mental fit, as shown in Fig. 1
Fig. 1. The color spectrum (Color figure online)
In the study, the use of “Sans Serif” font on packaging labels is eye-catching AND highly clear (Wang, R.W.Y. and Chou, M. C. 2011), as shown in Fig. 2. The outer packaging of the test samples is labelled “SHAMPOO”. Each sample used the same color and the same size font. The word “SHAMPOO” was chosen as the test sample mainly because it is familiar to consumers in Taiwan, Japan and mainland China. In this study, in order to avoid interference when answering the questionnaire, the test sample is not the product actually sold, and is only used in the questionnaire of this study, and the subject will not be affected by the previous use experience.
Fig. 2. The Sans Serif font and its features
In seven different label designs, labeled in six languages, the other sample was presented unlabeled. Selected six languages are most commonly occurs after the market survey the sales channels of the two countries in Taiwan and Japan, Taiwan, mainland China and Japan famous sales site (lotte, yahoo auction market, tao outfit), entity stores (carrefour, the welfare center, Japan business supermarket, Japan Maxvalu supermarkets, stores). There are Japanese, traditional Chinese, English, Korean, Simplified Chinese, and Thai, as shown in Fig. 3. These six languages are the most commonly translated languages in the retail market in the three countries tested. As the subjects of this study were consumers from Taiwan, China and Japan, the questionnaire was designed in both Japanese and Chinese versions, and the Japanese translation was made with the assistance of professors in Japanese related fields from Iwate
Influence of Different Language Labels on Perception
109
Fig. 3. Test samples presented on packaging labels in different languages
University. In the part of questionnaire survey, the subject will first observe the package samples and choose the appropriate product price according to the value perception after observation. Please quote a reasonable price for each language sample. The rating measure is one of the easiest ways to collect data in most studies (Peterson, R. A. Creating Effective Access, 2000). Likert scale was used to evaluate the scale. The scale was divided into 1 to 5 points, and 5 points were more than 160 NT dollars (more than 800 Japanese dollars; Above 147 RMB), 1 cent less than NT $100 (less than NT $500; Less than RMB23), the most attractive language packaging and the least attractive language packaging, as revealed by the product price evaluation of consumer value perception. The prices shown in Table 1 are based on market research, based on the cost of living in the country according to the results of the market research. Table 1. Price rating scales for different currencies Currency
1
2
3
4
5
Taiwan New Taiwan dollar
Less than 100
101–120
121–140
141–160
More than 160
Japan Japanese yen
Less than 500
501–600
601–700
701–800
More than 800
China Renminbi
Less than 23
24–64
65–105
106–146
More than 147
110
Y.-Y. Kang and Y.-D. Pan
3.1 Questionnaire Implementation Method The experiment will be conducted by an open questionnaire on the Internet. At the beginning of the questionnaire, the participants were asked basic information: Gender, age, education, frequency of shampoo, main shampoo brands and the price of shampoo in use, and then the text samples of each language will appear separately, and the subjects do not know the order of the samples in advance. The main reason for using a random approach is that past studies have shown that people will try to remember the first and last graph that comes up and compare them. In addition, research has found that people remember their favorite sample and compare it with other samples. In view of this, previous studies have shown that subjects will not favor any sample if they do not know the dial order.
4 Analysis of Research Results 4.1 Analysis of Questionnaire Results The subjects were 150 non-design students from three different countries, including 50 students from Kaohsiung Normal University, 50 students from Iwate University, and 50 students from Shandong Art School in mainland China. In order to avoid showing any bias in the results, the experimental process will be conducted in three different countries. The average age of the participants was between 18 and 25. The subjects were consumers without design background. 4.2 Perceived Value of Products in Different Languages on Packaging Labels ANOVA4 was used to investigate the influence of the respondents’ perception of product value. The results showed that Taiwanese respondents valued English and Japanese language labels more than other language labels. However, “Simplified Chinese” language labels had the lowest cognitive evaluation of value. On THE OTHER HAND, “English” and “Japanese” language LABELS HAD the highest value perception, while “SIMPLIFIED Chinese” labels had the lowest value perception. However, no significant difference was found for “unlabeled language” and “Korean.” The “SIMPLIFIED Chinese” packaging label sample had the lowest value perception for both Taiwanese and Japanese participants. There were no significant differences among the “unlabeled language”, “Korean” and “Japanese” label samples. The lowest price evaluation was obtained for the language sample with “Simplified Chinese” logo, as shown in Fig. 4. In mainland China, the Chinese mainland respondents had the highest value perception for the “English” and “Simplified Chinese” packaging label samples. The sample of packaging labels with “unlabeled language” had the lowest value perception for the Chinese mainland respondents, as shown in Table 2.
Influence of Different Language Labels on Perception
111
Fig. 4. Price rating scales for labels in different languages Table 2. Price rating scores for labels in different languages Country Score
Score
Score
Score
Score
Score
Score
Blank 1.7
Simplified Chinese 1.3
Traditional Blank Chinese 1.46 1.5
Simplified Chinese 1.26
Taiwan
English Japanese 2.8 2.8
Korean 2.2
Traditional Thai Chinese 1.9 2
Japan
English Japanese 2.2 2.05
Thai 1.9
Korean 1.6
China
English Simplified Traditional Japanese 2.5 Chinese Chinese 2.05 2.5 2.2
Thai 1.62
Korean Blank 1.3 1.15
5 Conclusion The results of this study show that using different language on packaging labels affects consumers’ perception of product value. By using labels in different languages, consumers have very different values about products. In this study, it is found that consumers with different nationalities have different perceptions of product value. Consumers in Taiwan think that using English and Japanese on labels gives higher price recognition than other languages. On the other hand, both Chinese mainland and Japanese respondents rated English label packaging as the most valuable. Nearly 7,000 known languages are spoken and recorded worldwide (Anderson, S. R. 2010). Therefore, manufacturers must consider the concept of brands sold in different languages to the global market (Ellis, J. 2014). According to the results of the perceptual perception of labels in different languages, consumers in Taiwan, Japan and mainland China tend to use “English” on packaging labels, which can increase the value perception of goods. Respondents in Taiwan and
112
Y.-Y. Kang and Y.-D. Pan
Japan think simplified Chinese is the least valuable. However, respondents in mainland China rated unlabeled packaging labels as having the lowest value perception. Since World War II and the Korean War in 1939, Asian countries have been most influenced by American culture. Generally speaking, most foreign immigrants in Asian countries come from different Asian countries and a few immigrants from Englishspeaking countries. English is seen as foreign and exotic in Asian countries because of the small number of native English speakers. The reason why young Asians are familiar with English mainly comes from the fact that pop music, movies and even school education focus on English teaching. All these factors have a certain influence on young Asians. According to a 1999 study by the National Language Institute of Japan, English is considered the most effective language for global communication. Due to the learning of English for education, business and various other purposes in Asian countries, about 350 million people in Asian countries can speak English (Nobuyuki H. 2005). Several key factors revealed in this study will be used to design packaging labels for future domestic and international markets. (1) The use of English in packaging design is the most influential language because English is considered attractive and generates a higher perception of product value. (2) The results show that consumers in different countries have a significant impact on product value when different languages are displayed on packaging labels, so the choice of language will have a significant impact on product value. (3) Putting English into the packaging labels of goods, even if it is decorative language, may help to improve the value image of products.
References Anderson, S.R.: How many languages are there in the world? (2010). http://www.linguisticsociety. org/content/how-many-languages-are-there-world Chen, Y.C.: A study on comprehensibility and interestingness of design from visual trope. Taiwan J. Arts 77, 1–11 (2005) Council, D.: The Power of Packaging Design (2013). http://www.designcouncil.com. Accessed 13 Sep 2013 Doole, I., Lowe, R.: International Marketing Strategy lnternational Thomson Business Press, London (1999) Doole, I., Lowe, R.: International Marketing Strategy, vol. 7. Cengage Learning, USA (2012) Ellis, J.: What is brand language? (2014). http://www.wisegeek.com/what-is-brand-language.htm Fahim, A., Cooke, A., Huang H,H.: How can private labels increase their value? – the role of a brand name and packaging design. In: Proceeding of European Marketing Academy Conference. Istanbul, Turkey (2013) Favre, J.-P., November, A.: Color and communication =: Color und communication = Color et communication. Ed. ABC, Zurich (1979) (2010). http://td026544.pixnet.net/blog/ Hook Jordan, P.W.: Designing Pleasurable Products: An Introduction to the New Human Factors. CRC Press, Boca Raton (2000) Kaneko, S.: Package Design (L. Zhizhong ) Taiwan, Taipei. Science and technology book co., ltd. (1998)
Influence of Different Language Labels on Perception
113
Karimi, P., Mahdieh, O., Rahmani, M.: The study of relationship between packaging elements and purchase behavior: consumers of food, cosmetics and health products. Interdiscip. J. Contemp. Res. Bus. 5(3), 281–295 (2013) Laughery, K.R., Wogalter, M.S.: Designing effective warnings. Rev. Hum. Factors Ergon. 2(1), 241–271 (2006) Mu Chien, C., Regina, W.Y.W.: The findability of food packaging design. J. Sci. Des. Bull. JSSD 59(3), 11 (2012) Nobuyuki, H.: English as a multicultural language in Asia and intercultural literacy. Intercult. Commun. Stud. XIV(2), 73–89 (2005) Peterson, R.A.: Creating Effective Questionnaires. Sage, Thousand Oaks, CA (2000) Run, E.C.D., Chin, S.F.: Language use in packaging: the reaction of Malay and Chinese consumers in Malaysia. Sunway Acad. J. 3, 133–145 (2006) Sara, R.: Packaging as a retail marketing tool. Int. J. Phys. Distrib. Logist. Manag. 20(8), 29–30 (1990). https://doi.org/10.1108/EUM0000000000372 Valentya, F., Lestari, N., Gotama, T., Kumar, S.: Packaging as an attractive language to stimulate consumer preference on perfume: a survey on young adult respondents in the area of Jabodetabek (Jakarta, Bogor, Depok, Tangerang, Bekasi) Indonesia. Int. J. Sci. Res. Publ. 4(1), 1 (2014) Wang, R,W,Y., Chou, M,C.: The comprehension modes of visual elements: how people know about the contents by product packaging. Int. J. Bus. Res. Manag. (IJBRM) 1(1), 1–13 (2010)
Structural Equation Modeling for the Interplay Among Consumer Engagements with Multiple Engagement Objects in Consumer’s Fashion Masahiro Kuroda(B) , Akira Oyabu, and Ryohei Takahashi Okayama University of Science, 1-1 Ridai-cho, Kita-ku, Okayama 700-0005, Japan {kuroda,ohyabu,r-takahashi}@ous.ac.jp
Abstract. Consumer engagement (CE) is attracting much attention from academics and practitioners. Early research on CE focused on online contexts such as an online brand community. Recent research attention of CE is not only in online but also in different offline settings. The research is mainly conceptual and is concerned about CE with a single engagement object. However, there is little understanding about CEs with multiple engagement objects. We explore how consumers engage with multiple engagement objects in a fashion consumption context and empirically examine the interplay among CEs with them, simultaneously. We define CE by consumer’s cognitive, emotional and behavioral investments into fashion-related interactions. Both qualitative and quantitative research are conducted from this CE perspective. In the qualitative research, we draw on semi-structured interviews and extract three focal objects. This research results are used to develop conceptual models of the interplay among CEs with these objects. The quantitative research collects data for measuring CEs with Likert type scale. Structural equation modeling (SEM) measures how well the conceptual models fit the data. Moreover, we propose a new approach NPCA-SEM that includes nonlinear principal component analysis (NPCA) in SEM. NPCA finds optimally scaled data quantifying Likert type scale data and computes principal component scores of the optimally scaled data. SEM treats the scores as observed data instead of Likert scale data. In the analysis of our research data, we compare the goodness of fit of conceptual models from NPCA-SEM with that from ordinal SEM and examine the performance of NPCA-SEM. Keywords: Consumer engagements · Multiple engagement objects · Consumer’s fashion context · Structural equation modeling · Nonlinear principal component analysis
1
Introduction
Consumer engagement (CE) is attracting much attention from academics and practitioners. In marketing literature, CE is defined as follows: “A customer’s c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 114–126, 2023. https://doi.org/10.1007/978-3-031-35129-7_8
SEM for the Interplay Among Consumer Engagements
115
motivationally driven, volitional investment of focal operant resources (including cognitive, emotional, behavioral, and social knowledge and skills), and operand resources (e.g., equipment) into brand interactions in service systems.” (Hollebeek et al. [7]). Early research on CE focused on online contexts such as an online brand community. Recent research attention of CE is not only in online but also in different offline settings. The research is mainly conceptual and is concerned about CE with a single engagement object. However, there is little understanding about CEs with multiple engagement objects (Heinonen [6]). Our study explores how consumers engage with multiple engagement objects in a fashion consumption context and empirically examines the interplay among CEs with them, simultaneously (Bowden et al. [3]). In the theoretical framework, the characters of CE are interactive, multidimensional and having focal objects or agents engaged with consumers (Brodie et al. [1]). In the fashion consumption, we consider CE with brand and online brand communities, concurrently. We define CE by consumer’s cognitive, emotional and behavioral investments into fashion-related interactions. These investments are included in multidimensional concept. From this CE perspective, we conduct both qualitative and quantitative research. In the qualitative research, we draw on semi-structured interviews and extract three focal objects, Fashion, Brand and Sales assistant. This research results are used to develop conceptual models of the interplay among CEs with these focal objects. The quantitative research collects data for measuring CEs with Likert type scale. Then, we measure how well the conceptual models fit the data by using structural equation modeling (SEM). In the analysis of SEM, data are assumed to be approximately normal distributed. Likert scale data are ordinal and may not necessarily resemble a normal distribution. We propose to quantify Likert scale data. Optimal scaling is a quantification technique that assigns numerical values to Likert scale level and nonlinearly transforms qualitative data into quantitative data. When using the optimal scaling in nonlinear principal component analysis (NPCA) of Young et al. [13], optimally scaled data are obtained by quantifying Likert scale data, and principal component scores are computed from the optimally scaled data. Thus, this approach replaces factor analysis in the measurement equation model with NPCA and treats the scores as observed data instead of Likert scale data. We refer to SEM combining NPCA as NPCA-SEM. In the analysis of our research data, we compare the goodness of fit of conceptual models obtained from NPCASEM with that from ordinal SEM and examine the performance of NPCA-SEM. This paper is organized as follows: Section 2 develops hypotheses and conceptual models of the interplay among CEs with objects, Fashion, Brand and Sales assistant. Section 3 gives second-order SEM and NPCA-SEM for the conceptual models. Section 4 shows the estimation results from SEM and NPCA-SEM. Section 5 presents our concluding remarks.
2
Development of Hypotheses and Conceptual Models
We develop hypotheses and conceptual models by employing the literature review on CE and the qualitative research. In our qualitative research, we con-
116
M. Kuroda et al.
ducted semi-structured interviews with 18 consumers interested in fashion in February 2017. The interviewees were asked to talk about their experiences of fashion in everyday life over the past few decades. The analysis focuses on examining what objects consumers engage with, and how multiple CEs are related to each other. From the analysis of the data obtained from the interviews, three focal objects were extracted: Fashion, Brand and Sales assistant. Brand is a specific name that distinguishes a product from others. Examples are Louis Vuitton and UNIQLO, the name of a clothing store. Many of the respondents were involved with a particular brand. I was impressed by the design itself, not because nobody else was wearing it, but because I had never had such an idea in my head (27 years old, male, part-time worker). Fashion is also about dressing up and being fashionable. Consumers who engage with fashion do not focus on brands, but on styles that suit them, comfort and balance, which is different from the focus on brands mentioned above. This is different from the brand focus mentioned above. The most enjoyable moment for me is when I feel I look good in my clothes or when I feel cute in my clothes (24 years old, female, office worker). Sales assistant is the person who works on the sales floor in clothes shops. Some respondents imitated the fashion of a particular member of staff, or engaged in fashion discussions or non-fashion small talk with their favorite member of staff. Some respondents told that they go to stores to imitate the fashion of a particular member of staff, or to engage in fashion discussions or non-fashion small talk with their favorite sales assistant. I feel like I’m being taught how to enjoy about clothes and how to live as a human being (19 years old, male, student). Furthermore, recent research suggests that consumers engage with multiple focal objects simultaneously (Chandler and Lusch [4]; Naumann et al. [11]). And it has also been suggested that consumers’ engagement with one object influences their engagement with other object. For example, Bowden et al. [2] argued for the existence of a spillover effect from consumers’ engagement with online brand communities to their engagement with brands. The respondents also stated that the more interested they were in fashion, the more attached they were to the brand, and at the same time, the more attached they were to the brand, the more interested they were in fashion. Therefore, we hypothesize following: – H1: CE with Fashion influences CE with Brand. – H2: CE with Brand influences CE with Fashion. It has also been noted that engagement with a company or brand has a positive impact on the direct interaction with the company and the customer
SEM for the Interplay Among Consumer Engagements
117
satisfaction (e.g., Jaakkola and Alexander [8]). In particular, in the context of fashion, customers who engage with fashion and brands tend to emphasize on the conversations and close relationships with staffs in stores. Therefore, we hypothesize following: – H3: CE with Brand influences CE with Sales assistant. – H4: CE with Fashion influences CE with Sales assistant. We denote CE with Brand, Fashion and Sales assistant by BCE, FCE and SCE, respectively.
3
Structural Equation Modeling for Consumer Engagement
We use structural equation modeling (SEM) to fit a conceptual model to observed data. Table 1 gives observed variables and latent factors with engagement in each object. Cognitive is measured with three items, Emotional is with four items, and Action is with four or five items. Let fF CE , fBCE and fSCE denote the latent factors of FCE, BCE and SCE, respectively. We describe the structural equation models for the CEs by
3.1
fF CE = αF C fF C + αF E fF E + αF A fF A + eF CE , fBCE = αBC fBC + αBE fBE + αBA fBA + eBCE ,
(1) (2)
fSCE = αSC fSC + αSE fSE + αSA fSA + eSCE .
(3)
Structural Equation Modeling
SEM integrates structural and measurement equation models into a model. The measurement equation model is just a factor analysis model. A latent factor in a structural equation model is not measured directly and then is measured by using several observed variables as indicator of the factor. We give the measurement equation models for Fashion, Brand and Sales assistant: – Fashion ⎧ XF C ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ XF E ⎪ ⎪ ⎪ ⎪ XF A ⎪ ⎪ ⎩
= λF C fF C + eF C = (λF C1 , λF C2 , λF C3 )fF C + (eF C1 , eF C2 , eF C3 ), = λF E fF E + eF E = (λF E1 , λF E2 , λF E3 , λF E4 )fF C + (eF E1 , eF E2 , eF E3 , eF E4 ), = λF A fF A + eF A = (λF A1 , λF A2 , λF A3 , λF A4 )fF C + (eF A1 , eF A2 , eF A3 , eF A4 ).
(4)
118
M. Kuroda et al. Table 1. Observed variables and latent factors
Object
Engagement Factor Observed variable
Fashion
Cognitive Emotional Action
fF C fF E fF A
XF C = (XF C1 , XF C2 , XF C3 ) XF E = (XF E1 , XF E2 , XF E3 , XF E4 ) XF A = (XF A1 , XF A2 , XF A3 , XF A4 )
Brand
Cognitive Emotional Action
fBC fBE fBA
XBC = (XBC1 , XBC2 , XBC3 ) XBE = (XBE1 , XBE2 , XBE3 , XBE4 ) XBA = (XBA1 , XBA2 , XBA3 , XBA4 , XBA5 )
Sales assistant
Cognitive Emotional Action
fSC fSE fSA
XSC = (XSC1 , XSC2 , XSC3 ) XSE = (XSE1 , XSE2 , XSE3 , XSE4 ) XSA = (XSA1 , XSA2 , XSA3 , XSA4 , XSA5 )
– Brand ⎧ XBC = λBC fBC + eBC ⎪ ⎪ ⎪ ⎪ = (λBC1 , λBC2 , λBC3 )fBC + (eBC1 , eBC2 , eBC3 ), ⎪ ⎪ ⎪ ⎪ ⎨ XBE = λBE fBE + eBE = (λBE1 , λBE2 , λBE3 , λBE4 )fBE + (eBE1 , eBE2 , eBE3 , eBE4 ), ⎪ ⎪ X ⎪ BA = λBA fBA + eBA ⎪ ⎪ ⎪ = (λBA1 , λBA2 , λBA3 , λBA4 , λBA5 )fBA ⎪ ⎪ ⎩ +(eBA1 , eBA2 , eBA3 , eBA4 , eBA5 ). – Sales assistant ⎧ XSC = λSC fSC + eSC ⎪ ⎪ ⎪ ⎪ = (λSC1 , λSC2 , λSC3 )fSC + (eSC1 , eSC2 , eSC3 ), ⎪ ⎪ ⎪ ⎪ ⎨ XSE = λSE fSE + eSE = (λSE1 , λSE2 , λSE3 , λSE4 )fSE + (eSE1 , eSE2 , eSE3 , eSE4 ), ⎪ ⎪ = λSA fSA + eSA X ⎪ SA ⎪ ⎪ ⎪ = (λSA1 , λSA2 , λSA3 , λSA4 , λSA5 )fSA ⎪ ⎪ ⎩ +(eSA1 , eSA2 , eSA3 , eSA4 , eSA5 ).
(5)
(6)
The measurement equation models estimate the latent factors fF i , fBi and fSi (i = C, E, A) in Equations (1) to (3). We specify second-order SEM as the structural equation models for hypotheses {H1, H3, H4} and {H2, H3, H4}. Then, the equations are described as follows: – {H1, H3, H4}:
– {H2, H3, H4}:
fBCE = βF CE fF CE + eBCE , fSCE = βBCE fBCE + βF CE fF CE + eSCE .
(7)
fF CE = βBCE fBCE + eF CE , fSCE = βBCE fBCE + βF CE fF CE + eSCE .
(8)
SEM for the Interplay Among Consumer Engagements
3.2
119
Structural Equation Modeling Including Nonlinear Principal Component Analysis
Likert type scale data are not exactly quantitative data but ordinal qualitative data. SEM treats Likert type scale data as quantitative data and assumes that they are approximately normal distributed. Thus, SEM may not be able to correctly measure to fit a conceptual model to the data when they do not resemble a normal distribution. Optimal scaling can be used to nonlinearly transform such ordered qualitative data into quantitative data. We propose a new approach using nonlinear principal component analysis (NPCA) of Young et al. [13] in SEM. NPCA is PCA with the optimal scaling and can obtain principal component scores of optimally scaled data. Appendix A introduces NPCA and gives the alternative least squares (ALS) algorithm for estimating parameters of NPCA. In the measurement equation model, we replace factor analysis with NPCA and use component scores of optimally scaled data from NPCA as observed data. Therefore, we do not examine relationship between latent factors and observed variables in factor analysis but construct composite variables of observed variables in NPCA. We refer to this approach as NPCASEM. Let X denote the observed Likert scale data of engagement. NPCA obtains optimally scaled data X∗ by quantifying X and computes a principal component score vector Z and a component loading vector A on r = 1 component. Then, the ALS algorithm finds the least squares estimates of X∗ , Z and A by minimizing the loss function σ(Z, A, X∗ ) for engagement. We give the loss function for each object: – Fashion: σF = σ(ZF C , AF C , X∗F C ) + σ(ZF E , AF E , X∗F E ) + σ(ZF A , AF A , X∗F A ).
– Brand: σB = σ(ZBC , ABC , X∗BC ) + σ(ZBE , ABE , X∗BE ) + σ(ZBA , ABA , X∗BA ).
– Sales assistant: σS = σ(ZSC , ASC , X∗SC ) + σ(ZSE , ASE , X∗SE ) + σ(ZSA , ASA , X∗SA ). Then, the measurement equations (4) to (6) are replaced by the following ones: – Fashion:
⎧ ⎨ ZF C = γF C fF C + eF C , ZF E = γF E fF E + eF E , ⎩ ZF A = γF A fF A + eF A .
(9)
120
M. Kuroda et al.
– Brand:
– Sales assistant:
⎧ ⎨ ZBC = γBC fBC + eBC , ZBE = γBE fBE + eBE , ⎩ ZBA = γBA fBA + eBA .
(10)
⎧ ⎨ ZSC = γSC fSC + eSC , ZSE = γSE fSE + eSE , ⎩ ZSA = γSA fSA + eSA .
(11)
We use Equation (7) as the structural equation model for {H1, H3, H4} and Equation (8) as the model for {H2, H3, H4}, respectively.
4
Model Results
We fit the conceptual models {H1, H3, H4} and {H2, H3, H4} to data by using SEM. The data were collected in January 2018 through an online survey of members registered in an internet research company and enjoying fashion. Responses are obtained from 263 males and 234 females. Hollebeek et al. [7] provides Likert type scales to measure CEs. We use five levels Likert scale for all variables of CE. SEM is performed by the R package lavaan of Rosseel [12]. The R package homals of de Leeuw and Mair [5] computes principal component scores from NPCA. We check the normality of the data using skewness. Table 2 shows the skewness of the data for Fashion. We see that all the data are not normal distributed. For the other objects, Brand and Sales assistant, we can obtain the same results. Therefore, it is not suitable to apply the maximum likelihood estimation to the data. The robust maximum likelihood estimation method is useful to second-order SEM and the robust weighted least squares method is available for NPCA-SEM. The optimal scaling in NPCA assigns numerical values to Likert scale levels. The values are called category quantifications. Table 3 shows the category quantifications for XF C , XF E and XF A . The five levels Likert scales are replaced with the corresponding category quantifications. NPCA computes the principal component scores of the optimally scaled data of the category quantifications, and then, NPCA-SEM estimates model parameters using the scores as observed data. Table 2. Skewness of the data for Fashion. Cognitive
Emotional
skewness p-value XF C1 1.251 XF C2 1.582 XF C3 1.369
0.000 0.000 0.000
Action
skewness p-value XF E1 XF E2 XF E3 XF E4
1.153 1.354 1.243 0.629
0.000 0.000 0.000 0.000
skewness p-value XF A1 0.785 XF A2 1.085 XF A3 0.785 XF A4 -0.110
0.000 0.000 0.000 0.000
SEM for the Interplay Among Consumer Engagements
121
Table 3. Category quantifications for XF C , XF E and XF A . Cognitive Level 1 2 3 4 5
Emotional
Action
qF C1
qF C2
qF C3
qF E1
qF E2
qF E3
qF E4
qF A1
qF A2
qF A3
qF A4
0.014 -0.011 -0.045 -0.057 -0.378
0.013 -0.013 -0.031 -0.066 -0.195
0.013 -0.010 -0.033 -0.033 -0.255
0.018 -0.016 -0.028 -0.028 -0.074
0.016 -0.019 -0.029 -0.029 -0.059
0.017 -0.018 -0.033 -0.036 -0.036
0.018 -0.001 -0.013 -0.013 -0.013
0.023 -0.007 -0.018 -0.018 -0.018
0.022 -0.009 -0.017 -0.018 -0.021
0.024 -0.005 -0.017 -0.017 -0.019
0.034 0.001 -0.011 -0.011 -0.011
Figures 1 and 2 show the standardized parameter estimates for the conceptual model {H1, H3, H4} obtained from second-order SEM, and NPCA-SEM. The figures illustrate that NPCA-SEM can describe the model in the smaller number of parameters than second-order SEM. For the data that the number of observed variables is larger than that of individuals, factor analysis is not applicable because the model parameters are not identified. NPCA obtains the composite variables of the observed variables and thus can avoid the model identification in SEM. The identification is an important issue in describing structural and measurement equations. Tables 4 and 5 give the evaluation of the conceptual models {H1, H3, H4} and {H2, H3, H4} from second-order SEM and NPCA-SEM. The third column of the tables gives the unstandardized parameter estimates of the structural equation model. The hypothesis tests are performed under a significance level of 0.05. NPCA-SEM provides the hypothesis test result that H4 is accepted, that is, βF CE = 0. It means that CE with Fashion dose not influence CE with Sales assistant.
Fig. 1. Parameter estimates for the conceptual model {H1, H3, H4} obtained from second-order SEM.
122
M. Kuroda et al.
Table 4. Evaluation of the conceptual model {H1, H3, H4} from SEM and NPCASEM. Hypothesis β H1 H3 H4 NPCA-SEM H1 H3 H4 SEM
1.073 1.416 −0.708 1.285 0.746 −0.452
SE
z-value p-value Result
0.121 8.880 0.000 0.232 6.106 0.000 0.262 −2.699 0.007 0.190 6.767 0.000 0.215 3.472 0.001 0.302 −1.500 0.134
Reject Reject Reject Reject Reject Accept
Fig. 2. Parameter estimates for the conceptual model {H1, H3, H4} obtained from NPCA-SEM. Table 5. Evaluation of the conceptual model {H2, H3, H4} from SEM and NPCASEM. Hypotheses β H2 H3 H4 NPCA-SEM H2 H3 H4 SEM
0.624 1.416 −0.708 0.633 0.746 −0.452
SE
z-value p-value Result
0.074 8.427 0.000 0.232 6.106 0.000 0.262 −2.699 0.007 0.082 7.747 0.000 0.215 3.472 0.001 0.302 −1.500 0.134
Reject Reject Reject Reject Reject Accept
Table 6 shows the index values of goodness of fit of the conceptual models {H1, H3, H4} and {H2, H3, H4}. Then, RMSEA and CFI of NPCA-SEM are better than those of SEM. The results illustrate that the use of the principal component scores of quantified Likert scale data can improve these index values.
SEM for the Interplay Among Consumer Engagements
123
Table 6. Index values of goodness of fit of the conceptual models {H1, H3, H4} and {H2, H3, H4}. (a) {H1, H3, H4} SRMR RMSEA CFI SEM 0.069 NPCA-SEM 0.072
0.043 0.000
0.955 0.973
(b) {H2, H3, H4} SRMR RMSEA CFI SEM 0.069 NPCA-SEM 0.072
5
0.043 0.000
0.950 0.973
Concluding Remarks
In this paper, we focused on the research on CEs with multiple engagement objects in a fashion consumption context. We explored how consumers engage with multiple engagement objects and empirically examined the interplay among CEs with them, simultaneously. We defined CE by consumer’s cognitive, emotional and behavioral investments into fashion-related interactions. From this CE perspective, we conducted qualitative and quantitative research. In the qualitative research, we drew on semi-structured interviews and extracted focal objects, Fashion, Brand and Sales assistant. Then, we developed conceptual models of the interplay among CEs with these focal objects. The quantitative research collected data for measuring CEs with Likert type scale. SEM evaluated how well the conceptual models fit the data. In applying SEM to observed Likert scale data not resembling a normal distribution, we proposed NPCA-SEM that includes NPCA in SEM computation. This approach does not use observed variables as indicator of latent factors but constructs composite variables of observed variables using NPCA. Therefore, NPCA optimally scales the Likert scale data and computes the principal component scores of the optimally scaled data instead of the estimation of latent factors in measurement equation models. SEM estimates the parameters of a given conceptual model using the scores as observed data. In the analysis of our research data, we employed second-order SEM and NPCA-SEM. The estimation results indicated that NPCA-SEM improves the goodness of fit of the conceptual model to the data. When the observed ordinal data are skew and does not resemble a normal distribution, NPCA can optimally scale the observed data, and the principal component scores of the optimally scaled data are available to treat as observed data in SEM computation. NPCA is applied to the data that the number of variables is larger than that of individuals, although factor analysis is not available to such data. Therefore, NPCA-SEM may avoid the identification problem in SEM.
124
M. Kuroda et al.
Acknowledgments. This work was supported by JSPS KAKENHI Grant Number JP21K11800.
A
Nonlinear Principal Component Analysis
Let X = (X1 , X2 , · · · Xp ) be an observation matrix on n objects and p variables. We code Xj of the qualitative variable j with Kj categories by using an n × Kj indicator matrix ⎞ ⎛ gj11 . . . gj1Kj ⎜ .. ⎟ , Gj = (gj1 . . . gjKj ) = ⎝ ... ... . ⎠ gjn1 . . . gjnKj where
gjik =
1 if object i belongs to category k, 0 if object i belongs to some other category k (= k).
We find Kj × 1 category quantifications qj under restrictions imposed by the measurement level of variable j and transform Xj into the optimally scaled vector X∗j = Gj qj . We can use the monotone regression method of Kruskal [9] for quantifying ordinal scale data due to the monotonicity restriction. For PCA of X of n objects by p quantitative variables, we approximate X ≈ ZA , where Z is an n × r matrix of n component scores on r (1 ≤ r ≤ p) components and A is a p × r matrix of p component loadings on r components. Then, the PCA is formulated in terms of the loss function σ(Z, A) = tr(X − ZA ) (X − ZA ).
(12)
The problem is solved to minimize the loss function (12) over Z and A by means of the singular value decomposition of X or the eigen-decomposition of X X. In the presence of qualitative variables, NPCA requires the quantification of qualitative data for obtaining X∗ . Then, the loss function (12) is replaced by σ(Z, A, X∗ ) = tr(X∗ − ZA ) (X∗ − ZA ) and is minimized over Z, A and X∗ under the restrictions
∗ ∗ X X ∗ X 1n = 0p and diag = Ip , n
(13)
(14)
where 1n and 0p are vectors of ones and zeros of length n and p, respectively, and Ip is the p × p identity matrix. Optimal scaling for X∗ can be performed for
SEM for the Interplay Among Consumer Engagements
125
each variable separately and independently, and therefore the loss function (13) can be also rewritten as σ(Z, A, X∗ ) =
p
∗ (X∗j − ZA j ) (Xj − ZAj ) =
j=1
p
σj (Z, Aj , X∗j ).
(15)
j=1
Thus, we can obtain the minimum of σ(Z, A, X∗ ) by independently minimizing each σj (Z, Aj , X∗j ) under measurement restrictions on variable j. When solving the minimization problem of the loss function (13), we cannot simultaneously find the closed-form solutions. The alternative least squares (ALS) algorithm is utilized to obtain the least squares estimates of the solutions. To minimize the loss function (13) over Z, A and X∗ under the restriction (14), the ALS algorithm alternates between two estimation steps. The first step computes Z and A for ordinary PCA, and the second finds X∗ for optimally scaled data. The algorithm iterates the following two steps: Step 1: Obtain A(t+1) by solving the eigen-decomposition of X∗(t) X∗(t) /n or the singular value decomposition of X∗(t) . Compute Z(t+1) = X∗(t) A(t+1) . ˆ (t+1) = Z(t+1) A(t+1) . Update X Step 2: Find X∗(t+1) by separately estimating X∗j for each variable j. Compute (t+1)
qj
by (t+1)
qj (t+1)
Recompute qj X
∗(t+1)
=
(t+1) Gqj .
−1 (t+1) ˆ = G Gj X . j Gj j
by using the monotone regression method. Obtain Check the convergence by σ (t) − σ (t+1) < δ,
where σ (t) = σ(X∗(t) , Z(t) , A(t) ) and δ is the desired accuracy. We can find the detail derivation of the ALS algorithm in Kuroda et al. [10].
References 1. Brodie, R.J., Hollebeek, L.D., Juric, B., Ilic, A.: Customer engagement: conceptual domain, fundamental propositions, and implications for research. J. Serv. Res. 14, 252–271 (2011) 2. Bowden, J.L.H., Conduit, J., Hollebeek, L.D., Luoma-aho, V., Solem, B.A.: Engagement valence duality and spillover effects in online brand communities. J. Serv. Theory Pract. 27, 877–897 (2017) 3. Bowden, J.L.H., Conduit, J., Hollebeek, L.D., Luoma-aho, V., Solem, B.A.: The role of social capital in shaping consumer engagement within online brand communities. In Johnston, K. A., Taylor, M. (eds.) The Handbook of Communication Engagement, pp. 491–450. Wiley & Sons (2018)
126
M. Kuroda et al.
4. Chandler, J.D., Lusch, R.F.: Service systems: a broadened framework and research agenda on value propositions, engagement, and service experience. J. Serv. Res. 18, 6–22 (2015) 5. de Leeuw, J., Mair, P.: GIFI methods for optimal scaling in R: The Package Homals. J. Stat. Softw. 31, 1–20 (2009) 6. Heinonen, K.: Positive and negative valence influencing consumer engagement. J. Serv. Theory Pract. 28, 147–169 (2018) 7. Hollebeek, L.D., Glynn, M., Brodie, R.J.: Consumer brand engagement in social media: conceptualization, scale development and validation. J. Interact. Mark. 28, 149–165 (2014) 8. Jaakkola, E., Alexander, M.: The role of customer engagement behavior in value co-creation a service system perspective. J. Serv. Res. 17, 247–261 (2014) 9. Kruskal, J.B.: Nonmetric multidimensional scaling: a numerical method. Psychometrika 29, 115–129 (1964) 10. Kuroda, M., Mori, Y., Iizuka, M., Sakakihara, M.: Alternating least squares in nonlinear principal components. WIREs Comput. Stat. 5, 456–464 (2013) 11. Naumann, K., Bowden, J., Gabbott, M.: Expanding customer engagement: the role of negative engagement, dual valences and contexts. Eur. J. Mark. 54, 1469–1499 (2020) 12. Rosseel, Y.: lavaan: an R package for structural equation modeling. J. Stat. Softw. 48, 1–36 (2012) 13. Young, F.W., Takane, Y., de Leeuw, J.: The principal components of mixed measurement level multivariate data: an alternating least squares method with optimal scaling features. Psychometrika 43, 279–281 (1978)
Considerations for Health Care Services Related to the Menstrual Cycle Mayu Moriya1(B) , Suzuka Mori1 , Momoka Nozawa1 , Kaito Ofusa1 , Miho Suto1 , Ayami Ejiri2 , Takeo Ainoya3 , and Keiko Kasamatsu1 1 Tokyo Metropolitan University, 6-6, Asahigaoka, Hino-shi, Tokyo 191-0065, Japan
[email protected]
2 Fukushima University, 1, Kanayagawa, Fukushima-shi, Fukushima 960-1248, Japan 3 Tokyo University of Technology, 5-23-22, Nishikamata, Ota-ku, Tokyo 144-8535, Japan
Abstract. This study focused on information provision, which is an issue in promoting women’s work styles in consideration of health issues related to the menstrual cycle and examined how to provide information in consideration of the convenience of both users and service purchasers, incorporating the process of party research based on the HCD process, and devised a health care service. First, a survey of problems and support cases related to the menstrual cycle was conducted to understand the usage situation. As a result, we found that users felt “nanka-gomen,” defined as “a feeling of sorry or guilt for causing inconvenience to others due to physiological phenomena that are out of one’s control, difficult to control, and hard to explain,” and that there were no products or services that directly support such problems. In addition, “nanka-gomen” was categorized into three factors: “physical burden to the surroundings,” “mental burden to the surroundings,” and “hygienic consideration to the surroundings. Next, to clarify the requirements, structuring from the definition of it and the emotional factors, “lack of knowledge” and “difficulty in communication” were identified as requirements. This led to the proposal of “a self-check service based on the degree of impact on life during PMS and menstruation” and “communication icons for communicating non-verbal health problems” as information provision methods and health care services. These services are expected to enable women to work without strain by making it easier for them to understand their condition, know how to cope appropriately, and communicate their health problems. Keywords: Menstrual Cycle · Mental Model · Health Care Services · User Experience
1 Introduction 1.1 Background Femtech, a product or service that uses technology to solve women’s health challenges, is currently attracting attention. While its market is expanding overseas, femtech is still not well known in Japan. On the other hand, Japanese companies are becoming increasingly © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 127–138, 2023. https://doi.org/10.1007/978-3-031-35129-7_9
128
M. Moriya et al.
aware of the health aspects related to the menstrual cycle and the way women work, as evidenced by the introduction of a menstrual leave system. However, taking menstrual leave requires disclosure of information about one’s menstrual cycle to the company and may require an explanation to gain the understanding of supervisors and coworkers. Until now, many Japanese women have felt uncomfortable talking about menstruation to others, or have a sense that it must be hidden, and it is expected to be difficult for this information to be provided smoothly. 1.2 Purpose The purpose of this study is to focus on the provision of information that could be an issue in promoting women’s work styles in consideration of health issues related to the menstrual cycle, as described above, and to explore how to provide information in consideration of the convenience of both users and service purchasers, incorporating the process of party research based on the HCD (Human Centered Design) process, and to devise a health care service. Propose a service concept as a solution by understanding and clarifying the requirements and understanding of the usage situation in the HCD process, based on a survey including discussions with menstrual parties and non-participants.
2 Surveys to Understand and Specify Context of Use (1) 2.1 Method First, to understand and specify context of use, a total of six people, five women aged 22–24 who were parties to the study and one man aged 23 who was not a party to the study, listed specific examples of problems related to menstrual cycles and were grouped by the KJ method. Next, we categorized these problems by applying Maslow’s five-level needs and conducted a survey to see if there were services or products that could support the problems. 2.2 Results The results of the specific examples and grouping of problems related to the menstrual cycle were as follows (Fig. 1, Fig. 2). The groups were divided into the following categories: cycle prediction, schedule changes, difficulty in taking a break and guilt, difficulty to understand, actual symptoms, anxiety about new goods, pills, purchasing menstrual products, bulk of luggage, eating during menstruation, expenses, effects on clothes and bedding, sanitary boxes on the go, places to dump menstrual products while on the go, and what to wear during menstruation. When these 15 group items were classified according to Maslow’s five-level needs, most of the problems were categorized as lower-level needs (Fig. 3). Those classified as physiological needs were actual symptoms and difficulties related to symptoms that appear in relation to menstruation, such as cycle prediction, pills, eating during menstruation, etc. Safety needs included anxiety about new goods, purchasing menstrual products, bulk of luggage, expenses, impact on clothing and bedding, sanitary boxes
Considerations for Health Care Services
129
Fig. 1. List of problems related to menstrual cycle
Fig. 2. Diagram with grouping of problems
on the go, places to dispose of menstrual products on the go, cycle prediction, pills, menstrual clothing, eating during menstruation, and other problems related to coping with menstruation. Social needs included difficulties with schedule changes and understanding, difficulty in taking a break and guilt, and difficulties related to the impact on surroundings, such as cycle prediction, the pill, and dressing during menstruation.
130
M. Moriya et al.
Fig. 3. Five-level need classification chart for problems related to menstrual cycle
Next, existing services and products to support these problems were surveyed and categorized, again applying Maslow’s five stages of need. As a result, support for actual symptoms is painkillers. Support for coping with menstruation is sanitary cloth napkins, menstrual cups, absorbent panties, basal thermometers, and menstrual management apps. And those related to the desire for self-actualization were products treated as femtech, such as sanitary cloth napkins, menstrual cups, and absorbent panties. Thus, it was found that few services or products existed that would provide direct support regarding the impact on surroundings (Fig. 4). In the discussion, cloth sanitary napkins, menstrual cups, and absorbent panties, known as femtech, are considered good for the environment. However, the current situation is that it is difficult for users to manage the washing and sterilization of these items and the environment is not conducive to replacing these items on the go. For these reasons, the respondents were reluctant to use them.
Fig. 4. Five-level need classification chart for support tools for problems related to menstrual cycles
Considerations for Health Care Services
131
2.3 Discussion These results indicate that problems related to the menstrual cycle fall into the lowerlevel needs of physiological, safety, and social needs, and that there are no services or products that provide direct support for problems related to the social needs, especially those related to the impact on surroundings environment. In terms of understanding the menstrual cycle, basal thermometers and menstrual management applications can provide indirect support for problems related to schedule changes and their impact on surroundings environment. However, direct support is needed for the overall problems related to understanding, difficulty in taking time off, guilt, and other difficulties related to the impact on those around them. Femtech’s products were also classified as a higherlevel need because of their low environmental impact and social contribution aspect. This suggests that the reason for the low awareness or little use of femtech in Japan is that the lower needs are not being met before the higher needs are met.
3 Surveys to Understand and Specify Context of Use (2) 3.1 Method In Survey (1), it became clear that services and products that provide direct support for the overall problems related to impact on surroundings are needed. Therefore, to clarify the factors that contribute to the problems related to the impact on surroundings, the episodes related to the problems listed in Survey (1) were divided into scenes by five female respondents aged 22–24 and grouped by the KJ method for analysis. 3.2 Results The following figure shows the results of grouping the episodes related to the problems into scenes (Fig. 5). Scenes were divided into the following categories: home, office, school, on a trip, in a car, in a crowd, commuting, gynecology, exercise, and always. 3.3 Discussion Based on the above results, the factors of distress regarding impact on surroundings were categorized by the feelings of “nanka-gomen,” lack of recognition, and others. The “nanka-gomen” feelings are episodes such as, “I have inconvenienced others by changing my schedule due to menstrual cramps,” “I have been hard on people due to my depressed mood caused by my menstrual cycle,” and “I have trouble finding a place to throw away used sanitary products when I go to someone’s house. The lack of recognition is episodes such as, “Sometimes it is difficult for women to understand each other because symptoms vary from person to person,” and “It is a little depressing to be always worried because the symptoms are not severe every time. The others categorized as “it is hard to buy sanitary products when the clerk is male” and “I can’t wear white clothes. And it was found that the feeling of “nanka-gomen” was felt regardless of the scenes.
132
M. Moriya et al.
Fig. 5. Diagram of episodes of problems related to impact on surroundings classified by scene and grouping of factors
4 Surveys to Understand and Specify Context of Use (3) 4.1 Method In Survey (2), it was found that the respondents felt “nanka-gomen” regardless of the scenes, as a problem related to the impact on surroundings. In order to investigate what kind of feelings these feelings were, five women aged 22–24 years who were involved in the study listed and analyzed episodes related to their menstrual cycle in which they felt “nanka-gomen” and what they felt it toward, by dividing the time of feeling into menstrual cycles: ovulation, premenstrual phase, during menstruation, and postmenstrual phase. In addition, groupings were made focusing on “what they felt it toward” to define these feelings as well as to identify factors. 4.2 Results The following is a list of menstrual cycle-related episodes in which the respondents felt “nanka-gomen” and what they felt it about, broken down by the menstrual cycle (Fig. 6).
Fig. 6. Diagram of episodes and times when they felt “nanka-gomen”
Considerations for Health Care Services
133
Next, the following diagram shows the grouping of those listed as “what they felt it toward” (Fig. 7).
Fig. 7. Diagram of grouping of “what they felt it toward”
4.3 Discussion Based on the above results and the specific experiences mentioned above, “nankagomen” was defined as “a feeling of apology and guilt for feeling that one has inconvenienced others due to physiological phenomena that are difficult to control and explain.” In addition, it was categorized into three main categories: physical burden to surroundings, mental burden to surroundings, and hygienic consideration to surroundings (Fig. 8). “Physical burden” was in response to the physical burden to surroundings, such as an increase in the workload of partner because one’s absence. “Mental burden” was for mentally burdening the partner, such as burdening the partner with an emotional change. “Hygienic consideration” was against the possibility of making the other person feel hygienically uncomfortable. According to “The National Attitude Survey” conducted by Japan’s Ministry of Land, Infrastructure, Transport and Tourism in 2019, the top things Japanese people consider to be traditional Japanese sensibilities were consideration for others, traditional culture and customs, harmony and cooperation, and love for nature [1]. Thus, the feelings of “nanka-gomen” when one’s own physiological phenomenon causes a burden to those around them is thought to be a feature of Japan’s culture, which emphasizes consideration for others and group harmony and cooperation.
134
M. Moriya et al.
Fig. 8. Factors contributing to “nanka-gomen”
5 Identify User Requirements 5.1 Method In Survey (3), “nanka-gomen” was defined as “a feeling of apology and guilt for feeling that one has inconvenienced others due to physiological phenomena that are difficult to control and explain.” Based on this definition, user requirements were identified by structuring of the feelings “nanka-gomen.” 5.2 Results The following figure shows the result of structuring “something sorry” (Fig. 9). The structuring of the feelings and problems that are factors in the definition of “something sorry” was conducted based on the “difficult to control,” “difficult to explain,” “don’t want to explain,” and “think about the impact on surroundings.”
Fig. 9. Diagram of structuring “nanka-gomen”
Considerations for Health Care Services
135
5.3 Discussion As the above results showed, the factors that made them feel “nanka-gomen” were lack of knowledge regarding “difficulty in explaining” such as “not many opportunities to obtain information about menstruation” and “lack of a common language to explain”. As for “impact on surroundings,” it was found that there was a factor of “more work that cannot be done” and a factor of “difficulty in communication” such as “difficulty in explaining” at such times. Therefore, we proposed a service concept based on the belief that user requirements were to solve this lack of knowledge and difficulty in communication.
6 Proposal of Service Concepts 6.1 Service Concept A The first service concept is a self-check service based on PMS (Premenstrual Syndrome) and the degree of impact on one’s life during menstruation. As for the “lack of knowledge” that is a factor in the feeling of “nanka-gomen,” during the discussion, there was a realization that symptoms related to menstruation vary greatly from person to person, that it is difficult to know the severity of one’s symptoms, that information may be difficult to obtain due to environment, and that there is no one with similar symptoms to one’s own to refer to for coping strategies. Therefore, we thought it would be ideal to know one’s type and position in various menstrual symptoms through a self-check, and to obtain information appropriate to type, which led us to propose this concept. Also, it enables the formation of communities with other people of the same type, allowing them to share their problems and coping strategies, and to obtain new information. The process of using the system is to first answer questions about the actual symptoms of PMS and menstruation, such as the degree of symptoms that appear and the activities that are made impossible by the symptoms. Next, users can check their own type and retrieve information about their type. It is also possible to provide type-specific counseling by AI, and to share information on problems and countermeasures by forming a community with people of the same type. Although such type of diagnosis services already exist, compared to prior examples [2–5], it is expected to expand the range of ways to interact with those around you in terms of communicating with people of the same type (Table 1). The following illustration is the Proposal A service scenario (Fig. 10).
136
M. Moriya et al. Table 1. Comparison with prior examples of type diagnosis services
Fig. 10. Service Scenario for Service Concept A
6.2 Service Concept B The second is communication icons to communicate non-verbal health problems. In structuring “nanka-gomen,” the results showed that the factors were “more work that cannot be done” and “difficulty in communication” such as “difficulty in explaining.” Some of the actual episodes included: “It’s hard to say I’m not feeling well because of menstruation,” “I don’t want everyone to worry,” and “I have menstrual pain and can participate in online classes by only listening, but it’s hard to speak up.” In response to this, we thought it would be ideal to be able to indicate in internal communication tools such as Slack and Microsoft Teams that I will work according to our physical condition today without having to go to the trouble of saying so in words. In fact, Slack has a status entry feature that allows users to indicate with a single icon what kind of situation they are working in, such as being on vacation, in a meeting, or
Considerations for Health Care Services
137
on sick leave. With reference to this precedent, we propose a communication icon that can present to others what kind of working conditions are possible. The process for using this icon is to first apply to the company’s occupational physician for permission to use this icon, such as “I have severe menstrual pain” or “I have a headache,” and receive approval. Next, the icon and the range of possible tasks are set when the person is not feeling well, and the supervisor and co-workers are informed that the person will work according to his/her physical condition on that day. If a supervisor or colleague wants to know a more detailed scope of possible work, he or she can view it by performing an operation (Fig. 11). In addition to menstruation-related health problems, the icon can also be used by any gender, including people with other chronic illnesses, and prevent people from finding out that they are menstruating. Such nonverbal communication is expected to reduce the psychological burden of communicating the condition to the other party.
Fig. 11. Service Scenario for Service Concept B
7 Conclusion In this study, we explored how to provide information on health care services related to the menstrual cycle, considering the convenience of both users and service purchasers, by incorporating the process of party research based on the HCD process. First, in the understanding and comprehension of the usage situation, the problems of the parties concerned were categorized as lower-order needs in Maslow’s five-level needs. Among these lower-order needs, there were no products or services that directly support problems related to the impact on surroundings of social needs.
138
M. Moriya et al.
The factors that contribute to the problems related to the impact on surroundings are the feeling of “nanka-gomen” and lack of recognition, and “nanka-gomen” is defined as “a feeling of sorry or guilt for the inconvenience caused by a physiological phenomenon that is beyond one’s control, difficult to control, and difficult to explain”. This feeling was categorized into three factors: “physical burden to the surroundings,” “mental burden to the surroundings,” and “hygienic consideration for the surroundings. Next, in order to clarify the requirements, the definition of “nanka-gomen” and the structuring of “nanka-gomen” in terms of emotional factors were used, and “lack of knowledge” and “difficulty in communication” were identified as requirements. Based on the above, we proposed the “Self-check service based on PMS and the degree of impact on one’s life during menstruation “ and “Communication icons to communicate nonverbal health problems” as a method of providing information and health care services for the convenience of both users and service purchasers. We believe that these services will help women understand their own bodies and learn how to deal with them appropriately, so that they will be able to work comfortably even before and after menstruation. Acknowledgments. This research was conducted in collaboration with TOPPAN Edge Inc. We hereby express our gratitude to TOPPAN Edge Inc.
References 1. Japan’s Ministry of Land, Infrastructure, Transport and Tourism: National Land Transport White Paper, p. 20 (2019) 2. LunaLuna. https://sp.lnln.jp/brand. Accessed 10 Feb 2023 3. PAIRCARE. https://paircare.jp. Accessed 10 Feb 2023 4. Health & Rights Inc. https://healthandrights.jp/careme-/. Accessed 10 Feb 2023 5. Sofy. https://www.sofy.jp/ja/app/sofy/type.html. Accessed 10 Feb 2023
Dialogue-Based User Needs Extraction for Effective Service Personalization Takuya Nakata1(B) , Sinan Chen2 , Sachio Saiki3 , and Masahide Nakamura2 1
3
Graduate School of Engineering, Kobe University, 1-1 Rokkodai, Nada, Kobe 657-8501, Japan [email protected] 2 Center of Mathematical and Data Sciences, Kobe University, 1-1 Rokkodai, Nada, Kobe 657-8501, Japan [email protected], [email protected] School of Data and Innovation, Kochi University of Technology, 185 Miyanokuchi, Tosayamada, Kami, Kochi, Japan [email protected]
Abstract. The research of service personalization is flourishing due to the development of machine learning and natural language processing. Despite the prevalence of prior research based on deep learning and dialogue, it remains challenging to reconcile the disadvantages of machine learning, such as explainability, with the strength of utilizing big data. This research proposes a user needs model that incorporates three elements: user readability, ease of extraction through dialogue, and the potential for advancement in machine learning. Additionally, a voice dialogue-based extraction method is designed and constructed to extract the proposed needs. Specifically, by adopting the 6W1H format for the needs model, a simple yet powerful dialogue flow is achieved and enables a comparison of existing services and needs simultaneously. The main modules of the system are a voice dialogue agent, a dialogue system, and a natural language processing-based needs extraction API. Through designing, implementing, and integrating each module, this study realizes a needs extraction system in Japanese. Furthermore, by operating the realized system, a simple evaluation of the needs model and the system is carried out. As a result of this research, both the user and the system can extract needs that are highly readable, contributing to the realization of user-friendly and effective service personalization. Keywords: Personalization · Needs language processing · Virtual agent
1
· Voice dialogue · Natural
Introduction
Recently, there has been a proliferation of research into service personalization, driven by advancements in machine learning [6] and natural language processing c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 139–153, 2023. https://doi.org/10.1007/978-3-031-35129-7_10
140
T. Nakata et al.
technologies [22]. Service personalization involves customizing the features of services to fit the individual, with the aim of increasing the consumer’s willingness to use the services. Research into personalization can be broadly categorized into two areas. Recommendation by machine learning relies on learning based on the user’s history of service usage, presenting services with high recommendation priority. On the other hand, conversational personalization involves interaction between the user and the system through an agent, customizing services to the individual based on explicit or implicit information provided through the conversational history. There has also been a growing body of research on conversational recommendation systems that combine machine learning and conversation [11]. As a conventional challenge in service personalization, it is noted that the weaknesses of machine learning, such as interpretability and fairness, have not been effectively reconciled with the advantages of utilizing big data [9]. Therefore, this study aims to realize a conversational system that extracts a need model that combines user readability and suitability for machine learning by proposing and designing a need model. The key idea of the research is to extract and accumulate needs in the 6W1H format (how, what, when, where, who, whom, why) through a voice dialogue agent. The 6W1H format not only makes it easy for many users to understand, but also enables the realization of a simple needs extraction dialogue flow in the fill-in-the-blank format. Additionally, it can be applied as an expression form of context-aware service, and can lead to the development of service recommendation based on comparison of needs and services. The approach of this study is as follows: (A1) (A2) (A3) (A4) (A5)
Design of a needs model Design of a needs extraction dialogue system Design of a dialogue flow Design of a needs extraction API Implementation and evaluation of the dialogue system.
Initially, we design the 6W1H needs model and then design a conversational system for extracting needs. The system’s main modules consist of a voice conversational agent, a dialogue-based extraction system, and a needs extraction API. In (A3), we design the dialogue flow between the user and the voice conversational agent. In (A4), we explain how natural language processing in the needs extraction API converts needs statements into the 6W1H needs model. Specifically, we describe the method of extracting each 6W1H element through a morphological analyzer and a syntax analyzer. Finally, by actually implementing and operating the conversation system with a focus on Japanese, we perform a simple evaluation of the needs model and system.
Dialogue-Based User Needs Extraction for Effective Service Personalization
2 2.1
141
Preliminaries Personalization
Personalization, in a broad sense, refers to the act of making something in accordance with the needs of a particular person [1]. Personalization in the field of digital technology is a long-studied area, meaning the process of modifying the system’s functionality, interface, information access, content, and uniqueness to increase the individual or group’s individual relevance [4]. By personalizing the service, it is expected to increase consumer satisfaction and service utilization motivation. Classical approaches to personalization include changes and reflection of service settings by users themselves, estimation and recommendation of individual preferences based on machine learning [7], and adaptation based on natural language processing based on conversational agents. 2.2
Machine Learning-Based Personalization
In the field of machine learning, particularly in recent years, research on personalization using deep learning is being actively conducted [7]. The main technique is recommendation using collaborative filtering [2]. Based on the user’s history of service usage such as service subscription and product purchase, deep learning recommends appropriate services to the user. Collaborative filtering is a technology that is especially used on online shopping sites [8]. There are several issues with personalization using machine learning. Consumers who have concerns about privacy and security exist as the service usage data is referenced for learning, leading to a decrease in service usage motivation [3]. Additionally, it is difficult for machine learning to fully grasp the user’s intention and there is a risk of reducing purchase behavior due to the error of recommendation. Furthermore, the explainability of how the recommendation was made, the fairness of learning results due to bias in the user layer used in the learning data, etc. are being paid attention to [10,15]. Affected by these problems, in recent years, research on causal inference and a combination of conversation and machine learning referred to as conversational recommender systems (CRS) are attracting attention [5,11]. 2.3
Conversational Personalization
A great deal of research has been conducted into personalization based on natural language dialogue between users and systems. One example of recommendation techniques is the existence of conversational search systems (CSSs) [21]. CSSs initialize the target in the initial conversation, infer the desired service through subsequent questions, answers, and searches, and finally present the results to the user and obtain feedback on their satisfaction. Another form of personalization is research that adapts the agent itself, which conducts the conversation with the user, through explicit questions and implicit adaptations in the conversation [13]. The agent’s adaptation personalizes the entire process of messages and
142
T. Nakata et al.
Fig. 1. Conversational recommender system.
recommendation functions that the agent speaks through. The adaptation of agents is particularly advanced in the field of health care applications [20]. In our laboratory, we are actively engaged in research on individual adaptation and agents, with the expected application to conversational personalization. For example, there is research on a personalization framework for smart systems [18] and research on agents that support conversations for the elderly [19]. CRS is a recommendation system that combines machine learning-based individual adaptation and conversational personalization. The architecture of the CRS is shown in Fig. 1. It is a recommendation system that uses machine learning for recommendation using dialogue logs while enabling rich questions and feedback through dialogue. In conventional research on the combination of machine learning and dialogue, one-shot conversations that estimate the user’s preferences from their past behavior and unilaterally propose services with high priority were frequent. On the other hand, recent CRS research has interacted richly between the user and the system through questions and feedback, and has been able to make more personalized recommendation by grasping the user’s actual needs and feedback. 2.4
Issues and Challenges of Conventional Research
A summary of the problems in the conventional research on personalization is presented. Firstly, in machine learning, there may be a lack of fairness, such as a dependence of results on learning data and the occurrence of biases in recommended services. Additionally, needs are not extracted as strings but as parameters, making it difficult to explain how the service recommendation came about. Furthermore, user participation in the recommendation process is difficult, making it difficult to correct errors in recommendations and fine-tune services to user preferences. The drawback of conversational personalization is that the accuracy of recommendations decreases when machine learning is not used. CRS, which performs recommendations by directly linking existing services and conversation content through machine learning, also has problems with fairness and explainability, like machine learning. Considering these issues, the construction of a new personalization that can jointly solve the disadvantages of machine learning and conversational personalization is a challenge.
Dialogue-Based User Needs Extraction for Effective Service Personalization
143
Fig. 2. Overview of the proposal system.
3 3.1
Proposal Method Goal and Key Idea
The aim of this study is to construct a method for obtaining user demands for services through dialogue with a system and extracting and accumulating them as a needs model that is effective for service personalization and understandable to the user. The target services for personalization are not limited to the personalization of conversational agents, which have been mainly studied in traditional conversational personalization, but include all services provided on a wide scale. The key idea is to obtain the content of the dialogue through a voice conversational agent and extract and accumulate the needs in the 6W1H format, which is easily understandable to the user, using natural language processing that does not rely on machine learning. An overview of the proposed system is shown in Fig. 2. 3.2
Issues and Challenges of Conventional Research
A summary of the problems in the conventional research on personalization is presented. Firstly, in machine learning, there may be a lack of fairness, such as a dependence of results on learning data and the occurrence of biases in recommended services. Additionally, needs are not extracted as strings but as parameters, making it difficult to explain how the service recommendation came about. Furthermore, user participation in the recommendation process is difficult, making it difficult to correct errors in recommendations and fine-tune services to user preferences. The drawback of conversational personalization is that the accuracy
144
T. Nakata et al.
of recommendations decreases when machine learning is not used. CRS, which performs recommendations by directly linking existing services and conversation content through machine learning, also has problems with fairness and explainability, like machine learning. Considering these issues, the construction of a new personalization that can jointly solve the disadvantages of machine learning and conversational personalization is a challenge.
4
Proposal Method
4.1
Goal and Key Idea
The aim of this study is to construct a method for obtaining user demands for services through dialogue with a system and extracting and accumulating them as a needs model that is effective for service personalization and understandable to the user. The target services for personalization are not limited to the personalization of conversational agents, which have been mainly studied in traditional conversational personalization, but include all services provided on a wide scale. The key idea is to obtain the content of the dialogue through a voice conversational agent and extract and accumulate the needs in the 6W1H format, which is easily understandable to the user, using natural language processing that does not rely on machine learning. The approach of this study is as follows: (A1) (A2) (A3) (A4) (A5)
Design of a needs model Design of a needs extraction dialogue system Design of a dialogue flow Design of a needs extraction API Implementation and evaluation of the dialogue system.
4.2
(A1) Design of a Needs Model
A solitary requirement for a specific service pertaining to a user is specified by the 6W1H component. The 6W1H elements are what, when, where, who, whom, why, and how, each of which has the following meanings: – – – – – – –
how: How the service will be executed (the executed service) what: Specifically what will be done in the service when: When the service will be executed where: Where the service will be executed who: Who will execute the service as the main subject whom: To whom the service will be executed why: Why the service will be executed (the conditions and purpose for executing the service)
Dialogue-Based User Needs Extraction for Effective Service Personalization
145
Fig. 3. The architecture of the needs extraction dialogue system.
For example, representing the need of “I want the smart speaker to announce the weather forecast in a loud voice that can be heard by Grandpa in the living room if it rains every morning at 6:30” using the 6W1H elements would be as follows: – – – – – – –
how: Smart speaker what: I want it to announce in a loud voice that can be heard when: Every morning at 6:30 where: Living room who: None whom: Grandpa why: If it rains
Not all needs have all of the 6W1H elements and there may be missing elements like the who element in the above example. Furthermore, a user’s needs may change over time and conditions. Additionally, there may be cases where a service that the user themselves considered necessary is actually not needed after actually receiving the service [17]. Therefore, the needs for a specific service for a user may have multiple competing ones. 4.3
(A2) Design of a Needs Extraction Dialogue System
The architecture of the proposed needs extraction dialogue system is illustrated in Fig. 3. The main actors and modules of the proposed architecture are as follows: – User : Possesses a need for a specific service and conveys the request to the agent.
146
T. Nakata et al.
– Agent: Acquires the user’s speech and transmits it to the system. The agent also verbally utters the response. – Dialogue-based extraction system: The core system that controls the needs extraction flow. It carries out API calls and database accesses, and generates the response based on the extracted 6W1H elements. – Needs extraction API : Extracts 6W1H elements from the need statement through natural language analysis. – Needs database: A database that accumulates needs as 6W1H elements. The operation of the proposed architecture during one response and needs extraction process consists of the following five stages: 1. 2. 3. 4. 5.
Listening to the need sentence Extracting the 6W1H elements Accumulating and updating needs Generating the response sentence Speaking the response sentence
Firstly, in the stage of listening to the need sentence, the user speaks a need sentence to the agent and the agent converts the spoken content into text through a voice recognition API. The spoken content is sent to the conversational extraction system. Next, in the stage of extracting the 6W1H elements, the conversational extraction system sends the need sentence to the needs extraction API and extracts the 6W1H elements from the need sentence by natural language processing within the API. The extracted 6W1H elements are returned to the conversational extraction system. Then, in the stage of accumulating and updating needs, the 6W1H elements are accumulated in the needs database. If the 6W1H elements have already been accumulated in the repeat of need extraction, they are merged with the new and old needs to update as a new need. And in the stage of generating the response sentence, the conversational extraction system generates an appropriate response sentence by referring to the extracted needs. The specific content of the response sentence will be described later. The generated response sentence is passed on to the agent. Finally, the agent speaks the response sentence to the user. The user can pass on a new reaction to the agent for the response sentence, and the system operates from the first stage again. The system operates by repeating this process multiple times. 4.4
(A3) Design of a Dialogue Flow
The flow of user-agent interaction proceeds as follows: Step 1. (User’s speech) Expresses their needs to the agent (e.g. “Please turn on the lights at 6:30 every morning”). If there is no mention of a specific service, the conversation terminates if the speech is deemed irrelevant to the needs. Step 2. (Agent’s speech) Asks the user for any missing 6W1H elements (e.g. “Do you have any additional information, such as the location?”).
Dialogue-Based User Needs Extraction for Effective Service Personalization
147
Step 3. (User’s speech) Informs the agent of any missing 6W1H elements (e.g. “In the bedroom”). If there are still missing elements, the conversation returns to step 2, or if there are no further elements necessary to express the needs, the user informs the agent and proceeds to step 4 (e.g. “I have no further requests”). Step 4. (Agent’s speech) Indicates the extracted needs to the user and asks if the correct needs have been extracted (e.g. “Your request is to turn on the lights in the bedroom every morning at 6:30, correct?”). Step 5. (User’s speech) Answers the agent and ends the conversation by confirming or denying the presented needs (e.g. “Yes, that’s correct”). If the needs are incorrect, the user can go back to step 1 and express their needs from the beginning. At each step of the interaction, the needs extraction process and response generation described in the design of the needs extraction dialogue system (A2) are carried out. The response text varies in accordance with the interaction flow, with texts generated and spoken such as “Do you have any additional conditions regarding time and location?” or “Do you desire such things?” to ask the user for missing 6W1H elements and confirm the extracted needs, respectively. 4.5
(A4) Design of a Needs Extraction API
The Needs Extraction API utilizes natural language processing to extract the 6W1H elements from a need sentence. In this implementation, a simplified version of a needs extractor was created to extract the five elements of how, when, where, whom, and what, with the exclusion of who and why, with Japanese as the target language. The natural language processing utilized the Japanese morphological analyzer JUMAN++ [16] and the syntactic analyzer KNP [12]. A morphological analyzer is a tool that analyzes the types of parts of speech and conjugation forms. A syntactic analyzer is a tool that analyzes the grammatical relationship between clauses. The extraction method for each element is described below: – how: A pre-registered service name that exactly matches the character string. – when: Extract clauses related to time and combine using grammatical relationship. – where: Extract clauses containing morphemes related to place and combine using grammatical relationship. – whom: Extract clauses containing morphemes related to people and combine using grammatical relationship. – what: Combine clauses related to the predicate that are not included in the other 6W1H elements. For example, if a virtual agent service named “Mei-chan” is registered, the following elements are extracted from the sentence “Please ask my father’s physical condition at home every morning using Mei-chan” when extracting the need:
148
T. Nakata et al.
Fig. 4. Screen of needs extraction dialogue.
– – – – –
how: Mei-chan when: every morning where: home whom: my father what: ask about physical condition
4.6
(A5) Implementation and Evaluation of the Dialogue System
Here, We briefly describe the implementation of the dialogue system we constructed. The agent was realized by utilizing our laboratory’s existing research on a voice dialogue agent (MMDAgent: Mei-chan) [14,19]. The processing was written in JavaScript and operates on a browser. It receives the user’s voice input, converts it into text using a voice recognition API. Additionally, the voice dialogue agent is linked to a user management service, allowing it to identify the user in conversation. The dialogue-based extraction system is a backend system created in the Kotlin language and operates on a Tomcat Web server. The extracted needs are stored in a MySQL database. The needs extraction API performs natural language processing using JUMAN++ and KNP on Python and returns the results. Figure 4 shows a screen of the Japanese needs extraction dialogue. The translation of the dialogue contents is as follows: – User: “When I get home, I want you to play some funny videos on YouTube.”
Dialogue-Based User Needs Extraction for Effective Service Personalization
149
Fig. 5. Failed dialogue in speech recognition.
– Agent: “Are there any additional details such as time and people regarding the YouTube execution?” – User: “In the evening.” – Agent: “Are there any additional details such as people regarding the YouTube execution?” – User: “No.” – Agent: “To summarize your request, you would like to play some funny videos on the service YouTube at home in the evening after you get home, is that correct?” – User: “Yes, that’s right.” By repeating the dialogue, the needs were extracted in the 6W1H format, and the user was able to present the extracted needs in a form that was easy to understand. Figure 5 depicts a successful extraction of needs, however, there were also instances of failed extraction of needs. The content of the conversation, translated, is as follows: – – – –
User: “Can you tell me about yesterday’s news, Mei-chan (in hiragana)?” Agent: “Which service would you like to use?” User: “Mei-chan (in katakana).” Agent: “For Mei-chan (in katakana), if you have any additional conditions such as time, place, people, or what you would like to do, please let me know.” Figure 6 depicts a failed voice recognition in the conversation. The content of the conversation, translated, is as follows:
– User: “I want heating turned on with air conditioning service.”
150
T. Nakata et al.
Fig. 6. Failed dialogue in needs extraction.
– Agent: “Regarding the execution of air conditioning service, are there any additional conditions such as time and place?” – User: “No.” – Agent: “To summarize your request, you’d like to have an air conditioning service available for people like ‘air conditioning service’ and ‘heating’, is that correct?” The correct service name is “Mei-chan” in katakana, but the voice recognition result produced the hiragana string “Mei-chan”, which is a homophone, leading to the incorrect recognition of the correct service. By conducting voice recognition once again, the correct result was obtained. In Fig. 3, the extraction of needs failed, and what should have been extracted as “air conditioning service” and what part of it as “heating” were mistakenly extracted as whom.
5 5.1
Consideration Advantages
In this study, the proposed needs extraction dialogue system enables the simple realization of needs extraction techniques using natural language processing, which are extractable from conversation and easily comprehensible to humans. This not only realizes the participation of users in the individual adaptation process through conversation, but also leads to the explainability and fairness of the extracted needs data. The generation of needs confirmation statements, as shown in Fig. 1, is a simple process that connects the 6W1H elements with particles and has an excellent ability to explain as the needs expressed in 6W1H elements have a structure that is easily understandable to users. The validity of the 6W1H needs model can be attributed to its familiarity and ease of understanding among many people, as the 6W1H concept expands
Dialogue-Based User Needs Extraction for Effective Service Personalization
151
on the 5W1H. Additionally, the extraction of needs is possible through simple conversation flow by repeatedly asking questions and responding with 6W1H elements in a fill-in-the-blank format. Furthermore, the 6W1H elements are capable of representing the context of things. The trigger and execution conditions of the context-aware service can also be expressed in the same format and are considered appropriate as a format for representing services. This also leads to the development of comparing and analyzing needs and services. The proposed dialogue flow extracts needs in the form of fill-in-the-blank with 6W1H elements, making it easier to identify missing elements in the extracted needs. 5.2
Limitations
In this implementation, for the sake of simplicity, the extraction of the who element, which represents the acting body, and the why element, which represents the reason, were omitted. Therefore, it is necessary to make the extraction possible by improving the needs extraction API. The extraction of who and why is considered possible by analyzing particles. In addition, there is a need to correct defects in the extraction algorithm, such as extraction failures of the whom element. Although this time, the needs extraction API was implemented as a microservice for the Japanese language, needs extraction can be made possible by implementing needs extraction APIs for other languages such as English and using them in the proposal architecture. Furthermore, it has been found that extraction of the how element, which represents the execution service, may fail due to the results of speech recognition, so it is necessary to give flexibility to the extraction of how. One method is to determine by word similarity rather than complete coincidence. Additionally, in order to prevent errors in speech recognition from affecting the accuracy of needs extraction, it is necessary to improve the accuracy of the text recognized by speech recognition. Utilizing a superior recognition API or rectifying the recognized text may be necessary. The future challenges are twofold. Firstly, it is necessary to confirm that the needs are correctly detected by the proposal system. Secondly, it is to construct a recommendation system based on comparison with the needs extracted this time by performing the 6W1H expression of the service.
6
Conclusion
In this study, we designed a 6W1H needs model that is easy for users to understand and has the potential to be extended to machine learning. Additionally, we carried out the design of a needs extraction system, a needs extraction dialogue flow, and a natural language processing API and built a system that targets Japanese language, and carried out a simple evaluation. As a result, it was found that the proposed model and system are effective in extracting needs that are readable. In particular, the dialogue flow based on the 6W1H element gap-filling
152
T. Nakata et al.
format was found to have the advantages of being simple in design and capable of deep mining of user needs. As challenges for the dialogue system, the extraction of the who and why elements, which were abandoned for simplification, and the improvement of extraction accuracy including speech recognition accuracy can be mentioned. However, since it was found that needs extraction is possible to some extent through this study, we consider that there are other tasks that should be prioritized. Specifically, these are evaluations of the extracted needs and research on personal adaptation using the extracted needs. Further development of this study can lead to the realization of advanced and practical personal service adaptation that combines user subjectivity and the strength of machine learning, thereby creating high user satisfaction and service value. Acknowledgements. This research was partially supported by JSPS KAKENHI Grant Numbers JP19H01138, JP20H05706, JP20H04014, JP20K11059, JP22H03699, JP19K02973, Grant-in-Aid for JSPS Research Fellow (No.22J13217), and Tateishi Science and Technology Foundation (C) (No.2207004).
References 1. Cambridge dictionary — english dictionary, translations & thesaurus. https:// dictionary.cambridge.org/. Accessed 22 Dec 2022 2. Ajaegbu, C.: An optimized item-based collaborative filtering algorithm. J. Ambient. Intell. Humaniz. Comput. 12(12), 10629–10636 (2021). https://doi.org/10. 1007/s12652-020-02876-1 3. Antwi-Boasiako, E., Zhou, S., Liao, Y., Liu, Q., Wang, Y., Owusu-Agyemang, K.: Privacy preservation in distributed deep learning: a survey on distributed deep learning, privacy preservation techniques used and interesting research directions. J. Inf. Secur. Appl. 61, 102949 (2021). https://doi.org/10.1016/j.jisa.2021.102949 4. Fan, H., Poole, M.S.: What is personalization? perspectives on the design and implementation of personalization in information systems. J. Organ. Comput. Electron. Commer. 16(3–4), 179–202 (2006). https://doi.org/10.1080/10919392.2006. 9681199 5. Fang, H., Chen, C., Long, Y., Xu, G., Xiao, Y.: DTCRSKG: a deep travel conversational recommender system incorporating knowledge graph. Mathematics 10, 1402 (2022). https://doi.org/10.3390/math10091402 6. Ferrari, A., Micucci, D., Mobilio, M., Napoletano, P.: On the personalization of classification models for human activity recognition. IEEE Access. 32066–32079 (2020). https://doi.org/10.1109/ACCESS.2020.2973425 7. Goldenberg, D., Kofman, K., Albert, J., Mizrachi, S., Horowitz, A., Teinemaa, I.: Personalization in practice: Methods and applications. In: WSDM 2021, pp. 1123–1126, May 2021. https://doi.org/10.1145/3437963.3441657 8. Guo, G.: Application of e-commerce personalized recommendation algorithm based on collaborative filtering. In: Xu, Z., Alrabaee, S., Loyola-Gonz´ alez, O., Zhang, X., Cahyani, N.D.W., Ab Rahman, N.H. (eds.) CSIA 2022. LNDECT, vol. 125, pp. 959–966. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-97874-7 140 9. Hollis, K., Soualmia, L., S´eroussi, B.: Artificial intelligence in health informatics: hype or reality? Yearbook Med. Inform. 28, 003–004 (2019). https://doi.org/10. 1055/s-0039-1677951
Dialogue-Based User Needs Extraction for Effective Service Personalization
153
10. Iosifidis, V., Ntoutsi, E.: Adafair: Cumulative fairness adaptive boosting. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 781–790. CIKM 2019, Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3357384.3357974 11. Jannach, D., Manzoor, A., Cai, W., Chen, L.: A survey on conversational recommender systems. ACM Comput. Surv. 54, 1–36 (2021). https://doi.org/10.1145/ 3453154 12. Kawahara, D., Kurohashi, S.: A fully-lexicalized probabilistic model for Japanese syntactic and case structure analysis, vol. 21, pp. 176–183 (2006). https://doi.org/ 10.3115/1220835.1220858 13. Kocaballi, A.B., et al.: The personalization of conversational agents in health care: systematic review. J. Med. Internet Res. 21, e15360 (2019). https://doi.org/10. 2196/15360 14. Lee, A., Oura, K., Tokuda, K.: MMDagent-a fully open-source toolkit for voice interaction systems, pp. 8382–8385 (2013). https://doi.org/10.1109/ICASSP.2013. 6639300 15. Lipton, Z.C.: The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue 16(3), 31–57 (2018). https://doi.org/10.1145/3236386.3241340 16. Morita, H., Kawahara, D., Kurohashi, S.: Morphological analysis for unsegmented languages using recurrent neural network language model, pp. 2292–2297 (2015). https://doi.org/10.18653/v1/D15-1276 17. Nakata, T., Chen, S., Nakamura, M.: Uni-Messe: unified rule-based message delivery service for efficient context-aware service integration. Energies 15(5), 1729 (2022). https://doi.org/10.3390/en15051729 18. Nakata, T., Saiki, S., Nakamura, M.: Characterizing smart systems with interactive personalization. In: 22nd IEEE-ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel Distributed Computing (SNPD 2021), pp. 8–14, November 2021 19. Ozono, H., Chen, S., Nakamura, M.: Encouraging elderly self-care by integrating speech dialogue agent and wearable device. In: Gao, Q., Zhou, J. (eds.) 8th International Conference, ITAP 2022, Held as Part of the 24th HCI International Conference, HCII 2022. LNCS, vol. 13331, pp. 52–70. Springer, Cham (2022). https:// doi.org/10.1007/978-3-031-05654-3 4 20. Tudor Car, L., et al.: Conversational agents in health care: Scoping review and conceptual analysis. J. Med. Internet Res. 22(8), e17158 (2020). https://doi.org/ 10.2196/17158, https://www.jmir.org/2020/8/e17158/ 21. Zhang, Y., Chen, X., Ai, Q., Yang, L., Croft, W.: Towards conversational search and recommendation: system ask, user respond. In: CIKM 2018, pp. 177–186, October 2018. https://doi.org/10.1145/3269206.3271776 22. Zhou, M., Duan, N., Liu, S., Shum, H.Y.: Progress in neural NLP: modeling, learning, and reasoning. Engineering 6, 275–290 (2020). https://doi.org/10.1016/j.eng. 2019.12.014
The Impact of External Networks on Product Innovation in Social Purpose Organizations: An Empirical Research on Japanese Museums Shohei Oishi(B) and Akitsu Oe Tokyo University of Science, 1-11-2 Fujimi, Chiyoda-ku, Tokyo 102-0071, Japan [email protected], [email protected]
Abstract. This study aims to explore the sources of knowledge related to the adoption of advanced technology for exhibits and services, a necessary product innovation for museums to thrive as social purpose organizations (SPOs). The study also intends to demonstrate how collaboration with external organizations influences the knowledge characteristics of museums as SPOs and the resulting impact on product innovation. Using a negative binomial distribution model for 1,416 organizations, including 103 museums in the Chubu and Kinki regions of Japan, network analysis and multiple regression analysis revealed that the fewer tourism resources an area has, the more collaboration occurs between museums and other organizations. Furthermore, the study demonstrated the positive impact of collaboration with mass media and private companies on knowledge acquisition from other communities and the marginal and interaction effects of such collaborations. The study also found a positive impact of collaboration with the mass media on product innovation. By visualizing the regional networks of museums in Japan, the study highlights the importance of collaboration among organizations with diverse areas of knowledge and strengths in generating innovation. The theoretical contributions of the study lie in organizational behavior theory, innovation theory, and museum management theory. Keywords: Museum management · Social purpose organization · Product innovation · Technology adoption · Cross-industry collaboration
1 Introduction With the rise of the SDGs in recent years, social purpose organizations (SPOs) have become increasingly important. SPOs are organizations that have both a social mission and commercial revenue goals [1]. The museums under study are SPOs with commercial goals such as increasing the number of visitors to continue their social activities of collecting, exhibiting, and disseminating educational materials. While there are examples of foreign museums enhancing their commercial activities to achieve their social mission as SPOs and reducing their dependence on increased revenues and donations by introducing digitalization [2], the concept of SPOs is not widespread in Japan. Product innovation is a necessary element for the growth of SPOs and is defined as “a product © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 154–172, 2023. https://doi.org/10.1007/978-3-031-35129-7_11
The Impact of External Networks on Product Innovation
155
innovation is a new or improved good or service that is significantly different from the firm’s previous goods or services and has been introduced into the market” [3, p. 21]. According to this definition, the main innovation of museums is the adoption of technology to improve exhibitions and services [4], which is considered a product innovation. Camarero and Garrido [5] demonstrated that museums’ orientation to donors promotes technological innovation, which is equivalent to product innovation. However, Japanese museums receive fewer donations than their foreign counterparts, and increasing their income from admission fees and other sources has been cited as a challenge [6]. Therefore, for museums to continue their social activities in the future, they need to find ways to generate product innovation as SPOs. Therefore, the research question of this study is where the knowledge of product innovation in museums, that is, the adoption of advanced technology in exhibitions and services, comes from. From the above, this paper aims to demonstrate the characteristics of knowledge that collaboration with external organizations brings to a museum as an SPO and its impact on product innovation.
2 Theory and Hypothesis 2.1 Collaboration and Knowledge Acquisition at the Museum’s Temporary Exhibition For organizations, the key to generating innovation is incorporating and using new knowledge [7]. Organizations continuously search for knowledge about the external environment and acquire new knowledge to innovate, adapt, and ensure competitiveness (e.g., [8]). There are various methods of searching for and acquiring knowledge, such as observing other organizations, partnering, and exchanging personnel. When an organization finds and acquires new knowledge from other organizations, it engages in interorganizational learning and organizational learning to develop the acquired knowledge as its own. However, organizations are not free to incorporate new knowledge at any time. Certain conditions are necessary to bring new knowledge into the organization. The ability of an organization to incorporate technological knowledge from outside is called absorptive capacity, which refers to an organization’s ability to recognize, assimilate, and apply new knowledge as a product or service to create new products and innovations [9]. For an organization to assimilate knowledge, it needs an overlap between new and existing knowledge to understand it [9]. Oishi and Oe [10] demonstrated that collaboration among staff, which is the internal knowledge recognition of the organization, and collaboration with other organizations in the museum, which is the external knowledge acquisition, promote product innovation in the museum. Then, how does a museum’s collaboration with other organizations start? Various private companies are involved in museums in various aspects, such as providing and maintaining facilities for exhibitions, providing products sold in the museum store, private companies supporting publishing, and private companies supplying food to the museum restaurant. Among them, this study focuses on private companies involved in temporary exhibitions in museums. A temporary exhibition in a museum is an exhibition held for a limited period in a corner of an exhibition room or a temporary exhibition room, as opposed
156
S. Oishi and A. Oe
to the permanent exhibitions set up in the museum since its establishment. Museums exhibit to their visitors, and their mission is to maintain and increase the number of visitors on a sustainable basis, even if they are nonprofit organizations. The customers of private companies largely correspond to the visitors to today’s museums. Museums need to survey and evaluate visitor satisfaction and implement improvement activities, and these activities should be consistent with the museum’s management strategy. However, it is difficult to frequently renew the permanent exhibitions in museums to maintain and increase the number of visitors. Therefore, by establishing a temporary exhibition separate from the permanent exhibition and changing the theme according to the period, visitors can have a highly satisfying viewing experience without getting bored, which will lead to the maintenance and expansion of the number of visitors. For example, as part of a museum’s management strategy, a temporary exhibition may be held to create a new market by highlighting features that are different from those of conventional museums. If the management strategy is to create a new market to attract potential and new visitors, planning an exhibition or special exhibition with a theme that has never been seen before is necessary. A good example of creating a new market is the exhibitions on ghosts and Yokai that have been actively held in Japan in recent years. The “Special Exhibition: Obake, Yokai, Ghosts” was held at the Hyogo Prefectural Museum of Art in 1987 and was a great success [11]. This type of exhibition has been held in many places and is now a summer staple in museums [12, 13]. Also in summer 2019, the National Museum of Japanese History’s “Summer of Mononoke: Ghosts and Yokai in Edo Culture” and the Kawasaki City Museum’s “Yokai/Human: From Fantasy to Reality” were held. This is expected to attract potential visitors who have not previously been interested in museums. Since Yokai and ghosts are handed down in different parts of the country, museums can easily arrange them according to each region. Consequently, museums in different regions hold special exhibitions with similar themes in the summer, creating a market for exhibitions in this mysterious world. The museum’s temporary exhibition is usually prepared two to three years before the exhibition, and the museum cooperates with related organizations in formulating a business plan. In addition, cooperation with related organizations, which is indispensable for the realization of the museum’s temporary exhibition, is often carried out not only once but also in subsequent temporary exhibitions. Granovetter [14], in his theory of embeddedness, which explains the pattern of organizational ties, introduces “relational embedding,” which is repeated connection with a partner once connected, and “positional embedding,” which is advantageous for information flow when network centrality is high. Relational embedding through connections in temporary museum exhibitions is advantageous for knowledge acquisition. Knowledge brought to a museum from other communities is different in nature from the knowledge of the surrounding communities and may be an asset that other museums do not have. In private companies, knowledge acquisition from other communities is actively pursued because it leads to innovation. As mentioned above, museums need to be active in research and educational activities in addition to their exhibition activities, and the acquisition of new knowledge of a different nature is essential for developing these activities.
The Impact of External Networks on Product Innovation
157
Camarero et al. [4] cite management expertise and hiring staff with non-museum experience as examples of organizational innovation in museums. However, museum management resources are limited, and it is not easy to increase the museum’s internal management resources, such as hiring new staff. Therefore, one possible solution is to collaborate with other organizations. Collaboration with other organizations in a museum can promote the absorption of knowledge and lead to innovation. In some areas of Japan, museums are strengthening their activities by networking with other organizations. In Kumamoto Prefecture, the Kumamoto Prefecture Museum Network Center was established in 2014 to revitalize museum activities in the prefecture through the mutual use of each museum’s collections [15]. There are numerous examples of collaboration in permanent networks and individual initiatives such as temporary exhibitions. These include a temporary exhibition in which museums sponsor each other through a collaboration agreement [16]; a mobiletype temporary exhibition co-sponsored by multiple museums under the same theme and separate exhibition periods [17]; and a temporary exhibition held in collaboration with foreign embassies [18]. Temporary museum exhibitions are held not only by their own museums but also through co-sponsorship by other museums, support and cooperation from private companies, and grants from foundations. Although it is difficult for museums to establish continuous partnerships with other museums due to a lack of management resources, it is believed that museums can actively absorb knowledge and work to improve their services by collaborating with other organizations through temporary exhibitions. It is difficult to realize the adoption of exhibitions and services using advanced technologies with limited management resources, and there are examples of companies introducing advanced technologies by collaborating with private companies [19]. Collaboration through temporary exhibitions could be motivation to absorb such knowledge. Therefore, when museums collaborate and network with other organizations, product innovation is expected to occur through the absorption and utilization of new knowledge. 2.2 Knowledge Acquisition and Product Innovation Through Collaboration with Other Organizations in Museums In addition to cases where the museum takes the initiative in creating new markets, there are also cases where the mass media take the initiative in creating new markets. Temporary exhibitions in museums can be planned internally by museums based on research results, or they can be planned by mass media such as newspapers and publishers. In this study, we focus on the latter case and consider that various information brought in by the mass media plays an important role in acquiring knowledge. Temporary exhibitions brought in by mass media are mainly held by borrowing collections and other materials from foreign museums [20]. If the borrowed collection is large, the exhibition is held as a traveling exhibition at several museums in Japan, and the museums and regional newspapers and publishers that serve as co-sponsors work together to promote the exhibition [20]. As the mass media gather information from various sources due to the nature of their business, knowledge acquisition from other communities is likely to occur. Publicity is essential to increase the number of visitors to temporary exhibitions. It is believed that museums not only gain from publicity but also acquire knowledge through
158
S. Oishi and A. Oe
the formation of a network connecting different museums through mass media, and the following hypothesis is proposed regarding the connection between museums and mass media: H1: Collaboration with mass media in the museum’s temporary exhibitions promotes the museum’s knowledge acquisition from other communities. H2: Collaboration with mass media in the museum’s temporary exhibitions promotes product innovation in the museum. There are various forms of collaboration among private companies in temporary exhibitions. There are cases in which large museums, such as national museums and general private companies, collaborate to hold temporary exhibitions and create product innovation [19], and there are also cases in which small regional museums collaborate with NPOs and private companies to create innovation [21]. Such medium- to long-term collaboration is thought to develop from exhibition collaboration at temporary exhibitions, which provide their products and technology through “exhibition collaboration,” to knowledge absorption and technology adoption, and is considered to be the next step for collaboration between museums and private companies. The following hypothesis is proposed for the collaboration between museums and private companies: H3: Collaboration with private companies in the museum’s temporary exhibitions promotes the museum’s knowledge acquisition from other communities. H4: Collaboration with private companies in the museum’s temporary exhibitions promotes product innovation in the museum.
3 Data and Analysis The data are based on the results of “The Second Questionnaire Survey on the Characteristics of Museum Management that Generate Innovation” conducted by e-mail with 2,048 museums in Japan between September 23 and November 30, 2021, and information on temporary exhibitions held in 2019, including relationship of cooperation. The questionnaire are based on the survey conducted by Camarero and Garrido [22] in European museums. Temporary exhibitions were collected from 634 temporary exhibitions held by public museums in Chubu (Niigata, Toyama, Ishikawa, Fukui, Yamanashi, Nagano, Gifu, Aichi, Shizuoka) and Kinki (Kyoto, Nara, Mie, Osaka, Wakayama, Hyogo) regions of Japan from January 1 to December 31, 2019. Temporary exhibitions, special exhibitions, or exhibitions were included in the analysis, and if the beginning of the period was 2018 or the end of the period was 2020, they are included in the collection if the year 2019 was included in the period. Among public museums, all museums established by the national and prefectural governments are included in the collection, while those that responded to the questionnaire survey are included in the collection for municipal museums. The information collected included the name and date of the temporary exhibition, the URL of the page containing the information, and other information listed on the homepage and flyers of each museum, the names of organizations listed in the sponsorship, co-sponsorship, cooperation, support, funding, and collaboration. The number of temporary exhibitions in the 157 museums in the collection was 634, and the number of related organizations, including temporary exhibition-holding museums, was 1,416.
The Impact of External Networks on Product Innovation
159
The purpose of the network analysis is to construct a network connecting museums holding temporary exhibition and related organizations to visualize the differences in how each center or region is connected, and to calculate a network centrality index for use in multiple-regression analysis. We considered nodes in the network, the organizations that have relationships, such as sponsorship, co-sponsorship, cooperation, support, funding, and collaboration, with the temporary exhibition museums of the target museums, and undirected ties connect the nodes. Gephi (ver. 0.9.7) was used as the analysis software to calculate network indices and draw network diagrams. A multiple regression analysis uses a negative binomial distribution model, and STATA (ver. 16.1) was used as the analysis software. The dependent variables are “Eigenvector centrality” as a network indicator for “Knowledge acquisition from other communities” in H1 and H3, and the “Number of advanced technology adoptions” as an indicator of the museum’s technology adoption for “Product innovation” in H2 and H4. Eigenvector centrality is an index that states that nodes connected to nodes with high network centrality are highly central [23]. A high value of eigenvector centrality indicates that a node is connected to a central organization in each network and is more likely to gather information and knowledge from different communities, so it is used as a variable for knowledge acquisition from other communities. Variables are standardized and processed so that the variables are greater than or equal to zero by adding the minimum value to utilize a negative binomial distribution model. The “Number of advanced technology adoptions” is based on the number of advanced technology adoptions related to museum exhibits and services as asked in the questionnaire survey. First, from a total of eleven technologies that have been adopted in museums, dummy variables were created from the results of responses to the four options of “have adopted,” “plan to adopt,” “have no plan but would like to adopt in the future,” and “do not plan to adopt,” with “have adopted” as 1 and the other options as 0. Next, to analyze the impact of adopting technologies with few pavilions and high demand, the technologies in the bottom 50% of the total of 11 technologies were considered “advanced technologies.” The six advanced technologies in this study are “digitization catalogs of materials and exhibits,” “viewing exhibits using 3D technology (including VR, AR, and MR technology),” “spatial presentation using digital technology,” “adoption of projection mapping technology,” “virtual visit (asynchronous) that allows visitors to view the museum on a website,” and “entrance reservation system.” The “Number of advanced technology adoptions” is the sum of these technology adoptions. However, the results for “spatial presentation using digital technology” and “adoption of projection mapping technology” were calculated as 1 even if both were “have adopted” because of the similarity of the question content. The independent variables are “Rate of relations with mass media in the temporary exhibition,” which indicates the status of relations with mass media in H1 and H3, and “Rate of relations with private companies in the temporary exhibition,” which indicates the status of relations with private companies in H2 and H4. The “Rate of relations with mass media in the temporary exhibition” is the number of private companies whose industry is television, radio, or newspaper out of the total number of organizations involved in the temporary exhibition held in each museum and represents the number of
160
S. Oishi and A. Oe
mass media per event. The “Rate of relations with private companies in the temporary exhibition” is the number of private companies excluding mass media mentioned above and represents the number of private companies per event. Both independent variables are standardized. The control variables are “Prefectural dummy,” “Chubu dummy,” “Designated administrator dummy,” “Volunteer recruit dummy,” “Number of visitors in 2019,” “Budget shortfall level,” “Importance of exhibition activities dummy,” and “Importance of education activities dummy.” First, the “Prefectural dummy” is set to 1 if the prefecture establishes the museum and 0 otherwise. The “Chubu dummy” is set to 1 if the museum is in the Chubu region (Niigata, Toyama, Ishikawa, Fukui, Yamanashi, Nagano, Gifu, Aichi, and Shizuoka) and 0 otherwise. The “Designed administrator dummy” is set to 1 if the responding museum has a designated administrator system and 0 otherwise. The “Volunteer recruit dummy” is set to 1 if the responding museum recruits full-time or part-time volunteers and 0 otherwise. The “Number of visitors in 2019” is the number of visitors in 2019, the year before the spread of the novel coronavirus, and is standardized. The “Budget shortfall level” is a Likert variable that is defined as 4 for “strongly agree,” 3 for “agree,” 2 for “neither agree nor disagree,” 1 for “strongly disagree,” and 0 for “completely disagree” to the question “budget is not sufficient to introduce new technologies.” The “Importance of exhibition activities dummy” is set to 1 if the respondents answered “exhibition activities” as the most important activity among the museum’s main activities, which are “collection and conservation activities,” “research and investigation activities,” “exhibition activities,” “education and extension activities,” and “recreation activities,” and to 0 otherwise. The “Importance of education activities dummy” is set to 1 if the respondents answered that the most important activity among the major activities is “educational activities,” and 0 otherwise.
4 Results 4.1 Network Analysis Figure 1 illustrates the results of the network analysis. Each museum is represented by a color-coded node: red for national museums, pink for prefectural museums, orange for municipal museums, yellow for other museums, blue for mass media such as newspapers and television, light blue for private companies, and yellow-green for government and educational institutions. The node’s size indicates the eigenvector centrality, and the thickness of the string represented by a line indicates the number of times the node is associated with temporary exhibitions. The larger the number of collaborations, the thicker the string, indicating that multiple collaborations took place. The names of the organizations are given for the large and characteristic nodes. In addition, the network is divided according to modularity [24], and communities are extracted. Some networks have large communities, but the greater the number of nodes (i.e., the number of organizations) that make up the community, the greater the number of mass media. The findings indicate that the mass media cooperate with temporary exhibitions with many related organizations. In the network in Niigata Prefecture, located in the center of Fig. 1, the network consists mainly of the Niigata City Museum, and many private companies are involved, while in the network in Yamanashi Prefecture, located next to Niigata
The Impact of External Networks on Product Innovation
Fig. 1. Museums Network (Clustering by Modularity)
Fig. 2. Museums Network (Clustering by Prefectures)
161
162
S. Oishi and A. Oe
Prefecture, many administrative organizations (shown in green) are involved, indicating regional differences. In particular, several museums with high eigenvector centrality are found in each municipality. There are many prefectural museums, such as the Yamanashi Prefectural Museum, Yamanashi Prefectural Museum of Archaeology, Yamanashi Prefectural Museum, Nagano Prefectural Museum of Art, and Hyogo Prefectural Museum of Art. National museums such as the Nara National Museum and the National Museum of Modern Art, Kyoto, also have a large node, indicating that they often cooperate with museums with large nodes in other communities outside the municipality. Similarly, among the mass media, we can see that the Asahi Shimbun has a large node and that it often collaborates with museums with large nodes in other communities beyond the community. Many of the communities are composed mainly of prefectural museums, and we can observe that they cooperate with national museums and mass media to transfer knowledge to other museums in the region. Figure 2 illustrates the network divided by prefecture, indicating that there are relationships across regions and that there is also cooperation between regions. Osaka Prefecture, the second largest city in Japan, has two large communities, indicating that communities are dispersed even within the prefecture. As there are few connections to other prefectures, there is less cooperation among them compared to other prefectures. Furthermore, the size of the nodes indicating the eigenvector centrality is not so large as to be noticeable. This indicates that the cooperative activities of museums in Osaka Prefecture with other organizations are weak. The area consisting of Aichi Prefecture, the third largest city in Japan, and the neighboring Mie Prefecture has only one large community with numerous connections to other prefectures. Aichi, Mie, and Gifu prefectures, which are called the three Tokai prefectures in various aspects and form a single cultural area, are strongly connected in terms of museums. In addition, there is strong cooperation with Shizuoka Prefecture, which borders the three Tokai prefectures. Particularly noteworthy are Niigata Prefecture and Yamanashi Prefecture. Niigata Prefecture is located on the Sea of Japan side and has many disadvantages, such as transportation networks, compared with the Pacific side of the central region. Because of its disadvantageous location, Niigata Prefecture is trying to establish new private companies and develop tourism resources by focusing on regional cooperation [25]. Yamanashi Prefecture also lacks tourism resources, and the Yamanashi Prefectural Museum of Art, which opened in 1978, purchased a painting by Jean-François Millet, world-famous for “The Sower” and “Picking Up the Harvest,” and became a hot topic as the “Millet’s Museum” [26]. In 2009, the museum opened the Millet Pavilion featuring Millet’s works, and the number of annual visitors increased by 39% over the previous year [27], and now the Yamanashi Prefectural Museum of Art constantly receives visitors from all over Japan as a “model” public museum [28]. Toward the 50th anniversary of its opening, the museum is developing activities focusing on museums and art galleries, including digital technology and regional cooperation [29]. The results of such activities can be seen in the dense community formation, cooperation with other organizations, and high eigenvector centrality illustrated in the network diagrams in Figs. 1 and 2. It can be seen that regional cities such as Niigata Prefecture and Yamanashi Prefecture
The Impact of External Networks on Product Innovation
163
are more active in the use of mass media than ancient Japanese cities such as Kyoto and Nara, which have long been tourist destinations, and Ishikawa Prefecture, which is called “Little Kyoto.” It is believed that regions with unfavorable conditions consciously implement cooperation with mass media, compared to regions with abundant tourism resources. 4.2 Multiple Regression Analysis Results
Table 1. Basic Statistics No.
Variables
Mean
S.D.
Min
Max
1
Knowledge acquisition from other communities
0.50
1.00
0.00
5.32
2
Number of advanced technology adoptions
0.53
0.85
0.00
3.00
3
Prefectural dummy
0.14
0.34
0.00
1.00
4
Chubu dummy
0.68
0.47
0.00
1.00
5
Designated administrator dummy
0.27
0.45
0.00
1.00
6
Volunteer recruit dummy
0.54
0.50
0.00
1.00
7
Number of visitors in 2019
0.00
1.00
– 0.38
9.22
8
Budget shortfall level
2.91
1.37
0.00
4.00
9
Importance of exhibition activities dummy
0.65
0.48
0.00
1.00
10
Importance of education activities dummy
0.16
0.36
0.00
1.00
11
Rate of relations with mass media
0.57
1.00
– 0.57
3.36
12
Rate of relations with private companies
0.49
1.00
– 0.49
4.14
Notes: Number of museums = 103
Table 1 illustrates the basic statistics, and Table 2 illustrates correlation tables for the variables used in the analysis. As the maximum absolute value of the correlation coefficient is 0.59, we checked the VIF and found that the maximum value is 1.66, which is below the strict threshold of 3 [30]. Table 3 illustrates the results of the analysis. The dependent variables are “Knowledge acquisition from other communities” for Models I through V and “Number of advanced technology adoptions” for Models VI and VII. Models I and VI are control variables only, while models II and VII are models with independent variables. Comparing Models I and II to V, and Models VI and VII, the log-likelihood values have increased, and the AIC and BIC values have decreased, so we consider that there is no problem regarding the model’s fit.
164
S. Oishi and A. Oe Table 2. Correlation Tables
No. Variables
1
2
3
4
5
6
7
8
9
10
11
12
1
Knowledge acquisition from 1 other communities
2
Number of advanced technology adoptions
3
Prefectural dummy
.17
.22
1
4
Chubu dummy
.19
.11
– .03 1
5
Designated administrator dummy
.00
.08
.01
.05
6
Volunteer recruit dummy
.19
.16
.19
– .13 .03
1
7
Number of visitors in 2019
.00
.26
.05
– .18 .01
– .02 1
8
Budget shortfall level
.07
– .11 – .02 .00
– .17 .16
9
Importance of exhibition activities dummy
.24
– .07 .17
.04
10
Importance of education activities dummy
– .07 – .02 – .09 .01
– .02 .12
– .06 – .03 – .59 1
11
Rate of relations with mass media
.31
.28
.00
.05
– .12 .07
.05
– .13 .07
.09 1
12
Rate of relations with private companies
.23
.14
.07
.09
.08
– .04 .45
– .06 .09
.12 .21 1
– .09 1
.02
1
.05
– .10 .08
1 – .02 1
Notes: Number of museums = 103
In Model II, where the dependent variable is “Knowledge acquisition from other communities,” “Rate of relations with mass media in the temporary exhibition” has a positively strong effect (p < 0.01) on the dependent variable, “Rate of relations with private companies in the temporary exhibition” has a weak positive effect (p < 0.10). Therefore, H1 and H3 are supported. In Model VII, where the dependent variable is the “Number of advanced technology adoptions,” “Rate of relations with mass media in the temporary exhibition” has a positive and strong effect (p < 0.01) on the dependent variable. H2 is supported. However, the “Rate of relations with private companies in the temporary exhibition” is not significant, and H4 is not supported. As for the control variables, in Model II, the “Chubu dummy,” “Volunteer recruit dummy,” and “Importance of exhibition activities dummy” positively impact “Knowledge acquisition from other communities. In Model VII, “Prefectural dummy,” “Chubu dummy,” and “Number of visitors in 2019” positively impact “Knowledge acquisition from other communities.” In Model VII, “Prefectural dummy,” “Chubu dummy,” and “Number of visitors in 2019” have a positive effect, while “Importance of exhibition activities dummy” has a negative effect on “Number of advanced technology adoptions.”
– 0.29**[0.13] – 5.16***[1.22]
Number of visitors in 2019
Budget shortfall level
Importance of exhibition activities dummy
Importance of education activities dummy
Rate of relations with mass media
Rate of relations with private companies
Marginal effect of rate of relations with private companies
Rate of relations with mass media X Rate of relations with private companies
103
Volunteer recruit dummy
Marginal effect of rate of relations with mass media
– 4.74***[1.21]
Designated administrator dummy
Constant
Number of museums
7
8
9
10
11
12
13
14
15
0.06 [0.38]
214.23
AIC
BIC
Notes: Standard error in brackets *** p < 0.01, ** p < 0.05, * p < 0.10
– 83.94
187.89
Log likelihood
1.85 [1.13]
2.55** [1.05]
0.07 [0.13]
0.12 [0.18]
0.92** [0.37]
148.80 185.68
208.72
– 60.40
103
– 2.85** [1.22]
– 0.34** [0.13]
– 0.61***[0.19]
0.25***[0.48]
2.28***[0.60]
0.31 [1.17]
1.14 [1.08]
0.08 [0.12]
0.15 [0.36]
0.90** [0.36]
0.07 [0.36]
0.70 [0.45]
177.10
– 76.55
103
– 4.76***[1.20]
0.26* [0.16]
0.52***[0.16]
0.95 [1.18]
2.17** [1.05]
0.19 [0.13]
– 0.02 [0.21]
0.83** [0.37]
0.28 [0.38]
0.95** [0.45]
208.56
174.31
– 74.16
103
0.56***[0.19]
0.71***[0.18]
1.38 [1.16]
2.11** [1.06]
0.22* [0.12]
– 0.08 [0.18]
1.02***[0.39]
0.45 [0.37]
1.06** [0.46]
0.54 [0.39]
6
IV
5
1.16** [0.45]
0.21 [0.37]
III
Chubu dummy
0.41 [0.40]
II
4
0.29 [0.41]
3
I
Variables
Prefectural dummy
No
Knowledge acquisition from other communities
Independent variables
Table 3. Analysis Results
185.69
146.16
– 58.08
103
– 2.66** [1.24]
– 0.49** [0.23]
– 0.32** [0.14]
– 0.63***[0.18]
1.69***[0.53]
2.69***[0.64]
0.32 [1.17]
0.86 [1.10]
0.07 [0.12]
0.07 [0.27]
0.84** [0.35]
0.01 [0.37]
0.79* [0.46]
0.36 [0.38]
V
231.07
204.72
– 92.36
103
– 0.98* [0.55]
– 0.43 [0.48]
– 0.61 [0.37]
– 0.13 [0.10]
0.33***[0.11]
0.59* [0.33]
0.10 [0.33]
0.79** [0.38]
0.73* [0.39]
VI
230.10
198.48
– 87.24
103
– 1.15** [0.54]
– 0.05 [0.14]
0.41***[0.11]
– 0.76 [0.51]
– 0.77** [0.36]
– 0.08 [0.09]
0.35***[0.11]
0.45 [0.32]
0.36 [0.31]
0.90** [0.38]
0.85** [0.36]
VII
Number of advanced technology adoptions
The Impact of External Networks on Product Innovation 165
166
S. Oishi and A. Oe
For H1 and H3, where the dependent variable is “Knowledge acquisition from other communities,” additional analyses of marginal and interaction effects are conducted. Model III includes the marginal effect variable. The squared terms of the independent variables for the dependent variable “Knowledge acquisition from other communities” indicate that “Rate of relations with mass media in the temporary exhibition” has a strong negative effect (p < 0.01), while “Rate of relations with private companies in the temporary exhibition” have a negative effect (p < 0.05), respectively. Figures 3 and 4 demonstrate the marginal effects of both dependent variables. Each graph depicts an inverted U-shaped marginal effect, indicating that too much of an increase in mass media and collaboration with private companies in the temporary exhibition inhibits knowledge acquisition from other communities. Comparing Figs. 3 and 4, the position of the vertex in Fig. 3 is higher than that in Fig. 4. This indicates that the relationship with mass media has a greater impact on knowledge acquisition from other communities than private companies.
Fig. 3. Marginal Effect of Rate of Relations with Mass Media
Model IV includes a variable for the interaction effect. The variable that multiplies the two independent variables has a negative effect on “Knowledge acquisition from other communities.” Fig. 5 illustrates the interaction effects of the two independent variables. For the “Rate of relations with private companies in the temporary exhibition,” if the variable’s value is larger than the mean value, the value is obtained by adding one standard deviation, and if the value is smaller, the value is obtained by subtracting
The Impact of External Networks on Product Innovation
167
one standard deviation. Figure 5 illustrates that the more the relationship with private companies in the temporary exhibition other than mass media (+1 standard deviation), the more the relationship with mass media, the easier the knowledge acquisition from other communities as the number of relationships with mass media increases. However, the relationship with fewer private companies (−1 standard deviation) is more conducive to knowledge acquisition from other communities than the relationship with more private companies. Therefore, if a museum’s special exhibition has a very large relationship with the mass media, more likely to acquire knowledge from other communities with smaller relationship with the private companies. However, as the intersection of the two curves is at the relationship ratio of approximately 2, and the vertex of the inverted U-shape in Figs. 3 and 4 is also at the relationship ratio of approximately 2, the marginal effect and the interaction effect occur almost simultaneously. Therefore, the more relationships the museum has with both mass media and private companies, the easier it is to acquire knowledge from other communities, indicating a synergistic effect. Model V is a full model that includes both marginal and interaction effect variables. Compared to model II, model V with marginal and interaction effects has a lower p-value (p < 0.01) for the independent variable “Rate of relations with private companies in the temporary exhibition,” indicating a positive and strong effect on the dependent variable.
Fig. 4. Marginal Effect of Rate of Relations with Private Companies
168
S. Oishi and A. Oe
Fig. 5. Interaction Effect between Independent Variables
5 Discussion This study aims to demonstrate the characteristics of knowledge that external networks formed through partnerships with external organizations bring to museums, which are social purpose organizations, and the mechanisms that generate product innovation. The analysis reveals that museums’ linkages with the mass media and private companies facilitate their knowledge acquisition from other communities and that linkages with the media also facilitate their product innovations. We also demonstrate marginal and interaction effects in the impact of mass media linkages and private companies’ linkages on knowledge acquisition from other communities. There are four main contributions of this study. The first is the demonstration that collaboration with other organizations directly impacts product innovation. While previous studies have demonstrated that museums’ attitudes toward collaboration with other organizations indirectly influence their technological innovation [5], this study demonstrates that collaboration with other organizations directly influences product innovation. Camarero and Garrido [5] demonstrated that museum awareness and efforts for donor promote technological innovation in museums, but Japanese museums have lower donor revenues than foreign museums, making it difficult for them to make a rapid turnaround in their efforts. Temporary exhibitions are held in many museums, and this study demonstrated using actual data on temporary exhibition collaborations suggests a way to generate product innovation without major changes in organizational structure or business models, and we consider that this will promote the generation of product innovation in museums. Furthermore, prior studies have not demonstrated which characteristics of organizations and partnerships with them influence museum innovation. The
The Impact of External Networks on Product Innovation
169
present results indicate that collaboration with mass media, which has high informationgathering and advertising capabilities, promotes museum product innovation. As product innovation is not about novelty in society but about novelty in the organization in question, it is thought that product innovation can be achieved by museums acquiring a wide range of information from mass media through collaboration and selecting and introducing advanced technologies that fit the organization. This is a theoretical contribution to the study of product innovation, as it demonstrates how product innovation can be promoted in nonprofit organizations that do not have profitable products or services. Second, the visualization of the status of cooperation of museums with other organizations and the networks constructed has revealed the characteristics of museums in each region. There is no visualization of what organizations the museums are working with and networking with. The network analysis was focused on some Japanese regions, and the differences in the state of cooperation among the regions made it possible to visualize the regional characteristics. The network analysis results indicated that local museums with fewer tourism resources are more active in collaborating with other museums and other organizations in different industries and are more active in acquiring knowledge from other communities. The results of the multiple regression analysis demonstrated that the “Chubu dummy” positively impacted “Knowledge acquisition from other communities” and “Number of advanced technology adoptions,” indicating that regional characteristics are related to knowledge acquisition and product innovation of museums. Indicating which organizations museums collaborate with and acquire knowledge from their network positions leads to the unraveling of the museum’s knowledge absorption mechanism, which is a theoretical contribution to organizational behavior and network theories. Furthermore, we demonstrated the marginal and interaction effects of collaboration with other organizations in different industries on knowledge acquisition from other communities in the museum. The results demonstrated inverted U-shaped effects between the influence of both relationships with mass media and relationships with private companies on knowledge acquisition from other communities at the temporary exhibition, and also indicated that when there are more relationships with mass media, fewer relationships with private companies is easier to acquire knowledge from other communities. Museums have limited management resources, and it is difficult to rapidly expand the scale of exhibitions and increase the number of organizations with which they collaborate. This study’s results demonstrate that more active collaboration with the media can promote knowledge acquisition, and this is a theoretical contribution that links museum management theory and organizational behavior theory by showing how limited management resources should be utilized. However, the marginal and interaction effects occurred almost simultaneously, indicating that up to a certain standard, there is a synergistic effect between the relationship with mass media and the relationship with private companies. This indicates that it is important for knowledge absorption to successfully combine the characteristics of the organization aiming to acquire knowledge and the characteristics of the partner organization. This is a theoretical contribution to the theories of organizational behavior and museum management and a practical contribution to museum collaboration.
170
S. Oishi and A. Oe
Finally, we demonstrated the impact of collaboration with other organizations from different industries on the museum’s knowledge acquisition. This study demonstrated that collaboration with other organizations in different industries in a temporary exhibition, a collaboration by organizations with different knowledge in their respective fields of expertise, impacts knowledge acquisition from other communities. This study’s results indicate that the combination of the museum’s research and exhibition capabilities, the mass media’s information gathering and advertising capabilities, and the private companies’ technological development capabilities in a museum’s temporary exhibition provide the museum with the capacity to search for and acquire external knowledge. Although the theory of knowledge absorptive capacity in firms [9] has been tested in many studies (e.g., [31, 32]), demonstrating that the theory can be applied in museums is novel. This theoretical contribution to knowledge absorptive capacity theory can be adapted to museums, which are not-for-profit organizations, and also demonstrates how knowledge can be acquired in organizations with limited management resources. One limitation of this study is that although the impact of external networks on knowledge acquisition and innovation is demonstrated, it does not demonstrate the impact of these on performance. In this case, the control variable “Importance of exhibition activities dummy” has a positive and strong impact on “Knowledge acquisition from other communities” but not on “Number of advanced technology adoptions.” Therefore, museums may choose between “external collaboration in temporary exhibitions” and “adoption of advanced technology in exhibitions.” This indicates that museums choose the business process that fits their organization within their limited resources. Therefore, future research is needed on the impact of knowledge acquisition from other communities on the economic performance of museums, such as the increase in the number of visitors, are required. Despite the above limitations, this does not detract from the contribution of this study. Acknowledgment. This work was supported by JSPS KAKENHI Grant Number JP20K20763, 21H00744.
References 1. Young, D.R.: Organizational identity in nonprofit organizations: strategic and structural implications. Nonprofit Manag. Leadersh. 12(2), 139–157 (2001) 2. Alshawaaf, N., Lee, S.H.: Business model innovation through museum nation in social purpose organisations: a comparative analysis of Tate modern and Pompidou centre. J. Bus. Res. 125, 597–608 (2021) 3. OECD & Eurostat.: Oslo Manual 2018: Guidelines for collecting, reporting and using data on innovation, 4th Edition, The measurement of scientific, technological and innovation activities, pp. 85–102. OECD Publishing, Paris (2018) 4. Camarero, C., Garrido, M.J., Vicente, E.: How cultural organizations’ size and funding influence innovation and performance: the case of museums. J. Cult. Econ. 35(4), 247–266 (2011) 5. Camarero, C., Garrido, M.J.: Fostering innovation in cultural contexts: market orientation, service orientation, and innovations in museums. J. Serv. Res. 15(1), 39–58 (2012)
The Impact of External Networks on Product Innovation
171
6. Japan Association of Museums.: Comprehensive Survey Report of Japanese Museums in the First Year of Reiwa. Japan Association of Museums (2020). https://www.j-muse.or.jp/02prog ram/pdf/R2sougoutyousa.pdf. Accessed 21 Jan 2023 7. Katila, R., Ahuja, G.: Something old, something new: a longitudinal study of search behavior and new product introduction. Acad. Manag. J. 45(6), 1183–1194 (2002) 8. Lane, P.J., Koka, B.R., Pathak, S.: The reification of absorptive capacity: a critical review and rejuvenation of the construct. Acad. Manag. Rev. 31(4), 833–863 (2006) 9. Cohen, W.M., Levinthal, D.A.: Absorptive capacity: a new perspective on learning and innovation. Adm. Sci. Q. 35(1), 128–152 (1990) 10. Oishi, S., Oe, A.: Management innovation and technological innovation in museums – Results of a questionnaire survey on museum characteristics that generate innovation. Japan Association for Cultural Economics (2021) 11. Nomura, M.: Ethnographer at Work: Vol. 4 Masaki Kondo – Planning exhibits at the museum. National Museum of Ethnology. https://older.minpaku.ac.jp/museum/showcase/fieldnews/shi gotoba/kondo/wp11. Accessed 21 Jan 2023 12. Furukawa M.: Feature Exhibit: Summer of Mononoke – Ghosts and Specters in Edo Culture. Internet Museum (2019). https://www.museum.or.jp/report/1220. Accessed 21 Jan 2023 13. Monga, M.: Summer! Ghosts! Ghosts! Let’s go to an exhibition that looks into the other world! Japanese culture entrance magazine Waraku web (2019). https://intojapan-waraku. com/culture/17142/. Accessed 21 Jan 2023 14. Granovetter. M.: Economic action and social structure: the problem of embeddedness. Am. J. Sociol. 91(3), 481–510 (1985) 15. Kumamoto Prefecture Museum Network Center.: Kumamoto Museum Network Center Midterm Plan FY 2019 – FY 2023. Kumamoto Prefecture General Museum Network Portal Site (2015). https://kumamoto-museum.net/wp-content/uploads/2015/09/tyuuki.pdf. Accessed 21 Jan 2023 16. Tokyo University of Agriculture and Technology.: [2019.1.5–3.30] Reeling Yarn from Cocoons: Technological Transition and the Future, a special exhibition commemorating the collaboration between the Tokyo University of Agriculture and Technology Science Museum and Okaya Silk Yarn Museum. Tokyo University of Agriculture and Technology (2018). https://www.tuat-museum.org/%E6%9D%B1%E4%BA%AC%E8%BE%B2%E5% B7%A5%E5%A4%A7%E5%AD%A6%E7%A7%91%E5%AD%A6%E5%8D%9A%E7% 89%A9%E9%A4%A8%E3%83%BB%E5%B2%A1%E8%B0%B7%E8%9A%95%E7% B3%B8%E5%8D%9A%E7%89%A9%E9%A4%A8%E9%80%A3%E6%90%BA%E8% A8%98/. Accessed 21 Jan 2023 17. Niigata Prefectural Museum of History.: Children’s Jomon Research Exhibition 2019 Jomon’s Message, My Message. Niigata Prefectural Museum of History (2019). http://nbz. or.jp/?p=21500. Accessed 21 Jan 2023 18. Niigata City Art Museum.: Come, Bauhaus - the origin of art and design. Niigata City Art Museum (2019). http://www.ncam.jp/exhibition/5040/. Accessed 21 Jan 2023 19. TOPPAN.: National Museum of Nature and Science and Toppan Printing to offer new dinosaur experiences through VR content and online courses. TOPPAN (2021). https://www.toppan. co.jp/news/2021/01/newsrelease_210119.html. Accessed 21 Jan 2023 20. Suzuki, S.: Chapter 5: Museum exhibition theory. Introduction to modern museum studies (Awata, S. ed.), pp. 135–173, Minerva Shobo, Tokyo (2019) 21. Mizuho Research & Technologies.: Survey on Sustainable Museum Management: Identification of Issues Facing Museums and Examples of Efforts to Solve Them. Mizuho Research & Technologies (2019). https://www.mizuho-ir.co.jp/publication/mhri/sl_info/working_papers/ pdf/report20190401.pdf. Accessed 21 Jan 2023
172
S. Oishi and A. Oe
22. Camarero, C., Garrido, M.J.: The role of technological and organizational innovation in the relation between market orientation and performance in cultural organizations. Eur. J. Innov. Manag. 11(3), 413–434 (2008) 23. Bonacich, P.: Factoring and weighting approaches to status scores and clique identification. J. Math. Sociol. 2(1), 113–120 (1972) 24. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E69(026113), 1–16 (2004) 25. Niigata Prefecture.: Action Plan for Promotion of Tourism in Niigata Prefecture. Niigata Prefecture (2021). https://www.pref.niigata.lg.jp/uploaded/attachment/264706.pdf. Accessed 21 Jan 2023 26. Yamanashi Prefectural Museum of Art.: The History of Yamanashi Prefectural Museum of Art Part 1: Birth of the Museum! -A record of the opening of the museum. Yamanashi Prefectural Museum of Art (2018). https://www.art-museum.pref.yamanashi.jp/permanent/history.pdf. Accessed 21 Jan 2023 27. Tanaka, Y.: Prefectural Art Museum: 40% increase in visitors due to the Millet Pavilion effect. Yamanashi Nichinichi Shimbun (2009). http://sannichi.lekumo.biz/minamialps/2009/ 12/30ken.html. Accessed 21 Jan 2023 28. Fujita, R.: Not only The Sower but also Millet’s masterpieces in Yamanashi: Unknown Amazing Art of Japan (12th). JBpress Autograph (2020). https://jbpress.ismedia.jp/articles/-/62652. Accessed 21 Jan 2023 29. Yamanashi Prefectural Museum of Art.: Yamanashi Prefectural Museum of Art as a creative base for culture and art Formulation of a vision for the 50th anniversary and the Metaverse Experimental Project. Yamanashi Prefectural Museum of Art (2022). https://www.art-museum.pref.yamanashi.jp/news/2022/09/detail_872.html. Accessed 12 Jan 2023 30. Zuur, A.F., Ieno, E.N., Elphick, C.S.: A protocol for data exploration to avoid common statistical problems. Methods Ecol. Evol. 1(1), 3–14 (2010) 31. Forés, B., Camisón, C.: Does incremental and radical innovation performance depend on different types of knowledge accumulation capabilities and organizational size? J. Bus. Res. 69(2), 831–848 (2016) 32. Szulanski, G.: Exploring internal stickiness: impediments to the transfer of best practice within the firm. Strateg. Manag. J. 17(S2), 27–43 (1996)
Does Guaranteeing Anonymity in SNS Use Contribute to Regional Revitalization? Yurika Shiozu1(B) , Soichi Arai2 , Hiromu Aso3 , Yuto Ohara3 , Ichiro Inaba3 , and Katsunori Shimohara3 1 Kyoto Sangyo University, Kyoto 603-8555, Japan
[email protected]
2 Akita University, Akita 010-8502, Japan 3 Graduate School of Science and Engineering, Doshisha University, Kyotanabe 610-0394,
Japan
Abstract. To avoid problems on social networking sites, people are warned not to identify their personal information, while it is said that if everyone uses SNSs under their real names, it will discourage inappropriate use. In fact, users use their real names when they intend to interact with others and anonymously when they intend to gather information. By allowing users to use their real names or anonymous names for different functions within the same application, could SNS be utilized to revitalize local communities? In this paper, we developed our own application and conducted social experiments with three groups in two regions. The results of the analysis suggest that when people who know each other offline communicate with each other in online communities, there are no age or regional differences in the use of functions that can be used anonymously, but there are regional differences in the use of functions that can only be used non-anonymously. In other words, the results indicate that when utilizing SNS for the purpose of revitalizing local communities, it is effective to implement functions that consider regional differences in social networks, even if people already know each other offline. Keywords: Anonymity · Social network · Regional difference
1 Introduction Recently, social networking services (SNSs) have been used not only for web marketing by companies but also for public relations activities by governments and municipalities, activation of local communities, and for other non-personal purposes. One possible explanation of this surge is that SNSs have been adopted by a wide spectrum of users, from all age groups. According to [6], the usage rate of LINE exceeds 80% for all age groups from teens to 60s, while the usage rate of YouTube exceeds 60% for all age groups from teens to 60s. On the other hand, the Twitter usage rate exceeds 50% among young people in their teens to 30s, however, is less than 50% among those in their 40s and older. A similar trend is seen in Instagram usage. Thus, certain SNSs, are no longer skewing usage by age. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 173–187, 2023. https://doi.org/10.1007/978-3-031-35129-7_12
174
Y. Shiozu et al.
As the use of SNSs has become increasingly widespread, problems have also become apparent. For example, as pointed out by [8], some SNS users are overly dependent on SNSs for “likes” and other types of praise from other users, which can negatively affect their mental and physical health. There are also people who post inappropriate content and cause social problems. [2] shows a study that investigated the relationship between the SNS dependence and age in Japan. In this study, the rate of the SNS dependence in Japan was the highest among teenagers and lowest among those in their 60s. SNS dependence is a problem not only in Japan but also in other countries. Communication in online communities was originally intended to be a benevolent exchange. However, dependence on SNSs is increasing, and social problems related to the SNS use (such as slander, flames, and inappropriate posts) have been on the rise as well. Recently, measures for addressing these problems have been studied and implemented. For example, one study (for example [3]) proposed displaying “likes” as a percentage rather than as a number of likes, for solving the problem of likes, and suggested to implement a function for displaying the names of those users who expressed likes. In addition, for Internet literacy, it was recommended to take care not to release personal information on SNSs, for avoiding problems. On the other hand, it was also suggested that having all SNS users use their real names will likely deter slanders and inappropriate posts. In fact, SNS users use either anonymous or real names, or nicknames from which real names can be guessed, depending on their motivation. They tend to use their real names when the purpose is to expand their business network or to communicate with offline friends, while they lean toward anonymity when the purpose is to collect information or pass the time. A series of surveys have been made by the Ministry of Internal Affairs and Communications for determining whether people communicate in online communities anonymously or under their real names, depending on their region and age. (See [5] and [6]). These surveys collected real-name usage rates by the SNS type, using questionnaires, and showed that in other countries, real-name usage is more common than in Japan for any SNS and that in Japan the rate of anonymous use of SNSs by type is age-independent. These results suggest that real-name use should be assumed when using SNSs to activate offline interactions within a community. However, since people tend to use SNSs anonymously when the purpose is to collect information, it is unclear whether SNSs could be utilized for regional revitalization by allowing anonymous use by function within the same application, although real names are assumed to be used. Sociological findings revealed regional differences in real-life social networks. For functions that can be used anonymously within the same application for the purpose of community activation, and functions that can only be used with real names, are there age or regional differences in usage?
Does Guaranteeing Anonymity in SNS
175
The research questions addressed by this study were whether regional and age differences exist when communicating in online communities, even when communicating anonymously, and whether regional and age differences exist when using communication functions that share the participants’ real names and information. To answer the above research questions, we developed a unique social media application to activate communication in a local community. The application introduced points, an element of gamification that has no economic value, but a unique point system was implemented as an indicator of intentionality. We then conducted a social experiment using the originally developed application for three groups: 1) members of a community development non-profit organization (NPO), mainly in their 70s, 2) a group of university students located in the same area (University A), and 3) a group of university students located in another area (University B). The participants in the experiment were aware of each other offline. Analysis of the data obtained did not reveal statistically significant differences associated with the use of anonymous functions by region or age. On the other hand, with respect to the function of visualizing connections between individuals in a non-anonymous manner using points that had no economic value, the results revealed no age-related differences but suggested regional differences. Regional differences in social networks were observed when the network structures of the different groups’ non-anonymous function were analyzed using the tools of the social network analysis. These results suggest that when people who know each other offline communicate with each other in online communities, there are no age or regional differences associated with the use of universal functions that can be used anonymously; however, there are regional differences associated with the use of functions that can only be used nonanonymously. In other words, when utilizing SNSs to revitalize local communities, it is effective to implement functions that account for regional differences in social networks, even if users already know each other offline. The social experiment is described in Sect. 2. Section 3 presents the results of the data analysis. Section 4 provides a discussion. Finally, Sect. 5 presents the conclusions and discusses future issues.
2 Social Experiment 2.1 CSD App In this study, an original app called Community System Design; CSD was developed, and a social experiment was conducted. The CSD app aims to activate communication within a local community. The points themselves have no economic value and are only meant as an indicator of intentionality. The rubric according to which CSD points were awarded is listed in Table 1.
176
Y. Shiozu et al. Table 1. Rubric of CSD for point
Function Action
Number of points
checkin
30
Submit a photo
comment Write comments when giving points to other participants 20 game
Take a position in a game of camp
30
gift
Receive points from others
Optional Points
like
Like a posted photo
5
meet
Meet face-to-face with other participants
50
step
Record the number of steps taken with an activity meter 1 point for every 10 steps
2.2 Subjects and Duration of the Social Experiments A social experiment was conducted on three groups of participants, using the CSD app. The first group consisted of 24 NPO participants. The second group consisted of nine university seniors from Region A, who were also students in the same class. The third group consisted of 11 first-year university students from Region B who were also taking the same class. Participants from the NPO and University A groups were given the option of lending a smartphone with the CSD app installed or installing the app on their personally owned smartphones. Participants from the NPO and University A groups were given Omron activity meters for the social experiment, while participants from the University B group were given smartphones with the CSD app installed and Omron activity meters. A training session was held in advance for all participants to explain the device operation. Before starting any social experiments, ethical screening was conducted at Doshisha University and Kyoto Sangyo University, and the experiments were conducted only after receiving approval. Summary statistics of participants and the duration of the experiment are listed in Table 2. Table 2. Overview of participants Number of participants
Period
Age
NPO
Male; 10, Female; 14
Jan. 1.2022 –Mar. 31.2022
30 s–80 s
University A
Male; 7, Female; 2
June 2.2022 –June 16.2022
20 s
University B
Male; 4, Female; 7
June 30.2022 –Oct. 7.2022
10 s–20 s
Does Guaranteeing Anonymity in SNS
177
2.3 Use Function Summary statistic plots for the participants are shown in Fig. 1. R. 4.3.1 was used for the analysis.
Fig. 1. Histograms, probability density distribution graphs and boxplots
The bottom row in Fig. 1 shows the histograms. The graphs on the diagonal show the probability density distributions. Because the experimental periods differed across the different groups, the number of days in the histogram for GROUP of the University A was smaller than other groups (shown in pink). Similarly, when the histograms for checkin, comment, game, gift, like, meet, and step were inspected, checking, comments, gifts, and likes had that the day no record is the most in any group. However, the distributions of games and meets varied for the NPO group, and the steps varied across the NPO and University A groups. Considering the different experimental periods of each group, we also checked the probability density distributions and found that not all distributions were normal. The distributions of data within groups by function were also confirmed by inspecting the box-and-whisker diagrams. The first column on the right side of Fig. 1 shows a boxand-whisker diagram. Checking and likes exhibited similar distributions of data among the three groups, but there were differences between the box-and-whisker diagrams of the different groups with respect to the other functions. Comments and gifts exhibited similar distributions across the NPO and University B groups, but games and meets exhibited similar distributions across the University A and University B groups. The
178
Y. Shiozu et al.
data distributions across the University A and University B groups were similar. This suggests that there may be differences between the functions used by university students and adults. The descriptive statistics are listed in Table 3. Table 3. Descriptive statistics NPO (N = 90)
mean
sd
median
mode
min
max
skew
kurtosis
se
checkin
0.07
0.36
0
0
0
3
6.53
46.91
0.04
comment
0.21
0.63
0
0
0
3
2.79
6.57
0.07
24.22
26.90
15
0
0
134
1.61
2.75
2.84
gift
0.32
1.03
0
0
0
6
3.78
14.84
0.11
like
0.60
2.72
0
0
0
21
6.05
38.63
0.29
meet
7.56
8.87
4
0
0
47
1.80
3.74
0.93
step
6.90
2.45
8
8
0
10
−1.82
2.83
0.26
checkin
0.00
0.00
0
0
0
0
NaN
NaN
0.00
comment
2.53
4.36
0
0
0
13
1.19
−0.21
1.12
game
0.13
0.35
0
0
0
1
1.95
1.93
0.09
game
Univ. A (N = 15)
gift
2.60
4.75
0
0
0
15
1.36
0.50
1.23
like
5.73
19.85
0
0
0
77
3.06
8.08
5.13
meet
2.13
7.20
0
0
0
28
3.07
8.11
1.86
step
2.73
1.58
2
2
1
8
2.34
5.36
0.41
checkin
0.83
3.69
0
0
0
32
6.80
52.27
0.38
comment
0.03
0.31
0
0
0
3
9.33
86.06
0.03
game
0.19
1.87
0
0
0
18
9.33
86.06
0.19
gift
0.11
0.94
0
0
0
9
9.17
83.91
0.10
like
0.49
2.99
0
0
0
27
7.79
64.35
0.31
meet
0.02
0.21
0
0
0
2
9.33
86.06
0.02
step
0.23
0.85
0
0
0
6
5.37
30.89
0.09
Univ. B (N = 93)
Does Guaranteeing Anonymity in SNS
179
3 Analysis 3.1 Group Difference Tests In this section, we first examine whether there were differences in the use of different functions across the different experimental groups, for clarifying whether these use differences were owing to age or regional differences. As shown in Table 2, the experimental periods for the different groups were not the same, and as shown in Fig. 1, the tested data did not follow a normal distribution; therefore, the non-parametric Kruskal–Wallis test was used. R. 4.3.1 was used for the analysis. The test results are listed in Table 4. Table 4. Result of Kruskal – Wallis test function checkin comment game gift like
chi-squared
p-value
5.1707
0.07537
20.994
0.00002762
124.87
0.0000*
13.713
0.001053
2.0065
0.3667
meet
122.62
0.0000*
step
139.26
0.0000*
* represents 0.0000000000000000000000022.
The results in Table 4 suggest that there was a difference in the usage of the COMMENT, GAME, GIFT, MEET, and STEP functions between the groups, at the 1% level of statistical significance. On the other hand, there was no significant difference in the use of likes and checkins. This indicates that there might be regional differences in the use of comments, games, gifts, meets, and steps, but not regional or age differences associated with the use of likes and checkins. The Kruskal–Wallis test informs about the existence of a difference between groups, but does not inform about the significance of differences between specific groups. Thus, and given the fact that the data were not normally distributed (Fig. 1), the non-parametric Wilcoxon test was used for comparing pairs of experimental groups. For post-hoc analysis, we used the Bonferroni correction and the Holm correction. The Bonferroni adjustment of the significance level was performed using Eq. (1), in which pB is the Bonferronicorrected p-value, p is the p-value reported by the Wilcoxon test, and N is the number of group pairs. Similarly, the significance level reported by the Wilcoxon test was divided by N to obtain the Bonferroni-corrected significance level. pB =
p N
(1)
The Holm method was implemented as in Eq. (2), where pH is the Holm-adjusted p-value and rank represents the rank in the Wilcoxon test when the p-values are ranked
180
Y. Shiozu et al.
in the ascending order. p ∗ rank N The results of the tests are listed in Table 5. pH =
(2)
Table 5. Adjusted p-values obtained after the Bonferroni and Holm correction comment
p-value
rank
Bonferroni correction
Holm correction
NPO vs. Univ. A
0.0127
3
0.0042
0.0127
NPO vs. Univ. B
0.0049
2
0.0016
0.0098
Univ. A vs. Univ. B
0.0000
1
0.0000
0.0000
game
p-value
rank
Bonferroni correction
Holm correction
NPO vs. Univ. A
0.0000
2
0.0000
0.0000
NPO vs. Univ. B
0.0000
1
0.0000
0.0000
Univ. A vs. Univ. B
0.0088
3
0.0029
0.0088
gift
p-value
rank
Bonferroni correction
Holm correction
NPO vs. Univ. A
0.0964
3
0.0321
0.0964
NPO vs. Univ. B
0.0049
2
0.0016
0.0097
Univ. A vs. Univ. B
0.0001
1
0.0000
0.0004
meet
p-value
rank
Bonferroni correction
Holm correction
NPO vs. Univ. A
0.0000
2
0.0000
0.0001
NPO vs. Univ. B
0.0000
1
0.0000
0.0000
Univ. A vs. Univ. B
0.0003
3
0.0001
0.0003
step
p-value
rank
Bonferroni correction
Holm correction
NPO vs. Univ. A
0.0000
3
0.0000
0.0000
NPO vs. Univ. B
0.0000
1
0.0000
0.0000
Univ. A vs
0.0000
2
0.0000
0.0000
For the pairwise comparison of the three groups, the Bonferroni-adjusted p-value at the 5% significance level was 0.0167. Thus, there was no difference between the distributions of GIFT usage by the NPO and University A groups. However, we observed significant differences for all other combinations. And because the Bonferroni correction is quite conservative, we also considered the Holm method. The Holm method reported no significant difference between the distribution of GIFT use by the NPO and University A groups. However, significant differences were observed for all other combinations. That is, the same results were obtained for both the Holm and Bonferroni methods. Therefore, we concluded that there were regional differences associated with the use of the COMMENT, GAME, MEET, and STEP functions as well as differences owing to the participants’ age. In addition, there were regional differences associated with the use of the GIFT function.
Does Guaranteeing Anonymity in SNS
181
3.2 Social Network Analysis In the CSD application, the GIFT function can be used to award points from Individual A to Individual B, and the transfer of these points is associated with the individual to whom they are awarded. Therefore, social network analysis can reveal the network structure of each group associated with the use of the GIFT function; since there were no differences in the use of the GIFT function between NPO and University A, and differences between NPO and University B and between University A and University B, the network structure was compared. Following the method of [10], a directed graph was created with the points gifted by any individual A in each group to another individual B as weights. Note that all NPO participants who used the GIFT function were in Region A. The network characteristics of the NPO, University A, and University B groups are summarized in Table 6. Gephi 0.10 was used for the analysis. Table 6. Overview of each network structure NPO
Univ. A
Univ. B
Average degree
1.5
1.444
0.833
Average weighted degree
175
38.444
55.833
Diameter
3
3
3
Density
0.167
0.181
0.076
Modularity class
2
2
3
Average clustering coefficient
0.235
0.23
0
Table 6 shows that for all of the experimental groups the network diameter was 3, indicating that the individuals in the groups were connected via on average three people, and that the sizes of the networks were the similar. However, the average degree, the average number of points given within a group, was above one for the NPO and University A groups, but below one for the University B group. The average number of points given per gift exceeded 100 for the NPO group, however, was under 100 for both the University A and University B groups. The density, the ratio of the number of possible point-giving combinations within a group that actually gave points, was above 0.1 for the NPO and University A groups, but below 0.1 for the University A and University B groups. The number of subgroups within a group was two for the NPO and University A groups, while it was three for the University B group. The average cluster coefficient C measures the connectively of a person’s neighborhood (excluding the person) and can be calculated using Eq. (3). Let vi denotes the degree of node i, e denotes the number of real edges between node i and its neighbors, and N denotes the overall number of nodes in the analyzed group. C=
e 1 N 1 ∗ i=1 N 2 vi (vi − 1)
(3)
182
Y. Shiozu et al.
Thus, for the structure in Fig. 2, the average cluster coefficient is 1, while for the structure in Fig. 3, it is 0.
Fig. 2. Perfect graph
Fig. 3. Star graph
From Table 6, the average clustering coefficient was above 0 for the NPO and University A groups, but 0 for the University B group. Therefore, the network of University B had the shape shown in Fig. 3. Furthermore, the clustering coefficients for individuals in each group were also calculated, and the group individuals were ranked based on their clustering values. Figure 4 shows the ranking of the clustering coefficients for the members of the NPO group. Figure 5 shows the ranking results for the University A group, and Fig. 6 shows the ranking results for the University B group. Pink in these figures indicates low-ranking clustering coefficients, darker green indicates higher-ranking clustering coefficients, and thicker lines indicate higher weights. Figures 4 and 5 are drawn based on the ranking of the clustering coefficients for the NPO group and the University A group. The same is true for the University B group. (See Fig. 6.) These Figs. Are made by Gephi 0.10. The blue dotted group consists of individuals with high clustering coefficients, while the red dotted group consists of individuals with low clustering coefficients; the same is true for the University A group; for the NPO group and the University A group, we observed that members are interconnected within the blue dotted group, but However, when T is left out of the NPO group and I is left out of the University A group, the cohesion within the red dotted group is weakened. In particular, in the University A group, A, B, C, F, and G are isolated vertices. This indicates that the NPO group and the University A group have a central vertex, but also have connections among members outside the central vertex. The results also indicate that the NPO group and the University A group have mutual gift of points, albeit between some members.
Does Guaranteeing Anonymity in SNS
Fig. 4. Ranking of cluster coefficient for NPO
Fig. 5. Ranking of cluster coefficient for University A
183
184
Y. Shiozu et al.
Fig. 6. Ranking of clustering coefficient for University B
However, in the University B group, the clustering coefficients for all group members had the same rank. In fact, when the clustering coefficients of the different individuals were checked, all the clustering coefficients had a value of 0. In other words, the shape of the network was considered to be variant, as shown in Fig. 3. Since the modularity class is 3 and the group is divided into three subgroups, each group is drawn with a dotted line in Table 6; leaving I, H, D, and J become isolated vertices and are divided into three subgroups that are not connected to each other. In addition, all point grants between members were in one direction.
4 Discussion The CSD app required real-name registration to activate communication within the community. We examined whether communication activation within a group was observed in the three studied groups using each function of the CSD app. The test revealed that there was no difference associated with the usage of photo postings (checkin) and “likes” among the three groups. This might be since not only Twitter and Instagram, but also LINE and YouTube are used by a wide spectrum of ages, and because these SNSs have photo postings and “like” functions, the use of photo postings and “like” functions in the CSD app was not affected by age or region. Other functions, however, were more affected by personal preferences, such as those similar to game applications or linked to health-management applications, rather than SNSspecific functions. In other words, those who originally liked games and those who were
Does Guaranteeing Anonymity in SNS
185
health-conscious tended to use these functions, while those who were not interested in them did not tend to use these functions. Since the subjects in this study were not a group based on personal preferences, it is likely that the use of these functions would have increased not owing to age or regional differences but if there were more people in the group who liked games or were health conscious. One of the features of the CSD app allows an individual to gift points to others in the individual’s name, and the giving behavior is disclosed to other members of the group. Although other apps implement the function of gifting points that have no economic value, these points are often a reward for some action and not a direct gift to a specific individual. In addition, many apps allow points to be gifted using nicknames rather than real names. All three groups in this study knew each other offline and knew each other’s real names (not just the nicknames in the online community). However, there were statistically significant differences in the use of the GIFT function between the NPO and University B groups, and between the University A and University B groups. A comparison of the University A and University B groups showed that there were differences in the use of the GIFT function between university students and that there were no significant differences in the number of participants, suggesting that the differences in the use of the GIFT function were not driven by age. Since the NPO and University A groups were located in the same area and there was no difference in the use of the GIFT function, and since the University A and University B groups were located in different areas, the difference in the use of the GIFT function might be owing to regional differences. [11] and [1] proposed that regional differences exist in social networks, and numerous studies have suggested the existence of such differences. These previous studies have shown that there are regional differences associated with the distance from friends’ residences and with the number of friends who provide support in addition to relatives. This study suggests that there are regional differences even in the relatively low-burden behavior of actually giving points through the CSD app, rather than providing support for difficult situations such as illness or distress. When the point-giving behavior was analyzed using the tools of the social network analysis and the network structures were compared across the different experimental groups, it was found that the University B group had a clustering coefficient of 0 for all participants. This means that the social network of the University B group was fragile in terms of connections among the group members, and the network could easily disintegrate if the central vertex left. According to residents of Region B, people in Region B tend to be shy. Given that there was no difference in the use of the “like” function between Regions A and B, it is possible that the University B group did not use the GIFT function because they did not want their names and point gifts to be known within the group. In addition, it is also possible that the students had only been at the university for a short period of time and had difficulty building relationships outside of class owing to the COVID19 pandemic, which might had affected their use of the GIFT function. For the NPO and University A groups, we observed that some of those who received points gave points in return. However, for the University B group, no reciprocal pointgiving was observed. In general, gifts between individuals are subject to reciprocity. However, because the CSD app does not have a function to notify the recipient that
186
Y. Shiozu et al.
he/she has received points, it is possible that the recipient was unaware of any received points. Therefore, we can infer that regional differences are not the only reason explaining why two-way point giving was not observed.
5 Conclusion and Remarks This study examined whether regional and age differences arise when communicating in online communities, even when communicating anonymously, and whether regional and age differences arise when communication is performed using real names; the study was performed using an originally developed application. Namely, we conducted a social experiment with three groups in two regions. When communicating anonymously in an online community, regional and age differences did not arise for features that were available in existing SNSs and were used by users with a wide range of ages, such as posting photos and granting likes. However, with respect to the functions influenced by users’ interests, such as games and health management, the results showed that usage differed depending on whether there were interested participants in the group, rather than on the influence of age or region. The results also suggest the possibility of regional rather than age differences in the use of communication functions that use real names and share information among the participants. Prior research has pointed to four reasons why people merely view and express “likes” of others’ posts: environmental influences, individual preferences, individual and group relationships, and security aspects (See [9]). In particular, those that affect individual preferences point to personality-related issues such as being concerned about the content of comments from others and being shy (See [7]). The results of this study indicate that users in regions with more shyness avoid using communication functions that share real names and information among participants compared with users in other regions, which is consistent with the conclusions of previous studies. However, it is possible that personality effects could be moderated if participants were able to build relationships offline over time. Whether the effect of personality or time is stronger remains unclear. In addition, gifting between individuals is reciprocal. However, the application used in this study did not implement a notification function, so it is possible that someone might have been approached to communicate with us and was not aware of it. In future studies, notification functions will be implemented both for the anonymous and non-anonymous modes, allowing to better examine the impact of anonymity and non-anonymity on the activation of communication. We will continue our research on these two issues. Acknowledgement. The authors acknowledge and thank all participants of our social experiments for their cooperation in preparation of this paper. We received great help from Vitalify Asia Co. Ltd. in the development of the app. We would like to express our gratitude to Vitalify Asia Co. Ltd. This study was funded by JSPS Kakenhi (Grants-in-Aid for Scientific Research by Japan Society for the Promotion of Science) No. JP21K12554.
Does Guaranteeing Anonymity in SNS
187
References 1. Fischer, C.S.: What do we mean by ‘friend’? an inductive study. Soc. Netw. 3(4), 287–306 (1982) 2. Hashimoto, Y.: Internet addiction in Japan; focusing on SNS addiction. Stresskagakukenkyu 33, 10–14 (2018). in Japanese 3. Kawabata, K., Nakata, Y., Kitani, Y.: Study about the psychological influences of like in SNS and its graphic design. Bull. JSSD 2017, 236–237 (2017). in Japanese 4. Ministry of Internal Affairs and Communications.: 2014 White Paper on Information and Communications (2014) 5. Ministry of Internal Affairs and Communications.: 2015 White Paper on Information and Communications (2015) 6. Ministry of Internal Affairs and Communications.: Report on Survey on Information and Communications Media Usage Time and Information Behavior in FY2021 (2022) 7. Nonnecke, B., Preece, J.: Lurker demographics: counting the silent. In: Proceedings of CHI00: Human Factors in Computing Systems, pp. 73–80. Association for Computing Machinery, New York (2000) 8. Rosen, L.D., Carrier, L.M., Cheever, N.A.: iDisorder: Understanding our Obsession with Technology and Overcoming its Hold on us. Macmillan, New York (2012) 9. Sun, N., Pei-Luen, P., Rau, Ma, L.: Understanding lurkers in online communities: a literature review. Comput. Hum. Behav. 38, 110–117 (2014) 10. Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994) 11. Wellman, B., Leighton, B.: Networks, neighborhoods, and communities: approaches to the study of the community question. Urban Aff. Q. 14(3), 363–390 (1979)
Effects of Poor Visibility on Riding Anxiety in Riding a Bicycle that Can Be Ridden with Two Infants Sakurako Toyoshima(B) , Makoto Oka, and Hirohiko Mori Tokyo City University, Tokyo, Japan [email protected]
Abstract. The position of the front seat installation and the attachment of a rain cover while riding a bicycle that can be ridden with two infants make it difficult to check the road ahead due to poor visibility. Therefore, in order to clarify the effect on riding anxiety caused by poor visibility due to the rain cover attached to the front seat, a riding experiment was conducted using a styrene board that resembled a rain cover. The eye mark recorder was used to measure the line of sight, and the participants were asked to answer a questionnaire after riding, and an evaluation was made based on these results. As a result, it was found that poor forward visibility of riders affects their sense of riding anxiety, leading to the conclusion that for riders who install a rain cover or high front seats, those with a forward viewable distance of 5 m or less are recommended. Keywords: Bicycle · Poor visibility · Eye mark recorder
1 Background Bicycles that can be ridden with two infants are bicycles that can carry one infant in front of and behind the rider and up to three passengers at the same time. These bicycles are often used for picking up and dropping off infants and for shopping in nearby areas in Japan. Since bicycles are “light vehicles,” they are subject to traffic rules [1], and riders are encouraged to use bicycles with safety markings such as the BAA mark [2], which is affixed to bicycles that meet bicycle safety standards, and the Car Mark [3], which is affixed to bicycles that meet the requirements of the National Police Agency for bicycles that can be ridden with two infants. However, many falling and tumbling accidents occur. In the survey by the Consumer Affairs Agency in Japan [4], the risks of falling due to differences in bicycle specifications is compared and evaluated. In the survey, in riding a bicycle with an infant, the subjective evaluations show being the high position of the child seat causes the poor visibility and it makes the riding difficult. In particular, when the rain cover is attached to the front seat as shown in Fig. 1 the rider’s eye line and the rain cover of the front seat are almost at the same height, as indicated by the line in the figure, making it difficult to check the front view. However, the effect of the poor visibility caused by the rain cover on the front seat on riding anxiety is not clarified. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 188–198, 2023. https://doi.org/10.1007/978-3-031-35129-7_13
Effects of Poor Visibility on Riding Anxiety
189
Fig. 1. Bicycle with rain cover installed
2 Related Research Aich et al. [5] and Okawa et al. [6] conducted a study of the rider’s viewpoint when riding a typical one-person bicycle using an eye mark recorder, and stated that approximately 60% of the riders looked within 10 m ahead when facing down and that the gaze distance depended on the riding speed, respectively. However, these studies did not deal with bicycles for infant passengers. On the other hand, the Consumer Affairs Agency conducted a survey on riding stability of bicycles for infant riders [4] and obtained subjective evaluations of forward visibility, but did not evaluate them as objective data, nor did they take into account the rider’s feelings of riding anxiety when visibility is poor due to the use of rain covers. Thus, no research has been conducted on visibility when riding bicycles with infants, and we do not know what kind of relationship exists between poor visibility and riding anxiety.
3 Purpose of Research The purpose of this study is to clarify the effect of poor visibility on riding anxiety when a rain cover is attached to the front seat of a bicycle for two infants. We conducted a riding experiment to verify whether there is a difference in the sense of riding anxiety depending on the distance at which the rider can check the frontward direction. Based on the experimental results, we propose an index to determine the extent to which the ability to check the road ahead has a small effect on riding apprehension.
190
S. Toyoshima et al.
4 Experiment In order to clarify the effect of poor visibility on riding anxiety when a rain cover is attached to the front seat of a bicycle for two infants, a riding experiment is conducted to investigate how much the effect on riding anxiety can be minimized when the front can be seen. 4.1 Subjects Since many female subjects are shorter than male subjects, we considered that female subjects are more likely to have their vision blocked. The heights of the three female subjects were 152 cm, 159 cm, and 161 cm, respectively, and those of the two male subjects were 170 cm and 177 cm, respectively. Before the experiment, the subjects were asked to practice riding a bicycle with two infants so that there would be no difference due to familiarity with riding. Each subject was asked to wear a protector in riding. 4.2 A Bicycle Styrene boards, which were designed to look like rain covers, were installed on the front seat of a bicycle for two infants. One styrene board was placed on the bicycle, and the other was clipped to the styrene board on the bicycle, so that the height of the styrene board could be changed. The width of the board is a little wider than the front seat to accommodate the rain cover. In this experiment, we used a bicycle for two toddlers, which is designed to be used with a front seat. This is because the bicycle has a low front seat, which allows the styrene board to be placed in a low position. Figure 1 and Fig. 2 show a side view of a bicycle with a styrene board installed and the second styrene board secured with a clip. Like the rain cover, the styrene board can obstruct the rider’s forward view, and the position of the styrene board can be adjusted.
Fig. 2. A side view of the bicycle with the styrene board installed
Fig. 3. The second styrene board secured with a clip
Effects of Poor Visibility on Riding Anxiety
191
4.3 A Forward Viewable Distance We set the forward visible distance is from the rider to the point where the rider can see the road ahead, as shown in Fig. 3. In the case of Fig. 3, the forward visibility distance is 8 m. The height of the styrene board was adjusted, and the subjects were asked to ride six times at the forward visibility distances of 3 m, 4 m, 5 m, 6 m, 7 m, and 8 m.
Fig. 4. A forward viewable distance
4.4 Infants The purpose of this experiment was to clarify the effect on riding anxiety when the rider’s forward vision is obstructed. If the experiment is conducted assuming riding conditions with an infant, it may give the effects to the rider’s feelings of riding anxiety. Therefore, the experiment was conducted without carrying a dummy doll or other objects that are as heavy as the passenger infants. 4.5 Road to Ride A circuit track, which is a combination of a straight and a curve, and a 5 cm step was placed in the circuit track. As shown in Fig. 4, the circuit track is a 10 m square, 2 m wide travel path with cones placed at the corners of the 10 m square and at the curves. The 5 cm step was made by combining boards as shown in Fig. 5. The 5 cm step is intended for a step from the roadway up to the sidewalk.
192
S. Toyoshima et al.
Fig. 5. A circuit track
Fig. 6. A 5cm step
4.6 Questionnaire Subjects were asked to answer a questionnaire after each ride, since there were six rides with a forward viewable distance of 3 to 8 m. The questions were as follows. The questions were rated on a five-point scale, “Very applicable,” “Applicable,” “Neither applicable nor not applicable,” “Not applicable,” and “Not applicable at all. 1. 2. 3. 4. 5. 6. 7. 8. 9.
Was it scary when you were going straight ahead? Was it difficult to see in front of you when you were going straight? Was it difficult to ride straight ahead? Were you afraid of steps? Was it difficult to see the road? Was it difficult to ride on steps? Were you afraid of curves? Was it difficult to see the curves? Was it difficult to ride on curves?
4.7 Eye Mark Recorder The subject’s viewpoint was measured with an eye mark recorder (EMR-9). This is a gaze measurement device that can detect the subject’s eye movements using a sensor attached to the head and display the subject’s field of view as an eye mark on the image captured by a viewing camera [4].
5 Experimental Results 5.1 Subjective Evaluation After each ride, the participants were asked to respond to a questionnaire. The questionnaire results were recorded on a 5-point scale, and the results were analyzed. The results were analyzed by assigning a score to each of the five levels, from 1 to 5, with the maximum score of 5 for “very applicable” and the minimum score of 1 for “not applicable at all. In order to eliminate the influence of differences in scoring between subjects, the scores were standardized to a mean of 0 and variance of 1 for each subject and each question, and then analyzed as a whole.
Effects of Poor Visibility on Riding Anxiety
193
For each question, the mean value was calculated for each forward viewable distance, and one-way analysis of variance and multiple comparison tests were conducted to determine the difference between the levels. The results of the analysis for each question item are presented below. Question 1: Significant difference between 5 m and 7 m in fear while riding straight. Question 2: Significant difference between 5 m and 7 m in difficulty of seeing when riding straight. Question 3: Significant difference between 5 m and 7 m in riding difficulty when riding straight. Question 4: Significant difference in analysis of variance for fear of steps, but no significant difference between levels. Question 5: Significant difference in difficulty seeing steps between 5 m and 7 m. Question 6: No significant difference between levels in analysis of variance for difficulty riding on steps. Question 7: Significant difference in fear of curves between 3 m and 7 m. Question 8: Significant difference in difficulty seeing curves between 5 m and 7 m. Question 9: Significant difference in analysis of variance for difficulty riding curves, but no significant difference between levels. The result was as follows. Significant differences were found in six of the nine items. Figure 6 shows the results for fear of straight lines, with significant differences between 5 and 7 m, and most of the other items were also significantly different between 5 and 7 m. Here, Fig. 7 shows the results regarding the difficulty of riding on steps. No significant differences were found in this question item, either in the analysis of variance or in the multiple comparison test. 5.2 Styrene Board During the experiment, the position of the styrene board was changed for each subject and each ride. The position of the styrene board was clearly different for males and females at the distance of 3 m (the condition where the subject rides first), and the position of the styrene board was lower for females than for males at other forward visibility distance. 5.3 Eye Mark Recorder Figures 8, 9, 10, 11, 12 and 13 show the rider’s view at different forward viewable distances on a travel path of steps from a curve. The three cone areas in the figure are curves, and the plate on the left is a 5 cm step. The “EYE MARK" indicated in the figure is the rider’s gaze point.
194
S. Toyoshima et al.
Fig. 7. Fear of straight lines
Fig. 8. The difficulty of riding on steps
Effects of Poor Visibility on Riding Anxiety
195
From the eye marks in the figure, it can be read that the subject is looking at the boundary line between the styrene board and the ground in front of the curve and the step at the distance of 7 m and 8 m where the subject can see ahead. The eye mark indicates that the subject is not looking at the boundary but at the curve immediately in front of it. Looking at the entire screen instead of the eye mark, the position of the styrene board on the screen changes somewhat at the forward visibility range of 3 to 5 m. However, the forward view elsewhere tends to be similar, and the driver’s view during driving does not change much. The proportion of the styrene board on the screen suddenly increases from a forward viewable distance of 6 m, and at 7 m and 8 m, the front side of the step is hidden by the styrene board (Fig. 14).
STEP
CURVE
EYE MARK Fig. 9. 3 m
Fig. 10. 4 m
Fig. 11. 5 m
Fig. 12. 6 m
6 Discussion 6.1 Discussion on Subjective Evaluation Among the questions for which significant differences were found, between 5 m and 7 m in terms of visibility for both straight ahead, step, and curves. This is because riders often look 5 to 7 m ahead to check the situation in front of them while riding, and they suddenly felt anxious about riding when they could no longer see this area. For the
196
S. Toyoshima et al.
Fig. 13. 7 m
Fig. 14. 8 m
question about straight riding, significant differences were found between 5 m and 7m in the cases of fear, difficulty seeing ahead, and difficulty riding. It can be considered that it caused because riders want to know the front situation in riding straight ahead, and the riding anxiety that the rider feels increase when it is difficult to see in front is easily observed. There were significant differences between 3 m and 7 m about the fear of curves. This is because the riders had the same fear of curves from 3 m (the lowest height of the styrene board at the beginning) to 6 m. In this experiment, they were looking at the area in front of 7 m, but by riding at a distance of 7 m, it suddenly became difficult to see within 7 m in front of them, and their fear of curves increased. This result may have been caused by the rider’s increased fear of curves due to the sudden difficulty in seeing within 7 m ahead caused by riding at a forward visibility distance of 7 m. From these results, we speculate that riders often see less than 7 m in front of them and that their sense of anxiety about riding increases when it becomes difficult for them to see within this range. We will also examine the questions for which no significant differences were found. First, questions 4 and 6 are related to steps, and as shown in the results of question 6 (Fig. 7), the analysis of variance shows that no significant differences were found. This indicates that steps are a scary and difficult riding surface regardless of the height of the styrene board, i.e., whether or not a rain cover is installed. Next, Question 9 asked whether curves were difficult to ride, and the significant difference in the fear of curves in Question 7 was found to be between 3 and 7 m, indicating that the difficulty of riding on curves does not depend on the visibility of the road ahead, i.e., curves are difficult to ride on regardless of whether rain covers are installed or not, as is the case with steps. This means that curves, like steps, are difficult to ride on regardless of whether rain covers are installed or not. 6.2 Discussion on Riders’ Height The position of the styrene board to maintain the same front view was lower for women than for men. This means that even if the height of the front seat and rain cover is not an obstacle for tall riders, it is an obstacle for riders of short stature. This is because
Effects of Poor Visibility on Riding Anxiety
197
the saddle position is lower for small riders and they also have a lower seat height. We believe that riders of short stature need to be more careful in riding because they have a narrower range of vision in front of them. 6.3 Discussion on Eye Mark Recorder The height of the styrene board is considered to affect the rider’s forward visibility because the rider looked at the boundary line between the styrene board and the ground in front of the curve and the step at the forward viewable distance of 7 m and 8 m. The rider’s forward visibility is also considered to be affected by the height of the styrene board. Furthermore, since the rider was looking at the curve immediately in front of the step, rather than at the step he had been looking at before, we believe that the rider’s range of vision in front of the curve became narrower, and he became more alert to the curve. The proportion of the styrene board on the screen suddenly increased from a forward viewable distance of 6 m, and at 7 m and 8 m, the front side of the step was hidden by the styrene board, suggesting that the height of the styrene board had a sudden influence on the rider’s forward visibility from a forward viewable distance of 7 m. Based on the above, it can be said that the distance at which the rider can see ahead begins to have a significant effect on the rider’s forward vision based on the results of the eye mark recorder is from 7 m, and that the rider is aware of danger by looking ahead while riding. 6.4 Comparison of Subjective Evaluation and Eye Mark Recorder From the subjective evaluation, the subjects feel scary in going through the steps and difficult to get over them regardless of the forward viewable distance. While there was no significant difference in the fear of steps, there was a significant difference in the fear of curves was between 3 m and 7 m. The results of the eye mark recorder showed that, just before getting over the step, the subjects looked at the front part of the step, where the front wheel contacts with the step first. In addition, they tended to look at the part of the step more frequently when the step was just after a curve. These results suggest that regardless of the forward viewable distance, the rider pays very much attention to the road situation just ahead, such as curves and steps and so on. In particular, as they feel more fear to steps than to curves, it is considered that the blindness of the step just at the moment when the front wheel rides up the step increases the riders fear more than other obstacles.
7 Conclusion The above results indicate that there is a difference in the sense of riding instability depending on the minimum distance at which the rider can check the road ahead, and that poor forward visibility affects the rider’s sense of riding instability. In addition, since riders look ahead within a range of 5 to 7 m to confirm safety, they feel particularly
198
S. Toyoshima et al.
anxious when the minimum distance at which they can check the road ahead is 7m or more. Therefore, a rider who installs a rain cover or a high front seat is recommended to install with a forward viewable distance of 5m or less.
8 Future Issues In this study, we conducted an experiment under the hypothesis that the effect on riding anxiety may increase as the rider’s forward visibility decreases, but we did not evaluate which part of the forward range that the rider sees or cannot see has a particular effect on his/her feeling of anxiety (for example, even if the rider cannot see the distance, if he/she can see the front wheels, his/her feeling of anxiety may decrease). For example, even if the rider cannot see the far side of the road, he/she may feel less anxious if he/she can see the front wheels. In terms of the experimental conditions, in the straight-ahead curve riding prepared for this study, it was easy to see the next step while riding around a curve because of the open field of vision and the short straight-ahead distance. However, on ordinary roads, it is more common to not be able to see the end of a turn. Therefore, adding a device that makes it difficult to see beyond a turn or increasing the straight-ahead distance may have a greater impact on the subjects’ sense of riding anxiety. In addition, when riding in an environment different from that of this study, such as when riding in the rain or nighttime, there is a possibility that various other anxiety factors will be added, such as the rider’s range of vision being different from that in clear weather because the road surface is slippery in the rain, and the rider having difficulty seeing farther at night than in the daytime. In this case, it is not known to what extent the rider’s visibility is affected by the rider’s sense of riding anxiety. Furthermore, when an infant actually rides with the rider, the rider may be distracted by the infant’s sudden movements, which may result in a narrower field of vision in front of the vehicle, or in a situation where the rider’s center of gravity may shift, causing the rider to become more unsteady. It is necessary to study these issues in the future.
References 1. Metropolitan Police Department Homepage. https://www.keishicho.metro.tokyo.lg.jp/kotsu/ jikoboshi/bicycle/menu/rule.html. Accessed 09 Feb 2023 2. BAA Homepage. https://www.baa-bicycle.com/. Accessed 09 Feb 2023 3. Consumer Product Safety Association Homepage. https://www.sg-mark.org/. Accessed 09 Feb 2023 4. Consumer Safety Research Commission, F.: Report on the Investigation into the Cause of Accidents, etc., Pursuant to Article 23, Paragraph 1 of the Consumer Affairs Agency Safety Act - Accidents involving electrically power assisted bicycles while an infant is riding with the bicycle (2020) 5. Toshiyuki Aichi, F., Hideo Yamanaka, S., Hiromichi Kitama, T., Yusuke Kanda, F.: Gaze analysis of bicycling and evaluation of sign types. Civ. Eng. Soc. Proc. 68(5), 909–916 (2012) 6. Takanori Ohkawa, F., Nagahiro Yoshida, S., Yasuo Hino, T., Takashi Uchida, F.: Experimental study on user’s gaze and behavior characteristics in bicycle traffic space. Civ. Eng. Soc. Proc. 69(5), 571–578 (2013) 7. eyemark Homepage. https://www.eyemark.jp/product/emr_9/. Accessed 06 Feb 2023
Wayfinding and Navigation in the Outdoors Quantitative and Data Driven Development of Personas and Requirements for Wayfinding in Nature Frode Volden(B) and Ole E. Wattne Norwegian University of Science and Technology, Gjøvik, Norway [email protected]
Abstract. Persona development in human-centered design processes is mostly done in a qualitative process involving procedures like interviews, focus-groups and workshops. These are methods that are criticized for being prone to biases, as not being based on rigorous empirical data and for using small and possibly non-representative populations. Quantitative approaches are an alternative or supplement to qualitative methods. Through a survey (n = 693) we have investigated how people navigate and find their way in the nature. The questionnaire contains questions on demographics, activities, and wayfinding behaviors when out in the nature. The study’s aim was twofold: First we wanted to investigate the use of a quantitative approach for exploring user behaviors and attitudes when having access to a sufficiently large data material. Secondly, we wanted to provide for persona-development for wayfinding systems used in the nature. The methods applied in this study is a combination of Principal Component Analyses (PCA) and Cluster Analyses (CA). Based on these methods three factors where identified, which again lead to three clusters of respondents. The study concludes that when having access to quantitative data as we managed to have in this study, the combination of PCA and CA is an efficient and precise way to describe requirements and develop Personas. Results also indicate significant effects of demographic variables like age and gender for technology preferences. as well as for confidence in abilities when navigating and finding the way in nature. Keywords: Quantitative persona development · Cluster analyses · Wayfinding
1 Introduction Personas are fictional characters that represent a particular segment of a user population, and they are widely used in design projects. They can be used as a tool for understanding the population, and for developing products and services that meet the needs of the population. Research indicates that the most common approach to persona development is to use a variety of qualitative methodologies like interviews and focus groups. Quantitative methods reports to be less used, although many organizations and companies gathers © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 199–210, 2023. https://doi.org/10.1007/978-3-031-35129-7_14
200
F. Volden and O. E. Wattne
and maintains a lot of data on their users today. Quantitative methods in persona development involve collecting and analyzing data from a large number of people to identify patterns and trends. This approach helps to ensure that personas are based on evidence and are representative of the target audience, rather than being based on assumptions or personal opinions, resulting in biased or stereotypical personas. Using quantitative methods can also increase the reliability and validity of personas, making them more useful for design and decision-making. The case for exploring quantitative approaches to describing user segments and requirements here is on how we behave when navigating in and using the outdoors. Finding your way and navigating in nature is different from finding your way in an urban environment. Turn by turn navigation as we find in many GPS-systems may occasionally be helpful in finding the way but may also cause dangerous situations and interfere with the experience of being in nature. The users dependence on digital technology for navigation in the outdoors can be a security hazard if the technology is not working – for example through empty batteries or lack of signal or map coverage – or if the technology becomes unavailable through severe weather conditions. From an experiential point of view, GPS-technology and the often-used mode of turn-by-turn directions, might interfere with the user’s reading and awareness of the surroundings. The focus and attention is on the digital system instead of on the surrounding nature. People’s approaches to solving the problem of being lost in the outdoors are also of interest when designing systems for navigation and wayfinding in the outdoors, and the knowledge about problem solving in these situations and contexts can be of value when designing for other wayfinding and navigation settings and scenarios and for safety and rescue procedures. Through a survey (n = 693) we have investigated how people navigate and find their way in the nature. The questionnaire contains questions on demographics, activities, and wayfinding behaviors when out in the nature. We also asked questions on preferences for wayfinding and navigation strategies and for technology use. Descriptions of how the respondents recovered and re-established their bearings and location again after getting lost was also collected. The study delivers survey data of a quantity and quality that allows for statistical treatment. The study’s aim was twofold: First we wanted to investigate the use of a quantitative approach for exploring user behaviors and attitudes when having access to a sufficiently large data material. Secondly, we wanted to provide for persona-development for wayfinding systems used in the nature. By following recommendations for method usage for data-driven persona development we want to explore the usefulness of quantitative methods, as well as exploring the insights these methods can provide for wayfinding systems in nature.
2 Background 2.1 Qualitative and Quantitative Methods Both qualitative and quantitative methods can be used for persona development. Brickey et al. [1] found that 81% of efforts to develop personas evaluated in current academic literature applied qualitative methods. Manual persona development (MPD) has been
Wayfinding and Navigation in the Outdoors Quantitative
201
criticized for developing personas that are not based on rigorous empirical data [2] because MPD often uses small samples, one-time data collection, and non-algorithmic methods. Qualitative methods, such as interviews and focus groups, rely more on subjective interpretations of small sets of data and may not be as representative of the larger population. Quantitative methods, such as surveys and data analysis, can provide more reliable and objective data on a target audience, as they involve analyzing large sets of numerical data. Additionally, quantitative methods can be more efficient for creating personas for large target audiences, as they can help identify patterns and trends in the data that can be used to create a generalizable persona. On the other hand, qualitative methods, such as interviews and focus groups, can provide more in-depth and detailed information about specific individuals or groups, and a mixed method approach is often recommended [3]. Salminen et al. (2021) recommends a combination of quantitative approaches for datadriven persona development in their review of 77 data-driven persona research articles in the period 2005–2020. 2.2 Wayfinding Wayfinding and navigation are activities and behaviors people will employ when finding their way from one point to another. From Kevin Lynch’s introduction of the term “way-finding” in the book “The image of the city” [4], wayfinding has been used to describe activities where the users are dependent on signs and clues read from the surrounding man-made environment, whereas navigation has been used for activities where the users employ knowledge, tools and behaviors to plan and execute journeys in or across nature. In many instances it is not possible – nor fruitful – to identify a strict division between wayfinding and navigation, and people finding their way will oftenmost use a mix of wayfinding and navigation behaviors, knowledge and tools in the different phases of their journey and according to their skills and preferences [5–7]. From this point of view, one can argue that navigation is a subset of wayfinding, as the broader view of finding one’s way will include navigational activities. Also, with the introduction of consumer, digital navigation systems based on Global Navigation Satellite Systems (GNSS) [8], one can argue that the user experience of wayfinding in a man-made environment or navigation in nature is becoming more alike as it has become more reliant on the same digital platforms and tools – in most instances a smart phone running apps – providing the same modalities and user experiences across these contexts. This is also confirmed in the data of our survey. But at the same time, the activities of wayfinding and navigation in nature is also different from the same activities in urban or other man-made settings. Turn-by-turn wayfinding procedures might be optimal while driving in an urban landscape, but are not necessarily ideal while rambling in nature, and may also be an obstacle to an optimal experience of the situation, surroundings and enjoyment of the surroundings. Also, the information available to the user is in nature different in from in a man-made environment. In an urban environment the landscape will be dominated by buildings, streets, intersections, plazas, cars, public transport systems, rail lines, signs and such, whereas in nature it will be dominated by hills, mountains, fields, forests, water, paths, plains, valleys, rivers, oceans etc. (of course dependents on situation, location and landscapes). The two different settings offer
202
F. Volden and O. E. Wattne
different types of information for the user to draw on in wayfinding activities. At the same time, true, “unspoiled” nature with absolutely no presence of man-made objects or influence is indeed rare, if not to say non-existing, in our age: there will in most locations be visible man-made phenomena and landmarks like roads, power lines, paths, cairns, signs, markings, buildings and artificial lights, and there will be “invisible” information technologies like GNSS and telephone signals available. The ubiquity of man-made elements gives the experience of wayfinding in nature similarities to wayfinding in urban areas, but the purpose of, and scenarios for, wayfinding is often different between the two, and the consequences of getting lost in nature can be more severe than the inconvenience of getting lost in an urban environment. The situation of getting lost is commonly – in both urban and natural settings – experienced as uncomfortable, and in the outdoors, it can potentially be critical. Most designs for wayfinding will be an attempt at reducing the risk of getting into this circumstance. 2.3 Personas Personas are fictional characters created to represent the behaviors, motivations, and goals of user groups in a design project. They are used as a tool to help designers understand and empathize with the needs and expectations of their users. Personas are based on research and data collected from real users and are used to guide design decisions and ensure that the final product meets the needs of the target audience. The term “persona” was first introduced by Alan Cooper in the late 1980s as part of his book, “About Face: The Essentials of Interaction Design” [9]. Since then, the use of personas has become widespread in the fields of human-centered design, user experience (UX) design, and user-centered product development. Personas are now considered an important part of the design process and are used by designers and product teams to ensure that their products meet the needs of their users. Personas in design are often used in user-centered design methodologies, where the focus is on understanding and addressing the needs of the user throughout the design process. Personas are used to help designers empathize with their users, and to create designs that are tailored to the needs of specific segments of the target audience. A key element in persona descriptions is that personas should be assigned names and additional descriptors outside the tasks they are performing in the context of the concrete development project. In our case we limit ourselves from this. We stop at describing user segments/clusters of traits and behaviors. Developing this further into proper personas with names and other details lies beyond the scope of his paper, but we should bear in mind the role of personas as representations of a synthetization of the knowledge we have about the specific user-segment the persona is set to typify. 2.4 User Segmentation and Cluster Analyses Cluster analysis is a statistical technique used in design and other fields to identify patterns and group similar objects, elements, or data points into clusters based on their similarity or distance. The goal of cluster analysis is to separate a large dataset into smaller, homogeneous subgroups or clusters, where the items within each cluster are more similar to each other than they are to items in other clusters. This process can
Wayfinding and Navigation in the Outdoors Quantitative
203
be useful in design to identify common themes, patterns, or trends in data, and to help inform design decisions based on insights from the data. K-means clustering is a popular algorithm for clustering or grouping data points into similar clusters based on their features. It was first proposed by Stuart Lloyd in 1957 as a technique for vector quantization in signal processing. In the late 1960s and early 1970s, K-means was independently re-discovered and applied to cluster analysis in computer science and statistics. Since then, it has become a widely used algorithm for data clustering and has been extended and modified for various applications. Even though K-means was first proposed over 60 years ago, it is still one of the most widely used algorithms for clustering. Ease of implementation, simplicity, efficiency, and empirical success are the main reasons for its popularity [10]. K-means clustering can be used for user segmentation by treating user data as points in a multidimensional space and grouping similar users into the same cluster based on their characteristics, such as demographics, behavior, or preferences. The resulting clusters represent segments of the user population with similar traits.
3 Methods and Research Design 3.1 Survey Through an on-line questionnaire we asked about people’s (self-assessed) skills, strategies, preferences and use of technology when navigating and finding their way in the outdoors (in nature). The purpose of the questionnaire was to investigate people’s use and perceptions on technologies and phenomena in relation to finding the way in the outdoors. The questionnaire was available in both an English and a Norwegian version. A large majority used the Norwegian version of the questionnaire (622 out of the total of 693 respondents). 3.2 Questions Used for Clustering 14 questions where 5-point Likert type questions, and these questions were used to categorize the respondents into different clusters. The 14 statements where on behaviors and attitudes towards using the nature and on technology for support and navigation. The final cluster assignments of the individual cases based on the 5-level likert scales was then analyzed towards a number of questions with dichotomous or categorical answers. These where demographical questions like gender and age groups, and questions on recovery strategies when loosing their direction. 3.3 Software, Participants and Recruitment The tool Nettskjema (https://nettskjema.no/), developed and run by the University of Oslo, was used for questionnaire development and data collection. The questionnaire was distributed among 71 Master’s students in a university course in scientific methodology. Each student was given the task to recruit 15 respondents from their network in addition to answer the survey themselves. Invitations was instructed to be personal, and not just
204
F. Volden and O. E. Wattne
general invitations posted on social media. This gave a total number of 1136 potential respondents (71 × 15 = 1065 + 71 = 1136). The survey finally received a total of 693 respondents (304 females) which gives a response rate of 61%, something we consider rather good. Based on the method of recruitment, we consider it likely that the participants in the study are biased towards an above average educational level. The study also came out with a clear age bias as often is the case for studies involving students. 50% of the respondents reported to be under the age of 30, while 20% was 50 yrs. or older. 3.4 Data Analyses The statistical software SPSS v28 was used for data analyses. According to suggestions for data analyses given by Salminen et al. [11] in their review article, several procedures was used. First a Principal Component Analyses (PCA) was preformed to indicate the number of hidden structures in our data set. Secondly a K-means Cluster analyses was used to describe clusters, and cluster membership as well as distance from cluster centres were written back to the dataset. In addition to the methods suggested for segmentation we also added ANOVA and Ttests of cluster membership against demographical and behaviour variables not included in the cluster analyses for further insights into characteristics of subjects in the different clusters. Finally, cluster membership was analysed against demographical and behaviour variables not included in the cluster analyses. 3.5 Ethics The answers were anonymous, as the students who recruited never had access to answers from the participants they recruited, and no direct or indirect identifying data was recorded. Participants were informed about the purpose of the study, voluntary participation, and anonymity before being directed to the on-line survey.
4 Results 4.1 Principal Component Analyses For the PCA, a couple of standard methods were applied to determine the number of factors present in the dataset. The first evaluation was against an Eigenvalue-criteria of above 1, the second an evaluation of a scree-plot. The results indicated that either 3 or 4 structures where present. 4 factors came out above 1, but the plot suggested an “elbow” at 3 factors. Based on this we moved on with both a 3 and a 4-cluster solution, to investigate the difference between the two. 4.2 The 3-Cluster Solution The 3-cluster solution gave us 3 rather different segments of users of the outdoors, with clear differences between the clusters. Table 1 shows the mean scores of the different
Wayfinding and Navigation in the Outdoors Quantitative
205
Table 1. Final Cluster Centers
questions for each of the three clusters. High scores are shown as green, and low scores as red (Table 2 and Table 3). Based on the within-cluster score of the questions, we have tried to name and describe central characteristics of the three clusters: Cluster 1: The Recreational Walker Does not get lost but is also the one that is least enthusiastic towards using maps as well
206
F. Volden and O. E. Wattne Table 2. Recovery by asking someone else
No N Cluster Number
Total
Yes %
N
%
1
40
16.5%
7
23.3%
2
122
50.2%
7
23.3%
3
81
33.3%
16
53.3%
243
100.0%
30
100.0%
as the phone. Enjoys using the outdoors, but neither place-names nor cardinal directions are important. As a group, these are the ones least dependent on signs and markings. Cluster 2: The Traditionalist Describes themselves as having a good sense of directions and place. The traditionalist enjoys paper maps, and seldom gets lost. Males are overrepresented in this cluster, and they are also somewhat older. This user segment scores high when it comes to using all sorts of maps and wayfinding technologies, and they navigate and take the lead, rather than follow others. When getting lost, the traditionalist is the only one who applies cardinal directions map and compass for recovery. Cluster 3: The Unsecure Scores low on both sense of place and direction, and their use of the outdoors is affected by this. They use digital maps but finds it hard to use and communicate cardinal directions. When getting lost, these are the ones most likely to ask others for the way. Members of cluster 3 are more likely to be female (Fig. 1 and Fig. 2). 4.3 4-Cluster Solution A 4-cluster solution was also developed, but we found it a bit more difficult to describe the four user segments with this solution. This said, the 4 cluster solution had some interesting qualities too, as it produced a “minority segment” (51 subjects), that had rather unique scores on a few of the variables. This minority segment might be a very important one as it differs from the other clusters only on a few variables, but significantly so. They score considerably lower on all questions regarding use of place names compared with the other clusters. This might call for development of a fourth persona with this trait. 4.4 Most Important Variables for Segmentation In the cluster analyses we found it useful to produce an ANOVA-table to explore which of the variables used in the clustering process, that showed the clearest group difference based on the final cluster assignment. All variables showed a clear significant effect from the final clusters, but this is of course to expect from the clustering algorithm. The F tests should be used only for descriptive purposes because the clusters have been chosen to
Wayfinding and Navigation in the Outdoors Quantitative
207
Table 3. Anova table of variables included in 3-cluster solution
F
I like to use paper maps
122.834
I like to use digital maps
35.575
I have a good sense of place
159.337
I have a good sense of direc-
145.563
tion I like to use the mobile phone
34.294
to find my way in the outdoors I struggle with reading and in-
167.834
terpreting maps My insecurity about finding
167.450
my way limits my use of the outdoors I often actively use place
120.366
names when I am in the outdoors, i.e. to talk about a route or name a landmark Signs and markings are im-
41.985
portant for me to find my way in nature I often use the cardinal direc-
217.824
tions (north, south, east, west) to get my bearings in the outdoors Place names are important
149.845
to me to memorize destinations and routes
maximize the differences among cases in different clusters. The observed significance levels are not corrected for this and thus cannot be interpreted as tests of the hypothesis that the cluster means are equal. The F-values differed quite a bit though, and variables with relatively higher F-values should indicate higher importance. In our case we got a large effect on quite a few variables, but maybe more important; for some of the variables
208
F. Volden and O. E. Wattne
Fig. 1. Cluster membership and gender
Fig. 2. Cluster 2, The traditionalist takes the lead
the effect was rather small. This goes for variables like use of mobile phones and digital maps, but maybe a bit more of a surprise also for the variable “Signs and markings are
Wayfinding and Navigation in the Outdoors Quantitative
209
important for me to find my way in nature”. We interpret this as for the three clusters we produced in this project, these variables were of little importance. Mobil phones, digital maps, signs and marking seems to be of equal importance to all.
5 Discussion and Conclusion The methodology applied for user segmentation and descriptions used in this study showed to be rather straight forward to perform. Through the method combination applied, we were able to identify user segments that could be taken further into persona descriptions. We were also able to describe which elements were most and least important at differentiating the user clusters. This, we believe, would be much more difficult applying qualitative methods. Whether the methodology applied here would lead to different personas and persona descriptions compared to pure qualitative approaches is not possible to say from this study. We are nevertheless pretty certain that the use of methods like the ones applied here at least will work as a supplement and correction in persona development. We therefore conclude that at least for cases where quantitative data describing a populations demographics, behaviors and/or attitudes are available, these data should be used. Data filtering methods to maximize group (persona) uniqueness should be tested out further, as there seems to be a balance between rich and representative group (segments descriptions/persona) descriptions, and their usefulness for communicating group differences. E.g. applying and changing cut-off levels for cluster membership could be possible as each cluster-members distance from its center is available. By applying a low cut-off level, e.g. less than 1 SD away from the cluster center, quite stereotypical representatives for each cluster would be kept, while those representing a more atypical pattern would be lost. The effect of this is not further investigated in this study, but the effect of such procedures would be interesting to investigate further.
6 Further Work As the current study is not comparing a case where personas created by pure qualitative methods is compared to those created based on quantitative methods, further efforts should be put into research on this.
References 1. Brickey, J., Walczak, S., Burgess, T.: Comparing semi-automated clustering methods for persona development. IEEE Trans. Softw. Eng. 38(3), 537–546 (2012) 2. Chapman , C.N., Milham, R.P.: The persona’s new clothes: methodological and practical arguments against a popular method. In: Proceedings of the Human Factors and Ergonomics Society 50th Annual Meeting, pp. 634–636 (2006) 3. Salminen, J., et al.: A survey of 15 years of data-driven persona development. Int. J. Hum. Comput. Interact. 37(18), 1685–1708 (2021) 4. Lynch, K.: The Image of the City. Cambridge, Mass. M.I.T. Press (1960)
210
F. Volden and O. E. Wattne
5. Möllerup, P.: Wayshowing: A Guide to Environmental Signage Principles & Practices. Lars Müller Publishers, Baden (2005) 6. Rinne, K., Memmert, D., Bock, O.: Proficiency of allocentric and egocentric wayfinding: a comparison of schoolchildren with young adults and older adults. J. Navig. 1–12 (2022) 7. Barker, A.: Navigating life: a taxonomy of wayfinding behaviours. J. Navig. 72(3), 539–554 (2019) 8. Bonnor, N.: A brief history of global navigation satellite systems. J. Navig. 65(1), 1–14 (2012) 9. Cooper, A., Reimann, R., Cronin, D.: About face 3 : the Essentials of Interaction Design. Indianapolis, Ind: Wiley. xxxv, p. 610 ill (2007) 10. Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31(8), 651–666 (2010) 11. Salminen, J., et al.: A literature review of quantitative persona creation. In: Proceedings of the 2020 Chi Conference on Human Factors in Computing Systems (Chi’20), pp. 1–14 (2020)
Knowledge in eLearning and eEducation
Analysis of Classroom Test Results for an Error-Based Problem Presentation System for Mechanics Nonoka Aikawa1(B) , Shintaro Maeda1 , Tomohiro Mogi1 , Kento Koike1 , Takahito Tomoto2 , Isao Imai3 , Tomoya Horiguchi4 , and Tsukasa Hirashima5 1
Graduate School of Engineering, Tokyo Polytechnic University, Atsugi, Kanagawa, Japan [email protected] 2 Faculty of Engineering, Tokyo Polytechnic University, Atsugi, Kanagawa, Japan 3 Chiba Municipal Satsukigaoka Junior High School, Chiba, Chiba, Japan 4 Graduate School of Maritime Sciences, Kobe University, Kobe, Hyogo, Japan 5 Graduate School of Advanced Science and Engineering, Hiroshima University, Higashi-hiroshima, Hiroshima, Japan https://www.takahito.com/members/aikawa/ Abstract. It is possible in learning for an impasse to occur during a learner’s trial-and-error process. This is a situation in which a learner cannot make progress in solving a problem because they cannot properly correct an error. In this study, we developed a learning support system for mechanics that helps learners overcome impasses by presenting auxiliary problems based on their errors. This system was implemented in a class of 86 third-year middle school students. To evaluate the problem graph implemented in the system, we analyzed the results of a test that dealt with drafting problems based on the strength of each entry in the results of the class practice. We found that this system was effective for learning gravity, normal force, the net force exerted by external forces, and frictional force from the ground. However, there were problems in learning frictional forces among objects themselves and forces propagating to other objects. Keywords: Auxiliary problem
1
· Error-based Simulation · Practice use
Introduction
It is essential in learning that learners engage in trial and error to deepen their understanding of a topic. Error-based simulation (EBS) is a learning support framework that encourages learners to learn by trial and error [4,6]. EBS is a framework that encourages learners to become aware of errors by presenting a simulation of incorrect behavior based on the learner’s incorrect answers to exercises. EBS systems have been developed for various subjects. For example, the EBS system developed in the field of physics has been shown to enhance learning during classroom practice [5,7]. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 213–224, 2023. https://doi.org/10.1007/978-3-031-35129-7_15
214
N. Aikawa et al.
However, during the trial-and-error process in EBS, an impasse, a state in which no progress is made in solving a problem, may occur due to the inability of learners to properly correct their errors. Such an impasse situation may lead to a decrease in the learner’s motivation, which may then cause the learner to give up learning [3]. It is common practice to give a learner who is stuck a clue pertaining to their error or to outright give them the correct answer [9]. On the other hand, it is necessary to provide appropriate clues to resolve the impasse, but if correct answers are provided as clues, then the learner’s trial-and-error process in the EBS may be hindered. In this study, we assessed the effectiveness of providing learners with auxiliary problems to help them overcome impasses during the trial-and-error process. One method for achieving advanced problem control is the dynamic whole-task selection approach, in which “the most complex problem is solved first, and then a task is dynamically given to the learner to learn partial solutions according to the problem-solving status” [8]. In this study, in addition to the dynamic wholetask selection approach, we developed and employed a system that automatically diagnoses the learner’s error points and presents auxiliary problems accordingly. We developed a learning support system for mechanics in which learners who are stuck in a problem are given auxiliary problems based on their errors, enabling them to overcome their impasse through continuous trial and error [1]. To verify the effectiveness of this system, we conducted a class practice for 86 third-year students at a middle school [2]. The system developed in this study was designed and developed so that the auxiliary problems correspond to the arrows representing forces in the original problem. Thus, when a learner makes a mistake in a problem, the system retrieves the force in the learner’s answer that had not been drawn and presents an auxiliary problem based on it. However, reports from previous studies have provided only an overview of this system’s learning effects during classroom practice, and have not fully considered what specific skills the learners have come to understand by using the system. Analyzing the test results for the force arrows and identifying which forces are effective allows us to clarify which auxiliary problems are effective and which ones need improvement. In this paper, we focus on the test results for drawing problems from the classroom practice and analyze each force entered.
2
System Used
In this section, we provide an overview of the learning support system developed in this study (Fig. 1) [1,2]. This system is a combination of a conventional EBS [5] and an auxiliary problem presentation function. 2.1
Error-Based Simulation (EBS)
In this system, the learner first performs the exercises in a conventional EBS. The system uses a mechanical drawing problem in which the learner uses arrows to
Analysis of Test Results for an Error-Based Problem Presentation System
215
Fig. 1. System screen.
Fig. 2. Example of an EBS simulation.
draw the force acting on an object from some phenomenon. The system responds to the learner’s input and presents a simulation based on the solution. If the learner’s answer is correct, then the simulation will also be correct. Therefore, the learner can notice errors in their responses and correct them through trial and error. For example, in a situation where an object is at rest on a floor, as shown in Fig. 2, gravity and the normal force act on the object. When the learner mistakenly inputs only gravity (a1), EBS presents a simulation (a2) in which the object slips through the floor and falls according to gravity. The learner can observe the unnatural simulation and realize that their answer is incorrect from the lack of upward force.
216
N. Aikawa et al.
Fig. 3. Problem graph.
2.2
Auxiliary Problem Presentation Function
When a learner incorrectly answers a problem a certain number of times, the system offers the learner the option to either “attempt to solve the current problem again” or “attempt to solve an auxiliary problem.” If the learner chooses to solve the auxiliary problem, then the system analyzes the learner’s previous answer history and extracts the most frequently incorrect force arrows. The most frequently incorrect forces are understood to be those that the learner understands the least. To help them understand the extracted forces, it is effective to solve two problems that differ in the presence or absence of the force. Therefore, the system selects the extracted forces for the two auxiliary problems by referring to the problem graph in Fig. 3. Additionally, the system presents the auxiliary problems at both ends of the link to the learner, starting with the easier problem. For example, if the learner works on the original problem in Fig. 3 and makes the most errors on force (3), then the system refers to the problem graph in Fig. 3 and selects the link connecting support problems 7 and 6. The system then follows the link and presents support problem 7 first, followed by support problem 6. The learner can learn about force (3) by solving the problems that do not contain it, then solving the ones that do. If the learner repeatedly makes mistakes while solving the support problem, then a simpler support problem is presented after analyzing their answer history. When the learner correctly answers an auxiliary problem, the system presents a more complex problem from the problem graph. For example, if the learner correctly answers support problem 6, then the system presents support problem 5, following the arrows in Fig. 3. Since the problem graph is designed to return to the original problem, the learner can solve the problems presented by the system and eventually achieve the correct answer to the original problem.
Analysis of Test Results for an Error-Based Problem Presentation System
Fig. 4. Learning tasks.
3 3.1
217
Fig. 5. Transfer tasks.
Practical Use Procedure
The following is an overview of the classroom practice conducted in this study. To investigate the learning effects of this system for middle school students, we conducted a three-period (135 min) class practice for three middle school third-grade classes: two with 29 students and one with 28 students, totaling 86 students. The class practice consisted of a greeting and class explanation (41 min), a pre-test (10 min), the use of the system (64 min), a post-test (10 min), and a questionnaire (10 min). Additionally, a delayed test was administered eight weeks after the class to assess whether the learners’ understanding of the problem gained by the system had taken root. Learning effectiveness was evaluated based on the results of the pre-test, post-test, and delayed test. The content of each test was the same, consisting of four learning tasks (Fig. 4) and three transfer tasks (Fig. 5), for a total of seven problems. The learning task (Fig. 4) consisted of four problems that were among the problems handled by the system: the original problem in Fig. 3, support problem 2, support problem 5, and support problem 7. The developmental problem (Fig. 5) consisted of three problems of a more advanced nature that were not included in the system. All of these problems were drafting problems, and the correct answers to them are indicated by arrows in the diagrams in Figs. 4 and 5. 3.2
Results
The results of the pre-, post-, and delayed tests are presented in this section. The total number of participants was 74 students, taking into account those who were absent from class. First, the aggregate results show that most students improved their scores by using the system, with a mean (SD) of 0.66 (0.95) on the pre-test and 3.42 (1.07)
218
N. Aikawa et al.
Fig. 6. Results from testing for each problem.
on the post-test. Additionally, no student’s score decreased from the pre-test to the post-test. The t-test results showed a significant difference of p < 0.001 between the pre-test and the post-test. The number and percentage of correct answers for each problem in the pre-, post-, and delayed tests are shown in Fig. 6. Learning task 4 was implemented in the system as the target problem in Fig. 3. Figure 6 shows that 28 students answered this problem correctly in the post-test, even though no one answered correctly in the pre-test. This suggests that initially unsolvable problems may become solvable through the use of this system. Next, for learning task 1 (the easiest problem and the one corresponding to support problem 7 in Fig. 6), 30 students answered correctly in the pre-test and all 74 students answered correctly in the post-test. Furthermore, 52 students answered correctly on the delayed test administered eight weeks later, suggesting that many students’ understanding of learning task 1 was firmly established. Furthermore, only two students answered correctly in the pre-test for learning task 3 (corresponding to support problem 5 in Fig. 6), whereas 62 students answered correctly in the post-test. Twenty-six correct answers were obtained in the delay test, which is an increase over that in the pre-test. This suggests that the system is capable of assisting students in solving difficult problems. On the other hand, there was no significant improvement in the performance of the transfer tasks. Learning task 4 (corresponding to the target problem in Fig. 6) also had only 28 correct answers in the post-test, which dropped to 2 in the delayed test. A simple examination of the system’s usage history for learning task 4 revealed that some students were unable to answer the question correctly on the post-test, even though they had answered it correctly during the use of the system. However, in the problem-by-problem scoring, the correct answers were those in which all the forces in the problem were drawn correctly. For example, when forces (1) and (2) exist in a problem, an answer that fails to plot either of them is considered to be equally incorrect. Therefore, it is not possible to determine whether either of these forces specifically was made clear by the system simply by analyzing the post- and delayed tests. By clarifying whether there were cases in which students were able to draw the forces in incorrectly answered problems, or, conversely, which forces were not, it is possible to determine the auxiliary problems that were effective and the ones that need to be improved. Therefore, in the next section, we present the results of a force-by-force analysis of the test results.
Analysis of Test Results for an Error-Based Problem Presentation System
4 4.1
219
Force-by-Force Qnalysis of the Test Results Problem-by-Problem Analysis of Force
First, each force in every problem of the test is numbered in Figs. 4 and 5, and the results of scoring them are shown in Figs. 7 and 8. We can see from Fig. 7 that the percentage of correct answers is high for all problems in (a) learning tasks 1 to 3. The correct response rate for the post-test was particularly high for all problems, which is consistent with the results from learning tasks 1–3 in Fig. 6. Next, for learning task 4 in (b), forces (1)-(6) and (9) had high percentages of correct responses in the post-test, but forces (7) and (8) did not. Those for force (8) were particularly low at just 12%, suggesting that this was a major factor in the low percentage of correct responses to learning task 4 in the delayed test. The correct response rates for forces (2) and (5) improved significantly in the post-test despite the low correct response rates in the pre-test, and they were maintained in the delayed test. On the other hand, the correct response rate for force (3) increased in the post-test and decreased in the delayed test. Next, Fig. 8 shows that for transfer task 1 in (e), forces (1), (3), and (5) had high percentages of correct responses in the post- and delayed tests, especially force (5), which also had a high percentage of correct responses in the pre-test. On the other hand, the response rate for force (6) was only in the 30% range for both the post-test and the delayed test. Finally, as shown in (d) a (e) in Fig. 8, there were almost no correct responses for transfer tasks 2 and 3, but correct responses for transfer tasks 2 (1), (4), and (6) and transfer tasks 3 (1), (4), (7), and (9) were above 80% for both the post-test and the delayed test. This suggests that when a problem was answered incorrectly, the learner did not completely understand it, only part of it. As shown above, there was a difference in the percentage of correct answers for each force, even within a single problem. Therefore, in the next section, we assign the forces into categories such as “gravity,” “normal force,” and “friction,” and analyze the percentage of correct responses. 4.2
Categorical Analysis of Forces
In this section, we categorize the forces and summarize the percentages of correct responses for them in Fig. 9. The categories are (a) “gravity,” (b) “normal force,” (c-1) the “force exerted by an object above,” (c-2) the “force exerted by an external force,” (d-1) the “friction force with the ground resisting a pushing force,” (d-2) the “friction force with an object above resisting a pushing force,” (d-3) the “friction force propagating to the upper object,” (d-4) the “frictional force propagating to the lower object,” (d-5) the “frictional force with the ground resisting the force propagating to the lower object,” and (e) the “action and reaction between objects and between objects and walls.”
220
N. Aikawa et al.
Fig. 7. Percentage of correct answers per force in each problem (learning tasks).
The percentage of correct answers for gravity and the normal force was high for all problems in both the post- and delayed tests. The percentage for gravity in Figure (a) was more than 80%, and that for the normal force was above 80% and 70% in the post- and delayed tests, respectively. For (c-1), all problems in the pre-test had less than 20% correct responses, whereas all problems in the post-test had more than 70%, which indicates that the system helped learners to understand (c-1). The percentage of correct answers in the delayed test was lower than that in the post-test, but it was still more than twice as high as that in the pre-test. The percentage of correct answers for (c-2) was more than 80% in both the post-test and the delayed test, which was as high as that for gravity and normal force. The response rate for (d-1) was generally good, but the response rates for the other frictional forces were generally low, that for (d-3) being higher than the others.
5
Consideration
The correct response rates were high for (a) gravity, (b) normal force, (c-2) the force exerted by an external force, and (d-1) frictional force with the ground resisting the pushing force. These are the forces that are included in the series of links of support problems 2, 3, and 7 in Fig. 3. (a) and (b) are intended to be taught in support problem 7, and (c-2) is intended to be taught in the link between support problems 7 and 3, whose purpose is to teach (d-1). Ultimately, we consider this linkage of these problems to be effective. On the other hand, the response rates for the frictional forces other than (d-1) were generally low. Of these forces, (d-2) is the one that is intended to
Analysis of Test Results for an Error-Based Problem Presentation System
221
Fig. 8. Percentage of correct answers per force in each problem (transfer tasks).
be learned in the link between support problems 2 and 1. Therefore, it can be assumed that force (8) in the target problem was not learned well because of the large difference between the supporting problems at both ends of this link. The correct response rate for (d-2) was low, but in comparison, that for (d3) was high. This is the force that we aimed to have the students learn in the link between support problems 7 and 4, so we consider this link to be effective. Since in the simulation the upper object did not move properly unless (d-3) was included, we believe that this force was particularly easy to learn. Furthermore, (d-4) and (d-5) appeared only in the developmental problems. Responses for these were particularly poor, suggesting that they were difficult to learn from this problem graph. Additionally, the correct response rate for (c-1) increased significantly in the post-test, while it decreased in the delayed test. This is the force that was intended to be learned in the link between support problems 7 and 6 in Fig. 3.
222
N. Aikawa et al.
Fig. 9. Percentage of correct answers classified by category of force.
However, it is not only this link, but also the link up to support problem 5, that seems to have a significant effect on this. During class practice, learners increased their correct response rate in the post-test by learning through this series of links. There were problems in the delayed test that corresponded to support problems 7 and 5 in the assignment, but not to support problem 6. Support problem 6 is an auxiliary problem in which the force being exerted by the object above is replaced by an external force. We can conclude that solving the series of auxiliary problems from support problem 7 to support problem 5 was effective.
Analysis of Test Results for an Error-Based Problem Presentation System
6
223
Conclusion
In this paper, we developed a system for presenting auxiliary problems based on errors made by learners stuck in EBS and implemented it in a middle school class. The results of the test were analyzed for each category of force. They showed that the problem graph implemented in this system was effective in helping learners to understand gravity, the normal force, the force exerted by an external force, and the friction force with the ground resisting a pushing force, but there were problems in reinforcing an understanding of the friction force with an upper object and the forces propagating to other objects. In future work, we will create and assess a more effective problem graph for the frictional forces that were not effectively reinforced by the current system. Additionally, we will further analyze the results of the class practice from two perspectives. First, we will use the logs of the system used in the class practice to analyze the learning activities of the learners and discuss them in connection with the test results. Second, we will perform a force-by-force analysis of the test results based on a causal series of forces. Since the problem graph implemented in this system is based on the causal relationship of forces [1], we determine whether the learner has understood the causal aspect of these forces by assessing the forces in their answers relative to the causal sequence. Acknowledgements. This work was supported by JSPS KAKENHI Grant Numbers JP22K12322, JP21H03565, JP20H01730.
References 1. Aikawa, N., Koike, K., Tomoto, T., Tomoya, H., Tsukasa, H.: A presentation system for auxiliary problems to resolving learners’ stuck in error-based simulation for mechanics. IEICE Trans. Inf. Syst. (Japanese edition) 106(2), 132–143 (2023). (in Japanese) 2. Aikawa, N., et al.: Practical use of an error-based problem presentation system in mechanics. In: Proceedings of the International Conference on Computers in Education ICCE, vol. 2022, pp. 118–123 (2022) 3. Beck, J.E., Gong, Y.: Wheel-spinning: Students who fail to master a skill. In: Proceedings of the 16th International Conference on Artificial Intelligence in Education, pp. 431–440 (2013) 4. Hirashima, T., Horiguchi, T., Kashihara, A., Toyoda, J.: Error-based simulation for error-visualization and its management. Int. J. Artif. Intell. Educ. 9(1–2), 17–31 (1998) 5. Hirashima, T., Imai, I., Horiguchi, T., Tomoto, T.: Error-based simulation to promote awareness of error in elementary mechanics and its evaluation. In Proceedings of International Conference on Artificial Intelligence in Education, pp. 409–416 (2009) 6. Hirashima, T., Shinohara, T., Yamada, A., Hayashi, Y., Horiguchi, T.: Effects of error-based simulation as a counterexample for correcting MIF misconception. In: Andr´e, E., Baker, R., Hu, X., Rodrigo, M.M.T., du Boulay, B. (eds.) AIED 2017. LNCS (LNAI), vol. 10331, pp. 90–101. Springer, Cham (2017). https://doi.org/10. 1007/978-3-319-61425-0 8
224
N. Aikawa et al.
7. Horiguchi, T., Imai, I., Toumoto, T., Hirashima, T.: Error-based simulation for error-awareness in learning mechanics: an evaluation. Educ. Technol. Soc. 17(3), 1–13 (2014) 8. Salden, R.J.C.M., Paas, F., van Merri¨enboer, J.J.G.: A comparison of approaches to learning task selection in the training of complex cognitive skills. Comput. Hum. Behav. 22(3), 321–333 (2006) 9. Shute, V.J.: Focus on formative feedback. Rev. Educ. Res. 78(1), 153–189 (2008)
Using Interactive Flat Panel Display for STEM Education Based on SAMR Model Yu-Hung Chien1(B) , Yu-Jui Chang1 , Hsunli Huang2 , Hsiang-Chang Lin2 , and Jyun-Ting Chien3 1 National Taiwan Normal University, Taipei 106, Taiwan
[email protected]
2 BenQ Corporation, Taipei 114, Taiwan 3 City University of Hong Kong, Kowloon, Hong Kong, China
Abstract. This study aims to introduce the course design of integrating educational technology into a STEM (Science, Technology, Engineering and Mathematics) course through Interactive Flat Panel Display (IFPD) to enhance studentteacher interaction by using an 8-h cloud-light hands-on activity as an example. The course was based on the SAMR model: Substitution, Augmentation, Modification, and Redefinition to develop interactive teaching activities which were different from those in traditional teaching. We invited fifteen k12 in-service technology teachers to participate in this course for their professional development and the confirmation of the effective learning interaction between teachers and students. This IFPD STEM course will be practically applied in teaching, changing the shortcoming of less student-teacher interaction in traditional teaching, and promoting the effectiveness of IFPD in STEM education. Keywords: Interactive flat panel display · SAMR Model · STEM education
1 Introduction An Interactive Flat Panel Display (IFPD) is a large-format touchscreen display that could replace for traditional blackboards and projectors with a higher-quality display, enhanced connectivity, and built-in software solutions. The IFPD can handle interactivity from multiple users and is typically compatible with a more comprehensive range of apps and connectivity software. Moreover, the IFPD is usually associated with a cloud-based platform that allows logging in from anywhere to access and share files. The advantage of the highly interactive performance between IFPDs and tablets would make students more involved in the learning and let teachers know well about students’ learning situations. The term ‘STEM education’ refers to teaching and learning in the fields of Science, Technology, Engineering, and Mathematics. STEM education allows students to develop increasingly essential skills, such as complex problem-solving, communication, and collaboration. Policymakers and educational researchers worldwide increasingly focus on ensuring students’ persistence and success in STEM [1] and students’ preparation for the labor market in which STEM takes a prominent place [2]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 225–234, 2023. https://doi.org/10.1007/978-3-031-35129-7_16
226
Y.-H. Chien et al.
Since engaging students in STEM is an urgent need in society, and there is an increasing worldwide trend toward technology-mediated learning to give students exposure to technology-enhanced interactive learning [3]. It is vital to use various ways understanding technologies influence students’ learning and engagement within STEM subjects to engage students in thinking through complex concepts, develop a deep knowledge of STEM, and promote student engagement towards STEM [4]. Good technology integration isn’t about using the fanciest tool; it’s about being aware of the range of options and picking the right strategies for the lesson. A SAMR model is a framework that categorizes four different degrees of classroom technology integration [5]. The SAMR model can be compelling potent during remote and blended learning when integrated classroom technology makes teaching and learning a more seamless experience for educators and students [6]. Consequently, we developed an IFPD STEM course based on the SAMR model in this study. In the course, fifteen k 12 in-service technology teachers were invited to participate in this course for their professional development and the confirmation of the effective learning interaction between teachers and students. This IFPD STEM course will be practically applied in teaching, changing the shortcoming of less studentteacher interaction in traditional education and promoting the effectiveness of the IFPD in interactive learning.
2 Theoretical Background 2.1 STEM Education The domains of science, technology, engineering, and mathematics, known collectively as STEM, have been deemed essential to preparing students for the future workforce [7]. STEM was derived at the National Science Foundation of the United States from the beginning of 1990, which is an integrated educational experience [8]. In traditional educational approaches, subjects are presented separately, often resulting in incoherent learning of knowledge and skills. When students face problems in real life, they will not be able to effectively solve the problems if they think about the problem-solving methods in different disciplines. In contrast, the interdisciplinary learning method breaks the barriers of disciplines, and students can establish meaningful knowledge connections, enabling them to solve multidisciplinary problems in real contexts [9]. 2.2 The SAMR Model The letters SAMR stand for Substitution, Augmentation, Modification, and Redefinition, a 4-level taxonomy used to select, utilize, and evaluate technology in the K 12 classroom [10]. Substitution and Augmentation are considered ‘Enhancement’ steps, while Modification and Redefinition are ‘Transformation’ steps. Figure 1 explains the details of the four steps of the SAMR model. • Substitution: A step of enhancing your lessons with technology. It is easy to see how an educator can move a classroom activity to substitution, such as changing a classroom lecture from a traditional class to a video conference.
Using Interactive Flat Panel Display
227
Fig. 1. The SAMR model [10].
• Augmentation: A step of enhancing your lessons with technology. Taking a digital worksheet is a sort of functional improvement. • Modification: A step of transforming lessons with technology. In this step, teachers move toward mastery of technology rather than merely using technology. Meanwhile, teachers may also ask students to design, create, and innovate with technology in this step. • Redefinition: A step of transforming your lessons with technology. By redefining a lesson, teachers can fundamentally change a lesson by using technology to create new tasks that were previously inconceivable. With redefinition, teachers can create new learning possibilities and opportunities for students [10].
3 Course 3.1 Content of the IFPD STEM Course Today, the IFPD serves as an instructional aid to facilitate interactive learning. The IFPD has been gradually installed in every classroom and has become the essential digital teaching equipment to replace the blackboard. Accordingly, an IFPD STEM course based on the SAMR model was developed in this study named ‘Cloud Light’ (Fig. 2). In this course, we used the SAMR model to reflect on several aspects, including: • • • •
What benefits does the IFPD bring? What does the IFPD have more than the old technology has? How to modify the original classroom tasks in response to the IFPD? What are the new classroom tasks and activities?
228
Y.-H. Chien et al.
Fig. 2. The IFPD-based STEM course ‘Cloud Light.’
The ‘Cloud Light’ course included an introduction to Ohm’s law and photoelectric effect for the learning field of natural science; the knowledge of electricity and control for the learning fields of science and technology, the use of multi-meters and welding skills for the learning field of technology, the calculation of the electrical resistance value for the learning fields of technology and mathematics. Table 1 shows the SAMR-model-based STEM course design. Table 1. SAMR-model-based STEM course design. Step
Content
Substitution
1. Replace blackboard and projector functions with the IFPD 2. During the teaching process of the multimeter, the geometric figure drawing function was used to ask questions 3. The timer built-in in the IFPD was used to control the activity time
Augmentation 1. Use the video recording function of the IFPD to record the course and play it directly on the IFPD 2. Access the Internet through a browser to supplement the course content with information from the Internet 3.Use Slido [11] and Kahoot [12] for gamification learning Modification
1. Import the recorded video from the cloud-based platform to allow students to conduct individualized learning through the tablet 2. Import pictures of parts to the IFPD for demonstration and explanation 3. Use the built-in functions to create the scratch card and the tiles
Redefinition
1. Integrate the IFDP, tablet, and cloud platforms, such as asking students questions about courses promptly through Slido 2. Use the wireless projection software InstaShare 2 [13] to allow teachers and students to conduct two-way communication such as screen sharing, cloud whiteboard, and collaboration 3. Teachers in different learning fields can teach, supplement, or integrate resources through the cloud-based platform by joining the function of the collaborative platform 4. The Students’ reports can be presented live in the IFPD
Using Interactive Flat Panel Display
229
3.2 Implementation of the Course The ‘Cloud Light’ course was 8-h long. After completing the course planning, development, knowledge content, and design of interactive activities, we invited fifteen k12 in-service technology teachers to participate in this course as students for their professional development and the confirmation of the effective learning interaction between educators and learners. The schedule of the course describes below. • Hour 1: The course teacher first gave the materials of the hands-on activity to the students and then checked the materials with the students through the IFPD (Fig. 3). The teacher started to teach the use of a multimeter and randomly asked the students questions by built-in drawing function to have some questions on the IFPD (Fig. 4). Students tested their materials after they leaned the use of the multimeter. The teacher used the timer function to control the activity time. Students who had problems with their materials or had questions could use tablets to inform teachers. The teacher then taught the resistor color code. The teacher calculated the resistance value by using the built-in calculator function, and the teacher asked the students’ resistance value by changing the resistor’s color (Fig. 5). The teacher used the built-in scoreboard function to create the interactive activities between students and the teacher for increasing interactivity in the class (Fig. 6).
Fig. 3. Checking materials on the IFPD.
• Hour 2: In the second hour of the course, interdisciplinary knowledge teaching was carried out through the collaborative platform (Fig. 7). Students can collaboratively use the moving of the tiles and the scratch card function to enhance interactive learning (Fig. 8). The teacher can use another digital device to send additional course materials to IFPD and to the web platform such as LearnMode [14 for students to download those materials with the QR-code showing on the IFPD. Students can self-regulated review the course according to their learning progress with their digital devices anytime, anywhere. In addition, web information such as YouTube [15] and pictures can be shown on the IFPD for better cross-field knowledge exploration (Fig. 9).
230
Y.-H. Chien et al.
choose the built-in drawing function. choose the line function. simulate the pointer.
Fig. 4. The use of built-in drawing function on the IFPD.
Change the color and calculate the resistance value.
Fig. 5. Changing the resistor’s color on the IFPD.
Fig. 6. The use of built-in scoreboard function on the IFPD.
• Hour 3: The circuit’s layout was mainly introduced after introducing the interdisciplinary knowledge used for the ’Cloud Light’ hands-on activity in this course. The above information was synchronously presented on the IFPD and the students’ tablets. The students could also use the tablets to control the IFPD under the teacher’s supervision (Fig. 10).
Using Interactive Flat Panel Display
Fig. 7. The collaborative platform.
231
Fig. 8. The tiles and the scratch card function.
YouTube video and web information can show on the IFPD for better cross-field knowledge exploration.
Fig. 9. The integration of web information on the IFPD.
Information was presented on the IFPD and students’ tablets synchronously. The Students could use the tablets to control the IFPD as well.
Fig. 10. Synchronously presented information on the IFPD and tablets.
• Hour 4: The students assembled and tested their circuits during this hour. The IFPD could show all of the information on one page to assist students in finishing their tasks.
232
Y.-H. Chien et al.
• Hour 5: Welding instructions were given through animation. Then the teacher used a physical projector to demonstrate the skill of welding, and the demonstration was recorded simultaneously (Fig. 11). The IFPD can play back videos or store videos in the cloud-based platform, allowing students to customize their learning progress through the tablet. • Hour 6: Hands-on activity of welding. The students used tablets to review the course materials, such as the welding record.
Fig. 11. The demonstration of welding.
• Hour 7: During this period, the students continued their hands-on activities. Meanwhile, the students could ask questions with Slido [11] through the tablet, and the teacher could also respond immediately. • Hour 8: After the students completed the work, they explained, displayed, and shared their works by taking pictures through the tablet and projecting the photos of their works on the IFPD with InstaShare 2 [13]. The teacher can also mark and explain the works presented on the IFPD concurrently (Fig. 12). At the same time; the students could also use the tablet to see the synchronous display from the IFPD.
Fig. 12. The students’ works on the IFPD.
Using Interactive Flat Panel Display
233
4 Conclusions After the global covid-19 epidemic, the use of technology to implement online learning has gradually become a teaching norm. Teachers should be familiar with using educational technologies as teaching aids to make courses more interactive and improve the shortcomings of traditional teaching. This study demonstrated how to use the IFPD to make teaching interactions more frequent and exciting. For example, students can answer teachers’ questions through the tablet to overcome the difficulties of not raise their hands to speak and overcome when teachers do not have enough time to answer students. Furthermore, the IFPD can connect to the Internet to enrich teaching sources and can carry out student-teacher interactions can be carried out. The IFPD can record the whole class, allowing students to review classroom materials without time and space constraints after returning home. However, we should be aware that students and teachers need to be familiar with the operation method and content according to the pre-training principle [16] before they use the IFPD and tablets in classes to prevent the additional cognitive load occurred to them [17]. The examinations of university entrance subjects still dominate the current education situation in most Asian countries. How to promote a new educational paradigm of STEM education in this educational environment and cultivate the ‘higher-order thinking skills (HOTS) possessed by K 12 students? How can students be talented in the 21st century to compete with global talents and connect with the world? Using the non-university entrance subject of technology education to promote the new educational paradigm of STEM teaching may be a feasible way for Asian countries to cultivate STEM talents. Secondly, Taiwan’s Ministry of Education began to promote a 4-year 20 billion ‘classes have Internet and students use tablets’ policy; Taiwan’s capital Taipei City also announced a 1.3 billion ‘classes have big screens’ education policy white paper. Technology education would serve as an essential driving force for promoting the use of software and hardware in this related policy. In response to the education policy of having the IFPD in each class, the IFPD STEM course in this study can be used as an example in the actual teaching field to enable teachers to practice responding to the IFPD-based teaching trend.
References 1. Skinner, E., Saxton, E., Currie, C., Shusterman, G.: A motivational account of the undergraduate experience in science: brief measures of students’ self-system appraisals, engagement in coursework, and identity as a scientist. Int. J. Sci. Educ. 39(17), 2433–2459 (2017) 2. World Economic Forum.: Realizing human potential in the fourth industrial revolution: An agenda for leaders to shape the future of education, gender and work. http://www3.weforum. org/docs/WEF_EGW_Whitepaper.pdf. Accessed 15 Dec 2022 3. Gauthier, A., Porayska-Pomsta, K., Dumontheil, I., Mayer, S., Mareschal, D.: Manipulating interface design features affects children’s stop-and-think behaviours in a counterintuitiveproblem game. ACM Trans. Comput.-Hum. Interact. 29(2), 1–22 (2022) 4. Vahidy, J.: Enhancing STEM learning through technology. Technol. Curriculum Summer 2019 (2019)
234
Y.-H. Chien et al.
5. Tsybulsky, D., Levin, I.: SAMR framework for study technology integration in science education. In: New Perspectives in Science Education Conference Proceedings of the 5th Edition (2016) 6. Aladé, F., Lauricella, A.R., Beaudoin-Ryan, L., Wartella, E.: Measuring with murray: touchscreen technology and preschoolers’ STEM learning. Comput. Hum. Behav. 62, 433–441 (2016) 7. Martín-Páez, T., Aguilera, D., Perales-Palacios, F.J., Vílchez-González, J.M.: What are we talking about when we talk about STEM education? a review of literature. Sci. Educ.103(4), 799–822 (2019) 8. Baran, E., Bilici, S.C., Mesuto˘glu, C., Ocak, C.: Moving STEM beyond schools: students’ perceptions about an out-of-school STEM education program. Int. J. Educ. Math. Sci. Technol. 4(1), 9–19 (2016) 9. Beane, J.A.: Curriculum integration and the disciplines of knowledge. Phi Delta Kappan 76(8), 616–622 (1995) 10. Place, S.: What is the SAMR model of technology integration?. https://www.bookwidgets. com/blog/2022/03/what-is-the-samr-model-of-technology-integration. Accessed 15 Dec 2022 11. Slido.: https://www.slido.com/?experience_id=12-z. Accessed 15 Dec 2022 12. Kahoot.: https://kahoot.com/schools-u/. Accessed 15 Dec 2022 13. InstaShare 2.: https://www.benq.com/en-ap/business/ifp/instashare.html. Accessed 15 Dec 2022 14. LarnMode: https://www.learnmode.net/home/. Accessed 15 Dec 2022 15. YouTube.: https://www.youtube.com/. Accessed 15 Dec 2022 16. Mayer, R.E.: Cognitive Theory of Multimedia Learning. The Cambridge Handbook of Multimedia Learning, pp. 43–71. Cambridge University Press. Cambridge (2005) 17. Sweller, J., van Merrienboer, J.J., Paas, F.: Cognitive architecture and instructional design. Educ. Psychol. Rev. 10, 251–296 (1998)
Analysis of Effects of Raggedy Student CG Characters in Face-to-Face Lectures and Their On-Demand Streaming Seishiro Hara1(B) , Ryoya Fujii1 , Saizo Aoyagi2 , and Michiya Yamamoto3 1 School of Science and Technology, Kwansei Gakuin University, Sanda, Hyogo, Japan
[email protected]
2 Faculty of Global Media Studies, Komazawa University, Setagaya-Ku, Tokyo, Japan
[email protected]
3 School of Engineering, Kwansei Gakuin University, Sanda, Hyogo, Japan
[email protected]
Abstract. With the changing circumstances of the COVID-19 disaster, many conventional face-to-face lectures are being conducted, but on-demand streaming of lectures is also expected to continue in the future. The authors have conducted ondemand lectures in which student characters’ raggedy actions are introduced from the viewpoint that these actions are important, but the raggedy actions’ effect has not yet been clarified. In this study, we analyzed the effects of introducing student characters’ raggedy actions by conducting face-to-face lectures and on-demand streaming that were based on recorded face-to-face lectures. We analyzed students’ behavior in a face-to-face class by setting “DOLA (Degree of the Learning Attitudes),” which is a scale of learning attitudes. In addition, we analyzed ondemand classes using the audience retention. We then compared these results. The results suggest that changing the actions of the characters in the class can attract students’ interest and encourage them to continue watching. Keywords: on-demand lectures · face-to-face lectures · audience retention · degree of the learning attitudes · student characters
1 Introduction Due to the COVID-19 pandemic, the style of classes in schools has changed in various ways. For example, there are face-to-face lessons in classrooms, on-demand lessons in which students study at home using video teaching materials, simultaneous interactive lessons on the Internet using an online conference system, and hybrid lessons combining these methods [1]. In 2022, the proportion of face-to-face classes at most universities has increased in Japan. However, Kwansei Gakuin University, to which the authors belong, made a statement: “For some subjects that are recognized to be more effective using online, will be conducted as an online class at the discretion of the department where the course is held” [2]. It acknowledged the diversity of class methods. A nationwide survey © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 235–251, 2023. https://doi.org/10.1007/978-3-031-35129-7_17
236
S. Hara et al.
of university students that the National Federation of University Co-operative Associations (NFUCA) conducted in July 2022 also indicated that around 20% of students, regardless of grade, wished to take face-to-face and online classes [3]. In this way, even if we return to the pre-coronavirus situation, a certain demand will remain to conduct classes using online technology. Researchers have studied online education for a long time, and some such studies have become widely used. For example, MOOCs (massive open online courses) [4] are free online learning tools millions of people all over the world use. In them, various class content is enriched. In addition, the “metaverse” has become popular in recent years. The University of Tokyo established the Metaverse School of Engineering, where students of all ages can study engineering [5]. On the other hand, the lack of students physically attending classes together is a common problem of these systems. We have clarified the importance of embodiment of CG characters as colleagues who take classes together in generating and controlling communications [6–8]. For example, we have made clear the importance of raising hands in class situations [9, 10], which can be a solution in problems arising in MOOCs, for example. From 2020 to 2021, when on-demand classes became the mainstream class format, we streamed 360-degree videos of an on-demand class with a student’s CG character in a CG classroom that simulates an actual classroom [11]. We found that this made it possible to evoke the feeling of being in class with other students, compared to the conventional streaming of only 2D videos. Here, we showed the effectiveness of CG characters’ communicative motions and actions, such as raggedy responses. However, the method for applying these effects to hybrid-type classes has not been clarified even though they are expected to continue being used. In particular, it is necessary to clarify the difference between the situation of an actual class many students take and the situation of an on-demand class one person takes. Therefore, in this study, we conducted a face-to-face class, streamed the class on demand, and made clear the effect of introduction of raggedy CG student characters. First, we conducted a face-to-face class and proposed an index for analyzing students’ learning attitudes based on the movements of each part of the students’ body. Next, we analyzed the change in the index and created a video with CG characters based on it. Then, we conducted an on-demand class using the content and analyzed the results and the audience retention. Finally, we compared face-to-face classes and on-demand classes to propose how to apply the findings to on-demand classes.
2 Related Works 2.1 Utilization of Characters Under the influence of the COVID-19 pandemic, online streaming of content has increased in various fields, such as the entertainment field. For example, ZAIKO, a video streaming service, provide real-time streaming of live concerts and recorded content after the original live stream [13]. In addition, a video streaming service, VARK, has started offering a service in which users can watch a live performance in a VR space next to a cast (CG character) [14]. This service is still being offered and is expected to continue to be used as a system that takes advantage of the benefits of online streaming
Analysis of Effects of Raggedy Student CG
237
even when the self-restraint stemming from the COVID-19 pandemic has begun to ease to some extent. In the field of education, research continues on how to make the best use of characters. Watanabe et al. showed that taking classes with agents with functions of giving advice and technical terms on the learner’s computer can improve the students’ motivation [15]. Mayer et al. showed that when the agent behaves more like a human, the test score increases compared to when it does not [16]. Many studies have shown CG characters’ effectiveness in the entertainment and educational fields. However, few examples demonstrate CG characters’ effectiveness in terms of on-demand education. In this study, we clarify whether the CG characters are effective in on-demand education. 2.2 Educational-Assistance Technology With the development of information technology, various forms of learning have been developed and used. In addition to traditional paper textbooks, educational institutions are using a combination of video and other media. For example, combinations of tablet terminals with textbooks have been introduced in the GIGA School Program in elementary and junior high schools. The Open University of Japan provides broadcast lecture courses taken via TV and radio and face-to-face courses taken in a face-to-face format with a friend, and students in online courses watched videos at their own pace [17]. In addition, at Kadokawa Dwango Gakuen N High School, many classes are held online. Using a VR headset, students can “touch” content in a virtual space and conduct experiments that are difficult to conduct in reality [18]. In both examples, face-to-face schooling is conducted in combination with other formats. On the other hand, various forms of learning have caused harmful effects. In a survey Waseda University conducted, students said, “I can’t get any information or learn how to learn from my classmates’ learning (in Japanese)” in online classes [19]. In addition, Tanaka showed that the disadvantage of on-demand teaching is the lack of communication between students caused by the change in teaching style with the spread of the COVID-19 pandemic [20]. Of course, the importance of taking classes with others, including friends, has been clarified in studies. For example, Ryo Okada clarified that mutual learning with friends is related to a sense of fulfillment in learning [21]. In addition, Shinsaku Okada states that interaction with others is effective in acquiring knowledge for adult learners, including the elderly [22]. We also conducted a survey that showed that about 1/3 of the students answered that the most important people in a classroom were the students [23]. Therefore, the importance of the relationship between students in the class scene is clear, and it is important for them to take classes with others. 2.3 Previous Research by the Authors In our previous research, we conducted a VR lesson in our laboratory using a VR HMD and an on-demand lesson in which we watched 360-degree videos of raggedy student characters uploaded to YouTube (Fig. 1) in January 2021 [11].
238
S. Hara et al.
Fig. 1. On-demand movie type
The results of the mini-quizzes on the class content, which were the same for each class, were significantly higher for the students who took the class with synthesized videos than for those who took the VR class. When we compared the audience retention of composite videos and 360-degree videos on YouTube [24], the audience retention was 74.53% for composite videos and 82.83% for 360-degree videos, with 360-degree videos having a higher retention. Figure 2 shows some of the results of a questionnaire on character behaviors conducted after class. In this questionnaire, after the subjects watched “characters only” videos, in which characters only exist; “nodding characters” videos, in which characters can nod; and “raggedy characters” videos, in which characters nod and look around their surroundings, we asked them to answer the evaluation items on a 7-point scale. Figure 2 presents the average scores of the evaluation items, and the error bars represent the standard deviation. Using the Wilcoxon signed-rank test and the Bonferroni method, the item “I feel as if I’m taking the class with others,” was rated equally for videos with nodding characters and raggedy characters when we asked the participants about relationships between students. On the other hand, in the item “The class is active,” regarding the quality of the class, nodding characters was evaluated the highest. Table 1 shows the p values and statistics for this test. Summarizing these results, we can state that 360-degree videos delivered on demand were the most likely to maintain students’ interest and improve the quality of the class. Therefore, we use 360-degree videos in this study.
Fig. 2. Questionnaire results of CG character motions.
Analysis of Effects of Raggedy Student CG
239
Table 1. P values and statistics of questionnaire regarding CG character motions. Characters only Characters only Nodding characters VS VS VS Nodding characters Raggedy characters Raggedy characters I feel as if I’m taking the class p= 0.000 with others Z = −5.699
p= 0.000 Z = −6.396
p= 0.190 Z = −1.856
The class is active
p= 0.021 Z = −2.684
p= 0.000 Z = −4.957
p= 0.000 Z = −6.749
3 Implementation and Analysis of Face-to-Face Classes 3.1 Lesson Implementation First, we conducted a face-to-face class, shot videos for on-demand streaming, and analyzed how students were attending classes in the classroom. For the analysis of the face-to-face class, we selected “Human data analysis (starting April 9, 2021)” at the School of Science and Technology, Kwansei Gakuin University, for which the author, Yamamoto, was in charge. The class consists of five chapters, including the opening part, which explains the contents of the class and the contents of the research that Yamamoto has been working on, including “What is a human interface?” and “Embodiment of media technology.” To prevent the spread of COVID-19, this class was held under stricter rules than in the fiscal year 2022, so students were seated one seat apart. Therefore, students basically did not communicate during the class. In addition, the target audience for this class was second-year university students. They were college freshmen at the time of the university lockdown caused by the COVID-19 pandemic, so this class may have been their first face-to-face class in their college careers. Few students slept during class, probably because of the resulting tension. In the next section, we quantify and analyze these learning attitudes for analysis. To record synthesized videos and 360-degree videos for on-demand steaming, we captured the slides and recorded the professor’s speech using the AV system installed in the classroom. In addition, we shot the movements of the professor and the screen using a video camera (HDR-PJ790V, SONY) from the back of the classroom. At the same time, to analyze the students’ behavior during class, we recorded the students using a video camera from the front of the classroom. We chose eight students whose face orientation and hand movements were all recorded and analyzed in this chapter (red circles in Fig. 3). After the class, we obtained consent from the students for the following three points: to be the subject of analysis in this study, to ensure that the attitudes analyzed did not affect their grades, and to write about them in a way that would not identify individuals. We have not consulted Kwansei Gakuin University Regulations for Behavioral Research with Human Participants because this study does not meet the criteria for the regulations.
240
S. Hara et al.
Fig. 3. 8 students to be analyzed.
3.2 Students’ Learning Attitude Analysis Many studies have been conducted to analyze the behavior of students during class. For example, Tomohiro Yamamoto et al. analyzed the transition of the degree of concentration in a specific situation from the viewpoint of positive or negative behavior of children during class and concrete examples [25]. They analyzed overall body movements. In this study, we analyzed the learning attitude by dividing “head,” “body,” and “hands” for a more detailed analysis. Then, based on video viewing and discussion by the authors and one cooperator, we defined an index for analyzing the learning attitudes of the eight participating students (hereafter referred to as “DOLA [degree of learning attitudes]”). The method is as follows. First, we observe the learning attitudes of the participants by dividing them into three parts: “head,” “body,” and “hand.” We give a score of five points for the observed movement of each part (Table 2). We calculate the subject’s DOLA at the time by totaling the scores of the three body parts. We summed every subject’s DOLA to calculate the DOLA of the classroom. Figure 4 shows an example of a student with a positive DOLA, while Fig. 5 provides an example of a student with a negative DOLA. Table 2. DOLA (degree of learning attitudes)
Analysis of Effects of Raggedy Student CG
241
Fig. 4. An example of taking notes.
Fig. 5. An example of placing one’s elbow on the desk.
Figures 6, 7 and 8 show the DOLA of the classroom in each lecture part in timeseries.. Figure 6 (Part 1, 2) shows the DOLA in the first half of the lecture. The DOLA is almost positive, so the ratio of time spent actively working on the lesson is high. On the other hand, Fig. 7 (Part 4) and Fig. 8 (Part 5) show the DOLA in the latter half of the lecture. The DOLA is almost negative, so there is a lot of time with a non-positive attitude in class. In addition, the DOLA had a tendency to increase in the final stages of all lectures.
Fig. 6. Proactivity in parts 1 and 2.
242
S. Hara et al.
Fig. 7. Proactivity in parts 3 and 4.
Fig. 8. Proactivity in part 5.
4 Face-to-Face Classes On-Demand 4.1 Lecture Video Creation We created videos for on-demand streaming of the face-to-face class that was conducted in the previous chapter. First, we created synthesized videos using the same method as in the previous study, with videos shot in face-to-face classes [11]. We adopted the synthesized videos in this study as a general control for a hybrid class. In addition, we made 360-degree videos, which previous studies have shown to be effective [11]. In previous studies, there was a comment that “the student character motion only one have one type of motion, which is unnatural;” thus, we increased the number of student characters’ motions in the lecture videos. In detail, we developed two modes for the class in the previous study: a new mode of “raggedy characters (Fig. 9)” and a mode of “nodding characters.” We introduced the raggedy characters into the classroom as more closely resembling the actual classroom situation. Moreover, the raggedy videos were rated as high as the nodding videos in the item on the relationship between students when we conducted a questionnaire on character behavior in a previous study.
Analysis of Effects of Raggedy Student CG
243
Fig. 9. Raggedy videos.
4.2 Lesson Implementation We conducted on-demand streaming of the face-to-face class in “Introduction to Computer Science (231 students)” at the School of Engineering, Kwansei Gakuin University, and “Body Studies (227 students)” at the Faculty of Culture and Information Science, Doshisha University from July 8 to 14, 2021. These classes consisted of an introductory section and five lecture sections with the same content as the face-to-face class. We uploaded each video—described in the aforementioned section—on YouTube. Here, we made different introductory sections for each class. Students took the course in different modes depending on their student ID numbers. The groups included a “group of synthesized” (watch the entire lecture section with synthesized videos), a “group of nodding” (watch the entire lecture section with nodding videos) and a “group of raggedy” (watch the lecture part alternately with nodding videos and raggedy videos) (Table 3). Table 3. Group of lecture videos to watch.
We prepared multiple-choice mini-quizzes on the contents of the lecture videos for each lecture as in the previous study. There were 17 questions in total for a total of 20 points. We uploaded the videos and mini-quizzes to Microsoft Forms and streamed the videos for each group (Table 3). This experiment does not meet the criteria for Kwansei Gakuin University Regulations for Behavioral Research with Human Participants. 4.3 Analysis of the Performance of Each Course Group
244
S. Hara et al.
First, we analyzed the results of the mini-quizzes during the on-demand lecture. We summed the total scores and tested with the analysis of variance for grades and groups. In this result, Fig. 10 shows that the score for the “group of synthesized” was higher than the score for the “group of nodding.” Fig. 11 shows no significant differences in sections where students watched only synthesized videos or only nodding videos. Figure 12 shows that the score for the “group of synthesized” was higher than for the “group of nodding” because significant differences were found in the total results when they watched the three different videos.
Fig. 10. Total score of all lectures.
Fig. 11. Total score for watching only nodding videos.
Next, we summed the score when they watched raggedy video for the first time and the score when they watched raggedy video for the second time. After that, we tested an analysis of variance. Figure 13 shows significant differences, and the score of the “group of raggedy” was higher than the score of the “group of nodding.” Figure 14 shows no significant differences. The author’s previous study showed that introducing a nodding CG character in the video improved the score. However, the results when the “group of raggedy” watched the raggedy video for the first time were different from the author’s previous study. We thought that the result was due to the novelty of the increased number of character actions.
Analysis of Effects of Raggedy Student CG
245
Fig. 12. Total score for watching three types of videos.
Fig. 13. Score for watching the raggedy video for the first time.
Fig. 14. Score for watching the raggedy video for the second time.
4.4 Audience Retention Analysis We used the audience retention of the lecture video by YouTube analytics as an index for analyzing students’ learning attitudes in the on-demand class. Figure 15 shows an example. “Mountain” indicates a part of the video where someone watched many times
246
S. Hara et al.
or shared, while “valley” indicates a part where someone skipped or stopped watching entirely [24]. The graph plots the audience retention when the “group of synthesized” first watched the synthesized video. Figure 15 shows that the audience retention increased around the answers to the mini-quizzes and decreased after the answers to the miniquizzes ended, which was the same as in our previous study [11].
Fig. 15. An example of audience retention.
Next, we tested with the Kruskal-Wallis on the audience retention of each group when they watched the raggedy video for the first and second time. Figure 16 shows that there was a significant difference between all groups when they first watched the raggedy video; the highest score was for the “group of raggedy.” Fig. 17 shows that there were also significant differences between all groups when they watched the raggedy video for the second time; the highest score was for the “group of nodding.” In both cases, the “group of synthesized” had the lowest audience retention, and this feature was remarkable when they watched the raggedy video for the second time. on group of raggedy, the audience retention decrease when they watched raggedy video for the second time. Figures 13 and 14 show the effectiveness of the nodding video in the scenes of non-first viewing. From these results, the audience retention shows the effectiveness of nodding, which was shown only in the results for the mini-quizzes in our previous research [6].
Fig. 16. Audience retention for watching the raggedy video for the first time.
Analysis of Effects of Raggedy Student CG
247
Fig. 17. Audience retention for watching the raggedy video for the second time.
5 Positive Comparison of Face-to-Face and On-Demand Classes We compared the results of the DOLA in face-to-face classes with the audience retention of modes including nodding videos and raggedy videos. Figures 18, 19 and 20 show changes in the audience retention and the DOLA. The second half of Part 4 shows a scene where both the DOLA and the audience retention increase. On the other hand, the end of Part 2 shows scene where the DOLA is high and on-demand classes is low.
Fig. 18. Comparison between DOLA and audience retention (parts 1 and 2).
248
S. Hara et al.
Fig. 19. Comparison between DOLA and audience retention (parts 3 and 4).
Fig. 20. Comparison between DOLA and audience retention (part 5).
6 Discussion 6.1 Positivity in Face-to-Face Classes In this research, we quantified the positive behaviors and the negative behaviors, and summed them up to calculate DOLA. In the first half of the lecture, the total of DOLA remained at a relatively high level. The students were more conscious of taking notes at the beginning of Part 1 because the professor encouraged them to do so. On the other hand, DOLA remained at a low level at the end of the class. In addition, as a common trend in all lecture parts, DOLA shifted in the positive direction at the end of each lecture part. This is probably because the professor was talking about the summary of the lecture part, and it was the best place to review the contents of each lecture part.
Analysis of Effects of Raggedy Student CG
249
6.2 Positivity of On-demand Classes In this research, we formed the contents by dividing the videos as Tanaka clarified. That division of lecture videos makes it easier for students to work [20]. The results showed that audience retention increased at the scenes that were related to the content of the miniquizzes in the lessons, and it decreased in other scenes. This behavior makes sense from the viewpoint of improving grades. However, from the viewpoint of the teacher, there are no unnecessary scenes in the teaching materials, and the answers are assumed to be given after viewing the entire class. Therefore, the audience retention in this experiment is not desirable. The challenge is to continue the viewing of the parts not related to the mini-quizzes, especially in the scenes where DOLA increased during the face-to-face class. When considering audience retention in terms of character action, audience retention was significantly higher than the other groups when the “group of raggedy” watched the raggedy video for the first time. We concluded that the “group of raggedy” paid attention to the changes in the character action. On the other hand, audience retention was highest in the “group of nodding” when the “group of raggedy” watched the raggedy video for the second time. The role of the raggedy action is assumed to lead to a negative mood as in an actual classroom. Therefore, it is possible for us to construct the pattern we intended. 6.3 In-Person vs On-Demand There were situations in which DOLA was high, but the audience retention was low, when we compare DOLA in face-to-face classes with the audience retention in ondemand classes. This result indicates that students were not participating in on-demand classes when DOLA was high, which means that the teacher was talking about what the students felt was important. 6.4 Findings of Character Behavior The findings from the above analysis can be summarized below. – The scenes where students show interest are different between face-to-face classes and on-demand classes. – Changing character actions is effective in keeping students watching, but nodding actions are important throughout the class. – There were scenes when DOLA increased in the face-to-face classes while the audience retention decreased in the on-demand classes. This indicates that in the ondemand streaming, the students missed out on scenes that they felt were important in the face-to-face class. – We may expect to maintain viewing by changing the character’s actions at key scenes.
7 Conclusion In this research, we analyzed the effect of introducing nodding student characters in ondemand streaming. We expect to continue using this in the future. As a method for this purpose, we defined an index called DOLA for analyzing students’ learning attitudes in
250
S. Hara et al.
face-to-face classes. In addition, we used the YouTube audience retention as an index for analyzing students’ learning attitudes in on-demand streaming. DOLA tended to be higher in the first half of the lecture and lower in the second half. DOLA tended to increase at the end of each lecture part. The audience retention increased in the scenes related to the mini-quizzes in the lessons and was lower in the other scenes. Students may have missed important content in the on-demand streaming because there were scenes where DOLA is high and the audience retention was low, when we compared the two methods. As a next step, we want to validate the findings for making characters’ actions or motions by conducting on-demand classes. Acknowledgements. This research was partially supported by JSPS KAKENHI 20H04096, 20K20121, etc. We thank the students of “Human data analysis” at the School of Science and Technology, Kwansei Gakuin University, “Introduction to Computer Science” at School of Engineering, Kwansei Gakuin University, and “Body Studies” at Faculty of Culture and Information Science, Doshisha University.
References 1. Center for the Promotion of Excellence in Higher Education, Kyoto Unibersity: What is a Hybrid Class?, CONNECT. http://www.highedu.kyoto-u.ac.jp/connect/en/teachingonline/ hybrid.html. Accessed 24 Jan 2023 2. Kwansei Gakuin University: Class Policies for the AY2022 Spring Semester, Kwansei Gakuin University. https://global.kwansei.ac.jp/AY2022/classes.html). Accessed 24 Jan 2023 3. National Federation of University Co-operative Associations: Let’s Deliver: Survey on Life as a University Student of the Corona Disaster Aggregate Result Report, National Federation of University Co-operative Associations. https://www.univcoop.or.jp/covid19/survey/pdf/pdf_ report2208.pdf. Accessed 24 Jan 2023 (in Japanese) 4. edX: About MOOCs, MOOC.org. https://www.mooc.org/about-moocs. Accessed 13 Dec 2023 5. Faculty of Engineering, The University of Tokyo: Faculty of Metaverse Engineering Main Website,The University of Tokyo. https://www.t.u-tokyo.ac.jp/meta-school. Accessed 24 Jan 2023 (in Japanese) 6. Yamamoto, M., Watanabe, T.: a learning support system with speech-driven embodied entrainment characters superimposed on images. Trans. Inform. Process. Soc. Japan 47(8), 2769–2778 (2006) (in Japanese) 7. Yamamoto, M., Watanabe, T.: Development of an edutainment system with InterActors of a teacher and a student in which a user plays a double role of them. Trans. Inform. Process. Soc. Japan 54(4), 1677–1685 (2013) (in Japanese) 8. Aoyagi, S., Yamamoto, M., Watanabe, T.: A real-space sharing group communication system based on make-believe play via CG characters. Trans. Inform. Process. Soc. Japan 57(12), 2859–2869 (2016) (in Japanese) 9. Aoyagi, S., Kawabe, R., Yamamoto, M., Fukumori, S.: Development of a hand-raising robot by representing embodied motions of active hand-rising. Trans. Inform. Process. Soc. Japan 58(5), 994–1002 (2017) (in Japanese)
Analysis of Effects of Raggedy Student CG
251
10. Minamide, K., Aoyagi, S., Fukumori, S., Yamamoto, M.: Effect of arm height and number of people on impression of hand raising in group communication. Trans. Inform. Process. Soc. Japan 61(6), 1216–1225 (2020) (in Japanese) 11. Fujii, R., Hirose, H., Aoyagi, S., Yamamoto, M.: On-demand lectures that enable students to feel the sense of a classroom with students who learn together. In: Yamamoto, S., Mori, H. (eds.) HCII 2021. LNCS, vol. 12765, pp. 268–282. Springer, Cham (2021). https://doi.org/ 10.1007/978-3-030-78321-1_21 12. President Online Editorial Department: College Students Satisfied with Online Classes Say They Can Concentrate Better at Twice the Speed., PRESIDENT Online. https://president.jp/ articles/-/37681. Accessed 24 Jan 2023 (in Japanese) 13. ZAIKO PTE Ltd.: About Us, ZAIKO PTE Ltd. https://business.zaiko.io/en/about. Accessed 24 Jan 2023 14. VARK Inc.:Cinderella switch 〜hololive Created by Everyone〜, VARK Inc. https://lp.vark. co.jp/minnade_tsukuru_hololive/. Accessed 24 Jan 2023 (in Japanese) 15. Watanabe, H., Mizugaki, M.: Supporting Learners in Self-Learning Courses Using Agent Character, FIT2004, N-024 (2004) (in Japanese) 16. Mayer, R.E., DaPra, C.S.: An embodiment effect in computer-based learning with animated pedagogical agents. J. Experim. Psychol. 18(3) (2012) 17. The Open University of Japan: OUJ Brochure, The Open University of Japan. https://www. ouj.ac.jp/doc/en/OUJ_Brochure.pdf. Accessed 24 Jan 2023 18. KADOKAWA DWANGO Educational Institute: The World’s Most Advanced Online Learning, KADOKAWA DWANGO Educational Institute N High School, S High School. https:// nnn.ed.jp/learning/vr/. Accessed 24 Jan 2023 (in Japanese) 19. Institute for Advanced Studies in Education Waseda University (supervision): Current Status and Prospects of Online Education in Universities, pp.10, GAKUBUNSHA (2022) (in Japanese) 20. Tanaka, S.: Practice and challenges of on-demand lectures (video lectures). Folia Pharmacol. Jpn. 156(6), 330–334 (2021) (in Japanese) 21. Okada, R.: Autonomous motivation in learning activities with friends. Japan. J. Educ. Psychol. 56(1), 14–22 (2008) (in Japanese) 22. Okada, S., Ando, T.: Motivational effect of talking with people on adult learners - Case study of a local language school. Yokohama J. Technol. Manage. Stud. 16, 17–23 (2017) (in Japanese) 23. Hirose, H., Aoyagi, S., Yamamoto, M.: A study on streaming 360 degree videos of classrooms for online classes. In: Proceedings of Human Interface Cyber Colloquim 2020, pp. 445–456 (2020) (in Japanese) 24. Google: Measure key moments for audience retention, YouTube Help. https://support.google. com/youtube/answer/9314415?hl=en. Accessed 24 Jan 2023 25. Yamamoto, T., Shimizu, Y.: A study on concentration and behavior analysis in learning scenes utilizing IT: From an analysis of classes utilizing IT in 5th grade social studies. Japan Soc. Educ. Technol. 30, 93–96 (2006) (in Japanese)
Triangle Logic Recomposition Exercise for Three-Clause Argument and Its Experimental Evaluation Tsukasa Hirashima(B) , Takuya Kitamura, Tomohiro Okinaga, Reo Nagasawa, and Yusuke Hayashi Hiroshima University, Hiroshima, Japan [email protected]
Abstract. A framework of triangle logic, a recomposition exercise environment of the triangle logic, and an experimental evaluation of the exercise are reported in this paper. The triangle logic is a graphical representation of a three-clause argument composed of Ground, Warrant, and Conclusion to simplify the Toulmin model. The three clauses are located at three vertices of a triangle, respectively. The recomposition exercise requires students to complete a triangle by selecting necessary clauses from provided candidates. The exercise environment can logically diagnose a recomposed triangle logic and give feedback to the student. We used a quasi-experimental, pretest, and posttest design to evaluate the exercise. Sixteen participants in the experimental group received the pretest, exercise, and posttest, and then fifteen participants in the control group received the pretest and posttest only. The interval between the pretest and posttest was a week. The exercise was conducted 40 min just before the posttest. As the results, the posttest score of the experiment group was statistically significantly improved compared to the pretest score, and the effect size was large, although the posttest score of the control group were not significantly improved compared to the pretest score. Keywords: Toulmin Model · Three-Clause Argument · Triangle Logic · Assembling Exercise · Open Information Structure Approach
1 Introduction Logic helps us understand meaning and worth of an argument, and distinguish between good and bad [1, 2]. Logical understanding of an argument is also an essential ability of critical thinking [3]. Toulmin model [4] is well known as a concrete and externalized representation of logical argument. Although the original model is composed of six components, it is also accepted that the three components, that is, Ground, Warrant and Conclusion, are the main components [5, 6]. The minimum condition for a logical argument is commonly agreed that Ground and Conclusion exist. An argument composed of Ground and Conclusion is often called “two-clause argument” and the basic form is “Ground, therefore, Conclusion”. In addition to the two components, a logical argument includes Warrant that is the reason why Conclusion can be derived from Ground. The © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 252–262, 2023. https://doi.org/10.1007/978-3-031-35129-7_18
Triangle Logic Recomposition Exercise for Three-Clause
253
component to explain the reason is usually called Warrant. Although Warrant is often omitted in an argument, it is indispensable component for the argument to be logical and exists even when it is omitted. The argument explicitly including Warrant is called “three-clause argument”. Triangle logic is a graphical representation of the three-clause argument. Figure 1 is an example of the triangle logic. Ground is located at the bottom left vertex of the triangle, and the Warrant at the bottom right vertex. Conclusion is located at the top vertex against the bottom. The representation of the triangle logic provides students with visualized and manipulatable clauses. In this paper, we formalize the triangle logic as having deduction structure, that is, (1) modus ponens(mp) {p, p → q, q}, or (2) syllogism ( or multiple modus ponens (mmp)) {p → q, q → r, p → r}. In the formalization, Ground is a minor premise, Warrant is a major premise and Conclusion is a conclusion. We call the formalized triangle logic “deductive triangle logic” (DTL). DTL visualizes three kind of reasoning: (1) deductive reasoning derives Conclusion from Ground and Warrant, (2) ground reasoning derives Ground from Conclusion and Warrant, and (3) warrant reasoning derives Warrant from Ground and Conclusion. In the classification of syllogism [7], the deductive reasoning corresponds to AAA in the first figure, the ground reasoning corresponds to AAA in the second figure, and the warrant reasoning corresponds to AAA in the third figure (Fig. 2).
Fig. 1. Triangle logic for modus ponens(mp).
Fig. 2. Triangle logic for multiple modus ponens (mmp).
A recomposition exercise of DTL by using a set of provided propositions (that is clauses) has been already implemented as a technology enhanced learning with diagnosis of the composed logic and feedback based on the diagnosis. In order to evaluate the exercise, a quasi-experimental, pretest and posttest design was used. As the results, although the posttest score of the control group were not significantly improved compared to the pretest score, the posttest score of the experiment group was statistically significantly improved compared to the pretest score and the effect size was large.
254
T. Hirashima et al.
In this paper, firstly triangle logic as a graphical representation of three-clause argument is explained. Then, a triangle logic recomposition exercise is introduced. Experimental evaluation of the exercise is also reported.
2 Triangle Logic for Three-Clause Argument 2.1 Two-Clause Argument and Its Exercises A logical argument is a series of statements that conclude with Conclusion. Then, Conclusion should be supported by Ground in the argument. A logical argument composed Ground and Conclusion is called two-clause argument. In the following example (Ex1), Ground: “ATM fees make their customers unhappy”, and Conclusion: “banks shouldn’t charge ATM fees”. (Ex1) Warranted argument: Banks shouldn’t charge ATM fees because the fees make their customers unhappy. [8] mentioned that Ex1 is the logically valid argument with warrant, and Ex2 below is not logical one without warrant. (Ex2) Banks shouldn’t charge ATM fees because banks are financial institutions. Several investigations indicated that even for the two clause-argument, it is often difficult for university students to judge the logical validity [9, 10]. Based on such investigations, several evaluation exercises for two-clause argument have been implemented [8, 10]. The results suggested that immediate feedback for student’s response was important role for effectiveness for learning. Therefore, technology-enhanced learning with diagnosis and feedback facilities for student’s response is a promising approach for the exercise of argument learning. 2.2 Triangle Logic for Three-Clause Argument Although the warrant is deal with implicit one in the exercises for two-clause argument, several researches of argument learning describes warrant explicitly. For example, in Toulmin model, Warrant is one of the six components. Moreover, Toulmin mentioned that “The step to the original claim or conclusion is an appropriate and legitimate one. ….These may normally be written very briefly (in the form ‘If D, then C’)….. Data such as D entitle one to draw conclusions, or make claims, such as C’” [4]. This mention suggests that the basic structure of logical argument is compose of Ground (or Data), Conclusion (or Claim) and Warrant. An argument composed of the three components is often called three-clause argument. This research deals with the three-clause argument. Graphical representation has several advantages for learning of logical argument [11]. Triangle logic is a promising representation for three-clause argument. The triangle logic is categorized into two types, one is Linguistic Triangle Logic: LTL, and the other is Deductive Triangle Logic: DTL. Figure 3 is an example of LTL. TL requires to describe a clause as Warrant that explains the reason of that Conclusion is able to derive from Ground. In the case of LTL, Warrant should be explicitly described, but it is not required to be able to derive Conclusion from Ground as formal logic. In the case of Fig. 3, the Warrant looks reasonable but it is not directly connected the Ground and Conclusion and
Triangle Logic Recomposition Exercise for Three-Clause
255
there are several missing premises. In short, LTL requires to describe Warrant explicitly, but allow including missing premises. In DTL, Warrant should be able to derive Conclusion from Ground as formal logic. By adding missing premises, the LTL shown in Fig. 3 is translated into the DTL shown in Fig. 4.
Fig. 3. Linguistic Triangle Logic: LTL,
Fig. 4. Deductive Triangle Logic: DTL.
Because LTL is easy to describe, it is easy to use in classroom. However, logical validity is not able to examine formally. Because DTL requires to describe the details of the argument, it is not easy to use in classroom. However, it has important advantage that logical validity is able to examine formally. Because DTL allows automatic diagnosis of recomposed logic by a student, we have adopted DTL to the exercise in technologyenhanced learning.
256
T. Hirashima et al.
Although the triangle of DTL is regarded as the result of deductive reasoning, it can be also interpreted three kind of reasoning: (1) conclusion reasoning (that is deductive reasoning) derives Conclusion from Ground and Warrant, (2) ground reasoning derives Ground from Conclusion and Warrant, and (3) Warrant reasoning derives Warrant from Ground and Conclusion.
3 Triangle Logic Recomposition Exercise 3.1 Outline of Recomposition Exercise As the exercise, opened information structure approach has been adopted [12]. In the opened information structure approach, a learning environment is designed to allow a learner to operate an information structure that the learner is required to understand. Several applications have been already reported [13, 14]. In this case, the triangle logic is the information structure. In the exercise implemented in this research, a learner is provided with several components and complete a tringle logic by recomposing them logically. Figure 5 is a beginning stage of the exercise. There are four propositions are provided in the left side of the screen as components of a triangle logic. In the right side, there is a triangle with three blanks: Ground, Warrant, and Conclusion. A learner is required to fill in the blanks with the propositions. Figure 6 is an example of successfully completed triangle logic. Figure 7 is an example of wrong triangle logic. In Fig. 7, “a creature is mortal” is not current to recompose the triangle logic. To use the proposition correctly, the triangle logic shown in Fig. 8 is necessary. In Fig. 8, “a human is a creature” and “Socrates is a creature” are additional proposition, that is, missing premises.
Fig. 5. A beginning stage of recomposition exercise of triangle logic.
Triangle Logic Recomposition Exercise for Three-Clause
Fig. 6. Example of completed triangle logic.
Fig. 7. Example of wrong triangle logic and feedback.
Fig. 8. Example of multiple triangle logic.
257
258
T. Hirashima et al.
3.2 Exercise Structure The exercise is composed of (Ex1) partial recomposed exercise, (Ex2) full recomposed exercise, (Ex3) multiple triangle logic exercise. In this subsection, these exercises are explained. In Ex1, one or two vertices of a triangle have been filled by propositions beforehand and a learner is required to fill the remined blanks. Because prefilled propositions are support a learner to recompose the triangle logic, Ex1 is a beginning stage of recomposed exercise of the triangle logic. However, in theoretically, in the case with two prefilled propositions, there are three types of reasoning, that is, (a) conclusion reasoning with prefilled ground and warrant, (b) ground reasoning with prefilled conclusion and ground, and (c) warrant reasoning with prefilled ground and conclusion. The ground reasoning and warrant reasoning are able to formalize as abduction [15]. Although the differences of the abductive reasoning from deductive reasoning are important for logical thinking, the differences are not dealt in the exercises explicitly. Exercise to deal with the differences is one of the most important future work. In Ex2, a learner is required to full recompose one triangle logic. This is the main activity of the exercise. In Ex1 and Ex2, a set of components provided to a learner includes dummy propositions. Dummy propositions are generated by the following procedure, (i) prepare a current triangle logic with three propositions, (ii) decompose the propositions into smaller components and generate new propositions by composing the elements, for example, making {r → p}from {p → q, q → r} (dummy proposition type 1), (iii) generate new propositions by changing a positive proposition component into a negative proposition component and vice versa for example, making {p → q} from {p → ~ q}(dummy proposition type 2) and (iv) replace a proposition component into related one, for example, making “a creature is mortal” from “a human is mortal” as shown in Fig. 8. In Ex 3, multiple triangle blocks are connected. Figure 9 shows an example of Ex3. In the figure, a learner is required to fill in the five blanks in the connected two triangle blocks by using provided five propositions. In Ex 3, there is no dummy proposition currently.
4 Experimental Evaluation 4.1 Procedure of the Experiment Subjects were thirty-one undergraduate or graduate students in information science course. The undergraduate students are fourth grade students. Programming is the main subject of the information science course, so that, they were regarded as subjects who had enough training of logical thinking in general meaning. The subjects were divided into two groups (one is an experimental group and the other is a control group). Sixteen participants in the experimental group received the pretest, exercise, and posttest, and then, fifteen participants in the control group received the pretest and posttest only. The interval of the pretest and posttest was a week. The exercise was conducted 40 min just before the posttest. Just after the posttest, two types of questionnaire about the exercise and flow-experience were also conducted.
Triangle Logic Recomposition Exercise for Three-Clause
259
As the pretest and posttest, we used twenty-two questions in a category of linguistical representation in a survey of logical thinking conducted by National Institute for Educational Policy Research [16]. This survey was conducted for 5,575 s grade high school students. The pretest and posttest were the same one and both took a test time for 40 min. For the experiment group, before the exercise, the triangle logic and the exercise environment were explained in ten minutes. In the explanation of the triangle logic, by using the roles of the ground, warrant and conclusion in logic. In the explanation, several wrong cases were also introduced. In the introduction of the exercise environment, several example of recomposition of the triangle logic were demonstrated. The exercise is composed of three types of exercises introduced in 3.3. 4.2 Results and Analysis Pretest and Posttest The results of the pretest and posttest are shown in Table 1. Two-way ANOVA was conducted with one between-subjects factor Exercise (exercise (experimental group) and no-exercise (control group)) and one within-subject factor Test (pretest and posttest). Because there was statistically significant interaction effect between Exercise and Test (p = .000 < .001), simple main effects were analyzed for each variable. Although there was no significant difference between experimental group and control group in the pretest (p = .059 > .05), there was significant difference between them in the posttest (p = .02 < .05). In the experimental group, then, there was significant difference between the pretest and posttest (p = .000 < .001), and the effect size was large (Cohen’s d = 1.30). In contrast, in the control group, there was no significant difference between them (p = .84 < .05). Table 1. Scores of pretest and posttest(full marks: 22).
Correlation between Exercise Time and Test Score All subjects completed the exercise and the average time was 27.4 (SD = 18.3) minutes. This exercise provides feedback to the subject when the recomposed triangle logic is not correct, and requests the subject to repeat the recomposition until complete. Therefore, the exercise time is an indicator of the difficulty of the exercise for the learner. Table 2 shows the correlation between exercise time and test score. There was a statistically
260
T. Hirashima et al.
significant negative strong correlation between pretest score and posttest score. The means that a subject who scored higher in the pretest tended to complete the exercise in short time. The same correlation was observed at the time of the posttest after the improvement of the score. Table 2. Correlation between Exercise time and Test Score. (n = 15)
Pretest
Posttest
Exercise time
−0.71(p = 0.003)
−0.79(p = 0.0005)
Results of Questionnaire Table 3 shows the questions and average responses for 5-Likert scale items(1: Strongly Agree, 2: Agree, 3: Neutral, 4: Disagree, 5: Strongly Disagree). The results of (1) and (2) showed that the participants thought the exercises were helpful for logical thinking and learning. Response for (3) is not so high, but there was no negative response. On the other hand, responses for (4) and (5) show that it is necessary to improve the exercises. Response for (6) is an expected one. Table 3. Results of Questionnaire. Questions
Average (n = 15)
(1) Logical thinking was necessary in the exercises
4.4
(2) Logical thinking ability was developed by the exercises
4.1
(3) The exercises were better than the usual exercises of logical thinking you had experienced
3.8
(4) Various types of reasoning were able to conduct
3.6
(5) Feedback in the exercises was adequate
3.0
(6) Professional knowledge was necessary for the exercises
2.0
Results of Flow Experiences With Flow Short Scale [17], we evaluated the flow experiences of the participants. The results are shown in Table 4. Three others are the results of concept map recomposition activities with the same scale [18]. These results suggest that the participants had flow experiences.
Triangle Logic Recomposition Exercise for Three-Clause
261
Table 4. Flow experiences. Triangle logic Flow state
4.5
Concept map recomposition Improved 1
Improved 2
Original
4.2
4.3
3.6
5 Conclusion and Remarks In this paper, a framework of triangle logic, a recomposition exercise environment of the triangle logic, and an experimental evaluation of the exercise were reported. In the exercise, a learner recomposes the triangle logic by using provided components. The environment can diagnose the recomposed triangle logic and give feedback based on the diagnosis. A quasi-experimental in the form of pretest-posttest control group design was used to evaluate the exercise. As the results, the posttest score of the experiment group was statistically significantly improved compared to the pretest score, and the effect size was large, although the posttest score of the control group were not significantly improved compared to the pretest score. Through this study, we have confirmed that the triangle logic recomposition exercise is a promising approach for promoting logical and critical thinking.
References 1. Duplass, J.A., Zeidler, D.I.: Critical thinking and the role of logical argument duplass, critical thinking and the role of logical argument in social studies education. Int. J. Soc. Educ. 15(1), 113–27 (2000) 2. Cerutti, F., Gaggl, S.A., Thimm, M., Wallner, J.: Foundations of implementations for formal argumentation. IfCoLog J. Logics Appl. 4(8), 2623–2705 (2017) 3. Greenlaw, S.A., DeLoach, S.B.: Teaching critical thinking with electronic discussion. J. Econ. Educ. 34(1), 36–52 (2003) 4. Toulmin, S.E.: The Philosophy of Science, vol. 14. Genesis Publishing (1958) 5. Govier, T.: The Philosophy of Argument. Vale Press, Newport News (1999) 6. Henderson, J.B., Osborne, J., MacPherson, A., Szu, E.: A new learning progression for student argumentation in scientific contexts. In: Proceedings of the ESERA 2013 Conference: Science Education Research for Evidence-Based Teaching and Coherence in Learning, pp. 726–742 (2014) 7. De Morgan, A.: On the Syllogism: And Other Logical Writings. Routledge (2019) 8. Britt, M.A., Kurby, C.A., Dandotkar, S., Wolfe, C.R.: I agreed with what? Memory for simple argument claims. Discourse Process. 45(1), 52–84 (2007) 9. Britt, M.A., Larson, A.A.: Constructing representations of arguments. J. Mem. Lang. 48, 794–810 (2003) 10. Larson, M., Britt, M.A., Larson, A.A.: Disfluencies in comprehending argumentative texts. Read. Psychol. 25, 205–224 (2004) 11. Hoffmann, M.H.: Analyzing framing processes in conflicts and communication by means of logical argument mapping. Framing matters: Perspectives on negotiation research and practice in communication, pp. 136–164 (2011)
262
T. Hirashima et al.
12. Hirashima, T.: Design of learning by logical empathic understanding in technology enhanced learning. In: Yamamoto, S., Mori, H. (eds.) HCII 2021. LNCS, vol. 12766, pp. 38–49. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-78361-7_4 13. Hirashima, T.: Reconstructional concept map: automatic Assessment and reciprocal reconstruction. Int. J. Innov. Creat. Change 5, 669–682 (2019) 14. Supianto, A.A., Hayashi, Y., Hirashima, T.: Model-based analysis of thinking in problem posing as sentence integration focused on violation of the constraints. Res. Pract. Technol. Enhanc. Learn. 12(1), 1–21 (2017). https://doi.org/10.1186/s41039-017-0057-5 15. Hirata, K.: A classification of abduction: abduction for logic programming. Mach. Intell. 14, 405 (1993) 16. NIER (National Institute for Educational Policy Research). A survey of logical thinking (2013). https://nier.repo.nii.ac.jp/?action=repository_uri&item_id=469&file_id= 22&file_no=1 (In Japanese) 17. Rheinberg, F., Engeser, S., Vollmeyer, R.: Measuring components of flow: the Flow-ShortScale. In: Proceedings of the 1st international positive psychology summit (2002) 18. Furtado, P.G.F., Hirashima, T., Hayashi, Y.: Reducing cognitive load during closed concept map construction and consequences on reading comprehension and retention. IEEE Trans. Learn. Technol. 12(3), 402–412 (2018)
Proposal for a Semi-subjective Learning Support System with Operation Indices Targeting Vectors Tomohito Jumonji1(B) , Nonoka Aikawa1 , and Takahito Tomoto2 1 Graduate School of Engineering, Tokyo Polytechnic University, Atsugi, Kanagawa, Japan
[email protected] 2 Faculty of Engineering, Tokyo Polytechnic University, Atsugi, Kanagawa, Japan
Abstract. It is important that learners engage in trial and error when solving mathematical problems. However, when learning ordinary mathematics, learners often cannot sufficiently perform trial and error because the correct answer is given to them by the textbook or the teacher after they have solved the problem. Therefore, we aim to encourage trial and error by producing a visualization of the learner’s solution and having them interact with it. In this study, we give an overview of how we developed a learning support system that visualizes errors made by learners studying vector addition. Keywords: high-school mathematics · vector · error visualization
1 Introduction In learning, it is important to reflect on one’s answers and to engage in trial and error. However, in conventional learning methods, such as those involving textbooks or classroom teaching, the textbook or the teacher will often only give the correct answer and its explanation when a learner answers incorrectly, leaving little in the way of adaptive feedback concerning their specific error. If the textbook or the teacher gives correct answers and explanations only, then the learner becomes a “passive learner” who readily accepts those answers, leaving open the possibility that the learner will not fully reflect on their own errors. If learners do not fully reflect on their incorrect answers, then they will not consider why their answers are wrong, which then tempts them to simply try to memorize the correct answers. Essentially, when learners make a mistake, they have a great opportunity to deepen their understanding by reflecting on their own answers. Therefore, it is important to provide an environment in which learners can proactively reflect on their own incorrect answers through trial and error, rather than simply giving them the correct answer when they make a mistake. To encourage learners to perform trial and error, it is necessary to provide them with clues that will guide them to the correct answer, rather than giving it to them directly. However, providing them with these clues may make them correct what is wrong, but it will not necessarily make them reflect on why it is wrong. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 263–273, 2023. https://doi.org/10.1007/978-3-031-35129-7_19
264
T. Jumonji et al.
Hirashima et al. [1] proposed error-based simulation (EBS) as a framework for error visualization, which focuses on what kind of strange behavior would occur if the learner’s incorrect answer were actually correct. Most often, feedback is negative and serves to negate the learner’s answer, for example, the message “the learner’s answer is incorrect." In error visualization, however, feedback is positive and serves to affirm the learner’s answer, such as the message “if the learner’s answer is correct, what kind of problems will occur?” The learner is able to visualize their own answer and is expected to notice its “wrongness” by themselves. They are also expected to reflect on why their answer is wrong. Having their errors visualized is expected to motivate learners to correct those errors on their own, and to encourage them to engage regularly in trial and error. Hirashima et al. developed EBS mainly for mechanics [2, 3] and verified its effectiveness through many classroom practice sessions [4, 5]. Kurokawa et al. [6–8] developed and evaluated a system that encourages learners to engage in trial and error by visualizing errors in trajectories in mathematics. Unlike in mechanics, mathematics does not target phenomena, so it is usually impossible to generate specific behaviors. Consequently, it is more difficult to suggest errors to learners than when using EBS for mechanics. Kurokawa et al. focused on two of the five types of mathematical representations captured by Nakahara [9]: symbolic representations using mathematical expressions and graphic representations using graphs, which are commonly used in mathematics learning. They developed a system that converts the symbolic and graphic expressions from a learner’s answer into other expressions, thereby allowing them to visualize errors and encouraging trial and error (hereafter, the “mathematical expression conversion system”). In the mathematical expression conversion system, the graph is visualized according to certain constraints, which are rules in the mathematical equation that the learner has formulated, and the diagram can be manipulated within the constraints to suggest errors to the learner. For example, if part of the process for determining the correct answer to a problem leads to “point P moves on y = 2x + 3” but the learner’s answer is “point P moves on y = −2x + 3,” then the behavior of point P in the diagram in which the correct answer is visualized will differ from the behavior of point P in the diagram in which the learner’s answer is visualized. The difference in behavior can be confirmed by the learner’s own manipulation of point P. By confirming the difference in behavior through manipulation, we expect the learner to notice the error and begin the trial-and-error process. We have developed a learning support system that uses error visualization for vectors in mathematics based on the work of Kurokawa et al., and have verified its effect on the learning process [10]. We successfully visualized errors by representing the direction of vectors in a figure as an arrow. In Kurokawa’s study, the graph itself was defined in a relatively unique way for visualization because it targeted trajectories. However, the restrictions for vectors are loose because they are concepts that include only direction and magnitude, and it is often impossible to plot vectors uniquely on a graph. Therefore, it is more important to have a function that allows the learner to manipulate the vectors themselves. Both Kurokawa et al.’s and our previous studies have utilized a function that allows learners to manipulate figures, but they did not control whether the learners themselves actually perform the manipulation. Moreover, conventional mechanics EBSs do not
Proposal for a Semi-subjective Learning Support System
265
guarantee that the learner will observe the strange behavior that is visualized. If the learner only checks whether the feedback is correct or incorrect and does not fully reflect on their own solution, then the learning process becomes passive in that the learner only accepts the correctness or incorrectness presented. If learners themselves do not fully recognize the relationship between their own answers and the visualized behaviors and figures, then they may not understand why their answers are incorrect. In this study, we propose a method that requires the learner to manipulate figures so that they may then recognize the relationship between their answer and the visualized figure. We expect that this will allow learners to proactively discover why their answers are incorrect, rather than having them passively accept correct or incorrect answers. (We call this activity "semi-subjective" in this study because the activity itself is required by the system.) We are specifically proposing a learning support system that explicitly requires the learner to perform operations involving vector composition in mathematics.
2 Error Visualization in Mathematics 2.1 Error Visualization for Linear Algebra Kurokawa et al. [6–8] developed and evaluated a system that encourages learners to engage in trial and error by visualizing their errors when calculating trajectories in mathematics. They focused on two of the five mathematical representations identified by Nakahara [9]: symbolic representations, which are representations of mathematical formulas, and graphic representations, which are representations of figures and graphs. They developed a conversion system for mathematical expressions that encourages trial and error by modifying the symbolic and graphic expressions in learner’s answers and visualizing their errors. However, unlike in dynamics, the mathematics in Kurokawa et al.’s study focused on graphs, not phenomena. Therefore, unlike EBS, which is a system using error visualization for dynamics, it is usually not possible to generate behaviors, and consequently, it is more difficult to suggest errors to the learner. The Mathematical Representation Transformation System visualizes graphs according to mathematical rules called “constraints” that exist in the mathematical equations constructed by the learner, and at the same time, it suggests errors to the learner by allowing them to manipulate those graphs within the constraints. As an example, take the situation where the correct answer to a question leads to “point P moves on y = 2x + 3” but the learner inputs “point P moves on y = −2x + 3. ” The difference in behavior of both these formulations is visualized by the system, and the learner is made aware of it through direct interaction. This will help the learner notice their error and will begin the cycle of trial and error..
266
T. Jumonji et al.
2.2 Error Visualization for Vectors In previous study, we visualized errors for vectors by representing their direction by an arrow in a figure. The range of trajectories handled in Kurokawa et al.’s study can be visualized in such a way that the representation of the graph is uniquely determined for a given mathematical expression. For example, the trajectory is uniquely determined when the formula y = 2x + 3 is input. Therefore, if that is the correct answer for a given problem but the learner’s answer is y = −2x + 3, then it will be easy to visualize the difference between the correct graph and that based on the learner’s answer [6–8]. However, since vectors are concepts that involve only direction and magnitude, the restrictions are loose, and it is often not possible to uniquely plot them on a graph. For example, take the following problem: Given an equilateral triangle consisting of three − → −→ −→ − → − → − → −→ points ABC, find XX for AB = XX + XX . If the correct answer is AB = AC + BC and − → − → − → the learner’s answer is AB = AC + BC, then the learner’s answer will be inconsistent with the condition “an equilateral triangle consisting of three points ABC” given by the problem, and thus, it cannot be visualized using the method of Kurokawa et al. Therefore, it is important that there be a function that can be manipulated by the learners themselves. Our learning support system provides such a function. However, although Kurokawa et al.’s and our previous studies have a function that allows learners to manipulate figures, the system that was used only converts answers (mathematical expressions) constructed by the learners into figures and learners themselves were free to actually perform the manipulations. Furthermore, the conventional mechanics EBS does not guarantee that the learner will observe the strange behavior that is visualized. If they only check the correctness of the given feedback and do not fully reflect on their solution, then the learning becomes passive, in the sense that they simply accept what is presented to them as being correct. If the learner does not fully recognize the relationship between their answers and the visualized behaviors and figures, then they may not be able to understand why their answers are wrong. This then makes it likely that they will make similar mistakes with similar problems. Therefore, it is important to provide specific activities that make the learner aware of the relationship between their answers and the shapes.
3 Proposed Methodology The activity that we propose to make learners recognize the relationship between their answers and the figures is explicitly requiring them to manipulate the figures, thereby allowing them to perform semi-autonomous learning activities. This requires learners to consider the visualization of their errors, and we expect that they will then proactively discover why their answers are wrong. The operations required by this method are designed based on the model that is to be learned. For example, if we want the learner to understand a relationship like "in a vector addition-subtraction formula, a triangle can be drawn by the group of vectors composed by the formula," then the operation must be able to draw such a triangle. Specifically, when the learner makes an error, the law that the error preserves is visualized as an
Proposal for a Semi-subjective Learning Support System
267
incorrect diagram. The learner performs the operation by superimposing the portions that visualize the components of their answer onto the required relationship. In the vector addition/subtraction example, the learner visualizes each term as a vector based on the equation they have constructed and draws the triangles that are possible with that equation. By manipulating the vectors for the components of their equation and placing them on the drawn triangle, the learner confirms that they constitute an incorrect triangle. We expect that doing this will make them aware of the relationship between their solution and the diagram they visualized and cause them to proactively reflect on their solution. We expect that this method will become a general method that will be used in many fields, but in this paper we focus specifically on vector composition in mathematics. Requiring the learner to manipulate the figure guarantees that it will be part of their problem-solving process, which has been learner-dependent in studies up until now. In other words, the learner can reflect on the difference between the correct figure and that generated from their own solution by directly manipulating it. In this study, manipulation indices were presented to the learner as explicit steps for manipulating figures. Manipulation indices are the criteria by which the learner performs the manipulation of a shape. The operation index is the final goal of the manipulation, and it is left to the learner to decide how to perform the manipulation to achieve it. Letting the learner decide allows them to be proactive in the process of manipulating the shapes, which will help them independently discover why their solution is wrong. Manipulation indices are presented in the form of certain activities that model the laws and knowledge required to solve the problem in question. The Triangle Law of vector addition states that triangles can be formed from the vectors on either side of expressions for vector addition. As an example, suppose that we are given a regular − → −→ −→ −→ pentagon ABCDE and told to determine XX for AC = XX + XX using point B. If the − → − → − → correct answer to this problem is AC = AB + BC, then a triangle can be formed from − → − → − → vector AC on the left side and the two vectors AB and BC on the right side, which is demonstrated in Fig. 1. Triangles like this will function as the operational indicator for problems focusing on vector composition. The learner manipulates the visualized vectors until they fit into the triangle corresponding to the correct solution, thereby allowing them to learn the proper vector addition rule. Furthermore, if the learner gives an incorrect answer, then the triangle formed will not correspond to the correct answer indicator. In the above problem for example, when − → − → − → the learner incorrectly answers AC = AB + CB, the visualization and manipulation indexes shown in Fig. 2 are possible. In this case, it is possible to form a triangle with − → − → − → AC, AB, and CB by manipulating the visualized vectors (upper part of Fig. 2), but it is − → − → − → not possible to fit the triangle with AC, AB, and BC, , which is the correct triangle (lower part of Fig. 2). It is in this way that the learner comes to recognize the errors in their solution and, through trial and error, learns how to formulate the correct solution.
268
T. Jumonji et al.
Fig. 1. The operational index for vector composition.
Fig. 2. Operational indicators in vector composition for erroneous answers.
4 Development Systems Figure 3 shows the screen of the actuallywhat is displayed by the implemented system. Based on the learning support system for vectors described in Chapter 2, Sect. 2 [10], this system provides an environment in which learners can learn semi-autonomously by requesting explicit operations from them. In addition, the target range of this system is vector composition in mathematics. This allows the system to request explicit manipulation from the learner as a presentation of the manipulation index, since the model to be presented as the manipulation index is fixed to one. First, the system provides the learner with a problem statement and asks them to find its solution. In conventional learning support systems, the system simultaneously presents the figure that will be generated when the correct answer is given. The learner constructs a solution while attempting to match the presented operation index. Once they input an answer, the system presents both the operation indices based on that answer and the operation indices for the correct answer. The system then asks them to manipulate two figures, the manipulation indicator for the learner’s answer and that for the correct answer.
Proposal for a Semi-subjective Learning Support System
269
In the example shown in Fig. 3, the learner was told that when given a regular − → −→ −→ −→ pentagon ABCDE to determine XX for AC = XX + XX using point B. Then, the − → − → − → triangles formed by the vectors AC, AB, and BC, which are generated when the correct − → − → − → answer in this question is AC = AB + BC, are presented as the operation indices. We believe that this system can expand the learner’s search space more than conventional learning support systems, which only provide answers based on the correct figure.
Fig. 3. Screen output from the proposed system.
Fig. 4. The learner’s operating index (correct answers).
270
T. Jumonji et al.
The learner constructs the correct answer by looking at the question text and the operational indicators, using a combo box from the answer input form at the bottom of the screen (Fig. 3). The system visualizes errors by converting the learner’s constructed answers (mathematical expressions) into vectors (figures) (left part of Fig. 4). At the same time, the system also presents the triangles that can be formed by the vectors in the learner’s answer as operation indicators (right part of Fig. 4). Figures 4 and 5 show the operational indices presented when the learner answers − → − → − → the question in Fig. 3. Figure 4 shows the correct answer AC = AB + BC, while Fig. 5 − → − → − → shows the incorrect answer AC = AB + CB. In Fig. 4, the triangles that can be formed − → − → − → from AC, AB, and BC that are the same as those that can be formed when the correct answer is given are presented as the operational index. However, in Fig. 5, the triangles − → − → − → that can be formed from AC, AB, and CB are clearly different from those that can be formed when the correct answer is given. This visualizes the difference between the correct answer and the learner’s answer. The learner forms a triangle by superimposing the generated vectors on the sides of the triangle, as shown in Fig. 6, targeting such an operation index. We believe that this allows the learner to recognize the relationship between their solution (mathematical formula) and the graph (graphic). In this case, the learner is requested to perform operations on the operation indices both for their answers and for the correct answers. Doing so for their answers teaches them how they relate to the visualization, while doing so for the operation index associated with a correct answer makes them aware that their answer is incorrect, since they cannot apply it to the operation index if that is the case (Fig. 7). These operations allow the learner to proactively discover why their answer is wrong by manipulating the figure, and to reflect on it. Moreover, since the learner can only move on to the next answer by forming a triangle, we expect this to solve the problem found in conventional learning support systems in which the learner does not check the visualization of the presented error nor reflect on it.
Fig. 5. The learner’s operating index (misinterpretation).
Proposal for a Semi-subjective Learning Support System
271
Fig. 6. The output after vector manipulation (correct answer).
Fig. 7. The output after vector manipulation (misinterpretation).
5 Conclusion In ordinary learning, such as from textbooks or classroom teaching, the professor (textbook or teacher) will often only give the correct answer and its explanation when a learner gives an incorrect answer, and there is little adaptive feedback for their individual error. If the instructor only gives correct answers and explanations, then the learner will become a passive learner who implicitly accepts the correct answers, which makes it likely that they will simply try to remember the correct answers without fully reflecting on their own errors. Therefore, it is important to create an environment in which learners can fully reflect on why their answers are wrong. Kurokawa et al. [6–8], and the learning support system developed by us [10], visualized errors by converting learners’ answers (mathematical expressions) into graphs (figures). This provided an environment that encouraged learners to think about why their answers were wrong and encouraged trial and error.
272
T. Jumonji et al.
However, in previous systems using error visualization, learners were only presented with figures that were visualized by the system until they reflected on their own answers. If they only check the correctness of the given feedback and do not fully reflect on their own answers, then there is a risk that they will only engage in passive learning, which will not lead to deeper understanding and will make it likely that they will make the same mistake in similar problems. For learners to fully reflect on their own answers, they need to recognize the relationship between those answers and the visualized figures. Therefore, it is important to provide specific activities that enable learners to recognize the relationship between their answers and the shapes. In this paper, we propose a learning support system for vector composition in mathematics that requires learners to explicitly manipulate figures, thus allowing them to perform semi-autonomous learning activities. By having the system explicitly request that these figures be manipulated, learners can independently discover why their answers are wrong and can reflect on their answers. In our system, errors are first visualized by converting the learner’s answers (mathematical formulas) into figures (vectors) [10]. A triangle can then be formed and is presented to the learner as an operation indicator. The operation index is a model of the laws that exist in the problem given by the system to the learner, and it is the criterion for how to perform the operation. The Triangle Law of vector composition states that a triangle can be formed from the components on either side of a vector addition expression. The learner manipulates the vectors generated by the system by dragging and moving them, overlapping them with the triangles formed by the vectors in their solution, which is presented as a manipulation index. This is the basis of the trial-and-error process of this system. We believe that this allows the learner to recognize the relationship between their answers and the graphs. Next, a triangle that can be formed from the correct answer is presented as a manipulation index. If the learner’s answer is incorrect, then their triangle will not match that for the correct answer. This allows the learner to proactively discover why their solution is incorrect by manipulating the figure and to reflect on it. In future research, we will conduct evaluation experiments to verify the learning effect of this system. We will also explore the extent to which the method of presenting operation indicators can be applied. Acknowledgements. This work was supported by JSPS KAKENHI Grant Numbers JP22K12322, JP21H03565, and JP20H01730.
References 1. Tomoya, H., Tsukasa, H.: Simulation-based learning environment for assisting errorcorrection management of error-based simulation considering the cause of errors. Japan. Soc. Artific. Intell. (Japanese edition) 17(4), 462–472 (2002). (in Japanese) 2. Hirashima, T., Horiguchi, T., Kashihara, A., Toyoda, J.: Error-based simulation for errorvisualization and its management. Int. J. Artific. Intell. Educ. 9(1–2), 17–31 (1998)
Proposal for a Semi-subjective Learning Support System
273
3. Hirashima, T., Shinohara, T., Yamada, A., Hayashi, Y., Horiguchi, T.: Effects of error-based simulation as a counterexample for correcting MIF misconception. In: André, E., Baker, R., Hu, X., Rodrigo, M.M.T., du Boulay, B. (eds.) AIED 2017. LNCS (LNAI), vol. 10331, pp. 90–101. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61425-0_8 4. Hirashima, T., Imai, I., Horiguchi, T., Tomoto, T.: Error-based simulation to promote awareness of error in elementary mechanics and its evaluation. In: Proceedings of International Conference on Artificial Intelligence in Education, pp. 409–416 (2009) 5. Horiguchi, T., Imai, I., Toumoto, T., Hirashima, T.: Error-based Simulation for error-awareness in learning mechanics: an evaluation. Educ. Technol. Soc. 17(3), 1–13 (2014) 6. Kai, K., Takahito, T., Tomoya, H., Tsukasa, H.: development and evaluation of a learning support system with functions for conversion of expression and facilitation of active errorawareness in locus problem in mathematics. Inst. Electron. (Japanese edition) J101-D(6), 864–873(2018), (in Japanese) 7. Kai, K., Takahito, T., Tomoya, H., Tsukasa, H.: Development of learning support system with drawing interface and conversion function from graphic to symbolic sentence in locus problem. The Japan. Soc. Artific. Intell. (Japanese edition) 9–14(2018). (in Japanese) 8. Kai, K., Takahito, T., Tomoya, H., Tsukasa, H.: Development of a mathematical solution environment to understand symbolic expressions in mathematics. In: Proceedings of Human Interface and the Management of Information: HIMI2018, pp. 288–299 9. Tadao, N.: Research on Constructive Approaches in Arithmetic and Mathematics. Seibunsha (1995) 10. Tomohito, J., Nonoka, A., Takahito, T.: Development of a learning support system with error visualization for learning for vector in mathematics. Japan. Soc. Artific. Intell. (Japanese edition) 1–6 (2022). (in Japanese)
Instructional Design of a VR-Based Empathy Training Program to Primary School Children Meng-Jung Liu1(B)
, Chia-Hui Pan2 , and Le-Yin Ma1
1 Department of Special Education, National Kaohsiung Normal University, Kaohsiung, Taiwan
[email protected] 2 Department of Educational Psychology and Counseling, National Taiwan Normal University,
Taipei, Taiwan Abstract. There are a variety of empathy training programs have been designed to explicitly teach empathy. These programs are generally based on perspectivetaking, and aim to feel the others’ emotions, to understand them, and to regulate one’s own feelings. Virtual reality (VR) allows users to have an “embodied experience”. Embodied technology may be a key feature of VR that allows users to practice and improve their cognitive empathy skills, specifically perspective taking. Research that has included the use of VR interventions has found increased empathy towards people with special needs. In the present study, we used VR to develop an empathy curriculum for primary school students. The study applied instructional design principles of ADDIE model, including analysis, design, development, implementation, and evaluation phases, to develop an empathy training program to facilitate educators to meet learning objectives through iterative users’ feedback and instructional designers’ reflection. The study recruited seven interdisciplinary team members who have experiences in empathy curriculum design, class teaching, academic research, school counseling, and instructional design. The main instructional content includes eight VR videos presenting four common school conflicts in first-person and bystander view, and an interactive function at the end of the VR videos for students to choose a voluntary response in a particular conflict situation. The class instruction is constructed by four sections including VR experiencing, self-reflection, group discussion, and a group activity respectively. The results show that the VR-based empathy training program highly engage students and effectively facilitate students to change perspectives and raise empathic concern. The design of this program sheds light on how to integrate VR technology into classroom teaching with to meet instructional goals. Keywords: Virtual reality · Empathy training · Perspective-taking · Empathic concern · ADDIE · Instructional design
1 Introduction Empathy composes of both emotional and cognitive factors whose integration contributes to share and understand another person’s perspective plays a key role in preserving human social relationships. Researchers have different opinions on the definition and core competence of empathy. Researchers who consider empathy consists of two elements, including sharing and understanding the feelings of others (Buffel du Vaure et al. 2017), the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 274–285, 2023. https://doi.org/10.1007/978-3-031-35129-7_20
Instructional Design of a VR-Based Empathy Training Program
275
ability to understand other people’s feelings or thoughts, and to understand other people’s perspectives (Hsieh et al.2013), as well as understanding others emotional experience and responding to it (Preis et al. 2013). Rodriguez (2013) conceptualized empathy as emotional responses and perspective taking. Williams et al. (1996) considered empathy as the ability to understand the feelings of others, to communicate emotionally with other people’s experiences, and to take over the perspective of others. Gladstein (1983) proposed that empathy should be regarded as a multi-level interpersonal interaction process, involving emotional infection, emotional recognition, and role-taking, and divided empathy into cognitive and affective domains. Davis (1983) compiled the Interpersonal Reactivity Index (IRI) scale, which includes four parts: perspective-taking, fantasy, empathic concern, and personal distress. Perspective-taking in the scale is considered to represent cognitive empathy, while empathetic concern is considered affective empathy. The modern research, begun in the 1950s, reveals three types of perspective taking. The first is self-focused: imagining that the stimuli impinging on the victim are impinging on oneself evokes an empathic response, which can be enhanced by association with similar events in one’s own past. The second is other-focused; it consists of attending to the victim’s feelings, current life condition, and behavior in similar situations. This may be more cognitive than affective empathy, except when the victim is present and affect is recruited from preverbal modes activated by the victim’s face, voice, and posture. The third type of perspective taking focuses on both self and other; it consists of cooccurring, parallel processes that benefit from the emotional intensity of self-focused and the sustained attention to the victim of other-focused perspective taking (Hoffman 2008). Cognitive empathy, like many other skills, benefits from the practice and that practice should occur within the zone of proximal development (Vygotsky 1978). Scholars have described empathy as a muscle, and as such it should be capable of growth and even regeneration with sufficient effort (Konrath et al. 2011). There are a variety of empathy training programs have been designed to explicitly teach empathy. These programs are generally based on taking the perspective of someone else with the aim to feel the others’ emotions, to understand them, and to regulate one’s own feelings (Lam et al. 2011). Virtual reality (VR) claims it is “the ultimate empathy machine” (Ted 2015). VR allows users to see and hear as if they were experiencing someone else’s point of view in the real world, in other words, to have an “embodied experience” (Ahn et al. 2013). Embodied technology may be a key feature of VR that allows users to practice and improve their cognitive empathy skills. There are three possible strategies that can be used to alter bodily self-consciousness by using VR: (1) mindful embodiment, which consists in the modification of the bodily experience by facilitating the availability of its content in the working memory; (2) augmented embodiment, which is based on the enhancement of bodily self-consciousness by altering/extending its boundaries; and (3) synthetic embodiment, which aims to replace own body with synthetic self-consciousness (Riva et al. 2016). VR could facilitate cognitive empathy, specifically perspective taking, in challenging situations by offering compelling experience of what it feels like to walk in someone else’s shoes (Ahn et al. 2013). Empathic distress is a multidetermined and
276
M.-J. Liu et al.
hence reliable prosocial motive (Hoffman 2000). Some VR-based empathy training programs have found increased empathy towards patients with specific conditions such as Alzheimer’s disease, hearing and vision loss among careers, medical and nursing students (Dyer et al. 2018). In the present study, we applied VR to develop an empathy training program for primary school students.
2 Methods 2.1 Participants In school settings, special education teachers, counseling teachers, and regular education teachers often cooperate to help students with emotional difficulties, interpersonal conflicts and communication issues. To design the VR-based empathy training program, there were seven interdisciplinary team members who have experiences in empathy curriculum design, class teaching, academic research, school counseling, and instructional design. One of them was a professor from the department of special education in a university led the team to develop creative ideas and research design. One team member was this project manager with a background in instructional technology. There were two special education teachers and two counseling teachers, as well as one regular education teacher in the team. The VR-based empathy training program was introduced to 10 classes of 4th -grade students (at the age 9 to 10) from 5 elementary schools in Kaohsiung city, Taiwan. There were 137 male students and 121 female students who participated in the study. Among the 5 elementary schools, 3 schools were located in the urban area while the other 2 were located in the suburban area. All students agreed and signed valid parent permission and personal consent forms before the intervention of the program. 2.2 Training Program Development Applying instructional design principles to the development of empathy training program facilitates educators to meet learning objectives through iterative users’ feedback and instructional designers’ reflection (Wang & Hsu 2008). Among many instructional design models, we applied the ADDIE model because of its well-defined systematic framework and a widely used approach (Peterson 2003). Most importantly, the ADDIE model is a learner-centered approach making the instructional delivery relevant to learning context and feasibly implemented (Peterson, 2003). There are five phases consisting of the ADDIE model, including analysis, design, development, implementation, and evaluation. Given that the ADDIE model is a cyclical process evolving over time, the development in each phase is summarized in the following sections. Analysis Phase The analysis phase is fundamental in the ADDIE model to understand needs of both students and teachers and to investigate available instructional content. Accordingly, instructional goals will be established and the amount of instruction will be determined (Peterson 2003).
Instructional Design of a VR-Based Empathy Training Program
277
The interviews for 15 elementary teachers, including homeroom teachers, special ed teachers, and school counsellors with over 10-year teaching experiences, were conducted to explore issues of bully behaviors and implementation of bully intervention programs in elementary schools in Taiwan. Most teachers argued that interpersonal conflicts were an important early signal of bully behavior and suggested that intervention in interpersonal conflicts is more effective than in school bullying. In addition, a lack of systematic instructional content makes teachers less effectively prepare students for solving interpersonal problems. Which contributing to misunderstandings and conflicts in the classrooms. The conventional lecture on the importance of empathy and perspective-taking is too dogmatic and tedious for students and thus fails to address this issue. Among primary students, 4th -grade students are recommended to start with the empathy training program since they have the cognitive ability of self-other distinction. According to Bailey & Bailenson (2009), children under age 7 may not fully develop sense of self and probably have difficulty understanding how oneself is presented in VR. Meanwhile, 4th grade students are about to deal with complex social relationships in their preadolescence. Therefore, 4th grade students were the target of the pilot pro-ject of empathy training The insights learned from these interviews reshaped the goal of the training program from bully intervention to bully prevention, which focused on developing empathic concern and perspective-taking for primary students. It shall be a school-based training program providing each student opportunities to learn how to interpret others’ intentions correctly and to respond appropriately when encountering interpersonal conflicts. Most importantly, the empathy training program needs to engage 4th -grade students to learn and internalize the knowledge and skills so that they are able to deal with real world situations. Design Phase The instructional designer plays an important role in this phase to identify instructional objectives, to determine how the objectives to be met, and to deploy appropriate instructional strategies (Peterson 2003). Experiential learning is adopted as the pedagogical approach to engage students and connect to individual experiences so that students will be able to learn different perspectives. In a formal 40-min class for 4th-grade students, there are four sections in each class, including experiencing a common school conflict situation, self-reflection, class discussion, and a group activity. It is a challenge to keep the event experience personal in an open classroom setting without the opinion leaders’ dominant influence. Another challenge is to design an effective way demonstrating different perspectives toward the same situation for students to experience different viewpoints and consequences of their responses. Hoffman’s empathy theory was employed in the curriculum design to develop empathy for 4th-grade students. According to Hoffman (2000), there are five arousal patterns including mimicry, classical conditioning, direct association, mediated associating, and role taking. This program focuses on direct association and role taking. Direct association refers to feeling the others’ emotion in a specific situation when associating one’s past experiences similar to the same situation (Hoffman, 2000). Our hypothesis is that the more experiences students have, the more likely students evoke empathic concerns and conduct prosocial behaviors. The conventional teacher-led instruction and text-based
278
M.-J. Liu et al.
content is not engaging and vivid enough to trigger emotions and similar experiences. Therefore, it is a critical design task to simulate real-world situations in the classroom where students feel safe to experience complex social conflicts. Another arousal pattern addressed in this training program is role taking, which is a way to think about someone else’s feelings in a particular situation. In other words, one has to change perspective to see things from other people’s situation so that one can feel the same way as the others and better understand their intention and actions. Accordingly, the instructional design of the training program has to meet the pedagogical goals in terms of perspective-taking and authentic experiences. Virtual reality (VR) technology was introduced to address the above-mentioned instructional needs and to design a virtual environment where students may safely experience complex social situations, choose a voluntary response, and deal with the consequences. VR is an effective medium that delivers immersive and interactive experiences in a well-structured simulation environment (Liu et al. 2023). In addition, different perspectives toward the same social conflict are provided so that students would experience both the first-person perspective and bystander perspective. For example, in a story of a character with Prader-Willi syndrome (PWS) and is often criticized by his chubby appearance, when students take the first-person role, they will face classmates’ mockery and exclusion from the class. On the other hand, when experiencing the bystander role, students will see the character suffering from hunger pangs and loneliness at home. Students experience the same conflict twice in the role of first-person and bystander respectively. The VR videos efficiently bring the whole class to experience the same situation at the same time and thus create a teachable moment for changing perspectives and understanding the consequences of their response in the particular situation. Assessment is another critical facet in this design phase, including what to measure, how to measure, and format of measurement (Peterson 2003). Formative assessment is adopted to continuously monitor students’ feedback and learning progress. Students have to complete a 3-page worksheet during each class and fill out a survey form at the end of each class. The data collected not only informs students’ learning but also provides feedback for teachers to improve their teaching and instructional content design. Development Phase The main instructional content includes eight VR videos presenting four common school conflicts (3 min for each video) and an interactive function at the end of the VR videos for students to choose a voluntary response in a particular conflict situation. Instructional tools also include power point slides for classroom instruction, VR equipment for the whole class, and working sheets for students. The VR videos are collaboratively developed by the design team and an animation company. Mobile VR is deployed in this program because of its affordability and mobility. It is worth mentioning that most primary schools in Taiwan may not equip enough wireless Internet bandwidth for realtime streaming. The eight VR videos are integrated in a mobile app, which is installed in the mobile phones before classes. The mobile app is ease of use and works smoothly offline. The empathy training program consists of 8 classes (40 min for each class) presenting four themes, each theme describes a typical school conflict in both first-person (see Fig. 1) and bystander (see Fig. 2) perspectives. When the student takes the first-person role,
Instructional Design of a VR-Based Empathy Training Program
279
unlike the bystander viewpoint, they experience what the main character has experienced in the VR videos. For example, one of themes describing a friendly boy with PraderWilli syndrome (PWS), who would like to make friends with others but is often mocked by his chubby appearance by his classmates. In the simulation environment, the viewer of first-person view sees two classmates right in front of the viewer, looking directly at the viewer with contempt and complaining about his body odor in a sarcastic way (see Fig. 1). Even though for those who have had few similar experiences like the main character, they would immediately sense disrespect and tense interaction in the VR environment. A sense of presence in the VR environment makes the mockery vivid so that the viewers receive authentic experiences. On the other hand, when a student plays the bystander view, he/she witnesses what happened to the main character in the VR videos. In the simulation world, the viewer standing nearby the boy watches two classmates with undisguised contempt telling the boy that he is smelly and asking him to go away (see Fig. 2). At the end of videos of the bystander perspective, the viewers are given three alternatives to response to what they witness and then watch the consequences accordingly.
Fig. 1. The first-person view
Fig. 2. The bystander (third-person) view
Implementation Phase The class instruction of pilot classes is constructed by four sections including VR experiencing, self-reflection, group discussion, and a group activity respectively. At the beginning of each class, each student is immersed in the VR video simulating a social conflict and writes down their personal emotions and thoughts afterward. It is a valuable teachable moment to transform emotion into empathic concern through changing perspective. Sharing and communicating different viewpoints without judgement is the cornerstone of the instruction. Students who share his/her feelings or thoughts in the class receive tangible rewards right after class. There is a group activity at end of the class to highlight the class takeaways and reinforce students’ learning experiences. The structure of the empathy training program is shown in Fig. 3. Before the formal teaching, two trial teachings were conducted to improve the design of training program. An important lesson learned from trial classes is that students are distracted easily by classroom rewards. Classroom rewards made students compete for teachers’ attention and not be able to reflect on themselves and listen to others. Therefore, no classroom reward is deployed in this program. Another lesson learned is that teachers may miss out some critical clues in the VR videos so that they fail to elaborate the
280
M.-J. Liu et al.
Fig. 3. Structure of the empathy training program
intention and responses to the conflict in terms of the first-person view and bystander view. Accordingly, lesson plans describing the clues in the VR videos and appropriate inquiries are provided for instructors. For example, did you see any difference between the first-person view and the bystander view? And why is that?
3 Results The ADDIE model is an iterative process continuously collecting feedback from teachers and students to improve instructional content and pedagogy. The feedback collected in the implementation and evaluation phases is illustrated below. 3.1 Implementation Phase The VR-based empathy training program was introduced to 10 classes of 4th-grade students from 5 primary schools in Taiwan from March to May, 2017. There were 10 elementary teachers with over eight years of teaching experience who were trained as course instructors for conducting the courses every week for 258 students in total. Each training course requires two teachers; one is in charge of teaching, class management, and class discussion while the other teacher assisting with the use of VR equipment and group discussion. In order to strictly control the internal validity of the teaching process, instructional procedures and activities used by the teachers were completely consistent (Table 1). Instructors’ weekly reflection reports and a weekly meeting was held to collect instructors’ and students’ feedback to modify this training program. Some instructors reported that students were confused in the first two classes and had difficulty identifying the difference between first-person view and bystander view. Thus, more instruction before experiencing VR video is provided for students to better understand which perspective they are taking. Moreover, classroom time management is a challenge for
Instructional Design of a VR-Based Empathy Training Program
281
Table 1. Consistent class procedure in each class. Section
Activity
Time
To-do
1
A 3-min VR video presenting a common school conflict situation
8–10 min
To distribute and take back VR equipment for the whole class
2
self-reflection
5 min
Students write down their emotion and thoughts without any discussion
3
group sharing and discussion
15 min
Teachers facilitate class discussion
4
group activity
10 min
A group activity to reinforce the class takeaway
instructors who are busy transitioning from one activity to another and may rush the classroom discussion in order to complete the lesson. The design team decided to simplify worksheets and group activities to leave adequate time for classroom discussion. Also, the class feedback survey was collected every two weeks instead of a weekly-based survey. Finally, student feedback data shows that they require longer VR videos for deeper immersion and practicing perspective change. However, each VR video remains 3 min because the design team and course instructors agreed that merely increasing the sense of immersion is not enough to effectively develop empathy for 4th-grade students. 3.2 Evaluation Phase The evaluation phase is multidimensional and can take place during the development phase or the implementation phase with the assistance of teachers and students (Peterson 2003). The primary task in this phase is to decide if the instructional delivery meets the instructional objectives. The mission of the design team was to create an engaging empathy training program for 4th-grade students by evoking empathic concern and changing different perspectives. The teacher feedback collected from both homeroom teachers and course instructors supports the effectiveness of the training program and maintenance effect. The weekly reflection reports of course instructors summarized the influences of the training program: (i) The motivation of students to participate was improved. For example, teacher A mentioned: “Students are more able to control their inattentive behaviors and to participate in the class”; teacher B mentioned: “Students do not need additional reinforcement and more likely volunteer to help teachers in classroom settings.” (ii) The expression of emotions and the sharing of experiences were richer. For example, teacher A mentioned: “Students were willing to share their emotion after watching VR stories” Teacher C mentioned: “In terms of emotion recognition and expression, students can better recognize the reasons behind the emotions.“ Teacher E mentioned: “Students can empathize the main characters’ emotions in the VR videos”. (iii) Improving the ability of perspectivetaking. For example, teacher B mentioned: “Empathic concerns were evoked to the boy with Prader-Willi Syndrome in the VR video because students discovered the unseen story by changing perspective.“
282
M.-J. Liu et al.
Based on the post survey for homeroom teachers, all of them agreed that there was improvement in classroom dynamics. Furthermore, 87.6% of the teachers agreed the improvement effect continues over one month. For the few students who had social difficulties, 87.5% of the teachers indicated other classmates showed more empathic concern and respect to them. 87.5% of the teachers were willing to continue this program in the coming semester (the reasons not to continue include a tight schedule in next semester and no longer being a homeroom teacher) and 100% of the teachers would like to recommend this program to other teachers. In addition, all of the teachers suggested middle primary students were appropriate for this training program. The students feedback supported that this VR-based empathy training program engages students as well as developed the ability to change perspective. The data collected from students showed that students with or without similar experiences related to the school conflicts in the VR videos understood the cause and effect in different conflict situations. One third of the students had similar social conflict experiences presented in the VR videos, while over 80% of students who have no similar experiences fully comprehended the situation and intentions of characters in the first-person and bystander perspective. In other words, students effectively learn different perspectives through the instructional content even though they had never encountered such situations. It is worth noting that 70% of students reported to have deep immersion and a sense of presence. Words alone cannot convey complex emotions and dynamic social interactions. The VR simulation not only provides an authentic and interactive environment but also effortlessly evokes emotions based on individual with or without similar past. Moreover, 91% of the students agreed that they had adequate time for class sharing and discussion. It is indeed the critical transition to transfer from emotions to empathic concern by taking others’ perspectives. This program provides students the opportunities to safely expose themselves to complex social situations and to practice taking different perspectives. Learning motivation stands out from the data analysis in terms of participation (96%), enjoyment (96%), and anticipation (97%).
4 Discussion and Conclusion The aim of the school-based empathy training program is to engage students in the development of empathy by evoking empathy concern and changing perspectives in school conflict contexts. To create a teachable moment for the whole class, VR videos provide deep immersion and interactivity in both first-person view and bystander view so that each student can easily connect to their own experiences and emotions, even for those who are novice to a specific school conflict would better understand how other people feel in that specific situation. The greater students are able to understand and share others’ feelings, the less prejudice and conflict, and the more positive social relationships (Feshbach & Feshbach 2009). In addition, discussion of different viewpoints and sharing of emotion is a catalyst for changing perspectives and seeing things from other people’s position. Being able to take someone else’s point of view increases levels of empathy (Feshbach & Feshbach 2009). The development of empathy shall take students’ age and social context into consideration. It is appropriate for 4th-grade students to take the VR-based empathy training
Instructional Design of a VR-Based Empathy Training Program
283
program. First, they are facing more complex social situations in their preadolescence and need to equip themselves with better empathic skills. Secondly, their cognitive ability is mature for how self is presented in the VR environment and is able to understand perspectives changing. Children aged 7 to 11 have increased cognitive skills, which promote deeper perspective taking (Hoffman 2000). Finally, learning from authentic experiences engages students and prepares them to respond appropriately to real-world problem solving. The locally developed program effortlessly relates students to everyday life at school with or without similar experience of a specific social situation. Furthermore, compared to lecture-based instruction, the VR-based training program provides a sense of presence and embodiment. Also, students can respond to the characters and see the consequences of their choice in the VR videos. The immersive experience and interactivity engage students in learning and internalizing their learning. How to effectively develop empathy for pupils has been concerned and investigated by educators and researchers (Konrath et al. 2011; Lam et al. 2011). The instructional design of the program addresses this issue through systematic teaching based on evidence-based theories of empathy development and structured instruction procedures and content. Among the five empathy arousal patterns proposed by Hoffman (2000), this program focuses on direct association and role taking when taking students’ cognitive ability into consideration. Deep immersion provided by VR videos facilitates students associating their past experiences similar to the simulation context and are able to spontaneously feel the others’ emotions. However, feeling other people’s emotions is not enough to demonstrate empathic behaviors. Emotional empathy can be automatically aroused when immersing oneself in the VR environment, while cognitive empathy requires more effortful engagement (Martingano et al. 2021). Accordingly, students are mandated to experience two different views in the VR videos in terms of first-person and bystander in this training program. Moreover, class discussion and sharing facilitated by trained instructors plays an important role to transfer emotions to empathic concern by taking others’ perspectives. The VR-based training program in this study is the first but hopefully not the last school-based empathy development program for primary students in Taiwan. There are several limitations in the instructional design. First, there are too many activities in each class so that both instructors and students are busy transitioning from one activity to another. Although some activities have been removed in the development phase, the scope and sequence of each lesson apparently is not streamlined for instructors and students when the courses take place in the implementation phase. Such tight schedule may dampen learning outcomes. Second, to effectively facilitate VR-based courses, instructors need to have confidence in technology use and also equip themselves with basic skills of VR equipment troubleshooting. It may be one of the reasons that teachers are less motivated to adopt this program since they have little knowledge and training for VR technology. Thirdly, another common reason for not implementing this program is the lack of VR equipment. The purchase of VR equipment for a whole class is too costly for many primary schools. Fourth, the development of empathy is important for pupils but not mandatory in school curriculum. Therefore, teachers are less motivated to implement this program in their tight class schedules. To integrate the empathy training with current school curriculum shall be one of the future work. Fifth, there are other
284
M.-J. Liu et al.
facets of social problem-solving needed to incorporate into this training program. Educators may develop more engaging and effective training courses in the future, such as emotion communication, emotion regulation, coping strategies, and helping behaviors. Lastly, the investigation of the efficacy of the training program is needed by a rigid and thorough experimental design. The urge of evidence-based empathy training programs from educators demonstrates the priority and opportunity of future research. Acknowledgement. The research was supported by a grant from the Hon Hai Education Foundation.
References Ahn, S.J., Le, A.M.T., Bailenson, J.: The effect of embodied experiences on self-other merging, attitude, and helping behavior. Media Psychol. 16(1), 7–38 (2013). https://doi.org/10.1080/ 15213269.2012.755877 Bailey, J.O., Bailenson, J.N.: Immersive virtual reality and the developing child. In: Cognitive Development in Digital Contexts, pp. 181–200. Academic Press (2017) Buffel du Vaure, C., et al.: Promoting empathy among medical students: a two-side randomized controlled study. J. Psychosom. Res. 103, 102–107 (2017). https://doi.org/10.1016/j.jpsych ores.2017.10.008 Davis, M.H.: Measuring individual differences in empathy: Evidence for a multidimensional approach. J. Personal. Soc. Psychol. 44(1), 113–126 (1983). https://doi.org/10.1037//00223514.44.1.113 Dyer, E., Swartzlander, B.J., Gugliucci, M.R.: Using virtual reality in medical education to teach empathy. J. Med. Lib. Assoc. 106, 498 (2018). https://doi.org/10.5195/JMLA.2018.518 Feshbach, N.D., Feshbach, S.: Empathy and education. In: Decety, J., Ickes, W. (eds.) The Social Neuroscience of Empathy. MIT Press Scholarship (2009). https://doi.org/10.7551/mitpress/ 9780262012973.003.0008 Gladstein, G.A.: Understanding empathy: Integrating counseling, developmental, and social psychology perspectives. J. Counsel. Psychol. 30(4), 467–482 (1983). https://doi.org/10.1037/ 0022-0167.30.4.467 Hoffman, M.L.: Empathy and Moral Development: Implications for Caring and Justice. New Cambridge University Press (2000) Hoffman, M.L.: Empathy and prosocial behavior. In: Lewis, M., Haviland-Jones, J.M., Barrett, L.F. (eds.) Handbook of Emotion pp. 440–455. The Guilford Press (2008) Hsieh, S., Irish, M., Daveson, N., Hodges, J.R., Piguet, O.: When one loses empathy: its effect on carers of patients with dementia. J. Geriatric Psychiatry Neurol. 26, 174–184 (2013). https:// doi.org/10.1177/0891988713495448 Kang, S.R., Kim, S.J., Lee, J.: Development of a violence prevention ed-ucational program for elementary school children using empathy (VPEP-E). Child Health Nurs. Res. 26(4), 422 (2020). https://doi.org/10.4094/chnr.2020.26.4.422 Konrath, S.H., O’Brien, E.H., Hsing, C.: Changes in dispositional empa-thy in American college students over time: a meta-analysis. Personal. Soc. Psychol. Rev. 15(2), 180–198 (2011). https:// doi.org/10.1177/1088868310377395 Lam, T.C.M., Kolomitro, K., Alamparambil, F.C.: Empathy training: methods, evaluation practices, and validity. J. Multidiscipl. Eval. 7(16), 162–200 (2011). https://journals.sfu.ca/jmde/ index.php/jmde_1/article/view/314
Instructional Design of a VR-Based Empathy Training Program
285
Liu, Y.L., Chang, C.Y., Wang, C.Y.: Using VR to investigate bystander behavior and the motivational factors in school bullying. Comput. Educ. 194 (2023). https://doi.org/10.1016/j.com pedu.2022.104696 Martingano, A.J., Hererra, F., Konrath, S.: Virtual reality improves emo-tional but not cognitive empathy: a meta-analysis. Technol. Mind Behav. 2(1) (2021). https://doi.org/10.1037/tmb000 0034 Peterson, C.: Bringing ADDIE to Life: instructional design at its best. J. Educ. Multimedia Hypermedia 12(3), 227–241 (2003). https://www.learntechlib.org/primary/p/2074/. Retrieved 8 Feb 2023 Preis, M.A., Schmidt-Samoa, C., Dechent, P., Kroener-Herwig, B.: The effects of prior pain experience on neural correlates of empathy for pain: an fMRI study. Pain 154, 411–418 (2013) Riva, G., Baños, R.M., Botella, C., Mantovani, F., Gaggioli, A.: Transforming experience: the potential of augmented reality and virtual reality for enhancing personal and clinical change. Front. Psych. 7, 164 (2016). https://doi.org/10.3389/fpsyt.2016.00164 Rodriguez, C.M.: Analog of parental empathy: association with physical child abuse risk and punishment intentions. Child Abuse Neglect 37, 493–499 (2013). https://doi.org/10.1016/j.chi abu.2012.10.004 Ted: How virtual reality can create the ultimate empathy machine (2015). https://www.youtube. com/watch?v=iXHil1TPxvA&ab_channel=TED Vygotsky, L.S.: Mind in Society: The Development of Higher Psychological Processes. Harvard University Press (1978) Wang, Hsu, H.-Y.: Using the ADDIE model to design second life activities for online learners. Tech Trends 53(6), 76–81 (2009). https://doi.org/10.1007/s11528-009-0347-x Williams, L.K., Palomares, S., Schilling, D.: Caring and Capable Kids: An Activity Guide for Teaching Kindness, Tolerance, Self-Control and Responsibility. Innerchoice Publishing (1996) Yu, S.J., Hsueh, Y.L., Sun, C.Y., Liu, H.Z.: Developing an intelligent virtual reality interactive system based on the ADDIE model for learning pour-over coffee brewing. Comput. Educ. Artific. Intell. 2, 100030 (2021). https://doi.org/10.1016/j.caeai.2021.100030
Classroom Practice Using a Code-Sharing Platform to Encourage Refinement Activities Shintaro Maeda1(B) , Kento Koike1 , and Takahito Tomoto2 1
2
Graduate School of Engineering, Tokyo Polytechnic University, Atsugi, Kanagawa, Japan [email protected] Faculty of Engineering, Tokyo Polytechnic University, Atsugi, Kanagawa, Japan https://shintaro.maeda.app/
Abstract. When learning programming, it is important to improve one’s own code. In this study, we have proposed an evaluation function that uses robot programming and a ranking system based on evaluations, and have developed and evaluated a code-sharing platform that allows users to learn only code with similar ranks. Although the evaluation confirmed a certain learning effect of the developed system in an experimental environment with a small number of participants, it is desirable to increase the number of participants in an environment where the code is shared. Therefore in this paper, we conducted a class for second-year university students and verified whether the same learning effects could be obtained in a large-group environment. The evaluation results suggest that the learning effect and the motivation to learn are enhanced in the same way as in a small-group environment. Keywords: Behavior Visualization · Learning Programming Ranking · Code Reading · Practice Use
1
·
Introduction
When learning programming, the structure of code is rarely focused on in evaluations. In most cases, it is considered good if a code can be constructed that achieves the objective of the assignment. However, if the goal is simply to write code that works, then code-refinement activities that apply one’s knowledge will be neglected. Therefore, we considered it important to conduct refinement activities that would improve existing code. In this study, we focused on learning from others’ code and developed a code-sharing platform that encourages code-refinement activities. Many studies have reported the usefulness of learning from others [1–3]. In particular, reading code written by experts has been shown to lead to better learning in learners [4–6]. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Mori and Y. Asahi (Eds.): HCII 2023, LNCS 14016, pp. 286–297, 2023. https://doi.org/10.1007/978-3-031-35129-7_21
Using a Code-Sharing Platform in the Classroom
287
Thus, although the learning effect of reading the code of a proficient learner is obvious, we believe that doing so as a first step does not necessarily lead to learning owing to the level of difference between a novice and a proficient learner. To solve this problem, it is desirable to provide slightly better code that is close to the learner’s level in an environment that allows the learner to learn step by step. Therefore, the authors have proposed an evaluation function that uses virtual robot programming and a system that ranks the code written by the learner based on the evaluation, and have implemented a system that encourages the learner’s refinement activities as a platform [7,8]. The developed platform includes “virtual robot programming,” which visualizes the behavior of code execution results, “quality indicators,” which evaluate the code according to the task design of that robot programming, and a “ranking function,” which shares the code according to the learner’s level based on the indicators. The quality indicators and ranking functions allow users to share codes that are close to their own level, thus supporting activities to help them learn from others’ codes step by step. The systems developed to date have been tested in small-group environments, which have confirmed their learning effectiveness. On the other hand, their applicability for use in actual programming lectures has not yet been confirmed, and the learning effects of the system for learners at the programming learning stage have not yet been examined. In this study, the developed system is put to practical use for second-year university students who are learning programming, and we provide an analysis of the results.
2 2.1
Code-Sharing Platform Functional Requirements of the Platform
The following requirements are important for learning from others’ code and improving your own code. (A). An environment for the sharing of code must be prepared. (B). The code of others who are also trying to learn matches their own level. First, based on (A), we need to prepare a learning environment in which others’ codes can be shared. Also, based on (B), it is necessary to control the learning environment in (A) so that code close to the learner’s level can be shared. Sharing code that is close to the learner’s level leads to worked examples. In general, it is known that presenting learners with examples of solutions to a problem allows them to learn with a reduced cognitive load, thus producing a high learning effect [9]. It has been reported that the learning effect of worked examples is similar to that in learning programming [10,11]. To learn from the shared code of others, it is also beneficial to understand the aspects of the shared code that are superior, for which the following requirements are important.
288
S. Maeda et al.
(C). Understanding of how other people’s shared code behaves at runtime. (D). Understanding of which aspects of the shared code are superior. Therefore, in this study, we prepared a virtual program execution environment and proposed a crop harvesting game that uses robot programming as a task. The proposed virtual robot programming is expected to encourage learners to understand the code, since the robot’s movements are reflected in its execution, which corresponds to requirement (C). In addition, we propose quality indicators based on robot programming that can be evaluated multidimensionally (D). Quality indicators make it easier for learners to understand which aspects of the shared code are superior (e.g., productivity or cost). 2.2
System Overview
The developed system is intended to be used by multiple learners, such as in a programming lecture. For this reason, we adopted a client–server model that separates multiple clients (learners) from the server (Fig. 1). Specifically, the client side is implemented in Unity3D using CSPSVERBc1 and is responsible for rendering. On the server side, the MongoDB (database) is used to manage the code written by learners and their evaluations. In this system, learners write code according to CSPSVERBc1 rules. When the code written by the learner is executed, it is evaluated in the client. When the code is successfully evaluated without any errors, the executed code and its evaluation are transferred to the server (database). Since this study plans to use the system during a classroom practice session, in which many learners participate, some of the configurations of the system are different [7,8]. For the purpose of load balancing, Roslyn, which is responsible for code compilation and execution, was moved from the server to the clients.
Fig. 1. System Configuration Chart
Using a Code-Sharing Platform in the Classroom
2.3
289
System Implementation
To understand the code of others, it is important to understand how the code behaves. In this study, we visualize the code written by the learner as a behavior using robot programming that operates virtually. Robot programming is expected to support the understanding of others’ code because the difference between their code and one’s own can be recognized as a difference in the robot’s behavior. Figure 2 shows an example of the interface of a code-sharing platform that encourages users to learn from others’ code and to refine their own. The developed system is based on robot programming, in which the robot is controlled in a virtual space. The learner programs a robot to go around a field, plant seeds, and harvest crops. The system presents the learner with a field panel, in which crops can be planted, and a puddle panel, whose execution is canceled when the robot encroaches. Therefore, the programming activity required of the learner is to write a code that successfully moves the robot to the field panel while avoiding the puddle panel, plants seeds in the field, and harvests the crop. The system has its own functions for operating the robot3. The learner writes code to patrol the field using this function. First, the robot is provided with the No. 1–4 movement functions needed to move the robot. The learner writes code to patrol the field using these functions. Next, we prepared the No. 5 (Planting) and No. 6 (Harvest) functions, which plant seeds and harvest crops in the field panels, respectively. The Planting function plants a seed if a field panel exists at the robot’s current location. The Harvest function performs harvesting if a field panel exists at the robot’s
Fig. 2. Example of the code-sharing platform interface.
290
S. Maeda et al.
current location and if seeds or a growing crop exists. The planted seeds grow as the robot repeatedly moves and plants seeds. In other words, it is important to construct the seed planting phase and the harvesting phase well as an algorithm.
Fig. 3. Example of operating a robot.
It has been reported that visualization of the execution results of code written by the learner can help beginners learn programming [12]. This system is expected to facilitate understanding of code behavior. 2.4
Quality Indicators
When learning from others’ code, it is desirable to learn from code that is close to one’s own level. However, in an ordinary programming lecture, learners of various levels, from beginners to experts, might participate in the lecture, so simply sharing code is not conducive to learning. Therefore, we propose an evaluation index that evaluates the code written by learners and measures the proximity of their levels in a quasi-level manner. In this system, based on robot programming, “Harvest Points,” “Cost,” and “Total Points” were defined as quality indicators. Harvest Points is an indicator of how well the robot harvests the crop and is used to evaluate the success of the code. Cost is an indicator that increases each time a function is called, such as for harvesting or moving the robot, and evaluates the cost of the code. Finally, ‘Total Points is an index that combines Harvest Points and Cost. Thus, in this system, good code is defined as that which allows the robot to harvest a large number of crops at a low cost. It is expected that the quality indicator will allow the code written by the learner to be represented as a pseudo-level.
Using a Code-Sharing Platform in the Classroom
2.5
291
Ranking
Figure 4 shows the interface of the ranking function. The ranking function ranks the codes of all learners using the system based on the scores calculated by the evaluation indicators, and learners can view the ranking by indicator. The information that can be viewed includes the name, score, and code content, but only codes above and below the learner’s own rank can be viewed. By selecting Total Points, Harvest Points, or Cost on the left side, the ranking of each indicator can be viewed. In the center, data are presented in a ranking format, and by selecting the data for which you wish to view codes, the codes for the selected data are shared on the right side. In the example shown in the figure, the Total Points ranking is being viewed, and the code in the 21st position is shared. As mentioned above, the number of codes that can be viewed is limited. In the example shown in the figure, the user’s highest ranking is 22nd, so all codes ranked 21st and lower can be viewed. Codes ranked 20th and above are not shared even if they are selected. This ranking function is expected to lead to step-by-step learning.
Fig. 4. Example of the interface for the ranking function.
292
3
S. Maeda et al.
Classroom Practice
Fig. 5. Classroom practices conducted (mixed offline and online).
4
Motivation
The developed system has been confirmed through experiments to have a certain learning effect. However, it is not certain whether the system can be practically used in the intended programming course because the number of subjects in the previous experiments was 12, and participants were university seniors who had already completed a programming course. In addition, the level of learners attending programming lectures in general varies from novices to experts. Since the current experiments were conducted in an experimental environment with a small number of subjects, we considered it necessary to conduct additional experiments in an environment with a large number of subjects in order to have the reliability of the developed system confirmed. As an additional evaluation, the system was used for programming lectures. We will assess whether the same learning effects can be obtained in this evaluation (Fig. 5).
Using a Code-Sharing Platform in the Classroom
5
293
Experiment Flow
The class practice was conducted as a game application experience in the “Introduction to Game Application Design” course offered by the Tokyo Polytechnic University, where the authors are enrolled. Considering the burden on the participants, the class practice was spread over two days: on the first day, a tutorial (10 minutes) was followed by pre-session system use (30 minutes). The second day consisted of a tutorial (60 minutes), post-use of the system (30 minutes), and a questionnaire. In the tutorial, an overview of the system and how to use it was given using a projector in the classroom, and how to write the system in CSPSVERBc1 was also explained. In the pre- and post-system learning, the use of the ranking function of the system was restricted. However, no restrictions were placed on the system’s functions in the system use study, and all functions were released. As in previous experiments, this study examines learning effects by comparing pre- and post-system use learning and by using a questionnaire. The number of participants in the class practice was 77, and the number of participants that could be analyzed as data from the results of the absence and consent forms was 65. 5.1
Assignment Design
The number of tasks prepared for the learners in the class practice was three. The tasks presented to the learners are shown in Fig. 6. The first problem is an arrangement of field panels laid out like a rectangle. The length of one side of this assignment varies from case to case. The second problem is a mesh-like arrangement with spaces in the middle field panels, based on the task design of the first problem. Finally, the third problem consists of a square arrangement of field panels with puddle panels dotted throughout. The placement of the puddle panels differs from case to case in this problem. The design intent of this assignment is to increase the number of robot control methods required of the learner as the assignment progresses.
6 6.1
Results Change in the Number of Harvests
Table 1 shows the difference in the number of harvests between pre-system and post-system use. The results showed that the number of yields improved for all problems. This suggests that the ranking function produces a certain learning effect within the scope of the task design for robot programming. In particular, in problem 3, which requires a higher degree of control than problems 1 and 2, such as avoiding puddles, the scores increased as in the other questions. Thus, the results suggest that the system can be used by learners who are trying to learn the target concept.
294
S. Maeda et al.
Fig. 6. The created assignment. Table 1. Differences in Harvest Points for pre- and post-system use. (Std. dev.) No. Design
6.2
Pre-test
Post-test
Difference
p-value
1
Rectangle 150.61 (279.89) 791.84 (451.04) 641.23 (466.34)