113 64 6MB
English Pages 193 [178] Year 2023
Fuzzy Management Methods Series Editors: Andreas Meier · Witold Pedrycz · Edy Portmann
Moreno Colombo
Phenotropic Interaction Improving Interfaces with Computing with Words and Perceptions
Fuzzy Management Methods Series Editors Andreas Meier, Fribourg, Switzerland Witold Pedrycz, Edmonton, Canada Edy Portmann, Bern, Switzerland
With today’s information overload, it has become increasingly difficult to analyze the huge amounts of data and to generate appropriate management decisions. Furthermore, the data are often imprecise and will include both quantitative and qualitative elements. For these reasons it is important to extend traditional decision making processes by adding intuitive reasoning, human subjectivity and imprecision. To deal with uncertainty, vagueness, and imprecision, Lotfi A. Zadeh introduced fuzzy sets and fuzzy logic. In this book series “Fuzzy Management Methods” fuzzy logic is applied to extend portfolio analysis, scoring methods, customer relationship management, performance measurement, web reputation, web analytics and controlling, community marketing and other business domains to improve managerial decisions. Thus, fuzzy logic can be seen as a management method where appropriate concepts, software tools and languages build a powerful instrument for analyzing and controlling the business.
Moreno Colombo
Phenotropic Interaction Improving Interfaces with Computing with Words and Perceptions
Moreno Colombo Human-IST Institute University of Fribourg Fribourg, Fribourg, Switzerland
ISSN 2196-4130 ISSN 2196-4149 (electronic) Fuzzy Management Methods ISBN 978-3-031-42818-0 ISBN 978-3-031-42819-7 (eBook) https://doi.org/10.1007/978-3-031-42819-7 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.
To the kind souls who bring chocolat to accompany my pain.
Foreword
Artificial Intelligence (AI) is nowadays pervading many aspects of our Society. This poses challenges to avoid people being put aside when their own data are processed by AI systems which provide decisions that may result in harmful discrimination. Accordingly, multidisciplinary teams (including among other engineers, mathematicians, linguists, and computer scientists but also domain experts and lawyers) face two major challenges: (i) how to extract valuable knowledge from data in a lawful, ethical, and robust way? and (ii) how to reuse existing AI-based resources in both academy and industry for the sake of efficiency and sustainability? Explainable AI (XAI1 ) is an endeavor to evolve AI methodologies and technology by focusing on the development of agents capable of generating decisions that a human can understand in each context, and explicitly explaining such decisions. This way, it is possible to scrutinize the underlying intelligent systems and verify if they make decisions in agreement with accepted rules and principles, so that decisions can be trusted, and their impact justified. XAI is a multidisciplinary research field where Data and Social Sciences naturally meet.2 In the context of XAI, an explanation is a presentation of (aspects of) the reasoning, functioning, and/or behavior of an explicandum (i.e., data, model and/or prediction) in humanunderstandable terms, to be conveyed from the explainer (i.e., the generator of the explanation) to the explainee (i.e., the recipient of the explanation). Accordingly, Human-Machine Interaction (HMI) must be carefully considered in the quest for effective explanations. Notice that, even if XAI and HMI are flourishing research fields, we still need to move a step forward for developing human-centric
1 The acronym XAI became popular with the following program published by the US Defense Advanced Research Projects Agency (DARPA): D. Gunning, E. Vorm, J.Y. Wang, M. Turek, “DARPA’s explainable AI (XAI) program: A retrospective”, Applied AI Letters, 2021, https:// doi.org/10.1002/ail2.61. 2 T . Miller, “Explanation in Artificial Intelligence: Insights from the Social Sciences”, Artificial Intelligence, 267:1–38, 2019, https://doi.org/10.1016/j.artint.2018.07.007.
vii
viii
Foreword
Trustworthy AI3 that is lawful, ethical, and robust, being aware of technical but also ethical, legal, socioeconomic, and cultural issues. In this regard, it is worth noting that the new European regulation on AI (i.e., the AI Act4 ) remarks the need to push for a human-centric responsible AI that empowers citizens to make them more informed and thus ready to make better decisions. With respect to technical issues, there are three main open research problems: (i) designing explainable and trustworthy algorithms; (ii) implementing explainable and trustworthy humanmachine interfaces; and (iii) defining novel methods for evaluation of human-centric explanations. In this context, the Theory of Computing with Words.5 rooted in the solid principles of the Fuzzy Set Theory,6 can play a crucial role mainly when dealing with vague and imprecise linguistic expressions that humans naturally use to interpret and describe perceptions in their everyday life. In the context of Trustworthy AI, Explainable Fuzzy Systems7 deal naturally with uncertainty and approximate reasoning (as humans do) through computing with words and perceptions. This way, they facilitate humans to scrutinize the underlying intelligent models and verify if automated decisions can be trusted. This comprehensive book is of interest for researchers, practitioners, and students working in the research field of Trustworthy AI; with special emphasis on fuzzygrounded knowledge representation and reasoning associated to explainable and trustworthy human-machine interfaces. The author makes a significant step ahead in the search for natural HMI thanks to the formalization and implementation of the so-called Phenotropic interaction. Moreover, when we live surrounded by more and more intelligent fancy gadgets (e.g., smart watches, smart TVs, smart cars, smart homes, and so on) which are however still far away from dealing with vagueness, imprecision, and uncertainty in a natural way, Phenotropic interaction can naturally enrich HMI with the capability for computing words and perceptions. Reading carefully this book, readers will learn not only to appreciate the fundamentals of Phenotropic interactions but also their close relation with the Theory of Computing
3 S. Ali, T. Abuhmed, S. El-Sappagh, K. Muhammad, Jose M. Alonso-Moral, R. Confalonieri, R. Guidotti, J. Del Ser, N. Díaz-Rodríguez, F. Herrera, “Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence”, Information Fusion, 2023, https://doi.org/10.1016/j.inffus.2023.101805. 4 On 14 June 2023, the EU Parliament voted the AI Act: 499 votes in favor, 28 against, and 93 abstentions (https://artificialintelligenceact.eu/the-act/). 5 L.A. Zadeh, “Fuzzy logic = Computing with Words”, IEEE Transactions on Fuzzy Systems, 4(2):103–111, 1996. 6 E. Trillas, L. Eciolaza, “Fuzzy Logic. An Introductory Course for Engineering Students”, Springer, 2015, https://link.springer.com/book/10.1007/978-3-319-14203-6. 7 J.M. Alonso, C. Castiello, L. Magdalena, C. Mencar, “Explainable Fuzzy Systems - Paving the Way from Interpretable Fuzzy Systems to Explainable AI Systems”, Studies in Computational Intelligence, Springer International Publishing, 2021, https://www.springer.com/gp/book/ 9783030710972.
Foreword
ix
with Words. Moreno Colombo shows how a practical implementation of Phenotropic interaction is feasible, and how outstanding results can be achieved when empowering intelligent gadgets with the ability of handling vagueness, imprecision, and uncertainty while interacting with humans in real-world applications. Santiago de Compostela, Spain June 2023
Jose Maria Alonso Moral
Preface
In today’s world, the interaction between humans and artificial systems has become increasingly important for collaborative problem-solving. People seek assistance from these systems, but the communication process has often been hindered by the reliance on unnatural concepts like protocols. As a result, users find interacting with artificial systems demanding, especially when compared to their everyday interactions with other people. This book delves into the exploration of a novel interaction paradigm called “phenotropic interaction,” whose primary objective is to enable individuals to engage in more natural interactions with artificial systems. To achieve this goal, a framework is proposed, built on literature research, that addresses the fundamental aspects required to enhance the naturalness of the interaction between humans and machines and its similarity to human communication. Furthermore, this work implements various methods to effectively understand and reason with human perceptions. These methods are essential for artificial systems to understand and react in a satisfying way to the needs and desires of people, and incorporate well-established theories such as computing with words and perceptions and analogical reasoning. Some practical applications of the phenotropic interaction framework are also showcased in this book. These include the development of a natural interactive virtual assistant prototype and the interaction between citizens and the smart city ecosystem. User studies are conducted on these artifacts to analyze the impact of phenotropics on the quality of interaction with these intelligent systems. Ultimately, the findings of this work highlight the potential of phenotropic interaction in creating interfaces that are natural, adaptive, and easily comprehensible for individuals. By leveraging simple and sustainable bio-inspired methods, this research demonstrates the possibilities for bridging the gap between humans and artificial systems, ultimately empowering users in their interactions with technology. Fribourg, Switzerland June 2023
Moreno Colombo
xi
Acknowledgments
Despite a PhD being an important personal achievement, nobody can do it alone. Luckily, in these last years, I was never left alone, but behind the curtains, many people contributed to the success of this dissertation with their outstanding support. First and foremost, I wish to express my deepest gratitude to my supervisor Edy Portmann, for his continuous support, for guiding me in this journey, and for providing always interesting and unconventional topics for discussion and research directions. I would also like to thank Denis Lalanne for his teachings about the importance of human-centricity in technology, which I will keep pursuing also in the future. I thank also Dr. José María Alonso Moral for his precious feedback and for contributing with his expertise in explainable artificial intelligence and fuzzy systems to the improvement of this research effort. I am also grateful for all the colleagues that I got to know at Human-IST and with whom I could work, have fun, or share the highs and lows of being PhD student. I cherish the scientific, philosophical, and cultural exchanges that we had, despite the reduced meeting occasions due to the global pandemic. Particularly, I am extremely grateful to have shared this journey with Jhonny (Dr. Gionno Salamino) and Tue(cillo). Thank you for the collaborations, inspiring discussions, and especially for always being there in the good and bad moments. To many more good ones together! A huge thank you to Sara for helping me get started with my research, as well as to Julien and Michael for the nice moments we spent together working in the super Lucideles team, but also for their outstanding support in the most stressful periods. Additionally, I particularly appreciated the opportunity of collaborating with colleagues external to our university on projects that contributed to broadening my horizons: Saskia Hurle, Elias Schäfer, Laura Iseli, Oleg Lavrovski, Prof. Joris Van Wezemael, and David Abele. Thanks also to the students that I had the luck to supervise or work with. A heartfelt thank you goes to my beloved Prisca, for always being by my side and with her strength pushing me to complete this project. Together we found our balance and managed to face many difficult moments and share our successes. xiii
xiv
Acknowledgments
A huge thanks definitely goes to my parents Elena and Enrico, who have never stopped supporting and motivating me in all the choices I’ve made for 30 years, and with whom I have always been able to share joys and sorrows. I would never have made it here without you. Thanks also to Simone, who has always been my role model and made it easy for me to choose my path. Even though we are far apart, it’s always nice to know that I can count on you. And thanks to Jenny, Thomas, Jari, and Grace for the good moments together and for introducing us to new places and cultures. I would like to thank all the friends who made this period a little more carefree. Without you, everything would have been much more complicated. Special thanks to the Bümpliz Ost gang, Flo, Giulia, and Ruben, for always being there to share a pizza, a “sprizzino,” a run, a hike, some board games, and much more to relieve tension or celebrate our successes. Also, thanks to Viola for always being there to share beautiful and difficult moments. Finally, thanks to the friends from the “Catantena,” Lorenz, Cam, and Lollo, for bringing some normality to a “pandemic” period when nothing was easy, with regular games of Catan (danke Klaus Teuber for this awesome game). It was sometimes hard to see the light at the end of the tunnel in these years, especially in the middle of a global pandemic, but all together, we made it. Thank you, it’s been a hell of a ride!
Contents
Part I Motivation and Objectives 1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Research Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Design Science Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.2 Toward Antidisciplinary Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Own Research Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 3 7 9 9 10 11 13 16
Part II Theory of Naturalness 2
Phenotropic Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Phenotropics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Design Principles of Phenotropic Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Phenotropic Interaction Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21 23 25 28 29 30
3
Cognitive and Perceptual Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Conversation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Computing with Words and Perceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Automated Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Interactive Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Explainable Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33 34 35 37 39 40 41 42
xv
xvi
Contents
Part III Natural Language Conversations 4
Semantic Similarity Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Conceptual Similarity Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Spectral Similarity Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 A Novel Spectral Similarity Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Choice of Knowledge Base . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Spectral Semantic Similarity Measure . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Practical Implementation Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 Thesaurus Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Multiple Meanings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.3 Order of Synonymy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49 50 52 53 53 54 55 55 56 57 59 60 63 66 67
5
Automatic Precisiation of Meaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Automatic Precisiation of Meaning 1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Basis Selection and Precisiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 The APM 1.0 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Automatic Precisiation of Meaning 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Algorithm Choices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 The APM 2.0 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
71 72 73 75 76 77 78 80 80 81 83 84
6
Fuzzy Analogical Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.1 Analogical Reasoning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 6.2 The FAR Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 6.2.1 Conceptual Analogies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 6.2.2 Spectral Analogies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.2.3 Prototype Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 6.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 6.3.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 6.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 6.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Part IV Applications of Phenotropic Interaction 7
Phenotropic Interaction in Virtual Assistants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 7.1 Extension of If This Then That Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Contents
8
xvii
7.1.1 Query Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.2 Query Matching. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.3 Rule Adaptation with Fuzzy Analogical Reasoning . . . . . . . . . . 7.2 The FVA Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
107 108 109 111 114 114 115 120 122
Phenotropic Interaction in Smart Cities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Jingle Jungle Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.4 Phenotropic Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Streetwise. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.4 Phenotropic Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
123 124 125 125 129 129 131 131 132 137 138 139 140
Part V Conclusions 9
Outlook and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Alignment with Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Future Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Outlook and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
145 145 147 153 155 156
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 A Survey: Ordering of Scalar Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 B Semantic Similarity Evaluation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 C Evaluation of the FVA Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
Acronyms
AI API APM 1.0 APM 2.0 CNN CC CWW DSR FAR FVA HCI HSC IFTTT ML NLP PC Per-C PI SC SWOT VA
Artificial Intelligence Application Programming Interface Automatic Precisiation of Meaning 1.0 Automatic Precisiation of Meaning 2.0 Convolutional Neural Network Cognitive Computing Computing with Words and Perceptions Design Science Research Fuzzy Analogical Reasoning Flexible Virtual Assistant Human-Computer Interaction Human Smart City IF This Then That Machine Learning Natural Language Processing Perceptual Computing Perceptual Computer Phenotropic Interaction Smart City Strengths, Weaknesses, Opportunities, and Threats Virtual Assistant
xix
List of Figures
Fig. 1.1 The design science research framework (Adapted from [15]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fig. 2.1 The phenotropic interaction framework, integrating conversation theory, CWW, reasoning, interactive ML, and explainable AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fig. 3.1 Structure of conversations (Adapted from [33]) . . . . . . . . . . . . . . . . . . . . . . Fig. 3.2 The CWW pipeline. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fig. 4.1 Example of data used for the computation of conceptual semantic similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fig. 4.2 Example of the semantic similarity between adjectives on the spectrum of heat, .sim(w0 , w1 ) ∝ Fig. 4.3 Fig. 4.4 Fig. 5.1 Fig. 5.2 Fig. 5.3 Fig. 5.4 Fig. 5.5 Fig. 6.1 Fig. 6.2 Fig. 6.3
1 dist(w0 ,w1 )
9
28 34 36 51
(Adapted
from [9]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Survey for ranking words based on their similarity to an extreme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Accuracy of rankings using various semantic similarity measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example of precisiation of the meaning of tall as value for a person’s height . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example of precisiation of hot using contextual information . . . . . . . Example of precisiations of hot with APM 2.0 . . . . . . . . . . . . . . . . . . . . . . Example of precisiation of terms describing temperature using APM 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Accuracy of rankings using precisiation of meaning . . . . . . . . . . . . . . . . The analogical scheme (Adapted from [3]) . . . . . . . . . . . . . . . . . . . . . . . . . . Fuzzy analogical reasoning for spectral analogies (Adapted from [6]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analogical reasoning with conceptual analogies in the prototype, including textual explanation (Reprinted from [6], ©2020 IEEE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52 61 65 73 76 78 79 82 88 93
95
xxi
xxii
List of Figures
Fig. 6.4 Analogical reasoning with spectral analogies in the prototype, including textual explanation (Reprinted from [6], ©2020 IEEE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fig. 7.1 Example of partial matching process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fig. 7.2 Architecture of the FVA prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fig. 7.3 Interface of the FVA prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fig. 7.4 Answers to the question “Did the assistant perform the requested task correctly?” for each of the three tasks in both scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fig. 7.5 Comparison of the perception of the FVA and the traditional prototypes (5 .→ strongly agree, 1 .→ strongly disagree). The stars indicate statistical significance of the t-test comparison of the prototypes (i.e., *.→ p < .05, **.→ p < .01, ns.→ not significant) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fig. 7.6 Comparison of the perception of the interaction with the FVA and the traditional prototypes (5 .→ strongly agree, 1 .→ strongly disagree). The stars indicate statistical significance of the t-test comparison of the prototypes (i.e., *.→ p < .05, **.→ p < .01) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fig. 8.1 Architecture for the generation of the Jingle Jungle soundscape map (Adapted from [40]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fig. 8.2 Overview of the Per-C module for the estimation of the sound level from the sound-bearing and distance-related words (Adapted from [40]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fig. 8.3 Visualization of the soundscape for the city of Bern, Switzerland (Reprinted from [40], ©2020 IEEE) . . . . . . . . . . . . . . . . . . . . Fig. 8.4 Architecture for the estimation of the perceived spatial quality of cities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fig. 8.5 Architecture of the Siamese CNN used for the automatic comparison of images (Adapted from [11]) . . . . . . . . . . . . . . . . . . . . . . . . . . Fig. 8.6 Accuracy and loss curves of the Siamese CNN on the training and validation sets with an 80/20 split . . . . . . . . . . . . . . . . . . . . . . Fig. 8.7 Visualization of the perceived atmosphere in cities with different data aggregation techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
95 109 113 113
116
118
119 126
127 128 133 134 135 136
List of Tables
Table 2.1 Table Table Table Table Table Table Table Table
Comparison of the features of traditional and phenotropic interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.1 Parameters for the computation of .sim(w1 , w2 ) with .w1 = freezing and .w2 = cold for different meanings of .w2 . . . . . . . 56 4.2 Results of the synonymy order selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.3 State-of-the-art similarity measures used in the evaluation . . . . . . . . 64 4.4 Mean accuracy of rankings using some semantic similarity measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 5.1 Mean distance between manual precisiations and algorithmic ones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 6.1 SWOT analysis of the FAR prototype for conceptual analogies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 6.2 SWOT analysis of the FAR prototype for spectral analogies . . . . . . 99 B.1 Mean accuracy of rankings using various semantic similarity measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
xxiii
List of Algorithms
Algorithm 4.1 Algorithm 5.1 Algorithm 6.1 Algorithm 6.2 Algorithm 7.1
Selection of optimal synonymy order i for .simi (Adapted from [9]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Selection of the basis for C as in [3]. . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Analogical reasoning with conceptual resemblance relations [6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Analogical reasoning with spectral resemblance relations [6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Fuzzy analogical reasoning for the estimation of a value .x ' of an extended rule .ri' . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
xxv
Part I
Motivation and Objectives
Chapter 1
Introduction
An overview of the background, motivation, scope, and methodology followed in this book is presented in this introductory chapter. This chapter attempts to apply principles from the field of phenotropics to the interaction between humans and artificial intelligent systems to obtain an interaction paradigm more natural for people. The structure of the proposed approach to solve the problem is also introduced, alongside the published contributions of the author to different research fields. The background and motivation for phenotropic interaction in general and for this book are introduced in Sect. 1.1, followed by the research objectives and questions that this book tries to answer in Sect. 1.2. Next, the applied research methodology is detailed in Sect. 1.3. In the following, the structure of the book and a short overview of each chapter are presented in Sect. 1.4. Finally, the list of research papers published by the author in the context of this project is summarized in Sect. 1.5.
1.1 Background and Motivation Since its early years, computer science was a discipline not only centered on the study of computers and computation but had essential aspects linked to the communication and interaction between computer components, groups of computers, or computers and humans. Unlike other specialized tools with unique uses, computers can solve different types of problems, making it fundamental for them to have conversations of a certain level with people to let them correctly specify the operations to be performed, for example, through programming. Fundamental to the improvement of this sort of conversation between people and computers is the concept of interaction. Interaction refers to an exchange of information (e.g., speech, body language) between actors and their influence on each © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Colombo, Phenotropic Interaction, Fuzzy Management Methods, https://doi.org/10.1007/978-3-031-42819-7_1
3
4
1 Introduction
other. Three main categories of interaction can be identified, depending on the nature of the involved actors: the interaction between biological systems, the interaction between machines, and the interaction between systems of diverse nature. The interaction between biological systems describes how different natural actors pursue a common goal, such as how people interact with the environment they are immersed in [20]. This interaction also includes the conversation model between people who exchange information using multiple modalities, such as words and body language [19]. This type of interaction is characterized by the flexibility and adaptivity of the actors to one another. For example, in the communication between humans, most of the time, a common language is used, and some social norms are respected. Still, these are not crisply defined nor known a priori but emerged as mere suggestions over years of interactions and keep evolving. Everyone has interpretations and perceptions about these aspects of communication (e.g., stylistic features, accents). Despite this, people adapt to the peculiarities of others to understand them as well as possible, by finding a common ground among the participants in a conversation. On the opposite side of the spectrum lies the interaction between machines, intended to exchange information between computer hardware or software components. This interaction is based on the workings of computer hardware, which is built on the concept of sequentially letting current flow through wires. To give meaning to this artificial representation of information, it is fundamental to employ protocols that provide the instructions to encode and decode information expressed as a sequence of bits. Furthermore, these protocols must be defined in a rigorous way that all components must know a priori to make sense of the exchanged data [34]. No flexibility is present in this type of interaction. The necessity of using this paradigm in the communication between hardware components directly translated to the emergence of the same hard protocols-based structure in the interaction between different pieces of software. These are thus designed knowing the exact protocols used by other modules that they interact with to allow for a successful exchange of information. Mixed interaction refers to exchanges between agents of different nature, in this case, biological and artificial systems. This includes between others human– computer interaction (HCI), smart cities (SC), and human-building interaction [24]. In these fields, the tendency is more and more that of designing interactions around the people, such as in human smart cities (HSC) [25]. The process for constructing interfaces that are useful, safe, usable, ethical, and functional for users [38] involves the study and application of methodologies and design principles that consider the needs and preferences of the user to improve the usability of an interface [12]. Many of these methods are derived from the study of human-to-human interaction, from which fundamental properties familiar to people are directly translated to the interface with the artificial agent. However, despite these disciplines’ goal of designing people-centered interaction, they still provide solutions centered around crisp protocols that the user must learn and adapt to. Indeed, the process of creating user-centered interaction consists mainly in creating interfaces that are natural, easy to understand, and usable for the
1.1 Background and Motivation
5
average user while still being easy to interpret for the artificial system (i.e., with a defined structure). In other words, they can be reduced to defining a crisp protocol that is easy to learn (i.e., intuitive) by system users. This approach tries to construct a mechanistic model describing the natural interaction between people and apply it to another context. However, it has been proved that abstractions of natural phenomena most of the time do not capture them fully, as they leave an unexplained remainder that represents what makes nature more complex and varied than artificial systems (their “living component”) [11]. In this specific case, the designed interactive systems simulate features of humanto-human interaction but omit the fundamental flexibility observed in human conversation, where the participants synchronize themselves and adapt to the other [28]. As a proof of this synchronization between people in a conversation, it has been observed that when one person stops speaking, the other one begins on average after 0.2 seconds, which is 5 times quicker than the average response time to a traffic light [11]. A more natural approach to this problem is necessary to make the interaction more similar to what people are used to in terms of interaction with their peers. Furthermore, the common protocols for designing human-centered interaction are built on the preferences of the average user. However, outliers exist, and they are in some cases very well represented, for example, by the whole category of women [29], as often the concept of the average user is strongly biased toward a certain type of person. In the best case, outliers simply have to renounce their preferences and learn to interact with the system differently than they would naturally. But in the worst case, they could not use the system at all to reach their goal, for example, in the case of people with a particular disability for which the system was not explicitly designed. Adaptive systems, also represented by the field of Intelligent User Interfaces [36], try to solve these limitations by designing “highly usable systems for people with different needs and characteristics in different context of use” [13]. Most of the solutions to this problem consist in adapting the presentation of information in graphical user interfaces and the responses of the interactive system to people, based on models of individual users describing some of their characteristics. For instance, works exist on the adaptation of the hypermedia content shown to individual users based on their knowledge and goals [3], or on the merging of operations often performed in succession by a user into macros, as well as on the automatic adaptation of interface parameters and layout to user preferences, in order to make the interaction more effective and less error-prone [14, 39]. However, the interaction with most of the interactive adaptive systems still strongly relies on predefined protocols, despite the interface can vary for different users based on their needs and preferences. One can argue that a more semantic-aware interaction between users and machines can provide an additional layer of adaptation to the user’s way of communicating. In the cases where strict protocols are not employed to structure the communication with the interactive system, for example, in intelligent Natural Language Processing (NLP) applications, the apparent flexibility is reached in most cases with
6
1 Introduction
deep learning, as most intelligent systems that have been developed in the last years. However, the inference power emerging from these systems is a mere combination of high computational power and big datasets, but not of a particularly intelligent handling of the data [35]. The increased use of such techniques employing humongous amounts of energy and private data for effectively training models is not sustainable, both from an environmental [33] and an ethical [37] perspective. On the other side, nature is very efficient, as the human brain uses a reduced amount of energy to function and infants learn to associate names to objects with a very small set of examples [40]. One can argue that the use of nature-inspired methods for the development of intelligent behaviors in computers is a possible solution to the problems of deep learning, which could provide a more sustainable future direction of computer science. For this reason, the methods researched in this book for the development of more natural interfaces try to use bio-inspired theories when possible. To tackle the presented limitations of traditional interactive systems, phenotropic interaction is proposed as a new approach allowing for an interaction paradigm inspired by the model of human conversation, centered thus on the adaptivity of the artificial agent to allow for a conversation that is more flexible and adapted to the needs of particular users. This can reduce the gap between natural and artificial communication, overcoming thus some of the negative implications of the latter [11]. Phenotropic interaction is based on the concept of phenotropic software [22], a theory suggesting some techniques to achieve bio-inspired communication between software modules. Furthermore, the proposed interaction paradigm relies on the structure of conversation theory [28] and various models for the automated handling and representation of the semantics of conversation exchanges, as this is believed to be a fundamental aspect of making computer systems able to bypass crisp protocols in the handling of interaction. The idea of phenotropic interaction is that of extending the well-studied principles of various disciplines (e.g., HCI) to include the desired flexibility and adaptivity. In general, phenotropic interaction aims to provide the basis for extending traditional mixed systems by enabling computers to understand and adapt to the users’ needs and ways of communicating. This feature can lead to systems that reach a higher level of customization and are accessible by design. For example, one can imagine a system designed to be used with vocal input to execute a limited set of commands. Understanding the semantics of communication on the artificial agent’s side could allow a person who cannot talk to entertain a conversation with the system using gestures, which would then be mapped to the approximate meaning that these represent. This could be achieved for example through the use of a precisiated natural language [41], a symbolic language representing the semantics of concepts expressed in different types of natural language. The obtained information could then be matched with the meaning of the predefined list of commands that the system knows in terms of speech to execute the command corresponding to the input expressed as gestures. For achieving these results, concepts from the field of fuzzy systems can be applied, which can handle the intrinsic imprecision in human cognition, allowing to bridge the gap between hard and human sciences by
1.2 Research Objectives
7
working with linguistic quantities [35], which could represent for example the social components of the conversation.
1.2 Research Objectives Phenotropic interaction being a novel and very vast field of study, the main research objective of this effort, lies in its precise definition in the general use-case of exchanges between people and artificial systems. This includes identifying the existing knowledge and theories that can contribute to the successful development of interfaces that are natural, adaptive, and able to understand and correctly process human conversations. Additionally, to provide a more profound understanding and analysis of some aspects of the proposed framework for phenotropic interaction, the focus of the research presented in this book is then set on the development of new methods for handling adaptivity, approximation, and robustness through automated semantic understanding and reasoning, especially on perceptions, that are fundamental for human reasoning and communication. This is followed by the implementation and analysis of phenotropic interaction in practical use-cases from different contexts to better understand and verify the developed principles and models. In the definition of concepts and practical implementations, the following research questions are to be answered: RQ1. What Are the Main Features of Phenotropic Interaction, and How Can They Be Modeled? To address this question, the concept of phenotropics and existing applications have to be analyzed to extract the properties that best apply to improving the limitations of traditional interaction between natural and artificial systems. This information can be used for laying a model of the different elements of phenotropic interaction, as presented in Chap. 2. RQ2. Which Methods and Theories Are Suitable for the Implementation of Phenotropic Interaction? Given the cognitive and perceptual nature of the problem of understanding human communication, cognitive computing and perceptual computing methods can be used for the successful implementation of phenotropic interfaces. Based on the criteria defined in the phenotropic interaction design principles and phenotropic interaction framework, a set of suitable methods from cognitive computing and perceptual computing for the handling of the identified problems should be selected that allow interfaces to better adapt to user needs and handle imprecision. General theories and their corresponding impact on the development of phenotropic interaction are introduced in Chap. 3.
8
1 Introduction
RQ3. How Can the Semantic Similarity Between Perceptions Expressed as Words be Computed in a Meaningful, Human-Like Way? As a more specific component of phenotropic interaction, artificial agents must be able to handle the automated understanding of the meaning of perceptions expressed by people through natural language. An important artifact for the processing of semantics consists of semantic similarity measures. A study of existing and newly developed measures and selecting one suitable for a human-like understanding of perceptions should be performed. Different approaches can be evaluated through a prototype implementation and a user experiment, as presented in Chap. 4. RQ4. What Methods Are Suitable for the Automated Understanding and Representation of the Semantics of Perceptions? Semantic similarity measures are insufficient for a complete understanding and representation of the semantics of perceptions allowing advanced handling of them (e.g., analogical reasoning); methods for the adequate automated understanding of the meaning of perceptions need to be researched. These should provide, among others, a representation of semantics, which could be used for the computation of related properties and the processing of the extracted meaning to improve the adaptivity of the interface. The automation of the Computing With Words (CWW) pipeline is contemplated as an answer to this question, which can be evaluated compared to human judgment in user studies, as presented in Chap. 5. RQ5. How Can Automated Reasoning Theories Be Implemented in Practice to Empower the Reasoning Process in a Phenotropic Interface? The ability of artificial systems to reason up to a certain degree with the information retrieved from a conversation is fundamental for the continuous improvement and adaptation to the user needs in phenotropic interaction. Despite many automated reasoning theories exist, not many of these are effectively implemented in usable software that is generic enough to reason in a context that is not domain-specific, such as in a free conversation. Therefore, an analysis of existing automated reasoning theories should be performed, followed by the implementation of a prototype applying a reasoning scheme to the elements of a conversation. To evaluate this, one can rely on a prototype implementation and user studies to verify the ability to reason in a sound way using the chosen methodology, as presented in Chap. 6. RQ6. How Can Phenotropic Interaction Principles Be Implemented in UseCases from the Real World? Do These Provide an Improvement Over Traditional Solutions? These features should be implemented in prototypes covering different use-cases to showcase and analyze the potential advantages of phenotropic interaction principles not only from a theoretical perspective but also in practice. This way, it is easier to comprehend the practical components necessary for the effective implementation of phenotropic interfaces and to compare the use of this new paradigm to more traditional solutions not centered on the adaptivity to the user in the conversation. Practical applications of phenotropic interaction are presented and evaluated with use-cases from HCI (Chap. 7) and SC (Chap. 8).
1.3 Methodology
9
1.3 Methodology The research presented in this book mainly follows the methodological guidelines of design science research for information systems [16], following an antidisciplinary approach [18]. These are presented in more detail in the following sections.
1.3.1 Design Science Research The design science research paradigm represents a way of solving problems in a structured way (see Fig. 1.1). It aims to enhance knowledge and practice through the design, implementation, and study of usable innovative artifacts. Artifacts are intended as any item that can be used directly to solve a practical problem and include algorithms, prototypes, interfaces, frameworks, and design principles. The development process and the evaluation of these artifacts provide insights on why these represent an improvement for the studied application context, contributing to the extension and understanding of the underlying theories. Design science research consists of an iterative process, where a fundamental problem is approached with a tentative solution in the form of an artifact built on the objectives identified as necessary for the problem solution. Then, the artifact is employed as a basis for evaluating the problem solution. Insights from the different phases of the design science research process are used for the communication of the discoveries regarding the artifact and underlying theories, as well as for the identification of the proposed solution’s limitations. Moreover, they can identify new problems emerging from the found solution, which leads to the start of a new iteration in the design research cycle. In this research effort, the design science research approach was chosen to answer most of the research questions because it is a methodology that allows both practical and theoretical results relatively quickly. Indeed, since phenotropic interaction is a new concept, it is fundamental to set a solid theoretical basis on which subsequent developments in this direction can rely, but at the same time, it is essential to
Fig. 1.1 The design science research framework (Adapted from [15])
10
1 Introduction
implement some practical solutions that both show and serve as a basis to evaluating the concepts in their early development phase. This can be achieved with the artifactcentered design science research approach and its short development–evaluation loop. The general problem to be solved in the presented research is improving the adaptivity of artificial interfaces to humans. A suggestion to solve this problem comes from adapting the concepts of phenotropic software to the studied field. This approach is developed as a set of design principles and a framework (representing the artifacts for the definition of phenotropic interaction), presented in Chap. 2 and built on existing cognitive and perceptual computing theories (Chap. 3), which are then evaluated through practical implementation of the design principles (Chaps. 7 and 8). Finally, results are communicated through peer-reviewed scientific publications and in more detail in this book, which provides a comprehensive overview of the design and evaluation of the phenotropic interaction artifact. In parallel to the overall design science research process for phenotropic interaction, other minor cycles are employed for the solution of more specific problems, such as the handling of the automated understanding and reasoning with perceptions and the practical implementation of phenotropic interaction principles in the SC context. Each of the Chaps. 4–8 describes an instance of the design science research process to solve a sub-problem of phenotropic interaction, where algorithms and prototypes are developed and evaluated utilizing user-based evaluation.
1.3.2 Toward Antidisciplinary Research One of the problems that have been identified in traditional academic research is the importance of publications in top-tier peer-reviewed journals [31]. This is well described by the commonly used expression that faculty must “publish or perish.” This race to publishing pushes researchers to focus their efforts on providing minimal improvements to existing theories and methods and proving the value of their research to a reduced number of experts in their field, rather than taking higher-risk unconventional approaches to solve more general problems, which are often more relevant for society [18]. This divides academia into silos representing different disciplines that become increasingly focused on less and less and have difficulties connecting with other disciplines. However, most problems that afflict the real world either concern the interplay between different disciplines or do not fit inside them. From this observation originated the ideas of interdisciplinary, transdisciplinary, and antidisciplinary research. Interdisciplinary research refers to the integration by teams or individuals of “information, data, techniques, tools, perspectives, concepts and/or theories from two or more disciplines or bodies of specialized knowledge” [23] to find solutions to problems that are out of the scope of a single academic discipline. In other words, it consists of tackling a problem by combining different points of view.
1.4 Outline
11
In addition to the properties of interdisciplinary, transdisciplinary research focuses on the relevance of the problem to be solved. Indeed, it is defined as the interaction between “members of different cultures [. . . ] to co-produce knowledge” [30]. This is mainly intended as the collaboration between the scientific community and society (e.g., citizens, companies), where societal and scientific problems are aligned, and solutions are sought on a goal-oriented basis through co-production [21]. This can finally be used to solve societal problems and improve scientific practice simultaneously. Antidisciplinary research takes a slightly different approach, where it is not necessarily essential to impact an academic discipline. However, the focus is on finding relevant problems in any space, spanning over different disciplines, or lying between them, and searching for a concrete solution to this, not necessarily employing methods from a predefined research field. This is thus not done to provide a publishable insignificant development to a discipline but to solve a relevant problem, employing any method [18]. The expression “deploy or die” summarizes the idea of antidisciplinary research [17] and emphasizes the practice-oriented nature of this approach, which perfectly aligns it with the principles of design science research, where artifacts are central to the advance of research toward the resolution of problems relevant to society. The research presented in this book is executed following the approach of antidisciplinary research, where items of research are selected to solve problems identified as being fundamental for the effective development of phenotropic interaction to improve the collaboration between people and complex technological systems. These problems, similar to many HCI-related studies, also include aspects from several disciplines, including mathematics, computer science, linguistics, and psychology, but also aspects that are not being studied in any traditional academic discipline.
1.4 Outline This book, besides the introductory and conclusive chapters, is composed of three main parts, divided into chapters presenting more specific topics. In Part II, the concepts of phenotropic interaction, central to the whole book, are introduced and structured. Next, the building blocks of phenotropic interaction, in the form of general theories fundamental for the implementation of this interaction paradigm in any context, as well as more precise developments in the direction of the modeling, understanding, and reasoning with human perceptions, are presented in Part III. Finally, Part IV describes the implementation and evaluation of phenotropic interaction, employing the building blocks from Part III, in two different contexts: a human–computer interface with a virtual assistant, and the interaction between citizens and a SC.
12
1 Introduction
Part II: Theory of Naturalness In this first part, a theory for the implementation of more natural interaction is introduced, including the general techniques that would allow to concretely apply this paradigm to intelligent interfaces between technology and people: • Chapter 2—Phenotropic Interaction. This chapter builds on the concept of phenotropics to define the design principles of an interaction paradigm that is more natural and adaptive for humans. • Chapter 3—Cognitive and Perceptual Computing. The main theories on which phenotropic interaction relies for the provision of adaptivity to people’s communication are based on the adequate understanding of human conversation. This goes through the practical understanding of and reasoning with language and human perceptions. This chapter presents the main families of methods for achieving this goal. Part III: Natural Language Conversations The focus of this part lies in the development of new techniques for the understanding of and reasoning with perceptions expressed in the form of natural language that are fundamental for the practical implementation of the phenotropics design principles into the context of natural language-based interaction between systems: • Chapter 4—Semantic Similarity Measures. For effective handling of semantics in conversation, an overview of semantic similarity measures is presented in this chapter. Additionally, since there is a lack of similarity measures able to correctly treat the semantics of human perceptions expressed as scalar adjectives and adverbs, a new method to solve this problem is presented and evaluated. • Chapter 5—Automatic Precisiation of Meaning. In this chapter, a method from CWW to provide a higher level of understanding of words is introduced and extended with the ability to infer semantics automatically. Two iterations of the design science research process are presented on an algorithm for solving the problem of automatic precisiation of meaning. • Chapter 6—Fuzzy Analogical Reasoning. For a better understanding of the implications of information expressed by humans in the context of a conversation, there is a necessity for artificial systems to perform automated reasoning. This chapter introduces some theories for automated reasoning and presents a prototype proving the ability of computers to perform automated approximate reasoning to a certain degree. Part IV: Applications of Phenotropic Interaction Practical applications of phenotropics to use-cases in the natural language-based interaction between users and virtual assistants, as well as between citizens and smart cities, are analyzed in this part. Here, the concrete measures for the application of the phenotropic interaction design principles to different contexts are presented, and their impact on the interaction is studied: • Chapter 7—Phenotropic Interaction in Virtual Assistants. This chapter presents the introduction of the design principles of phenotropic interaction into
1.5 Own Research Contribution
13
the design of the prototype of a flexible virtual assistant able to handle in a more natural way custom IF This Than That (IFTTT) rules. Additionally, the details of the implementation, including the use of the building blocks presented in Part III, and an evaluation of different aspects of the prototype and of phenotropic interaction are presented in this chapter. • Chapter 8—Phenotropic Interaction in Smart Cities. The benefits of applying phenotropic interaction principles to the development of projects in the context of HSCs are shown in this chapter through the presentation and analysis of two concrete projects implementing them. Part V: Conclusions Concluding remarks, including a general discussion of the results and future research directions, are presented in this conclusive part: • Chapter 9—Outlook and Conclusions. This chapter summarizes the content of the book and the answer to the research questions and provides concluding remarks and ideas for future developments of phenotropic interaction.
1.5 Own Research Contribution This section presents a list of the publications that the author contributed to in the frame of this work: • Colombo, M., & Portmann, E. (2020) Semantic Similarity Between Adjectives and Adverbs—The Introduction of a New Measure [7]. This study analyzes existing semantic similarity measures and their inability to effectively treat scalar adjectives and adverbs, mostly used to describe perceptions, in a human-like way. A new semantic similarity measure based on the overlaps of second-order synonyms of words is introduced and evaluated in a user study proving its strongly increased accuracy compared to state-of-the-art methods. This research provides the basis for automated perception understanding and reasoning, fundamental for developing phenotropic interfaces. This research item is presented in detail in Chap. 4. • Spring, T., Ajro, D., Pincay, J., Colombo, M., & Portmann, E. (2020) Jingle Jungle Maps—Capturing Urban Sounds and Emotions in Maps [32]. This research item presents a methodology for the creation of a city’s soundscape (e.g., for the analysis of noise pollution), based on the analysis of human perceptions expressed with natural language using CWW techniques on geotagged social media posts. In addition, this use-case describes a practical implementation of the phenotropic interaction design principles in the context of SCs, as presented in Chap. 8. The role of the author in this project was that of providing ideas and guidance regarding the use of fuzzy systems and CWW techniques in the development of the proposed solution, as well as designing and executing the evaluation of the developed prototype.
14
1 Introduction
• Colombo, M., Hurle, S., Portmann, E., & Schäfer, E. (2020) A Framework for a Crowdsourced Creation of Smart City Wheels [5]. This work introduces a framework that allows estimating the smartness of a city based on feedback from various stakeholders by combining crisp measures, rough estimation, and perceptions of people toward different aspects of the city and their impact on the city ecosystem. The crowdsourced scoring of the city of Basel, Switzerland, is presented as a proof of concept. This represents how citizens’ perceptions can be employed for a more targeted development of SC solutions and was used as a basis for the development of the works presented in Chap. 8. • Colombo, M., & Portmann, E. (2020) An Algorithm for the Automatic Precisiation of the Meaning of Adjectives [6]. This research builds on the previously developed semantic similarity measure to empower the automatic precisiation of the meaning of scalar adjectives and adverbs to allow the practical implementation of the CWW pipeline to use-cases without a strictly defined and restricted dictionary. The proposed algorithm is evaluated on a user study that proves its performance in the automatic assessment and representation of the meaning of words describing human perceptions. This article constitutes one of the practical bases on which implementations of phenotropic interaction build, as it allows artificial systems to understand the meaning of words, as presented in Chap. 5. • Colombo, M., D’Onofrio, S., & Portmann, E. (2020) Integration of Fuzzy Logic in Analogical Reasoning: A Prototype [4]. In this article, the necessary operations for the development of an automated reasoning strategy in an imprecise context (e.g., reasoning with perceptions) are studied and implemented in a prototype. This aims to provide the first implementation of fuzzy analogical reasoning and evaluate it with users. The fuzzy analogical reasoning implementation represents one of the main building blocks for the practical realization of automatic understanding and adaptation to user needs in phenotropic interaction, as presented in Chap. 6. • Colombo, M., Pincay, J., Lavrovsky, O., Iseli, L., Van Wezemael, J. E., & Portmann, E. (2021) Streetwise: Mapping Citizens’ Perceived Spatial Qualities [9]. This research presents a practical use-case of phenotropic interaction in the context of citizens’ perception of urban spatial quality. It details the methodology followed for estimating citizens’ perception of the atmosphere in different areas of Swiss cities based on street-level imagery. More precisely, the methods include the collection of citizens’ perceptions through crowdsourcing, its extension with the use of a siamese convolutional neural network (CNN), scoring the perceived atmosphere via random pair voting, the aggregation of data, and the visualization of the final results on a map. This research item is presented in more detail in Chap. 8 as a practical application of phenotropic interaction principles to the field of SCs. • Andreasyan, N., Dorado Dorado, A. F., Colombo, M., Terán, L., Pincay, J., Nguyen, M. T., Portmann, E. (2021) Framework for Involving Citizens in Human Smart City Projects Using Collaborative Events [1]. This article introduces a framework to structure and facilitate collaboration between citizens and the public administration to solve problems linked to the city ecosystem, based on
1.5 Own Research Contribution
15
the concept of co-production, implemented through collaborative events and crowdsourcing campaigns. This creates a structured way of providing contact points between citizens and city administration to improve the HSC ecosystem that was used as a basis for the development of the citizen-centric SC projects presented in Chap. 8. The contribution of this book’s author in this project consisted in the participation in the selection of the fundamental elements of the framework, as well as in the refinement of the connections between them. • Colombo, M., Pincay, J., Lavrovsky, O., Iseli, L., Van Wezemael, J. E., & Portmann, E. (2022) A Methodology for Mapping Perceived Spatial Qualities [10]. This publication is an extended version of [9], where the employed methodology for estimating the perceived spatial quality of a city is improved and presented as a universal approach for the analysis of different types of spatial quality. Furthermore, the methodology is verified by applying it to estimating two different perceptual dimensions: the perceived atmosphere and the perceived safety. This constitutes a methodology to apply phenotropic interaction principles to the analysis of the perceptions of the urban space and is presented in more detail in Chap. 8. Additionally, during the development of the research projects presented in this book, the author participated in other research projects not directly related to the content of this book, but which had an essential role in learning new skills fundamental for further practical and research-related activities. These include: • The SPICE: Scaling smart city Projects—from Individual pilots toward a Common strategy of industry Emergence (SNF NRP 77 “Digital Transformation”1 ) project,2 to study solutions for the implementation and scaling of citizencentered SC projects to create a set of guidelines simplifying this task for all the stakeholders in SCs [1]. In this project, the role of the author of the book was that of studying and structuring methods for the digital participation of citizens in the different stages of SC projects. • The LUCIDELES: Leveraging user-centric intelligent daylight and electric lighting for energy savings3 project (SFOE IEA Task 61 “Integrated Solutions for Daylighting and Electric Lighting”), where efficient and human-centered solutions to automated lighting as a combination of daylight and artificial light are searched in the context of office environments [26, 27], through the use of sunlight modeling and predictive lighting models based on machine learning techniques [2]. The role of the author was that of designing and implementing the technical infrastructure and the control strategies for the optimal management of lighting to provide a good balance between adequate user comfort and reduced energy consumption. Additionally, the author participated in the design and execution of various user experiments testing the automatic lighting system.
1 https://snf.ch/en/hRMuYd5Qqjpl1goQ/page/researchinFocus/nrp/nrp77. 2 https://spice.human-ist.ch. 3 https://smartlivinglab.ch/en/projects/lucideles.
16
1 Introduction
• The participation to the first Mindfire4 mission, where an antidisciplinary approach to the solution of the artificial general intelligence problem was studied, with its technical, ethical, and legal implications [8]. The role of the author was that of approaching the artificial general intelligence problem from an interaction science and perceptual computing perspective, in the discussion with a transdisciplinary team of researchers and practitioners from several other disciplines (e.g., medicine, law, ethics).
References 1. Andreasyan, N., et al. (2021). Framework for involving citizens in human smart city projects using collaborative events. In Eighth International Conference on eDemocracy eGovernment (ICEDEG) (pp. 103–109). https://doi.org/10.1109/ICEDEG52154.2021.9530860 2. Basurto, C., et al. (2021). Implementation of machine learning techniques for the quasi realtime blind and electric lighting optimization in a controlled experimental facility. Journal of Physics: Conference Series 2042(1), 012112. https://doi.org/10.1088/1742-6596/2042/1/ 012112 3. Brusilovsky, P. (1998). Methods and techniques of adaptive hypermedia (pp. 1–43). Springer (1998). https://doi.org/10.1007/978-94-017-0617-9_1 4. Colombo, M., D’Onofrio, S., & Portmann, E. (2020). Integration of fuzzy logic in analogical reasoning: A prototype. In IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP) (pp. 5–11). https://doi.org/10.1109/ICCP51029.2020. 9266156 5. Colombo, M., Hurle, S., Portmann, E., & Schäfer, E. (2020). A framework for a crowdsourced creation of smart city wheels. In Seventh International Conference on eDemocracy eGovernment (ICEDEG) (pp. 305–308). https://doi.org/10.1109/ICEDEG48599.2020.9096754 6. Colombo, M., & Portmann, E. (2020). An algorithm for the automatic precisiation of the meaning of adjectives. In Joint 11th International Conference on Soft Computing and Intelligent Systems and 21st International Symposium on Advanced Intelligent Systems (SCISISIS) (pp. 1–6). https://doi.org/10.1109/SCISISIS50064.2020.9322674 7. Colombo, M., & Portmann, E. (2021). Semantic similarity between adjectives and adverbs— the introduction of a new measure. In V. Kreinovich, & N. Hoang Phuong (Eds.), Soft Computing for Biomedical Applications and Related Topics (pp. 103–116). Springer. http:// doi.org/10.1007/978-3-030-49536-7_10 8. Colombo, M., Portmann, E., & Kaufmann, P. (2020). Artificial intelligence—the Mindfire foundation and other initiatives. In E. Portmann, & S. D’Onofrio (Eds.), Cognitive computing (pp. 67–85). Springer (2020). https://doi.org/10.1007/978-3-658-27941-7_3 9. Colombo, M., et al. (2021). Streetwise: Mapping citizens’ perceived spatial qualities. In Proceedings of the 23rd International Conference on Enterprise Information Systems—Volume 1: ICEIS (pp. 810–818). INSTICC, SciTePress. https://doi.org/10.5220/0010532208100818 10. Colombo, M., et al. (2022). A methodology for mapping perceived spatial qualities. In ´ J. Filipe, M. Smiałek, A. Brodsky, & S. Hammoudi (Eds.), Enterprise Information Systems, Lecture Notes in Business Information Processing. Springer. https://doi.org/10.1007/978-3031-08965-7_10 11. Desmet, M. (2022). The psychology of totalitarianism. Chelsea Green Publishing.
4 https://mindfire.global.
References
17
12. D’Amico, G., Bimbo, A. D., Dini, F., Landucci, L., & Torpei, N. (2010). Natural human— computer interaction. In Multimedia interaction and intelligent user interfaces (pp. 85–106). Springer (2010). https://doi.org/10.1007/978-1-84996-507-1_4 13. Gullà, F., Cavalieri, L., Ceccacci, S., Germani, M., & Bevilacqua, R. (2015). Method to design adaptable and adaptive user interfaces. In HCI International 2015—Posters’ Extended Abstracts (pp. 19–24). Springer. https://doi.org/10.1007/978-3-319-21380-4_4 14. Gullà, F., Ceccacci, S., Germani, M., & Cavalieri, L. (2015). Design adaptable and adaptive user interfaces: A method to manage the information (pp. 47–58). Springer. https://doi.org/10. 1007/978-3-319-18374-9_5 15. Hevner, A., & Chatterjee, S. (2010). Design science research frameworks. In Design research in information systems (pp. 23–31). Springer. https://doi.org/10.1007/978-1-4419-5653-8_3 16. Hevner, A., March, S. T., Park, J., Ram, S., et al. (2004). Design science research in information systems. MIS Quarterly, 28(1), 75–105. https://doi.org/10.2307/25148625 17. Ito, J. (2014). Antidisciplinary. https://doi.org/10.31859/20141002.1939. Visited on June 2022 18. Ito, J. (2016). Design and science. Journal of Design and Science, 1. https://doi.org/10.21428/ f4c68887 19. Jolly, S. (2000). Understanding body language: Birdwhistell’s theory of kinesics. Corporate Communications: An International Journal, 5, 133–139. https://doi.org/10.1108/ 13563280010377518 20. Kirlik, A., Miller, R., & Jagacinski, R. (1993). Supervisory control in a dynamic and uncertain environment: A process model of skilled human-environment interaction. IEEE Transactions on Systems, Man, and Cybernetics, 23(4), 929–952. https://doi.org/10.1109/21.247880 21. Lang, D. J., et al. (2012). Transdisciplinary research in sustainability science: Practice, principles, and challenges. Sustainability Science, 7(1), 25–43. https://doi.org/10.1007/s11625-0110149-x 22. Lanier, J. (2003). Why Gordian software has convinced me to believe in the reality of cats and apples. https://www.edge.org. Visited on Apr. 2022 23. National Academy of Sciences, National Academy of Engineering, Institute of Medicine. (2005). Facilitating interdisciplinary research. The National Academies Press. https://doi.org/ 10.17226/11153 24. Nembrini, J., & Lalanne, D. (2017). Human-building interaction: When the machine becomes a building. In R. Bernhaupt, G. Dalvi, A. Joshi, D.K. Balkrishan, J. O’Neill, & M. Winckler (Eds.), 16th IFIP Conference on Human-Computer Interaction (INTERACT), HumanComputer Interaction, vol. LNCS-10514 Part II (pp. 348–369). Springer. https://doi.org/10. 1007/978-3-319-67684-5_21 25. Oliveira, Á., & Campolargo, M. (2015). From smart cities to human smart cities. In 48th Hawaii International Conference on System Sciences (pp. 2336–2344). IEEE. https://doi.org/ 10.1109/HICSS.2015.281 26. Papinutto, M., et al. (2021). Towards the integration of personal task-lighting in an optimised balance between electric lighting and daylighting: A user-centred study of emotion, visual comfort, interaction and form-factor of task lights. In Journal of Physics: Conference Series (Vol. 2042, p. 012115). IOP Publishing. https://doi.org/10.1088/1742-6596/2042/1/012115 27. Papinutto, M., et al. (2022). Saving energy by maximising daylight and minimising the impact on occupants: An automatic lighting system approach. Energy and Buildings, 268, 112176. https://doi.org/10.1016/j.enbuild.2022.112176 28. Pask, G. (1975). Conversation, cognition and learning. Elsevier. 29. Perez, C. C. (2019). Invisible women: Data bias in a world designed for men. Abrams. https:// doi.org/10.1111/1475-4932.12620 30. Pohl, C. (2008). From science to policy through transdisciplinary research. Environmental Science & Policy, 11(1), 46–53. https://doi.org/10.1016/j.envsci.2007.06.001 31. Rawat, S., & Meena, S. (2014). Publish or perish: Where are we heading? Journal of Research in Medical Sciences: the Official Journal of Isfahan University of Medical Sciences, 19(2), 87.
18
1 Introduction
32. Spring, T., Ajro, D., Pincay, J., Colombo, M., & Portmann, E. (2020). Jingle jungle maps— capturing urban sounds and emotions in maps. In Seventh International Conference on eDemocracy eGovernment (ICEDEG) (pp. 36–42). https://doi.org/10.1109/ICEDEG48599. 2020.9096770 33. Strubell, E., Ganesh, A., McCallum, A. (2019). Energy and policy considerations for deep learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 3645–3650). 34. Tanenbaum, A. S. (2003). Computer networks. Pearson Education India. 35. Trillas, E., Termini, S., Tabacchi, M. E., & Seising, R. (2015). Fuzziness, cognition and cybernetics: an outlook on future. In 2015 Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and Technology (IFSA-EUSFLAT-15) (pp. 1413–1418). Atlantis Press. https://doi.org/10.2991/ifsa-eusflat-15.2015.200 36. Völkel, S. T., Schneegass, C., Eiband, M., & Buschek, D. (2020). What is “intelligent” in intelligent user interfaces? A meta-analysis of 25 years of IUI. In Proceedings of the 25th International Conference on Intelligent User Interfaces, IUI ’20 (pp. 477–487). Association for Computing Machinery. https://doi.org/10.1145/3377325.3377500 37. Wallimann-Helmer, I., Terán, L., Portmann, E., Schübel, H., & Pincay, J. (2021). An integrated framework for ethical and sustainable digitalization. In 2021 Eighth International Conference on eDemocracy & eGovernment (ICEDEG) (pp. 156–162). IEEE. https://doi.org/10.1109/ ICEDEG52154.2021.9530972 38. Wania, C. E., Atwood, M. E., & McCain, K. W. (2006). How do design and evaluation interrelate in HCI research? In Proceedings of the 6th Conference on Designing Interactive Systems (pp. 90–98). https://doi.org/10.1145/1142405.1142421 39. Yen, G. G., & Acay, D. (2009). Adaptive user interfaces in complex supervisory tasks. ISA Transactions, 48(2), 196–205. https://doi.org/10.1016/j.isatra.2008.11.002 40. Yurovsky, D., Smith, L. B., & Yu, C. (2013). Statistical word learning at scale: The baby’s view is better. Developmental Science, 16(6), 959–966. https://doi.org/10.1111/desc.12036 41. Zadeh, L. A. (2004). Precisiated natural language (PNL). AI Magazine, 25(3), 74–74. https:// doi.org/10.1609/aimag.v25i3.1778
Part II
Theory of Naturalness
Chapter 2
Phenotropic Interaction
Traditional interactive systems are built mainly by employing user-centered design, which means that they are designed to satisfy a set of requirements that adapt to the system’s average target user. This means that, for example, if the target users are primarily young people, the requirements are different than if they are mainly older adults. In general, the designed interface is complex and sometimes even impossible to be used by people not belonging to the target group or having different needs (e.g., people with a disability). For instance, a virtual assistant designed for voice interaction can be challenging to use by people who cannot talk or have a limited vocabulary. Of course, this is not the case for the average user, but many still have limitations in using such a system. Moreover, traditional interactive systems are primarily static, so they do not improve over time and generally do not adapt to the user once deployed. Optimizing for the average target user is executed during the interface design process. Then the obtained results are implemented in the system without adapting the interaction to the needs of a more specific person. The user-centered design process consists thus in defining a strict protocol to be followed in the interaction with the system that the user must follow. Although this protocol is designed around the needs and possibilities of the average user, it is still a strict set of rules that the actual user has to conform to in order to be able to use a system and has to learn how the system is supposed to be used. This is easily observed in the case of interaction with natural language, where often only a limited set of keywords is accepted or understood by the artificial system and the user has to conform to that, whereas in natural interaction between people, language is less restricted, in terms of both used words and accents. In interactions between humans, a different approach than that with artificial systems is observed, where at each exchange, the receiver of information iteratively adapts to the way of communicating of the other actor until they reach a conversational alignment [10]. In other words, people do not agree on a strict communication
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Colombo, Phenotropic Interaction, Fuzzy Management Methods, https://doi.org/10.1007/978-3-031-42819-7_2
21
22
2 Phenotropic Interaction
protocol before starting an interaction, but they iteratively adapt until they reach a good enough alignment that allows them to exchange the desired information. For instance, if person A starts talking with person B, which answers with hand gestures, A might understand that B cannot talk and hear and will thus try to communicate with gestures, even if A does not know any sign language. B will thus adapt to the “language” used by A by trying to extract meaning from the gestures and adapt their own gestures to those used by B. This type of interaction results in an imprecise or incomplete exchange of information. Still, it allows an exchange to happen, whereas having to follow a strict protocol would have meant either for A to not understand anything as they could not hear the voice of the interlocutor or for B not to understand as they did not know the protocol (i.e., sign language) used by A. In addition to allowing people with disabilities to successfully interact with technology, adaptivity can also improve the experience of other categories of people who do not correspond to the average user. Indeed, this concept often refers to Caucasian men with certain physical capabilities and education level, which contributed to the development of the industry toward products discriminating against certain categories of users, including women [24]. This chapter defines a set of design principles based on observations from interactions between humans and other natural entities for applying a similar natural interaction paradigm to interfaces between humans and artificial systems. These are built on the concept of phenotropics, from which the name phenotropic interaction is derived. This new paradigm could improve the ability of artificial agents to understand and adapt to the needs of people, with the consequence of improving the ability of humans and artificial systems to collaborate in a natural way [5]. According to the media naturalness theory [14, 15], the increase in the naturalness of the interaction between people and machines can significantly improve the interaction itself, by reducing the required cognitive effort and the ambiguity of the communication. Thus, the added naturalness that a phenotropic interface aims to achieve could contribute to the improvement of traditional interaction paradigms. Phenotropics and their origin in the context of software engineering are introduced in Sect. 2.1, according to the observations of Lanier [16] and Basman [2]. The observations and design principles apt to apply phenotropics to the interaction between systems of diverse nature are then devised in Sect. 2.2, followed by a framework for the practical development of phenotropic interaction in Sect. 2.3, providing its structure and the main building blocks in the form of theories and models for the effective implementation of the defined design principles. Finally, a summary and a critical outlook conclude the chapter in Sect. 2.4.
2.1 Phenotropics
23
2.1 Phenotropics The term phenotropics, a combination of the prefix pheno- (referring to appearance) and the suffix -tropics (referring to changes, but also used as interaction), means “the interaction of surfaces,” as defined by the virtual reality pioneer Jaron Lanier [16]. This word was coined in the context of a reflection on the limitations of traditional software, and a solution for these was elaborated with the concept of phenotropic software. The same criticisms and ambitions regarding traditional software were moved independently also by Basman [2]. Traditional software engineering is directly derived from the structure of the used hardware. For example, modern computers consist of wires sequentially transmitting electrical signals so that information is encoded in a series of electrical signals over time. Likewise, software is based on protocols depending on sequences of bits that are related or unrelated to the previous or following one. These protocols indicate a strict way for different software modules to communicate with one another. However, the use of protocols poses several problems to the robustness and scalability of software, as observed in [2, 16]. Indeed, a bit set to the wrong value (e.g., because of a programming or input error) has a significant impact on the outcomes of a particular exchange, and it could also completely break an interaction between two pieces of software. This is referred to by Lanier [16] as the brittleness of software built on strict protocols, which tends to “break before it bends” [16]. Moreover, the use of strict protocols limits the possibility of information exchange. Take, for example, two teams working independently on two modules that need to communicate and control one another. The standard way to exchange information between them consists in creating an application programming interface (API) for each module allowing reading and writing a minimum part of their status variables, exchanging the specifications of this API with the other team, and using the API to monitor and control the module. However, this has the disadvantage that only predefined parts of the modules can be controlled, and any new information needed from a module would require its development team to work on developing a new API to access this. The potential solution proposed by Lanier [16] to these issues of protocolbased exchange of information consists in being inspired by the interaction between biological systems, where interaction happens more in a surface-based fashion (e.g., like in the cornea), instead of using protocols on sequences of information bits. The idea is that of conveying information in a surface-inspired manner, a structure with many bits where the meaning of each bit is partially dependent on nearby elements. This interface between two systems should expose as much as possible their states instead of providing several single points of contact, limiting the knowledge about the system (e.g., like in API) [2]. This allows interaction by measurement, where any feature interesting to the observer module can be probed. It must be noted that what is meant by the term surface here is not necessarily a surface in the mathematical sense but simply a structure to which pattern recognition
24
2 Phenotropic Interaction
can be applied [17]. This includes the interface between user and computer, where, for instance, a search engine can be made able to react to noisy searches, even if these are not represented in the form of a two-dimensional surface. For instance, if the query “Miguel Phelps” is performed, the search engine recognizes that the user is more likely to be interested in the well-known ex-swimmer Micheal Phelps, so it returns output in the form “Did you mean Micheal Phelps?”. General features can be extracted by such a structure with pattern recognition instead of relying on strict rules that must be exchanged before the start of the interaction. Moreover, the use of pattern recognition allows identifying features even if there is an error in a bit, as pattern recognition assumes a constant presence of minor noise in the analyzed data and tries to identify the underlying patterns without considering the noise. This approach does not provide perfect information exchange but a fuzzystatistical solution to the problem, where imprecise information is retrieved with a certain probability. Compared to the precision obtained when employing protocols, this is not as precise but, at the same time, allows for interpretation with a high enough accuracy of imprecisely communicated data, which would instead break the interaction with protocols. For instance, in a situation where a smart city API is deployed in many different cities, a small error like a bit change in an underlying system might cause the collapse of the whole smart city infrastructure, with important consequences for a considerable amount of people. The similarity with human interaction could be observed, for instance, in the case where someone pronounces a word in the wrong way. With the phenotropic approach, the information receiver would adapt and understand with a certain confidence that the other person has mispronounced that word, whereas, with the traditional protocol-based approach, the second person would simply refuse to understand the information communicated with a mispronounced word. So, phenotropic software applies the concept of “bending before breaking,” which provides flexibility and robustness to an interaction. As both a consequence of and opportunity for the imprecision intrinsic in phenotropic software, evolutionary self-improvement can be obtained by coupling it with feedback loops that improve its accuracy and reliability over time. This is possible in phenotropics because each system tries to predict up to a certain degree the input it will receive from the other involved actor and can then adjust the prediction based on its feedback (i.e., the actual input). Finally, outside of the predefined contact points and under the surface, traditional software is just an “incomprehensible world of blinking lights” [17], which indicates that a certain degree of explainability or interpretability [1] is needed in any system to improve the potential for other systems to understand them on a deeper level, allowing the improvement of the exchanges between them. Although only a few, some applications of phenotropics exist in the fields of sensor networks, be it the Internet or the Web of Things [9], and software design. In particular, research in the extension of sensor networks has been focusing on the recognition of items and patterns for an increase in the flexibility of the network by identifying new sensors or actuators (including their properties) based
2.2 Design Principles of Phenotropic Interaction
25
on analog features perceived by other sensors, such as their appearance [25] or their energy consumption patterns [11, 12], instead of relying on the communication of their nature by means of a protocol. A more literal application of phenotropics, as proposed by Lanier, is observed in the design of software, where research has focused on the formalization of phenotropic software and possible design implications and technological advances necessary to implement it [7, 8]. Other than the existing applications in sensor networks and software design, phenotropics are believed to have the potential to improve the interactions between humans and computers significantly, as there is a constant needed to adapt to and understand the interaction patterns of people that interact with systems with a certain amount of noise, which could be for example a manifestation of subjectivity (e.g., use of different words from user to user) or imprecision (e.g., touching different parts of a button to click it). In general, it also sounds reasonable to implement bioinspired interaction understanding techniques in an artificial system that is handling interaction with a biological system, as this would theoretically allow to better adapt the interaction to the needs of the biological element in the exchange (i.e., the human in this case). Phenotropic interaction represents the application of phenotropics to HCI.
2.2 Design Principles of Phenotropic Interaction Based on the theory proposed by Lanier for the development of phenotropic software, combined with an analysis of the weaknesses identified in [2] and the applications of phenotropics to various fields, a set of design principles are defined for the implementation of phenotropic interaction. These encompass the properties that an interface providing a more natural interaction centered around the peculiarities of each user should possess. An interface providing an interaction respecting the phenotropic principles should be: • • • • •
Not using crisp protocols Approximation safe Robust by design Improving over time Multimodal
In the following, more details about the reasons and implications of each of these points are argumented. Not Using Crisp Protocols Similarly to what has been observed in software engineering, the use of rigid protocols is a possible culprit for the failure of communication between two entities. Indeed, any minor violation of the protocol completely breaks the exchange of information. In traditional HCI, the protocol is unidirectional, in the sense that
26
2 Phenotropic Interaction
the machine provides a set of rules or commands to be strictly followed, and the human party in the interaction has to adapt to those. Humans usually do not communicate following strict protocols, so they become easily frustrated when a slight variation from what the system expects has the consequence that the interaction is unsuccessful [4]. For this reason, the principal focus of phenotropic interaction should lie in developing an interactive system that is flexible and adapts to how humans want to communicate with it, rather than providing a predefined protocol that users must follow. This makes the interaction more accessible and lighter for humans, which can focus their energy on the task they want to perform instead of adapting to the machine’s language. Moreover, this potentially allows reducing the number of interaction failures, which can improve user satisfaction and productivity [13]. The proposed approach relies on the fundamental understanding of (at least up to a certain degree) the semantics of the objects of interaction. It can be inspired by how people communicate with others, which consists of receiving feedback and adapting their way of communicating accordingly, in a process similar to the cybernetics loop [22, 27]. Approximation Safe Since human perceptions are approximate due to the very imprecise nature of their senses, people handle perceptions and communication in an approximate way [29]. For example, someone might complain that it is very hot in summer rather than say that they are uncomfortable because it is .34.56 ◦ C. To correctly handle the meaning expressed by human users, the phenotropic system has thus to understand approximate perceptions correctly. Despite this being easy for humans to do, as everybody perceives the world in an approximate way and constantly treats approximate information granules, it can be more difficult for digital computers built for exact computation. Nevertheless, treating approximate and imprecise perceptions in computers can be achieved with the frameworks of fuzzy logic [29], CWW [20], and granular computing [23, 28]. Robust by Design The usage of pattern recognition allows errors, noise, and imprecision to be treated thanks to the understanding of the overall semantics and patterns existing in the exchanged data. Patterns and meaning can be identified also in noisy data and can then be used as a basis for reasoning, combined with existing knowledge of the world, to put them in context and extract meaningful information from the communication. For example, in the case of an interaction based on voice, a word said with the stress on the wrong syllable should not compromise the communication, similar to what happens in exchanges between humans. The same should be observed when, for instance, a word with the wrong meaning is used, but its intended meaning can be extrapolated from the context, or synonyms and words with similar meaning to those “known” by the receiver of the information are used.
2.2 Design Principles of Phenotropic Interaction
27
Improving Over Time In the spirit of evolution and adaptation, understanding someone’s communication should improve over time while increasing and inferring some knowledge about the common features of the used language. In the concrete use-case of HCI, the system should adapt to understand and react always better to the actual needs, intentions, and perceptions of a specific user. This can generally be achieved with feedback loops, as seen in interactive machine learning (ML) [6], cybernetics [27], and conversation theory [22]. The feedback can be either explicit, meaning that the user gives such feedback, for instance, to correct errors, or implicit, meaning that data (e.g., sensor measurements) are used in combination with the objects of interaction to understand their meaning better (e.g., to what range of temperature roughly corresponds the perception of “hot weather” for the user). The feedback loops allow the interface to improve over time, thus not needing a lot of initial design work to discover the best-suited features for the average user. Furthermore, this continuous adaptation to the user’s needs can allow for the emergence of more personal user interfaces over time. The learning should thus be moved from the user’s to the system’s side, contrary to what happens with protocoldependent interfaces. Multimodal To have an even more flexible, robust, and natural interactive system, an important feature to be considered is multimodality. Indeed, it has been studied that nonverbal communication substantially impacts the overall experience of interaction in several contexts [3, 18, 26]. Moreover, the ability of a system to automatically handle several modalities makes it immensely more flexible, as it can adapt to users who do not have access to a specific modality. For instance, a system designed with the voice as the sole interaction modality does not have any way of entertaining a conversation with people with a hearing impairment, whereas a multimodal system has the channels to potentially learn and adapt to communicate with gestures. The details of phenotropic interaction, compared to common traditional interactive methods, are summarized in Table 2.1. The presented design principles are in some ways overlapping with one another, in the sense that indirectly, they are all consequences of or techniques to obtain
Table 2.1 Comparison of the features of traditional and phenotropic interfaces Traditional interfaces – Using hard protocols – Sharp – Learning on user’s side – Low robustness – Static – Quality of interaction depends on design of the system
Phenotropic interfaces – Not using protocols – Fuzzy, approximation safe – Learning on artificial system’s side – High robustness – Adapting to single users, evolving – Good interaction emerges from natural adaptation to the users
28
2 Phenotropic Interaction
an interaction not relying on perfect syntactic matching as in a strict protocol. The more these principles are followed, the more the designed interaction is considered phenotropic and provides flexibility and robustness. However, one can argue that already partly satisfying one of the presented criteria can characterize a phenotropic interaction, as it adds some flexibility or robustness compared to interactions based on sharp protocols, which in turn provides some benefits for the user.
2.3 Phenotropic Interaction Framework To effectively impersonate the design principles presented in Sect. 2.2, existing theories can be used to design an interface implementing phenotropic interaction, which are summarized in the framework presented in this section. The phenotropic interaction framework describes the surface handling in a protocol-less way the interactions between humans and artificial agents as shown in Fig. 2.1. In Fig. 2.1, the interaction between the human and the machine is represented as a loop, which has a two-folded meaning. Indeed, it illustrates: • The two-way communication between the human and the machine, including the conversational alignment [10] between the two agents, that try to understand and adapt to the other agent’s way of communicating • The feedback loop process as seen in interactive ML [6], inspired by the cybernetics loop [27], which allows the system to better adapt to the user, and the user to understand the choices of the system, as well as to guide them
Fig. 2.1 The phenotropic interaction framework, integrating conversation theory, CWW, reasoning, interactive ML, and explainable AI
2.4 Concluding Remarks
29
To provide the interaction between the human and the artificial system with a structure familiar to the user, the exchange between them is modeled on Pask’s conversation theory [21, 22], as presented in more detail in Sect. 3.1, which describes the alignment process [10] that people go through when communicating with one another. This allows bringing the structure of the interaction between a user and an artificial system closer to that observed in natural conversations between people. To facilitate the exchange of information between the involved actors, the phenotropic interface acts as the surface where the conversation and adaptation happen on the machine’s side. This surface aims to handle the conversation in a protocol-less and multimodal way. The minimum requirement for multimodality is that in the system, at least two interaction channels are available (e.g., video and audio). This can provide an even more advanced level of adaptivity to peculiar user needs, such as the inability to use a specific modality, or provide redundancy. The main components of a successful protocol-less exchange are the possibility of handling approximations and robustness. The fuzzy logic toolbox, more specifically CWW (Sect. 3.2) and approximate reasoning (Sect. 3.3), comes in handy for satisfying these features, as they are strongly dependent on a successful understanding of the semantics of the objects of interaction and of the imprecision of human cognition and communication. These aim to foster a high level of cointension between the user and the interactive system, by finding an accurate alignment between the semantics and the representation of the information conveyed by the user and that interpreted by the interactive system [1, 19, 30]. Additionally, the improvement over time of phenotropic interaction can be achieved thanks to a feedback mechanism. This can be based on inputs from external sensors, which can be used to make sense of the interaction and learn the meaning of specific patterns, or inputs from the user, who can notify errors and express preferences. Interactive ML is a theory that can handle both these types of feedback mechanisms [6] (Sect. 3.4). Moreover, to let users give better feedback and correctly identify errors in understanding their queries, an effective explainability of the reasoning process of the artificial agent [1] is fundamental (Sect. 3.5). The described overall methods to be used in the different phases of the implementation of a phenotropic interface are presented in more detail in Chap. 3, followed by more specific techniques from CWW and approximate reasoning that were developed for more accurate handling of the semantics of the objects of interaction in Chaps. 4–6.
2.4 Concluding Remarks In this chapter, the concept of phenotropics has been introduced and adapted to the development of more natural, human-centric, and robust HCI. The fundamentals of phenotropic interaction, devised from the principal features of phenotropic software, were described in the form of the five design principles
30
2 Phenotropic Interaction
of phenotropic interaction, which provide a simple checklist for the verification of the “phenotropicness” of an interface. These provide a basis for the definition of the goals of new interactive systems and describe all the criteria that such a system should satisfy to be considered perfectly phenotropic. However, satisfying only part of the listed features can already partially provide the flexibility and robustness typical of phenotropic interfaces, which can be a good enough result, especially for improving already existing interfaces without completely overturning them. The defined principles constitute the basis for the implementation of bioinspired interaction between people and artificial systems, making it more natural by removing the need for hard protocols that are not natural for people, who are used to flexibility and adaptivity in their interactions. As a guide for the practical implementation of phenotropic interfaces, a framework comprising the technical theories and models well suited to be employed in the different fundamental components of phenotropic interaction was introduced. This sets the theoretical basis for the definition of the principal building blocks necessary for the implementation of phenotropic interfaces able to provide: natural interaction using concepts from conversation theory, flexibility, and robustness based on the understanding of the semantics and reasoning on the objects of interaction, and selfimprovement of the artificial agent employing explainability and feedback loops. The listed building blocks of phenotropic interaction are presented in general in Chap. 3. These are followed by more precise developments belonging to the overlap between CWW and phenotropic interaction (Chaps. 5–6), which have the goal of specifically understanding and reasoning on human perceptions, mostly expressed with scalar adjectives and adverbs, to improve the level of understanding of the people’s needs and goals in the interaction. The overall perceptions of people toward phenotropic interaction compared to more traditional HCI are difficult to assess, as it most likely depends on specific implementations. The devised framework and design principles are thus verified in an experiment on the practical application of phenotropic interaction on a virtual assistant use-case in Chap. 7.
References 1. Alonso Moral, J. M., Castiello, C., Magdalena, L., & Mencar, C. (2021). Toward explainable artificial intelligence through fuzzy systems. In Explainable fuzzy systems: Paving the way from interpretable fuzzy systems to explainable AI systems (pp. 1–23). Springer. https://doi.org/10. 1007/978-3-030-71098-9_1 2. Basman, A. (2016). Building software is not a craft. Proceedings of the Psychology of Programming Interest Group. 3. Calero, H. H. (2005). The power of nonverbal communication: How you act is more important than what you say. Silver Lake Publishing. 4. Ceaparu, I., Lazar, J., Bessiere, K., Robinson, J., & Shneiderman, B. (2004). Determining causes and severity of end-user frustration. International Journal of Human–Computer Interaction, 17(3), 333–356. https://doi.org/10.1207/s15327590ijhc1703_3
References
31
5. Epstein, S. L. (2015). Wanted: Collaborative intelligence. Artificial Intelligence, 221, 36–45. https://doi.org/10.1016/j.artint.2014.12.006 6. Fails, J. A., & Olsen Jr., D. R. (2003). Interactive machine learning. In Proceedings of the 8th International Conference on Intelligent User Interfaces (pp. 39–45). https://doi.org/10.1145/ 604045.604056 7. Gabriel, R. P. (2006). Design beyond human abilities. In Proceedings of the 5th International Conference on Aspect-Oriented Software Development (Vol. 20, p. 2). https://doi.org/10.1145/ 1119655.1119658 8. Gabriel, R. P., & Goldman, R. (2006). Conscientious software. SIGPLAN Notices, 41, 433–450. https://doi.org/10.1145/1167515.1167510 9. Guinard, D., & Trifa, V. (2009). Towards the Web of Things: Web mashups for embedded devices. In Workshop on Mashups, Enterprise Mashups and Lightweight Composition on the Web (MEM), in Proceedings of WWW (International World Wide Web Conferences), Madrid, Spain (Vol. 15, p. 8). 10. Henderson, A., & Harris, J. (2011). Conversational alignment. Interactions, 18(3), 75–79. 11. Hu, Z., Frénot, S., Tourancheau, B., & Privat, G. (2011). Iterative model-based identification of building components and appliances by means of sensor-actuator networks. https://hal.inria.fr/ inria-00636055. 2nd Workshop on Buildings Data Models, CIB W078 - W102, Sophia Antipolis - France 12. Hu, Z., Privat, G., Frénot, S., & Tourancheau, B. (2012). Representation and self-configuration of physical entities in extended smart grid perimeter. In 3rd IEEE PES Innovative Smart Grid Technologies Europe (ISGT Europe) (pp. 1–7). https://doi.org/10.1109/ISGTEurope. 2012.6465861 13. Johnson, C., Johnson, T., & Zhang, J. (2000). Increasing productivity and reducing errors through usability analysis: A case study and recommendations. In Proceedings of the AMIA Annual Symposium (pp. 394–398) 14. Kock, N. (2005). Media richness or media naturalness? The evolution of our biological communication apparatus and its influence on our behavior toward e-communication tools. IEEE Transactions on Professional Communication, 48(2), 117–130. https://doi.org/10.1109/ TPC.2005.849649 15. Kock, N. (2011). Media naturalness theory: human evolution and behaviour towards electronic communication technologies. Applied Evolutionary Psychology 381–398. 16. Lanier, J. (2003). Why Gordian software has convinced me to believe in the reality of cats and apples. https://www.edge.org. Visited on Apr. 2022 17. Lewis, C. H. (2018). Phenotropic programming? In PPIG 2018—29th Annual Workshop 18. Mast, M. S. (2007). On the importance of nonverbal communication in the physician—patient interaction. Patient Education and Counseling, 67(3), 315–318. https://doi.org/10.1016/j.pec. 2007.03.005 19. Mencar, C., Castiello, C., Cannone, R., & Fanelli, A. (2011). Design of fuzzy rule-based classifiers with semantic cointension. Information Sciences, 181(20), 4361–4377. https://doi. org/10.1016/j.ins.2011.02.014 20. Mendel, J. (2007). Computing with words: Zadeh, Turing, Popper and Occam. IEEE Computational Intelligence Magazine, 2(4), 10–17. https://doi.org/10.1109/MCI.2007.9066897 21. Pangaro, P. (2018). Making chatbot humane—adopting the technology of human conversation. Bots Brazil. 22. Pask, G. (1975). Conversation, cognition and learning. Elsevier (1975) 23. Pedrycz, W. (2001). Granular computing: an introduction. In Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569) (Vol. 3, pp. 1349– 1354). IEEE. https://doi.org/10.1109/NAFIPS.2001.943745 24. Perez, C. C. (2019). Invisible women: Data bias in a world designed for men. Abrams. https:// doi.org/10.1111/1475-4932.12620 25. Privat, G. (2012). Phenotropic and stigmergic webs: The new reach of networks. Universal Access in the Information Society, 11(3), 323–335. https://doi.org/10.1007/s10209-011-02401
32
2 Phenotropic Interaction
26. Wang, H. (2009). Nonverbal communication and the effect on interpersonal communication. Asian Social Science, 5(11), 155–159. https://doi.org/10.5539/ass.v5n11p155 27. Wiener, N. (1948). Cybernetics 28. Zadeh, L. A. (1997). Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets and Systems, 90(2), 111–127. https://doi.org/10. 1016/S0165-0114(97)00077-8 29. Zadeh, L. A. (1999). Fuzzy logic = computing with words. In Computing with Words in Information/Intelligent Systems (Vol. 1, pp. 3–23). Springer. https://doi.org/10.1109/91.493904 30. Zadeh, L. A. (2008). Is there a need for fuzzy logic? Information Sciences, 178(13), 2751– 2779. https://doi.org/10.1016/j.ins.2008.02.012
Chapter 3
Cognitive and Perceptual Computing
As presented in the phenotropic interaction framework in Chap. 2, many different components are necessary for developing phenotropic interfaces. In particular, the fact that phenotropic interaction represents a bio-inspired and human-centered way of handling interaction between diverse systems makes it essential to: shape the interaction around the fundamental aspects of human conversation, handle the imprecision of language (intended as any means used to communicate), understand the semantics of the exchanged information, and explain any reasoning process to clarify actions and conveyed information. These points can be handled with techniques belonging to the families of cognitive computing and perceptual computing. Cognitive computing refers to the ability of computers to mimic the mechanisms of the brain [11] and to extend human abilities in a symbiotic way [17], thus strongly relying on natural interaction and collaboration with people. Cognitive computing also provides the basis for the practical application of cognition and learning theories (e.g., connectivism [39] and learning algorithms [41]) to computer systems with the use of soft computing methods. Perceptual computing, on the other side, contemplates the ability of computers to compute and reason with perceptions and imprecise data [27]. Cognitive computing and perceptual computing are strongly intertwined, as the replication of brain mechanisms is strongly dependent on the ability to handle perceptions and fuzziness, and vice versa reasoning on approximate data is best performed through the use of mind-inspired methods. In this chapter, the general methods from cognitive computing and perceptual computing that constitute the principal building blocks of phenotropic interaction that were introduced in the framework in Fig. 2.1, allowing thus to satisfy its design principles, are presented. In particular, conversation theory is introduced in Sect. 3.1, followed by frameworks for handling the meaning of words (i.e., CWW in Sect. 3.2, automated reasoning in Sect. 3.3). The fundamental components of trust building and phenotropic interaction’s improvement over time based on user feedback, interactive © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Colombo, Phenotropic Interaction, Fuzzy Management Methods, https://doi.org/10.1007/978-3-031-42819-7_3
33
34
3 Cognitive and Perceptual Computing
ML, and explainability are then introduced in Sects. 3.4 and 3.5. A summary of the importance of the presented methods for the phenotropic interaction framework concludes the chapter in Sect. 3.6.
3.1 Conversation Theory Conversation theory [34] does not technically belong to the fields of cognitive computing nor perceptual computing. However, it is a fundamental theory describing the conversation mechanism between humans; thus, it is a good inspiration for implementing an information exchange technique between people and cognitive computing systems to obtain collaboration between them in a similar way to how humans cooperate. Conversation theory represents a generalization of the structure of human conversation, where a change occurs in at least one of the participants, which, for example, learns something new, understands a concept, or aligns with the other participant’s intent or values; as opposed to communication, which is a mere exchange of messages [33]. The conversation structure modeled according to the conversation theory is summarized in Fig. 3.1. Conversation theory is modeled on the cybernetic loop [46] and can be simplified as an exchange to align two people toward a common goal. It starts with a participant having a goal, and the conversation that follows has the objective of pursuing that goal. The conversation starts with a context (a situation, place, or shared history), from which a language (a way of exchanging information) is selected. This begins a back-and-forth exchange using that modality, during which the participants adapt their way of communicating [21] and align to the goal until they reach an agreement (a shared understanding of intent or values). The agreement then leads to collaboratively performing an action or transaction to accomplish the original goal.
Fig. 3.1 Structure of conversations (Adapted from [33])
3.2 Computing with Words and Perceptions
35
Conversation theory sets the basis for phenotropic interaction, as this is modeled on making the interaction between humans and artificial systems more natural for humans. Therefore, conversational alignment, adaptation to the understanding of the person’s goals, and agreement on the operations to be performed to pursue the goal are fundamental for the implementation of a phenotropic interface, which allows the understanding of and continuous adaptation to user needs and desires without the use of strict protocols.
3.2 Computing with Words and Perceptions For the practical understanding of people’s ways of reasoning and communicating, which relies on the exchange of people’s perceptions that are fuzzy because of the imprecision of the human sensory inputs, a method for handling fuzziness on the artificial side of the human–computer conversation loop is necessary [42]. The CWW pipeline [48] provides a methodology for handling precisely this type of problem. Indeed, CWW is based on the fuzzy sets theory, where imprecision is represented mathematically through membership functions indicating the degree to which a particular item belongs to a set. On this point, it is important to note that the term CWW is used here as a term including both the theories of computing with words [48] and the computational theory of perceptions [43], as these both build on the pipeline of computing with words. The difference between these theories is simply that the computational theory of perceptions refers to perceptions expressed in natural language, whereas computing with words encompasses any type of word. Definition 3.1 Let .U = {x1 , x2 , . . . , xn } be a reference set called universe of discourse. Then a fuzzy set .A ⊂ U is defined as a set of ordered pairs .{(wi , μA (xi ))} where .μA : U → [0, 1] is the membership function of A, and .μA (x) ∈ [0, 1] is the membership degree of x in A. For example, fuzzy logic allows representing partially true statements, as opposed to traditional logic, where a statement is either entirely true or false. Additionally, the degree to which a man with a height of 178 cm can be considered to belong to the set of tall people can be for instance represented by the membership degree of 178 cm to the set of tall people .μtall (178 cm) = 0.3, meaning that a 178cm-high person belongs to the set of tall people only partially (e.g., at .30%). This partial belonging to a set can be interpreted in several ways [12]. It can indicate that .30% of the population described person X, with a height of 178 cm, as being a tall person; according to .30% of the population, 178 cm is in the range of the height of tall people; or the 178 cm is at a normalized distance of .0.7 from the closest ideal prototype of a tall person.
36
3 Cognitive and Perceptual Computing
Important basics in fuzzy systems are fuzzy numbers and fuzzy intervals, as these allow to simply describe concepts as approximate values, and linguistic variables, which take terms and words instead of numerical values: Definition 3.2 A fuzzy number .∗a, .a ∈ N, is a fuzzy set representing a value of approximately a, for which the membership function decreases on both sides of the number a. A fuzzy interval .[∗a, ∗b] is an interval whose limits are characterized by the fuzzy numbers .∗a and .∗b. The membership to the interval is maximum between a and b but gradually decreases outside this interval. For instance, the number .2.5 can partially belong to the set .∗3 (approximately 3). Similarly, the fuzzy interval .[∗3, ∗5] is defined with a smooth transition; for example, the numbers in the ranges .[2, 3], and .[5, 6] only partially belong to the set. Different shapes exist to describe fuzzy membership functions in fuzzy sets. The most used ones include triangular, trapezoidal, and Gaussian functions [50]. However, other big-data-generated shapes are also used in some cases [13, 25, 40]. Definition 3.3 Let X be the name of a variable, T the set of terms that X can take, U the universe of discourse, G a syntactic rule for the generation of the name of terms, and M a semantic rule associating each term to its meaning (a fuzzy set on U ). Then, the quintuple .(X, T , U, G, M) is called linguistic variable. For example, a linguistic variable describing temperature could be defined by the quintuple .(X, T , U, G, M), where X is temperature, T is the set {cold, medium, hot} generated by G, and M specifies the membership degree for each set of all the values in the universe of discourse .U = [−100 ◦ C, 100 ◦ C]. To treat the semantics of natural language in a mathematical way, the concept of CWW was developed [48, 49]. Definition 3.4 CWW is a process allowing to perform computations on words, phrases, and prepositions drawn from a natural language, which describe perceptions of people toward different aspects of the context they are surrounded by. As shown in Fig. 3.2, the CWW pipeline is composed by two main components: precisiation of meaning and computation. Precisiation of meaning refers to transforming words describing perceptions q into a fuzzy-logic-based representation of their meaning .q ∗ , on which computation can be performed.
Fig. 3.2 The CWW pipeline
3.3 Automated Reasoning
37
For the practical implementation of the CWW pipeline [28], in particular of the precisiation of meaning, Mendel presents a methodology in the Perceptual Computer (Per-C) [26]. Definition 3.5 The Per-C represents a framework for the practical implementation of the CWW pipeline. According to this model, to precisiate words, the following steps have to be performed: 1. A vocabulary containing all of the application-dependent words should be constructed. 2. A set of people should define the meaning of all the words in the vocabulary in numeric terms (e.g., each person inputs the range of heights that for them corresponds to tall people). 3. The collected data are merged to create fuzzy sets describing the meaning of all words in the dictionary. The obtained dictionary can then be used for the precisiation of the meaning of words by simply mapping them to the fuzzy set describing their meaning. The step of computation on the obtained precisiated values allows computing an answer to the original question, expressed in words, based on performing mathematical operations on the analyzed fuzzy sets. This can be executed with arithmetics on sets and more elaborated techniques, such as fuzzy IF-THEN rules [32]. In the Per-C framework for the practical implementation of fuzzy sets, after the computation step, the obtained data are transformed again to provide the answer to the original question in a more user-friendly way, such as summarizing the data with words describing it on a high level [31], alongside with the raw data supporting and explaining the output. However, creating a vocabulary with application-dependent words and manually collecting data for their precisiation do not provide any flexibility and adaptivity to CWW and are costly and time-consuming, especially for complex applications. This poses a substantial limitation to the application of CWW to general problems, despite its powerful means of handling semantics. To solve this limitation, a solution to generalize the step of precisiation of meaning is proposed in Chap. 5.
3.3 Automated Reasoning Today’s interactive systems are limited in their ability to process information similarly to human reasoning. This poses a limitation not only to understanding people’s needs and desires but also to effectively extracting knowledge about links and similarities between different situations. These are all fundamental aspects for a better alignment of goals in the conversation theory loop, and thus in the phenotropic interaction framework modeled on it, as the understanding of the semantics of
38
3 Cognitive and Perceptual Computing
the exchanged information is not enough, but also inferring how to react to this information is essential. Indeed, the automatic extraction of meaning and relationships from processed information is fundamental for cognitive computing, as modeled in CWW and implemented in the semantic web [8]. However, information should be treated in a more advanced way to extract knowledge from it in a way more similar to how humans do it by creating cognitive links between different concepts and pieces of information [15]. This potentially allows artificial systems to show particular human-like learning abilities (e.g., analogical learning), which help obtain a more natural and adaptive interaction, thus facilitating the achievement of mutual support between humans and machines [10, 35], a radical component of conversation theory and phenotropic interaction. Definition 3.6 According to the structure-mapping theory [18], analogies are a mapping of knowledge from a familiar domain (base) to an unfamiliar one (target), under the assumption that relationships existing between elements in the base also hold between elements in the target. The analogy, the mapping of the relational structure between different domains, is complemented by the appearance, the mapping of object features. An analogy exists if a relationship between a pair of elements is structurally similar to a pair of other items, as in the expression “king is to man as queen is to woman.” Analogical reasoning is at the root of most human reasoning processes and is used to form a conclusion in unknown situations by matching them to known ones. This is common in problem-solving, where knowledge of a solved problem is applied to solve a new one; and in classification problems, where analogies between the features of classified objects and new objects are sought and employed to infer the class of the new items. Similarly, analogical reasoning can be used to solve the problems of agreement on a particular goal and conversational alignment in a conversation-based interaction between humans and artificial systems. Some works exist that make use of analogical reasoning for solving some NLP problems, such as denominal verb interpretation [24] and word sense disambiguation [5]. However, one can argue that these methods are not close to human cognition as they work with crisp data and do not consider the intrinsic fuzziness of human perceptions and reasoning. For this reason, to better model analogies in complex and uncertain environments, a promising approach consists in applying the framework of approximate reasoning [47], which is based on the use of fuzzy systems to automate and simulate as closely as possible the uncertainty and vagueness of human reasoning. Many methods provide an attempt to reason with fuzzy information [1, 4, 23, 30, 44]. However, only a limited amount of work exists on the reasoning with a combination of fuzziness and analogies [9, 14, 45]. Unfortunately, these work on fuzzy sets, so they do not allow reasoning directly with concepts expressed in natural language. Furthermore, practical implementations of the presented theories do not exist, making them difficult to use in real-life scenarios.
3.4 Interactive ML
39
Applications based on CWW to reason on perceptions, expressed as words, to first transform natural language data to fuzzy sets on which to perform analogical reasoning can prove beneficial for the extension of knowledge of systems and to provide a better understanding and adaptation to the language, perceptions, and desires of people. An implementation of this is proposed in Chap. 6.
3.4 Interactive Machine Learning In the cybernetic loop of conversation theory, corresponding to the feedback signal used by a participant in the conversation to better align with the counterpart, there is the need for a mechanism to react to feedback. In practice, this can be obtained with interactive ML techniques [16]. The traditional ML pipeline starts with a considerable amount of data collected or provided by domain experts relative to a particular problem or application. Then, relevant features are selected from the data from a collaboration between ML practitioners and domain experts. Based on this, the better-suited models are selected and trained by tuning parameters and tweaking features to improve some performance metrics. Finally, once both the ML practitioners and the domain experts are satisfied with the developed model, this is deployed to the users and rarely improved. If users discover issues in the deployed model or new techniques to solve the specified problem emerge from research, the development process is repeated from the beginning by taking these new pieces of information into account. If better results are obtained in this development, the deployed model is updated with the new iteration. In contrast, the interactive ML pipeline is more centered on the users as participants in the improvement of the model. As a result, updates to deployed models are quicker (the model is updated in response to user feedback), more focused (only a part of the model related to the user feedback is updated), and incremental (updates to the model are minor, a single update does not change it much) [3]. The idea is that of letting the users impact the ML model by adapting it to their specific needs based on interaction. This allows users to see the impact of their actions on how the model works until they observe the desired behavior, allowing people with little or no ML experience to contribute to the improvement of the model through trial and error. A simple example of interactive machine learning is Netflix’s movie recommendations, where people can give feedback to the suggested titles by specifying if they like or dislike them. The recommendation engine uses this information to improve the suggestions for the specific user. If this operation harms the recommendations to the user, they can remedy this by changing the rating they gave or by adapting future ratings to steer the system in the direction they want.
40
3 Cognitive and Perceptual Computing
3.5 Explainable Artificial Intelligence As presented in Chap. 2, one of the essential features of the phenotropic interaction framework that allows people to better provide feedback to the interactive system is the ability for the system to explain its choices and reasoning process in a user-friendly way. Indeed, this allows the identification of misunderstandings and errors on the side of the artificial actor, to which the user can either provide targeted feedback to improve the system’s understanding of and reaction to the exchanged information (system’s adaptation) or adapt the communication in order to be correctly understood (user adaptation). Basic Artificial Intelligence (AI) implementations are based on simple and intuitive models, for example, on lists or trees of easy-to-understand rules or simple mathematical models and algorithms. However, when the complexity of the problems to be tackled and the necessary flexibility increase, so do the methods to approach the problems effectively. In an era where deep learning dominates AI research, especially in the fields of NLP and image analysis, it is more and more challenging to understand the inference process of the used models, even when knowing the mathematical basis on which these are built [22]. For instance, in most modern NLP inference systems, words are being transformed into multi-dimensional vectors describing their meaning, becoming impossible for humans to understand. So an insight into the mathematical operations performed by the model does not serve as a human-understandable explanation of the reasoning process that produced the answer to the problem, making it a black-box operation. Definition 3.7 Interpretability of an AI system refers to the ability of people to gain a human-understandable insight into the model’s rationale for the production of a certain output [6]. The main objective of a system’s interpretability is allowing people to easily verify that some fundamental requisites are observed, such as the respect of ethical and moral principles [36], as well as its correctness and potential causes of failure [2]. Furthermore, interpretability has been proved to strongly contribute to building trust toward AI systems and improve the emotional confidence of the users [38]. The following requisites are to be considered for the implementation of a successful explainable AI system [20]: • Transparency: A language that is understandable by people should be used to represent the decision-making process; this includes, for example, fuzzy rulebased systems or simple data visualization techniques. • Causality: Not only the inferences of a model trained on data are essential, but also insights into the underlying cause–effect relationships. • Bias: It should be possible to verify if the AI system learned a biased view of the world due, for example, to biased training data or model parameters [37].
3.6 Concluding Remarks
41
• Fairness: It should be possible for humans to verify the fairness of the choices made by an AI system; people should not have to simply blindly trust and accept decisions made by an artificial system. • Safety: To be able to trust the reliability of a system, one has to understand the reasoning process that is happening behind the scenes. Indeed, it could be strongly flawed despite working correctly on training data and fail dangerously on new data. The most direct methods for providing interpretability strongly rely on transparent models, such as decision trees, classification rules, and linear models [2]. In this case, the knowledge used for the inference process is easily accessible to the user, which might understand it better or worse, depending on their expertise level. For instance, decision trees can easily be interpreted by following a sequence of conditions applied to the input to reach the output. However, these methods often do not perform as accurately as black-box models when applied to complex data such as natural language or images. Another approach to explainable AI is aiming to achieve a higher inference accuracy by making use of black-box models, such as deep neural networks, while still providing explanations in an explicit way (not intrinsic to the employed model) [19]. Various methods exist for achieving this: • Model explanation: A transparent model is used as a surrogate model [7] imitating the black-box results. • Outcome explanation: Given a particular input, a transparent model is used to explain the output related to that specific instance without having to describe the logic of the whole model. • Model inspection: A visual or textual representation of details regarding the inner workings of the black-box model is provided, such as class activation maps [29]. Ideally, following Occam’s razor, when transparent models provide a good enough accuracy, these should be selected over black-box models with additional modules providing interpretability of the outputs. The second option should be used only if accuracy is of utmost importance (e.g., in biomedical applications) or transparent models have poor performance (e.g., for complex image analysis). These principles should be used for the implementation of phenotropic interfaces, where the understanding and reasoning processes of the artificial system toward the conversation elements should be easily interpretable by people to allow them to provide meaningful feedback for the improvement and adaptation of the system to their needs.
3.6 Concluding Remarks In this chapter, the theories constituting the building blocks of the phenotropic interaction have been presented and contextualized in the phenotropic interaction framework. These building blocks for a more intelligent and natural handling of
42
3 Cognitive and Perceptual Computing
conversations, originating in the fields of cognitive and perceptual computing, constitute the general methods that foster the implementation of phenotropic interfaces, that allow people to interact with artificial systems in a way more similar to what they are used to in interactions with other people. In these theories, it was included the basic structure of conversation between people (i.e., conversation theory) that phenotropic interaction aims to replicate to make the interaction between humans and artificial interfaces more similar to a conversation between humans, with its understanding, adaptation, and agreement between the involved actors to reach a common goal. Additionally, means of understanding, elaborating, and reasoning on imprecise information (e.g., described with natural language) were introduced in the concepts of CWW and automated reasoning, which allow more human-like handling of information coming from an imprecise perception of the world. These are fundamental for the alignment of the artificial system to understanding and alignment with people’s perceptions and needs. Finally, theories supporting feedback from people to improve the ability of the artificial counterpart in the conversation better adapt to human needs were presented in the concepts of interactive ML and explainable AI. These represent ways for the system to react to feedback to improve over time based on people’s needs, guided by an explanation to the user of the system’s reasoning process, which allows more targeted feedback, and increases trust and confidence in the system. The introduced theories are a mere basis on which phenotropic interaction is built, but more precise developments and application-specific methods from these theories should be used for a successful implementation of phenotropic interfaces. Examples of more precise, newly developed methods targeted explicitly at the understanding and processing of human perceptions are introduced in the following Chaps. 4–6.
References 1. Ahmad, R., & Rahimi, S. (2005). A perception based, domain specific expert system for question-answering support. In Proceedings of the North American Fuzzy Information Processing Society Conference NAFIPS (pp. 454–459). IEEE. https://doi.org/10.1109/WI.2006. 22 2. Alonso Moral, J. M., Castiello, C., Magdalena, L., & Mencar, C. (2021). Toward explainable artificial intelligence through fuzzy systems. In Explainable Fuzzy Systems: Paving the Way from Interpretable Fuzzy Systems to Explainable AI Systems (pp. 1–23). Springer. https://doi. org/10.1007/978-3-030-71098-9_1 3. Amershi, S., Cakmak, M., Knox, W. B., & Kulesza, T. (2014). Power to the people: The role of humans in interactive machine learning. AI Magazine, 35(4), 105–120. https://doi.org/10. 1609/aimag.v35i4.2513 4. Baldwin, J. (1979). A new approach to approximate reasoning using a fuzzy logic. Fuzzy Sets and Systems, 2(4), 309–325. https://doi.org/10.1016/0165-0114(79)90004-6 5. Barbella, D., & Forbus, K. (2013). Analogical word sense disambiguation. Advances in Cognitive Systems, 2(1), 297–315.
References
43
6. Barredo Arrieta, A., et al. (2017). Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115. https://doi.org/10.1016/j.inffus.2019.12.012 7. Basurto, C., et al. (2021). Implementation of machine learning techniques for the quasi realtime blind and electric lighting optimization in a controlled experimental facility. Journal of Physics: Conference Series, 2042(1), 012112. https://doi.org/10.1088/1742-6596/2042/1/ 012112 8. Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The semantic web. Scientific American, 284(5), 34–43. 9. Bouchon-Meunier, B., & Valverde, L. (1999). A fuzzy approach to analogical reasoning. Soft Computing, 3(3), 141–147. https://doi.org/10.1007/s005000050062f 10. Cooley, M. (1996). On human-machine symbiosis. In Human machine symbiosis (pp. 69–100). Springer. https://doi.org/10.1007/978-1-4471-3247-9_2 11. Denning, P. J. (2014). Surfing toward the future. Communications of the Association for Computing Machinery, 57(3), 26–29. https://doi.org/10.1145/2566967 12. Dubois, D., & Prade, H. (1994). Fuzzy sets—a convenient fiction for modeling vagueness and possibility. IEEE Transactions on Fuzzy Systems, 2(1), 16–21. https://doi.org/10.1109/91. 273117 13. Dyck, R., Sadiq, R., Rodriguez, M., Simard, S., & Tardif, R. (2017). A comparison of membership function shapes in a fuzzy-based fugacity model for disinfection byproducts in indoor swimming pools. International Journal of System Assurance Engineering and Management, 8(4), 2051–2063. https://doi.org/10.1007/s13198-014-0318-2 14. D’Onofrio, S., Müller, S. M., Papageorgiou, E. I., & Portmann, E. (2018). Fuzzy reasoning in cognitive cities: An exploratory work on fuzzy analogical reasoning using fuzzy cognitive maps. In IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1–8). IEEE. https://doi.org/10.1109/FUZZ-IEEE.2018.8491474 15. D’Onofrio, S., & Portmann, E. (2017) Cognitive computing in smart cities. InformatikSpektrum, 40(1), 46–57. https://doi.org/10.1007/s00287-016-1006-1 16. Fails, J. A., & Olsen Jr., D. R. (2003). Interactive machine learning. In Proceedings of the 8th International Conference on Intelligent User Interfaces (pp. 39–45). https://doi.org/10.1145/ 604045.604056 17. Farrell, R. G., et al. (2016). Symbiotic cognitive computing. AI Magazine, 37, 81–93. https:// doi.org/10.1609/aimag.v37i3.2628 18. Gentner, D. (1983). Structure-mapping: A theoretical framework for analogy. Cognitive Science, 7(2), 155–170. https://doi.org/10.1016/S0364-0213(83)80009-3 19. Guidotti, R., et al. (2018). A survey of methods for explaining black box models. Association for Computing Machinery Computing Surveys, 51(5). https://doi.org/10.1145/3236009 20. Hagras, H. (2018). Toward human-understandable, explainable AI. Computer, 51(9), 28–36. https://doi.org/10.1109/MC.2018.3620965 21. Henderson, A., & Harris, J. (2011). Conversational alignment. Interactions, 18(3), 75–79. 22. Holzinger, A. (2018). From machine learning to explainable AI. In World Symposium on Digital Intelligence for Systems and Machines (DISA) (pp. 55–66). https://doi.org/10.1109/ DISA.2018.8490530 23. Khorasani, E. S., Rahimi, S., & Gupta, B. (2009). A reasoning methodology for CW-based question answering systems. In International Workshop on Fuzzy Logic and Applications (pp. 328–335). Springer. https://doi.org/10.1007/978-3-642-02282-1_41 24. McFate, C. J., & Forbus, K. D. (2016). Analogical generalization and retrieval for denominal verb interpretation. Cognitive Science. 25. Medasani, S., Kim, J., & Krishnapuram, R. (1998). An overview of membership function generation techniques for pattern recognition. International Journal of Approximate Reasoning, 19(3), 391–417. https://doi.org/10.1016/S0888-613X(98)10017-8
44
3 Cognitive and Perceptual Computing
26. Mendel, J. (2001). The perceptual computer: An architecture for computing with words. In 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297) (Vol. 1, pp. 35–38). https://doi.org/10.1109/FUZZ.2001.1007239 27. Mendel, J., & Wu, D. (2010). Perceptual computing: Aiding people in making subjective judgments. John Wiley & Sons. 28. Mendel, J., Zadeh, L. A., Trillas, E., Yager, R., Lawry, J., Hagras, H., & Guadarrama, S. (2010). What computing with words means to me [discussion forum]. IEEE Computational Intelligence Magazine, 5(1), 20–26. https://doi.org/10.1109/MCI.2009.934561 29. Muhammad, M. B., & Yeasin, M. (2020). Eigen-CAM: Class activation map using principal components. In International Joint Conference on Neural Networks (IJCNN) (pp. 1–7). IEEE. 30. Mukaidono, M., Ding, L., & Shen, Z. (1990). Approximate reasoning based on revision principle. In Proceedings of the North American Fuzzy Information Processing Society Conference NAFIPS (Vol. 1, pp. 94–97). 31. Novák, V. (2016). Linguistic characterization of time series. Fuzzy Sets and Systems, 285, 52– 72. https://doi.org/10.1016/j.fss.2015.07.017 32. Novák, V., & Lehmke, S. (2006). Logical structure of fuzzy if-then rules. Fuzzy Sets and Systems, 157(15), 2003–2029. https://doi.org/10.1016/j.fss.2006.02.011 33. Pangaro, P. (2017). Questions for conversation theory or conversation theory in one hour. Kybernetes, 46, 1578–1587. https://doi.org/10.1108/K-10-2016-0304 34. Pask, G. (1975). Conversation, cognition and learning. Elsevier. 35. Pedrycz, W., & Gomide, F. (2007). Fuzzy systems engineering: Toward human-centric computing. John Wiley & Sons. 36. Portmann, E., & D’Onofrio, S. (2022). Computational ethics. HMD Praxis der Wirtschaftsinformatik, 59(2), 447–467. 37. Roselli, D., Matthews, J., & Talagala, N. (2019), Managing bias in AI. In Companion Proceedings of the 2019 World Wide Web Conference (pp. 539–544). https://doi.org/10.1145/ 3308560.3317590 38. Shin, D. (2021). The effects of explainability and causability on perception, trust, and acceptance: Implications for explainable AI. International Journal of Human-Computer Studies, 146, 102551. https://doi.org/10.1016/j.ijhcs.2020.102551 39. Siemens, G. (2017). Connectivism. Foundations of learning and instructional design technology. 40. Stanojevic, B., & Stanojevi´c, M. (2021). Approximate membership function shapes of solutions to intuitionistic fuzzy transportation problems. International Journal of Computers, Communications and Control, 16(1). https://doi.org/10.15837/ijccc.2021.1.4057 41. Thrun, S., & Pratt, L. (2012). Learning to learn. Springer. https://doi.org/10.1007/978-1-46155529-2 42. Trillas, E., Termini, S., Tabacchi, M. E., & Seising, R. (2015). Fuzziness, cognition and cybernetics: An outlook on future. In 2015 Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and Technology (IFSA-EUSFLAT-15) (pp. 1413–1418). Atlantis Press. https://doi.org/10.2991/ifsa-eusflat-15.2015.200 43. Trivino, G., & Sugeno, M. (2013). Towards linguistic descriptions of phenomena. International Journal of Approximate Reasoning, 54(1), 22–34. https://doi.org/10.1016/j.ijar.2012.07.004 44. Turksen, I., & Lucas, C. (1991). A pattern matching inference method and its comparison with known inference methods. In Proceedings of the International Fuzzy Systems Association World Congress. 45. Turksen, I., & Zhong, Z. (1988). An approximate analogical reasoning approach based on similarity measures. IEEE Transactions on Systems, Man, and Cybernetics, 18(6), 1049–1056. https://doi.org/10.1109/21.23107 46. Wiener, N. (1948). Cybernetics. 47. Zadeh, L. (1983). The role of fuzzy logic in the management of uncertainty in expert systems. Fuzzy Sets and Systems, 11(1), 199–227. https://doi.org/10.1016/S0165-0114(83)80081-5 48. Zadeh, L. A. (1999). Fuzzy logic = computing with words. In Computing with Words in Information/Intelligent Systems (Vol. 1, pp. 3–23). Springer. https://doi.org/10.1109/91.493904
References
45
49. Zadeh, L. A. (2012). Computing with words: Principal concepts and ideas. Springer. https:// doi.org/10.1007/978-3-642-27473-2 50. Zhao, J., & Bose, B. (2002). Evaluation of membership functions for fuzzy logic controlled induction motor drive. In IEEE 28th Annual Conference of the Industrial Electronics Society IECON (vol. 1, pp. 229–234). https://doi.org/10.1109/IECON.2002.1187512
Part III
Natural Language Conversations
Chapter 4
Semantic Similarity Measures
The conversation between two actors is much more than just the data being exchanged (e.g., words), it provides a context that transforms it into information (e.g., sentences), and it reaches the final goal of transferring knowledge (e.g., the concepts described by the sentence) when the actors extract meaning from the exchanged information, which allows them to align their needs to reach a common goal [28]. For this reason, in the context of phenotropic interaction, aiming to bring more naturalness to artificial exchanges, the possibility for artificial systems to correctly converse with users and adapt to their needs and desires is strongly dependent on the ability of the system to understand the semantics of the exchanged information, to acknowledge the shared knowledge and react accordingly. This aligns with today’s relevant challenge for computer systems of understanding semantics in humans’ natural ways of communicating, as shown by recent developments in Natural Language Processing [5, 37] and gestural communication [8, 15]. An essential tool for better treating semantics in communication is the semantic similarity. This notion allows one to understand the semantics of a particular concept relative to another; it is an estimate indicating the closeness of the meaning of two elements (e.g., words, sentences, concepts, gestures, and pieces of text) [6]. Semantic similarity measures are fundamental for the automatic processing of communication in a human-like way to extend knowledge to unknown concepts and perceptions. For example, they are at the base of the precisiation phase of the CWW pipeline [48], and to the task of finding relationships between concepts in approximate reasoning [11], as it will be presented in more detail in Chaps. 5 and 6. Similarity estimation can be used to model people’s perceptions about the relationships and the commonalities between elements [13]. However, despite a certain degree of correspondence in the interpretation of perceptions by different people, these have a strong subjective component, which implies that similarity cannot be defined as a crisp value, but it should have a fuzzy nuance allowing to model this subjectivity. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Colombo, Phenotropic Interaction, Fuzzy Management Methods, https://doi.org/10.1007/978-3-031-42819-7_4
49
50
4 Semantic Similarity Measures
Moreover, a similarity of .0.6 between two words does not mean anything if taken out of context; it acquires a meaning only when compared with the similarity between two other words, where one could say that a pair is more similar than another. Correspondingly, human perceptions allow to quickly say that two words are more similar than two others but do not allow to say that, in general, without any context, two words are very similar. This chapter focuses on the effective computation of semantic similarity between people’s perceptions expressed in natural language. This constitutes the basis for the implementation of adaptivity and robustness in phenotropic interfaces, as these rely on a deep-level understanding of human communication. Some considerations about two different types of similarities (conceptual and spectral) are presented in Sects. 4.1 and 4.2, including an overview of mainstream semantic similarity measures. A new algorithm for accurately computing spectral semantic similarity, to which development the author participated, published in [9], is presented in Sect. 4.3, along with some practical implementation challenges in Sect. 4.4. A usercentered evaluation of its performance compared to state-of-the-art measures is reported in Sect. 4.5. Finally, critical remarks about the developed measure and future improvements conclude the chapter in Sect. 4.6.
4.1 Conceptual Similarity Measures Semantic similarity between words belonging to distinct parts of speech builds on independent features, as the types of relationships between words in the same class are different based on the nature of the concepts being represented. In particular, nouns are related to one another using the type of relationships that (directly or indirectly) exist between the concepts they represent, which might be, for example, of type is-a, has-a, part-of, synonym. At the same time, the similarity between scalar adjectives and adverbs [45], that is, words that can be placed on a spectrum representing all the terms describing the same feature, is naturally more based on their closeness when situated on the said spectrum. The former will be referred to as conceptual similarity [47], and the latter as spectral similarity. Regarding the computation of conceptual similarity, three main categories of approaches to the problem [7] exist: context-based, knowledge-based, and hybrid methods. Context-Based Methods These make use of co-occurrence statistics of words in large corpora, under the assumption that the similarity between two words .w1 and .w2 is inversely proportional to the number of occurrences of these two words close to one another in the same document (see Fig. 4.1a). Common data sources for computing co-occurrence statistics include various web engines and corpora with information expressed using everyday language. Turney [42] computes the pointwise mutual information as the ratio between the
4.1 Conceptual Similarity Measures
51
Fig. 4.1 Example of data used for the computation of conceptual semantic similarity
number of hits returned by a web search containing the two terms to be compared, combined and independently, and employs it as a measure of the degree of synonymy. Bollegala et al. [4] input to support vector machines several relatedness metrics based on the counts of search results, which rank the similarity of target words. Sahami and Heilman [38] employ search results to build vectors representing the searched word, which can then be used as a basis to compute the similarity between vector pairs, and thus the terms they are representing. Similarly, Mikolov et al. [24] and Pennington et al. [29] use several corpora (e.g., tweets, news articles, Wikipedia dumps) to generate vector representations of words, which have interesting properties such as the possibility of solving word analogies using vector arithmetic (e.g., of the form king - man + woman = queen [12]) and computing semantic similarity with cosine similarity [17]. More recent ways of generating this type of word embeddings [10, 35] rely on transformer models [44]. Knowledge-Based Methods These are built on ontologies and other artificial constructs to represent basic knowledge in a structured way (e.g., WordNet [25, 26] and ConceptNet [40]). Various models compute the distance or the amount of shared information in ontologies [14, 19, 30, 31] (see Fig. 4.1b), and many others take the graph of Wikipedia links between pages as a base of information for executing similar tasks [16, 34, 46]. Most of these knowledge bases being generated or corrected using manual inputs, the data on which the semantic similarity measures are based are reliable. However, the existence or non-existence of a particular connection between two concepts in an ontology or similar knowledge structures is, in this case, always either present or absent, whereas, in reality, some relationships of the same type can be stronger or weaker depending on several factors, such as the probability of a specific connection to exist. A possible solution that better reflects this reality would be to use fuzzy ontologies, where relationships can be weighted according to their strengths and level of certainty [33].
52
4 Semantic Similarity Measures
Hybrid Methods These methods try to overcome the limitations of context-based and knowledgebased methods by simply combining the two into the same model, using, for example, first a context-based method for an approximate clustering and then a knowledge-based method to relocate outliers and refine the clustering [46], or by computing the amount of shared information between two terms, based on the number of shared words in their dictionary definitions [22], extended with the glosses of related concepts in WordNet [1, 2].
4.2 Spectral Similarity Measures As described in Sect. 4.1, spectral similarity represents for humans a more natural way of perceiving similarity when it comes to quantitative and qualitative adjectives, as well as adverbs of manner and frequency. In particular, it refers to how close the meaning of two terms is on the spectrum of all possible words describing the same feature. For instance, when taking the qualitative adjectives hot, cold, and burning, which are all describing the same feature (heat), most people would agree that the meaning of burning and hot is much closer than that of cold and hot, as shown in Fig. 4.2. This means that, based on people’s perceptions, .sim(burning, hot) > sim(cold, hot), where sim represents the spectral semantic similarity and is in this case inversely proportional to the semantic distance between the considered words on their spectrum. This type of semantic similarity measure is fundamental for CWW and perceptual computing, as human perceptions are best described using spectral adjectives and adverbs. Despite this, research in the area is not very active. Only a limited amount of works exists that explicitly handle similarities between adjectives or adverbs [32, 43]. These methods rely on the connections between the analyzed words and nouns, exploiting the relationships between nouns instead of directly considering the relationships between the adjectives and adverbs in their spectral representation. Thus the resulting measure does not consider the spectral similarity but only the conceptual similarity between related nouns. Also, word embeddings such as Word2Vec [24] and GloVe [29] are technically usable for computing the cosine similarity between adjectives and adverbs. However, by playing around with these similarities, one can quickly notice that they do not always align with the
Fig. 4.2 Example of the semantic similarity between adjectives on the spectrum of heat, ∝ dist(w10 ,w1 ) (Adapted from [9])
.sim(w0 , w1 )
4.3 A Novel Spectral Similarity Measure
53
spectral similarity. Indeed, many examples exist for which the observed results are the opposite of what humans would expect, even for some very commonly agreedon comparisons. For instance, .sim(always, never) > sim(always, often), while for the vast majority of people, the opposite is true, as always and often are almost synonyms, while always and never are antonyms [41]. Because of the lack of research explicitly aiming at the computation of spectral similarity, which is a natural way for humans to find relationships between words describing perceptions, fundamental, for instance, for understanding and reasoning with these terms, one can argue that it is crucial to develop an effective way of computing spectral semantic similarity between adjective and adverbs. A tentative to fill this gap is presented in Sect. 4.3.
4.3 A Novel Spectral Similarity Measure To overcome the lack of similarity measures explicitly targeting spectral similarity between adjectives and adverbs, a new approach is introduced in [9] and presented in detail in the following. First, a description of the chosen knowledge base on which the similarity measure is built is presented, followed by the definition of the measure itself and the optimization of some of its parameters.
4.3.1 Choice of Knowledge Base One can safely assume that the choice of a good source of information has an important impact on the accuracy of any semantic similarity measure. Although, as in ontologies and similar knowledge bases, it is not clear if and how relationships between adjectives and adverbs can be successfully represented, these are hardly a good base for the practical computation of the spectral semantic similarity. One can observe that dictionary-based methods such as [2] could be well suited for this task, as scalar adjectives and adverbs perceived similarly should have many words in common in their definitions. However, the quality of this measure strongly depends on the exact wording used in the dictionary definitions, so one can argue that a better data source should be selected. Instead of using a traditional dictionary as a knowledge base, a more uniform description of the semantics of terms (i.e., not depending on the arbitrary wording used in the dictionary definition) would provide better results. For this reason, the selected data source for the computation of spectral semantic similarity between adjectives and adverbs is the thesaurus. A thesaurus describes the meaning of words based on their synonyms and antonyms. This makes it a uniform knowledge base, as the meaning of all words is characterized by two sets of other words, their synonyms and antonyms, without the ambiguity of using complete sentences.
54
4 Semantic Similarity Measures
4.3.2 Spectral Semantic Similarity Measure Relying on this assumption, one can proceed to define a novel spectral similarity measure based on the overlaps between word synonyms from a thesaurus. Indeed, one can safely assume that the more common words exist in the list of synonyms of two words .w1 and .w2 , the more .w1 and .w2 are semantically similar in a spectral representation. Definition 4.1 A set of terms W belonging to the same semantic category is the set of all words used to describe different grades of the same feature (e.g., adjectives describing temperature, .W = {hot, cold, burning, . . .}). Definition 4.2 Let .w1 , w2 ∈ W be two words belonging to the same semantic category. .w1 , w2 are similar up to a certain degree defined by the similarity function .sim(w1 , w2 ) ∈ [0, 1]. The more similar .w1 and .w2 , the higher the value of .sim(w1 , w2 ). In the special case where .w1 = w2 , their similarity is maximal .sim(w1 , w2 ) = 1. Using these observations about the spectral semantic similarity, one can define the 0-th order spectral semantic similarity between two terms. Definition 4.3 Let .w1 , w2 ∈ W . The 0-th order spectral semantic similarity, noted sim0 (w1 , w2 ), between .w1 and .w2 can be defined as follows:
.
⎧ sim0 (w1 , w2 ) =
1,
if w1 = w2
0,
otherwise.
.
The 0-th order semantic similarity measure is applicable only for accurately computing the similarity between identical words. However, its granularity can be improved to handle similarity between non-identical words by using sets of synonyms and the similarity between them. Definition 4.4 Let .w ∈ W be a word. The operator .SYN(w) represents the set of all synonyms of w. Likewise, .SYNi (w) represents the i-th order synonyms of w, for example, .SYN2 (w) = SYN(SYN(w)). Definition 4.5 The 0-th order similarity between two sets of words .V , U ⊂ W corresponds to counting the number of overlapping terms (with repetition) in the two sets: ∑∑ .sim0 (V , U ) = sim0 (v, u). (4.1) v∈V u∈U
For the computation of the spectral semantic similarity, it might not be enough to employ first-order synonyms, as these could not provide a complete enough coverage of the whole spectrum. Indeed, it is possible that with this parameter cold and freezing would result as having the same similarity (0) with hot, while one of the
4.4 Practical Implementation Challenges
55
two is clearly closer to hot than the other. Using synonyms of synonyms provides better coverage of the spectrum because of the accumulation of imperfect synonymy. Thus, a general definition of the i-th order similarity measure is provided. Definition 4.6 The i-th order spectral similarity measure is defined as follows: 1 .simi (w1 , w2 ) = 2
⎛
σ |SYNi (w1 )|
+
σ |SYNi (w2 )|
⎞ ,
(4.2)
with .σ = sim0 (SYNi (w1 ), SYNi (w2 )). An overlap between lower-order synonyms indicates a higher degree of similarity than overlap between higher-order synonyms. This means that a more significant weight shall be given to overlaps in lower-order synonyms in the similarity computation. In Definition 4.6, this weighting is implicitly provided by the fact that if an overlapping word v occurs at the j -th order synonymy, then at the .(j + 1)-th order, all the synonyms of v will overlap, which gives to v more weight than an overlap between a single word at the .(j + 1)-th level.
4.4 Practical Implementation Challenges Some choices and challenges to effectively implement the defined semantic similarity measure in a usable way have to be faced. These include the selection of a well-suited thesaurus, handling words with multiple meanings, and selecting the generally best possible order of synonymy for Eq. 4.2.
4.4.1 Thesaurus Selection The thesaurus to be used as a data source for the implementation of the introduced semantic similarity measure should: • Be complete, meaning that it contains a high enough number of synonyms for each word. • Have diversity in the proposed synonyms, as the presence of both perfect and partial synonyms allows to quickly accumulate imperfection in synonyms, making it possible to estimate similarity with a lower level of synonymy. • Be available to be used in a programming language, thus obtainable online and accessible with an API.
56
4 Semantic Similarity Measures
Table 4.1 Parameters for the computation of .sim(w1 , w2 ) with .w1 = freezing and .w2 = cold for different meanings of .w2 Meaning of .w2 Temperature adj. Emotion adj. Weather noun All meanings
Overlaps 23 4 1 24
.|SYN(w1 )|
.|SYN(w2 )|
.sim1 (w1 , w2 )
26 26 26 26
54 34 26 106
0.66 0.14 0.04 0.55
Of several explored alternatives, the English language thesaurus provided by Dictionary.com1 was selected as the one better satisfying these conditions. This thesaurus is built and constantly updated by a team of lexicographers and supplemented by additional sources. The English language is targeted in this implementation, but the same techniques could be used for the computation of the semantic similarity in other languages, provided that a data source similar to the selected one is available. To retrieve the thesaurus data using Python 3, the open-source module thesaurusapi2 is being used. This module allows retrieving synonyms and antonyms of words that can be filtered in different ways based on their relevance, complexity, length, and part of speech. For words with multiple meanings, a set of synonyms is returned for each meaning. In implementing the similarity measure, a local database was used to cache the synonyms to avoid unnecessary and time-consuming requests to the online thesaurus.
4.4.2 Multiple Meanings Words with multiple meanings (i.e., homonyms) can create problems in the semantic similarity computation employing overlaps between synonyms. This can be seen in Table 4.1, where the similarity between freezing and cold is computed as an example. In that case, when taking the list of all the synonyms (“All meanings” in Table 4.1), the ratio of overlapping synonyms is much smaller than if only the meaning related to the description of temperature (“Temperature” in Table 4.1) is considered for the two words, as the only few overlaps between the other meanings (“Emotion” and “Weather” in Table 4.1) dramatically reduce the ratio compared to the one with the temperature-related meaning only.
1 https://thesaurus.com. 2 https://github.com/bradleyfowler123/thesaurus-api.
4.4 Practical Implementation Challenges
57
To overcome this issue, only one meaning is considered for semantic similarity computation. A two-point heuristic is used to choose the best meaning for each of the considered words: 1. Only meanings representing the same part of speech (e.g., adjectives, adverbs) are selected for the computation of similarity between two words. 2. The combination of meanings that produces the highest similarity estimate is taken. For instance, in the example with freezing and cold, the noun meaning of cold is discarded as it is not compatible with the part of speech of the only meaning of freezing (adjective). Then, the similarity is computed for both remaining meanings of cold, and the meaning with the highest similarity (temperature, adj., similarity .= 0.66) is considered as the one describing the same property for both freezing and cold. In this example, it is easy to confirm that this is correct, as the bestselected meaning of cold corresponds to the adjective describing temperature, the same feature that freezing describes.
4.4.3 Order of Synonymy For practical computations of spectral semantic similarity, one has to select the order of synonymy i that provides the best results in the similarity distribution in the range .[0, 1]. To optimize the order of synonymy, the following three criteria are proposed: 1. Simplicity: The number of synonyms grows exponentially as a function of i, limited by the total amount of terms that describe the same feature. The computation of higher-order semantic similarity requires more time and resources; thus, selecting a smaller i is better. 2. Coverage: To be able to compute the differences in the similarity between words whose meanings are far, there is ideally the need for all pairs of words to have at least one common synonym so that the similarity measure is always bigger than 0, but approaches this value very closely when two words have opposed meanings. For this to be true, for any couple of words .w1 , w2 ∈ W , i i .SYN (w1 ) ∪ SYN (w2 ) should cover the whole spectrum of W . This can be achieved by incrementing the value of i, which increases the spectrum coverage thanks to the accumulation of imperfect synonyms until the wanted condition is met. 3. Completeness: Choosing a too-large value for i comes with the risk that all possible words .w ∈ W describing the same feature belong to the set of synonyms of the analyzed words .w1 , w2 ∈ W . In this case, one would have many overlaps between these sets of synonyms, independently of the real semantic similarity .sim(w1 , w2 ). To prevent this, i should be as small as possible.
58
4 Semantic Similarity Measures
For the search for the optimal order of synonymy i, the semantic similarity measure was implemented in Python 3 with the help of the thesaurus-api module. First, the implemented similarity measure was applied to a list of 108 random adjectives and adverbs, manually categorized by the author in groups of words describing the same feature. The identified categories were: price, temperature, size, quality, difficulty, speed, look, quantity, age_people, age_things, brightness, height, weight, and width. For instance, the words selected in the category of adjectives describing quality were: perfect, bad, good, acceptable, mediocre, OK, great, and wonderful. Then, the similarity with different values of i was computed for all combinations of words belonging to the same category (384 word pairs in total) to select the i for which the three proposed criteria were best satisfied. Algorithm 4.1 is used for selecting the synonymy order i for .simi satisfying the requirements of simplicity, coverage, and completeness. To simplify this, the smallest possible value for i that satisfies the criterion of coverage is searched, which implicitly considers thus simplicity. After that, the criterion of completeness is also checked on the found value to ensure that all the requirements are reasonably satisfied. Algorithm 4.1: Selection of optimal synonymy order i for .simi (Adapted from [9]) Input: l, a matrix of words, where all rows belong to the same category; thresh, the maximum no. of pairs without common synonyms Output: the smallest i satisfying the criterion of coverage i = −1; do i + +; zeros = 0; for li in l do c = combinations(li, 2); for w1, w2 in c do if simi (w1, w2) == 0 then zeros + +; end end end while zeros > thresh;
To check for coverage on the i-th order synonymy, the number of word pairs w1 , w2 ∈ W for which no overlaps (.simi (w1 , w2 ) = 0) occur is considered. In Algorithm 4.1, a threshold of .1% is selected, meaning that when more than .99% of the considered words have at least one i-th order synonym in common, a good suboptimal i is found heuristically. Similarly, the level of completeness is indicated by pairs of words .w1 , w2 ∈ W that have an extremely high number of overlaps in their synonyms (.simi (w1 , w2 ) > θ ), despite potentially not being close in meaning. Thus, if the majority (.n >
.
4.5 Evaluation Table 4.2 Results of the synonymy order selection
59 .i
0 1 2
Ratio of .simi (w1 , w2 ) = 0 .100% .50.8% .0.5%
Ratio of .simi (w1 , w2 ) > 0.9 .0% .0.5% .4.4%
50%) of the pairwise comparisons have a similarity higher than .θ , the level of completeness is too high. Consequently, the similarity measure with the chosen order of synonymy order would be inadequate. This ratio should be kept as low as possible while satisfying the other criteria. The result of the optimization process on the 384 random word pairs with a threshold of .1% returns the minimum value of i satisfying the requirements as .i = 2, as shown in Table 4.2. This is also checked on the completeness criterion using .θ = 0.9. The completeness criterion is satisfied as only .4.4% of the total comparisons has a higher similarity than this limit. Consequently, the second order of synonymy is the found trade-off solution that satisfies all the 3 given criteria to a high degree while allowing the practical computation of the proposed spectral semantic similarity measure. For this reason, in future implementations (i.e., in the rest of this chapter and the following ones), the spectral semantic similarity measure between two words .w1 , w2 ∈ W that will be used is .sim2 (w1 , w2 ) when not specified differently. However, this observation is valid only when the same data source as here is used. In case another thesaurus is employed as a basis for the estimation of word similarity, the optimal order of synonymy should be computed again, as it depends on how strong the relationship between two terms has to be for them to be listed as synonyms in the considered thesaurus. Based on the observations from this section, a script for the computation of the spectral semantic similarity between words, available online,3 was implemented in a module usable for the computation of the semantic similarity, for instance, for evaluation purposes.
4.5 Evaluation A test comparing various measures and human performance on a similarity-based word ordering task was conducted to evaluate the newly proposed spectral similarity measure. For this purpose, a survey with groups of scalar adjectives and adverbs to be sorted was created. This was then executed by human raters and several semantic similarity measures, including the one introduced in this chapter, and their results were compared.
3 https://github.com/colombmo/semanticsimilarity.
60
4 Semantic Similarity Measures
4.5.1 Methodology As previously mentioned, similarity measures are a relative concept, meaning they do not make much sense when considered alone but are very meaningful compared to other estimates. In other words, the main power of the semantic similarity measure involves assessing whether the similarity between a specific pair of words is higher or lower than that between another pair. This property can easily be used for sorting words by their similarity to another word, which is a task that is easy for humans. For this reason, the validity of semantic similarity measures on scalar adjectives and adverbs is analyzed based on their capacity to perform word ranking tasks, compared to humans executing the same assignment, as demonstrated in [9, 45]. As a base for this evaluation, 15 sets of 5 scalar adjectives or adverbs belonging to the same category were pseudo-randomly selected by the author and verified by three collaborators, taking care to have variability in the following criteria between the different sets: • Similarity: The selected words should have different degree of similarity between them. For example, in the set {never, often, sometimes, regularly, always}, the words are much further in meaning than in the set {gigantic, massive, big, very big, huge}. • Categories: The selected categories of words should be variable. In the specific case, set of words belonging to the categories of frequency, temperature, size, quality, quantity, difficulty, and speed were picked. For each of the sets of 5 elements, one of the two extremes was randomly selected based on the author’s perceptions and confirmed by the assessment of three collaborators external to the project. For example, from the set {freezing, cold, mild, hot, burning}, freezing was selected as an extreme. The picked word was used as the element to which the others were compared for creating a ranking. This choice was made to ensure that all participants used the same reference system for their ranking, and thus all differences in their answers are indeed due to diverging perceptions and not simply to different reference system choices. A survey was created to collect people’s perceptions based on the defined adjectives and adverbs and the selected extremes. Some demographic information, such as age and English level, were first collected; after this, the central part of the survey asked participants to order four scalar adjectives or adverbs from the one closest to the furthest in meaning from the given word (i.e., the selected extreme). The wording of these questions was in the form “Please order the following words from the one which is the closest to the one which is the furthest in meaning from [word].” Figure 4.3 depicts a question investigating the ranking of the group {massive, very big, big, huge} with respect to gigantic, and the full list of the 15 questions asked is available in Appendix A. A script was implemented in Python 3 to answer the same questions using different similarity measures. A ranking can be obtained using similarity measures
4.5 Evaluation
61
Fig. 4.3 Survey for ranking words based on their similarity to an extreme
by simply computing the semantic similarity between the extreme word and the other four words in the set. Then the ranking corresponds to the elements sorted from the one with the highest to the one with the lowest similarity estimates. To compare different rankings, several methods exist. The most recurrent in the literature are Kendall’s [18] and Spearman’s [39] rank correlation coefficients. These indicate how closely correlated two orderings are, which can help compare the rankings of two distinct humans or an algorithm and a human. In these methods, crisp rankings are considered, meaning that integers are used to indicate a word’s position in the rankings. This implies that if some words have a very similar or different meaning, an inversion between them concerning the ground truth has the same effect on the computation of the correlation between the two. However, an inversion between very similar words can occur due to different perceptions on a fine-grained level, whereas an inversion between words that the high majority of people consider different is most likely an error in one of the compared rankings. Additionally, one can assume that such an error is not present in the ground truth. A solution to this issue involves using a weighted adaptation of Kendall’s distance between rankings [20]. This measure can be employed to compare the rankings obtained with an algorithm .RA to a ground truth .RG (e.g., the answers of single participants in the survey and the average answer of all participants), counting the number of pairwise inversions in the compared rankings and weighting them based on the similarity between the exchanged terms. Definition 4.7 Let .[n] = {1, . . . , n} be a set of indices corresponding to the terms to be ordered. Let .Sn be the set of permutations of .[n], for .σG , σA ∈ Sn , .σG (i), resp. .σA (i) indicate the crisp rank of element i in the ground truth ranking .RG , resp. .RA . Let D be a metric indicating the cost of a swap between two elements .i, j . The Kendall distance between rankings weighted by the elements inversion cost is defined as ∑ .KD (RG , RA ) = Dij [σA (i) > σA (j )] . (4.3) σG (i) 0. Consequently, one can say that .BC should be composed by two elements .BC = {b1 , b2 }. To easily cover the whole spectrum, the best choice for .b1 , b2 are the two opposite extremes of C, in such a way that any new term .w ∈ C lies between the elements of the basis .w ∗ ∈ [b1∗ , b2∗ ] and that a certain symmetry is ensured in the selected basis. .
Definition 5.1 Let C be the set of all terms describing the same property and .C ∗ the precisiated values of such terms. A fuzzy basis .BC of C is composed by two elements. For simplicity, these are considered to be the words .b1 , b2 ∈ C such that their precisiated values correspond to the minimum .min(C ∗ ) and the maximum ∗ ∗ .max(C ) of .C . Considering the extremes of C as the components of .BC allows using a simple heuristic for their precisiation. Indeed, under the previous assumption that the precisiation of scalar terms consists in mapping their meaning to a percentage of the spectrum, one can precisiate .b1∗ = ∗0% (approximately .0%) and .b2∗ = ∗100% or vice versa, as the orientation of the spectrum is subjective and irrelevant if only percentages are considered. With this in mind, one can approximate the extremes with the help of the spectral semantic similarity measure .sim2 and precisiate them as .∗0% and .∗100% to obtain a precisiated basis .BC∗ on the domain C. The procedure for doing this in an automatic way is summarized in Algorithm 5.1 and consists in computing the similarity between all terms in a vocabulary describing the same feature and selecting the pair with the lowest similarity as the basis.
Algorithm 5.1: Selection of the basis for C as in [3] Data: LC : A list of terms LC ⊆ C Result: BC : A basis of C min_similarity = ∞; BC = (); for all couples (w1 , w2 ), w1 , w2 ∈ LC do if sim(w1 , w2 ) < min_similarity then BC = {w1 , w2 }; min_similarity = sim(w1 , w2 )
One can notice that a vocabulary .LC ⊆ C has to be defined for this algorithm, similarly to one of the listed limitations of the Per-C. However, the vocabulary does not have to be application-specific or complete in this case. Indeed, only a small vocabulary has to be created for each domain of spectral terms. This is universal in the sense that it is common to any application and user. Also, missing words are not a problem, as those can be automatically precisiated using APM 1.0. Because of these properties, the vocabulary .LC could easily be manually constructed whenever needed, adding antonyms of the selected words to ensure
5.1 APM 1.0
75
variability. Alternatively, online resources could be used to automatically build this, as lists of terms describing a specific property are available in various locations. However, when .LC /= C, one cannot be sure that the selected basis corresponds to the actual extremes of C; only a good enough approximation is obtained. For this reason, .BC is called a fuzzy basis. Although the term basis is borrowed from linear algebra because of its spanning property (i.e., all words in C can be represented as a combination of the elements of .BC ), .BC is not a proper basis in this sense, as its elements are not linearly independent. With the selected basis, it is then possible to precisiate all the unknown words in the same category C on the spectrum .C ∗ , as a linear combination of the basis .BC .
5.1.2 The APM 1.0 Algorithm Like the construction of vectors as a linear combination of basis elements, the APM 1.0 algorithm proposes to precisiate all words belonging to the same category as a combination of their precisiated basis elements, depending on their similarity. Definition 5.2 Let C be a domain of scalar terms describing the same property and w ∈ C any word belonging to that category. Let .BC = {b1 , b2 } be the basis of C, and .BC∗ = {b1∗ , b2∗ } the precisiated basis of C. The term w can be precisiated as follows, considering that its precisiated value lies between the elements of .BC :
.
sim2 (b1 , w) · b1∗ + sim2 (b2 , w) · b2∗ . sim2 (b1 , w) + sim2 (b2 , w)
w∗ =
.
(5.1)
For instance, with C the set of adjectives describing temperature, .BC = {freezing, scorching.}, .BC∗ = {∗0%, ∗100%}, one can precisiate the meaning of hot, with ∗ = ∗77%. .sim2 (freezing, hot) = 0.30 and .sim2 (scorching, hot) = 0.99 as .hot This could be put in the context of ambient temperature where .BC = {freezing, scorching.}, .BC∗ = {∗ − 5 ◦ C, ∗40 ◦ C}, obtaining .hot ∗ = ∗30 ◦ C, as shown in Fig. 5.2. Definition 5.3 Let Cat be the set of all domains .Ci of words describing the same feature. The heuristic for the selection of the correct basis for the precisiation of a word w consists in looking for the category .Ci with the maximum number of overlaps between a small set of words .Si ∈ Ci from that category and the synonyms .SYN(w) of w: C = arg max |Si ∩ SYN(w)|.
.
Ci ∈Cat
(5.2)
Once C is found to be the category to which w belongs to, its basis .BC is selected to precisiate the meaning of w. With the presented approach, the results obtained by the algorithm are all in the form of fuzzy numbers (e.g., .∗10%), so to convert these to fuzzy sets, a ratio
76
5 Automatic Precisiation of Meaning
Fig. 5.2 Example of precisiation of hot using contextual information
model [7, 8] is taken as a basis for the definition of a triangular membership function corresponding to .∗a. In the specific case, a triangular membership function .T (a − 20%, a, a + 20%) is chosen with an arbitrary width of .20% of the spectrum, as shown in the example of Fig. 5.2. However, this approach does not reflect people’s real perceptions, as different words have a variable granularity level [5]. For example, freezing has a much narrower meaning than cold, as a freezing temperature is also cold, but a cold temperature is not necessarily freezing. This was also described as one of the main limitations of this solution in a workshop with six experts in the field of CWW. Thus, mainly to improve the estimation of the granularity level of the precisiated words, but also as a tentative to increase the accuracy of APM 1.0, a second iteration of the algorithm was developed and is presented in the following Sect. 5.2.
5.2 Automatic Precisiation of Meaning 2.0 This second iteration of the proposed algorithm for the precisiation of meaning focuses on the correct estimation of the width of the fuzzy set representing the meaning of the precisiated word. One can use the first-order synonyms to estimate the granularity level of a particular word and thus the width of its membership function. Indeed, it is possible to observe that, in general, words with low granularity (i.e., with a more general meaning), such as cold, have a broader spectrum of synonyms than more precise words, such as freezing. Other aspects to be considered for the improvement of APM 1.0 are the shape and symmetry of the chosen membership functions. For simplicity, only the width of the fuzzy sets is studied in this section, so an arbitrary symmetric shape is assumed for the searched membership functions. The proposed improvement of APM 1.0, Automatic Precisiation of Meaning 2.0 (APM 2.0), consists in precisiating all the synonyms .SYN(w) of the target word
5.2 APM 2.0
77
w to be precisiated. The synonyms are precisiated using APM 1.0, and their mean and standard deviation are then exploited to define the fuzzy set .w ∗ precisiating w. To avoid confusion between the precisiation with the accurate estimate of the membership function’s shape and the one obtained with APM 1.0, one can refer to the latter as pre-precisiation and to the former simply as precisiation. Definition 5.4 Let .SYN(w) = {v1 , v2 , . . . , vn } be the set of the first-order synonyms of word w. Then .SYN∗ (w) = {v1∗ , v2∗ , . . . , vn∗ } is the set of synonyms of w pre-precisiated using APM 1.0. In the following, some design choices for defining the second version of the APM algorithm are motivated.
5.2.1 Algorithm Choices One can safely assume that the meanings of the synonyms of a word w follow a normal distribution around the meaning of w. Indeed, the probability of a word being considered a synonym of w decreases the more its meaning is far from the meaning of w in both directions relative to the spectrum. To apply this observation, the chosen membership function representing the meaning of w has a Gaussian shape. It has to be noted that this is an approximation based on the assumption of symmetry of the membership functions, which does not necessarily correspond to the real perceptions of people. Since the pre-precisiations obtained with the APM 1.0 algorithm only lie in the range [0,1], simply computing the mean and standard deviation of a set of the pre-precisiated synonyms pushes to the center of the spectrum the words whose precisiated values should, in reality, lie closer to the limits, as one can observe in Fig. 5.3a. To remedy this, one can extend the data obtained from the pre-precisiation to values outside of the range by mirroring them around the median. This solution allows to center the precisiation around the median and adjust the standard deviation accordingly. Furthermore, to reduce the probability of having outliers, the .1% of the data furthest from the median is discarded. Definition 5.5 Let .SYN∗ (w) be the set of pre-precisiated synonyms of w and .v ∈ SYN∗ (w) a single element of this set. Then .SYN∗' (w) is the extension of this set by mirroring it around its median (Mdn). SYN∗' (w) = SYN∗ (w) ∪ {2 · Mdn(SYN∗ (w)) − v |v ∈ SYN∗ (w)}.
.
(5.3)
An example of the results of this process is depicted in Fig. 5.3b. Furthermore, to improve the precisiation of the elements of the basis, which are always the extremes of the spectrum, a particular heuristic is used to ensure that those are indeed the extremes also when precisiated. To do so, if the word w to be precisiated is an element of the basis, then its membership function is set as
78
5 Automatic Precisiation of Meaning
Fig. 5.3 Example of precisiations of hot with APM 2.0
a Gaussian function centered around the pre-precisiated basis element (i.e., .0% or .100% of the spectrum), with a standard deviation obtained from the standard deviation of the pre-precisiated synonyms mirrored around this value.
5.2.2 The APM 2.0 Algorithm Combining the presented choices, the APM 2.0 algorithm can be defined as follows.
5.2 APM 2.0
79
Fig. 5.4 Example of precisiation of terms describing temperature using APM 2.0
Definition 5.6 Let C be a domain of scalar terms describing the same property and w ∈ C a word belonging to that category. Let .BC = {b1 , b2 } be the basis of C and ∗ ∗ ∗ .B C = {b1 , b2 } the pre-precisiated basis. Then the membership function describing w can be computed with APM 2.0 as follows: .
w ∗ (x) =
.
⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩
⎛ √1 e σ 2π √1 e σ 2π
− 12 − 12
⎛
x−bi∗ σ
⎞2
⎞ x−μ 2 σ
, if w = bi ,
(5.4)
otherwise,
where .μ and .σ are, respectively, the mean and the standard deviation of .SYN∗' (w). This algorithm allows precisiating any spectral term on a normalized spectrum when the basis is precisiated automatically as .{0%, 100%}. This can then be translated to other contexts when applied to other minimum and maximum limits of the spectrum, such as the height of adults, lying, for instance, between 120 and 200 cm. APM 2.0 allows to precisiate different terms with a varying width of their membership functions, reflecting the granularity of their meanings. An example of some precisiated terms describing temperature can be found in Fig. 5.4, which was generated using a Python 3 implementation of the algorithm, available online.1 This does not represent a fuzzy partition [1] of the spectrum of temperature, but the precisiation of a random subset of the possible words with different granularity levels that can be used by people to describe perceived temperature.
1 https://github.com/colombmo/precisiation.
80
5 Automatic Precisiation of Meaning
5.3 Evaluation To evaluate the accuracy of the word precisiation, two strategies are used. The first one compares the algorithms’ results with manual precisiations done by humans, similarly to [3]. The second uses the same principle of sorted scalar adjectives used in Chap. 4. The methodology for two types of evaluation is presented, followed by the results for both iterations of the presented algorithm.
5.3.1 Methodology In the first experiment, a workshop with a group of experts in CWW was conducted, in which they were first asked to manually precisiate a set of 20 spectral terms with the help of hand-drawn triangular functions. After this, the participants were shown the automatic precisiations with APM 1.0, and then feedback and possible algorithm improvements were discussed as reported in [3]. These comments were then used as a basis to build APM 2.0. The manual precisiations produced by the experts are compared to the results of the two iterations of the algorithms using the distance between their membership functions defuzzified with the mean of maxima method [20]. This allows having a rough estimate of the correctness of the location of the precisiated values on the spectrum compared to human raters. This process is done in comparison with single experts and the average of all participants and is compared to the distance between different raters as a baseline. In the second evaluation method, a similar process to the one used in the evaluation of the spectral semantic similarity measure (Chap. 4) is employed. 15 sets of 4 spectral terms are ordered with respect to a randomly selected extreme belonging to the same category, based on their precisiation, and the obtained results are compared to the same task performed by people. The employed ground truth corresponds to that described in Sect. 4.5. Definition 5.7 For the computation of a crisp ordering, the center of the membership function of each precisiated word is considered, and their crisp position .σA (i) with respect to one another is considered as their rank .τA (i) = σA (i). This way, the crisp ranking .RA,c is defined. The computation of a similarity-based ranking requires more complex computations. In this case, the idea is to consider that a different ordering in words with a very similar meaning can be explained with the subjectivity of perceptions and does not have to be considered as big an error as an inversion between words with very disparate meanings. One can observe that the similarity between two words, solely based on their precisiated values, can be related to the amount of overlap between their membership functions, as seen in Fig. 5.4. In this example, one can observe a reduced overlap
5.3 Evaluation
81
between two different words, such as hot and cold, and a high overlap between hot and scorching. Definition 5.8 To exploit this property to create a ranking .RA,s that considers the similarity between analyzed words, it is possible to compute the similarity-based ranking .τA (i) of word .wi , using the crisp rank .σA (i) and the overlapping proportions of two membership functions .ov(wi∗ , wj∗ ), with the following equation:
τA (i) = ∑4
.
4 ∑
1
∗ ∗ j =1 ov(wi , wj )
ov(wi∗ , wj∗ ) · σA (j ).
(5.5)
j =1
With this method, the similarity-based ranking .RA,s is defined. The accuracy of the rankings obtained with APM 1.0 and APM 2.0 compared to the ground truth can be computed as in Chap. 4 using the Kendall distance-based accuracy with Eq. 4.8.
5.3.2 Results Six researchers who are familiar with the concepts of CWW participated in a workshop where they gave feedback about the APM 1.0 algorithm. The experts precisiated 19 pseudo-randomly selected words using a triangular membership function. These data were used to compare the distance between the manual precisiations of single participants, respectively of their average, to three different conditions: manual, the baseline to assess the level of agreement between different individuals; the results of the APM 1.0 algorithm; and the results of the APM 2.0 algorithm. The results of this comparison are summarized in Table 5.1, where one can observe that the APM 2.0 algorithm performs slightly better than APM 1.0, but both perform on average worse than the inter-participant agreement level. However, one can argue that the mean error of the algorithms spacing between .10.60% and .12.91% is still a reasonably good result compared to the interparticipant agreement level (.7.67% with a pairwise comparison between participants, .6.26% with a comparison to the average rating). Indeed, the maximum distance between the precisiation of two experts is .35.00%, and that with the average of all participants is .25.00%, and similar results are obtained with the Table 5.1 Mean distance between manual precisiations and algorithmic ones
Algorithm Humans APM 1.0 APM 2.0
Distance From single rater .0.0767 (SD = 0.0704) .0.1288 (SD = 0.0933) .0.1102 (SD = 0.0842)
From average rater (SD = 0.0529) .0.1291 (SD = 0.0821) .0.1060 (SD = 0.0770) .0.0626
82
5 Automatic Precisiation of Meaning
automatic precisiation algorithms. Despite this quite significant difference in the absolute precisiation of words, people still understand what others mean, so one could safely observe that a difference between precisiations of up to at least .20% could be attributed to subjectivity, and any difference smaller than this should be considered as a pretty accurate estimation of the meaning of a word. It should also be noted that this comparison considers the defuzzified values of the precisiated words, while the width of the obtained membership functions also accounts partly for differences. To consider more the width of the membership functions describing the meaning of the precisiated words in the evaluation of the accuracy of these, a possibility is to consider the accuracy of the ranking of sets of four words compared to a ground truth obtained from human raters, similarly as in the evaluation of Chap. 4. Employing the methods from Sect. 5.3.1, notably the ranking based on membership functions overlap with Eq. 5.5, one can compare the accuracy of the ranking obtained with the two iterations of the algorithm for the precisiation of meaning to the rankings with Colombo and Portmann’s semantic similarity measure [4] and the inter-rater agreement. The results of this comparison are reported in Fig. 5.5, for the cases using the crisp and similarity-based rankings, compared to single raters and the average of all participants.
Fig. 5.5 Accuracy of rankings using precisiation of meaning
5.4 Concluding Remarks
83
The results with this method roughly correspond to those obtained with the orderings using the semantic similarity measure and the inter-rater agreement. A slightly improved accuracy is observed in the second iteration of the APM algorithm compared to the first iteration. For example, the accuracy compared to the average human rater using the similarity-based ranking is .0.9260 (SD = 0.1581) for APM 1.0, .0.9670 (SD = 0.0833) for APM 2.0, .0.9623 (SD = 0.0611) for the spectral semantic similarity measure, and .0.9772 (SD = 0.0529) for the inter-rater agreement. These observations prove the quality of the presented method for automatically precisiating the meaning of spectral terms, especially in the case of the APM 2.0 algorithm, which also provides an estimate of the granularity of the precisiated terms.
5.4 Concluding Remarks In the current chapter, the first developments of an algorithm for the automatic precisiation of meaning [3] of spectral terms based on the spectral semantic similarity measure [4] are presented and extended with a second iteration of the algorithm based on the observations of six experts. This work provides an essential step toward inferring the meaning of unknown words describing human perceptions based on their relations with other words belonging to the same category. Thanks to simple heuristics for the automatic selection and precisiation of a basis, previous knowledge of related words is not even necessary. This approach allows an in-depth estimation and representation of the meaning of words that can increase the level of cointension [11] between the user and the phenotropic system, making it possible to implement adaptivity and robustness in the phenotropic interface, which provides a more natural interaction from the point of view of the user. Although the results prove that this automated technique provides results close to manual precisiation of meaning, human input is still needed to put the precisiated words in specific contexts. For instance, precisiating the word tall does not provide the heights in cm that correspond to a tall person. A possible solution to this problem could consist of using a database with the minimum and maximum values of all specific spectra (e.g., human height) to put the normalized percentual precisiation into the context of measurable quantities. In the proposed solution for estimating the meaning of a word w, the membership function is built with the standard deviation of the synonyms of all the meanings of w, and not only around the most relevant meaning. This might sometimes give a too broad estimation of the width of the membership function, but at the same time, this choice allows to include all relevant meanings, as sometimes more than one meaning of a word is relevant for its precisiation. The choice of employing all meanings of w gives overall better results, as it is judged better to have slightly too wide membership functions that underrepresented meanings of words. Additionally,
84
5 Automatic Precisiation of Meaning
the choices employed in APM 2.0 make it so that the obtained membership functions have a Gaussian and symmetric shape, but these are not necessarily perfect representations of the real perception of people toward the meaning of words, as this could be slightly asymmetric in some cases. In the second iteration of the APM algorithm, a method to estimate the granularity of words has been introduced. This has the advantage of allowing the discovery of a hierarchy in words describing the same feature. However, the correctness of such a granulation should be further explored with more specific experiments. Moreover, according to Mendel’s observations, type-2 fuzzy sets should be used to describe words to better represent the subjectivity factor (i.e., the fact that words have slightly different meanings for different people) [13]. Thus, in future iterations, the proposed model for automatic precisiation of meaning should be extended to use type-2 fuzzy membership functions. This could be achieved by considering the synonyms of w divided by the several meanings, creating a different membership function for each connotation, and then using this variability to define the bounding functions of the type-2 membership function. Despite future works that should be developed to further improve the accuracy and correctness of the proposed algorithm, the current results already allow for a better understanding of spectral adjectives and adverbs. With this, it is possible to understand even more unknown words thanks to the application of analogical reasoning, which will be presented in Chap. 6. This provides a basis to better understand human perceptions and desires without requiring users to explicitly create rules to translate (precisiate) words to a language that a computer can easily use for computation and other operations.
References 1. Alonso Moral, J. M., Castiello, C., Magdalena, L., & Mencar, C. (2021). Interpretability constraints and criteria for fuzzy systems. In Explainable fuzzy systems (pp. 49–89). Springer. https://doi.org/10.1007/978-3-030-71098-9_3 2. Buecheler, T., Sieg, J. H., Füchslin, R. M., & Pfeifer, R. (2010). Crowdsourcing, open innovation and collective intelligence in the scientific method: a research agenda and operational framework. In The 12th International Conference on the Synthesis and Simulation of Living Systems (pp. 679–686). MIT Press. https://doi.org/10.21256/zhaw-4094 3. Colombo, M., & Portmann, E. (2020). An algorithm for the automatic precisiation of the meaning of adjectives. In Joint 11th International Conference on Soft Computing and Intelligent Systems and 21st International Symposium on Advanced Intelligent Systems (SCISISIS) (pp. 1–6). https://doi.org/10.1109/SCISISIS50064.2020.9322674 4. Colombo, M., & Portmann, E. (2021). Semantic similarity between adjectives and adverbs— the introduction of a new measure. In V. Kreinovich, N. Hoang Phuong (Eds.), Soft computing for biomedical applications and related topics (pp. 103–116). Springer. http://doi.org/10.1007/ 978-3-030-49536-7_10 5. Kaurova, O., Alexandrov, M., & Ponomareva, N. (2010). The study of sentiment word granularity for opinion analysis (a comparison with Maite Taboada works). International Journal on Social Media. MMM: Monitoring, Measurement, and Mining, 1(1), 45–57.
References
85
6. Kennedy, C., & McNally, L. (2005). Scale structure and the semantic typology of gradable predicates. Language, 81, 345–381. https://doi.org/10.1353/lan.2005.0071 7. Krifka, M. (2007). Approximate interpretation of number words (pp. 111–126). HumboldtUniversität zu Berlin, Philosophische Fakultät II. http://dx.doi.org/10.18452/9508 8. Lefort, S., Lesot, M. J., Zibetti, E., Tijus, C., & Detyniecki, M. (2017). Interpretation of approximate numerical expressions: Computational model and empirical study. International Journal of Approximate Reasoning, 82, 193–209. https://doi.org/10.1016/j.ijar.2016.12.004 9. Magdalena, L. (1997). Adapting the gain of an FLC with genetic algorithms. International Journal of Approximate Reasoning, 17(4), 327–349. Genetic Fuzzy Systems for Control and Robotics. https://doi.org/10.1016/S0888-613X(97)00001-7 10. Magdalena, L. (2002). On the role of context in hierarchical fuzzy controllers. International Journal of Intelligent Systems, 17(5), 471–493. https://doi.org/10.1002/int.10033 11. Mencar, C., Castiello, C., Cannone, R., & Fanelli, A. (2011). Design of fuzzy rule-based classifiers with semantic cointension. Information Sciences, 181(20), 4361–4377. Special Issue on Interpretable Fuzzy Systems. https://doi.org/10.1016/j.ins.2011.02.014 12. Mendel, J. (2001). The perceptual computer: an architecture for computing with words. In 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297) (Vol. 1, pp. 35–38). https://doi.org/10.1109/FUZZ.2001.1007239 13. Mendel, J. (2007). Computing with words: Zadeh, Turing, Popper and Occam. IEEE Computational Intelligence Magazine, 2(4), 10–17. https://doi.org/10.1109/MCI.2007.9066897 14. Mendel, J., Zadeh, L. A., Trillas, E., Yager, R., Lawry, J., Hagras, H., & Guadarrama, S. (2010). What computing with words means to me [discussion forum]. IEEE Computational Intelligence Magazine, 5(1), 20–26. https://doi.org/10.1109/MCI.2009.934561 15. Novák, V. (2016). Linguistic characterization of time series. Fuzzy Sets and Systems, 285, 52– 72. https://doi.org/10.1016/j.fss.2015.07.017 16. Novák, V., & Lehmke, S. (2006). Logical structure of fuzzy if-then rules. Fuzzy Sets and Systems, 157(15), 2003–2029. https://doi.org/10.1016/j.fss.2006.02.011 17. Shabaninia, F. (2014). Z-mouse : A new tool in fuzzy logic theory. World Journal of Computer Application and Technology, 2(1), 22–27. https://doi.org/10.13189/wjcat.2014.020104 18. Trillas, E., Termini, S., Tabacchi, M. E., & Seising, R. (2015). Fuzziness, cognition and cybernetics: An outlook on future. In 2015 Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and Technology (IFSA-EUSFLAT-15) (pp. 1413–1418). Atlantis Press. https://doi.org/10.2991/ifsa-eusflat-15.2015.200 19. Zadeh, L. A. (1999). Fuzzy logic = computing with words. In Computing with words in information/intelligent systems (Vol. 1, pp. 3–23). Springer. https://doi.org/10.1109/91.493904 20. Zhao, R., & Govind, R. (1991). Defuzzification of fuzzy intervals. Fuzzy Sets and Systems, 43(1), 45 – 55. https://doi.org/10.1016/0165-0114(91)90020-Q
Chapter 6
Fuzzy Analogical Reasoning
The automatic precisiation of the meaning of words is a powerful technique for understanding perceptions described with natural language. A comprehensive understanding and better treatment of the meaning of certain words can be obtained with reasoning, which allows inferring the causes or consequences of the expressed perceptions. This process can prove particularly useful for an artificial entity to better adapt the interaction to the needs and desires of a user, as it allows to map a known situation (base) to an unknown one (target). For instance, the knowledge about the reaction of an artificial entity expected by a user in a specific context can be extrapolated to react in an analogous way to other situations where the user’s expectations are unknown. In the current chapter, a way of automatically handling reasoning by analogies is presented, which is a fundamental process for developing adaptive interfaces that try to infer the users’ needs and react accordingly. This aspect also has the advantage of providing a more robust interaction, as the artificial entity tries to react to unknown situations in a meaningful way instead of failing without trying. This coincides with the fundamental idea of phenotropics of “trying to be an ever better guesser instead of a perfect decoder” [18], by not following precise protocols, corresponding in this case to reacting in a predefined way to known situations, but decoding patterns in the current interaction, reflecting the behavior that is observed in human interactions, rather than that observed in machines. In Sect. 6.1, the basics of fuzzy analogical reasoning are presented, and these are used in the implementation of the fuzzy analogical reasoning (FAR) prototype, detailed in Sect. 6.2, and to which the author collaborated in [6]. A Strengths, Weaknesses, Opportunities, and Threats (SWOT) analysis of the FAR prototype based on experts feedback is summarized in Sect. 6.3. A critical outlook on the FAR prototype and future improvements conclude the chapter in Sect. 6.4.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Colombo, Phenotropic Interaction, Fuzzy Management Methods, https://doi.org/10.1007/978-3-031-42819-7_6
87
88
6 Fuzzy Analogical Reasoning
6.1 Analogical Reasoning This section presents the general scheme on which fuzzy analogical reasoning is based. This consists of the fuzzy analogical scheme developed by D’Onofrio et al. [11], based on the work of Bouchon-Meunier and Valverde [4]. This strongly relies on the concept of resemblance between problems to be solved and on mapping the resemblance relations from one domain to another, using an analogy. The resemblance relations are based in this type of reasoning mainly on the semantics of the used relations. Furthermore, as the analogical scheme presented in [4, 11] is based on the reasoning between fuzzy sets, a first idea for the extension of the same scheme to CWW is also introduced. Definition 6.1 (Bouchon-Meunier et al. [3]) Let V and W be two linguistic variables defined on the universes X and Y , respectively. Let .[0, 1]X and .[0, 1]Y be the set of fuzzy sets on X and Y , respectively. Then, for the given relations .β ⊂ [0, 1]X × [0, 1]Y , .RX ⊂ [0, 1]X × [0, 1]X , and Y Y .RY ⊂ [0, 1] × [0, 1] , an analogical scheme is a mapping Rβ,RX ,RY : [0, 1]X × [0, 1]Y × [0, 1]X −→ [0, 1]Y ,
.
(6.1)
such that .∀A ∈ [0, 1]X and .∀B ∈ [0, 1]Y satisfying .AβB, and .∀A' ∈ [0, 1]X satisfying .ARX A' , the following properties are valid: 1. .B = Rβ,RX ,RY (A, B, A). 2. .B ' = Rβ,RX ,RY (A, B, A' ) satisfies .A' βB ' and .BRY B ' . This can be used to find .B ' that resembles B through .RY such that .A' , B ' are linked by .β, whenever .A, B are known to be related to one another by .β and .A, A' are related through .RX . In other words, knowing the relations .RX between .A, A' and .β between .A, B, it is possible to find .B ' that is related to .A' in an analogous way as to how B is related to A, as depicted in Fig. 6.1. Definition 6.2 Let .RX (A, A' ) be the resemblance relation between A and .A' in X and .β the relation between A and B. Then, for analogy, a .B ' exists, such that the resemblance relation .RY (B, B ' ) between B and .B ' in Y is equivalent to .RX (A, A' ): RX (A, A' ) = RY (B, B ' ).
.
This property is used to construct a .B ' related to B by .RY and to .A' by .β. Fig. 6.1 The analogical scheme (Adapted from [3])
(6.2)
6.2 The FAR Prototype
89
For instance, consider the following analogy “Knowing that the price of big houses is high, what is the price of small houses?”. In this case, the variable V on X represents the size of a house, while W on Y represents the price, .A = big, ' = small, .B = high. Then, an answer .B ' to the question can be found .A that satisfies .RX (A, A' ) = RY (B, B ' ). In this case, a possible solution could be ' .B = low. This is obtained assuming that .big, small, high, and low are labels of predefined fuzzy sets. The same method can be used to compute the corresponding analogy when the fuzzy sets are a priori not known, but only their labels. To do this, one can first use the APM 2.0 algorithm to precisiate the meaning of the used words, compute the analogy on the obtained fuzzy sets, and finally, select the answer .B ' . To transform ' .B to a word output, the label from a list of precisiated candidates can be selected such that its precisiated value is the closest to .B ' .
6.2 The FAR Prototype As a proof of concept for the practical application of analogical reasoning to real use-cases, a prototype implementing the analogical scheme on the task of reasoning with words is presented in this section. This has been implemented in Python 3 with an interface allowing people to input analogies and get as an output an answer solving them, including an explanation of the result. The script implementing the prototype is accessible online.1 First, some possible resemblance relations to be used as a basis for the reasoning process are presented for the two cases of conceptual and spectral analogies. These correspond to the type of similarities introduced in Chap. 4, the former handling the similarity based on the type of relations existing between concepts (e.g., is-a, capable-of, part-of), the latter relative to the positioning of scalar adjectives and adverbs on the spectrum of the possible values of the corresponding linguistic variable. Finally, the inputs, outputs, and some considerations on the explainability of the developed prototype handling both types of analogies are described.
6.2.1 Conceptual Analogies Possible resemblance relations between terms that the analogical scheme is built on include measures of distance, similarity, or other types of relatedness on the semantics of the related words. For the case of reasoning with concepts, a good data source describing the relations between concepts can be found in ontologies [20], knowledge graphs [26], and fuzzy ontologies [21]. These allow knowing the type,
1 https://github.com/colombmo/Analogical_Reasoning_Prototype.
90
6 Fuzzy Analogical Reasoning
and also the strength in the case of fuzzy ontologies, of the relationships existing between the various analyzed concepts. As also seen in Chap. 4, the conceptual semantic similarity is often computed employing these constructs, proving that they provide a good description of the effective relatedness of concepts. Assuming a complete enough knowledge base more or less precisely representing the knowledge about the state of the world, described by the relationships and interactions between objects and related concepts, this can be used as a data source to describe the resemblance relations to be employed for the analogical reasoning process. For example, imagine that a search engine knows that people who often search for the term engine are generally interested in cars. Thus it is economically beneficial to show them advertisements related to this topic (user profiling can have ethical implications [22, 27], not considered in this example). When a new user of the system searches for the term pedals, the engine can use analogical reasoning to infer the possible interests of this user to adapt the shown advertisement campaigns accordingly. In order to find the interests related to this search item, the system has to answer the question “what is to pedals as car is to engine,” also represented as “engine : car :: pedals : ?” using the standard notation for proportional analogies. A simple algorithm to provide an answer to this question consists in finding in a knowledge base the relationship existing between engine and car, for example, part-of, meaning that engine is part-of car. By analogy, one can then apply the same relationship to pedals in the same knowledge base to find an answer to the original question. In this case, one might find the answer bicycle; thus, the engine would infer that the user is interested in bicycles. If several answers are present in the knowledge base, they are all returned. These could be sorted by relevance if the employed knowledge base is a fuzzy ontology, an ontology containing information about the strength of relations between items, but this distinction cannot be obtained with crisp bases as all relationships are represented with the same strength and relevance. Although some alternatives for the computation of conceptual analogies exist in the literature [5, 17], the simple algorithm from the example is chosen for the practical computation of conceptual relatedness because of its simplicity and the explainability [1] associated with it. This is generalized in Algorithm 6.1.
Algorithm 6.1: Analogical reasoning with conceptual resemblance relations [6] Input: An analogy with concepts A : A' :: B :? as in Fig. 6.1; O an extensive knowledge base. Output: B' a set of possible results B ' as in Fig. 6.1 RX ← Find relationship between A and A' in O; RY ← RX ; R ← Apply RY to B in O.
6.2 The FAR Prototype
91
Algorithm 6.1 is limited to the most straightforward cases where a direct relationship exists between .A, A' and can return an empty set if the same relation applied to B does not provide any result. To overcome the first limitation, one could, for example, describe .RX , and consequently .RY , with a concatenation of relationships in O. However, this is out of the scope of this work, as the main focus is on the computation of spectral analogies for the correct handling of human perceptions expressed with scalar adjectives and adverbs. Returning a set of answers to the original question and not simply selecting the unique best answer are motivated by the fact that most of the time, several alternatives are equally correct. For instance, engine is not only part of cars, but also of other vehicles such as motorbikes. In an experiment, it has been proved that despite all people understand the analogical question the same way, they do not always agree on a single answer, but the results contain a subjective factor and are thus not identical between different people (e.g., often synonyms are used, but also concepts with different semantics) [11, 12]. For this reason, Algorithm 6.1 returns a list of all the possible answers instead of losing information by selecting only one. Algorithm 6.1 was implemented in the FAR prototype with Python 3.9, using ConceptNet 5.5 [26] as a knowledge base for the retrieval of the ontological relationships between words. ConceptNet is a multilingual knowledge graph representing common-sense relationships between concepts, constructed on the information from various sources, such as crowd-sourced resources, games with a purpose [30], and expert-generated content [20]. This was selected as the most complete and accurate general-purpose knowledge base publicly available since it extends some popular general-purpose ontologies such as WordNet [29] and better covers any domain than domain-specific fuzzy ontologies [23]. In the prototype, the expected input is in the form “.A : A' :: B : ?.” Some preprocessing is applied to the input to extract and lowercase the words corresponding to .A, A' , and B. These are then used to implement Algorithm 6.1 by retrieving the type of relationship between them using the ConceptNet API.2 Then the retrieved relationship type is used as an input to the ConceptNet API to obtain .B' from B. Examples of conceptual analogical reasoning can be found in Sect. 6.2.3.
6.2.2 Spectral Analogies As described in Chap. 4, relationships between scalar adjectives and adverbs are generally not well represented in ontologies and similar knowledge bases, while for humans, it is straightforward to compute the spectral similarity between them. For this reason, a different approach than that for the conceptual resemblance relations
2 https://github.com/commonsense/conceptnet5/wiki/API.
92
6 Fuzzy Analogical Reasoning
has to be taken when computing the spectral resemblance relations between these types of terms. A good candidate for the computation of spectral resemblance relations could be Colombo and Portmann’s [8] spectral semantic similarity measure presented in Chap. 4. However, semantic similarity measures are relative and not absolute; they are good at representing if two items are closer or further in meaning than two other elements from the same domain, but they are not comparable in different domains. In other words, when considering two different linguistic variables V and W , the similarity between two linguistic terms .A, A' in V cannot be transferred to two linguistic terms .B, B ' in W , as the linguistic variables .V , W represent different domains in which the numbers representing the semantic similarity vary. As a consequence, the mapping .RX (A, A' ) = RY (B, B ' ) cannot be executed, and thus the spectral semantic similarity measure is not a good candidate resemblance relation for the case of spectral analogies. A domain-independent, normalized measure should be used as a resemblance relation to solving this problem. The proposed resemblance relation to be used for computing analogies between scalar terms is based on the precisiated value of words, for example, precisiated with the APM 2.0 algorithm presented in Chap. 5 [7]. Indeed, this method proposes to precisiate the meaning of terms on a normalized scale (i.e., from 0 to 1) for any category of words describing the same feature, which overcomes the presented limitation of non-uniformity between distinct linguistic variables that the similarity measure presents. The idea of this method consists in precisiating the words of the analogy .A, A' , B, computing the distance between the precisiated values .A∗ , A'∗ of ' '∗ such that .R applied to .B ∗ satisfies the relationship .A, A as .RX , and finding .B Y ∗ '∗ ∗ ''∗ .RX (A , A ) = RY (B , B ). If an answer in natural language form is expected, the term .B ' such that the distance between its precisiated value .B '∗ and the target value .B ''∗ is minimal is selected from a list of linguistic terms describing the same linguistic variable W . The whole process is described in Algorithm 6.2 and depicted in Fig. 6.2.
Algorithm 6.2: Analogical reasoning with spectral resemblance relations [6] Input: An analogy with scalar terms A : A' :: B :? as in Fig. 6.1; O an extensive knowledge base. Output: A scalar term B ' solving the analogy. V ← Identify the category of A, A' with Eq. 5.2; W ← Identify the category of B with Eq. 5.2; A∗ , A'∗ , B ∗ ← Precisiate A, A' , B with APM 2.0; RX ← dist (A∗ , A'∗ ); RY ← RX ; B ''∗ ← B ∗ ∓ RY ; B ' ← select from a subset of candidates W the word B ' s.t. dist (B '∗ , B ''∗ ) is minimal.
The signed distance between fuzzy sets .dSGD [31] is being used to compute the distance between the precisiated values .A∗ , A'∗ and .B '∗ , B ''∗ .
6.2 The FAR Prototype
93
Fig. 6.2 Fuzzy analogical reasoning for spectral analogies (Adapted from [6])
R Definition 6.3 Let .A, B be two fuzzy sets, with .Aα , Bα their .α-cuts and .AL α , Aα , L R .Bα , Bα the left and right of the .α-cuts, respectively. Then, the signed distance between A and B is defined as a mapping such that:
1 .dSGD (A, B) = 2
1⎡
⎤ R L R AL α (α) + Aα (α) − Bα (α) − Bα (α) dα.
0
(6.3)
In the presented algorithm, the membership functions obtained with APM 2.0 are symmetric, which means that for any fuzzy set A constructed this way, .∀α ∈ [0, 1], ⎡ L ⎤ 1 R (α) = μ , where .μ is the center of the Gaussian membership A . (α) + A A A α α 2 function, as shown in Fig. 6.2. As a consequence of this observation, the signed distance can be in this case simplified as dSGD (A, B) = μA − μB .
.
(6.4)
Based on the signed distance, one can notice in Algorithm 6.2 that .B ''∗ ← B ∗ ∓ RY , the .∓ means in this case that, to try and obtain .B ''∗ , .B ''∗ = B ∗ − RY is first computed, in order to keep the same direction as in the relationship .RX . If this gives a result outside of the range .B ''∗ ∈ [0, 1], then .B ''∗ is computed as .B ''∗ = B ∗ + RY , and the relationship .RY is then the “opposite” of .RX , as in the example from Fig. 6.2. However, the term “opposite” when denoting a relationship from a different domain does not necessarily mean something, as the direction of .RX , RY depends on the orientation of the axes of the spectra on which these relationships are defined. For instance, in the precisiation of temperature, one could set the left limit (i.e., 0) to be corresponding to freezing or burning and the right limit (i.e., 1) to burning or freezing, respectively. Depending on this arbitrary choice, .RX could take opposite directions. Algorithm 6.2 was implemented in the FAR prototype with Python 3.9 using the implementation of the APM 2.0 algorithm based on Colombo and Portmann’s semantic similarity measure building on the synonyms retrieved from thesaurus.com, as this allows to precisiate scalar terms quite precisely without
94
6 Fuzzy Analogical Reasoning
human intervention. The candidates for retrieving the answer .B ' to the analogy are selected from the short list of random terms belonging to the same category of B, which are also used to identify the category of B. In the prototype, the expected input is in the form “.P (A, A' ) :: Q(B, ?)” using the notation from Fig. 6.1 to differentiate this case from the conceptual analogy. However, P and Q are used only for presentation purposes and are not necessary for the reasoning process. For instance, the analogy “knowing that a Ferrari is expensive because it is fast, what is the price of a Fiat, knowing it is slow?” should be written as “Ferrari (fast, expensive): Fiat (slow, ?).” Some preprocessing is applied to the input to extract and lowercase the words corresponding to .A, A' , B. These are then used to implement Algorithm 6.2 by precisiating them, computing their distance, mapping it to the other domain to find .B ''∗ , and consequently computing the best candidate .B ' .
6.2.3 Prototype Interface The two presented algorithms were implemented in a prototype with a simple command-line interface. This allowed to let users input an analogy in the form “.A : A' :: B :?” for conceptual analogies and “.P (A, A' ) : Q(B, ?)” for spectral analogies. With a simple regular expression matching, the system can tell which of the two types of analogy is to be employed and redirects the input terms to the corresponding algorithm after having lowercased them to prevent possible issues. The output of the FAR prototype consists not only of the answer to the proposed analogy (i.e., .B' for conceptual analogies, .B ' for spectral analogies) but also of an explanation of the process for which that answer has been provided. In this way, the user can understand why a particular answer has been provided, allowing them to decide if to accept or reject the machine’s decision [1], which helps create a trusting relationship in the human–machine collaboration [13]. Furthermore, explanations can be helpful to see what the process involved in the search for the provided answer was. The advantage of seeing this schema includes the possibility for users to identify the differences in their reasoning with respect to that of the system when their answers do not match perfectly and confirm, challenge, or make new assumptions based on the system’s explanations [24]. This process also allows, for example, to restrict the analogy computation in the conceptual analogy only to a specific type of relationship. Explanations are provided in the prototype through transparency of the algorithms involved and textual explanations of the decisions taken in the fuzzy analogical reasoning process. The textual explanations are based on templates, as the targeted users for the evaluation are experts that are first instructed on the use of the prototype, but employing natural language generation libraries would be a fundamental addition to make these explanations easier to understand for the general public [14, 15].
6.2 The FAR Prototype
95
Fig. 6.3 Analogical reasoning with conceptual analogies in the prototype, including textual explanation (Reprinted from [6], ©2020 IEEE)
Fig. 6.4 Analogical reasoning with spectral analogies in the prototype, including textual explanation (Reprinted from [6], ©2020 IEEE)
The structure of the explanations in the case of the conceptual analogy consists of the identified relations .RX between .A, A' and in the results .B' of the analogy, displayed divided by the type of relationship that allowed to reach that answer. An example for the analogy “nucleus : atom :: sun : ?” is presented in Fig. 6.3. In the case of spectral analogies, the explanations are a bit less straightforward, and they include, other than the answer to the analogy: • The identified category of .A, A' and B • The precisiated values .A∗ , A'∗ and .B ∗ , B '∗ , both as numbers (defuzzified value) and visually on the spectrum of all linguistic terms describing the same linguistic variable • The distances .dist (A∗ , A'∗ ) and .dist (B ∗ , B '∗ ) An example for the analogy “motorbike (lightweight, fast) : tank (bulky, ?)” is presented in Fig. 6.4.
96
6 Fuzzy Analogical Reasoning
6.3 Evaluation In this section, a tentative evaluation of the current state of the analogical reasoning prototype is presented, and user feedback to be included in future iterations in the design science research process is discussed.
6.3.1 Methodology Some datasets containing analogies exist in the literature, which allow comparing automated analogical reasoning models with ground truth, for instance, the Google analogy test set [19], the Bigger analogy test set [16], and the SAT analogy questions [28]. However, the use of these datasets as ground truth for fuzzy analogical reasoning has been highly criticized for various reasons [25]. These include, among others, the fact that some datasets mainly consider syntactic relationships instead of semantic ones and their assumption that linguistics and psychology are binary. Thus answers are either correct (corresponding to the ground truth) or incorrect (not corresponding to the ground truth). However, this has been proven to be untrue [32]. For instance, a group of humans performing the same analogical reasoning task mostly do not come up with the same answer but with different solutions, either using synonyms and similar words or taking different approaches in the reasoning [12]. These observations make it extremely difficult to compare automatic models for computing analogies to ground truth. Additionally, none of the existing datasets designed for this type of task includes analogies between scalar terms, the computation of which is the primary goal of the presented prototype. These reasons combined motivate not to execute a traditional evaluation of the FAR prototype based on the accuracy against a ground truth. Moreover, this being the first iteration of the FAR prototype, one of the most interesting points to explore is the feedback of potential users of the system in order to identify the points to be improved in future iterations. For the presented reasons, the current version of the prototype was tested and challenged in an unstructured way by eight researchers with backgrounds in applied fuzzy systems, machine learning, human–computer interaction, psychology, and business in the context of a workshop. The participants were instructed on using the FAR prototype and one with the same interface based on Word2Vec [19]. The Word2Vec-based version of the prototype was provided in order to have a baseline for comparison. This version only provided the answers to the analogy question, without an explanation of the results, as the fact of being based on word embeddings does not allow for a straightforward explanation of its reasoning process. After learning how to use the prototypes, the participants were allowed to use them freely for computing conceptual and spectral analogies for about 10 minutes.
6.3 Evaluation
97
This was followed by an unstructured discussion about the FAR prototype, in general, and compared to the alternative (i.e., Word2Vec-based).
6.3.2 Results For the case of the conceptual analogies, the feedback from the participants was generally positive. The results were similar with both prototypes, in the sense that the accuracy of both was very variable. Sometimes, the provided list of answers contained also the one that the participant would have given, along with other reasonable ones (e.g., “nucleus is to atom as sun is to?” returned “the universe” and “solar system”). Sometimes it contained the one that the participant would have given, but also some that did not make any sense according to them (e.g., “car is to engine as human is to?” returned “five fingers on each hand,” “two arms,” . . . , “gone to the moon”). In general, the provided analogies were far from perfect. However, the participants unanimously affirmed that the provision of textual explanations in the prototype added much value, as it allowed them to easily understand the reason for some mistakes, and based on that, they could tell which answers were potentially correct. Thanks to this feature, some participants noticed that many errors were caused by missing, incorrect, or too general relationships (e.g., of the type relatedTo) between words in the knowledge base. Additionally, some testers observed that answers could have been generally improved by considering the differences in the granularity level of A and .A' when computing .B ' from B. For instance, the answer to “nucleus is to atom as sun is to?” should be only “solar system,” as the answer “the universe” is too high in the hierarchy compared to “atom” with respect to “nucleus.” A possible solution to this could be using a knowledge base with information about the strength of relationships between elements, as the granular knowledge cube [10]. Further possible improvements of the system include handling relationships between elements that are not directly connected in the knowledge base but have a third concept in common and using similar relationships when an exact matching .RX = RY does not provide any answer. The results of the discussion with the experts regarding the FAR prototype with concepts are summarized in the SWOT analysis in Table 6.1. In the case of spectral analogies, participants were generally satisfied with the accuracy of the analogies computed by the prototype, which was most of the time returning synonyms or words with a very similar meaning to what they would have answered. From this point of view, the results of the alternative prototype were insufficient. Additionally, as in the previous case, the users found the explanations of the reasoning process beneficial for the interpretation of the correctness of the algorithm, although requiring slightly longer to understand than the ones for conceptual analogies.
98
6 Fuzzy Analogical Reasoning
Table 6.1 SWOT analysis of the FAR prototype for conceptual analogies Strengths – Simplicity – Explainability
Weaknesses – Lack of precision in some relationships in the knowledge base – Strength of relationships missing
Opportunities – Introduction of strength of relationships in the knowledge base – Use of similar relationships
Threats – Inability to handle indirect relationships, which the alternative can do (although not always well) – Incomplete knowledge base
One common comment between the different participants was that sometimes the answer to an analogy was an uncommon word (e.g., scorching), when using a combination between a modifier and an adjective or adverb would be more natural for humans (e.g., extremely hot) [9]. As this would allow having a finer granularity in the provided answers and the user inputs, it would be worth extending either the semantic similarity measure [8] on which the FAR prototype is indirectly relying or using a set of rules allowing to adapt the membership functions of the precisiated terms [7] when they are combined with modifiers. Opportunities for improving the FAR prototype in terms of the number of possible candidate answers include, other than the ability to handle modifiers, the creation of new fuzzy knowledge bases with the spectral relationships between scalar terms (or their precisiated values) already included in their structure, as it would allow not only to accelerate the reasoning process but also to make it easier for users to observe the knowledge of the system and eventually take action to correct erroneous data [2]. A further aspect that none of the experts observed during the experiment but that came up in the discussion phase as a potential future improvement is the ability of the system to reason with a mix of scalar terms and other concepts, for instance, in the analogy “what is the speed of a tank, knowing that motorcycles are fast?”. This could be based on the combined attributes of the concepts (e.g., size, weight, the number of occupants) to compute the analogy. However, the complexity of such a reasoning process is very high, except if a universal similarity measure uniformly covering both conceptual and spectral similarity is first developed. The results of the discussion with the experts regarding the FAR prototype with spectral analogies are summarized in the SWOT analysis in Table 6.2.
6.4 Concluding Remarks In the current chapter, the first iteration of the FAR prototype was presented, as well as a qualitative evaluation with eight experts. This represents a promising approach for computing analogies on scalar adjectives and adverbs using automatically
6.4 Concluding Remarks
99
Table 6.2 SWOT analysis of the FAR prototype for spectral analogies Strengths – Accuracy – Explainability
Weaknesses – Inability to process modifiers
Opportunities – Introduction of spectral similarity or precisiated values in knowledge base – Mechanism to let users correct data
Threats – Inability to handle analogies of mixed type
precisiated words with the APM 2.0 algorithm. This prototype’s novelty consists of the complete automation of the reasoning process on words describing perceptions, which was previously possible only on a theoretical basis or analogies expressed as fuzzy sets instead of the representation employing words, which is more natural for humans. The developed prototype provides essential steps toward the automated understanding of relationships between data and using this knowledge to apply analogous relationships to unknown situations to be solved. For instance, assume that a user created a rule for which a virtual assistant sets the lighting intensity to 100% when the user says “it is too dark in here.” Then fuzzy analogical reasoning can be used to infer a rule that applies when the same user says the unknown rule (but similar to the original one) “it is too bright in here.” This process corresponds to solving the analogy .dark : 100% :: bright :?. Fuzzy analogical reasoning is used in this format in Chap. 7 to increase the robustness and flexibility of a virtual assistant implementing phenotropic principles compared to a traditional assistant. With the FAR prototype being in the first iteration of the design science research process, the qualitative evaluation results are promising for the new case of spectral analogies and allow the fundamental analysis of important points to address in future iterations of the prototype. The main improvement lies in the ability to handle modifiers (e.g., very, quite, extremely) in addition to single adjectives and adverbs, as these are very commonly used in combination with a limited set of simple terms by people to express perceptions with more granularity and in a more straightforward way than with a more extensive set of terms. Similarly, a technique for handling combinations of different terms (e.g., between cold and hot, medium or high, not good) should also be developed. Furthermore, an effective way for exploiting the explanations provided by the prototype about its reasoning would be allowing people to correct the errors that can sometimes occur in the fuzzy analogical reasoning process in an interactive machine learning fashion [2]. Indeed, explanations have been observed to allow people better identify and debug possible problems [1]. The combination of the presented practical implementation of fuzzy analogical reasoning, the automatic precisiation of meaning, and the spectral semantic similarity measure based on the concepts of CWW and perceptual computing can be
100
6 Fuzzy Analogical Reasoning
used in several ways to improve the understanding and correct handling of human perceptions and desires by artificial systems. This is fundamental for creating more understanding, personal and adaptive systems, which can provide robustness and flexibility in line with the design principles of phenotropic interaction. In the following Part IV, practical applications of the phenotropic interaction design principles are presented, building on the concepts introduced in the current Part III.
References 1. Alonso Moral, J. M., Castiello, C., Magdalena, L., & Mencar, C. (2021). Toward explainable artificial intelligence through fuzzy systems. In Explainable fuzzy systems: Paving the way from interpretable fuzzy systems to explainable AI systems (pp. 1–23). Springer. https://doi.org/10. 1007/978-3-030-71098-9_1 2. Amershi, S., Cakmak, M., Knox, W. B., & Kulesza, T. (2014). Power to the people: The role of humans in interactive machine learning. AI Magazine, 35(4), 105–120. https://doi.org/10. 1609/aimag.v35i4.2513 3. Bouchon-Meunier, B., Delechamp, J., Marsala, C., & Rifqi, M. (1997). Several forms of fuzzy analogical reasoning. In Proceedings of 6th International Fuzzy Systems Conference (Vol. 1, pp. 45–50). IEEE. https://doi.org/10.1109/FUZZY.1997.616342 4. Bouchon-Meunier, B., & Valverde, L. (1999). A fuzzy approach to analogical reasoning. Soft Computing, 3(3), 141–147. https://doi.org/10.1007/s005000050062f 5. Chang, C. Y., Lee, S. J., & Lai, C. C. (2017). Weighted word2vec based on the distance of words. In International Conference on Machine Learning and Cybernetics (ICMLC) (Vol. 2, pp. 563–568). https://doi.org/10.1109/ICMLC.2017.8108974 6. Colombo, M., D’Onofrio, S., & Portmann, E. (2020). Integration of fuzzy logic in analogical reasoning: A prototype. In IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP) (pp. 5–11). https://doi.org/10.1109/ICCP51029.2020. 9266156 7. Colombo, M., & Portmann, E. (2020). An algorithm for the automatic precisiation of the meaning of adjectives. In Joint 11th International Conference on Soft Computing and Intelligent Systems and 21st International Symposium on Advanced Intelligent Systems (SCISISIS) (pp. 1–6). https://doi.org/10.1109/SCISISIS50064.2020.9322674 8. Colombo, M., & Portmann, E. (2021). Semantic similarity between adjectives and adverbs— the introduction of a new measure. In V. Kreinovich, N. Hoang Phuong (Eds.), Soft computing for biomedical applications and related topics (pp. 103–116). Springer. http://doi.org/10.1007/ 978-3-030-49536-7_10 9. De Cock, M., & Kerre, E. E. (2004). Fuzzy modifiers based on fuzzy relations. Information Sciences, 160(1–4), 173–199. https://doi.org/10.1016/j.ins.2003.09.002 10. Denzler, A., Wehrle, M., & Meier, A. (2015). Building a hierarchical, granular knowledge cube. International Journal of Computer and Information Engineering, 9(6), 334–340. https:// doi.org/10.5281/zenodo.1107720 11. D’Onofrio, S., Müller, S. M., Papageorgiou, E. I., & Portmann, E. (2018). Fuzzy reasoning in cognitive cities: an exploratory work on fuzzy analogical reasoning using fuzzy cognitive maps. In IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1–8). IEEE. https://doi.org/10.1109/FUZZ-IEEE.2018.8491474 12. D’Onofrio, S., Müller, S. M., & Portmann, E. (2018). A fuzzy reasoning process for conversational agents in cognitive cities. In International Conference on Enterprise Information Systems (pp. 104–129). Springer. https://doi.org/10.1007/978-3-030-26169-6_6
References
101
13. Epstein, S. L. (2015). Wanted: Collaborative intelligence. Artificial Intelligence, 221, 36–45. https://doi.org/10.1016/j.artint.2014.12.006 14. Gatt, A., & Krahmer, E. (2018). Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. Journal of Artificial Intelligence Research, 61, 65–170. 15. Gatt, A., & Reiter, E. (2009). SimpleNLG: A realisation engine for practical applications. In Proceedings of the 12th European Workshop on Natural Language Generation (ENLG 2009) (pp. 90–93). 16. Gladkova, A., Drozd, A., & Matsuoka, S. (2016). Analogy-based detection of morphological and semantic relations with word embeddings: What works and what doesn’t. In Proceedings of the NAACL Student Research Workshop (pp. 8–15). Association for Computational Linguistics. https://doi.org/10.18653/v1/N16-2002 17. Honda, H., & Hagiwara, M. (2021). Analogical reasoning with deep learning-based symbolic processing. IEEE Access, 9, 121859–121870. https://doi.org/10.1109/ACCESS.2021.3109443 18. Lanier, J. (2003). Why Gordian software has convinced me to believe in the reality of cats and apples. https://www.edge.org. Visited on Apr. 2022 19. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. In Proceedings of Workshop at International Conference on Learning Representations (ICLR). 20. Miller, G. A. (1995). WordNet: A lexical database for English. Communications of the Association for Computing Machinery, 38(11), 39–41. https://doi.org/10.1145/219717.219748 21. Portmann, E. (2012). The FORA framework: A fuzzy grassroots ontology for online reputation management. Springer. 22. Portmann, E., & D’Onofrio, S. (2022). Computational ethics. HMD Praxis der Wirtschaftsinformatik, 59(2), 447–467. 23. Quan, T. T., Hui, S. C., & Cao, T. H. (2004). FOGA: A fuzzy ontology generation framework for scholarly semantic web. In Proceedings of the Knowledge Discovery and Ontologies Workshop (Vol. 24). Citeseer. https://doi.org/10.1109/TKDE.2006.87 24. Rieg, T., Frick, J., Baumgartl, H., & Buettner, R. (2020). Demonstration of the potential of white-box machine learning approaches to gain insights from cardiovascular disease electrocardiograms. PLOS ONE, 15(12), 1–20. https://doi.org/10.1371/journal.pone.0243615 25. Rogers, A., Drozd, A., & Li, B. (2017). The (too many) problems of analogical reasoning with word vectors. In Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (SEM) (pp. 135–148). Association for Computational Linguistics. https://doi.org/ 10.18653/v1/S17-1017 26. Speer, R., Chin, J., & Havasi, C. (2017). ConceptNet 5.5: An open multilingual graph of general knowledge. In Proceedings of the Thirty-First Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence (pp. 4444–4451). https://doi.org/10.48550/ arXiv.1612.03975 27. Teran, L., Pincay, J., Wallimann-Helmer, I., & Portmann, E. (2021). A literature review on digital ethics from a humanistic and sustainable perspective. In 14th International Conference on Theory and Practice of Electronic Governance (pp. 57–64). https://doi.org/10.1145/ 3494193.3494295 28. Turney, P. D., Littman, M. L., Bigham, J., & Shnayder, V. (2003). Combining independent modules to solve multiple-choice synonym and analogy problems. arXiv preprint cs/0309035. 29. Van Miltenburg, E. (2016). WordNet-based similarity metrics for adjectives. In Proceedings of the 8th Global WordNet Conference (GWC) (pp. 419–423). 30. Von Ahn, L. (2006). Games with a purpose. Computer, 39(6), 92–94. https://doi.org/10.1109/ MC.2006.196 31. Yao, J. S., & Wu, K. (2000). Ranking fuzzy numbers based on decomposition principle and signed distance. Fuzzy Sets and Systems, 116(2), 275–288. https://doi.org/10.1016/S01650114(98)00122-5 32. Zadeh, L. (1983). The role of fuzzy logic in the management of uncertainty in expert systems. Fuzzy Sets and Systems, 11(1), 199–227. https://doi.org/10.1016/S0165-0114(83)80081-5
Part IV
Applications of Phenotropic Interaction
Chapter 7
Phenotropic Interaction in Virtual Assistants
Voice-based virtual assistants are one of the main conversation-based interfaces between people and technology, used mainly in pervasive control of home automation and hands-free interactions (e.g., while driving). The virtual assistant mostly interfaces with the user using a conversational engine (often, but not always, voice-based). There exist two main types of queries that the assistant can respond to: question, which indicates a request for information (e.g., “what time is it?”), and control, which asks the virtual assistant to act as a control hub changing the state of other devices (e.g., “set the thermostat to .20 ◦ C”). In both cases, the virtual assistant interfaces with other systems via Application Programming Interface (API) requests to either request information or control a device and returns an answer to the user. As a feature to allow more customization in virtual assistants, they provide the possibility of easily programming some behaviors, the so-called skills. These allow the creation of simple rules specifying the type of task to be executed by the assistant when a specific event is observed. This includes the reaction to particular voice commands and other events (e.g., a measurement from a sensor). Skills are based on IF This Then That (IFTTT) rules, which structure the custom rule in an “IF trigger THEN action” fashion, such as “IF I say ‘set the lighting to bright’ THEN set the intensity of light XY to 100%,” where “set the intensity of light XY to 100%” is, in reality, an API call to the service to set the intensity of light 1231 to 100%. In general, the accuracy of virtual assistants is not very good, both in understanding queries and in retrieving information. For instance, as analyzed in Perficient’s study of 2019 on the accuracy of virtual assistants [1], voice assistants are often not able to answer queries (only in 40–80% of the queries they attempt an answer, depending on the assistant) or provide incomplete or incorrect answers (50–95% of the attempted answers are corrected and complete, depending on the assistant), when confronted with questions regarding general knowledge. These inaccuracies can increase even more in the case of personalized rules, as single rules are usually defined with only one possible wording, which has to be replicated precisely to © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Colombo, Phenotropic Interaction, Fuzzy Management Methods, https://doi.org/10.1007/978-3-031-42819-7_7
105
106
7 PI in Virtual Assistants
execute the wanted task. For example, if a rule “IF I say ‘set the lighting to bright in the living room’ THEN set the intensity of light XY to 100%” is created, then if a user uses the command “set a bright lighting in the living room,” the virtual assistant does not recognize this, as the wording is not the same. In this chapter, some techniques from the previous chapters are applied to the treatment of inputs to a virtual assistant in order to associate custom IFTTT rules with similar inputs, in terms of queries with the same semantics and a different syntax but also trying to extract some meaning from created rules and apply them to different situations. For example, the proposed technique aims at automatically extending the rule “IF I say ‘set the lighting to bright in the living room’ THEN set the intensity of light XY to 100%” to the case when the query is “set the lighting to dark in the living room,” by estimating the correct output of the rule for this case. This represents a way to make this type of interaction with an artificial agent more similar to natural interaction between two people, as the receiver of the information is not limited to performing an exact matching between its knowledge and the received query, but applies reasoning to try and understand the query, by matching it with similar knowledge. The presented improvement of classical IFTTT rules represents a tentative prototype, the Flexible Virtual Assistant (FVA) prototype, and a demonstration of a phenotropic human–computer interaction based on a conversational interface that aims to improve a virtual assistant’s robustness, flexibility, and adaptability. The presented prototype serves as proof that the introduction of the phenotropic interaction design principles can improve the interaction with virtual assistants. The base techniques used for extending and adapting the IFTTT rules are presented in Sect. 7.1, followed by the architecture and further details of the prototype of a virtual assistant implementing these techniques in Sect. 7.2. Next, a user evaluation based on some scenarios performed on the presented prototype is developed in Sect. 7.3. Finally, a critical outlook on the FVA prototype and future improvements conclude the chapter in Sect. 7.4.
7.1 Extension of If This Then That Rules Two strategies are used to provide more flexibility to IFTTT rules to adapt physical systems to the perceptions or desires of people expressed with scalar terms. The first tries to find a match between two queries with a different syntax but similar semantics, which allows executing the proper operation when a query with the same meaning as the trigger of a known rule is asked. The second is used to infer new rules from the defined ones, based on APM 2.0 and fuzzy analogical reasoning, which allows, for example, to infer a rule for the query “set the lighting in the living room to dark,” when only the rule for the query “set the lighting in the living room to bright” is known.
7.1 Extension of IFTTT Rules
107
In order to handle both these strategies, several operations are executed in sequence. These include: 1. Query preprocessing 2. Query matching 3. Adaptation of rule with fuzzy analogical reasoning These steps are presented in detail in the following.
7.1.1 Query Preprocessing To allow the personalization of the interaction with a virtual assistant, IFTTT rules can be created manually. As mentioned before, each rule is composed of a trigger, mostly a vocal command in the case of virtual assistants, and an action, which is a change in the state of another device. These rules can be created with simple interfaces like the one provided by IFTTT.1 In the following, the trigger action will be considered as being in the form of a query asked to the virtual assistant. To structure these rules in a way that simplifies their matching with new queries and allows the inference of similar rules, the trigger and new queries are preprocessed similarly. The preprocessing of queries consists in: 1. Query tokenization 2. Part-of-speech tagging 3. Extraction of the scalar adjective w used to describe the current perception or the desired output 4. Selection of the category W corresponding to w As the first step, the query is divided into tokens [5], which corresponds to dividing the whole sentence of the query into a set of the used words. Then, an intelligent and context-aware part-of-speech tagging is performed on the retrieved tokens. This allows identifying with reasonable accuracy the part of speech of each word and its type (e.g., comparative adjective, possessive pronoun). Using this information, a heuristic is used to select from the query the scalar adjective or adverb w describing the perception or desire of the user. This is done by selecting from the list of tokens the last one tagged as belonging to the parts of speech that can be used to describe scalar terms (i.e., adjective, comparative adjective, superlative adjective, predeterminer, adverb, comparative adverb, superlative adverb), excluding some elements that have been observed to be common in this type of queries that belong to the selected parts of speech, but not to scalar terms (e.g., in, here).
1 https://ifttt.com/.
108
7 PI in Virtual Assistants
Using the same technique presented in Eq. 5.2, the feature W (e.g., temperature) described by w is selected. The reason for this operation is twofold. W is used to group the triggers in a subset to speed up the matching with queries by first comparing them with known triggers categorized in the same W . Moreover, W is employed to mask the identified scalar term to improve the matching between similar queries. For example, if the trigger “set the lighting in the living room to dark” is registered in the system, the similar query “set the lighting in the living room to bright” might not be partially matched (more about partial matching in Sect. 7.1.2) with the known query, as the meaning of dark and bright is very different. Substituting the identified scalar terms in both queries with their category allows a perfect matching, as both are transformed to “set the lighting in the living room to brightness.” This is a useful operation to ensure that similar queries with different scalar terms are correctly matched, allowing these rules to be extended. The same operations are executed on both the trigger of newly created rules and new queries, the results from the former are saved along with the other information about the created rules, and the ones from the latter are directly used for the search for a match with the rules in the rules base.
7.1.2 Query Matching A match between any new query q and any trigger .ti from the saved rules base R allows executing the correct operation when q is executed. It is also crucial to try to infer the operation to be executed when a sentence similar to an existing trigger is queried. Two central problems are to be solved in the matching phase: queries with the same semantics but a different syntax should be correctly matched with triggers (e.g., correspondence between “set a bright lighting in the living room” and “set the light in the living room to bright” should be identified); queries with the same semantics but describing different perceptions should be partially matched to allow the inference of new rules (e.g., partial correspondence should be identified between “set a dark lighting in the living room” and “set the light in the living room to bright”). Different types of matching are executed to solve these problems. Initially, a perfect matching (i.e., finding q and .ti with the same syntax) between a query q and any trigger .ti in the rules from R is tested. If a match is found, the action from the corresponding rule .ri is executed. If no perfect matching is found, a rule with a trigger with the same semantics but different syntax of the query is searched. To do so, q and the triggers .tk corresponding to the rules including a scalar term describing the same feature W identified in q, .{rk |rk ∈ R s.t. w ∈ tk and w ∈ W } are compared in search of a semantic matching. The semantic matching is obtained using a transformer-based sentence embedding SBERT [4]. This transforms the sentences q and .tk into 384-
7.1 Extension of IFTTT Rules
109
Fig. 7.1 Example of partial matching process
dimensional vectors .q ' and .tk' representing their meanings. The semantic similarity between q and .tk is then considered to be corresponding to the cosine similarity [3] between .q ' and .tk' . If the semantic similarity between q and .tk is maximal and higher than a certain threshold, fixed at .0.6 by trial and error, then a match between q and .tk is obtained. This means that a rule with a trigger with the same meaning as the query q has been found, and the action from the corresponding rule is executed. If no match has been found in the same category W , the same operation is tried on all the .ti s from the .ri s with scalar terms from any category. This can overcome the problem where a category has not been identified correctly. If no semantic matching has been found, then partial matches are searched. The same procedure as in the semantic matching is carried out to obtain this, first within the same category and then within all categories. The only difference is that the identified semantic terms in the query q and the triggers .ti are substituted by their category before searching for the matching using SBERT and the cosine similarity. If a match is found in this way, further steps are required to adapt the rule corresponding to the matched trigger to the different scalar term used in q to describe the user’s perceptions, as presented in Sect. 7.1.3. If no partial match is found, then the query is considered unknown. An example of the process for partial matching is summarized in Fig. 7.1.
7.1.3 Rule Adaptation with Fuzzy Analogical Reasoning When a partial matching is obtained, rules must be adapted using fuzzy analogical reasoning. Indeed, direct execution of the actions corresponding to the matched rule in the rules base is inappropriate, as the operation desired by the user is probably different or even opposite. It is enough to think of the case where the query q “set the lighting to dark” is matched with the trigger .ti “set the light to bright” in the rule
110
7 PI in Virtual Assistants
base. In this case, executing the action of rule .ri (e.g., set the intensity of the light bulb to 100%) would be perceived as an incorrect operation by the user. To perform the correct operation, a relationship between the adjectives bright and dark can be found and mapped to the action of the rule .ri , to create a new rule .ri' . To map a rule to a similar rule handling a different perception, fuzzy analogical reasoning can be used on the identified scalar terms w in the query q and .w ' in the trigger .ti . Definition 7.1 Let .ri be a rule, composed by a trigger .ti and an action .ai in the form “IF .ti THEN .ai ,” with .ai in the form “set state of device d to x.” Then the extended rule .ri' is triggered by the query q and executes the adapted action .ai' of the form “set state of device d to .x ' .” The limits of the actions are defined by the minimum and maximum valid values for the status of device d, .x, x ' ∈ [xmin , xmax ]. For example, for the case of the lighting intensity of a smart light bulb, the limits of .x, x ' are most likely .[0%, 100%]. Fuzzy analogical reasoning is applied to w, .w ' and x, in order to estimate .x ' . As x is expressed as a non-normalized number .x ∈ [xmin , xmax ], a scaling of the normalized values is carried out as part of the fuzzy analogical reasoning process. The algorithm used for this process is presented in Algorithm 7.1. Algorithm 7.1: Fuzzy analogical reasoning for the estimation of a value .x ' of an extended rule .ri' Input: Scalar terms w, w ' ; W, W ' the categories of w, w ' ; x the operation to which device d is set by rule ri ; xmin , xmax the limits of the values to which d can be set. Output: x ' the value to which device d is set by the extended rule ri' . w ∗ , w '∗ ← Precisiate w, w ' with APM 2.0 and their categories W, W ' ; RX ← dist (w ∗ , w '∗ ); RY ← RX ∗ |xmax − xmin |; if x − RY ∈ [xmin , xmax ] then x ' ← x − RY else x ' ← x + RY . end
For instance, using the example of the query q “it is too bright in this room” matched with the rule “IF I say ‘it is too dark in here’ THEN set the intensity of light XY to 100%,” Algorithm 7.1 can be used to infer a rule .ri' to be executed when the query q is used. To do so, bright and dark are precisiated with APM 2.0, returning, for example, .bright ∗ = ∗1 and .dark ∗ = ∗0. Then, the relationship RX = dist (dark ∗ , bright ∗ ) = −1
.
7.2 The FVA Prototype
111
between the two is computed with the signed distance [6]. The resulting relationship between the values .x, x ' of the actions .ai , ai' can then be computed as RY = RX ∗ |100% − 0%| = −100%.
.
Finally, the relationship .RY is applied to x to obtain .x ' such that .RY = dist (x, x ' ) = |x − x ' | and .x ' ∈ [xmin , xmax ]. x ' = x ∓ RY = 0%.
.
Thus, the obtained extended rule .ri' built on .ri to be executed for query q is “IF I say ‘it is too bright in this room’ THEN set the intensity of light XY to 0%.” The .∓ sign allows this type of reasoning to be executed on opposite scales, as shown in the previous example, where the direction of the signed distance between the precisiated words is not necessarily the same as between the associated values. Moreover, this process allows tentative reasoning on words that do not necessarily belong to the same scale, which aims to improve the flexibility of the system and its feature of being a good guesser of people’s intentions. For instance, the query “set a low lighting” could be matched with the rule “set the lighting to bright,” and the reasoning process would thus try to find a relationship between low and bright and use that as a basis to infer the corresponding rule. This can be done as the precisiated values of low and bright both lie on a normalized scale, and thus the distance between them can be computed even if they are not on the same spectrum. This provides an accurate estimation of .ri' in the cases where only one of the possible options of .x ' = x − RY and .x ' = x + RY lies in the interval .[xmin , xmax ]. In the opposite case, there is a .50% chance of guessing the right rule, as the system possesses no knowledge about the correspondences of the orientation of the spectrums of height (low) and brightness (bright); indeed, high could have a correspondence with either bright or dark. The presented approach for the extension of IFTTT rules was implemented in the FVA prototype in order to test its concepts in a concrete user experiment.
7.2 The FVA Prototype Based on the presented technique for the extension of IFTTT rules, a prototype for a virtual assistant able to handle imprecision in IFTTT rules, named Flexible Virtual Assistant (FVA) prototype, was developed. This section details the implementation details, architecture, and interface of the FVA prototype. In the FVA prototype, interfaces for creating new rules and executing queries are needed, along with the execution of complex logic in the background to implement the procedure from Sect. 7.1, and the possibility of storing newly created rules in a
112
7 PI in Virtual Assistants
structured way. To satisfy these features, the Django framework v4.0.32 was chosen as a basis for the development of the FVA prototype as a web application. Django is a web framework based on Python, which allows a secure and straightforward development of web applications composed of a back-end handling the logic of the application in Python and the data storage in various available database technologies and a front-end for the creation of the user interface with a templating system based on HTML5,3 CSS3,4 and JavaScript.5 For this project, SQLite v3.36.06 was employed for the storage of the custom IFTTT rules, as being the lightest and simplest available technology in the Django framework, perfectly adapted for the low data storage requirements of the prototype. The overall architecture of the FVA prototype is presented in Fig. 7.2, where the two types of possible user interactions (i.e., rule creation and query execution) are separated. The interface for the creation of a new IFTTT depicted in Fig. 7.3a represents a structured way for users to define a new IFTTT rule with a text input. This is in the form “IF say trigger THEN set device to value,” where trigger is a sentence in natural language, device is a previously defined device (e.g., light bulb in the living room), and value is the value to which the device’s status is set when the corresponding trigger is queried. A Python script then preprocesses this new rule with the method described in Sect. 7.1.1 and saves it to an SQLite database. To better test IFTTT rules extension without considering the problems that could arise from the voice recognition model, the queries to the FVA are made through the textual query input in Fig. 7.3b. First, the query is preprocessed as described in Sect. 7.1.1 to identify and extract scalar terms. Then, with SBERT sentence embedding and cosine similarity, a matching with triggers from the database is searched. If none is found, a message specifying that the query was not understood is returned as in Fig. 7.3d. Otherwise, the action from the matched rule is directly returned in a textual manner in case of a semantic match, or the action is first inferred with fuzzy analogical reasoning as in Sect. 7.1.3 and adapted to be returned as in Fig. 7.3c. For the part-of-speech tagging in the search for scalar terms, the spaCy v3.3 module7 was employed, with the en_core_web_sm pre-trained tokenizer, tagger, parser, and named entity recognizer model. The sentence embeddings using sBERT were performed with the sentence-transformers v2.2.0 module,8 pre-trained all-MiniLM-L12-v2 model.9 Fuzzy analogical reasoning was based on the
2 https://djangoproject.com. 3 https://html.spec.whatwg.org. 4 https://w3.org/Style/CSS. 5 https://ecma-international.org. 6 https://sqlite.org. 7 https://spacy.io/. 8 https://sbert.net/. 9 https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2.
7.2 The FVA Prototype
113
Front-end
Back-end
New IFTTT rule
Rule
Rule preprocessing
IF say:
Python
THEN:
SQLite
set value to
HTML + CSS
Save rule
Rule storage
Legend
Rule creation
Query
Query preprocessing
Your query
Execute query
Query input Rules retrieval
Semantic matching
Match
Action
No match
> >
Set a low lighting Okay, I will set Lucideles - Target illuminance to 498 lux
Action confirmed
Partial matching Match
FAR No match
Fig. 7.2 Architecture of the FVA prototype
Fig. 7.3 Interface of the FVA prototype
> >
Set a high temperature Unknown query, please try another wording
Unknown query
114
7 PI in Virtual Assistants
scripts for the spectral semantic similarity and the precisiation of meaning developed for Chaps. 4 and 5. The prototype was installed on a web server to be easily accessible by several users, the page for the creation of rules is accessible at http://smartassistant.ga/ifttt, and queries can be tested at http://smartassistant.ga/assistant/A. Each set of rules is associated with a user by setting and reading a cookie not to mix up the rules of different users while allowing a single person to set and use their own rules. The code used to implement this prototype is available online10 in a version easily runnable on a local machine.
7.3 Evaluation The presented prototype of a virtual assistant with the capability to extend IFTTT rules can be used as a basis to evaluate the accuracy of the developed methodology and the perceptions of people toward some aspects of the inclusion of phenotropic principles in the interaction between them and a virtual assistant. This section presents the methodology and the results of such a user evaluation in detail.
7.3.1 Methodology To analyze the extension of IFTTT rules and the perceptions about the flexibility and other aspects of the FVA implementing this, the used methodology consists of a user experiment comparing various aspects of the FVA prototype to an identical prototype implementing traditional IFTTT rules (i.e., recognizing rules only when an exact match between the query and the triggers exists). A within-user experiment was designed in which each participant had to execute some tasks in the context of two scenarios using the FVA and the traditional virtual assistant prototypes and subsequently answer some questions about their experience and perceptions of the prototype. The order of the prototypes is randomized. The first scenario is based on the control of the temperature of a room with a command and requires the creation of a rule to set a hot temperature, followed by the use of the same rule, a similar query to set the temperature to a different value (i.e., warm), and intentionally using a different wording of the trigger to set a hot temperature as if some time has passed since the definition of the rule and the exact wording has been forgotten. The second scenario is based on the control of the illuminance in a room with a command and requires the creation of a rule to set a bright illuminance. Similar to the first scenario, this is followed by the use of the query interface to express a
10 https://github.com/colombmo/smartassistant.
7.3 Evaluation
115
request to set a bright light in the room, a similar query to set the lighting to dark, and finally using a similar query to set a high brightness. The instructions in both scenarios are kept very vague on purpose to avoid influencing the wording that the people could use (e.g., “create a rule for which when you ask to set a ‘hot’ temperature in the living room, the heating is set to the wanted power (i.e., in the range [0–100%])”). Both scenarios are repeated by each participant using both prototypes. After each task, the user is asked if the virtual assistant performed the action they expected based on their query. In case an unexpected result is obtained, the participants are asked if they would prefer the virtual assistant to autonomously learn to understand their query and execute the corresponding action or if they would instead define a new IFTTT rule for every single request and wording that they might need to use, before performing any query. At the end of the interaction with each prototype, general questions about the virtual assistant’s perceived intelligence and flexibility and the interaction’s qualities with each prototype are asked. This allows a direct comparison of the perceptions of the extended and traditional versions of IFTTT. A complete list of the questions investigated in this survey can be found in Appendix C.
7.3.2 Results 28 persons fluent in English and with experience with virtual assistants were selected through the platform Prolific.11 Of these, the results of 7 participants were discarded based on their answers to a control question. Finally, the results of 21 nonEnglish mother tongue participants in the experiment were obtained, aged between 20 and 53 years old (.M = 29.4, SD = 7.9), primarily males (14 males, 7 females) and living on different continents but mainly in Europe (Europe: 16, Africa: 3, Asia: 1, South America: 1). The feedback of participants regarding the correct execution of the query that was collected after they executed each task could be used to compare the accuracy of the prototype handling IFTTT rules in a traditional way and that of the FVA prototype, which automatically extends and handles the IFTTT rules in a flexible way. The detailed results can be found in Fig. 7.4, where for each task, the percentages of “Yes,” “No, it did not understand the query,” and “No, it performed the wrong action” answers to the question “Did the assistant perform the requested task correctly?” are reported. On the combination of all tasks, the FVA prototype performed the correct action programmed by users in IFTTT rules close to double the times than the traditional handling of IFTTT rules (FVA prototype: .81.7% of correct actions, traditional prototype: .42.1% of correct actions).
11 https://prolific.co.
116
7 PI in Virtual Assistants
Fig. 7.4 Answers to the question “Did the assistant perform the requested task correctly?” for each of the three tasks in both scenarios
Of the queries rated as unsuccessful, in the FVA prototype, the majority were due to the wrong performed action (.43.5% unknown query, .56.5% wrong performed action). The opposite was observed in the errors of the traditional prototype (.64.4% unknown query, .35.6% wrong performed action). These observations, combined with the analysis of the results of the single tasks from Fig. 7.4, say that the flexibility provided by the extension of the IFTTT rules yields an improvement in the understanding of queries. This is true for the cases where: • A different perception or desire is expressed through a rule similar to the registered one (i.e., task S1.2, resp. S2.2, “Ask the assistant to set a ‘warm’ temperature in the living room, resp., ‘dark’ lighting in the office,” when a rule was created only to react to when the user wants to set a “hot” temperature, resp., a “bright” lighting). • A different wording than the original rule is used to express the same meaning (i.e., task S1.3 “Ask the assistant to set a ‘hot’ temperature. However, this time, imagine that a long time passed since when you defined the rule for this, so you should use a slightly different wording compared to the one that you defined as if you forgot the exact wording that you defined for the query”). • A different type of scalar term is used to express the same feature (i.e., task S2.3 “Brightness, like other properties, could also be described using adjectives such as ‘high brightness.’ For us, it is easy to translate from ‘bright’ to ‘high brightness,’ for example. Try to ask the assistant to set a ‘high brightness’ in the office,” when the defined rule used the term “bright” to express this trigger). Also in tasks S1.1 and S2.1, where the participants are allowed, but not forced, to use the exact wording as in the trigger they defined, one can see a difference between
7.3 Evaluation
117
the two approaches. This is because the rules for both scenarios were created at the beginning of the first time each scenario was executed and reused for the second time the same scenario was performed with the other prototype. This approach allowed to simulate a situation where some time has passed from the rule definition, and the users might not remember the exact wording they used. As one can observe in Fig. 7.4, approximately 30–40% of users were observed to use a variation of the rule that they defined, which was thus handled in the wrong way by the traditional prototype, but correctly by the FVA prototype, showing its ability to understand the semantics of the user queries. By observing the rules created by users and the queries they performed, it was possible to identify some of the most common limitations in handling queries by the FVA prototype. The following two main categories of errors were identified: 1. Despite the instructions of the scenarios indicated to create a rule for which when the user asks to set a hot temperature (scenario 1), or a bright lighting (scenario 2), a specific action is performed; some participants created rules that not only did not contain the word hot or bright but did not even contain any scalar term (e.g., “heat us up!”, “brighten my day”). As the FVA prototype is built so that fuzzy analogical reasoning is applied only to scalar terms, queries partially matching these triggers cannot be correctly processed as they do not contain any scalar term. This problem could be partially solved by making the prototype able to perform fuzzy analogical reasoning also on verbs and nouns derived from scalar terms, by transforming these words to the scalar terms they are derived from (e.g., .heat → hot, .cool → cold, .brighten → bright, .darken → dark). 2. Some queries were formulated with entirely different semantics to the trigger supposedly corresponding to them. For instance, for the trigger “Set the temperature to hot,” a query “It’s cold in here” was executed. When a person fully understands the semantics of these two sentences, it is clear that the expected action to be taken is similar. However, on a shallower level, the meaning of the two sentences is entirely different, as the first one expresses a command or a desire, and the second expresses a feeling or a perception. Because of this, the system cannot establish a match between the query and the trigger. The match between the two sentences could only be obtained by comparing the expected action to be performed when they are executed. However, a system able to find such a match would need to be able to infer the actions to be performed for any query without any other information from the user. This would render userdefined IFTTT rules useless, as such a system would need to have a high enough knowledge and understanding of the world to create its own meaningful set of IFTTT rules. To investigate the opinion of the participants toward the principles behind the extended IFTTT rules, they were asked the question “Since you created a rule similar to the one that you just tried, would you prefer the assistant to learn from that rule how to interpret this new request and react correctly accordingly, or would you rather define a new rule for every specific case (e.g., hot, cold, medium, warm, chilly, freezing, burning, ... temperatures)?”, after the assistant did not understand
118
7 PI in Virtual Assistants
the query or performed the wrong action. All participants were asked this question at least once. Those who were asked this more than once always gave the same answer. According to .76.2% of the participants, the system should learn to adapt the actions performed based on the similarities to similar rules that were created. The other .23.8% prefers the system not to adapt the rules automatically but would instead specifically define all the rules they might need. At the end of the use of each prototype, participants were asked to rate on a 1 (i.e., strongly disagree) to 5 (i.e., strongly agree) Likert scale how much they agreed with different statements regarding the last prototype they used (i.e., it behaved as expected, it was robust, it was flexible, it was understanding, it was unpredictable, it was reasoning in a human-like way), and about their perception about the interaction (i.e., it was natural, it was human-like, it required a high cognitive load, it was frustrating). The differences between the perceptions of the users regarding the FVA prototype and the one implementing traditional IFTTT rules were compared with a two-tailed paired t-test with Holm–Bonferroni correction for multiple comparisons as reported in Fig. 7.5. According to the participants’ perceptions, the implementation of the extension of the IFTTT rules behaved significantly more as expected than that with the traditional IFTTT handling (FVA: (.M = 4.00, SD = 0.98, traditional: .M = 2.95, SD = 1.00, .p = 0.01). The FVA prototype was perceived as being
Fig. 7.5 Comparison of the perception of the FVA and the traditional prototypes (5 .→ strongly agree, 1 .→ strongly disagree). The stars indicate statistical significance of the t-test comparison of the prototypes (i.e., *.→ p < .05, **.→ p < .01, ns.→ not significant)
7.3 Evaluation
119
significantly superior to the traditional also in terms of being more robust (FVA: M = 3.76, SD = 0.97, traditional: .M = 2.95, SD = 0.72, .p = 0.03), flexible (FVA: .M = 3.95, SD = 0.95, traditional: .M = 2.67, SD = 1.08, .p = 0.003), understanding (FVA: .M = 3.81, SD = 1.14, traditional: .M = 2.76, SD = 1.15, .p = 0.03), and reasoning in a human-like way (FVA: .M = 3.47, SD = 1.10, traditional: .M = 2.05, SD = 0.95, .p = 0.001). No significant difference was found between the two prototypes in terms of unpredictability (FVA: .M = 2.05, SD = 1.05, traditional: .M = 2.71, SD = 1.31, .p = 0.07), this can be seen as a good point, as a possible danger of the automatic extension of IFTTT rules is that unpredictable behavior is observed. Similar to the prototypes, also some features of the interaction with them were compared with a two-tailed paired t-test with Holm–Bonferroni correction for multiple comparisons as reported in Fig. 7.6. According to the participants’ perceptions, the interaction with the FVA prototype was significantly more natural (FVA: .M = 3.81, SD = 1.05, traditional: .M = 2.81, SD = 1.37, .p = 0.02) and more human-like (FVA: .M = 3.48, SD = 1.18, traditional: .M = 2.52, SD = 1.10, .p = 0.02) than with the prototype implementing IFTTT rules in the traditional way. Moreover, the interaction with FVA was significantly superior also in terms of requiring a slightly lower cognitive load (FVA: .M = 2.76, SD = 1.15, traditional: .M = 3.52, SD = 1.21, .p = 0.03), for instance to remember the exact wording to be used for the queries and to understand why a query was not recognized, and was perceived as being less .
Fig. 7.6 Comparison of the perception of the interaction with the FVA and the traditional prototypes (5 .→ strongly agree, 1 .→ strongly disagree). The stars indicate statistical significance of the t-test comparison of the prototypes (i.e., *.→ p < .05, **.→ p < .01)
120
7 PI in Virtual Assistants
frustrating because of the minor number of unrecognized or incorrect queries (FVA: M = 1.95, SD = 1.13, traditional: .M = 3.00, SD = 1.35, .p = 0.03). These combined observations about the users’ perceptions of the two prototypes indicate that the FVA prototype successfully implements the phenotropic design principles in a human–machine conversational interface. Although some improvements to the employed algorithm are possible, and an extension to the case of voice input should be provided, this experiment provides a first proof that the inclusion of IFTTT rules extension in virtual assistants would provide an increment of the flexibility, robustness, and adaptivity of the assistants, with various benefits to the users.
.
7.4 Concluding Remarks A new method for handling IFTTT rules in a smart and flexible way based on fuzzy analogical reasoning and CWW was presented in this chapter in the form of a virtual assistant prototype. This allowed showcasing a practical application of phenotropic interaction design principles to a conversational interface between humans and machines. The results of a comparative user evaluation based on the interaction with the developed prototype and an alternative prototype implementing IFTTT rules in the traditional way allowed proving that the inclusion of the phenotropic interaction design principles made the system more flexible, robust, natural, and human-like in the perceptions of the users. Also, the proposed system was easier and less frustrating to interact with. This analysis represents encouraging first results and potential in favor of introducing phenotropic principles in the specific case of virtual assistants, but also interactive systems in general. Indeed, it shows that applying phenotropic interaction design principles allows conversations with artificial systems to be more natural and similar to those between people, than the more widespread mechanistic interfaces that do not handle the conversation in an intelligent, bio-inspired way. Furthermore, the executed experiment allowed identifying limitations and possible improvements to the presented prototype and possible future research directions for the concrete development of phenotropic human–computer interfaces. One of the limitations observed in the extension of IFTTT, and thus in the FVA prototype, consists of adapting only rules containing one-word scalar terms. Handling of composed terms built with modifiers and nouns or verbs derived from scalar terms could strongly improve the usability and naturalness of this prototype, as these are widespread terms to be used for the expression of perceptions and desires (e.g., instead of saying “scorching,” most people say “very hot”). Solving this limitation would require improving the algorithm for APM 2.0 and recognizing terms that are not directly scalar adjectives or adverbs but other word categories derived from them or a combination of these with modifiers.
7.4 Concluding Remarks
121
The handling of errors in the IFTTT rules extension is also an important aspect to be considered to improve the presented prototype, as it would allow the users to understand the reasons behind them and slightly adapt the rules to make the system more adapted to their way of expressing queries. This could be achieved by providing a good textual explanation of the reasoning process involved in the query processing and in the estimation of a new rule, similar to the approach adopted in Chap. 6, on-demand (e.g., the user asks “please explain why you took this decision”) or automatically when an error of the type “unknown query” occurs. In the specific case of virtual assistants for the control of building settings, a loop mechanism could be used to continuously adapt the system to the user’s perceptions. This could be achieved by mapping percentages from the automatic precisiation of meaning back to real-world values (context-dependent) using sensor data. Also, when someone uses a command, for instance, “it’s hot,” the precisiation of the meaning of hot could be improved by adding the temperature from a sensor to the data used for the APM 2.0 algorithm. This would add data about the subjective perception of the meaning of hot for the current user, allowing to improve the accuracy of the precisiation of the meaning of hot, all while allowing the system to learn about the subjective nuance of the expressed perception. This process could make the virtual assistant more and more personal over time. Moreover, in the context where sensors are available for the virtual assistant to be analyzed, the observation of the user’s interaction with the system compared with the sensor data could help discover relationships between them and create new rules that would allow the automation of the room control. For instance, a virtual assistant could observe that whenever the temperature in a room is approximately .23 ◦ C or higher, the user almost always uses a command to reduce the temperature. So the virtual assistant could start constantly monitoring the temperature and act to avoid it reaching .23 ◦ C, without having the user have to give a command explicitly. This could improve the experience and the comfort of the user, as the temperature would always be in the desired interval without having to spend time giving a command to the virtual assistant and would improve the interaction with the building [2], all while allowing the inferred rule to change depending on user feedback (i.e., new queries). Ideas to allow even more flexibility to human–computer interfaces through phenotropics include the automatic handling of several modalities by finding correlations between them based on the underlying semantics. For example, an interface that has been designed to be used with voice could adapt to a user who cannot talk or is in a noisy environment by extracting meaning from its gestures (not necessarily using sign language) and matching them with expected vocal commands. Alternatively, a combination of modalities could be achieved by a similar system, where a virtual assistant knowing a rule to be executed when the user says “turn on the light in the kitchen,” could infer that the same rule should also be executed when the user points at the kitchen light while saying “turn this on.” Also in this case, a match between the semantics of the rule and the multimodal query would need to be found to perform the correct action.
122
7 PI in Virtual Assistants
Although future work can be done to improve the extension of IFTTT following the phenotropic interaction design principles, the prototype that was presented in this chapter serves as a solid base for future developments and as a proof that the introduction of these principles can improve the interaction not only with virtual assistants but also with other interactive systems.
References 1. Enge, E. (2019). Rating the smarts of the digital personal assistants in 2019. https://www. perficient.com/insights/research-hub/digital-personal-assistants-study. Visited on Apr. 2022 2. Nembrini, J., & Lalanne, D. (2017). Human-building interaction: When the machine becomes a building. In R. Bernhaupt, G. Dalvi, A. Joshi, D. K. Balkrishan, J. O’Neill, & M. Winckler (Eds.), 16th IFIP Conference on Human-Computer Interaction (INTERACT), Human-Computer Interaction, vol. LNCS-10514 Part II (pp. 348–369). Springer. https://doi.org/10.1007/978-3319-67684-5_21 3. Rahutomo, F., Kitasuka, T., & Aritsugi, M. (2012). Semantic cosine similarity. In The 7th International Student Conference on Advanced Science and Technology ICAST (Vol. 4, p. 1). 4. Reimers, N., & Gurevych, I.: Sentence-BERT: Sentence embeddings using Siamese BERTnetworks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 3982–3992). Association for Computational Linguistics. https://doi.org/ 10.18653/v1/D19-1410 5. Webster, J. J., & Kit, C. (1992). Tokenization as the initial phase in NLP. In The 14th International Conference on Computational Linguistics (Vol. 4). https://doi.org/10.3115/992424. 992434 6. Yao, J. S., & Wu, K. (2000). Ranking fuzzy numbers based on decomposition principle and signed distance. Fuzzy Sets and Systems, 116(2), 275–288. https://doi.org/10.1016/S01650114(98)00122-5
Chapter 8
Phenotropic Interaction in Smart Cities
Modern cities and communities constitute a complex interplay between human, natural, and technical factors. This can be seen as one of the most advanced forms of sociotechnical system [37], where a good balance between the various stakeholders in such a system is deemed as fundamental. For example, a technological development without the consideration of the social aspects related to the introduction of such does not necessarily improve the sociotechnical system as a whole. Smart cities (SCs) represent a concept answering efficiency problems of modern cities [17]. This concept originated from the application of ICT methods to solve city challenges. It has since been of significant interest to IT and telecommunication companies, which launched several market initiatives for more intelligent cities [32]. For this reason, the notion of SC is generally centered around a very technocratic view of urban management. However, the most critical component of cities is the people living, working, or visiting the urban area, what has been defined as the “living component” of the city, which adds complexity and creativity to the city ecosystem [13]. Thus, the smart development of cities should be centered on the well-being, quality of life, and sense of inclusion of the citizens, rather than solely on the overall optimization of processes and ICT use. To effectively achieve this, it is believed that citizens should be actively engaged in the development of the city [35], through the process of co-production and collaborations between the citizens, the city administration, and the other involved stakeholders [3]. The concept of cities centered around the needs of the citizens takes the name of human smart city (HSC) [9]. For the effective inclusion of people in the HSC development process, points of contact allowing exchanges between the city as an entity and the citizens have to be provided. These often consist of competitive or collaborative events and experiments [3], such as makeathons and living labs [26]. However, the institution of digital points of contact can improve the citizens’ participation, as they are faster and more easily adapted to people’s daily lives, allowing for the participation of an even more significant part of the population. These include, among others, crowdsourcing © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Colombo, Phenotropic Interaction, Fuzzy Management Methods, https://doi.org/10.1007/978-3-031-42819-7_8
123
124
8 Phenotropic Interaction in SCs
campaigns [5, 6] and eVoting [29], but also methods for the collection of feedback regarding the city infrastructure [1]. Moreover, to establish more targeted developments of the SC, the smart city wheel can be employed. This provides a scoring of the different primary sectors of the city, allowing to identify the sectors that have a margin of improvement based on several criteria. Of course, the criteria selected for the assessment of the smartness of a city can vary [8]. Still, citizens’ feedback regarding their perceptions of the city is fundamental for creating a citizen-centric overview of the HSC [9]. To handle the citizens–city digital points of contact in a robust, flexible, and citizen-centered way, the principles of phenotropic interaction can be applied to such an interface. This can allow a more extensive understanding of the citizens’ needs, desires, and perceptions of the urban environment. A good enough adaptation of the city to citizen communications can even lead to an understanding of the reasons behind citizens’ feedback, allowing the city to directly predict problems of similar nature from latent city data, instead of relying solely on citizen communication. In this chapter, two projects applying the phenotropic interaction framework and principles, in which development the author participated, are presented. The first one, Jingle Jungle Maps [40] (Sect. 8.1), focuses on the use of data crowdsourced from citizens for the estimate of urban sounds where direct measurements are missing as a form of indirect communication between citizens and the city. The second project, Streetwise [10] (Sect. 8.2), presents a methodology for creating a map of the citizens’ perceptions of various areas of the city, using street-level imagery. A critical outlook of the presented citizen–city phenotropic interfaces and the implications of such methods conclude the chapter in Sect. 8.3.
8.1 Jingle Jungle Maps Noise pollution is a primary concern impacting the health and well-being [22] of millions of people worldwide [16], approximately .90% of whom live in urban areas. Significant sources of noise pollution include road, railway, and air traffic. Thus, it is fundamental for the city administration to monitor the soundscape of the urban area to identify areas where action should be taken to reduce the amount of noise pollution. Solutions to monitor noise mainly consist of sensor deployments in predefined locations in the area to be analyzed, complemented with computational estimates based on traffic models [33, 41], citizen-powered noise monitoring through a smartphone application [31], or public surveys involving citizens in the estimation of their perceptions toward city noise. However, these solutions are costly regarding infrastructure, people’s engagement, and the time necessary for the data collection. Moreover, ambient sensors consider only the strength of sounds, but not how people perceive these. Indeed, the distinction between sound and noise depends on the intensity and other features [28].
8.1 Jingle Jungle Maps
125
To overcome these limitations, the Jingle Jungle Maps project [40] focuses on creating an approximate, perception-based, and always up-to-date city soundscape based on the analysis of social media feeds of people inside the city. This has the advantage of using data generated by citizens not explicitly for noise pollution analysis but that can be reused for this task. Furthermore, Jingle Jungle Maps represents a phenotropic interface between citizens and the city, as it allows a seamless exchange of information without the citizens having to use a particular protocol. Indeed, the soundscape analysis is executed on data expressed in natural language completely freely, not necessarily directly relating to sound description.
8.1.1 Problem Statement To solve the problems related to noise pollution in cities, the city administration has to analyze the perceived soundscape in the urban area to identify problematic areas. To reduce the costs compared to traditional installments of ambient noise sensors and allow the map to be generated often (e.g., to observe changes in the situation after the implementation of new measures or to analyze seasonal patterns in the soundscape), a new method for the accurate mapping of the city soundscape is researched. This could be used directly for the estimation of the soundscape or for the identification of selected potentially noisy areas where ambient sensors could be placed. The tentative solution to this problem is based on analyzing text published on social media containing sound-bearing words. A careful analysis of such text snippets allows the estimation of the type and distance of sounds, which should be combined with sentiment analysis to estimate the perceived sound level, which has been proven to be strongly influenced by emotions [18, 27, 42].
8.1.2 Architecture The proposed solution for the social media-based estimation of the soundscape of a city aims at analyzing sentences from social media posts, associated with the location where they were written, in order to estimate through CWW the perceived sound level at different locations in the urban area and subsequently visualize them on a map, using the architecture depicted in Fig. 8.1. As a source for social media posts, Flickr,1 a popular photo exchange platform, has been employed in the implementation of the Jingle Jungle prototype. This choice is motivated by the possibility of retrieving and filtering an unlimited amount of data points based on sound-bearing words, including their geolocation, via a public API.
1 https://flickr.com.
126
8 Phenotropic Interaction in SCs
Fig. 8.1 Architecture for the generation of the Jingle Jungle soundscape map (Adapted from [40])
As input to Jingle Jungle, the text from a social media post is taken (i.e., the image captions in Flickr), carefully selecting only posts containing sound-bearing words, on which three different operations are performed: • Sound identification • Distance recognition • Emotion detection Each operation is executed on the whole input and subsequently is used as a feature to estimate the perceived sound level. Sound Identification The sound described in the analyzed sentence is identified using the urban sound taxonomy [2]. This consists in matching all sound-bearing words (e.g., train, drilling) from the sentence with the ones present in the taxonomy, including the category they belong to between music, nature, transport, human, and mechanical. In addition, each sound-bearing word is also associated with a typical range of sound level and frequency [4]. Distance Recognition Words describing distance (e.g., near) are extracted from the input text and reduced for simplicity to three categories: close, normal, far. Emotion Detection As the perception of sound level has been proven to be influenced by the emotions of the listener [18, 27, 42], sentiment analysis is performed on the input sentence to extract emotions from the social media post. In the prototype, this is implemented using Indico’s2 API, which extracts from a piece of text the probability that the overall sentiment of the text corresponds to anger, fear, joy, sadness, or surprise. The results from these three analyses are then recombined in two subsequent CWW inference modules to estimate the perceived loudness of sounds, considering their nature, distance, frequency, and emotional connotation. Both CWW inference modules are modeled on the Per-C architecture, consisting of an encoder of the information into fuzzy terms, a CWW engine performing the computations, and a
2 https://indicodata.ai.
8.1 Jingle Jungle Maps
127
Estimated DB
low medium high
Membership
Membership
low medium high
Estimated DB
Fig. 8.2 Overview of the Per-C module for the estimation of the sound level from the soundbearing and distance-related words (Adapted from [40])
decoder transforming the results to a more human-friendly representation. These are implemented using the scikit-fuzzy3 Python module. The first Per-C, shown in Fig. 8.2, takes care of the estimation of the sound level from the type of identified sound (i.e., the sound-bearing word) and the term describing its distance from the observer. In the encoding phase, the typical range for frequency and sound level of the sound-bearing word are retrieved from the database [4]. Then the sound level range is divided into three triangular membership functions describing the low, normal, and high values in the range. Based on the observation that distance from the source impacts the intensity of the sound but not its frequency and on the dampening effect of high and low frequencies on the sound level, a set of 9 fuzzy IF-THEN rules [34] is implemented in the CWW modules to estimate the intensity of the sound as a result of the estimated distance of the observer from the source. For example, if the distance is far and the frequency is low, then the sound level will be low, and the frequency will remain low. If no words describing distance are present in the data, then a normal distance is assumed. In the decoding phase, the result in fuzzy sets is transformed back to a crisp value representing the estimated sound level using centroid defuzzification [43]. The second Per-C module handles the effects of frequency and emotions on the perceived loudness of sounds. Indeed, it has been observed, as described by the equal-loudness curves [23], that the same sound level at higher frequencies is perceived as being louder than at lower frequencies, and, similarly, negative emotions increase the perceived sound level [18, 27, 42]. Therefore, the outputs of the previous Per-C (i.e., the first approximation of the sound intensity and frequency) are used as inputs of the second module, combined with the emotions extracted from the text through sentiment analysis. In the CWW engine, these are used to compute the estimated perceived sound level employing 22 fuzzy IFTHEN rules combining the loudness, frequency, and emotions. For example, if the
3 https://pythonhosted.org/scikit-fuzzy.
128
8 Phenotropic Interaction in SCs
estimated sound level from the sound-bearing word and distance is loud and the frequency is in the range of bass, then the resulting perceived sound level is normal. Moreover, when negative emotions (i.e., anger, fear, and sadness) are identified in the text at a medium or high degree, this increases the perceived loudness of the sound. Finally, the fuzzy estimated perceived sound level obtained with this method is defuzzified with centroid defuzzification so that a crisp value can be returned in decibels, along with a linguistic variable describing the sound level in words (e.g., very loud). The information obtained from the original text at the end of this process consists of the coordinates of the event, the estimated perceived sound level, the sentiment of the text, and the category of the identified sounds. This can be used to visualize the soundscape of the city on a map, as shown in Fig. 8.3, by showing a dot at the location where the post was published, or the corresponding picture was taken, colored with a color indicating the category of the sound with the highest perceived level from the various sounds in the post. When clicking on specific dots, more details about the specific estimation can be retrieved, such as the perceived sound intensity for different sound-bearing word categories and the estimated emotional state of the message. The presented architecture was implemented in the form of a prototype allowing the estimation of the soundscape of a city in order to provide an evaluation of
Fig. 8.3 Visualization of the soundscape for the city of Bern, Switzerland (Reprinted from [40], ©2020 IEEE)
8.1 Jingle Jungle Maps
129
the employed methodology. The code used to implement the prototype is available online.4
8.1.3 Evaluation To evaluate the proposed solution for the estimation of the sound levels from social media data, no ground truth about the perceived sound intensity at different locations is available; thus, the evaluation methodology aims at verifying the accuracy of the sound category identification. To do so, the fact that Flickr is mainly a picture-sharing platform is exploited. Five participants were asked to rate the correctness of the categories of sounds identified in the captions by looking at the corresponding picture. For example, if the image of a post was recognized as containing mainly nature-related sounds, the rater estimated that the main sound one would hear in that location is from nature (e.g., birds, river), then the categorization was marked as successful, otherwise as unsuccessful. Each participant rated 100 categorizations for a total of 500. By combining their answers, the accuracy of Jingle Jungle’s sound categorization is computed as correct classifications .Accuracy = total classifications = 79.4%. This result is encouraging for the creation of the soundscape of a city; however, the estimation of the perceived sound intensity from social media data is difficult to verify, and further studies should be executed in this direction. Additionally, the direct analysis of the pictures from Flickr or streetlevel imagery could be employed instead of using the caption as data to be analyzed. This would possibly allow a more precise estimation of the sound categories, but with added complexity in the analysis.
8.1.4 Phenotropic Interaction Jingle Jungle represents an application of phenotropic interaction principles to an interface between the citizens, indirectly participating in the estimation of the soundscape, without expressly having to invest their precious time for this task, and the city, for which a more precise knowledge of its perceived soundscape can be generated. An overview of the applied phenotropic design principles in this project is presented in the following. Not Using Crisp Protocols Citizens are not required to directly interact with the city through a fixed interface such as a website or a survey where they have to provide structured feedback on their
4 https://github.com/colombmo/jinglejungle.
130
8 Phenotropic Interaction in SCs
perceptions of noise. Indeed, they can communicate their perceptions indirectly by just naturally expressing them, by posting their perceptions with free-form natural language on social media. The city can then extract these public data for the analysis of the perceptions, satisfaction, and desires of citizens. Approximation Safe The ability of Jingle Jungle to effectively handle natural language, the inference process based on CWW and fuzzy rules, the analysis of the ratio between the various emotions perceived in the text, as well as the possibility of obtaining as output the perceived sound level estimation in the form of linguistic variables, make Jingle Jungle a system that is flexible and is very well suited for working with approximations. Robust by Design The approximate nature of the system’s predictions makes it more robust, as it tries to provide a good approximation of the perceived sound levels and categories rather than a perfect measure that several external factors can influence. Moreover, the creation of a map representing the soundscape of a city does not have to be extremely precise, so data aggregation on the map could be performed to provide an estimation of the averagely dominant type of sounds in various areas of the city, where the level of granularity should be selected to find a good compromise between the precision of the representation and the amount of noise. Indeed, data aggregation can hide incorrect classifications (outliers) by considering their closeness in space to other estimations. This can increase the robustness of the soundscape creation, as it provides more informative and correct content by reducing its granularity. Improving Over Time Jingle Jungle is naturally improving over time simply because the amount of data that can be analyzed is constantly increasing; thus, the generated map can be updated any time new data are available, and the granularity of the map increases with the amount of data. In addition to this improvement, Jingle Jungle has a certain ability to adapt to the natural language used by users, despite it being variable and subjective. Moreover, measurements from ambient sensors placed in some areas of the city could provide a feedback loop that would allow improving the estimation of the perceived sound level by the system. The same could be obtained by letting people provide their perceptions or direct feedback on the data they can observe on the Jingle Jungle map. Multimodal Multimodality is not expressly handled in Jingle Jungle. However, the estimation via social media posts could be extended by using text data and performing an analysis of the pictures posted on Flickr. Moreover, the output map could also be generated by combining two different methods for estimating the perceived sound levels, for instance, by mixing the data from ambient sensors measurements and the values calculated with the social media analysis.
8.2 Streetwise
131
8.2 Streetwise The perception of spaces in a city influences positively or negatively the people living in or visiting those places. Indeed, it has been observed that spatial qualities have an essential effect on the ability to shape feelings and memories [20], and the perceived atmosphere of a space is connected with crime [19]; for instance, a place with signs of civil disorder is likely to aliment crimes in the area. These observations show the importance of leveraging collective perception for constructing a comprehensive perceptual map of a city, which can be used as a basis for targeted planning of measures to improve the city. The same analysis could also be applied to planned improvements to the city. For example, 3D renderings of several alternative implementations can be analyzed to choose the one that better impacts the desired features. Analyzing the perceptions of the built environment is not a straightforward task, as it is based on abstract concepts, such as atmosphere or comfort, which are perceptions that are not clearly defined in terms of the physical features of the city. In other words, there exists no list of elements (e.g., height of buildings, the number of cars) that allow to automatically estimate how a space’s atmosphere is perceived. Still, when immersed in this environment, be it directly or through pictures, people can easily judge concrete situations, for example, in terms of their quality of stay or sense of security. To perform this task, the Streetwise project [10] aims at creating a map of the perceived spatial quality of Switzerland, based on street-level imagery as data for people’s estimation of their perceptions of different aspects of the city, extended with the help of machine learning (ML) techniques to obtain comprehensive results quickly, without the need for a considerable number of participants, which would be expensive and time-consuming. Furthermore, Streetwise provides a phenotropic interface between the citizens and the city, as it allows the analysis of the perceptions of people about the built environment thanks to the learning and prediction of their interaction with the interface, which allows it to become an “ever better guesser than a perfect decoder” [30] of citizens’ perceptions.
8.2.1 Problem Statement To improve the built environment of the city, including streets and parks between others, the current situation should be analyzed in terms of people’s perceptions of different spatial features, such as atmosphere (how willing they are to spend some of their free time in the area) and safety (how safe they feel walking in the area). To perform this analysis, street-level imagery (e.g., Google Street View), which is publicly available in most Swiss cities, can be used as an adequate substitute for the concrete physical immersion in different situations, as shown by Dubey et al. [15]. The accurate analysis of the perceptions of several (ideally all) geotagged street-
132
8 Phenotropic Interaction in SCs
level images of the city allows the creation of a comprehensive map of perceived spatial quality, which serves as a basis for the identification of problematic or exemplary areas in the city. Two central problems arise when the described goal is pursued. The first regards the difficulty of collecting people’s perceptions of the spatial quality from a picture. Indeed, similarly to the case of semantic similarity presented in Chap. 4, it is difficult for people to accurately describe a spatial quality in absolute terms. For instance, asking people “Please rate on a scale from 1 to 5 your perception of how safe the place depicted in the picture feels” is a complex task, which adds a layer of subjectivity to the interpretation of the question and of the possible answers. To overcome this limitation, people’s perceptions are collected in relative terms, based on comparisons within pairs of different situations, as proposed by Salesses et al. [38]. The second problem concerns the collection of large amounts of data for the adequate coverage of urban areas of varying dimensions. For example, in the city of Zurich, several hundreds of thousands of street-level images are available, and all would need to be analyzed to have a detailed map of perceptions of the whole city. However, the number of people available to participate in this tedious task is limited, as well as the number of images that a single person can evaluate, as it is very timeconsuming. For this reason, the Streetwise interface aims to learn the features that make people perceive situations in a certain way rather than another and thus be able to independently predict the perceived spatial quality from new pictures without the need for human input after a learning and adaptation period. This is achieved using deep learning techniques for image processing and feature identification. In the description of the proposed solution, the focus is put on the computation of the perceived atmosphere of the city (i.e., the answer to the comparative question “Where would you rather stay?”). Nevertheless, the same pipeline can be used to compute different perceived features, such as safety [11] or bikeability [24].
8.2.2 Architecture The proposed solution for estimating the perceived spatial quality of cities aims to analyze street-level imagery of the urban landscape to discover how these are perceived in terms of spatial quality, using the architecture depicted in Fig. 8.4. As a source for street-level imagery, Mapillary,5 a platform hosting crowdsourced streetlevel imagery and map data, has been employed for the retrieval of images to be analyzed. This choice is motivated by the possibility of downloading and processing an unlimited number of images, as well as by the public nature of the available pictures, which anybody can upload in several formats, and the crowd filters lowquality data.
5 https://mapillary.com.
8.2 Streetwise
133
Siamese branch 1 FC 2 FC 256
FC 1024
FC 1024
P(top) P(bottom)
Siamese branch 2
Fig. 8.4 Architecture for the estimation of the perceived spatial quality of cities
In the presented architecture, street-level imagery is first retrieved from Mapillary for the area one is interested in (i.e., several cities and villages in the Germanspeaking part of Switzerland in our case) with the corresponding coordinates. The images are then filtered and preprocessed to eliminate blurry pictures, equalize their colors, and crop borders to hide parts of the vehicle from which they are taken. Using a random subset of all the collected images, a web interface is developed to facilitate the crowdsourced analysis of the perceived quality in the images (i.e., the perceived atmosphere in this case). First, the interface provides the user with an overview of the project goals and instructions on how to use the crowdsourcing application. After this, ten random pairs of street-level images are shown to the participant, which has to rate in each pair what would be their answer to the question In which of the two presented places would you rather stay? Either both or none of the images can be selected as an answer, with the possibility to specify a reason for an equal rating of two images and flag a poor-quality image that would be removed from the dataset. After the ten ratings are completed, a participant can choose if they want to participate in a draw to win a smartphone, and after that, decide if to continue with other ratings or stop their participation in the crowdsourcing experiment. During the crowdsourcing experiment, .10,766 pairwise image comparisons were collected from 1834 participants. Since the number of image comparisons necessary for estimating the perceived atmosphere of most areas is very high (e.g., approximately 6 million comparisons for the analysis of the city of Zurich), the number of participants and the time needed if only crowdsourcing is employed would be highly elevated. For this reason, the proposed solution aims at training a ML model able to automatically simulate the
134
8 Phenotropic Interaction in SCs
Siamese branch 1 FC 2 FC 256
FC 1024
FC 1024
P(top) P(bottom)
Siamese branch 2
Features extraction (e.g., VGG19)
Concatenation
Fully connected layer
Fig. 8.5 Architecture of the Siamese CNN used for the automatic comparison of images (Adapted from [11])
crowdsourcing process with reasonable accuracy. This way, the data collected in the crowdsourcing are used to train and validate the model, which can then be employed for fast computation of the perceived atmosphere in all images from the dataset, even on new ones, when uploaded. This allows the map to be constantly updated. Similar to the task executed in crowdsourcing, the proposed model takes a pair of images as input and returns the probability that either of them is judged as having a better perceived atmosphere than the other. To perform this operation, the Siamese Convolutional Neural Network (CNN) architecture depicted in Fig. 8.5 is employed. This gets as input two images, from which fundamental features are extracted using the feature extraction layers of the VGG19 architecture [39], a 19-layer deep CNN for object recognition, pre-trained on ImageNet [12]. The extracted features of the two images are then merged and fed to a feature comparison subnetwork. In this part, the model learns the important image features for estimating the better perceived atmosphere. The network’s output is then the probability that the image on the top or bottom branch of the Siamese CWW is the one with the better perceived atmosphere. The training of the Siamese CNN is performed using the data from the crowdsourcing as ground truth, employing an 80/20 training/validation split. Data augmentation is executed by mirroring the comparisons in the crowdsourcing in such a way that if two images have been compared with one in the top branch and the other in the bottom, a new entry is added to the ground truth where the position of the pictures is inverted, along with the result. Transfer learning is applied, meaning that only the feature comparison subnetwork is trained, while the VGG19 layers are frozen. In the case of the perceived atmosphere, the training was executed using batches of 64 data points with an initial learning rate of .0.0006, reduced by half every time the validation loss stagnated for ten epochs. After 400 epochs of training, the model’s accuracy reached .69.09%, and the softmax loss .0.6853, as shown in Fig. 8.6.
8.2 Streetwise
135
Fig. 8.6 Accuracy and loss curves of the Siamese CNN on the training and validation sets with an 80/20 split
The obtained trained model can then be used to automatically estimate the perceived atmosphere in different places represented by their street-level images. To rank the places with the best and worst atmosphere from pairwise comparisons, the TrueSkill score [21], a method commonly used in gaming ranking creation, is employed. This allows, for a set of at least 30 pairwise comparisons between images, to converge to an accurate scoring of the perceived atmosphere of each image. The principle is based on the fact that if an image “wins” a comparison (is recognized by the CNN as having a better perceived atmosphere), its score increases depending on the current difference in scores between the two considered pictures, and vice versa if the comparison is lost. After the scoring process is completed for all the pictures in the area to be analyzed, each image is associated with a score representing how good is the perceived atmosphere at that location (0 meaning a very bad atmosphere, 50 a very good one). Thanks to the obtained scores and the information relative to their physical location (geographical coordinates), it is finally possible to visualize the perceived atmosphere of the different areas of the city on a map. A raw visualization can be produced by displaying each data point as a point on the map, as shown in Fig. 8.7a, color-coded based on the estimated perceived atmosphere at that location. This is useful in a scenario where results with a high granularity are wanted. However, it contains noise due to the relative subjectivity of perceptions, lousy quality of part of the used image data, and imperfect accuracy of the employed CNN model, making the results more challenging to read and interpret. To smoothen the visualization, data can be aggregated. Aggregation can filter out the noise by analyzing more extensive areas and showing the mean score in that area. This has the advantage that bigger parts of the city can be analyzed without the risk that a bad-quality image strongly impacts the generated map. A fixed square tessellation is proposed as a possible data aggregation technique, as shown in Fig. 8.7b, where the map is divided into squares of a fixed size, which are colored based on the average score of all data points inside that square. However, the fixed size of the employed squares poses a problem as squares containing a single data point amplify their impact on the overall visualization, and meaningful variations within the same square are erased.
136
8 Phenotropic Interaction in SCs
Fig. 8.7 Visualization of the perceived atmosphere in cities with different data aggregation techniques
To overcome these limitations, a fuzzy clustering-based approach is proposed, as shown in Fig. 8.7c. This has the advantage of providing an estimate of the perceived atmosphere also in locations where image data are missing, based on their closeness to existing data, as well as showing all the critical data while smoothing out the noise. The approach for trying to estimate the perceived atmosphere in locations with missing data is similar to humans’ tendency to fill the gaps [14]. This approach focuses on creating a fuzzy clustering of the data points based on their location and score, computing the weighted average score of each score, and visualizing for each location on the map the estimated score at that location. This is computed assuming that a point placed at that location with a neutral (25) perceived atmosphere score belongs to a certain degree to all clusters. Then the new, fictitious point score is computed as the average of the mean score of each cluster, weighted by the membership degree of the point in each cluster. Ideally, the three presented visualization techniques could each represent helpful information for different use-cases. Moreover, they all represent different granularity levels. Thus, they could be used in a map visualization to represent different levels of information. For example, a very high-level view could be produced with the fuzzy clustering-based technique, the tessellation aggregation could be used for a more precise, but still general, view at a medium “zoom level,” and the raw visualization could be provided for example when a someone focuses on a single street.
8.2 Streetwise
137
In the raw visualization implemented for Streetwise, it is possible to visualize the picture corresponding to each of the colored dots on the map, and it is possible to give feedback by saying if one agrees or not with the obtained score of the picture. This can potentially be used as data to continuously improve the automatic estimation of the perceived atmosphere, but as the amount of feedback in the Streetwise implementation was limited, this has not been tested yet. The source code used to implement the Streetwise prototype is available online, divided into the infrastructure to facilitate the crowdsourcing campaign6 and a module for the training and automatic mapping of perceived quality of cities.7
8.2.3 Evaluation As the obtained accuracy of the Siamese CNN in the replication of the crowdsourcing task is relatively low (i.e., .69.09%) compared to most mainstream image classification models, it is worth examining in more detail what this accuracy means and if this low value could be attributed solely to the subjectivity of perceptions or if the model is indeed not performing the task well enough. To do so, an experiment is proposed to compare the level of agreement between different people performing the same task to that between a person and the CNN. An evaluation of the perceived atmosphere on the same 100 pairs of randomly selected street-level images was performed by 15 participants between 20 and 64 years old (9 males and 6 females) and by the Siamese CNN. First, the results obtained within pairs of different people were compared with the results of the CNN compared with single participants’ responses. Then, the results of single humans were aggregated to find for each rating the most commonly selected answer (considered to be the answer of the average person), and this result was compared with the answers of single people (with the ratings of the considered person not included in the average answer calculation) and the CNN. These operations allow estimating the accuracy of the CNN with respect to a single person performing the same task. As a result, the mean level of agreement (i.e., of corresponding answers) between single people was .54.75% (.SD = 7.85%), with a minimum of .36% agreement and a maximum of .74%. On the other hand, the level of agreement of the ML model with the single people was slightly higher, with a mean of .58.20% (.SD = 7.57%), with a minimum of .46%, and a maximum of .70%. The mean level of agreement between people and the average human rating was .63.20% (.SD = 6.80%), with a minimum of .52% and a maximum of .82%. That between the ML model with the average human rating was, also in this case, slightly higher, with an agreement ratio of .65%.
6 https://github.com/Streetwise/streetwise-app. 7 https://github.com/Streetwise/streetwise-score.
138
8 Phenotropic Interaction in SCs
These results confirm that, despite the relatively low accuracy of the Siamese CNN on the validation set, the trained model still performs the task similar to an average human rater. Thus, the advantages of using the proposed model compared to relying on crowdsourcing only, such as speed and cost of operation, are not counterbalanced by a loss of accuracy, and the apparent inaccuracy of the model is mainly due to the subjectivity of spatial perceptions. Additionally, the maps generated at the end of the analysis of different cities could be rated qualitatively by people who know the concerned cities, and according to this, the results seem to be quite correct, as the perceived atmosphere is generally higher in parks and close to rivers and lakes. At the same time, it is lower in high-traffic areas and locations with many industries and similar. Unfortunately, no ground truth relative to the perception of the atmosphere in different areas of cities is available, so a more precise quantitative evaluation of this aspect is not (yet) possible. Finally, it has to be considered that spatial quality can depend on culture and how people are used to living. Thus there is possibly a strong local impact, which is why Streetwise focused on a relatively local scale (i.e., Switzerland). However, one can argue that the same process could be applied in different contexts with similar quality results, as long as the crowdsourcing phase uses data and people from the areas to be analyzed.
8.2.4 Phenotropic Interaction Streetwise represents an application of phenotropic interaction principles to an interface between the citizens, participating in the estimation of the perceived quality of the city, and the city, for which a more precise knowledge of its perceived quality can be generated and made available to be used in different contexts for the improvement of the livability of the city. An overview of the applied phenotropic design principles in this project is presented in the following. Not Using Crisp Protocols Streetwise does not follow a strict protocol where the spatial quality of the city is computed from the sole count of features in a particular location (e.g., trees, cars), but it is based on the collection of people’s perceptions, from street-level imagery taken in an unconstrained format by people using the streets every day, combined with a soft computing-based solution to try and replicate in an approximate but accurate way the human perception of space. Also, the estimated spatial quality is not absolute but relative to other areas in the city. Approximation Safe The solution proposed in this project for estimating the perceived spatial quality has an intrinsically approximate nature, as it works with perceptions, which have a strong subjective component. Moreover, the use of various aggregation methods in the results visualization part allows better handling of approximate data and more
8.3 Concluding Remarks
139
informative information visualization thanks to the approximation of results and noise filtering. Robust by Design Robustness in estimating perceived spatial quality is handled in two consecutive steps. The first consists of scoring each image with more than 30 pairwise comparisons with other randomly selected images, which ensures that the TrueSkill score converges to a meaningful value [21], even if some of the images in the comparison are of bad quality. However, if the quality of the image being rated is bad, or the image does not represent correctly the area being analyzed, its score is not representative of the perceived spatial quality in that location. To overcome this problem, data aggregation is used in the visualizations, allowing to filter out noise and outliers, making the overall estimation of the spatial quality more robust. Improving Over Time Streetwise is improving over time as the crowdsourcing data are used only for learning purposes, but then the trained ML model can be used to estimate people’s perceptions of space in new situations, including areas not considered in the crowdsourcing or more recent pictures. Like Jingle Jungle, Streetwise’s maps can also be continuously updated automatically with the most recent data to see the temporal evolution of the perceived spatial quality of a city. As shown in [11], Streetwise can easily adapt to any spatial quality that the citizens want to analyze with new crowdsourcing campaigns, as it represents a generalized methodology and model for the handling of spatial quality from street-level imagery. One could even think to estimate the city’s soundscape, similarly to Jingle Jungle, with the methodology of Streetwise. To further support the continuous improvement of the CNN model used in Streetwise, users can observe the pictures used to generate each point on the raw visualization and provide feedback saying if they think the rating is correct or incorrect. This helps both in discovering the reasons behind bad and good ratings in some areas and in collecting people’s feedback that can be used to improve the model in an interactive ML fashion. Multimodal Multimodality is not implemented in Streetwise in its current status. However, one could include it by combining different data sources in the analysis, such as direct people’s feedback on specific areas, police reports, amount of visitors, and the pictures analyzed in the presented version of Streetwise.
8.3 Concluding Remarks This chapter presented two projects for mapping different perceptual features of the city using CWW and perceptual computing principles. This allowed showcasing two practical applications of phenotropic interaction design principles to SC interfaces.
140
8 Phenotropic Interaction in SCs
The inclusion of phenotropic interaction principles in these projects allowed the development of adaptive and robust solutions for mapping features with a subjective component. Furthermore, both solutions allow a creation of an interface between the citizens and the city through their indirect participation in the analysis. Indeed, in the case of Jingle Jungle, data that people would share nevertheless and not directly related to the analyzed dimension are used as a basis. In contrast, the direct participation of a small number of citizens is used in Streetwise to let the system learn to perform the same task without needing direct people’s input for future assessments of perceived spatial quality. The presented applications of phenotropic interaction to SC interfaces focus only on aspects related to the smart living sector [9]. However, the same principles could be easily used to develop other types of interfaces, covering other sectors of the smart city wheel [7], such as human-centered logistic and mobility solutions [25, 36], in the smart mobility sector. The described examples of the application of phenotropic interaction in the SC context show the utility of applying the developed phenotropic interaction framework not only to the interaction between humans and personal devices (e.g., computers, virtual assistants) but also to the exchanges between people and more complex systems, such as the city infrastructure seen as an artificial system that can benefit from a more bio-inspired internal working and interface toward its biological components, including citizens and the natural environment. Therefore, future work in the direction of phenotropic interaction in SC should focus on making it more and more straightforward for people to communicate their needs, desires, and perceptions to the city as a complex entity, but more specifically to all the city stakeholders, including other citizens. This can contribute to setting a basis on which new developments for the practical improvement of the city can be implemented in a genuinely citizen-centered way, rather than focusing solely on the optimization of processes and resources without considering the citizens’ well-being, quality of life, and sense of inclusion.
References 1. Abu-Tayeh, G., Portmann, E., & Stürmer, M. (2017). Züri wie neu: Public value von onlinepartizipation. HMD Praxis der Wirtschaftsinformatik, 54(4), 530–543. https://doi.org/10.1365/ s40702-017-0324-3 2. Aiello, L. M., Schifanella, R., Quercia, D., & Aletta, F. (2016). Chatty maps: constructing sound maps of urban areas from social media data. Royal Society Open Science, 3(3), 150690. https://doi.org/10.1098/rsos.150690 3. Andreasyan, N., et al. (2021). Framework for involving citizens in human smart city projects using collaborative events. In Eighth International Conference on eDemocracy eGovernment (ICEDEG) (pp. 103–109). https://doi.org/10.1109/ICEDEG52154.2021.9530860 4. Berger, E. H., Neitzel, R., & Kladden, C. A. (2006). Noise navigator sound level database with over 1700 measurement values. University of Washington, Department of Environmental & Occupational Health Sciences, Seattle, 8, 20. 5. Brabham, D. C. (2013). Crowdsourcing. MIT Press.
References
141
6. Buecheler, T., Sieg, J. H., Füchslin, R. M., & Pfeifer, R. (2010). Crowdsourcing, open innovation and collective intelligence in the scientific method: A research agenda and operational framework. In The 12th International Conference on the Synthesis and Simulation of Living Systems (pp. 679–686). MIT Press. https://doi.org/10.21256/zhaw-4094 7. Cohen, B. (2012). The top 10 smart cities on the planet. https://www.fastcompany.com/ 90186037/the-top-10-smart-cities-on-the-planet. Visited on June 2022 8. Cohen, B. (2014). The smartest cities in the world 2015: Methodology. https://www. fastcompany.com/3038818/the-smartest-cities-in-the-world-2015-methodology. Visited on May 2022 9. Colombo, M., Hurle, S., Portmann, E., & Schäfer, E. (2020). A framework for a crowdsourced creation of smart city wheels. In Seventh International Conference on eDemocracy eGovernment (ICEDEG) (pp. 305–308). https://doi.org/10.1109/ICEDEG48599.2020.9096754 10. Colombo, M., et al. (2021). Streetwise: Mapping citizens’ perceived spatial qualities. In Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 1: ICEIS (pp. 810–818). INSTICC, SciTePress. https://doi.org/10.5220/0010532208100818 11. Colombo, M., et al. (2022). A methodology for mapping perceived spatial qualities. In J. Filipe, ´ M. Smiałek, A. Brodsky, & S. Hammoudi (Eds.), Enterprise information systems, lecture notes in business information processing. Springer. https://doi.org/10.1007/978-3-031-08965-7_10 12. Deng, J., et al. (2009). ImageNet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 248–255). https://doi.org/10. 1109/CVPR.2009.5206848 13. Desmet, M. (2022). The psychology of totalitarianism. Chelsea Green Publishing. 14. Dilks, D. D., Baker, C. I., Liu, Y., & Kanwisher, N. (2009). Referred visual sensations: rapid perceptual elongation after visual cortical deprivation. Journal of Neuroscience, 29(28), 8960– 8964. https://doi.org/10.1523/JNEUROSCI.1557-09.2009 15. Dubey, A., et al. (2016). Deep learning the city: Quantifying urban perception at a global scale. In European Conference on Computer Vision (pp. 196–212). Springer. https://doi.org/10.1007/ 978-3-319-46448-0_12 16. European Environment Agency. (2017). Managing exposure to noise in Europe. https://www. eea.europa.eu/publications/managing-exposure-to-noise-in-europe/noise-in-europe-updatedpopulation-exposure. Visited on May 2022 17. Finger, M., & Portmann, E. (2016). What are cognitive cities? (pp. 1–11). Springer. https://doi. org/10.1007/978-3-319-33798-2_1 18. Gabrielsson, A., & Juslin, P. N. (1996). Emotional expression in music performance: Between the performer’s intention and the listener’s experience. Psychology of Music, 24(1), 68–91. https://doi.org/10.1177/0305735696241007 19. Gau, J. M., & Pratt, T. C. (2010). Revisiting broken windows theory: Examining the sources of the discriminant validity of perceived disorder and crime. Journal of Criminal Justice, 38(4), 758–766. https://doi.org/10.1016/j.jcrimjus.2010.05.002 20. Goldhagen, S. W., & Gallo, A. (2017). Welcome to your world: How the built environment shapes our lives. Harper. 21. Herbrich, R., Minka, T., & Graepel, T. (2006). Trueskill™: A Bayesian skill rating system. In B. Schölkopf, J. Platt, & T. Hoffman (Eds.), Advances in neural information processing systems (Vol. 19). MIT Press. https://proceedings.neurips.cc/paper/2006/file/ f44ee263952e65b3610b8ba51229d1f9-Paper.pdf 22. Hänninen, O., et al. (2014). Environmental burden of disease in Europe: Assessing nine risk factors in six countries. Environmental Health Perspectives. https://doi.org/10.1289/ehp. 1206154 23. ISO 226:2003. (2003). Acoustics—normal equal-loudness-level contours. Standard, International Organization for Standardization. 24. Ito, K., & Biljecki, F. (2021). Assessing bikeability with street view imagery and computer vision. Transportation Research Part C: Emerging Technologies, 132, 103371.
142
8 Phenotropic Interaction in SCs
25. Jaha, A., Jaha, D., Pincay, J., Terán, L., & Portmann, E. (2021). Privacy-friendly delivery plan recommender. In Eighth International Conference on eDemocracy & eGovernment (ICEDEG) (pp. 146–151). https://doi.org/10.1109/ICEDEG52154.2021.9530869 26. Keyson, D., Guerra-Santin, O., & Lockton, D. (2017). Living labs. In Design and assessment of sustainable living. Springer. https://doi.org/10.1007/978-3-319-33527-8 27. Konecni, V. J. (1975). The mediation of aggressive behavior: Arousal level versus anger and cognitive labeling. Journal of Personality and Social Psychology, 32(4), 706–712. https://doi. org/10.1037/0022-3514.32.4.706 28. Kosko, B. (2006). Noise. Penguin (2006). 29. Kshetri, N., & Voas, J. (2018). Blockchain-enabled e-voting. IEEE Software, 35(4), 95–99. https://doi.org/10.1109/MS.2018.2801546 30. Lanier, J. (2003). Why Gordian software has convinced me to believe in the reality of cats and apples. https://www.edge.org. Visited on Apr. 2022 31. Maisonneuve, M., Stevens, M., Niessen, M., Hanappe, P., & Steels, L. (2009). Citizen noise pollution monitoring. In Proceedings of the 10th Annual International Conference on Digital Government Research, Partnerships for Public Innovation (pp. 96–103). https://doi.org/10. 5555/1556176.1556198 32. Montes, J. (2020). A historical view of smart cities: Definitions, features and tipping points. Features and Tipping Points. https://doi.org/10.2139/ssrn.3637617 33. Morley, D., et al. (2015). International scale implementation of the CNOSSOS-EU road traffic noise prediction model for epidemiological studies. Environmental Pollution, 206, 332–341. https://doi.org/10.1016/j.envpol.2015.07.031 34. Novák, V., & Lehmke, S. (2006). Logical structure of fuzzy if-then rules. Fuzzy Sets and Systems, 157(15), 2003–2029. https://doi.org/10.1016/j.fss.2006.02.011 35. Oliveira, Á., & Campolargo, M. (2015). From smart cities to human smart cities. In 48th Hawaii International Conference on System Sciences (pp. 2336–2344). IEEE. https://doi.org/ 10.1109/HICSS.2015.281 36. Pincay, J., Portmann, E., & Terán, L. (2021). Fuzzifying geospatial data to identify critical traffic areas. In 19th World Congress of the International Fuzzy Systems Association (IFSA), 12th Conference of the European Society for Fuzzy Logic and Technology (EUSFLAT), and 11th International Summer School on Aggregation Operators (AGOP) (pp. 463–470). Atlantis Press. https://doi.org/10.2991/asum.k.210827.061 37. Ropohl, G. (1999). Philosophy of socio-technical systems. Society for Philosophy and Technology Quarterly Electronic Journal, 4(3), 186–194. https://doi.org/10.5840/techne19994311 38. Salesses, P., Schechtner, K., & Hidalgo, C. A. (2013). The collaborative image of the city: Mapping the inequality of urban perception. PloS One, 8(7), e68400. https://doi.org/10.1371/ journal.pone.0119352 39. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 40. Spring, T., Ajro, D., Pincay, J., Colombo, M., & Portmann, E. (2020). Jingle jungle maps— capturing urban sounds and emotions in maps. In Seventh International Conference on eDemocracy eGovernment (ICEDEG) (pp. 36–42). https://doi.org/10.1109/ICEDEG48599. 2020.9096770 41. Swiss Federal Office for the Environment. (2014). Lärm-berechnung sonbase ermittelt lärmbelastung in der schweiz. https://www.bafu.admin.ch/bafu/en/home/topics/noise/state/gislaermdatenbank-sonbase.html. Visited on May 2022 42. Tajadura-Jiménez, A., Väljamäe, A., Asutay, E., & Västfjäll, D. (2010). Embodied auditory perception: The emotional impact of approaching and receding sound sources. Emotion, 10(2), 216–229. https://doi.org/10.1037/a0018422 43. Zhao, R., & Govind, R. (1991). Defuzzification of fuzzy intervals. https://doi.org/10.1016/ 0165-0114(91)90020-Q
Part V
Conclusions
Chapter 9
Outlook and Conclusions
In this final chapter, the discoveries and developments obtained throughout this project are summarized to provide a general overview, concluding remarks, and possible future developments about the newly introduced paradigm of phenotropic interaction. In Sect. 9.1, the main points of this research are summarized. This is followed in Sect. 9.2 by aligning the presented subprojects with the original research questions and discussing how they are answered. Considering these aspects, future work toward developing more phenotropic interfaces is proposed in Sect. 9.3. An outlook about phenotropic interaction concludes the main content of the book in Sect. 9.4.
9.1 Summary Interaction between people and artificial systems is vital for providing an environment where people and technology can effectively collaborate to solve problems relevant to humans. However, this type of interaction is generally still very limited compared to the conversation between humans, as it is static and structured around rigorous protocols. A solution to these limitations was proposed in this book with a new interaction paradigm providing adaptability and flexibility of the artificial interface toward a more natural expression of human needs and perceptions through the principles of phenotropic interaction. In this book, the primary focus was on defining the essential design principles of phenotropic interaction and on creating a basis for applying these to different types of interfaces between humans and artificial systems. To this effect, several subprojects were defined in the form of design science research cycles for which artifacts were designed and implemented in the form of algorithms, frameworks, and prototypes, to be then evaluated in order to extract some practical and theoretical
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Colombo, Phenotropic Interaction, Fuzzy Management Methods, https://doi.org/10.1007/978-3-031-42819-7_9
145
146
9 Outlook and Conclusions
knowledge from them for the improvement, validation, and implementation of phenotropic interfaces. In the development of specific techniques for the inclusion of phenotropic interaction in interfaces between humans and artificial systems, it was chosen to focus on the ability of the system to provide flexibility, robustness, understanding of human perceptions, and partially on the adaptation to the needs and way of communicating of single users. In addition, some ideas regarding multimodality, explainability, and feedback loops for the targeted improvement of the system were presented but not explicitly developed. This choice was made to keep the performed research from becoming too broad. However, the topics that were not explicitly targeted are as crucial for the emergence of phenotropic interaction as the researched ones. Thus, they are worth developing more in future research works. To solve the overall problems of interaction between humans and artificial systems presented in this book in a way inspired by the conversations between biological actors, the topic of phenotropics, the idea of applying nature-inspired interaction to the communication between artificial, by design protocol-centric machines, was introduced in Chap. 2. In the same chapter, these principles were adapted to the context of the communication between people and artificial systems to make the interaction more natural for humans and structured in the form of a framework comprising the main theories and models to be used to provide the system with the capabilities necessary for phenotropic interaction. To familiarize with the main high-level components necessary for phenotropic interaction, these were researched and introduced in Chap. 3. However, the presented theories are vast, and inside these, specific methods for implementing phenotropic interaction principles in a concrete context (e.g., handling of languagebased conversations) do not necessarily already exist. This was the case for techniques able to correctly let computers represent and process human perceptions expressed with natural language, which constitute the basis on which several phenotropic interaction principles are built, such as flexibility and robustness. New techniques for the understanding of perceptions based on CWW were thus developed, implemented, and tested in Chaps. 4 and 5. On this basis, a technique allowing for the simulation of the human reasoning process to infer the expected reactions of the system in a conversation was implemented in Chap. 6. This set the basis for adapting basic knowledge of the system to new, unknown situations, in a situation of uncertainty, to constantly improve its capabilities and understanding of people’s needs, desires, and perceptions. Finally, the phenotropic interaction framework and the newly created methods for the handling of human perceptions were combined with other components from the building blocks of this paradigm to demonstrate the practical implementation of phenotropic interaction principles in the context of human–computer interaction and smart cities. In Chap. 7, a virtual assistant was developed, able to automatically extend customized IFTTT rules for better flexibility and robustness of the system. This was used as a prototype for the analysis of the impact of the included phenotropic interaction principles on the perception of users toward the system and the interac-
9.2 Alignment with Research Questions
147
tion with it, compared to the more static type of interaction that is implemented in widespread virtual assistants and IFTTT rules. In Chap. 8, two different projects developed in the context of human smart cities were presented, including an accurate analysis of the components of the developed systems satisfying the design principles of phenotropic interaction and possible improvements to make these solutions more phenotropic. With the results obtained in the different parts of the book combined, one can dare to say that the fundaments for the successful development of phenotropic interfaces in different use-cases have been built and demonstrated, as well as the positively perceived by people impact on their interaction with artificial systems.
9.2 Alignment with Research Questions In this section, the answers to the research questions presented in Sect. 1.2 are summarized based on the artifacts and the knowledge linked to them obtained in the various design science research cycles executed in the frame of this book. RQ1. What Are the Main Features of Phenotropic Interaction, and How Can They Be Modeled? To create the basis on which to build an interaction paradigm more natural for humans, the concept of phenotropics in its original conception—applied to the improvement of the robustness of software—has been analyzed in Sect. 2.1, including the different (although few) developments in different fields such as the Internet of Things. From this analysis, the fundamental features of a phenotropic system have been extracted and put in the context of the interaction between people and artificial systems to define the fundamental design principles of phenotropic interaction defined in Sect. 2.2. These provide a clear set of the main features that an interactive system should have to provide a more natural conversation with humans and describe the rationale and possible consequences of each of them. The features that have been identified as necessary for a system to be able to handle the flexibility, subjectivity, imprecision, and multimodality of human communication require interfaces to be: not using crisp protocols, approximation safe, robust by design, improving over time, and able to handle multimodal interaction. These design principles represent a direct transposition of the principles of phenotropic software combined with the features necessary to overcome the limited adaptivity of traditional mixed systems. Further research might identify additional design principles for phenotropic interaction, but it was shown with the experiment in Sect. 7.3 that satisfying only part of these design principles can already significantly improve the interaction. This means that, even in the case where the presented set of design principles was incomplete, one could argue that its application would still provide a more natural interaction than traditional systems.
148
9 Outlook and Conclusions
Furthermore, in Sect. 2.3, a framework for phenotropic interaction built on conversation theory is presented to provide a structure to the design of phenotropic interaction, including the fundamental phases and building blocks of such a conversation. This allows defining a model for phenotropic interaction built on the model of conversations between people, which helps thus to provide a paradigm that is more natural for people. RQ2. Which Methods and Theories Are Suitable for the Implementation of Phenotropic Interaction? For the successful implementation of phenotropic interaction in practice, some existing methods and theories can be employed to satisfy the defined design principles. Given the nature of human conversation, which is firmly based on the expression and understanding of thoughts and perceptions, cognitive and perceptual computing methods are considered the building blocks for phenotropic interaction, as presented in Chap. 3. More specifically, theories from these fields are considered for implementing the defined design principles following the phenotropic interaction framework. The interaction is modeled on the conversation theory to allow a more natural exchange. This represents the basis for a protocol-less communication between people and artificial systems, where the understanding of the other actor’s goals is achieved iteratively through conversational alignment. These iterations also contribute to the improvement of the interaction over time, as the feedback loops allow for continuous learning of the other actor’s preferences and ways of communicating and to the resolution of misunderstandings (robustness), as the conversation ideally continues until an agreement has been reached and the actors finally act together toward the common goal. The pipeline of CWW is suggested for handling uncertainty in the conversation. This allows for the mathematical representation and understanding of words and perceptions. Indeed, CWW represents a fundamental theory for working with the semantics of the exchanged information. This high level of understanding of exchanges’ meaning leads to a less restricted conversation in terms of syntax (protocol-less), as the exact syntax loses importance in favor of semantics. Similarly, the same argument could be used for the improvement of robustness, as understanding the meaning allows overcoming failures due to incorrect syntax, as long as the system understands the semantics. Moreover, the imprecise nature of communication and perceptions is considered in CWW, allowing the interface to be approximation safe, as this pipeline builds on the concepts of fuzzy sets theory, a mathematical field for treating imprecise data. Further understanding of the conversation exchanges and their relationships with other concepts (e.g., the expected reactions) can be reached with a certain level of reasoning. This allows a more robust interaction and a constant improvement and adaptation of the interaction to the user, where reactions to unknown inputs can be inferred based on the relationship between the unknown query and the known ones. Furthermore, automated reasoning is crucial for an automatic adaptation of the system to use alternative modalities (empowering thus multimodality). Indeed,
9.2 Alignment with Research Questions
149
relationships between the meaning of information exchanged through an unknown and known modality can be employed to better understand and treat the message received through the unknown modality. Finally, the theories of explainable artificial intelligence and interactive machine learning can strongly contribute to the continuous improvement of the interaction, as this is based on feedback loops following the conversation theory. Interpretability of the system’s reactions in an exchange with a user allows the trust in the system to be improved and for the person to identify the origin of a problem in case of failure. This way, their feedback to the artificial system can be adapted to be more beneficial for improving its interface. The effective handling of this type of user feedback for the improvement of the interaction can be based on techniques of interactive machine learning, which consist exactly of models able to learn and improve through feedback loops. Additionally, interactive machine learning techniques could be employed to teach a system to better understand communication modalities that were not initially designed to understand, allowing for adaptive multimodality. In specific use-cases of phenotropic interaction, employing other theories and models not listed in this book could be necessary. However, one can argue that the ones presented as the building blocks of phenotropic interaction in Chap. 3 are sufficient for most use-cases; as combined, they allow for a way to handle each of the design principles of phenotropic interaction in at least two ways. RQ3. How Can The Semantic Similarity Between Perceptions Expressed as Words Be Computed in a Meaningful, Human-Like Way? As part of the tentative of allowing artificial systems to better understand the semantics of human perceptions expressed with words, state-of-the-art semantic similarity measures were researched in Sects. 4.1 and 4.2. Two categories of semantic similarity measures were identified, either relying on the conceptual similarity between concepts or the spectral similarity of scalar terms. Perceptions being primarily expressed with scalar adjectives and adverbs (e.g., hot), a spectral similarity measure able to closely replicate the perception of people toward these terms was sought. However, it was discovered that, despite being accurate in estimating conceptual similarity, state-of-the-art methods are inaccurate for estimating spectral similarity, and no measure was found that is specifically addressing the problem of spectral similarity. For this reason, a new algorithm for estimating the spectral semantic similarity measure, based on the overlap between second-order synonyms of the words one wants to compute the similarity of, was developed in Sect. 4.3. The proposed algorithm was implemented and evaluated in a user experiment consisting of ordering a set of scalar terms based on their perceived similarity to a fixed word describing the same feature, as described in Sect. 4.5. The evaluation results showed a significant improvement in the accuracy of the newly proposed semantic similarity measure compared to state-of-the-art methods, reaching results close to the level of agreement between different participants. This shows the new measure’s ability to correctly estimate the objective component of similarity, which is in part
150
9 Outlook and Conclusions
also influenced by subjective judgment, as shown by the different perceptions of similarity between distinct people. A limitation of the proposed solution is the ability to perform semantic similarity estimation only on single adjectives or adverbs. Indeed, in reality, people often use combinations of words, for instance, by employing modifiers, to express their perceptions with a higher granularity (e.g., very hot instead of scorching). Despite this, the proposed similarity measure represents a sound basis for building a more accurate understanding of the semantics of spectral terms. RQ4. What Methods Are Suitable for the Automated Understanding and Representation of the Semantics of Perceptions? Semantic similarity allows a partial understanding of the semantics of perceptions. However, this is limited to the representation of the meaning of one word relative to another. In order to express the absolute meaning of spectral terms representing perceptions as well as their imprecision given by the subjectivity of perceptions, the use of CWW techniques was explored in Chap. 5. Scalar terms describing perceptions are defined as words whose meaning can be represented on a scale or spectrum of all words describing the same characteristic (e.g., words describing perceived temperature). This property makes them suitable to be represented mathematically on a scale from 0 to 1, where 0 and 1 are the extremes of the category with opposite meanings, employing fuzzy membership functions to represent the imprecision of perceptions and natural language. In Sects. 5.1 and 5.2, two iterations of an algorithm for the estimation of the representation of words on their corresponding scale (i.e., precisiation of meaning) were presented, based on the semantic similarity measure introduced in Chap. 4. These were evaluated compared to the human estimation of the meaning of a set of words and with an ordering task in Sect. 5.3, showing satisfying results in both evaluations, close to the level of agreement between different people performing the same tasks. The level of understanding of perceptions provided by this technique is not perfect. Indeed, it has the same limitation as to the used semantic similarity measure of being unable to handle modifiers. Additionally, it allows to map word meanings only to a scale from 0 to 1 instead of using more standard units (e.g., a tall person is represented as .∗ 0.9, not .∗ 190 cm). Finally, it employs type-1 fuzzy sets when type-2 fuzzy sets are suggested to better represent the subjectivity of perceptions. However, the experiment in Sect. 7.3 showed that applying the automatic precisiation of meaning algorithm to a concrete use-case can provide satisfying results in most cases, despite the algorithm’s limitations. Indeed, for most use-cases, it is more important to know the position of the meaning of a word relative to the spectrum than to understand the exact measure corresponding to it in real-world units and the estimation of the subjectivity factor is not relevant. Moreover, single words are mainly used to express simple perceptions, whereas modifiers are added to communicate more complex perceptions.
9.2 Alignment with Research Questions
151
RQ5. How Can Automated Reasoning Theories Be Implemented in Practice to Empower the Reasoning Process in a Phenotropic Interface? With the focus on improving the understanding of the relationships between different items, particularly with the practical use-case of phenotropic interaction of Chap. 7 in mind, a tentative providing automated reasoning capabilities to artificial systems was developed. Because of the vast extent of analogies in human reasoning and of the imprecise and subjective nature of the words and perceptions object of the reasoning process, the chosen approach to this problem was to employ the theory of fuzzy analogical reasoning. A practical implementation of analogical reasoning for the case of objective, respectively spectral, analogies was proposed in Sect. 6.1, based on the use of ontologies, respectively, the developed algorithm for the automatic precisiation of meaning. This was implemented in a prototype for analogical reasoning where people can input an analogy in the form “A is to B as C is to?” and the prototype provides an answer to this question by applying the analogical scheme to the input data, as presented in Sect. 6.2. The outputs of the prototype not only provide an answer to the question but also an explanation of the reasoning process behind the computation of the results, in the form of a description of the identified relationships between A and B, as well as the application of the same relationship to C. This explanation, when included in an interactive phenotropic system, can help the user identify errors in the reasoning process of the system, allowing them to provide punctual feedback for the correction of such mistakes. This would be fundamental for the constant improvement and adaptation of the interaction to the user’s needs through feedback loops. In the presented prototype, the explanation of the result was included to improve the user evaluation so that participants could see how correct or incorrect the output was, and in case of imprecisions, they could identify their origin. The prototype for fuzzy analogical reasoning was evaluated in Sect. 6.3 in a qualitative way by a group of experts in CWW who could use the prototype freely before entertaining an interview with the author, where they could provide feedback to the system. Collecting the various participants’ feedback was then employed to perform a SWOT analysis of the fuzzy analogical reasoning process. This can be used as a basis for further improvements to the prototype and its algorithm. Overall, participants in the evaluation were satisfied with the accuracy and explainability of the system in the case of reasoning with spectral analogies but found that the prototype should be able to understand modifiers (e.g., very) to let them better express their perceptions. Moreover, the ability to handle analogies concerning relationships between scalar and objective terms could further increase the fields of application of fuzzy analogical reasoning. For example, this could be implemented by enhancing the base data source for objective similarity— ontologies—with information about the precisiated meaning or the spectral relationships between scalar terms. As shown in the experiment results in Sect. 7.3, the application of fuzzy analogical reasoning to the automatic extension of custom IFTTT rules contributed to improving the user experience with the interface. This means that the reasoning process worked most likely as expected by the users.
152
9 Outlook and Conclusions
RQ6. How Can Phenotropic Interaction Principles Be Implemented in UseCases from the Real World? Do These Provide an Improvement Over Traditional Solutions? Practical applications of the phenotropic interaction design principles, following the defined framework, were implemented in the form of prototypes for a use-case related to the interaction between a person and a virtual assistant (Chap. 7) and for two use-cases covering the interaction between citizens and smart city infrastructure (Chap. 8). In the case of the interaction with a virtual assistant using custom IFTTT rules, presented in Sect. 7.2, the focus was put on the understanding of the semantics of the elements of conversation, mainly employing the methods developed in Chaps. 4–6 for the improvement of the understanding of people’s perceptions expressed with words, along with other more common NLP methods. This prototype’s main goal was to verify the feasibility of practical implementation of phenotropic interfaces in an HCI setting and to analyze the impact of the applied design principles on the users’ perceptions of the interface and its interaction with it. For simplicity, and based on the main developments presented in this book, the aspect of multimodality was not considered in this application, despite being potentially very important for the development of phenotropic interaction. In Sect. 7.3, a user experiment was designed to let people perform some guided tasks using the prototype developed following the phenotropic interaction principles and an alternative prototype implemented to handle IFTTT rules in the traditional way that can be found nowadays in various devices. This experiment observed that the virtual assistant implementing the design principles of phenotropic interaction performed all the submitted tasks of different types better than the traditional virtual assistant. Moreover, it was perceived as being significantly more robust, flexible, understanding, reasoning in a human-like way, and behaving as expected than the traditional solution, and the interaction with it was perceived as being significantly more natural and human-like, all while reducing the amount of required cognitive load and frustration. This observation proves the positive impact of implementing the phenotropic interaction design principles on the user experience and the ability to perform the desired tasks successfully. Possible improvements to the presented prototype would include improving the explainability of the system’s choices, the possibility for people to provide feedback to make the virtual assistant learn their preferences or correct mistakes, and the automatic handling of multiple interaction modalities. Two projects were developed in the optics of phenotropic interaction for the use-cases related to smart city problems. The first focused on the estimation of the soundscape of a city based on social media data, and the second was mapping the perceived urban spatial quality of cities employing street-level imagery. Both were implemented following the design principles of phenotropic interaction to allow citizens to participate more easily in the city’s evolution and become more adapted to their needs by providing their opinions and perceptions and letting the city learn from them. This allowed showing the potential of phenotropic interaction not only in simple one-on-one conversations between user and computer but also in a more
9.3 Future Developments
153
complex interaction of several people with an artificial system such as the smart city, where the correct understanding of people’s needs and perceptions toward different aspects of the urban area is crucial. The successful application of some of the phenotropic interaction design principles to extremely varied use-cases spacing from the interaction of single people with a virtual assistant to the interaction of citizens with their artificial ecosystem shows the universality of the defined principles, which one could dare to say that they could be applied—although using different strategies and theories as a basis to implement them—to any use-case where interaction between any number of people and an artificial system is envisaged.
9.3 Future Developments In light of the answers to the research questions and their analysis, it is possible to identify some essential directions for the further development of phenotropic interaction, allowing a more straightforward implementation of its design principles in new interfaces between humans and artificial systems. As the central part of this book focused on methods to apply the theories of CWW and automated reasoning to the understanding and processing of human perceptions, future work should include studies about the employment of the other building blocks of phenotropic interaction that were identified, such as the implementation of flexible multimodal interaction, the improvement of the explainability of the system, and the use of interactive machine learning methods for the improvement of the interface based on user feedback. These should ideally not only be implemented for specific use-cases but developed further to provide some guidelines or standardized models usable in different contexts. Moreover, an improvement to the conversation that would provide more certainty about the reached alignment of intentions of the user and the system would consist in making the artificial agent able to estimate how sure they are about their understanding of the expected goal and ask for clarifications or confirmations in uncertain cases, in a way similar to what happens in human conversations. These aspects were partially addressed in this book. When they were not explicitly considered in the development phase, some ideas on how these could be included in various interfaces were presented. However, further practical developments in these directions are fundamental to address all of the design principles of phenotropic interaction and implement more natural and adaptive interfaces. An improvement that could provide an important contribution to the adaptivity of phenotropic interfaces would be the automatic handling of multimodal interaction based on the reasoning on the semantics of information shared through different modalities. This could allow interactive systems to use all the available interaction modalities even if the system has been designed with a specific modality in mind. For instance, retaking the example of flexible virtual assistants, one could imagine that virtual assistant devices have a camera available so that they can potentially
154
9 Outlook and Conclusions
handle also gestures as an input modality, despite them being designed to be used with natural language via voice input. Automatic handling of the unexpected input modality would consist in this case in, whenever the user for any reason interacts with the device using gestures instead of voice, extracting the semantics from the identified gestures (e.g., by means of a sign language recognizer [1]) in the form of a combination of natural language and precisiated natural language. This can then be used, similarly as for the natural language use-case presented in Sect. 7.2, to find a match between the detected input and existing natural language-based rules, perform some basic reasoning on this match, and execute an operation as a reaction to the received input. This process could strongly improve the adaptivity of the phenotropic interface, as an interface designed to be used with a certain modality could automatically be used with an alternative modality, adapting thus to the needs and preferences of the users toward interaction modalities. A further important aspect to be considered in future developments, which would allow scaling the concepts of phenotropic interaction and include them in various interfaces more efficiently, is the development of a complete toolkit (e.g., a library) containing a set of the base components necessary for the successful implementation of phenotropic interfaces in various contexts. This would, for example, contain some functions implementing the algorithms developed during this research project for the computation of semantic similarity, the precisiation of meaning, and the reasoning with perceptions in a context of uncertainty, but also further methods helpful in handling explainability, multimodality, and improvement of the interaction based on user feedback. This would enormously simplify the emergence of phenotropic interfaces and provide a modular approach, which is well suited for a gradual improvement of the methods used to apply the design principles of phenotropic interaction. The first version of such a toolkit is provided by the collection of the algorithms presented in this book, but this should be extended between others with methods targeting the interaction with modalities other than natural language, and helping with the building of a trust relationship between the user and the machine. For the specific case of the understanding and reasoning with human perceptions expressed with natural language that was focused on in this book in Chaps. 4–6, a common limitation to all of the developed methods was identified in the inability of these to correctly process perceptions if they are not expressed through a single scalar term. For example, expressions using a combination of a modifier and a scalar term (e.g., very hot) or words derived from scalar adjectives and adverbs but not belonging to this category (e.g., heat, brighten) are not understood correctly by the presented methods. As these are widespread ways of communicating perceptions, it is essential for future developments to allow systems to understand and reason with these to allow for a more natural, robust, and accurate conversation with people. Ideas to approach this problem include an accurate analysis and modeling of how the use of modifiers impacts the perceived meaning of scalar terms and a process similar to analogical reasoning on the relationships between words derived from scalar terms and the scalar terms themselves to infer the meaning of the exchanged information.
9.4 Outlook and Conclusions
155
9.4 Outlook and Conclusions Human augmentation through their collaboration with intelligent systems simplifies people’s daily tasks, moving their focus from doing simple or repetitive tasks to other, more interesting, challenging operations. To achieve good cooperation between humans and artificial systems, like the collaboration between people, communication, and understanding of each other and shared goals are crucial. This is achieved through successful interaction, which has to be natural, adaptive, flexible, and robust, meaning that it is fundamental for the receiver of the information to understand and process it correctly, despite it not being necessarily structured following a strict protocol pre-shared between the actors participating in the exchange. Improving the interaction between people and artificial systems should not only enhance the user experience of collaborating with intelligent systems, but it should allow them to have a higher level of understanding of the system and its reasoning processes, which can let people have more control over various aspects of the system, including the correctness of its choices and suggestions. Furthermore, this gives people a means to judge the reliability, the satisfaction of the ethical principles, and the trustworthiness of such a system. A further feature of successful interaction consists in the ability of the actors to improve constantly by learning from one another. This continuous improvement through feedback loops, if using the correct learning methods, could ideally go as far as when the knowledge and reasoning capabilities of the artificial system become similar to those of the humans with whom they interacted, allowing them to engage in conversations with people in the same way as humans would. In this research work, a tentative of reaching an interaction paradigm more natural for people based on the flexibility of the interface and the understanding and processing of the semantics of human communication, derived from the idea of phenotropic software, was performed with quite encouraging results. The research was focused on some sub-problems regarding the representation, understanding, and reasoning on human perceptions, which are fundamental elements for the communication of people’s needs and desires, and on the partial application of the design principles of phenotropic interaction in three use-cases from different areas. The problem of phenotropic interaction is much more vast than what was researched during this work. Nevertheless, the following key aspects constitute the main contributions of this book: • A set of design principles and a framework detailing the implementation of the concept of phenotropic interaction to make the interaction between humans and artificial systems more natural and adapted to people • An open-source toolbox for the automated representation, understanding, and reasoning with the semantics of human perceptions, which are naturally imprecise, to improve the ability for interfaces to adapt to the users’ needs and allow a more robust interaction
156
9 Outlook and Conclusions
• The practical implementation of the methods from the developed toolbox, combined with others, to provide a phenotropic interaction-based solution to three concrete use-cases from different fields of application These points demonstrated the possibility for the interaction between people and artificial systems to be improved by removing the necessity for strict protocols. Indeed, these are not natural for people, who typically do not follow them when communicating with other people or interacting with their environment, but they are in place for the sole purpose of making interactions easier to process for machines. The abolition of strict protocols in this interaction has moreover been proven to be perceived positively by people, who judged the interaction with a phenotropic interface more natural and human-like than a traditional interaction with a machine. The toolbox and use-cases for phenotropic interaction represent only some examples of components and practical implementations of phenotropic interaction. These were developed to provide a basis for the understanding and future developments in the direction of an interaction paradigm that is more adaptive and careful to people’s needs, thanks to the application of cognitive and perceptual computing methods.
Reference 1. Joudaki, S., bin Mohamad, D., Saba, T., Rehman, A., Al-Rodhaan, M., & Al-Dhelaan, A. (2014). Vision-based sign language classification: A directional review. IETE Technical Review, 31(5), 383–391. https://doi.org/10.1080/02564602.2014.961576
Glossary
Cognitive computing A set of theories and techniques to let computers to mimic the mechanisms of the human brain. It provides the basis for the practical application of cognition and learning theories to computer systems with the use of soft computing methods. Computing with words and perceptions A process allowing to perform computations on words, phrases, and prepositions drawn from a natural language, which describe perceptions of people toward different aspects of the context they are surrounded by. This is based on the fuzzy logic toolbox and allows to represent and perform operations on the meaning of words. Convolutional neural network A class of neural networks commonly used for image analysis that is relying on convolution operations to extract features from data. Design science research A research methodology that aims to enhance knowledge and practice through the design, implementation, and study of usable innovative artifacts. Explainable artificial intelligence The study of the interpretability and understandability by users of the reasoning behind the operations performed by an artificially intelligent system, allowing them to have a certain control over the ethnicity, fairness, bias, and safety of the decisions taken by such a system. Fuzzy logic An extension of classical binary logic, where the truth value of propositions cannot only be completely true or false, but also partially true and false to varying degree. Fuzzy system A system that relies on the methods of fuzzy logic. Human smart city Development of smart solutions for the improvement of the well-being, quality of life, and sense of inclusion of people living, working, or
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Colombo, Phenotropic Interaction, Fuzzy Management Methods, https://doi.org/10.1007/978-3-031-42819-7
157
158
Glossary
visiting a city, by applying a citizen-centric approach where citizens actively contribute to find the ideal solutions to their problems. Linguistic variable A variable composed by a linguistic label describing it (e.g., temperature) and a set of terms that represent the values that the variable can take (e.g., hot, medium, cold), along with the fuzzy sets representing the meaning of these terms. Perceptual computing A set of theories and techniques allowing computers to compute and reason with perceptions and imprecise data. Precisiation of meaning The search for a mapping from words describing perceptions into a fuzzy set representing their meaning in a formal way, on which computation can be performed. Smart city The collection of smart solutions applied to the improvement of efficiency of cities through the use of information technologies. This approach is traditionally very technocentric. Virtual assistant A (mostly voice-based) conversational interface acting as a control hub for several digital services (e.g., smart lighting, weather forecast). Examples of these assistants include Google Assistant, Alexa (Amazon), and Siri (Apple).
Appendix A
Survey: Ordering of Scalar Terms
The questions of the survey introduced in Sect. 4.5 used to build a ground truth about the ordering of scalar terms based on their perceived meaning are presented in the following. The collected data have been employed to compare people’s judgment with the results obtained with different semantic similarity measures executing the same task (Sect. 4.5) and employing the automatically precisiated meaning of scalar terms (Sect. 5.3): 1. Please order the following words from the one that is the closest to the one that is the furthest from never always sometimes often regularly 2. Please order the following words from the one that is the closest to the one that is the furthest from freezing burning mild cold hot 3. Please order the following words from the one that is the closest to the one that is the furthest from giant short tall enormous tiny 4. Please order the following words from the one that is the closest to the one that is the furthest from enormous little small large miniature 5. Please order the following words from the one that is the closest to the one that is the furthest from perfect acceptable good mediocre bad 6. Please order the following words from the one that is the closest to the one that is the furthest from extremely little very moderately barely 7. Please order the following words from the one that is the closest to the one that is the furthest from easy difficult arduous tough straightforward
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Colombo, Phenotropic Interaction, Fuzzy Management Methods, https://doi.org/10.1007/978-3-031-42819-7
159
160
A Survey: Ordering of Scalar Terms
8. Please order the following words from the one that is the closest to the one that is the furthest from burning freezing very warm warm chilly 9. Please order the following words from the one that is the closest to the one that is the furthest from usually occasionally rarely never regularly 10. Please order the following words from the one that is the closest to the one that is the furthest from quick rapid speedy torpid slow 11. Please order the following words from the one that is the closest to the one that is the furthest from always regularly normally usually often 12. Please order the following words from the one that is the closest to the one that is the furthest from sometimes rarely seldom occasionally never 13. Please order the following words from the one that is the closest to the one that is the furthest from chill heated mild temperate warm 14. Please order the following words from the one that is the closest to the one that is the furthest from OK excellent perfect good great 15. Please order the following words from the one that is the closest to the one that is the furthest from gigantic very big big massive huge
Appendix B
Semantic Similarity Evaluation Details
In Table B.1, the detailed results, including mean M and standard deviation SD, of the evaluation of Sect. 4.5 are reported. Table B.1 Mean accuracy of rankings using various semantic similarity measures
Algorithm Humans Colombo Lesk Word2Vec GloVe Fasttext BERT-eng BERT-mult S-BERT
M SD M SD M SD M SD M SD M SD M SD M SD M SD
Accuracy Crisp rank .RA,c .RG,h .RG,avg .0.9086 .0.9705 .0.1879 .0.0956 .0.8730 .0.9094 .0.1772 .0.1299 .0.6038 .0.6151 .0.3081 .0.288 .0.6824 .0.6841 .0.2786 .0.2863 .0.6845 .0.6798 .0.2671 .0.2869 .0.6957 .0.6902 .0.2698 .0.2957 .0.6924 .0.6902 .0.268 .0.2874 .0.6815 .0.6809 .0.2742 .0.2877 .0.6864 .0.6884 .0.2726 .0.2833
Similarity-based rank .RA,s .RG,avg .0.9086 .0.9705 .0.1879 .0.0956 .0.92044 .0.9569 .0.1391 .0.0644 .0.6038 .0.6151 .0.3081 .0.288 .0.6824 .0.6841 .0.2786 .0.2863 .0.6845 .0.6798 .0.2671 .0.2869 .0.6957 .0.6902 .0.2698 .0.2957 .0.6924 .0.6902 .0.268 .0.2874 .0.6815 .0.6809 .0.2742 .0.2877 .0.6864 .0.6884 .0.2726 .0.2833 .RG,h
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Colombo, Phenotropic Interaction, Fuzzy Management Methods, https://doi.org/10.1007/978-3-031-42819-7
161
Appendix C
Evaluation of the FVA Prototype
The survey introduced in Sect. 7.3, used to guide the participants in the experiment with the FVA prototype and to analyze the accuracy of the prototype, the users’ perceptions of the prototype, and their thoughts regarding their interaction with it, is presented in the following. This compares the results with an assistant implementing phenotropic interaction and one implementing a traditional approach. Each participant uses the two in random order.
Smart Assistants In this survey, you will have an interaction with two different smart assistants, for which you can create some “If This Then That” (IFTTT) rules. You will be guided in completing a couple of tasks with both assistants, and some questions will be asked about your experience with each of them. You will first have to create some custom rules and then have an interaction with the assistant. The interaction is text-based as this is an early-stage prototype, but you could imagine that the exact same could be done with voice-based assistants such as Google Assistant, Siri, Alexa, etc. During the study, you will be asked to do some things on an external website. Since the links to the external website will change during the experiment, please always use the provided link to access the IFTTT interface and the assistant. Smart Assistant A This virtual assistant allows you to create some custom rules, and based on those, it tries to understand the meaning of similar queries and react accordingly. For example, if you create a rule for which you turn on the air conditioning at 100% of its maximum power when you say “It’s very hot in here,” this assistant tries to understand what should be done when the command you provide is “It’s cold in here” (turn off the air conditioning). You will now be guided in doing some tasks with this smart assistant and will be asked some questions about this interaction. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Colombo, Phenotropic Interaction, Fuzzy Management Methods, https://doi.org/10.1007/978-3-031-42819-7
163
164
C Evaluation of the FVA Prototype
Smart Assistant A—Scenarios 1. Go to http://smartassistant.ga/ifttt, and create a rule for which when you ask to set a “hot” temperature in the living room, the heating is set to the wanted power (i.e., in the range [0–100%]). Please create only one rule. 2. Ask the assistant http://smartassistant.ga/assistant/A to set a “hot” temperature in the living room, according to the created rule. 3. Did the assistant perform the requested task correctly? .□ Yes .□ No, it did not understand the query .□ No, it performed the wrong action Comment: 4. Without creating a new rule, ask the assistant http://smartassistant.ga/assistant/ A to set a “warm” temperature in the living room. 5. Did the assistant perform the requested task correctly? .□ Yes .□ No, it did not understand the query .□ No, it performed the wrong action Comment: 6. If “No” in the previous answer: Since you created a rule similar to the one that you just tried, would you prefer the assistant to learn from that rule how to interpret this new request and react correctly accordingly, or would you rather define a new rule for every specific case (e.g., hot, cold, medium, warm, chilly, freezing, burning, . . . temperatures)? .□ The assistant should learn to understand the query .□ I would rather define exactly all the rules that I need Comment: 7. Ask once again the assistant http://smartassistant.ga/assistant/A to set a “hot” temperature. However, this time, imagine that a long time passed since when you defined the rule for this, so you should use a slightly different wording compared to the one that you defined, as if you forgot the exact wording that you defined for the query. 8. Did the assistant perform the requested task correctly? .□ Yes .□ No, it did not understand the query .□ No, it performed the wrong action Comment: 9. If “No” in the previous answer: Since you created a rule similar to the one that you just tried, would you prefer the assistant to learn from that rule how to interpret this new request and react correctly accordingly, or would you rather define a new rule for every specific case (e.g., hot, cold, medium, warm, chilly, freezing, burning, . . . temperatures)? .□ The assistant should learn to understand the query .□ I would rather define exactly all the rules that I need Comment: 10. Go to http://smartassistant.ga/ifttt, and create a rule for which when you ask to set a “bright” lighting in the office, the minimum illuminance threshold in the room is set to the desired value (i.e., in the range [100 lux (darkest)–3000 lux
C Evaluation of the FVA Prototype
11. 12.
13. 14.
15.
16.
17.
18.
165
(brightest)]) in order to automatically control the lighting in the room with the specified illuminance value. Please create only one rule. Ask the assistant http://smartassistant.ga/assistant/A to set a “bright” lighting in the office, according to the rule that you created. Did the assistant perform the requested task correctly? .□ Yes .□ No, it did not understand the query .□ No, it performed the wrong action Comment: Without creating a new rule, ask the assistant http://smartassistant.ga/assistant/ A to set a “dark” lighting in the office. Did the assistant perform the requested task correctly? .□ Yes .□ No, it did not understand the query .□ No, it performed the wrong action Comment: If “No” in the previous answer: Since you created a rule similar to the one that you just tried, would you prefer the assistant to learn from that rule how to interpret this new request and react correctly accordingly, or would you rather define a new rule for every specific case (e.g., hot, cold, medium, warm, chilly, freezing, burning, . . . temperatures)? .□ The assistant should learn to understand the query .□ I would rather define exactly all the rules that I need Comment: Brightness, like other properties, could also be described using adjectives such as “high brightness,” “medium brightness,” and “low brightness.” For us, it is easy to translate from “bright” to “high brightness” for example. Try to ask the assistant http://smartassistant.ga/assistant/A to set a “high brightness” in the office. Did the assistant perform the requested task correctly? .□ Yes .□ No, it did not understand the query .□ No, it performed the wrong action Comment: If “No” in the previous answer: Since you created a rule similar to the one that you just tried, would you prefer the assistant to learn from that rule how to interpret this new request and react correctly accordingly, or would you rather define a new rule for every specific case (e.g., hot, cold, medium, warm, chilly, freezing, burning, . . . temperatures)? .□ The assistant should learn to understand the query .□ I would rather define exactly all the rules that I need Comment:
Smart Assistant A—General Questions How much do you agree with the following statements? (1 .= strongly disagree, 5 .= strongly agree)
166
C Evaluation of the FVA Prototype
Statement The assistant behaved as expected The assistant was robust The assistant was flexible The assistant was understanding The assistant was unpredictable The assistant was reasoning in a human-like way The interaction with the assistant was natural The interaction with the assistant was human-like The interaction with the assistant required a high cognitive load (e.g., remembering the queries) The interaction with the assistant was frustrating
1
2
3
4
5
.□
.□
.□
.□
.□ .□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
Smart Assistant B This smart assistant allows you to create some custom rules and replicates the tasks that you programmed when you give the corresponding command. For example, if you create a rule for which you turn on the air conditioning at 100% of its maximum power when you say “It’s very hot in here,” this assistant will learn this rule and will execute this task when this command is used. You will now be guided in doing some tasks with this smart assistant and will be asked some questions about this interaction. Smart Assistant B—Scenarios 1. Go to http://smartassistant.ga/ifttt, and create a rule for which when you ask to set a “hot” temperature in the living room, the heating is set to the wanted power (i.e., in the range [0–100%]). Please create only one rule. 2. Ask the assistant http://smartassistant.ga/assistant/B to set a “hot” temperature in the living room, according to the created rule. 3. Did the assistant perform the requested task correctly? .□ Yes .□ No, it did not understand the query .□ No, it performed the wrong action Comment: 4. Without creating a new rule, ask the assistant http://smartassistant.ga/assistant/ B to set a “warm” temperature in the living room. 5. Did the assistant perform the requested task correctly? .□ Yes .□ No, it did not understand the query .□ No, it performed the wrong action Comment: 6. If “No” in the previous answer: Since you created a rule similar to the one that you just tried, would you prefer the assistant to learn from that rule how to interpret this new request and react correctly accordingly, or would you rather define a new rule for every specific case (e.g., hot, cold, medium, warm, chilly, freezing, burning, . . . temperatures)?
C Evaluation of the FVA Prototype
7.
8.
9.
10.
11. 12.
13. 14.
15.
167
.□ The assistant should learn to understand the query .□ I would rather define exactly all the rules that I need Comment: Ask once again the assistant http://smartassistant.ga/assistant/B to set a “hot” temperature. However, this time, imagine that a long time passed since when you defined the rule for this, so you should use a slightly different wording compared to the one that you defined, as if you forgot the exact wording that you defined for the query. Did the assistant perform the requested task correctly? .□ Yes .□ No, it did not understand the query .□ No, it performed the wrong action Comment: If “No” in the previous answer: Since you created a rule similar to the one that you just tried, would you prefer the assistant to learn from that rule how to interpret this new request and react correctly accordingly, or would you rather define a new rule for every specific case (e.g., hot, cold, medium, warm, chilly, freezing, burning, . . . temperatures)? .□ The assistant should learn to understand the query .□ I would rather define exactly all the rules that I need Comment: Go to http://smartassistant.ga/ifttt, and create a rule for which when you ask to set a “bright” lighting in the office, the minimum illuminance threshold in the room is set to the desired value (i.e., in the range [100 lux (darkest)–3000 lux (brightest)]) in order to automatically control the lighting in the room with the specified illuminance value. Please create only one rule. Ask the assistant http://smartassistant.ga/assistant/B to set a “bright” lighting in the office, according to the rule that you created. Did the assistant perform the requested task correctly? .□ Yes .□ No, it did not understand the query .□ No, it performed the wrong action Comment: Without creating a new rule, ask the assistant http://smartassistant.ga/assistant/ B to set a “dark” lighting in the office. Did the assistant perform the requested task correctly? .□ Yes .□ No, it did not understand the query .□ No, it performed the wrong action Comment: If “No” in the previous answer: Since you created a rule similar to the one that you just tried, would you prefer the assistant to learn from that rule how to interpret this new request and react correctly accordingly, or would you rather define a new rule for every specific case (e.g., hot, cold, medium, warm, chilly, freezing, burning, . . . temperatures)? .□ The assistant should learn to understand the query .□ I would rather define exactly all the rules that I need Comment:
168
C Evaluation of the FVA Prototype
16. Brightness, like other properties, could also be described using adjectives such as “high brightness,” “medium brightness,” and “low brightness.” For us, it is easy to translate from “bright” to “high brightness” for example. Try to ask the assistant http://smartassistant.ga/assistant/B to set a “high brightness” in the office. 17. Did the assistant perform the requested task correctly? .□ Yes .□ No, it did not understand the query .□ No, it performed the wrong action Comment: 18. If “No” in the previous answer: Since you created a rule similar to the one that you just tried, would you prefer the assistant to learn from that rule how to interpret this new request and react correctly accordingly, or would you rather define a new rule for every specific case (e.g., hot, cold, medium, warm, chilly, freezing, burning, . . . temperatures)? .□ The assistant should learn to understand the query .□ I would rather define exactly all the rules that I need Comment: Smart Assistant B—General Questions How much do you agree with the following statements? (1 .= strongly disagree, 5 .= strongly agree) Statement The assistant behaved as expected The assistant was robust The assistant was flexible The assistant was understanding The assistant was unpredictable The assistant was reasoning in a human-like way The interaction with the assistant was natural The interaction with the assistant was human-like The interaction with the assistant required a high cognitive load (e.g., remembering the queries) The interaction with the assistant was frustrating
1
2
3
4
5
.□
.□
.□
.□
.□ .□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□
.□